نتایج جستجو برای: instruction cache

تعداد نتایج: 56814  

2002
Ravi Bhargava Juan Rubio Lizy K. John

Performing multiple, accurate, low-latency predictions is crucial to improving instruction throughput in future wide-issue microprocessors. However, demands of wide-issue processing coupled with implementation challenges posed by high clock frequencies present obstacles to these prediction goals. This paper proposes the Traveling Speculation framework to accommodate predictions in a wide-issue ...

2016
Mikael Hirki Zhonghong Ou Kashif N. Khan Jukka K. Nurminen Tapio Niemi

It has been a common myth that x86-64 processors suffer in terms of energy efficiency because of their complex instruction set. In this paper, we aim to investigate whether this myth holds true, and determine the power consumption of the instruction decoders of an x86-64 processor. To that end, we design a set of microbenchmarks that specifically trigger the instruction decoders by exceeding th...

Journal: :IEEE Trans. Computers 1999
Eric Rotenberg Steve Bennett James E. Smith

As the instruction issue width of superscalar processors increases, instruction fetch bandwidth requirements will also increase. It will eventually become necessary to fetch multiple basic blocks per clock cycle. Conventional instruction caches hinder this effort because long instruction sequences are not always in contiguous cache locations. Trace caches overcome this limitation by caching tra...

2002
Anand S. Rajan Shiwen Hu Juan Rubio

This paper studies the level 1 cache performance of Java programs by analyzing memory reference traces of the SPECjvm98 applications executed by the Latte Java Virtual Machine. We study in detail Java programs’ cache performance of different access types in three JVM phases, under two execution modes, using three cache configurations and two application data sets. We observe that the poor data ...

1999
Tomohiro Yoneda

As a benchmark circuit for timed asynchronous circuit verification, we have developed an abstracted version of TITAC 2 instruction cache sub-system and its formal specification. This document shows all the figures of the gate level sub-circuits which compose the abstracted instruction cache. A time Petri net model for the formal specification is also shown with the detailed explanation. The tex...

1997

Instruction prefetching can effectively reduce instruction cache misses, thus improving the performance. In this paper, we propose a prefetching scheme, which employs a branch predictor to run ahead of the execution unit and to prefetch potentially useful instructions. Branch prediction based (BP-based) prefetching has a separate small fetching unit, allowing it to compute and predict targets a...

1993
Andrew Naylor Arthur Abnous

This paper describes the design and implementation of a very long instruction word (VLIW) microprocessor. The VIPER (VLIW integer processor) contains four pipelined functional units, and can achieve 0.25 cycle-per-instruction performance. The processor is capable of performing multiway branch operations, two load/store operations or up to four ALU operations in each clock cycle, with full regis...

1997
Toru Kisuki Masaki Wakabayashi Junji Yamamoto Keisuke Inoue Hideharu Amano

Abstract. The shared cache structures and snoop cache structures for single-chip multiprocessors are evaluated and compared using an instruction level simulator. Simulation results show that 1-port large shared cache achieves the best performance if there is no delay penalty for arbitration and accessing the bus. However, if 1-clock delay is assumed for accessing the shared cache, a snoop cache...

1998
Greg Snider

design space exploration, VLIW, systolic array, cache This paper addresses the problem of automated design of a computer system for an embedded application. The computer system to be designed consists of a VLIW processor and/or a customized systolic array, along with a cache subsystem comprising a data cache, instruction cache and second-level unified cache. Several algorithms for "walking" the...

2000
Richard E. Ladner Ray Fortna Bao-Hoang Nguyen

An experimental comparison of cache aware and cache oblivious static search tree algorithms is presented. Both cache aware and cache oblivious algorithms outperform classic binary search on large data sets because of their better utilization of cache memory. Cache aware algorithms with implicit pointers perform best overall, but cache oblivious algorithms do almost as well and do not have to be...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید