نتایج جستجو برای: instruction cache
تعداد نتایج: 56814 فیلتر نتایج به سال:
During processor design, it is often necessary to evaluate multiple cache configurations. This paper describes the design and implementation of a retargetable on-line cache simulator. The cache simulator has been implemented using a retargetable instruction set simulator from the SimnML [9] processor description language. The retargetability helps in cache simulation and evaluation much before ...
The performance of statically scheduled VLIW processors is highly sensitive to the instruction scheduling performed by the compiler. In this work we identify a major deficiency in existing instruction scheduling for VLIW processors. Unlike most dynamically scheduled processors, a VLIW processor with no load-use hardware interlocks will completely stall upon a cache-miss of any of the operations...
This paper shows that code expanding optimizations have strong and non-intuitive implications on instruction cache design. Three types of code expanding optimizations are studied in this paper: instruction placement, function inline expansion, and superscalar optimizations. Overall, instruction placement reduces the miss ratio of small caches. Function inline expansion improves the performance ...
Cache memories are mandatory to bridge the growing gap between CPU speed and main memory access time. Standard cache organizations improve the average execution time but are difficult to predict for worst case execution time (WCET) analysis. This paper proposes a different cache architecture, intended to ease WCET analysis. The cache stores complete methods and cache misses occur only on method...
This paper shows that code expanding optimizations have strong and non-intuitive implications on instruction cache design. Three types of code expanding optimizations are studied in this paper: instruction placement, function inline expansion, and superscalar optimizations. Overall, instruction placement reduces the miss ratio of small caches. Function inline expansion improves the performance ...
Online transaction processing (OLTP) is a multibillion dollar industry with high-end database servers employing state-of-the-art processors to maximize performance. Unfortunately, recent studies show that CPUs are far from realizing their maximum intended throughput because of delays in the processor caches. When running OLTP, instruction-related delays in the memory subsystem account for 25 to...
Due to unfortunate circumstances this lecture was not scribed, following are several points that I remember were brought up. If anyone has something to add please tell me. In this session we discussed three papers: Alternative Fetch and Issue Policies for the Trace Cache Fetch Mechanism-describes several enhancements to the original University of Michigan view of the trace cache. Path-Based Nex...
Code Layout as a Source of Noise in Jvm Performance Dayong Gu and Clark Verbrugge and Etienne Gagnon
We describe the effect of a particular form of “noise” in benchmarking. We investigate the source of anomalous measurement data in a series of optimization strategies that attempt to improve runtime performance in the garbage collector of a Java virtual machine. The results of our experiments can be explained in terms of the difference in code layout, and hence instruction and data cache behavi...
This paper compares the performance characteristics of the Alpha 21164 to the previous-generation 21064 microprocessor. Measurements on the 21164-based AlphaServer 8200 system are compared to the 21064based DEC 7000 server using several commercial and technical workloads. The data analyzed includes cycles per instruction, multiple-issued instructions, branch predictions, stall components, cache...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید