A Hardware-based Cache Pollution Filtering Mechanism for Aggressive Prefetches
نویسندگان
چکیده
Aggressive hardware-based and software-based prefetch algorithms for hiding memory access latencies were proposed to bridge the gap of the expanding speed disparity between processors and memory subsystems. As smaller L1 caches prevail in deep submicron processor designs in order to maintain short cache access cycles, cache pollution caused by ineffective prefetches is becoming a major challenge. When too aggressive prefetching are applied, ineffective prefetches not only can offset the benefits of benign prefetches due to pollution but also throttle bus bandwidth, leading to overall performance degradation. In this paper, a hardware based cache pollution filtering mechanism is proposed to differentiate good and bad prefetches dynamically using a history table. Two schemes — Per-Address (PA) based and Program Counter (PC) based — for triggering prefetches are proposed and evaluated. Our cache pollution filters work in tandem with both hardware and software prefetchers. As shown in the analysis of our simulated results, the cache pollution filters can significantly reduce the number of ineffective prefetches by over 90%, alleviating the excessive memory bandwidth induced by them. The IPC is improved by up to 9% as a result of reduced cache pollution and less competition for the limited number of cache ports.
منابع مشابه
Combining Cooperative Software/Hardware Prefetching and Cache Replacment
Data prefetching is an effective technique to hide memory latency and thus bridge the increasing processor-memory performance gap. Our previous work presents guided region prefetching (GRP), a hardware/software cooperative prefetching technique which cost-effectively tolerates L2 latencies. The compiler hints improve L2 prefetching accuracy and reduce bus bandwidth consumption compared to hardw...
متن کاملA Static Filter for Reducing Prefetch Tra c
The growing di erence between processor and main memory cycle time necessitates the use of more aggressive techniques to reduce or hide main memory access latency. Prefetching data into higher speed memories is one such technique. However, speculative prefetching can signi cantly increase memory tra c. We present a new technique, called Static Filtering (SF), to reduce the tra c generated by a ...
متن کاملBranch History Guided Instruction Prefetching
Instruction cache misses stall the fetch stage of the processor pipeline and hence affect instruction supply to the processor. Instruction prefetching has been proposed as a mechanism to reduce instruction cache (I-cache) misses. However, a prefetch is effective only if accurate and initiated sufficiently early to cover the miss penalty. This paper presents a new hardware-based instruction pref...
متن کاملTransactional Distributed Shared Memory
We present a new transaction-based approach to distributed shared memory, an object caching framework, language extensions to support our approach, path-expression-based prefetches, and an analysis to generate path expression prefetches. To our knowledge, this is the first prefetching approach that can prefetch objects whose addresses have not been computed or predicted. Our approach makes aggr...
متن کاملCache Showdown: The Good, Bad, and Ugly
Prefetching algorithms have been mainly studied in the context of the Coverage and Accuracy metrics. While this is an appropriate metric for prefetching into separate stream buffers, it is a poor assessment of prefetching into a shared cache structure where cache pollution can become a serious factor. Traditionally, prefetches have been categorized as ”good” or ”bad” if they are accessed or are...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003