Instruction Fetch Energy Reduction Using Forward-Branch Bufferable Innermost Loop Buffer

نویسندگان

  • Bin-Hua Tein
  • I-Wei Wu
  • Chung-Ping Chung
چکیده

Recently, several loop buffer designs have been proposed to reduce instruction fetch energy due to size and location advantage of loop buffer. Nevertheless, on design complexity dictates most loop buffer designs to store only innermost loops without forward branch or instructions within innermost loops before a forward branch. While program modeling shows that typical programs can best be represented with a simple loop model, many of then contain forward branches in their innermost loops. For example, MiBench spends 71% of execution time on innermost loops, and 27% of these innermost loops consist of forward branch(es). Hence, existing designs lead to limitation in reduction of instruction fetch energy. We propose a simple and effective way to cope with this complexity: since using BTB is a norm in most designs, if we add an extra bit in BTB, indicating if the loop buffer stores the fall-through or target trace after a within-the-innermost-loop forward branch, then much of the complexity can be avoided. Results with MiBench indicate that up to 14.1% of further reduction in instruction fetch energy, and only 1.8% hardware overhead in BTB, is introduced compared with the design without forward branch handling.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Block Based Fetch Engine for Superscalar Processors

The implementation of modern high performance computer is increasingly directed toward parallelism in the hardware. However, most of the current fetch units are limited to one branch prediction per cycle and therefore, can fetch no more than one basic block per cycle. While fetching a single basic block each cycle is sufficient for implementations that issue small number of instructions per cyc...

متن کامل

Integrated I-cache Way Predictor and Branch Target Buffer to Reduce Energy Consumption

In this paper, we present a Branch Target Buuer (BTB) design for energy savings in set-associative instruction caches. We extend the functionality of a BTB by caching way predictions in addition to branch target addresses. Way prediction and branch target prediction are done in parallel. Instruction cache energy savings are achieved by accessing one cache way if the way prediction for a fetch i...

متن کامل

The Precomputed Branch Architecture

Accurate instruction fetch and branch prediction is increasingly important on today’s superscalar architectures. Fetch prediction is the process of determining the next instruction to request from the memory subsystem. Branch prediction is the process of predicting the likely out-come of branch instructions. A branch target buffer (BTB) is often used to provide target addresses for taken branch...

متن کامل

The Basic Block Reassembling Instruction Stream Buffer with LWBTB for X86 ISA

The potential performance of superscalar processors can be exploited only when processor is fed with sufficient instruction bandwidth. The front-end units, the Instruction Stream Buffer (ISB) and the fetcher, are the key elements for achieving this goal. Current ISBs could not support instruction streaming beyond a basic block. In x86 processors, the split-line instruction problem worsens this ...

متن کامل

Using a serial cache for energy efficient instruction fetching

Computer Science Department, University of California, Los Angeles Department of Computer Science and Engineering, University of California, San Diego Abstract The design of a high performance fetch architecture can be challenging due to poor interconnect scaling and energy concerns. Way prediction has been presented as one means of scaling the fetch engine to shorter cycle times, while providi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006