The Cost of Uncore in Throughput-Oriented Many-Core Processors

نویسنده

Gabriel H. Loh

چکیده

Achieving performance through traditional techniques such as extracting more instruction level parallelism or increasing clock frequencies are losing their effectiveness due to the power wall. Multi-core processors have been put forth as a more power-performance efficient means of continuing performance scaling while coping with the realities of a power-limited design. Extrapolating the increase in the number of cores leads us to “many-core” systems, potentially containing hundreds of cores. The multi-/many-core approach is no panacea, however. As the number of cores increases, the overall system will need to provide more cache resources to feed all of these cores, and an everincreasingly complex interconnection network to tie all of these cores together. These additional “uncore” components are not free, and, unless carefully controlled, they may limit the effectiveness of many-core systems. We introduce a simple extension to Hill and Marty’s recent Amdahl’s Lawbased multi-core cost/performance model to account for the uncore components. From this model, we conclude that to sustain the scalability of future many-core systems, the uncore components must be designed to scale sub-linearly with respect to the overall core count.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

System-wide Performance Counter Measurements: Offcore, Uncore, and Northbridge Performance Events in Modern Processors

Modern processors often have many processing cores in one package (or socket). Traditional hardware performance counters measure only values on a single core. A chip package has many resources which are packagewide and thus need a separate performance reporting mechanism. The values for these shared and off-core resources are reported as “offcore”, “uncore” or “northbridge” events.

متن کامل

An Analysis of Core- and Chip-Level Architectural Features in Four Generations of Intel Server Processors

This paper presents a survey of architectural features among four generations of Intel server processors (Sandy Bridge, Ivy Bridge, Haswell, and Broadwell) with a focus on performance with floating point workloads. Starting on the core level and going down the memory hierarchy we cover instruction throughput for floating-point instructions, L1 cache, address generation capabilities, core clock ...

متن کامل

Ultra-Low-Energy DSP Processor Design for Many-Core Parallel Applications

Background and Objectives: Digital signal processors are widely used in energy constrained applications in which battery lifetime is a critical concern. Accordingly, designing ultra-low-energy processors is a major concern. In this work and in the first step, we propose a sub-threshold DSP processor. Methods: As our baseline architecture, we use a modified version of an existing ultra-low-power...

متن کامل

Parallel Packet Processing on Multi-core and Many- core Processors

The Service-oriented Router (SoR), a highly functional router based on a novel router architecture, enables unprecedented web services traditional routers were unable to provide. The SoR performs Deep Packet Inspection (DPI) to analyze Layer 7 information, which is becoming increasingly difficult due to the substantial increase in Internet traffic. Meanwhile, multi-core processors and general-p...

متن کامل

A Clustering Approach to Scientific Workflow Scheduling on the Cloud with Deadline and Cost Constraints

One of the main features of High Throughput Computing systems is the availability of high power processing resources. Cloud Computing systems can offer these features through concepts like Pay-Per-Use and Quality of Service (QoS) over the Internet. Many applications in Cloud computing are represented by workflows. Quality of Service is one of the most important challenges in the context of sche...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

The Cost of Uncore in Throughput-Oriented Many-Core Processors

نویسنده

چکیده

منابع مشابه

System-wide Performance Counter Measurements: Offcore, Uncore, and Northbridge Performance Events in Modern Processors

An Analysis of Core- and Chip-Level Architectural Features in Four Generations of Intel Server Processors

Ultra-Low-Energy DSP Processor Design for Many-Core Parallel Applications

Parallel Packet Processing on Multi-core and Many- core Processors

A Clustering Approach to Scientific Workflow Scheduling on the Cloud with Deadline and Cost Constraints

عنوان ژورنال:

اشتراک گذاری