Fine-Grain Software Distributed Shared Memory on SMP Clusters

نویسندگان

Daniel J. Scales

Kourosh Gharachorloo

Anshu Aggarwal

چکیده

Commercial SMP nodes are an attractive building block for software distributed shared memory systems. The advantages of using SMP nodes include fast communication among processors within the same SMP node and potential gains from clustering where remote data fetched by one processor is used by other processors on the same node. The widespread availability of SMP servers with small numbers of processors has led several researchers to consider their use as building blocks for Shared Virtual Memory (SVM) systems. These systems exploit the SMP cache-coherence hardware to support fine-grain communication within a node, and use software to support communication across nodes at a coarser page-size granularity. Our goal is to explore the use of SMP nodes in the context of the Shasta system. A unique aspect of Shasta compared to most other software distributed shared memory systems is that shared data can be kept coherent at a fine granularity. Shasta implements this coherence by inserting inline code that checks the cache state of shared data before each load or store. In addition, Shasta allows the coherence granularity to be varied across different shared data structures in a single application. This approach alleviates potential inefficiences that arise from the fixed large granularity of communication typical in most software systems. This paper describes a major extension to the Shasta system that supports fine-grain shared memory across SMP nodes. Allowing processors to efficiently share memory within the same SMP is complicated by race conditions that arise because the inline state check is non-atomic with respect to the actual load or store of shared data. We present a novel and efficient protocol that avoids such race conditions without the use of costly synchronization in the inline checking code. The above protocol is fully functional and runs on a prototype cluster of Alpha multiprocessors connected through Digital’s Memory Channel network. To characterize the benefits of using SMP nodes in the context of Shasta, we also present detailed performance results for nine SPLASH-2 applications running on this cluster.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparative Evaluation of Fine- and Coarse-Grain Approaches for Software Distributed Shared Memory

Symmetric multiprocessors (SMPs) connected with low-latency networks provide attractive building blocks for software distributed shared memory systems. Two distinct approaches have been used: the fine-grain approach that instruments application loads and stores to support a small coherence granularity, and the coarse-grain approach based on virtual memory hardware that provides coherence at a p...

متن کامل

A Taxonomy of Programming Models for Symmetric Multiprocessors and SMP Clusters

The basic processing element, from PCs to large systems, is rapidly becoming a symmetric multiprocessor (SMP). As a result, the nodes of a parallel computer will often be an SMP. The resulting mixed hardware models (combining shared-memory and distributed memory) provide a challenge to system software developers to provide users with programming models that are portable, understandable, and eff...

متن کامل

Fine-Grain Distributed Shared Memory on Clusters of Workstations

Shared memory, one of the most popular models for programming parallel platforms, is becoming ubiquitous both in low-end workstations and high-end servers. With the advent of low-latency networking hardware, clusters of workstations strive to offer the same processing power as high-end servers for a fraction of the cost. In such environments, shared memory has been limited to page-based systems...

متن کامل

SilkRoad: A Multithreaded Runtime System with Software Distributed Shared Memory for SMP Clusters

Multithreaded parallel system with software Distributed Shared Memory (DSM) is an attractive direction in cluster computing. In these systems, distributing workloads and keeping the shared memory operations efficient are critical issues. Distributed Cilk (Cilk 5.1) is a multithreaded runtime system for SMP clusters with the support of divide-and-conquer programming paradigm. However, there is n...

متن کامل

Overcoming performance bottlenecks in using OpenMP on SMP clusters

This paper presents a new parallel programming environment called ParADE to enable easy, portable, and high-performance computing for SMP clusters. Different from the prior studies, ParADE separates the programming model from the execution model: it enables shared-address-space programming while it realizes hybrid execution of message-passing and shared-address-space. To overcome the poor perfo...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1998

Fine-Grain Software Distributed Shared Memory on SMP Clusters

نویسندگان

چکیده

منابع مشابه

Comparative Evaluation of Fine- and Coarse-Grain Approaches for Software Distributed Shared Memory

A Taxonomy of Programming Models for Symmetric Multiprocessors and SMP Clusters

Fine-Grain Distributed Shared Memory on Clusters of Workstations

SilkRoad: A Multithreaded Runtime System with Software Distributed Shared Memory for SMP Clusters

Overcoming performance bottlenecks in using OpenMP on SMP clusters

عنوان ژورنال:

اشتراک گذاری