Fine-Grain Software Distributed Shared Memory on SMP Clusters
نویسندگان
چکیده
Commercial SMP nodes are an attractive building block for software distributed shared memory systems. The advantages of using SMP nodes include fast communication among processors within the same SMP node and potential gains from clustering where remote data fetched by one processor is used by other processors on the same node. The widespread availability of SMP servers with small numbers of processors has led several researchers to consider their use as building blocks for Shared Virtual Memory (SVM) systems. These systems exploit the SMP cache-coherence hardware to support fine-grain communication within a node, and use software to support communication across nodes at a coarser page-size granularity. Our goal is to explore the use of SMP nodes in the context of the Shasta system. A unique aspect of Shasta compared to most other software distributed shared memory systems is that shared data can be kept coherent at a fine granularity. Shasta implements this coherence by inserting inline code that checks the cache state of shared data before each load or store. In addition, Shasta allows the coherence granularity to be varied across different shared data structures in a single application. This approach alleviates potential inefficiences that arise from the fixed large granularity of communication typical in most software systems. This paper describes a major extension to the Shasta system that supports fine-grain shared memory across SMP nodes. Allowing processors to efficiently share memory within the same SMP is complicated by race conditions that arise because the inline state check is non-atomic with respect to the actual load or store of shared data. We present a novel and efficient protocol that avoids such race conditions without the use of costly synchronization in the inline checking code. The above protocol is fully functional and runs on a prototype cluster of Alpha multiprocessors connected through Digital’s Memory Channel network. To characterize the benefits of using SMP nodes in the context of Shasta, we also present detailed performance results for nine SPLASH-2 applications running on this cluster.
منابع مشابه
Comparative Evaluation of Fine- and Coarse-Grain Approaches for Software Distributed Shared Memory
Symmetric multiprocessors (SMPs) connected with low-latency networks provide attractive building blocks for software distributed shared memory systems. Two distinct approaches have been used: the fine-grain approach that instruments application loads and stores to support a small coherence granularity, and the coarse-grain approach based on virtual memory hardware that provides coherence at a p...
متن کاملA Taxonomy of Programming Models for Symmetric Multiprocessors and SMP Clusters
The basic processing element, from PCs to large systems, is rapidly becoming a symmetric multiprocessor (SMP). As a result, the nodes of a parallel computer will often be an SMP. The resulting mixed hardware models (combining shared-memory and distributed memory) provide a challenge to system software developers to provide users with programming models that are portable, understandable, and eff...
متن کاملFine-Grain Distributed Shared Memory on Clusters of Workstations
Shared memory, one of the most popular models for programming parallel platforms, is becoming ubiquitous both in low-end workstations and high-end servers. With the advent of low-latency networking hardware, clusters of workstations strive to offer the same processing power as high-end servers for a fraction of the cost. In such environments, shared memory has been limited to page-based systems...
متن کاملSilkRoad: A Multithreaded Runtime System with Software Distributed Shared Memory for SMP Clusters
Multithreaded parallel system with software Distributed Shared Memory (DSM) is an attractive direction in cluster computing. In these systems, distributing workloads and keeping the shared memory operations efficient are critical issues. Distributed Cilk (Cilk 5.1) is a multithreaded runtime system for SMP clusters with the support of divide-and-conquer programming paradigm. However, there is n...
متن کاملOvercoming performance bottlenecks in using OpenMP on SMP clusters
This paper presents a new parallel programming environment called ParADE to enable easy, portable, and high-performance computing for SMP clusters. Different from the prior studies, ParADE separates the programming model from the execution model: it enables shared-address-space programming while it realizes hybrid execution of message-passing and shared-address-space. To overcome the poor perfo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998