An architecture for tolerating processor failures in shared-memory multiprocessors

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Architecture for Tolerating Processor Failures in Shared Memory Multiprocessors

In this paper, we focus on the problem of recovering processor failures in shared memory multiprocessors. We propose an architecture designed for transparently tolerating processor failures. The Recoverable Shared Memory (RSM) is the main component of this architecture which provides a hardware supported backward error recovery mechanism. This technique copes with standard caches and cache cohe...

متن کامل

Tolerating Processor Failures in a Distributed Shared - Memory Multiprocessor

Scaling transistor geometries and increasing levels of integration lead to rising transientand permanent-fault rates. Future server platforms must combine reliable computation with cost and performance scalability, without sacrificing application portability. Processor reliability—for both transient and permanent faults—represents the most challenging aspect of designing reliable, available ser...

متن کامل

Tolerating Latency Through Software-Controlled Prefetching in Shared-Memory Multiprocessors

The large latency of memory accesses is a major obstacle in obtaining high processor utilization in large scale shared-memory multiprocessors. Although the provision of coherent caches in many recent machines has alleviated the problem somewhat, cache misses still occur frequently enough that they significantly lower performance. In this paper we evaluate the effectiveness of non-binding softwa...

متن کامل

A Novel Lightweight Directory Architecture for Scalable Shared-Memory Multiprocessors

There are two important hurdles that restrict the scalability of directory-based shared-memory multiprocessors: the directory memory overhead and the long L2 miss latencies due to the indirection introduced by the accesses to directory information, usually stored in main memory. This work presents a lightweight directory architecture aimed at facing these two important problems. Our proposal ta...

متن کامل

Dynamic Data Replication for Tolerating Single Node Failures in Shared Virtual Memory Clusters of Workstations

In this paper we investigate how shared memory clusters can take advantage of replication to tolerate single system failures. We start from a shared virtual memory protocol (GeNIMA) that has been optimized for low-latency, highbandwidth system area networks. We propose a set of extensions that maintain shared data consistent in the presence of failures and support SMP nodes. Our scheme uses dyn...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Computers

سال: 1996

ISSN: 0018-9340

DOI: 10.1109/12.543705