Designing SSI clusters with hierarchical checkpointing and single I/O space

نویسندگان

  • Kai Hwang
  • Hai Jin
  • Edward Chow
  • Cho-Li Wang
  • Zhiwei Xu
چکیده

(SSI) in a workstation cluster. In a cluster of computers, local area networks or highbandwidth switch networks using optical fibers physically connect a collection of node computers. The workstations in a cluster can work collectively as an integrated computing resource—that is, an SSI—or they can operate as individual computers, separately. Present clusters are usually small and provide only limited SSI services. Future clusters will likely increase in scalability and offer more SSI support, as Figure 1 illustrates. The implication is that future clusters could replace the MPP, SMP, or CC-NUMA architectures (see “The cluster as a computer architecture” sidebar for key characteristics of these computer platforms). We focus on clusters with high availability through SSI support, distributed RAID (redundant arrays of inexpensive disks) with parity checks, and hierarchical checkpointing with adaptive recovery. In particular, we developed a single I/O address space among all disks and peripheral devices attached in the cluster. This enables direct remote disk access, which is a necessary step to implement a Adopting a new

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Design and Analysis of Clusters with Single I/O Space

Support of Single System Image (SSI) services is the main approach that enables better utilization of PC/workstation clusters. Some SSI services can be easily built with the support of other low-level, elementary, SSI services. In this paper, we describe a Single I/O Space architecture for achieving a SSI at the I/O subsystem level. Furthermore, we demonstrate how the Single I/O Space can facil...

متن کامل

Performance Effect Analysis of False Sharing Problem in Clusters with Single I/O Space

Single I/O space is an important characteristic of single system image (SSI) in the cluster of workstations/PCs, especially in the I/O intensive applications. Based on the study of the different I/O architectures of cluster, false sharing problem in the distributed RAID with single I/O space is arisen. Identification of false sharing problem plays an important role for the performance improveme...

متن کامل

Modeling of Hierarchical Distributed Systems with Fault-Tolerance

Absfracf-This paper addresses some fault-tolerant issues pertaining to hierarchically distr ibuted systems. Since each o f the levels in a hierarchical system could have various characteristics, different faulttolerance schemes could he appropriate at different levels. I n this paper, we use stochastic Pet r i nets (SPN's) to investigate various faulttolerant schemes in this context. The basic ...

متن کامل

Single I/O Space for Scalable Cluster Computing

In this paper, we propose a novel Single I/O Space architecture for achieving a Single System Image (SSI) at the I/O subsystem level. This is very much desired in a scalable cluster computing environment using commodity components. Our design achieves a single address space for all blocks of data in the cluster, which can tolerate all single disk failures. While traditional approaches focused o...

متن کامل

Single ‎A‎ssignment Capacitated Hierarchical Hub Set Covering Problem for Service Delivery Systems Over Multilevel Networks

The present study introduced a novel hierarchical hub set covering problem with capacity constraints. This study showed the significance of fixed charge costs for locating facilities, assigning hub links and designing a productivity network. The proposed model employs mixed integer programming to locate facilities and establish links between nodes according to the travel time between an origin-...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IEEE Concurrency

دوره 7  شماره 

صفحات  -

تاریخ انتشار 1999