Composable Reliability for Asynchronous Systems

نویسندگان

  • Sunghwan Yoo
  • Charles Edwin Killian
  • Terence Kelly
  • Hyoun Kyu Cho
  • Steven Plite
چکیده

Distributed systems often employ replication to solve two different kinds of availability problems. First, to prevent the loss of data through the permanent destruction or disconnection of a distributed node, and second, to allow prompt retrieval of data when some distributed nodes respond slowly. For simplicity, many systems further handle crash-restart failures and timeouts by treating them as a permanent disconnection followed by the birth of a new node, relying on peer replication rather than persistent storage to preserve data. We posit that for applications deployed in modern managed infrastructures, delays are typically transient and failed processes and machines are likely to be restarted promptly, so it is often desirable to resume crashed processes from persistent checkpoints. In this paper we present MaceKen, a synthesis of complementary techniques including Ken, a lightweight and decentralized rollback-recovery protocol that transparently masks crash-restart failures by careful handling of messages and state checkpoints; and Mace, a programming toolkit supporting development of distributed applications and application-specific availability via replication. MaceKen requires near-zero additional developer effort—systems implemented in Mace can immediately benefit from the Ken protocol by virtue of following the Mace execution model. Moreover, Ken allows multiple, independently developed application components to be seamlessly composed, preserving strong global reliability guarantees. Our implementation is available as open

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Composition and Implementation of Sequential Consistency (Extended Version)

It has been proved that to implement a linearizable shared memory in synchronous message-passing systems it is necessary to wait for a time proportional to the uncertainty in the latency of the network for both read and write operations, while waiting during read or during write operations is sufficient for sequential consistency. This paper extends this result to crash-prone asynchronous syste...

متن کامل

On Composition and Implementation of Sequential Consistency

To implement a linearizable shared memory in synchronous message-passing systems it is necessary to wait for a time linear to the uncertainty in the latency of the network for both read and write operations. Waiting only for one of them suffices for sequential consistency. This paper extends this result to crash-prone asynchronous systems, proposing a distributed algorithm building a sequential...

متن کامل

Dataflow formalisation of real-time streaming applications on a Composable and Predictable Multi-Processor SOC

Embedded systems often contain multiple applications, some of which have real-time requirements and whose performance must be guaranteed. To efficiently execute applications, modern embedded systems contain Globally Asynchronous Locally Synchronous (GALS) processors, network on chip, DRAM and SRAM memories, and system software, e.g. microkernel and communication libraries. In this paper we desc...

متن کامل

Bit Error Performance for Asynchronous Ds Cdma Systems Over Multipath Rayleigh Fading Channels (RESEARCH NOTE)

In recent years, there has been considerable interest in the use of CDMA in mobile communications. Bit error rate is one of the most important parameters in the evaluation of CDMA systems. In this paper, we develop a technique to find an accurate approximation to the probability of bit error for asynchronous direct–sequence code division multiple–access (DS/CDMA) systems by modeling the multipl...

متن کامل

Model-Predictive Controllers for Performance Management of Composable Conveyor Systems

Composable Conveyors expose fundamental new problems that must be addressed as the nation transforms its advanced manufacturing infrastructure. Unanticipated fluctuations in workloads caused by the increasingly open and interconnected advanced manufacturing systems makes it significantly challenging to appropriately configure and adapt the operating parameters of conveyor systems that are deplo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012