Transactional Failure Recovery for a Distributed Key-Value Store

نویسندگان

  • Muhammad Yousuf Ahmad
  • Bettina Kemme
  • Ivan Brondino
  • Marta Patiño-Martínez
  • Ricardo Jiménez-Peris
چکیده

A b s t r a c t . With the advent of cloud computing, many applications have embraced the ensuing paradigm shift towards modern distributed keyvalue data stores, like HBase, in order to benefit from the elastic scal­ ability on offer. However, many applications still hesitate to make the leap from the traditional relational database model simply because they cannot compromise on the standard transactional guarantees of atomic­ ity, isolation, and durability. To get the best of both worlds, one option is to integrate an independent transaction management component with a distributed key-value store. In this paper, we discuss the implications of this approach for durability. In particular, if the transaction manager provides durability (e.g., through logging), then we can relax durability constraints in the key-value store. However, if a component fails (e.g., a client or a key-value server), then we need a coordinated recovery pro­ cedure to ensure that commits are persisted correctly. In our research, we integrate an independent transaction manager with HBase. Our main contribution is a failure recovery middleware for the integrated system, which tracks the progress of each commit as it is flushed down by the client and persisted within HBase, so that we can recover reliably from failures. During recovery, commits that were interrupted by the failure are replayed from the transaction management log. Importantly, the re­ covery process does not interrupt transaction processing on the available servers. Using a benchmark, we evaluate the impact of component failure, and subsequent recovery, on application performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CubicRing: Enabling One-Hop Failure Detection and Recovery for Distributed In-Memory Storage Systems

In-memory storage has the benefits of low I/O latency and high I/O throughput. Fast failure recovery is crucial for large-scale in-memory storage systems, bringing network-related challenges including false detection due to transient network problems, traffic congestion during the recovery, and top-of-rack switch failures. This paper presents CubicRing, a distributed structure for cubebased net...

متن کامل

Failure Handling in Transactional Work ows Utilizing

Transactional workkows have been previously speciied using commercially-available workkow management systems (WFMSs). WFMSs have facilitated this speciication by providing task coordination and execution capabilities. However, these WFMSs presently have limitations in terms of heterogeneous distributed system integration, non-proprietary cross-platform support, exible ACID property support, and...

متن کامل

Client Zone 1 Zone 2 Zone 3 Shard A Shard

Application programmers increasingly prefer distributed stor-age systems with strong consistency and distributed transac-tions (e.g., Google’s Spanner) for their strong guarantees andease of use. Unfortunately, existing transactional storage sys-tems are expensive to use – in part because they require costlyreplication protocols, like Paxos, for fault tolerance. In thisp...

متن کامل

Recovery and Page Coherency for a Scalable Multicomputer Object Store

This paper presents scalable algorithms for recovery and page coherency in multicomputer object stores. Recovery and coherency are central to object store engineering and distributed memory multicomputers are fundamental to scalable computation. Efficient recovery is implemented through a combination of local logging and a localisation of the transactional workspace model. A vector of update co...

متن کامل

Warp: Lightweight Multi-Key Transactions for Key-Value Stores

Traditional NoSQL systems scale by sharding data across multiple servers and by performing each operation on a small number of servers. Because transactions on multiple keys necessarily require coordination across multiple servers, NoSQL systems often explicitly avoid making transactional guarantees in order to avoid such coordination. Past work on transactional systems control this coordinatio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013