In Search of I/O-Optimal Recovery from Disk Failures

نویسندگان

Osama Khan

Randal C. Burns

James S. Plank

Cheng Huang

چکیده

We address the problem of minimizing the I/O needed to recover from disk failures in erasure-coded storage systems. The principal result is an algorithm that finds the optimal I/O recovery from an arbitrary number of disk failures for any XOR-based erasure code. We also describe a family of codes with high-fault tolerance and low recovery I/O, e.g. one instance tolerates up to 11 failures and recovers a lost block in 4 I/Os. While we have determined I/O optimal recovery for any given code, it remains an open problem to identify codes with the best recovery properties. We describe our ongoing efforts toward characterizing space overhead versus recovery I/O tradeoffs and generating codes that realize these bounds.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

S-Code: Lowest Density MDS Array Codes for RAID-6

RAID, a storage architecture designed to exploit I/O parallelism and provide data reliability, has been deployed widely in computing systems as a storage building block. In large scale storage systems, in particular, RAID-6 is gradually replacing RAID-5 as the dominant form of disk arrays due to its capability of tolerating concurrent failures of any two disks. MDS (maximum distance separable) ...

متن کامل

Flat Datacenter Storage

Flat Datacenter Storage (FDS) is a high-performance, fault-tolerant, large-scale, locality-oblivious blob store. Using a novel combination of full bisection bandwidth networks, data and metadata striping, and flow control, FDS multiplexes an application’s large-scale I/O across the available throughput and latency budget of every disk in a cluster. FDS therefore makes many optimizations around ...

متن کامل

RAMCube: Exploiting Network Proximity for RAM-Based Key-Value Store

Disk-based storage is becoming increasingly problematic in meeting the needs of large-scale cloud applications. Recently RAM-based storage is proposed by aggregating the RAM of thousands of commodity servers in data center networks (DCN). These studies focus on improving performance with high throughput I/O, low latency RPC and fast failure recovery. RAM-based storage brings great DCN-related c...

متن کامل

Hierarchical RAID: Design, performance, reliability, and recovery

Hierarchical RAID (HRAID) extends the RAID paradigm to mask the failure of whole Storage Nodes (SNs) or bricks, where each SN is a disk array with a certain RAID level. HRAIDk/l with N SNs and M disks per SN tolerates k SN failures and l disk failures per SN withMaximum Distance Separable (MDS) erasure codes, which introduce the minimum level of redundancy at each level. For N = M there are k i...

متن کامل

Using Disk Add-Ons to Withstand Simultaneous Disk Failures with Fewer Replicas

Contemporary storage systems that utilize replication often maintain more than two replicas of each data item, reducing the risk of permanent data loss due to simultaneous disk failures. The price of the additional copies is smaller usable storage space, increased network traffic, and higher power consumption. We propose to alleviate this problem with SIMFAIL, a storage system that maintains on...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

In Search of I/O-Optimal Recovery from Disk Failures

نویسندگان

چکیده

منابع مشابه

S-Code: Lowest Density MDS Array Codes for RAID-6

Flat Datacenter Storage

RAMCube: Exploiting Network Proximity for RAM-Based Key-Value Store

Hierarchical RAID: Design, performance, reliability, and recovery

Using Disk Add-Ons to Withstand Simultaneous Disk Failures with Fewer Replicas

عنوان ژورنال:

اشتراک گذاری