In-network redundancy generation for opportunistic speedup of data backup

نویسندگان

  • Lluis Pamies-Juarez
  • Anwitaman Datta
  • Frédérique E. Oggier
چکیده

Erasure coding is a storage-efficient alternative to replication for achieving reliable data backup in distributed storage systems. During the storage process, traditional erasure codes require a unique source node to create and upload all the redundant data to the different storage nodes. However, such a source node may have limited communication and computation capabilities, which constrain the storage process throughput. Moreover, the source node and the different storage nodes might not be able to send and receive data simultaneously – e.g., nodes might be busy in a datacenter setting, or simply be offline in a peer-to-peer setting – which can further threaten the efficacy of the overall storage process. In this paper we propose an “in-network” redundancy generation process that leverages on the self-repairing property of the novel SRC codes. This in-network redundancy generation allows storage nodes to generate new redundant data by exchanging partial information among themselves, improving the throughput of the storage process. The process is carried out asynchronously, utilizing spare bandwidth and computing resources from the storage nodes. We analytically show that the performance of this technique relies on an efficient usage of the spare node resources, and we derive a set of scheduling algorithms to maximize the same. We experimentally show that our algorithms can, depending on the environment characteristics, increase the throughput of the storage process significantly with respect to the classical naive storage approach. Keywords-distributed storage; erasure codes; backup;

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptive Protection Based on Intelligent Distribution Networks with the Help of Network Factorization in the Presence of Distributed Generation Resources

Factorizing a system is one of the best ways to make a system intelligent. Factorizing the protection system, providing the right connecting agents, and transmitting the information faster and more reliably can improve the performance of a protection system and maintain system reliability against distributed generation resources. This study presents a new method for coordinating network protect...

متن کامل

A Network Differential Backup and Restore System based on a Novel Duplicate Data Detection algorithm

The ever-growing volume and value of data has raised increasing pressure for long-term data protection in storage systems. Moreover, the redundancy in data further aggravates such pressure in these systems. It has become a serious problem to protect data while eliminating data redundancy, saving storage space and network bandwidth as well. Data deduplication techniques greatly optimize storage ...

متن کامل

On the Impact of the Data Redundancy Strategy on the Recoverability of Friend-to-Friend Backup Systems

Social network-based systems, also known as Friend-to-Friend (F2F) systems, are a promising approach to develop backup solutions that provide high reliability with a much lower consumption of bandwidth and storage than P2P ones. F2F backup systems can use two data redundancy strategies to handle peer failure events, namely: replication and erasure coding. In this paper we evaluate the use of th...

متن کامل

Optimum Location for Backup Land Uses From the Perspective of Passive Defense in Urmia City: A Case Study

This study aim is to find the optimal location form backup land uses from the perspective of passive defense in Urmia City. This is an applied, descriptive, and analytical study. Data collection was done using documentary data, a field study, and a questionnaire. Participants were 10 experts in this subject, who were selected purposefully. Using the analytic network process (ANP), a dynamic and...

متن کامل

Adaptive redundancy management for durable P2P backup

We propose a redundancy management mechanism for peer-to-peer backup applications. Since, in a backup system, data is read over the network only during restore processes caused by data loss, redundancy management targets data durability rather than attempting to make each piece of information availabile at any time. Each peer determines, in an on-line manner, an amount of redundancy sufficient ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Future Generation Comp. Syst.

دوره 29  شماره 

صفحات  -

تاریخ انتشار 2013