A Distributed Consistent Global Checkpoint Algorithm for Distributed Mobile Systems
نویسنده
چکیده
A distributed coordinated checkpointing algorithm for distributed mobile systems is presented. A consistent global checkpoint is a set of states in which no message is recorded as received in one process and as not yet sent in another process. It is used for rollback when process failure occurs. A consistent global checkpoint must be obtained for any checkpoint initiation by any process. This paper shows a checkpoint algorithm in which the amount of information piggybacked on program messages does not depend on the number of mobile processes. The number of checkpoints is minimized under two assumptions: ( I ) one consistent global checkpoint is taken for concurrent checkpoint initiations and (2) a checkpoint is initiated at each handoff by mobile processes. This algorithm is thus optimal among the generalizations of Chandy and Lamport’s distributed snapshot algorithm under the latter assumption.
منابع مشابه
A Review of Checkpointing Fault Tolerance Techniques in Distributed Mobile Systems
Fault Tolerance Techniques enable systems to perform tasks in the presence of faults. A checkpoint is a local state of a process saved on stable storage. In a distributed system, since the processes in the system do not share memory, a global state of the system is defined as a set of local states, one from each process. In case of a fault in distributed systems, checkpointing enables the execu...
متن کاملA Distributed First and Last Consistent Global Checkpoint Algorithm
Distributed coordinated checkpointing algorithms are discussed. The first global checkpoint for a checkpoint initiation is a set containing the checkpoint for each process in which any checkpoint before the element is not consistent with the initiation. The last global checkpoint for a checkpoint initiation is a set containing the checkpoint for each process in which any checkpoint after the el...
متن کاملNecessary and sufficient conditions for transaction-consistent global checkpoints in a distributed database system
Checkpointing and rollback recovery are well-known techniques for handling failures in distributed systems. The issues related to the design and implementation of efficient checkpointing and recovery techniques for distributed systems have been thoroughly understood. For example, the necessary and sufficient conditions for a set of checkpoints to be part of a consistent global checkpoint has be...
متن کاملAn optimistic checkpointing and message logging approach for consistent global checkpoint collection in distributed systems
Checkpointing and rollback recovery are widely used techniques for achieving fault-tolerance in distributed systems. In this paper, we present a novel checkpointing algorithm which has the following desirable features: A process can independently initiate consistent global checkpointing by saving its current state, called a tentative checkpoint. Other processes come to know about a consistent g...
متن کاملReview of Some Checkpointing Schemes for Distributed and Mobile Computing Environments
Mr Raman Kumar Mewar University, Chittorgargh (Raj) Email: [email protected] Dr Parveen Kumar Amity University Gurgaon (Haryana) Email: [email protected] ---------------------------------------------------------------------ABSTRACT------------------------------------------------------Fault Tolerance Techniques facilitate systems to carry out tasks in the incidence of faults. A checkpoint is a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001