نتایج جستجو برای: coordinated checkpointing

تعداد نتایج: 48092  

1995
Geert Deconinck Johan Vounckx Rudy Lauwereins Jean A. Peperstraete

We propose a method to incorporate coordinated checkpointing and rollback in high performance computing applications on massively parallel computers. A library allows the user to specify which data-items (including files) belong to the contents of the checkpoint, and to trigger the checkpointing in the application. The recovery-line management on the distributed disk system takes care of which ...

2006
Chaoguang Men Dongsheng Wang Yunlong Zhao

In this paper, the concept of “computing checkpoint” is introduced, and then an efficient coordinated checkpoint algorithm is proposed. The algorithm combines the two approaches of reducing the overhead associated with coordinated checkpointing, which one is to minimize the processes which take checkpoints and the other is to make the checkpointing process non-blocking. Through piggybacking the...

2015
Mr Raman Kumar Parveen Kumar

Mr Raman Kumar Mewar University, Chittorgargh (Raj) Email: [email protected] Dr Parveen Kumar Amity University Gurgaon (Haryana) Email: [email protected] ---------------------------------------------------------------------ABSTRACT------------------------------------------------------Fault Tolerance Techniques facilitate systems to carry out tasks in the incidence of faults. A checkpoint is a...

2014
Monika Nagpal

In this paper, a three phase minimum-process coordinated checkpointing algorithm for nondeterministic mobile distributed systems is proposed, where no useless checkpoints are taken. An effort has been made to minimize the blocking of processes and synchronization message overhead and to capture the partial transitive dependencies during the normal execution by piggybacking dependency vectors on...

1999
Luís Moura Silva João Gabriel Silva

Checkpointing is a very effective technique to ensure the continuity of long-running applications in the occurrence of failures. However, one of the handicaps of coordinated checkpointing is the high latency for committing output from the application to the external world. Enhancing the checkpointing scheme with a message logging protocol is a good solution to reduce the output latency. The ide...

1994
Thomas Eirich

The paper discusses problems of checkpointing in distributed object systems and presents an algorithm suited optimally to their fine-grained structure. Usually, checkpoint algorithms assume nodes or processes as system units. This assumption results in a coarse-grained structure of checkpointing. We will show that this difference in granularity makes usual checkpoint algorithms inadequate. The ...

2011
Liu Guoliang Chen Shuyu Zhang Xiaoqin

The technology of checkpointing and rollback recovery as an effective method of fault tolerance, has been used widely on the parallel or distributed computer systems. We have presented a nonblocking coordinated checkpointing algorithm for distributed systems, which are differ from the conventional approach of taking first temporary checkpoints and then converting them to permanent ones by proce...

1998
NUNO F. NEVES Ravishankar Iyer Jane Liu Laxmikant Kale

Distributed systems are being used to support the execution of applications ranging from long-running scientific simulators to e-commerce on the Internet. In this type of environment, the failure of one of its components, either a computer or the network, may prevent other components from completing their tasks. Since the probability of failure increases with the number of computers and executi...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید