نتایج جستجو برای: checkpointing

تعداد نتایج: 2665  

Journal: :Scalable Computing: Practice and Experience 2014
Bakhta Meroufel Ghalem Belalem

Cloud computing is a new benchmark towards enterprise application development that can facilitate the execution of workflows in business process management system. The workflow technology can manage the business processes efficiently satisfying the requirements of modern enterprises. Besides the scheduling, the fault tolerance is a very important issue in the workflow management. In this paper,...

2007
John Paul Walters Vipin Chaudhary

As computational clusters increase in size, their mean-time-to-failure reduces. Typically checkpointing is used to minimize the loss of computation. Most checkpointing techniques, however, require a central storage for storing checkpoints. This severely limits the scalability of checkpointing. We propose a scalable replication-based MPI checkpointing facility that is based on LAM/MPI. We extend...

2012
A.Vani Vathsala Hrushikesha Mohanty

Web Services are built on service-oriented architecture which is based on the notion of building applications by discovering and orchestrating services available on the web. Complex business processes can be realized by discovering and orchestrating already available services on the web. In order to make these orchestrated web services resilient to faults; we proposed a simple and elegant check...

1995
Gilbert Cabillic Gilles Muller Isabelle Puaut

This paper presents the design and implementation of a consistent checkpointing scheme for Distributed Shared Memory (dsm) systems. Our approach relies on the integration of checkpoints within synchronization barriers already existing in applications; this avoids the need to introduce an additional synchronization mechanism. The main advantage of our checkpoint-ing mechanism is that performance...

2014
Matthew Forshaw A. Stephen McGough Nigel Thomas

Checkpointing is a fault-tolerance mechanism commonly used in High Throughput Computing (HTC) environments to allow the execution of long-running computational tasks on compute resources subject to hardware and software failures and interruptions from resource owners. With increasing scrutiny of the energy consumption of IT infrastructures, it is important to understand the impact of checkpoint...

1996
D. Manivannan Mukesh Singhal

In this paper, we propose a quasi-synchronous checkpointing algorithm and a low-overhead recovery algorithm based on it. The checkpointing algorithm preserves process autonomy by allowing them to take checkpoints asynchronously and uses communication-induced checkpoint coordination for the progression of the recovery line which helps bound rollback propagation during a recovery. Thus, it has th...

2014
Andreas Löscher Nicolas Tsiftes Thiemo Voigt Vlado Handziski

Developing sensornet software is difficult partly because of the limited visibility of the system state of deployed nodes. Sensornet checkpointing is a method that allows developers to save and restore full system state of nodes. We present four extensions to sensornet checkpointing—compression, binary diffs, selective checkpointing, and checkpoint inspection—that reduce the time required for c...

2000
Cheng-Min LIN

This work presents two novel algorithms to prevent rollback propagation for independent checkpointing: an efficient adaptive independent checkpointing algorithm and an optimized adaptive independent checkpointing algorithm. The last opportunity strategy that yields a better performance than the conservation strategy is also employed to prevent useless checkpoints for both causal rewinding paths...

2006
William W. Symes

The optimal checkpointing algorithm (Griewank and Walther, 2000) minimizes the computational complexity of the adjoint state method. Applied to reverse time migration, optimal checkpointing eliminates (or at least drastically reduces) the need for disk i/o, which is quite extensive in more straightforward implementations. This paper describes optimal checkpointing in a form which applies both t...

1998
Mangesh Kasbekar Chandramouli Narayanan Chita R Das

This paper presents a re ective approach to checkpointing concurrent object oriented programs. We describe a checkpointing and rollback library for multithreaded programs written in C++. We demonstrate some of the unique features o ered by this library, such as selective checkpointing and selective rollbacks of threads of a process that are achievable only through the use of re ection.

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید