نتایج جستجو برای: checkpointing

تعداد نتایج: 2665  

2013
Monika Nagpal Praveen Kumar

Checkpointing is one of the commonly used techniques to provide fault tolerance in distributed systems so that the system can operate even if one or more components have failed. However, mobile computing systems are constrained by low bandwidth, mobility, lack of stable storage, frequent disconnections and limited battery life. Hence checkpointing protocols which have fewer checkpoints are pref...

2004
Vamsi Kambhampati Indrajit Ray Eunjong Kim

Secure checkpointing appears to be a useful technique for designing survivable systems. These are fault-tolerant systems that are robust against malicious security attacks. Secure checkpointing, however, is not easily done. Without adequate protection, the checkpointing process can be attacked and compromised. The checkpointing data can be subjected to malicious attacks and be a source of secur...

2016
Michel Schanen Oana Marin Hong Zhang Mihai Anitescu

Adjoints are an important computational tool for large-scale sensitivity evaluation, uncertainty quantification, and derivative-based optimization. An essential component of their performance is the storage/recomputation balance in which efficient checkpointing methods play a key role. We introduce a novel asynchronous two-level adjoint checkpointing scheme for multistep numerical time discreti...

2010
Surender Kumar Parveen Kumar

Checkpointing is an efficient fault tolerance technique used in distributed systems. Mobile computing raises many new issues, such as high mobility, lack of stable storage on mobile hosts (MHs), low bandwidth of wireless channels, limited battery life and disconnections that make the traditional checkpointing protocols unsuitable for such systems. Several checkpointing algorithms have been repo...

2007
Andrey Smirnov

Problem of efficient cluster resources usage is very important, because of high demand for parallel computations. Checkpointing allows to manage cluster computing time more efficiently. In this article parallel programs checkpointing problems are discussed and implementation of automatic parallel checkpointing systems for MPI programs is presented. It is based on simple user-space portable chec...

Journal: :J. Inf. Sci. Eng. 2010
Mehdi Lotfi Seyed Ahmad Motamedi

Blocking coordinated checkpointing is a well-known method for achieving fault tolerance in cluster computing systems. In this work, we introduce a new approach for blocking coordinated checkpointing using two-level checkpointing. The first level of checkpointing is local checkpointing, and computing nodes save the checkpoints in local disk. If a transient failure occurs in the computing node, t...

Journal: :JSW 2008
Nianen Chen Shangping Ren

Checkpointing is a commonly used approach to provide system fault-tolerance. However, using a constant checkpointing frequency may compromise the system’s overall performance when there are multiple types of QoS requirements involved. Hence, it is important that the checkpointing frequency is customizable and runtime adaptable. However, for open distributed and embedded applications, often ther...

1995
Pierre Sens

This paper describes performance measurements of an implementation of independent checkpointing in a network of workstations. Independent checkpointing is a simple technique for providing fault tolerance in distributed system, Because processes do not coordinate during checkpointing, this technique has a low run-time overhead. To avoid the classical domino effect, our implementation relies on a...

Journal: :J. Parallel Distrib. Comput. 1997
Adam Beguelin Erik Seligman Peter Stephan

We have explored methods for checkpointing and restarting processes within the Distributed object migration environment (Dome), a C++ library of data parallel objects that are automatically distributed over heterogeneous networks of workstations (NOWs). System level checkpointing methods, although transparent to the user, were rejected because they lack support for heterogeneity. We have implem...

2010
Surender Kumar R. K. Chauhan Parveen Kumar Lalit Kumar R K Chauhan V K Gupta

Checkpointing is an efficient way of implementing fault tolerance in distributed systems. Mobile computing raises many new issues, such as high mobility, lack of stable storage on mobile hosts (MHs), low bandwidth of wireless channels, limited battery life and disconnections that make the traditional checkpointing protocols unsuitable for such systems. Minimum process non-blocking coordinated c...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید