Towards Optimal Multi-Level Checkpointing
نویسندگان
چکیده
منابع مشابه
Optimal checkpointing interval for two-level recovery schemes
K e y w o r d s F a i l u r e recovery, Two checkpoints, Checkpointing interval, Markov process, Rollback overhead. 1. I N T R O D U C T I O N In c o m p u t e r a n d d a t a b a s e i n f o r m a t i o n sy s t ems , some e r ro r s o f t en o c c u r due to noises , h u m a n er rors , so f tware bugs , a n d h a r d w a r e faul ts , a n d m a k e t h e s e s y s t e m s i n h e r e n t l y...
متن کاملHigh-level python abstractions for optimal checkpointing in inversion problems
Inversion and PDE-constrained optimization problems often rely on solving the adjoint problem to calculate the gradient of the objective function. This requires storing large amounts of intermediate data, setting a limit to the largest problem that might be solved with a given amount of memory available. Checkpointing is an approach that can reduce the amount of memory required by redoing parts...
متن کاملMulti-level checkpointing and silent error detection for linear workflows
We focus on High Performance Computing (HPC) workflows whose dependency graph forms a linear chain, and we extend single-level checkpointing in two important directions. Our first contribution targets silent errors, and combines in-memory checkpoints with both partial and guaranteed verifications. Our second contribution deals with multi-level checkpointing for failstop errors. We present sophi...
متن کاملTowards Multi-User Multi-Level Interaction
The necessity of incorporating experts from various domains in order to understand and draw meaningful conclusions from complex and massive amounts of data is an undisputed fact. In order to create and effectively use such a collaborative information workspace it is vital to understand the interaction processes involved. Established, high-level interaction patterns work well for single user, si...
متن کاملTowards Application - Level Multi - Homing
Hundreds of millions of users are attracted to many of the thriving Internet applications and services, such as e-mail, online social networking, or video sharing. In an attempt to effectively serve the large number of users, the application providers started deploying their own application-specific network infrastructures. We demonstrate that the significant proliferation of applications and t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Computers
سال: 2017
ISSN: 0018-9340
DOI: 10.1109/tc.2016.2643660