0% found this document useful (0 votes)
21 views5 pages

Define The Terms: Rollback Propagation.: Coordinated Checkpointing

The document discusses various concepts in distributed computing, including rollback propagation, outside world processes, and types of checkpointing. It compares coordinated and uncoordinated checkpointing, details the Koo–Toueg coordinated checkpointing algorithm, and explains the rollback recovery algorithm. Additionally, it describes the Juang–Venkatesan algorithm for asynchronous checkpointing and recovery, emphasizing the importance of consistent checkpoints and message logging.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views5 pages

Define The Terms: Rollback Propagation.: Coordinated Checkpointing

The document discusses various concepts in distributed computing, including rollback propagation, outside world processes, and types of checkpointing. It compares coordinated and uncoordinated checkpointing, details the Koo–Toueg coordinated checkpointing algorithm, and explains the rollback recovery algorithm. Additionally, it describes the Juang–Venkatesan algorithm for asynchronous checkpointing and recovery, emphasizing the importance of consistent checkpoints and message logging.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

FT 6 Distributed Computing

1. Define the terms: rollback propagation.


Upon a failure of one or more processes in a system, these dependencies may force some of the
processes that did not fail to roll back, creating what is commonly called a rollback propagation.
2. What is meant by “outside world process (OWP).”?
A distributed system often interacts with the outside world to receive input data or deliver the
outcome of a computation
Outside World Process (OWP)
a special process that interacts with the rest of the system through message passing
A common approach
save each input message on the stable storage before allowing the application program to process it

3. What are the two types of communication-induced check pointing?


Two types of communication-induced checkpointing
model-based checkpointing and index-based checkpointing.

4. Formulate the different types of messages.


In-transit message
Lost messages
Orphan messages
Duplicate messages
Delayed messages
5. Compare coordinated check pointing and uncoordinated check pointing.

 Coordinated Checkpointing:

 In this approach, all processes in the system take checkpoints at the same time, ensuring a
consistent global state. This simplifies recovery because the system can restart from the
last global checkpoint without inconsistency.

 Uncoordinated Checkpointing:

 Here, processes take checkpoints independently without synchronization. While this can
reduce overhead and increase flexibility, it may lead to inconsistent states that complicate
recovery. Inconsistent states can require additional mechanisms, like logging or message
logging, to handle dependencies between processes.

6. i) Summarize the Koo–Toueg coordinated check pointing algorithm.

A coordinated checkpointing and recovery technique that takes a consistent set of checkpointing and
avoids domino effect and livelock problems during the recovery
Includes 2 parts: the checkpointing algorithm and the recovery algorithm
Checkpointing algorithm
Assumptions: FIFO channel, end-to-end protocols, communication failures do not partition the
network, single process initiation, no process fails during the execution of the algorithm
Two kinds of checkpoints: permanent and tentative
Permanent checkpoint: local checkpoint, part of a consistent global checkpoint
Tentative checkpoint: temporary checkpoint, become permanent checkpoint when the algorithm
terminates successfully

Checkpointing algorithm
First phase

An initiating process Pi takes a tentative checkpoint and requests all other processes to take tentative
checkpoints. Each process informs Pi whether it succeeded in taking a tentative checkpoint. A process
says “no” to a request if it fails to take a tentative checkpoint, which could be due to several reasons,
depending upon the underlying application. If Pi learns that all the processes have successfully taken
tentative checkpoints, Pi decides that all tentative checkpoints should be made permanent; otherwise,
Pi decides that all the tentative checkpoints should be discarded.

Second phase

Pi informs all the processes of the decision it reached at the end of the first phase. A process, on
receiving the message from Pi, will act accordingly.

Therefore, either all or none of the processes advance the checkpoint by taking permanent checkpoints.

The algorithm requires that after a process has taken a tentative checkpoint, it cannot send messages
related to the underlying computation until it is informed of Pi’s decision.

Correctness: for 2 reasons


Either all or none of the processes take permanent checkpoint
No process sends message after taking permanent checkpoint
Optimization: maybe not all of the processes need to take
checkpoints (if not change since the last checkpoint)

ii) Explain the rollback recovery algorithm.

The rollback recovery algorithm


Restore the system state to a consistent state after a failure with assumptions: single initiator,
checkpoint and rollback recovery algorithms are not invoked concurrently
2 phases
The initiating process send a message to all other processes and ask for the preferences – restarting to
the previous checkpoints. All need to agree about either do or not.
The initiating process send the final decision to all processes, all the processes act accordingly after
receiving the final decision.

First phase

An initiating process Pi sends a message to all other processes to check if they all are willing to restart
from their previous checkpoints. A process may reply “no” to a restart request due to any reason (e.g., it
is already participating in a checkpoint or recovery process initiated by some other process). If Pi learns
that all processes are willing to restart from their previous checkpoints, Pi decides that all processes
should roll back to their previous checkpoints.

Otherwise, Pi aborts the rollback attempt and it may attempt a recovery at a later time.

Second phase

Pi propagates its decision to all the processes. On receiving Pi’s decision, a process acts accordingly.

During the execution of the recovery algorithm, a process cannot send messages related to the
underlying computation while it is waiting for Pi’s decision.

7. Describe about the Juang–Venkatesan algorithm for asynchronous check pointing and recovery.

Assumptions: communication channels are reliable, delivery messages in FIFO order, infinite buffers,
message transmission delay is arbitrary but finite
Underlying computation/application is event-driven: process P is at state s, receives message m,
processes the message, moves to state s’ and send messages out. So the triplet (s, m, msgs_sent)
represents the state of P

Two type of log storage are maintained:


Volatile log: short time to access but lost if processor crash. Move to stable log periodically.
Stable log: longer time to access but remained if crashed
Asynchronous checkpointing:
After executing an event, the triplet is recorded without any synchronization with other processes.
Local checkpoint consist of set of records, first are stored in volatile log, then moved to stable log.
Recovery algorithm
Notations:
𝑅𝐶𝑉𝐷𝑖←(𝐶𝑘𝑃𝑡𝑖 ): number of messages received by 𝑝𝑖 from 𝑝𝑗, from the beginning of computation to
checkpoint 𝐶𝑘𝑃𝑡𝑖
𝑆𝐸𝑁𝑇𝑖→(𝐶𝑘𝑃𝑡𝑖 ): number of messages sent by 𝑝𝑖 to 𝑝𝑗, from the beginning of computation to
checkpoint 𝐶𝑘𝑃𝑡𝑖
Idea:
From the set of checkpoints, find a set of consistent checkpoints
Doing that based on the number of messages sent and received
Since RCVDX←Y _CkPtX_ = 3 > 2 (2 is the value received in the ROLLBACK(Y , 2) message from Y ), X
will set CkPtX to ex2 satisfyingRCVDX←Y _ex2_ = 2 ≤ 2. Since RCVDZ←Y _CkPt Z_ = 2 > 1, Z will set CkPtZ
to ez1 satisfying RCVDZ←Y _ez1_ = 1 ≤ 1. At Y , RCVDY←X_CkPtY _ = 1 < 2 and RCVDY←Z_CkPtY _ = 1 =
SENTZ←Y _CkPtZ_.

Hence, Y need not roll back further. In the second iteration, Y sends ROLLBACK(Y, 2) to X and
ROLLBACK(Y, 1) to Z; Z sends ROLLBACK(Z, 1) to Y and ROLLBACK(Z, 0) to X; X sends
ROLLBACK(X, 0) to Z and ROLLBACK(X, 1) to Y . Note that if Y rolls back beyond ey3 and loses the
message from X that caused ey3, X can resend this message to Y because
ex2 is logged at X and this message is available in the log. The second and third iteration will progress in
the same manner. Note that the set of recovery points chosen at the end of the first iteration, {ex2, ey2,
ez1}, is consistent, and no further rollback occurs.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy