0% found this document useful (0 votes)
50 views17 pages

Unit II Notes

distributed computing
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views17 pages

Unit II Notes

distributed computing
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

CS8603-Distributed System

UNIT II
MESSAGE ORDERING & SNAPSHOTS
Message ordering and group communication: Message ordering paradigms –
Asynchronous execution with synchronous communication –Synchronous program order on an
asynchronous system –Group communication – Causal order (CO) - Total order. Global state and
snapshot recording algorithms: Introduction –System model and definitions –Snapshot
algorithms for FIFO channels

2.1. Message ordering and Group Communication


2.1.1. Message Ordering Paradigm
⚫ IPC is core of any distributed systems.
⚫ IPC is done via Message passing.
Notations:
⚫ Modelled the distributed system as graph (N,L).
⚫ Message is represented as mi
⚫ For mi, send and receive events are represented as si and r i
⚫ Generally, send and receive events are represented as s and r .
⚫ Also use M, send (M), and receive(M)

⚫ Corresponding events: a ∼ b denotes a and b occur at the same process


⚫ a and b are any two events (either send or receive) occurred at same process.

⚫ Send-receive pairs T = {(s, r) ∈ Ei × Ej | s corresponds to r }


⚫ Message delivery order is important.
⚫ Because, message delivery order determines messaging behavior .
⚫ Middleware is used to provide certain well define message delivery behavior with some
programming language.
⚫ It is very useful to programmer for code the logic w.r.t this behavior.
⚫ Several message delivery orders are:
⚫ Non-FIFO
⚫ FIFO
⚫ Casual
⚫ Synchronous

⚫ A-execution: (E, ≺) for which the causality relation is a partial order.


⚫ Asynchronous Executions:

⚫ No causality cycles on any logical link, not necessarily FIFO delivery,


⚫ Such executions are also called as non-FIFO execution.
⚫ Due to the physical nature of the link, it may deliver the message sent on it in FIFO
order.
⚫ e.g., network layer IPv4 connectionless service
⚫ All physical links obey FIFO

Not a FIFO execution

1
⚫ An A-execution in which: for all (s, r ) and (sj, r j) ∈ T , (s ∼ sj and r ∼ r j and s ≺ sj)
⚫ FIFO Execution:

=⇒ r ≺ r j
⚫ Logical link inherently non-FIFO
⚫ Can assume connection-oriented service at transport layer, e.g., TCP
⚫ To implement FIFO over non-FIFO link:
⚫ use ( seq num, conn id ) per message. Receiver uses buffer to order messages.

FIFO Eexecution.

⚫ A CO execution is an A-execution in which, for all (s, r ) and (sj, r j) ∈ T , (r ∼ r j


⚫ Casually Ordered(CO) Execution:

and s ≺ sj) =⇒ r ≺ r j
⚫ If send events s and sj are related by causality ordering (not physical time ordering),
their corresponding receive events r and rj occur in the same order at all common
dests.
⚫ If s and sj are not related by causality, then CO is vacuously satisfied.

⚫ An execution that violates CO because s1 ≺ s3 and at the common destination P1, we


Fig.a

have r3 ≺ r1 in figure a.
⚫ Casually Ordered(CO) Execution:
⚫ Figure (b) shows an execution that satisfies CO. Only s1 and s2 are related by
causality but the destinations of the corresponding messages are different.
⚫ Figure (c) shows an execution that satisfies CO. No send events are related by
causality.
⚫ Figure (d) shows an execution that satisfies CO. s2 and s1 are related by causality
but the destinations of the corresponding messages are different. Similarly for s2 and
s3.

2
⚫ If send (m1) ≺ send (m2) then for each common destination d of messages m1 and
⚫ CO alternate definition

m2, delivered (m1) ≺ delivered (m2) must be satisfied.


⚫ Message arrival vs. delivery:
⚫ Message m that arrives in OS buffer at Pi may have to be delayed until the
messages that were sent to Pi causally before m was sent (the “overtaken”
messages) have arrived!
⚫ The event of an application processing an arrived message is referred to as a
delivery event (instead of as a receive event).
⚫ No message overtaken by a chain of messages between the same (sender, receiver)
pair.
⚫ Causal Order: Other Characterizations

⚫ A-execution in which, for all (s, r ) and (sj, r j) ∈ T , s ≺ sj =⇒ ¬(r j ≺ r )


⚫ Message Order (MO)

⚫ Fig (a): s1 ≺ s3 but ¬(r 3 ≺ r 1) is false ⇒ MO not satisfied

⚫ (E, ≺) is an EI execution if for each (s, r ) ∈ T , the open interval set {x ∈ E


⚫ Empty-Interval (EI) property

| s ≺ x ≺ r } in the partial order is empty.


⚫ Fig(b), Consider M2. No event x such that s2 ≺ x ≺ r 2. Holds for all
messages ⇒ EI For EI (s, r ), there exists some linear extension 1 < | such the
corresp. Interval {x ∈ E | s < x < r } is also empty.

⚫ An execution (E, ≺) is CO iff for each pair (s, r ) ∈ T and each event e ∈ E ,
⚫ Common Past and Future

⚫ Weak common past: e ≺ r =⇒ ¬(s ≺ e)


⚫ Weak common future: s ≺ e =⇒ ¬(e ≺ r )
⚫ Synchronous Executions (SYNC)

⚫ Instantaneous communication ⇒ modified definition of causality, where s, r are


⚫ Handshake between sender and receiver

atomic and simultaneous, neither preceding the other.

Execution in an Equivalent instantaneous


asynchronous communication.
system.

⚫ Synchronous Executions: Definition


⚫ Causality in a synchronous execution.
⚫ The synchronous causality relation on E is the smallest transitive relation that
satisfies the following.
⚫ S1. If x occurs before y at the same process, then x <<y

3
⚫ S2. If (s, r ) ∈ T , then for all x ∈ E , [(x s ⇐⇒ x r ) and (s x ⇐⇒ r x
)]
⚫ S3. If x y and y z , then x z
⚫ Synchronous execution (or S-execution).
⚫ An execution (E, ) for which the causality relation is a partial order.

⚫ An execution (E, ≺) is synchronous iff there exists a mapping from E to T


⚫ Timestamping a synchronous execution.

(scalar timestamps) |

⚫ for each process Pi , if ei ≺ eij then T (ei ) < T (eij)


⚫ for any message M, T (s(M)) = T (r (M))

2.1.2. Asynchronous Execution with Synchronous Communication

⚫ Synchronous send and receive primitives of pairs of processes will produce synchronous
order.
⚫ One question here is :
⚫ Will a program written for an asynchronous system (A-execution) run correctly if
run with synchronous primitives?
⚫ Answer is:
⚫ An algorithm runs on asynchronous system may deadlock on synchronous systems.

Fig: A communication program for an


asynchronous
system deadlocks when using synchronous
primitives.
⚫ Executions realizable with synchronous communication (RSC):
⚫ An execution is realizable with synchronous communication if each send event is

⚫ Non-separated linear extension of (E, ≺)


immediately followed by its corresponding receive event.

⚫ A linear extension of (E, ≺) such that for each pair (s, r ) ∈ T , the interval
{ x ∈ E | s ≺ x ≺ r } is empty.
⚫ s2 r2 s3 r3 s1 r1 is a linear extension that is non-separated.
⚫ s2 s1 r2 s3 r3 s1 is a linear extension that is separated

⚫ An A-execution (E, ≺) is an RSC execution iff there exists a non-


⚫ RSC execution

separated linear extension of the partial order (E, ≺).

4
⚫ if the adjacent send event and its corresponding receive event are
viewed atomically, then that pair of events shares a common past and a
common future with each other.
⚫ Crown:
⚫ Study of characterization of execution in terms of graph structure is called crown
⚫ It leads a feasible test for a RSC execution.

⚫ Let E be an execution. A crown of size k in E is a sequence <si ,ri, i ∈ {0 … k−1


Definition:

>of pairs of corresponding send and receive events such that: s0 ≺ r1, s1 ≺ r2, ,
sk−2 ≺ rk−1, sk−1 ≺ r0`
Example,

a. The crown is <(s1, r1),( s2 r2) as we have s1 ≺ r2 and s2 ≺ r1.


b. The crown is <(s1, r1),( s3, r3), (s2, r2) as we have s1 ≺ r3 and s3 ≺ r2 and s2 ≺ r1.
c. The crown is <(s1, r1),( s2, r2),(s3, r3) as we have s1 ≺ r2 and s2 ≺ r3 and s3 ≺ r1.

Some observations
• In a crown, si and r i +1 may or may not be on same process Non-CO
execution must have a crown

dependencies of crown ⇒ cannot schedule messages serially ⇒ not RSC


• CO executions (that are not synchronous) have a crown (see Fig (b)) Cyclic

⚫ In an execution that is not CO, it is possible to generalize this to state that a non-CO
execution must have a crown size at least 2.(example in fig a & b)
⚫ CO execution that are not synchronous has a crown size of 3. (example in fig c)

⚫ Need to determine whether there exist any cyclic decencies among message for determining
whether execution holds proper RSC or not.

⚫ Define the ‹→: T × T relation on messages in the execution (E, ≺) as follows. Let
⚫ Crown Test for RSC executions:

‹→ ([s, r ], [sj, r j]) iff s ≺ r j. Observe that the condition s ≺ r j (which has the form
used in the definition of a crown) is implied by all the four conditions: (i) s ≺ sj, or
(ii) s ≺ r j, or (iii) r ≺ sj, and (iv) r ≺ r j.

5
⚫ Now define a directed graph G‹→ = (T , ‹→), where the vertex set is the set of
messages T and the edge set is defined by ‹→.
⚫ Observe that ‹→: T × T is a partial order iff G‹→ has no cycle, i.e., there must not be

⚫ Observe from the defn. of a crown that G‹→ has a directed cycle iff (E, ≺) has
a cycle with respect to ‹→ on the set of corresponding (s, r ) events.

a crown.
⚫ Crown criterion
⚫ An A-computation is RSC, i.e., it can be realized on a system with synchronous
communication, iff it contains no crown.

⚫ Execution (E, ≺) is RSC iff there exists a mapping from E to T (scalar


⚫ Timestamps for a RSC execution

timestamps) such that

⚫ for each (a, b) in (E × E ) \ T , a ≺ b =⇒ T (a) < T (b)


⚫ for any message M, T (s(M)) = T (r (M))

⚫ Hierarchy of Message Ordering Paradigms

⚫ RSC ⊂ CO ⊂ FIFO ⊂ A.
⚫ An A-execution is RSC iff A is an S-execution.

⚫ More restrictions on the possible message orderings in the smaller classes. The
degree of concurrency is most in A, least in SYN C.
⚫ A program using synchronous communication easiest to develop and verify. A
program using non-FIFO communication, resulting in an A-execution, hardest to
design and verify.

Fig:Hierarchy of message ordering paradigms. (a) Venn diagram


(b) Example executions.

2.1.3. Synchronous Program Order on an Asynchronous System


⚫ Rendezvous is a synchronization mechanism based on procedural decomposition.
⚫ It is a synchronous communication among arbitrary number of asynchronous process.
⚫ It is similar to a procedure call.
⚫ Difference here is that the caller and called belongs to different task.
⚫ Called procedure is usually called an entry point of the corresponding task.
⚫ A call to an entry point is synchronous.
⚫ i.e., the caller is blocked until completion.
⚫ Rendezvous is an architecture for creating multi-user application.
⚫ It provides support for managing a multi-user session.
⚫ Functionality:
⚫ Performing fundamental input and output activities.
⚫ Controlling the degree to which the multiple users either share or do not share both
information and control

6
⚫ Binary rendezvous:
⚫ Implemented by using tokens
⚫ Token for each enabled interaction schedule online, atomically, in a
distributed manner crown-free scheduling
⚫ Bagrodia’s Algorithm for Binary Rendezvous:
⚫ Assumption:
⚫ Receive commands are forever enabled from all processes.
⚫ A send command, once enabled, remains enabled until it completes, i.e., it is not
possible that a send command gets disabled (by its guard getting falsified) before
the send is executed.
⚫ To prevent deadlock, process identifiers are used to introduce asymmetry to break
potential crowns that arise.
⚫ Each process attempts to schedule only one send event at any time.
⚫ Message types:
⚫ M, ack(M), request(M), permission(M)
⚫ Process blocks when it knows it can successfully synchronize the current message.
⚫ Fig shows the high and low priority process blocks.
⚫ Each process maintains a queue that is processed in FIFO order only when the process
is unblocked.
⚫ When a process is blocked waiting for a particular message that it is currently
synchronizing, any other message that arrives is queued up.
⚫ Ack(M), request(M), and permission(M)- Control messages.
⚫ Lower Priority Process:
⚫ messages M and ack(M) are involved in that order.
⚫ The sender issues send(M) and blocks until ack(M) arrives.
⚫ Thus, when sending to a lower priority process, the sender blocks waiting for
the partner process to synchronize and send an acknowledgement.

⚫ Higher Priority Process:


⚫ messages request(M), permission(M), and M are involved, in that order.
⚫ The sender issues send(request(M)), does not block, and awaits permission.
⚫ When permission(M) arrives, the sender issues send(M).

7
 Bagrodia’s Algorithm

Bagrodia’s Algorithm- Examples


⚫ Figure shows two examples of how the algorithm breaks cyclic waits to schedule messages.
⚫ Observe that in all cases in the algorithm, a higher priority process blocks on lower priority
processes, irrespective of whether the higher priority process is the intended sender or the
receiver of the message being scheduled.
⚫ In Figure (a), at process Pk, the receive of the message from Pj effectively gets permuted
before Pk’s own send(M) event due to step 2(bi).
⚫ In Figure (b), at process Pj , the receive of the request(M) message from Pk effectively
causes M to be permuted before Pj’s own message that it was attempting to schedule with
Pi, due to step 2(bii).

8
2.1.4. Group Communication
⚫ Processes across a distributed system cooperate to solve a joint task.
⚫ They need to communicate with each other as a group.
⚫ So ,there is a needs to support for group communication.
⚫ Message Unicast- sending of a message to particular destination process
⚫ Message Broadcast- sending of a message to all members in the distributed system
⚫ Message Multicast- message is sent to a certain subset, identified as a group, of the
processes in the system
⚫ Group are dynamic
⚫ They may be created and destroyed.
⚫ Process may join or leave group
⚫ Process may belongs to multiple groups.
⚫ Groups allow processes to deal with collections of process as one abstraction.
⚫ A process should only send message to a group and need not know or care who its members
are.
⚫ Group communication can be implemented in several ways.
⚫ One to many
⚫ Many to one
⚫ Many to many

⚫ Spanning tree network protocol is used for both broad cast and multicast.
⚫ It is an efficient mechanism for distributing information.
⚫ Some of the features are not provided by hardware assisted or network protocol assisted
multicast.
⚫ Some of them are:
⚫ Application-specific ordering semantics on the order of delivery of messages.
⚫ Adapting groups to dynamically changing membership.
⚫ Sending multicasts to an arbitrary set of processes at each send event.
⚫ Providing various fault-tolerance semantics.
⚫ One to Many
⚫ Message is sent by one sender to multiple receiver.
⚫ One-to-many scheme is also known as multicast communication.
⚫ A special case of multicast communication is broadcast communication.
⚫ Group Management in One-to-Many
⚫ Two types:
⚫ Closed group
⚫ Open group
⚫ Closed group

9
⚫ Only the members of the group can send a message to the group.
⚫ An outside process cannot send a message to the group as a whole, although
it may send a message to an individual member of the group.
⚫ Open group
⚫ Any process in the system can send a message to the group as a whole.
⚫ Use of open or closed group is depends upon application.
⚫ Group Addressing in One-to-Many
⚫ A two-level naming scheme is normally used for group addressing.
⚫ High level group name
⚫ Low level group name.
⚫ High level group name is ASCII string and it is location independent.
⚫ Low level group naming depends upon underlying hardware.
⚫ User applications use high level group names in programs.
⚫ Buffered and Un buffered Multicast in One-to-Many
⚫ Multicasting is an asynchronous communication mechanism
⚫ Because, multicast send can not be synchronous due to the following reasons:
⚫ It is unrealistic to expect a sending process to wait until all the receiving
processes that belong to the multicast group are ready to receive the multicast
message.
⚫ The sending process may no be aware of all the receiving processes that
belong to the multicast group.
⚫ Un Buffered Multicast
⚫ Message is not buffered for the receiving process.
⚫ It is lost if the receiving process is not in state ready to receive it,.
⚫ Therefore, message is received only by those processes of the multicast group
that are ready to receive it.
⚫ Buffered Multicast
⚫ Message is buffered for the receiving process.
⚫ So, each process of the multicast group will eventually receive the message.
⚫ Semantics in One-to-Many
⚫ Two types of semantics
⚫ Send- to – all semantics:
⚫ A copy of the message is sent to each process of the multicast group
and message is buffered until it is accepted by the process.
⚫ Bulletin-board semantics:
⚫ A message to be multicast is addressed to a channel instead of being
sent to every individual process of the multicast group.
⚫ Flexibility reliability in Multicast communication:
⚫ Different application require different degrees of reliability
⚫ In one-to-many communication, the degree of reliability is normally expressed in the
following forms:
⚫ 0-reliability:
⚫ No response is expected by the sender from any of the receiver.
⚫ 1-reliability:
⚫ The sender expects a response from any of the receivers
⚫ M-out-of-n- reliable:
⚫ The multicast group consists of n receivers and the sender expects a
response from m (1<m<n) of the receivers.

10
⚫ All-reliable:
⚫ The sender expects a response message form all the receiver in
the multicast group.
⚫ Atomic multicast:
⚫ Atomic multicast has an all-or- nothing property.
⚫ When a message is sent to a group by atomic multicast, it is either received by all the
correct processes that are members of the group or else it is not received by any of
them.
⚫ Many-to-one Communication:
⚫ Multiple senders send message to a single receiver.
⚫ Single receiver may be selective or non selective
⚫ Selective receiver:
⚫ Specifies a unique sender.
⚫ Message exchange takes place only if that sender send a message.
⚫ Non selective receiver:
⚫ Specifies a set of sender
⚫ If any one sender in the set sends a message to this receiver a message
exchange takes place.
⚫ Many to one scheme is non-determinism.
⚫ Many-to-Many Communication:
⚫ Multiple sender send message to multiple receiver.
⚫ Ordered message delivery ensures that all messages are delivered to all receiver in
an order acceptable to the application.
⚫ For example,
⚫ Suppose 2 senders send messages to update the same record of a database to
2 server process having a replica of the database.
⚫ If the message of the 2 senders is received but the 2 servers in different
orders, then the final value of the updated record of the database may be
different in its 2 replicas.
⚫ Message Ordering in Many-to-Many Communication:
⚫ R1 and R2 receive m1 and m2 in different order.
⚫ Fig shows the no ordering constraints for message delivery.
⚫ Some message ordering is required :
⚫ Absolute ordering
⚫ Consistent/ total ordering
⚫ Causal ordering
⚫ FIFO ordering.

⚫ Absolute Ordering in Many-to-Many Communication:


⚫ Here all messages are delivered to all receiver process in the exact order in
which they were sent.

11
⚫ Rule: mi must be delivered before mj if Ti < Tj
⚫ Implementation:
⚫ A clock synchronized among machine is required.
⚫ A sliding time window used to commit message delivery whose time stamp
in this window.
⚫ Window size is properly chosen taking into consideration the maximum
possible time that may be required by a message to go from one machine to
other machine in the network.
⚫ Example: Distributed Simulation
⚫ Drawbacks:
⚫ Too strict constraint
⚫ No absolute synchronized clock
⚫ No guarantee to catch all tardy message

⚫ Consistent/ Total Ordering in Many-to-Many Communication:


⚫ It ensures that all messages are delivered to all receiver processes in the same order.
⚫ Rule: messages received in the same order, regardless of their timestamp.
⚫ Implementation:
⚫ A message sent to a sequencer, assigned a sequence number, and finally
multicast to receiver.
⚫ A message retrieved in incremental order at a receiver
⚫ Example: replicated database updates
⚫ Draw back: a centralized algorithm

⚫ Causal Ordering in Many-to-Many Communication:


⚫ If 2 message sending events are not causally related, the 2 messages may
be delivered to the receivers in any order.
⚫ Two message sending events are said to be causally related if they are correlated by
the happened before relation.
⚫ Rule: Happened-before relation
⚫ If eki,eli € h and k<1, then eki € eli,
⚫ If ei = send(m) and ej= receive (m), the ei € ej,
⚫ If ee’ and e’e’’, then ee’’

12
⚫ Implementation: use of a vector message,
⚫ Example: distributed file system
⚫ Drawbacks:
⚫ Vector as an overhead
⚫ Broadcast assumed.
2.1.5. Causal order (CO)
 a → b iff ta < tb
 Events a and b are causally related Events a and b are causally related iff ta < tb or tb < ta,
else they are concurrent they are concurrent
 Note that this is still not a total order Note that this is still not a total order
Uses of Vector Clock in CO of Messages
 If send(m1) If send(m1) → send(m2), then every recipient of both send(m2), then every
recipient of both message m1 and m2 must “deliver” m1 before m2.
o “deliver” –when the message is actually given to the application for
processing application for processing
Birman-Schiper Schiper-Stephenson Stephenson Protocol Protocol

 To broadcast m from process i, increment To broadcast m from process i, increment Ci(i),


and timestamp m with timestamp m with VTm = Ci[i]
 When j ≠ i receives m, j delays receives m, j delays delivery of m until delivery of m until
– Cj[i] = VTm[i] –1 and
– Cj[k] ≥ VTm[k] for all k ] for all k ≠ i
– Delayed messages are queued in j sorted by vector time.
 Delayed messages are queued in j sorted by vector time. Concurrent messages are sorted by
receive time. Concurrent messages are sorted by receive time.
 When m is delivered at j, When m is delivered at j, Cj is updated according to is updated
according to vector clock rule. vector clock rule.
Problem of Vector Clock:
 Message size increases since each message needs to be Message size increases since each
message needs to be tagged with the vector tagged with the vector
 Size can be reduced in some cases by only sending Size can be reduced in some cases by
only sending values that have changed values that have changed
2.1.6. Total Ordering
 A system of clocks that satisfy the Clock Condition can be used to totally order system
events.

occurrence. In case two or more events occur at the same time, an arbitrary total ordering ≺
 To totally order the events in a system, the events are ordered according to their times of

of processes is used. To do this, the relation ⇒ is defined as follows:


 If a is an event in process Pi and b is an event in process Pj , then a ⇒ b if and only if either:

o Cihai = Cj hbi and Pi ≺ Pj


o Cihai < Cj hbi or

 There is total ordering because for any two events in the system, it is clear which happened
first.
 The total ordering of events is very useful for distributed system implementation.

13
2.2. Global State and Snapshot Algorithm
2.2.1. Introduction
 Distributed Garbage Collection
 Deadlock Detection
 Termination Detection
 Debugging

Distributed garbage collection:


An object is considered to be garbage if there are no longer any references to it anywhere in
the distributed system. The memory taken up by that object can be reclaimed once it is known to be
garbage. To check that an object is garbage, we must verify that there are no references to it
anywhere in the system. Process p1 has two objects that both have references – one has a reference
within p1 itself, and p2 has a reference to the other. Process p2 has one garbage object, with no
references to it anywhere in the system. It also has an object for which neither p1 nor p2 has a
reference, but there is a reference to it in a message that is in transit between the processes.
Distributed deadlock detection:
A distributed deadlock occurs when each of a collection of processes waits for another
process to send it a message, and where there is a cycle in the graph of this ‘waits-for’ relationship.
Processes p1 and p2 are each waiting for a message from the other, so this system will never make
progress.
Distributed termination detection:
The problem here is how to detect that a distributed algorithm has terminated. Detecting
termination is a problem that sounds deceptively easy to solve: it seems at first only necessary to
test whether each process has halted. Two processes p1 and p2 , each of which may request values
from the other. Instantaneously, we may find that a process is either active or passive – a passive
process is not engaged in any activity of its own but is prepared to respond with a value requested
by the other. Suppose we discover that p1 is passive and that p2 is passive.
The phenomena of termination and deadlock are similar in some ways, but they are different
problems. First, a deadlock may affect only a subset of the processes in a system, whereas all
processes must have terminated. Second, process passivity is not the same as waiting in a deadlock
cycle: a deadlocked process is attempting to perform a further action, for which another process
waits; a passive process is not engaged in any activity.
Distributed debugging:
Distributed systems are complex to debug and care needs to be taken in establishing what
occurred during the execution.
2.2.2 System model and definitions
Global states and consistent cuts:

14
The state of the collection of processes – is much harder to address. The essential problem is
the absence of global time. If all processes had perfectly synchronized clocks, then we could agree
on a time at which each process would record its state – the result would be an actual global state of
the system.

A consistent cut cannot violate temporal causality by implying that a result occurred before
its cause, as in message m1 being received before the cut and being sent after the cut. The union of
the individual process histories:
H= h0 U h1U…U hN – 1
A cut of the system’s execution is a subset of its global history that is a union of prefixes of
process histories.
c1 c2 cN
C =h1 Uh2 U…U hN
A consistent global state is one that corresponds to a consistent cut. We may characterize the
execution of a distributed system as a series of transitions between global states of the system:
S0 -> S1-> S2-> …..
A linearization or consistent run is an ordering of the events in a global history that is consistent
with this happened-before relation ->on H. Note that a linearization is also a run.
Global state predicates, stability, safety and liveness:
Detecting a condition such as deadlock or termination amounts to evaluating a global state
predicate. A global state predicate is a function that maps from the set of global states of processes
in the system.
Once the system enters a state in which the predicate is True, it remains True in all future
states reachable from that state. By contrast, when we monitor or debug an application we are often
interested in non-stable predicates, such as that in our example of variables whose difference is
supposed to be bounded. Here two further notions relevant to global state predicates: safety and
liveness.
Safety with respect to α is the assertion that α evaluates to False for all states S reachable
from S0. Liveness with respect to β is the property that, for any linearization L starting in the state
S0 , β evaluates to True for some state SL reachable from S0
2.2.3. Snapshot algorithms for FIFO channels
The ‘snapshot’ algorithm of Chandy and Lamport:
Chandy and Lamport describe a ‘snapshot’ algorithm for determining global states of
distributed systems. The goal of the algorithm is to record a set of process and channel states (a
‘snapshot’) for a set of processes pi ( i = 1, 2, .., N ) such that, even though the combination of
recorded states may never have occurred at the same time, the recorded global state is consistent.
The algorithm records state locally at processes; it does not give a method for gathering the
global state at one site.
The algorithm assumes that:
 Neither channels nor processes fail – communication is reliable so that every message sent is
eventually received intact, exactly once.
15
 Channels are unidirectional and provide FIFO-ordered message delivery.
 The graph of processes and channels is strongly connected (there is a path between any two
processes).
 Any process may initiate a global snapshot at any time.
 The processes may continue their execution and send and receive normal messages while
the snapshot takes place.
The algorithm proceeds through use of special marker messages, which are distinct from any
other messages the processes, send and which the processes may send and receive while they
proceed with their normal execution.
The algorithm is defined through two rules, the marker receiving rule and the marker sending
rule.
The marker sending rule obligates processes to send a marker after they have recorded their
state, but before they send any other messages.
The marker receiving rule obligates a process that has not recorded its state to do so. In that
case, this is the first marker that it has received. It notes which messages subsequently arrive on the
other incoming channels.
The algorithm for a system of two processes, p1 and p2 , connected by two unidirectional
channels, c1 and c2 . The two processes trade in ‘widgets’. Process p1 sends orders for widgets over
c2 to p2. Sometime later, process p2 sends widgets along channel c1 to p1.

Two processes and their initial states:


Process p2 has already received an order for five widgets, which it will shortly dispatch to p1.

Process p1 records its state in the actual global state S0, when the state of p1 is <$1000, 0>.
Following the marker sending rule, process p1 then emits a marker message over its outgoing
channel c2 before it sends the next application-level message: (Order 10, $100), over channel c2 .

16
The system enters actual global state S1. Before p2 receives the marker, it emits an
application message (five widgets) over c1 in response to p1’s previous order, yielding a new actual
global state S2. Now process p1 receives p2’s message (five widgets), and p2 receives the marker.
Following the marker receiving rule, p2 records its state as <$50, 1995> and that of channel c2 as
the empty sequence. Following the marker sending rule, it sends a marker message over c1. When
process p1 receives p2’s marker message, it records the state of channel c1 as the single message
(five widgets) that it received after it first recorded its state. The final actual global state is S3. The
final recorded state is p1 : <$1000, 0>; p2 : <$50, 1995>; c1 : <(five widgets)>; c2 : < >. Note that
this state differs from all the global states through which the system actually passed.
Termination of the snapshot algorithm:
A process that has received a marker message records its state within a finite time and sends
marker messages over each outgoing channel within a finite time. If there is a path of
communication channel and processes from a process pi to a process pj (j ǂ i), then it is clear on
these assumptions that pj will record its state a finite time after pi recorded its state.

Characterizing the observed state:


The snapshot algorithm selects a cut from the history of the execution. The cut, and
therefore the state recorded by this algorithm, is consistent.
Consider the sequence of H messages m1, m2,..,mH (H >= 1 ), giving rise to the relation
ei->ej. By FIFO ordering over the channels that these messages traverse, and by the marker sending
and receiving rules, a marker message would have reached pj ahead of each of m1,m2,…,mH.
Stability and the reachability of the observed state:
The reachability property of the snapshot algorithm is useful for detecting stable predicates.
In general, any non-stable predicate we establish as being true in the state Ssnap may or may not
have been true in the actual execution whose global state we recorded. However, if a stable
predicate is True in the state S snap then we may conclude that the predicate is True in the state S
final, since by definition a stable predicate that is True of a state S is also True of any state
reachable from S.

17

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy