0% found this document useful (0 votes)
11 views

3513259

The document discusses a rigorous journey from the bakery algorithm to a distributed state machine algorithm, highlighting the connection between the two. It outlines the evolution of algorithms, emphasizing mutual exclusion and the correctness of the bakery algorithm. The article concludes with a discussion on the relevance of these algorithms in modern computing and their historical significance.

Uploaded by

Hesham Elbakoury
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

3513259

The document discusses a rigorous journey from the bakery algorithm to a distributed state machine algorithm, highlighting the connection between the two. It outlines the evolution of algorithms, emphasizing mutual exclusion and the correctness of the bakery algorithm. The article concludes with a discussion on the relevance of these algorithms in modern computing and their historical significance.

Uploaded by

Hesham Elbakoury
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

contributed articles

DOI:10.1145/ 3513259
include its clock value in the messages
A rigorous journey from the bakery algorithm it sends. Commands to the state ma-
chine are ordered according to the
to a distributed state machine. value of a process’s clock when it is-
sues a command, with ties broken by
BY LESLIE LAMPORT process name.
The similarity between the bakery

Deconstructing
algorithm’s numbers and the state-
machine algorithm’s clocks has been
noticed, but I know of no previous rig-

the Bakery
orous connection between them. Our
trip makes this connection, going from
the bakery algorithm to the state-ma-
chine algorithm through a sequence of

to Build
algorithms, each (except the first) de-
rived from the preceding one.
The first algorithm on the journey
is a straightforward generalization of

a Distributed
the bakery algorithm, mainly by allow-
ing a process to read other processes’
numbers in an arbitrary order. We
then deconstruct this algorithm by

State Machine
having each process maintain mul-
tiple copies of its number, one for
each other process. Next is a distrib-
uted version of the deconstructed al-
gorithm obtained by having each copy
of a process i’s number kept by the
process that reads it, where i writes
the value stored at another process
by sending a message to that process.
We then modify this distributed al-
gorithm to ensure that numbers in-
crease with each execution of the crit-
IN THIS ARTICLE, the reader and I will journey between ical section. Finally, we arrive at the
two concurrent algorithms of the 1970s that are distributed state-machine algorithm
still studied today. The journey begins at the bakery by forgetting about critical sections
and just using the numbers as logical
algorithm9 and ends at an algorithm for implementing clocks.
a distributed state machine.12 I hope we enjoy the Not only do our algorithms date
from the 1970s, but the path between
voyage and perhaps even learn something. them is one that could have been fol-
The bakery algorithm ensures processes execute
IMAGE BY AND RIJ BORYS ASSOCIAT ES, USING SH UTT ERSTOC K

lowed at that time. The large amount of


a critical section of code one at a time. A process related work done since then has nei-
ther influenced nor obviated any part
trying to execute that code chooses a number it of the route. At the end of our journey,
believes to be higher than the numbers chosen by a concluding section discusses that re-
lated work and why the algorithms that
other such processes. The process with the lowest begin and end our path are still stud-
number goes first, with ties broken by process ied today. The correctness proofs in
name. In the distributed state-machine algorithm, our journey are informal, much as they
would have been in the 1970s. More
each process maintains a logical clock, with the modern, rigorous proofs are discussed
clocks being synchronized by having a process in the concluding section.

58 COM MUNICATIO NS O F TH E ACM | S EPTEM BER 2022 | VO L . 65 | N O. 9


SE PT E MB E R 2 0 2 2 | VO L. 6 5 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 59
contributed articles

The Original Bakery Algorithm equal 0 for all i, so every process inside
The bakery algorithm solves the the bakery would be waiting forever at
mutual-exclusion problem introduced statement L3. But this is impossible
and solved by Edsger Dijkstra.3 The because the waiting process i with the
problem assumes a set of processes
that alternate between executing a The bakery smallest value of (number[i], i) would
eventually enter the critical section.
noncritical and a critical section of
code. A process must eventually exit
algorithm Hence, the algorithm is deadlock free.
To show that the algorithm is star-
the critical section, but it may stay ensures processes vation free, it suffices to obtain a con-
forever in the noncritical section. The
basic requirement is that, at most,
execute a critical tradiction by assuming that a process i
remains forever inside the bakery and
one process can be executing the criti- section of code outside the critical section. By dead-
cal section at any time. A solution to
the mutual-exclusion problem lies at
one at a time. lock freedom, other processes must
continually enter and leave the critical
the heart of almost all multiprocess section, since they cannot halt there.
programming. However, once a process j is outside
The bakery algorithm assumes the bakery, to enter the bakery again
processes are named by numbers it must execute statement M and set
from 1 through N. Figure 1 contains number[j] to be greater than number[i].
the code for process number i, almost At that point, process j must remain
exactly as it appeared in the original forever inside the bakery because it will
paper. The values of the variables num- loop forever if it reaches L3 with k = i.
ber and choosing are arrays indexed Eventually, number[i] will be less than
by process number, with number[i] number[j] for every process j in the bak-
and choosing[i] initially equal to 0 for ery, so i will enter its critical section.
every process i. The relation << is lexi- This is the contradiction that proves
cographical ordering on pairs of num- starvation freedom.
bers, so (1, 3) << (2, 2) << (2, 4); it is an Essentially, the same proof shows
irreflexive total ordering on the set of that the other mutual-exclusion algo-
all pairs of integers. rithms we derive from the bakery algo-
Mutual exclusion can be achieved rithm also satisfy starvation freedom.
very simply by not allowing any process So, we will say little more about star-
to ever enter the critical section. A mu- vation freedom. We now explain why
tual-exclusion algorithm needs to also the bakery algorithm satisfies mutual
satisfy some progress condition. The exclusion. For brevity, we abbreviate
condition Dijkstra’s algorithm satis- (number[i], i) << (number[j], j) as i << j.
fies is deadlock freedom, meaning that Here is a naive proof that i and j can-
if one or more processes try to enter not both be in their critical sections at
the critical section, one of them must the same time. For i to enter the critical
succeed. Most later algorithms satisfy section, it must find number[j] = 0 or i
the stronger requirement of starvation << j when executing L3 for k = j. Simi-
freedom, meaning that every process larly, for j to enter the critical section,
that tries to enter the critical section it must find number[i] = 0 or j << i when
eventually does so. Before discussing executing L3 for k = i. Since a process’s
mutual exclusion, we show that the number is non-zero when it executes
bakery algorithm is starvation free. But L3, this means that for i and j both to
first, some terminology. be in their critical sections, i << j and j
We say that a process is in the door- << i must be true, which is impossible.
way when it is executing statement M. This argument is flawed because it
After it finishes executing M until it assumes that both i and j were inside
exits its critical section, we say that it the bakery when the other process ex-
is inside the bakery. When it is at any ecuted L3 for the appropriate value of k.
other place in its code, we say that it is Suppose process i read number[j] while
outside the bakery. j was in the doorway (executing M) but
We first show that the algorithm is had not yet set number[j]. It is possible
deadlock free. If it weren’t, it would for j to have read number[i] = 0 in L3 and
eventually reach a state in which every entered the critical section, and for i
process is either forever in its non- then to have chosen number[i] to make
critical section or forever inside the i << j and entered the critical section.
bakery. Eventually, choosing[i] would The flaw in the argument is correct-

60 COM MUNICATIO NS O F TH E AC M | S EPTEM BER 2022 | VO L . 65 | N O. 9


contributed articles

ed by statement L2. Since choosing[j] makes (number[j], j) << (number[i], i) for the critical section does not need to be
equals 1 when j is in the doorway, all j, but we will not bother with that completed before the process enters
process i executed L3 after L2 found generalization. We rewrite statement the noncritical section. In fact, that
that j was not in the doorway; similar- M using :> to mean “is assigned a value assignment need not even be com-
ly, j executed L3 after finding i not in greater than.” pleted if the process leaves the non-
the doorway. If, in both cases, the two The second obvious generaliza- critical section to enter its critical sec-
processes were inside the bakery when tion is that statements L2 and L3 for tion again. As long as that assignment
L2 was executed, then the naive argu- different values of k do not have to be is completed or aborted (leaving the
ment is correct. If one of them, say j, executed in the order specified by the register equal to ¿) before number[i] is
was not inside the bakery, it must have for statement. Since the proof of mu- assigned a new value in statement M,
been outside the bakery. Since i was tual exclusion considers each pair of it just appears to other processes as if
then inside the bakery, with its cur- processes by themselves, the only re- process i is still in the critical section
rent value of number[i], process j must quirement is that, for any value of k, or is executing the assignment state-
have chosen number[j] to be greater statement L2 must be executed before ment immediately after the critical
than the current value of number[i], L3. For different values of k, those state- section. Therefore, mutual exclusion
making i << j true. Hence, j could not ments can be executed concurrently by is still satisfied. To maintain starva-
have exited the L3 loop for k = i and different subprocesses. Also, there is tion freedom, the write of 0 must even-
entered the critical section while i was no reason to execute them for k = i be- tually be completed if i remains for-
still in the bakery. Therefore, i and j cause their if tests always equal false. ever in the noncritical section. There
cannot both be in the critical section. These two generalizations have seems to be no simple way to describe
Observe that the choosing vari- appeared elsewhere.5,10 There is an- in pseudo-code these requirements
able serves only to ensure that, when other, less obvious generalization that for setting number[i] to 0 upon com-
process i executes L3 for k = j, there seems to be new: The assignment of 0 pleting the critical section. We simply
had been an instant when i was already to number[i] after the process leaves add the mysterious keyword asynchro-
inside the bakery and j was not in the
doorway. This will be important later. Figure 1. Process i of the original bakery algorithm.
The most surprising property of the
bakery algorithm is that it does not re-
quire reading or writing a memory reg-
ister to be an atomic action. Carefully
examining the proof of mutual exclu-
sion shows that it just requires that
number[i] and choosing[i] are what were
later called safe registers,13 ensuring
only that a read not overlapping a write
obtains the current register value. A
read that does overlap a write can obtain
any value the register might contain.
It is most convenient to describe
a safe register in terms of atomic ac-
tions. We represent writing a value v to
the register as two actions: the first sets
its value to a special constant ¿ and the
second sets it to v. We represent a read Figure 2. A generalization of the original bakery algorithm.
as a single atomic action that obtains
the value of the register if that value
does not equal ¿. A read of number[i]
when it equals ¿ can return any natural
number, and a read of choosing[i] when
it equals ¿ can return 0 or 1.

Generalization of
the Original Algorithm
Two generalizations of the bakery al-
gorithm were obvious when it was pub-
lished. The first is that, in statement M,
it is not necessary to set number[i] to 1
+ maximum(. . .). It could be set to any
number greater than that maximum. It
can also be set to the maximum if that

SE PT E MB E R 2 0 2 2 | VO L. 6 5 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 61
contributed articles

nously and refer to this discussion for Figure 3. The deconstructed bakery algorithm.
its explanation.
The generalized algorithm is in
Figure 2. Processes are explicitly de-
clared; the outer process statement
indicates that there are processes
numbered from 1 through N and
shows the code for process number i.
Variables are declared with their ini-
tial values. The inner process state-
ment declares that process i has N – 1
subprocesses j with numbers from
1 through N, with none numbered i,
and gives the code for subprocess j.
That statement is executed by forking
the subprocesses and continuing to
the next statement (the critical sec-
tion) when all subprocesses have ter-
minated. Harmful or not, gotos have
been eliminated. The outer loop is
described as a while statement. The
loops at L2 and L3 have been described
with await statements, each of which
repeatedly evaluates its predicate and
terminates when it is true. The :> in
statement M and the asynchronously
statement are explained above. We have explicitly indicated the Since number[i] has been replaced by
two atomic actions that represent the registers localNum[j][i], process i
The Deconstructed writing a value v to the safe register now has a separate doorway for each
Bakery Algorithm localNum[j][i], first setting its value to other process j. We say that i is in the
We have assumed that number[i] and ¿ and then to v. We have not bothered doorway with respect to j from when
choosing[i] are safe registers, written to do that for the writes to localCh[j] it begins executing statement M until
only by i and read by multiple readers. [i]. The localCh[j][i] and localNum[j][i] its subprocess j assigns number[i] to
Such a register is easily implemented writes are performed by subprocesses localNum[j][i]. We say that i is inside
with safe registers having a single of process i, except that the N - 1 sep- the bakery with respect to j from when
reader by keeping a copy of the reg- arate writes of ¿ to all the registers it leaves the doorway with respect to j
ister’s value in a separate register for localNum[j][i] are represented by an until it exits the critical section. The
each reader. assignment statement definition of i outside the bakery is the
We deconstruct the generalized same as before.
localNum[*][i] := ¿
bakery algorithm by implement- To transform the proof of correct-
ing the safe registers choosing[i] and of the main process i. (This will be ness of the original bakery algorithm
number[i] with single-reader registers more convenient for our next version of to a proof of correctness of the de-
localCh[j][i] and localNum[j][i], for the bakery algorithm.) To set number[i] constructed algorithm, we replace
each j ≠ i . Note the counterintuitive to 0 after i exits the critical section, all every statement that i or j is in the
subscript order, with localCh[j][i] and the registers localNum[j][i] are set to doorway or inside the bakery with
localNum[j][i] containing the copies ¿ by the main process, and each is set the statement that it is there with re-
of choosing[i] and number[i] read by to 0 by a separate process. We require spect to the other process. The modi-
process j. that the setting of localNum[j][i] to 0 fied proof shows that the function of
The pseudo-code of the decon- has been either completed or aborted statement L2 is to ensure some time
structed algorithm is in Figure 3. The when localNum[j][i] is set to number[i] between i coming inside the bakery
reads of choosing[j] and number[j] by by subprocess (i, j). Again, this is not with respect to j and executing L3 for j,
process i in the generalized algorithm made explicit in the pseudo-code. process j was not in the doorway with
are replaced by reads of localCh[i] A proof of correctness for the de- respect to i.
[j] and localNum[i][j]. The variable constructed algorithm can be ob-
number[i] is now read only by process tained by simple modifications to the The Distributed Bakery Algorithm
i, and we have eliminated choosing[i] proof for the original algorithm. For We now implement the deconstructed
because process i never reads it. Ad hoc the original algorithm, we defined bakery algorithm with a distributed
notation is used in statement M to in- process i to be in the doorway while algorithm. Each main process i is ex-
dicate that number[i] is set to be greater executing statement M, which ended ecuted at a separate node, which we
than the values of all localNum[j][i]. with assigning the value of number[i]. call node i, in a network of processes

62 COMM UNICATIO NS O F THE ACM | S EPTEM BER 2022 | VO L . 65 | N O. 9


contributed articles

that communicate by message pass- appended, to be part of process i’s lo- livery ensures that it is set to 0 before
ing. The variable localNum[j][i], which cal state. A folk theorem4 says that, for its subsequent setting to a non-zero
is process j’s copy of number[i], is kept reasoning about a multiprocess algo- value. Also, since localNum[j][i] is now
at node j. It is set by process i to the rithm, we can combine any number set by process (j, i) upon receipt of
value v by sending the message v to of actions that access only a process’s the message, the assignment to it in
j. The setting of localNum[j][i] to ¿ in local state into a single atomic action. subprocess j of i has been removed.
the deconstructed bakery algorithm That folk theorem has been formal- Correctness of the deconstructed
is implemented by the action of send- ized in a number of results starting algorithm also depends on the assign-
ing that message, and localNum[j][i] with one by Lipton,15 and perhaps the ment to localNum[j][i] being performed
is set to v by process j when it receives most directly applicable being Lamp- before process i sets localCh[j][i] to 0.
the message. Thus, we are implement- ort.14 In our algorithm, making this Since the assignment to localNum[j]
ing the deconstructed algorithm by action appear atomic just requires [i] is now performed at node j, the or-
having process j obtain a previous preventing other processes at node i dering of those two operations is no
value of localNum[j][i] on a read when from acting on any incoming messag- longer trivially implied by the code. To
localNum[j][i] equals ¿. Since the de- es while the action is being executed. maintain that ordering, subprocess
constructed algorithm allows such a The other significant change to the j of i must learn that process (j, i) has
read to obtain any value, this is a cor- deconstructed algorithm is that the set localNum[j][i] to number[i] before it
rect implementation. asynchronously statement has disap- can set localCh[j][i] to 0. This is done
For now, we assume that process peared. The setting of localNum[j][i] by having (j, i) send a message to i with
i can write the value of localCh[j][i] is performed by the receipt of mes- some value ack that is not a natural
atomically by a magical action at a dis- sages sent by i to j. FIFO message de- number. Process (j, i) sets the value of
tance. We will remove this magic later.
We assume that messages sent from Figure 4. The Distributed Bakery Algorithm, with magic.
a process i to any other process j are re-
ceived in the order that they are sent.
We represent the messages in transit
from i to j by a first-in, first-out (FIFO)
queue q[i][j]. We let ∅ be the empty
queue, and we define the following op-
erations on a queue Q:
˲ Append(Q, val) appends the ele-
ment val to the end of Q.
˲ Head(Q) is the value at the begin-
ning of Q.
˲ Behead(Q) removes the element at
the beginning of Q.
˲ Head(Q) and Behead(Q) are unde-
fined if Q equals ∅.
The complete algorithm is in Fig-
ure 4. The shading highlights uses
of localCh, whose magical properties
need to be dealt with. Along with the
main process i, there are concurrently
executed processes (i, j) at node i, for
each j ≠ i. Process (i, j) receives and acts
upon the messages sent to i by j.
The main process i of the distrib-
uted algorithm is obtained directly
from the deconstructed algorithm by
replacing the assignments of ¿ to each
localNum[j][i] with the sending of a
message to j, except for two changes.
The first is that statement M and the
following sending of messages to
other processes (represented by ap-
pending number[i] to all the message
queues q[i][j]) have been made a single
atomic action. We can do this because
we can view the end of each message
queue q[i][j], onto which messages are

SE PT E MB E R 2 0 2 2 | VO L. 6 5 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 63
contributed articles

localNum[j][i] and sends the ack mes- message delivery, that time was also the idea that if two processes are trying
sage to i as a single atomic action. When before the receipt of the ack that L0 to enter the critical section at about
process (i, j) at node i receives the ack is waiting for. In both cases, execut- the same time, then the process i with
message, it sets ackRcvd[i][j] to 1 to no- ing L0 ensures there was some time T the smaller value of (number[i], i) en-
tify subprocess j of process i that the ack after i entered inside the bakery with ters first. We now make that true no
has arrived. The setting of localNum[j][i] respect to j when j was not in the door- matter when the two processes enter
to number[i] in the deconstructed algo- way with respect to i. Hence, state- the critical section. Define a version
rithm is replaced by statement L0 that ment L2 is redundant. of the bakery algorithm to be number-
waits for ackRcvd[i][j] to equal 1. Because L2 is the only place where ordered if it satisfies this condition:
The rest of the code for the main the value of localCh[i][j] is read, we can If process i enters the critical sec-
process i is the same as that of the cor- eliminate localCh and all statements tion with number[i] = ni and process
responding process of the deconstruct- that set it. Removing all the grayed state- j later enters the critical section with
ed algorithm, except that after i leaves ments in Figure 4 gives us the distribut- number[j] = nj, then (ni, i) << (ni, j). We
the critical section, the asynchronous ed bakery algorithm, with no magic. now make the distributed bakery num-
setting of all the registers localNum[j][i] The first paper devoted to distrib- ber-ordered. We can do that because
to 0 is replaced by sending the message uted mutual exclusion was apparently we have generalized the bakery algo-
0 to all the processes j, and ackRcvd[i][j] that of Ricart and Agrawala.19 Their rithm to set number[i] to any number
is reset to 0 for all j. algorithm can be viewed as an opti- greater than the maximum value of the
The asynchronously executed process mization and simplification of our al- values of number[j] it reads, not just to
(i, j) receives messages sent by j via q[j][i]. gorithm. It delays the sending of ack the next-largest number.
For an ack message, it sets ackRcvd[i][j] to messages in such a way that a process We add to the distributed bakery
1; for a message with a value of number[j] can enter its critical section when it re- algorithm a variable maxNum, where
it sets localNum[i][j] and, if the value is ceives an ack from every other process, maxNum[i][j] is the largest value
non-zero, sends an ack to j. so it does not have to keep track of localNum[i][j] has equaled, for j ≠ i. We
The one remaining problem is the other processes’ numbers. The num- let maxNum[i][i] be the largest value
magical atomic reading and writing ber 0 messages sent upon exiting number[i] has equaled. We then make
of the register localCh[i][j]. The value the critical section can therefore be two changes to the algorithm. First,
of that register is used only in state- eliminated, yielding an algorithm with we replace statement M with the state-
ment L2. The purpose of L2 is to ensure fewer messages. Although nicer than ment in Figure 5.
that, before the execution of L3, there our algorithm, the Ricart-Agrawala al- Second, in process (i, j), if localNum[i]
existed a time T when i was in the bak- gorithm is not directly on the path we [j] is assigned a non-zero value, then
ery with respect to j and j was not in are traveling. maxNum[i][j] is assigned that same
the doorway with respect to i. We now value. The FIFO ordering of messages
show that statement L2 is unnecessary, A Distributed State Machine assures the new value of maxNum[i]
because executing L0 ensures the exis- In a distributed state machine,12 there [j] will be greater than its previous
tence of such a time T. is a set of processes at separate nodes value. Clearly, localNum[i][j] always
The execution of statement M by j in a network, each wanting to ex- equals maxNum[i][j] or 0. The value of
and the sending of number[j] in a mes- ecute state-machine commands. The number[i] chosen this way is therefore
sage to i are part of a single atomic processes must agree on the order in allowed by statement M of the distribut-
action, and j enters the bakery with which all the commands are executed. ed algorithm, so this is a correct imple-
respect to i when that message is re- To execute a command, a process must mentation of that algorithm. We now
ceived at node i. Therefore, j is in the know the entire sequence of preceding show that it is number-ordered.
doorway with respect to i exactly when commands. Suppose i enters the critical section
there is a message with a non-zero in- A distributed mutual-exclusion al- with number[i] = ni and j later enters
teger in q[j][i]. Let’s call that message gorithm can be used to implement a the critical section with number[j] = nj.
a doorway message. Process i enters distributed state machine by having It’s evident that (ni, i) << (nj , j) if i = j, so
the bakery with respect to j when its a process execute a single command we can assume i ≠ j. The proof of mu-
message containing number[i] is re- in the critical section. The order in tual exclusion for the deconstructed
ceived at node j, an action that ap- which processes enter the critical sec- algorithm shows that either (i) (ni, i)
pends to q[j][i] the ack that L0 is wait- tion determines the ordering of the << (nj, j) or (ii) j chose nj after reading
ing to arrive. If there is no doorway commands. It is easy to devise a pro- a value of localNum[i][j] written after
message in q[j][i] at that time, then tocol that has a process in its critical i set it to ni. In our modified version
immediately after execution of that section send its current command to of the distributed algorithm, j reads
action is the time T whose existence all other processes, which order it af- maxNum[j][i] not localNum[i][j] to set
we need to show, since it occurred ter all preceding commands. Starting number[j], and maxNum[j][i] never
before the receipt of the ack that L0 with this idea and the distributed bak- decreases. Therefore, (ni, i) << (ni, j) is
was waiting for. If there is a doorway ery algorithm, we will obtain the dis- true also in case (ii), so the algorithm
message in q[j][i], then the required tributed state-machine algorithm12 by is number-ordered.
time T is right after that message was eliminating the critical section. Since the algorithm is number-
received at node i. Because of FIFO The bakery algorithm is based on ordered, we don’t need the critical

64 COMM UNICATIO NS O F THE AC M | S EPTEM BER 2022 | VO L . 65 | N O. 9


contributed articles

Figure 5. A new version of statement M. write operations to shared memory.22


A number of them improve the bak-
ery algorithm, the most significant
improvement being a bound on the
chosen numbers.6,21 But all improve-
ments seem to add impediments to
Figure 6. A newer version of statement M. our path, except for one: Moses and
Patin17 optimized the bakery algorithm
by allowing process i to stop waiting for
process j at statement L3 if it reads two
different values of number[j]. However,
it is irrelevant to our path because it op-
timizes a case that cannot occur in the
section to implement a distributed maxNum[i][j] to v (possibly increas- distributed bakery algorithm.
state machine. We can order the com- ing maxNum[i][i]) and sends back to Mutual-exclusion algorithms based
mands by the value (number[i], i) would j the message (maxNum[i][i], ack). on read and write operations have
have had when i entered the criti- When that message is received, j sets been of no practical use for decades,
cal section to execute the command. maxNum[j][i] accordingly, (increas- since modern computers provide spe-
Process i can send the command it is ing maxNum[j][j] if necessary). When i cial instructions to implement mutual
executing in the messages containing has received all the ack messages for a exclusion more efficiently. Now, they
the value of number[i] that it sends to command it issued with maxNum[i][i] are studied mainly as concurrent pro-
other processes. In fact, we don’t need equal to v, all its values of maxNum[i] gramming exercises. The bakery algo-
number[i] at all. When we send that [j] will be ≥ v, so process i knows it has rithm is of interest because it was the
message, number[i] has the same value received all commands ordered before first mutual-exclusion algorithm not to
as maxNum[i][i]. We can eliminate ev- its current command. It can therefore assume lower-level mutual exclusion,
erything in the main process i except execute all of them, in the appropri- which is implied by atomic reads and
the atomic statement containing state- ate order, and then execute its current writes of shared memory. The distrib-
ment M, which can now be written as in command. uted state-machine algorithm is inter-
Figure 6, where Cmd is process i’s cur- This is almost identical to the esting because it preserves causality.
rent command. distributed state-machine algo- But it too is less important than the
There is one remaining problem. rithm,12 where maxNum[i][i] is called problem it solves.
Process i saves the messages contain- process i’s clock. (The sketch of the The most important contribution
ing commands that it sends and re- algorithm given there is not detailed of my state-machine paper was the
ceives, accumulating a set of triples enough to mention the other regis- observation that any desired form of
(v, j, Cmd) indicating that process j is- ters maxNum[i][j].) The one differ- cooperation in a network of comput-
sued a command Cmd with number[j] ence is that, when process i receives ers can be obtained by implement-
having the value v. It knows that those a message from j with a new value ing a distributed state machine. The
commands are ordered by (v, j). How- v of maxNum[i][j], the algorithm re- obvious next step was to make the
ever, to execute the command in (v, quires maxNum[i][i] to be set to a implementation fault tolerant. The
j, Cmd), it has to know that it has re- value > v, whereas ≥ v suffices. The work addressing that problem is too
ceived all commands (w, k, Dmd) with algorithm remains correct if the val- extensive to discuss here. Fault-tol-
(w, k) << (v, j). Process i knows that, for ue of maxNum[i][i] increases by any erant state-machine algorithms have
each process k, it has received all com- amount at any time. Thus, the reg- become the standard building block
mands (w, k, Dmd) with w ≤ maxNum[i] isters maxNum[i][i] could be logical for implementing reliable distributed
[k]. However, suppose i has received clocks that are also used for other systems.20
no commands from k. How can i be purposes. There was no direct connection
sure that k hasn’t sent a command in We have described all the pieces of between the creation of the bakery
a message that i hasn’t yet received? a distributed state-machine algorithm algorithm and of the state-machine
The answer is to use the distributed but have not put them together into algorithm. The bakery algorithm was
bakery algorithm’s ack messages. pseudo-code. “The precise algorithm is inspired by a bakery in the neighbor-
Here’s how. straightforward, and we will not bother hood where I grew up. A machine dis-
For convenience, we let process i to describe it.”12 pensed numbers to its customers that
keep maxNum[i][i] always equal to the determined the order in which they
maximum of the values maxNum[i] Ancient and Recent History were served. The state-machine algo-
[j] (including j = i). It does this by in- In addition to being the author of this rithm was inspired by an algorithm
creasing maxNum[i][i], if necessary, article, I am the author of the starting of Paul Johnson and Robert Thomas.7
when receiving a message with the and ending algorithms of our journey. They used the << relation and process
value of maxNum[i][j] from another The bakery algorithm is among hun- identifiers to break ties, but I don’t
process j. Upon receiving a message dreds of algorithms that implement know if that was inspired by the bak-
(v, Cmd) from process j, process i sets mutual exclusion using only read and ery algorithm.

SE PT E MB E R 2 0 2 2 | VO L. 6 5 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 65
contributed articles

The path between the two algo- reasoning I have used here is notori- bakery algorithm. Science of Computer Programming
78, 9 (2013), 1622–1638.
rithms that we followed is not the one ously unreliable. I believe the best 6. Jayanti, P., Tan, K., Friedland, G., and Katz, A. Bounding
I originally took. That journey began rigorous proofs of safety properties Lamport’s bakery algorithm. In L. Pacholski and
P. Ruzicka, eds., SOFSEM 2001: 28th Conference on
when I was looking for an example are usually based on invariants— Current Trends in Theory and Practice of Informatics
of a distributed algorithm for notes I predicates that are true of every state 2234, Lecture Notes in Computer Science, Springer
(2001), 261–270.
was writing. Stephan Merz suggested of every possible execution.2 Invari- 7. Johnson, P. and Thomas, R. The maintenance of
the mutual-exclusion algorithm I had ance proofs that the bakery algorithm duplicate data bases. Request for Comment RFC
#677, NIC #31507, ARPANET Network Working Group,
used to illustrate the state-machine satisfies mutual exclusion have often (January 1975).
algorithm. I found it to be too com- been used to illustrate formalisms or 8. Lamport, L. Online supplemental material for
Deconstructing the bakery to build a distributed state
plicated, so I simplified it. (I did not tools.5,11 An informal sketch of such a machine, http://lamport.azurewebsites.net/pubs/
bakery/deconstruction.html.
remember the Ricart-Agrawala algo- proof for the decomposed bakery al- 9. Lamport, L. A new solution of Dijkstra’s concurrent
rithm and was only later reminded of gorithm is in an expanded version of programming problem. Commun. ACM 17, 8 (Aug.
1974), 453–455.
it by a referee). After stripping away this article, which is available on the 10. Lamport, L. Concurrent reading and writing. Commun.
things that were not needed for that Web.8 Elegant rigorous proofs of prog- ACM 20, 11 (Nov. 1977), 806–811.
11. Lamport, L. Proving the correctness of multiprocess
particular state machine, I arrived at ress properties can be written using programs. IEEE Transactions on Software Engineering
the distributed bakery algorithm. It temporal logic.18 SE-3, 2 (Mar. 1977), 125–143.
12. Lamport, L. Time, clocks, and the ordering of events in
was obviously related to the original Rigorous proofs are longer than in- a distributed system. Commun. ACM 21, 7 (July 1978),
bakery algorithm, but it was still not formal ones and can intimidate read- 558–565.
13. Lamport, L. On interprocess communication.
clear exactly how. ers not used to them. I almost never Distributed Computing 1 (1986), 77–101.
I wanted to make the distributed write one until I believe that what I 14. Lamport, L. A theorem on atomicity in distributed
algorithms. Distributed Computing 4, 2 (1990), 59–68.
algorithm an implementation of the want to prove is true. For the correct- 15. Lipton, R. Reduction: A method of proving properties of
bakery algorithm. I started with the ness of our algorithms, that belief was parallel programs. Commun. ACM 18, 12 (Dec. 1975),
717–721.
generalization of having subprocesses based on the reasoning embodied in 16. Merz, S. Online TLA+ specifications and proofs for
of each process interact independent- the informal proofs I presented—the “deconstructing the bakery to build a distributed state
machine.” https://members.loria.fr/SMerz/papers/
ly with the other processes; that was same kind of reasoning I used when I distributed-bakery.html.
17. Moses, Y. and Patkin, K. Mutual exclusion as a matter
essentially how I had been describing discovered the bakery and distributed of priority. Theoretical Computer Science 751 (2018),
the bakery algorithm for years. Delay- state-machine algorithms. 46–60.
18. Pnueli, A. The temporal logic of programs. In
ing the setting of number[i] to 0 was I understood the two algorithms well Proceedings of the 18th Annual Symposium on the
required because the distributed al- enough to be confident in the correct- Foundations of Computer Science, IEEE (Nov. 1977),
46–57.
gorithm’s message that accomplished ness of the non-distributed versions of 19. Ricart, G. and Agrawala, A. An optimal algorithm for
it could be arbitrarily delayed. It took the bakery algorithm and of the deri- mutual exclusion in computer networks. Commun.
ACM 24, 1 (1981), 9–17.
me a while to realize that I should de- vation of the state-machine algorithm 20. Schneider, F. Implementing fault-tolerant services
construct the multi-reader register from the distributed bakery algorithm. using the state machine approach: A tutorial. ACM
Computing Surveys 22, 4 (December 1990), 299–319.
number[i] into multiple single-reader Model checking convinced me of the 21. Taubenfeld, G. The black-white bakery algorithm and
registers, and that both the original correctness of the distributed bakery al- related bounded-space, adaptive, local-spinning and
FIFO algorithms. In R. Guerraoui, (Ed.), Proceedings
bakery algorithm and the distributed gorithm and confirmed the confidence of Distributed Computing, 18th Intern. Conf. 3274,
algorithm implemented that decon- my informal invariance proof had given Lecture Notes in Computer Science, Springer (Oct. 4,
2004), 56–70.
structed algorithm. me that the deconstructed algorithm 22. Taubenfeld, G. Concurrent programming, mutual
The path back from the distributed satisfies mutual exclusion. exclusion. In M-Y Kao, (Ed.) Encyclopedia of
Algorithms—2016 Edition, 421–425. Springer, 2016.
bakery algorithm to the distributed More recently, Stephan Merz wrote
state-machine algorithm was easy. It a formal, machine-checked version of Leslie Lamport is the recipient of the 2013 ACM
may have helped that I had previously my informal invariance proof. He also A.M. Turing Award.
used the idea of modifying the bakery wrote a machine-checked proof that
This work is licensed under a http://
algorithm to make values of number[i] the actions of the distributed bakery creativecommons.org/licenses/by/4.0/
keep increasing. Paradoxically, that algorithm implement the actions of
was done to keep those values from the deconstructed bakery algorithm
getting too large.10 under a suitable data refinement.
Correctness of a concurrent algo- These two proofs show that the decon-
rithm is expressed with two classes of structed algorithm satisfies mutual
properties: safety properties, such as exclusion. The proofs are available on
mutual exclusion, that assert what the the Web.16
algorithm may do, and liveness proper-
ties, such as starvation freedom, that References
1. Alpern, B. and Schneider, F. Defining liveness. Information
assert what the algorithm must do.1 Processing Letters 21, 4 (Oct. 1985), 181–185.
Safety properties depend on the ac- 2. Ashcroft, E. Proving assertions about parallel
programs. Journal of Computer and System Sciences
tions the algorithm can perform; live- 10, 1 (Feb. 1975), 110-135.
ness properties depend as well on as- 3. Dijkstra, E. Solution of a problem in concurrent
programming control. Commun. ACM 8, 9 (Sept. Watch the author discuss
sumptions, often implicit, about what 1965), 569. this work in the exclusive
4. Harel, D. On folk theorems. Commun. ACM 23, 7 (July Communications video.
actions the algorithm must perform. 1980), 379-389. https://cacm.acm.org/videos/
The kind of informal behavioral 5. Hesselink, W. Mechanical verification of Lamport’s deconstructing-the-bakery

66 COMM UNICATIO NS O F THE AC M | S EPTEM BER 2022 | VO L . 65 | N O. 9

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy