DC Chap 4
DC Chap 4
We then look at two widely used models for communication: Remote Procedure
Call (RPC), and Message-Oriented Middleware (MOM).
Communication Foundations
Drawbacks
• Focus on message-passing only
• Often unneeded or unwanted functionality
• Violates access transparency
Layered Protocols
Figure 2: A typical message as it appears on the network.
Communication Foundations
Low-level layers
Recap
• Physical layer: contains the specification and implementation of bits, and
their transmission between sender and receiver
• Data link layer: prescribes the transmission of a series of bits into a frame
to allow for error and flow control
• Network layer: describes how packets in a network of computers are to
be routed.
Observation
For many distributed systems, the lowest-level interface is that of the network
layer.
Layered Protocols
Communication Foundations
Transport Layer
Important
The transport layer provides the actual communication facilities for most
distributed systems.
Layered Protocols
Communication Foundations
Middleware layer
Observation
Middleware is invented to provide common services and protocols that can be
used by many different applications
• A rich set of communication protocols
• (Un)marshaling of data, necessary for integrated systems
• Naming protocols, to allow easy sharing of resources
• Security protocols for secure communication
• Scaling mechanisms, such as for replication and caching
Note
What remains are truly application-specific protocols... such as?
Layered Protocols
Communication Foundations
Middleware is an application that logically lives (mostly) in the OSI application layer, but
which contains many general-purpose protocols that warrant their own layers, independent
of other, more specific applications.
Layered Protocols
Communication Foundations
Types of communication
Distinguish...
Types of Communication
Communication Foundations
Types of communication
Transient versus persistent
Types of Communication
Communication Foundations
Types of communication
Places for synchronization
• At request submission
• At request delivery
• After request processing
Types of Communication
Communication Foundations
Client/Server
Some observations
Client/Server computing is generally based on a model of transient
synchronous communication:
• Client and server have to be active at the time of communication
• Client issues request and blocks until it receives reply
• Server essentially waits only for incoming requests, and subsequently
processes them
Types of Communication
Communication Foundations
Client/Server
Some observations
Client/Server computing is generally based on a model of transient
synchronous communication:
• Client and server have to be active at the time of communication
• Client issues request and blocks until it receives reply
• Server essentially waits only for incoming requests, and subsequently
processes them
Types of Communication
Communication Foundations
Messaging
Message-oriented middleware
Aims at high-level persistent asynchronous communication:
• Processes send each other messages, which are queued
• Sender need not wait for immediate reply, but can do other things
• Middleware often ensures fault tolerance
Types of Communication
Communication Remote procedure call
Conclusion
Communication between caller & callee can be hidden by using procedure-call
mechanism.
1. Client procedure calls client stub. 6. Server does local call; returns result to stub.
2. Stub builds message; calls local OS. 7. Stub builds message; calls OS.
3. OS sends message to remote OS. 8. OS sends message to client’s OS.
4. Remote OS gives message to stub. 9. Client’s OS gives message to stub.
5. Stub unpacks parameters; calls server. 10. Client stub unpacks result; returns to client.
Conclusion
Client and server need to properly interpret messages, transforming them into
machine-dependent representations.
Parameter
Passing Value Parameters (1)
Figure 4-8. (c) The message after being inverted. The little numbers in
boxes indicate the address of each byte.
21
Communication Remote procedure call
Conclusion
Full access transparency cannot be realized.
Parameter
Communication Remote procedure call
Asynchronous RPCs
Essence
Try to get rid of the strict request-reply behavior, but let the client continue
without waiting for an answer from the server.
Variations on RPC
Communication Remote procedure call
Variations on RPC
Writing a Client
and a Server (1)
Client
1 c la s s Client :
2 def run(s e lf) :
3 s = socket(AF_INET, SOCK_STREAM)
4 s.connect((HOST, PORT)) # connect t o server (block u n t i l accepted)
5 s.send(b"Hello, world") # send same data
6 d a ta = s.recv(1024) # receive t h e response
7 print(data) # p r i n t what you received
8 s.send(b"") # t e l l t h e server t o close
9 s . c lo s e () # c los e t h e connection
Alternative: ZeroMQ
Provides a higher level of expression by pairing sockets: one for sending
messages at process P and a corresponding one at process Q for receiving
messages. All communication is asynchronous.
Three patterns
• Request-reply
• Publish-subscribe
• Pipeline
Request-reply
1 import zmq
2
3 def s e rve r ( ) :
4 context = zmq.Context()
5 socket = context.socket(zmq.REP) # create r e p ly socket
6 socket.bind("tcp:// * :12345") # bind socket t o address
7
8 while True:
9 message = socket.recv() # wait f o r incoming message #
10 i f not "STOP" i n str(message): i f not t o s t o p . . .
11 re p ly = str(message.decode())+’*’ # append " * " t o message
12 socket.send(reply.encode()) # send i t away (encoded)
13 e ls e :
14 break # break o u t o f loop and end
15
16 def c lie n t ( ) :
17 context = zmq.Context()
18 socket = context.socket(zmq.REQ) # create request socket
19
20 socket.connect("tcp://localhost:12345" ) # block u n t i l connected
21 socket.send(b"Hello world") # send message
22 message = socket.recv() # block u n t i l response
23 socket.send(b"STOP") # t e l l server t o stop
24 print(message.decode()) # print result
Publish-subscribe
1 import multiprocessing
2 import zmq, time
3
4 def s e rve r ( ) :
5 context = zmq.Context()
6 socket = context.socket(zmq.PUB) # create a publisher socket
7 socket.bind("tcp:// * :12345") # bind socket t o t h e address
8 while True:
9 time.sleep(5) # wait every 5 seconds
10 t = "TIME " + time.asctime()
11 socket.send(t.encode()) # publish t h e current time
12
13 def c lie n t ( ) :
14 context = zmq.Context()
15 socket = context.socket(zmq.SUB) # create a subscriber socket
16 socket.connect("tcp://localhost:12345") # connect t o t h e server
17 socket.setsockopt(zmq.SUBSCRIBE, b"TIME") # subscribe t o TIME messages
18
19 f or i i n range(5): # Five ite r a tio n s
20 time = socket.recv() # receive a message related t o subscription
21 print(time.decode()) # p r i n t t h e r e s u l t
Pipeline
1 def producer():
2 context = zmq.Context()
3 socket = context.socket(zmq.PUSH) # create a push socket
4 socket.bind("tcp://127.0.0.1:12345") # bind socket t o address
5
6 while True:
7 workload = random.randint(1, 100) # compute workload
8 socket.send(pickle.dumps(workload)) # send workload t o worker
9 time.sleep(workload/NWORKERS) # balance production by waiting
10
11 def worker(id):
12 context = zmq.Context()
13 socket = context.socket(zmq.PULL) # create a p u l l socket
14 socket.connect("tcp://localhost:12345") # connect t o t h e producer
15
16 while True:
17 work = pickle.loads(socket.recv()) # receive work from a source
18 time.sleep(work) # pretend t o work
Operation Description
Queue-based messaging
Four possible combinations
Message-oriented middleware
Essence
Asynchronous persistent communication through support of middleware-level
queues. Queues correspond to buffers at communication servers.
Operations
Operati Description
on
General model
Queue managers
Queues are managed by queue managers. An application can put messages
only into a local queue. Getting a message is possible by extracting it from a
local queue only ⇒ queue managers need to route messages.
Routing
Message broker
Observation
Message queuing systems assume a common messaging protocol: all
applications agree on message format (i.e., structure and data representation)
Example: AMQP
Lack of standardization
Advanced Message-Queuing Protocol was intended to play the same role as,
for example, TCP in networks: a protocol for high-level messaging with
different implementations.
Basic model
Client sets up a (stable) connection, which is a container for serveral (possibly
ephemeral) one-way channels. Two one-way channels can form a session. A
link is akin to a socket, and maintains state about message transfers.
Example: Advanced Message Queuing Protocol (AMQP)
Communication Message-oriented communication
1 import rabbitpy
2
3 def producer():
4 connection = rabbitpy.Connection() # Connect t o RabbitMQ server
5 channel = connection.channel() # Create new channel on t h e connection
6
7 exchange = rabbitpy.Exchange(channel, ’exchange’) # Create an exchange
8 exchange.declare()
9
10 queue1 = rabbitpy.Queue(channel, ’example1’) # Create 1 s t queue
11 queue1.declare()
12
13 queue2 = rabbitpy.Queue(channel, ’example2’) # Create 2nd queue
14 queue2.declare()
15
16 queue1.bind(exchange, ’example-key’ ) # Bind queue1 t o a s in g le key
17
queue2.bind(exchange, ’example-key’ ) # Bind queue2 t o t h e same key
18
19 message = rabbitpy.Message(channel, ’Test message’)
20 message.publish(exchange, ’example-key’ ) # Publish t h e message using t h e key
21 exchange.delete()
1 import rabbitpy
2
3 def consumer():
4 connection = rabbitpy.Connection()
5 channel = connection.channel()
6
7 queue = rabbitpy.Queue(channel, ’example1’)
8
9 # While th e r e are messages i n t h e queue, f e t c h them using Basic.Get
10 while len(queue) > 0 :
11 message = queue.get()
12 print(’Message Q1: %s’ % message.body.decode()) message.ack()
13
14
15 queue = rabbitpy.Queue(channel, ’example2’)
16
17 while len(queue) > 0 :
18 message = queue.get()
19
print(’Message Q2: %s’ % message.body.decode())
20
message.ack()
Application-level multicasting
Essence
Organize nodes of a distributed system into an overlay network and use that
network to disseminate data:
• Oftentimes a tree, leading to unique paths
• Alternatively, also mesh networks, requiring a form of routing
• Link stress: How often does an ALM message cross the same physical
link? Example: message from A to D needs to cross ⟨Ra,Rb⟩twice.
• Stretch: Ratio in delay between ALM-level path and network-level path.
Example: messages B to C follow path of length 73 at ALM, but 47 at
network level ⇒ stretch = 73/47.
Flooding
Essence
P simply sends a message m to each of its neighbors. Each neighbor will
forward that message, except to P, and only if it had not seen m before.
Flooding-based multicasting
Communication Multicast communication
Flooding
Essence
P simply sends a message m to each of its neighbors. Each neighbor will
forward that message, except to P, and only if it had not seen m before.
Variation
Let Q forward a message with a certain probability pflood , possibly even
dependent on its own number of neighbors (i.e., node degree) or the degree of
its neighbors.
Flooding-based multicasting
Communication Multicast communication
Epidemic protocols
Assume there are no write–write conflicts
• Update operations are performed at a single server
• A replica passes updated state to only a few neighbors
• Update propagation is lazy, i.e., not immediate
• Eventually, each update should reach every replica
Anti-entropy
Principle operations
• A node P selects another node Q from the system at random.
• Pull: P only pulls in new updates from Q
• Push: P only pushes its own updates to Q
• Push-pull: P and Q send updates to each other
Observation
For push-pull it takes O(log(N)) rounds to disseminate updates to all N nodes
(round = when every node has taken the initiative to start an exchange).
Anti-entropy: analysis
Basics
Consider a single source, propagating its update. Let pi be the probability that
a node has not received the update after the ith round.
Anti-entropy performance
Rumor spreading
Basic model
A server S having an update to report, contacts other servers. If a server is
contacted to which the update has already propagated, S stops contacting
other servers with probability pstop.
Observation
If s is the fraction of ignorant servers (i.e., which are unaware of the update), it
can be shown that with many servers
s = e−(1/pstop+1)(1−s)
Formal analysis
Notations
Let s denote fraction of nodes that have not yet been updated (i.e., susceptible;
i the fraction of updated (infected) and active nodes; and r the fraction of
updated nodes that gave up (removed).
1) ds/dt = −s ·i
2) di/dt = s ·i −pstop ·(1 − s ) ·i
pstop
⇒ di/ds = −(1 + pstop) + s
⇒ i (s) = −(1 + pstop) ·s + pstop ·ln(s) + C
Wrap up
i (1) = 0 ⇒ C = 1 + pstop ⇒ i (s) = (1 + pstop) ·(1 − s ) + pstop ·ln(s). We are
looking for the case i (s) = 0, which leads to s = e−(1/pstop+1)(1−s)
Rumor spreading
The effect of stopping
Consider 10,000
nodes
1/pst s Ns
op
1 0.20318 203
8 2
2 0.05952 595
0
3 0.01982 198
7
4 0.00697 70
7
5 0.00251 25
6
6 0.00091 9
Gossip-based data dissemination
8
Communication Multicast communication
Rumor spreading
The effect of stopping
Consider 10,000
nodes
1/pst s Ns
op
1 0.20318 203
8 2
2 0.05952 595
0
3 0.01982 198
7
Note
If we really have to ensure that all servers are eventually
4 updated,
0.00697 rumor
70
spreading alone is not enough 7
5 0.00251 25
6
6 0.00091 9
Gossip-based data dissemination
8
Communication Multicast communication
Deleting values
Fundamental problem
We cannot remove an old value from a server and expect the removal to
propagate. Instead, mere removal will be undone in due time using epidemic
algorithms
Solution
Removal has to be registered as a special update by inserting a death
certificate
Deleting values
When to remove a death certificate (it is not allowed to stay for ever)
• Run a global algorithm to detect whether the removal is known
everywhere, and then collect the death certificates (looks like garbage
collection)
• Assume death certificates propagate in finite time, and associate a
maximum lifetime for a certificate (can be done at risk of not reaching all
servers)
Note
It is necessary that a removal actually reaches all servers.