0% found this document useful (0 votes)
51 views

Destributed System Lecture Note Finale

This document discusses distributed systems. It defines a distributed system as a collection of autonomous computing elements that appears as a single coherent system to users. Distributed systems provide resources across multiple computers and examples include Google, banks, content delivery networks, and telecommunications. Design goals of distributed systems include making resources easily accessible, hiding distribution transparency, being open and scalable. The document also discusses processes, communication, coordination, and programming in distributed systems.

Uploaded by

gemchis dawo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views

Destributed System Lecture Note Finale

This document discusses distributed systems. It defines a distributed system as a collection of autonomous computing elements that appears as a single coherent system to users. Distributed systems provide resources across multiple computers and examples include Google, banks, content delivery networks, and telecommunications. Design goals of distributed systems include making resources easily accessible, hiding distribution transparency, being open and scalable. The document also discusses processes, communication, coordination, and programming in distributed systems.

Uploaded by

gemchis dawo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 148

Distributed Systems

1 Introduction

▶ Introduction

▶ Processes

▶ Communication

▶ Coordination

▶ Distributed programming

2
Introduction
Distributed systems

• Definitions
• Design goals
• Classification
• Pitfalls
Basic Concepts
1 Introduction

Definition 1: work together to achieve a common goal or provide a unified service.

A distributed system is a a collection of autonomous computing elements that


appears to its users as a single coherent system.

Realization of distributed systems

4
Basic Concepts
1 Introduction

Definition 2:
A distributed system is a networked computer system in which processes and
resources are sufficiently spread across multiple computers ⇐ Expansive view

Examples
• Google (Search, Mail, etc.)
• Finance and commerce (Banks, Amazon, eBay, etc.)
• Content Delivery Network (CDN)
• Telecommunication

Decentralized systems: necessarily. E.g., federated learning, distributed ledger


(blockchain), etc ⇐ Integrative view

5
Design Goals
1 Introduction

A distributed system should:


• make resources easily accessible
• hide the fact that resources are distributed across a network
• be open, and
• be scalable

6
Distribution transparency
Design Goals

The distribution of processes and resources is transparent, that is, invisible, to end users
and applications.

Distribution transparency
Transparency Description
Access Hide differences in data representation and how an object is accessed
Location Hide where an object is located
Relocation Hide that an object may be moved to another location while in use
Migration Hide that an object may move to another location
Replication Hide that an object is replicated
Concurrency Hide that an object may be shared by several independent users
Failure Hide the failure and recovery of an object

7
Degree of distribution transparency
Distribution transparency

Trade-off between a high degree of transparency and the performance of a system.

Example
Video streams (failure to access server)
• How to hide transmission delays for wide-area distributed systems?
• How to distinguish slow system from failing one?

Note:
Distribution transparency is a nice a goal, but achieving it is a different story.

8
Openness
Design Goals

The system communicates with services of other systems irrespective of the underlying
environment.

Openness
• Interoperability – different manufacturers can co-exist and work together by a
common standard.
• Portability – an application can be executed on different distributed system without
modification.
• Extensible – adding new components or replacing existing ones without affecting the
other components.

Lack of openness ⇒ open-source

9
Separating versus mechanism
openness

Implementing openness: Implementing openness:


policies mechanisms
• What level of consistency do we require • Allow (dynamic) setting of
for client-cached data? caching policies
• Which QoS requirements do we adjust • Provide adjustable QoS
in the face of varying bandwidth? parameters per data stream
• What level of secrecy do we require for • Offer different encryption
communication? algorithms

Large distributed systems ⇒ reasonable defaults and self-configurable systems

10
Dependability
Design Goals

The system operates as expected.

Partial failures and fault tolerance

Requirements
• Availability – the probability of operating correctly at any given moment.
• Reliability – continue to work without interruption.
• Safety – no catastrophic event happens during temporary fails.
• Maintainability – how easily a failed system can be repaired.

11
Scalability
Design Goals

Scalability
Three components:
• Size scalability – add users/resources without noticeable loss of performance.
• Geographical scalability – users and resources may lie far apart, but communication
delay is hardly noticed.
• Administrative scalability – can be easily managed even if it spans many
independent administrative organizations.

12
Scaling techniques
Scalability

Performance problems caused by limited capacity of servers and network?


• Scale up – increasing memory, upgrading CPUs, or replacing network modules.
• Scale out – expanding the distributed system

Scaling out
• Hiding communication latencies – avoid waiting for responses to remote-service
requests as much as possible. However, the approach doesn’t fit for every
application.
• Partitioning and distribution – partition data and computations across multiple
machines. Example: World Wide Web documents
• Replication – availability, load balance and latency
— Problem: consistency

13
Classification of distributed systems
1 Introduction

• Distributed computing systems


— Cluster computing
— Grid computing
— Cloud computing
• Distributed information systems
• Distributed pervasive systems

14
Cluster computing cluster computing plays a crucial role in distributed systems by enabling efficient
Distributed computing systems resource utilization, parallel processing, fault tolerance, and scalability.

Distributed computing systems are used for high-performance computing tasks.

Cluster computing
• Homogeneous: same OS, near-identical hardware
• Single managing node

15
focuses on a single organization's resources
Grid computing Grid computing in distributed systems enables efficient resource sharing, collaboration,
Distributed computing systems and salability across organizational boundaries, making it a powerful paradigm for
tackling large-scale computational challenges.

Grid computing
• Fabric layer – shared heterogeneous resources
• Connectivity layer – communication and security
protocols to authenticate and transfer data
• Resource layer – protocols for operating on a single
shared resources. E.g., get configuration, create
process
• Collective layer – coordinates sharing of resources.
E.g., allocation and scheduling of tasks onto multiple
resources
• Application layer – applications running on the gird.

16
Cloud computing It provides a scalable and flexible computing model, enabling efficient resource
Distributed computing systems utilization and empowering users to focus on their core business or tasks without
the burden of infrastructure management.

Cloud computing
Four layers
• Hardware – physical storage, processors, network
devices, etc.
• Infrastructure – provides virtually unlimited “raw”
computing, storage, and network resources
• Platform – set of tools or middleware that are used to
develop or deploy applications on the cloud
• Application – running applications.

17
Distributed information systems
Classification

• The traditional environments, where databases play an important role.


• Database and processing of the system are separated components.
— Transaction processing systems

Transaction primitives

Primitive Description
BEGIN_TRANSACTION Mark the start of a transaction
END_TRANSACTION Terminate the transaction and try to commit
ABORT_TRANSACTION Kill the transaction and restore the old values
READ Read data from a file, a table, or otherwise
WRITE Write data to a file, a table, or otherwise

18
Transaction processing
Distributed information systems

Properties of transactions
Transactions adhere to the so-called ACID properties:
• Atomicity – All operations either succeed, or all of them fail. When the transaction
fails, the state of the object will remain unaffected by the transaction.
• Consistency – A transaction establishes a valid (predefined) state transition, i.e.,
corruption or errors in your data do not create unintended consequences for the
integrity of your table.
• Isolation – Concurrent transactions do not interfere with each other. Each request
occur as though they were occurring one by one.
• Durability – After the execution of a transaction, its effects are made permanent,
i.e., changes to the state survive failures.
19
Distributed pervasive systems
Classification

Emerging next-generation of distributed systems in which nodes are small, mobile, and
often embedded in a larger system, characterized by the fact that the system naturally
blends into the user’s environment.

Pervasive systems
Types (overlapping characters):
• Ubiquitous computing systems
• Mobile computing systems
• Sensor networks

20
Ubiquitous computing systems
Pervasive systems

Pervasive and continuously present, i.e., user will be continuously interacting with the
system .
Core requirements
• Distribution – Devices are networked, distributed, and accessible transparently
(hidden from view)
• Interaction – Interaction between users and devices is highly unobtrusive (implicit)
• Context awareness – The system is aware of a user’s context to optimize interaction
• Autonomy – Devices operate autonomously without human intervention, and are
thus highly self-managed
• Intelligence – The system as a whole can handle a wide range of dynamic actions and
interactions (AI)

21
Mobile computing systems
Pervasive systems

Pervasive, but emphasis is on the fact that devices are inherently mobile.

Mobile computing systems


Characters of mobile computing systems:
• Wireless communication – smart phones, remote controls, pagers, active badges, car
equipment, various GPS-enabled devices, and so on.
• Mobile – Change location over time.
• MEC – supported by Mobile Edge Computing

22
Sensor networks
Pervasive systems

Pervasive, with emphasis on the actual (collaborative) sensing and actuation of the
environment.

Sensor networks
Characters of sensor networks:
• 10s - 1000s of small nodes, each equipped with one or more sensing devices.
• Wireless and often battery powered
• Limited resources, i.e., small memory/compute/communication capacity, which is an
advantage for the power consumption.

23
Sensor networks as distributed system
Pervasive systems

Sensor networks

24
Pitfalls
1 Introduction

Developing a distributed system is a formidable task.

False assumptions
• The network is reliable
• The network is secure
• The network is homogeneous
• The topology does not change
• Bandwidth is infinite
• Transport cost is zero
• There is one administrator

25
Summary
1 Introduction

• The goal of distributed system is to spread processes and resources across different
computers for sufficiency, not necessity.

• Design goals for distributed systems include sharing resources, ensuring openness,
distribution transparency, and scalability.

• Different types of distributed systems exist which can be classified as being oriented
toward supporting computations, information processing, and pervasiveness.

26
Distributed Systems
2 Processes

▶ Introduction

▶ Processes

▶ Communication

▶ Coordination

▶ Distributed programming

27
Processes
Distributed systems

• Threads
• Virtualization
• Clients and Servers
• Code Migration
Threads
2 Processes

We build virtual processors in software, on top of physical processors:

Basic concepts
• Processor – Provides a set of instructions along with the capability of automatically
executing a series of those instructions.
• Thread – A minimal software processor in whose context a series of instructions can
be executed. Saving a thread context implies stopping the current execution and
saving all the data needed to continue the execution at a later stage.
• Process – A software processor in whose context one or more threads may be
executed. Executing a thread, means executing a series of instructions in the context
of that thread.

29
Context switching
Threads

Context switching
• Threads share the same address space. Thread context switching can be done
entirely independent of the operating system.

• Creating and destroying threads is much cheaper than doing so for processes.

• Process switching is generally more expensive as it involves getting the OS in the


loop, i.e., trapping to the kernel.

30
Context switching
Threads

Context switching
• Threads use the same address space: more prone to errors
• No support from OS/HW to protect threads using each other’s memory
• Thread context switching may be faster than process context switching

31
Python example
2 Processes
1 from m u l t i p r o c e s s i n g i m p o r t P r o c e s s
2 from t i m e i m p o r t *
3 from random i m p o r t *
4
5 d e f s l e e p e r ( name ) :
6 t = gmtime ( )
7 s = randint (1 ,20)
8 t x t = s t r ( t . tm_min ) + ’ : ’ + s t r ( t . tm_sec ) + ’ ’ +name+ ’ i s g o i n g t o s l e e p f o r ’ + s t r ( s ) + ’ s e c o n d s ’
9 print ( txt )
10
11 sleep ( s )
12 t = gmtime ( )
13 t x t = s t r ( t . tm_min ) + ’ : ’ + s t r ( t . tm_sec ) + ’ ’ +name+ ’ h a s woken up ’
14 print ( txt )
15
16 i f __name__ == ’ __main__ ’ :
17 p = P r o c e s s ( t a r g e t = s l e e p e r , a r g s = ( ’ eve ’ , ) )
18 q = P r o c e s s ( t a r g e t = s l e e p e r , a r g s = ( ’ bob ’ , ) )
19 p. start () ; q. start ()
20 p. join () ; q. join ()

Output:
46:9 eve is going to sleep for 4 seconds
46:9 bob is going to sleep for 13 seconds
46:13 eve has woken up
46:22 bob has woken up

32
Python example
Threads

1 from m u l t i p r o c e s s i n g import Process


2 from t h r e a d i n g import Thread
3 from time import *
4 from random i m p o r t *
5
6 shared_x = randint (10 ,99)
7
8 d e f s l e e p i n g ( name ) :
9 g l o b a l shared_x
10 s = randint (1 ,20)
11 t = gmtime ( )
12 t x t = s t r ( t . tm_min ) + ’ : ’ + s t r ( t . tm_sec ) + ’ ’ +name+ ’ i s g o i n g t o s l e e p f o r ’ + s t r ( s ) + ’ s e c o n d s ’
13 print ( txt )
14
15 sleep ( s )
16 shared_x = shared_x + 1
17
18 t = gmtime ( )
19 t x t = s t r ( t . tm_min ) + ’ : ’ + s t r ( t . tm_sec ) + ’ ’ +name+ ’ h a s woken up , s e e i n g s h a r e d x b e i n g ’ + s t r ( s h a r e d _ x )
20 print ( txt )

33
Python example
2 Processes

21 d e f s l e e p e r ( name ) :
22 p r i n t ( name+ ’ s e e s s h a r e d x b e i n g ’ + s t r ( s h a r e d _ x ) )
23 sleeplist = list ()
24 f o r i i n range ( 3 ) :
25 s u b s l e e p e r = T h r e a d ( t a r g e t = s l e e p i n g , a r g s = ( name+ ’ ’ + s t r ( i ) , ) )
26 s l e e p l i s t . append ( s u b s l e e p e r )
27
28 for s in s l e e p l i s t : s . start ( )
29 for s in s l e e p l i s t : s . join ( )
30
31 p r i n t ( name , ’ s e e s s h a r e d x b e i n g ’ , s h a r e d _ x )

Output:
eve sees shared x being 68 48:19 bob 2 has woken up, seeing shared x being 69
bob sees shared x being 68 48:24 eve 1 has woken up, seeing shared x being 69
48:14 eve 0 is going to sleep for 19 seconds 48:25 bob 1 has woken up, seeing shared x being 70
48:14 eve 1 is going to sleep for 10 seconds 48:25 eve 2 has woken up, seeing shared x being 70
48:14 eve 2 is going to sleep for 11 seconds 48:32 bob 0 has woken up, seeing shared x being 71
48:14 bob 0 is going to sleep for 18 seconds bob sees shared x being 71
48:14 bob 1 is going to sleep for 11 seconds 48:33 eve 0 has woken up, seeing shared x being 71
48:14 bob 2 is going to sleep for 5 seconds eve sees shared x being 71

34
Threads and operating systems
Threads

Should an OS kernel provide threads, or should they be implemented as user-level


packages?
User-space solution
• All operations can be completely handled within a single process ⇒ implementations
can be extremely efficient.
• All services provided by the kernel are done on behalf of the process in which a
thread resides ⇒ if the kernel decides to block a thread, the entire process will be
blocked.

35
Threads and operating systems
Threads

Kernel solution:
The whole idea is to have the kernel contain the implementation of a thread package,
i.e., thread operation (creation, deletion, synchronization, etc.) require a system call.

Kernel solution
• Operations that block a thread are no longer a problem: the kernel schedules
another available thread within the same process.
• The problem is the loss of efficiency because each thread operation requires a trap
to the kernel (context switching becomes expensive).

36
Threads and Distributed Systems
Threads

Multithreaded Web client


Hiding network latencies:
• Web browser scans an incoming HTML page, and finds that more files need to be
fetched.
• Each file is fetched by a separate thread, each doing a (blocking) HTTP request.
• As files come in, the browser displays them.

Multiple request-response calls to other machines


• A client does several calls at the same time, each one by a different thread.
• It then waits until all results have been returned.
• Note: if calls are to different servers, we may have a linear speed-up.
37
Threads and Distributed Systems
Threads

Improve performance
• Having a single-threaded server prohibits simple scale-up to a multiprocessor
system.
• As with clients: hide network latency by reacting to next request while previous one
is being replied.
• Starting a thread is much cheaper than starting a new process.

38
Threads and Distributed Systems
Threads

Why multithreading is popular: dispatcher/worker model

39
Virtualization
2 Processes

Virtualization is becoming increasingly important.


Virtualization
• Hardware changes faster than software
• Ease of portability and code migration (heterogeneous environments)
• Isolation of failing or attacked components
Principle: mimic the behavior of another system

40
Mimicking interfaces
Virtualization

Mimicking interfaces
Four types of interfaces at three different levels:
1. Instruction set architecture: the set of machine instructions, with two subsets:
— Privileged instructions: allowed to be executed only by the operating system.
— General instructions: can be executed by any program.
2. System calls as offered by an operating system.
3. Library calls, known as an application programming interface (API)
41
Ways of virtualization
Virtualization

Ways of virtualization
1. Process VM – Instructions can be interpreted (as is the case for the Java runtime
environment).
2. Native VMM – mimics the instruction set of directly on the hardware ⇒ a complete
operating system and its applications can be supported
3. Hosted VMM – Low-level instructions, but delegating most work to a full-fledged OS
(Example: VMware, VirtualBox).
42
Containers
Virtualization

Applications rely on specific libraries and other support software.

Containers
• Reduced instance of virtualization
• A container holds only the necessary OS
components (binaries/images/libraries) that are
needed for that specific application to run.
• Virtualizes the software environment for an
application.
• Applications and processes operating in different
containers need to be isolated from each other

43
Client-server
2 Processes

Interaction
• Application-level – A networked application with its own protocol. Example: calendar
synchronization.
• Direct remote access – A general solution to allow access to remote applications, i.e.,
server provide convenient user interface.

44
Client-side software
Client-server

Generally tailored for distribution transparency


• Access transparency – client stub provides the same interface, hiding differences in
machine architectures and communication
• Location/migration transparency – client can be informed when the server changes
location (to rebind server), but can hide the information
• Replication transparency – collect all responses and pass a single response.

• Failure transparency – client can mask server and communication failures. Example:
repeatedly attempt to connect to a server, or try another server
45
Servers
Client-server

General organization
A process implementing a specific service on behalf of a collection of clients. It waits
for an incoming request from a client and subsequently ensures that the request is
taken care of, after which it waits for the next incoming request.

Two basic types


• Iterative server – Server handles the request before attending a next request.
• Concurrent server – Uses a dispatcher, which picks up an incoming request that is
then passed on to a separate thread/process.

46
Out-of-band communication
Client-server

Is it possible to interrupt a server once it has accepted (or is in the process of accepting) a
service request?
Solution 1: Use a separate port for urgent data
• Server has a separate thread/process for urgent messages
• Urgent message comes in ⇒ associated request is put on hold
• Note: we require OS supports priority-based scheduling

Solution 2: Use facilities of the transport layer


• Example: TCP allows for urgent messages in same connection
• Urgent messages can be caught using OS signaling techniques

47
Servers and state
Client-server

Stateless servers
• Do not keep information about the status of a client after having handled a request.
• No disruption of the service offered by the server if information is lost.
• Soft state: maintain state on behalf of the client, but only for a limited time.

Stateful servers
• Keeps track of the status of its clients
• Record that a file has been opened, so that prefetching can be done
• Knows which data a client has cached, and allows clients to keep local copies of
shared data ((client, file) table).

The performance of stateful servers can be extremely high, provided clients are allowed
to keep local copies.
48
Server clusters
Client-server

Note: The first tier is generally responsible for passing requests to an appropriate server:
request dispatching.

49
Request Handling
Client-server

TCP handoff
Having the first tier handle all communication from/to the cluster may lead to a
bottleneck.

50
Code Migration
2 Processes

Reasons to migrate code:


• Performance
• Flexibility
• Privacy and security

Performance
• Ensuring that servers in a data center are sufficiently loaded (e.g., to prevent waste
of energy) ⇒ Load distribution/algorithms
• Minimizing communication by ensuring that computations are close to where the
data is (think of MEC).

51
Flexibility
Code Migration

Flexibility
• Moving code to a client when needed (design flexibility)
• dynamically moving code requires a protocol for downloading and initializing.
• downloaded code should be executable on the client’s machine.
Avoids pre-installing software and increases dynamic configuration.

52
Privacy and security
Code Migration

Example: federated machine learning

Privacy and security


• One cannot move data to another location, for whatever reason (often legal ones) ⇒
move the code to the data

53
Models for code migration
Code Migration

54
Strong and weak mobility
Code Migration

Weak mobility
Move only code and data segment (a transferred program is always started anew):
• Relatively simple, especially if code is portable
• Distinguish code shipping (push) from code fetching (pull)

Strong mobility
Move component, including execution state (execution segment can be transferred)
• Migration – move entire object from one machine to the other
• Cloning – start a clone, and set it in the same execution state. Example: fork()

55
Heterogeneous systems
Code Migration

Main problem
• The target machine may not be suitable to execute the migrated code
• The definition of process/thread/processor context is highly dependent on local
hardware, operating system and runtime system

Solution
Migrate not only processes, but to migrate entire computing environments.
• Interpreted languages, effectively having their own VM
• Virtual machine monitors (think of virtual machines)

56
Distributed Systems
3 Communication

▶ Introduction

▶ Processes

▶ Communication

▶ Coordination

▶ Distributed programming

57
Communication
Distributed systems

• Remote procedure call


• Message-oriented
communication
• Multicast communication
Layered protocols
3 Communication

Open Systems Interconnection (OSI) Reference Model

59
Top layers
Layered protocols

Standard Internet protocols


• Application layer – Essentially, everything else: e-mail protocols, Web access
protocols, file-transfer protocols, and so on. It enables user interaction.
• Presentation layer – Prescribes how data is represented in a way that is independent
of the hosts on which communicating applications are running. It translates data to
the app. layer.
• Session layer – Provides support for sessions between applications, i.e., coordinates
conversations between applications.

60
Transport layers
Layered protocols

The transport layer provides the actual communication facilities for most distributed
systems. It determines how data should be delivered.
Standard Internet protocols
• TCP – connection-oriented, reliable, stream-oriented communication
• UDP – unreliable (best-effort) datagram communication

61
Low-level layers
Layered protocols

Recap
• Physical layer – contains the specification and implementation of bits, and their
transmission between sender and receiver.
• Data link layer – prescribes the transmission of a series of bits into a frame to allow
for error and flow control
• Network layer – describes how packets in a network of computers are to be routed
(i.e., path determination).

62
Types of communication
3 Communication

Distinguish
• Transient versus persistent communication
• Asynchronous versus synchronous communication

63
Types of communication
3 Communication

Transient versus persistent


• Transient communication – a message is stored by the communication system only
as long as the sending and receiving application are executing.
• Persistent communication – a message is stored by the communication middleware
as long as it takes to deliver it to the receiver.
64
Types of communication
3 Communication

Asynchronous versus Synchronous


• Asynchronous communication – the sender continues immediately after it has
submitted its message for transmission.
• Synchronous communication – the sender is blocked until its request is known to be
accepted.
65
Types of communication
3 Communication

Places for synchronization


• At request submission
• At request delivery
• After request processing
66
Client/Server
Types of communication

Client/Server computing is generally based on a model of transient synchronous


communication:
• Client and server have to be active at time of communication.
• Client issues request and blocks until it receives reply.
• Server essentially waits only for incoming requests, and subsequently processes
them.

67
Messaging
Types of communication

Message-oriented middleware
Aims at high-level persistent asynchronous communication:
• Processes send each other messages, which are queued
• Sender need not wait for immediate reply, but can do other things

68
Remote Procedure Call (RPC)
3 Communication

• Basic RPC operation


• Parameter passing
• RPC variations

69
Basic RPC operations
Remote Procedure Call
• Application developers are familiar with simple procedure model
• Well-engineered procedures operate in isolation (black box)
• Client and server stubs pack the parameters into a message and request that
message to be sent

Communication between caller and callee can be hidden by using procedure-call


mechanism.
70
Basic operations
Remote Procedure Call

1. Client procedure calls client stub. 6. Server makes local call and returns result to stub.
2. Stub builds message; calls local OS. 7. Stub builds message; calls OS.
3. OS sends message to remote OS. 8. OS sends message to client’s OS.
4. Remote OS gives message to stub. 9. Client’s OS gives message to stub.
5. Stub unpacks parameters and calls server. 10. Client stub unpacks result and returns to the client.

71
Parameter passing
Remote Procedure Call

Parameter marshaling
There’s more than just wrapping parameters into a message:
• Client and server need to properly interpret messages, transforming them into
machine-dependent representations. (e.g., ordering)
• Client and server have to agree on the same encoding:
— How are basic data values represented (integers, floats, characters) and
— How are complex data values represented (arrays)

Stub generation
Define protocols (message formats, data representations, delivery method)

72
Parameter passing
Remote Procedure Call

Pointers and references


• Different address space between server and client

Support
• Forbid pointers and reference parameters
• Copy the entire data structure to which the parameter is referring, effectively
replacing the copy-by-reference mechanism by copy-by-value
• Global references: client and the server have access to the same file system

73
Asynchronous RPCs
Remote Procedure Call

Essence
Try to get rid of the strict request-reply behavior, but let the client continue without
waiting for an answer from the server.
• No result to return
• Multiple RPCs need to be performed

74
Sending out multiple RPCs
Remote Procedure Call
Sending an RPC request to a group of servers.

Consideration
• Client may be unaware of multiple servers existence. Example: fault tolerance
system using multicast address
• Client proceed after one or all responses have been received?
75
Message-Oriented Communication
3 Communication

• Transient Messaging
• Message-Queuing System
• Message Brokers

76
Transient messaging: sockets
Message-Oriented Communication

77
Queue-based messaging
Message-Oriented Communication
Four possible combinations:

78
Message-oriented middleware
Message-Oriented Communication

Essence
Asynchronous persistent communication through support of middleware-level queues.
Queues correspond to buffers at communication servers.

79
General model
Message-Oriented Communication

Queue managers
Queues are managed by queue managers. An application can put messages only into a
local queue. Getting a message is possible by extracting it from a local queue only ⇒
queue managers need to route messages.

Routing – special queue managers that forward incoming messages to other queue
managers.

80
Message broker
Message-Oriented Communication

Message queuing systems assume a common messaging protocol: all applications agree
on message format (i.e., structure and data representation)
Message broker
Centralized component that takes care of application heterogeneity in an MQ system:
• Transforms incoming messages to target format
• Very often acts as an application gateway

81
Message broker
Message-Oriented Communication

General architecture

82
Application-level multicasting
Multicast communication
Organize nodes of a distributed system into an overlay network and use that network to
disseminate data.

• Link stress – is defined per link and counts how often a packet crosses the same
physical link? Example: message from A to D needs to cross 〈Ra, Rb〉 twice.
• Stretch – ratio in delay between ALM-level path and network-level path. Example:
messages B to C follow path of length 73 at ALM, but 47 at network level ⇒ stretch =
73/47.
83
Flooding
Multicast communication

Rather than broadcasting, multicast by minimize the use of intermediate nodes for which
the message is not intended.
• Construct an overlay network per multicast group.
— A node needs to maintain separate list of neighbors if it belongs to several group.

Flooding
For an overlay corresponding to a multicast group, we need to broadcast a message.
• P simply sends a message m to each of its neighbors. Each neighbor will forward that
message, except to P, and only if it had not seen m before.

84
Flooding performance
Multicast communication

Overlay network:
• G = (V, E)
• For undirected graph,
Mtot = δ(vo ) + Σv∈V −vo (δ(v) − 1) ≈ 2|E| − |V | + 1 , where δ(v) is the number of
neighbors of node v.

Performance
For fully connected graph:
|V |
• We have |E| = 2 leading to an order of |V |2 messages ⇒ O(N 2 )

85
Probabilistic approach
Multicast communication
Assumption: no information on the structure of the overlay network.
• Random graph representation – a graph having a probability pedge that two vertices
are joined by an edge.
• |E| = pedge · |V | · (|V | − 1) · /2

• Flooding (send message) with Pf lood :


— the total number of messages sent will drop linearly in Pf lood

86
Epidemic protocols
Multicast communication

Infected node spreads message/exchanges update to/with other (susceptible) nodes.


Two forms of epidemics
• Anti-entropy – Each replica regularly chooses another replica at random, and
exchanges state differences, leading to identical states
• Gossiping – A replica which has just been updated (i.e., has been contaminated), tells
a number of other replicas about its update (contaminating them as well).

87
Anti-entropy
Multicast communication

Principle operations
A node P selects another node Q from the system at random.
• Push – P only pushes its own updates to Q
• Pull – P only pulls in new updates from Q
• Push-pull – P and Q send updates to each other

For push-pull it takes O(log(N )) rounds to disseminate updates to all N nodes (round ⇒
when every node as taken the initiative to start an exchange).

88
Gossip-based data dissemination
Multicast communication

Principle operations
A server S having an update to report, contacts other servers. If a server is contacted to
which the update has already propagated, S stops contacting other servers with
probability pstop .

If s is the fraction of ignorant servers (i.e., which are unaware of the update), it can be
shown that with many servers
s = e−(1/pstop +1)(1−s)
Note: it cannot guarantee that all nodes will actually be updated.

89
Effect of stopping
Multicast communication

90
Distributed Systems
4 Coordination

▶ Introduction

▶ Processes

▶ Communication

▶ Coordination

▶ Distributed programming

91
Coordination
Distributed systems

• Logical clocks
• Mutual exclusion
• Election algorithm
Clock synchronization
4 Coordination
Centralized systems – time is unambiguous, i.e., system clock keeps time and all entities
can use that
Distributed systems
Each node has it’s own clock
• Problem: an event that occurring after the other may be assigned an earlier time.
Example: make

93
Physical clock
4 Coordination

How to tell time?


• Use astronomical metrics (solar day)
— Accurate clocks are atomic oscillators
— Coordinated Universal Time (UTC) — international standard based on atomic time
◦ Add leap seconds to be consistent with astronomical time
◦ UTC broadcast on radio (satellite and earth)
— Most clocks are less accurate (e.g., mechanical watches)
◦ Computers use crystal based clocks (one part in a million)
◦ results in clock drift
— Need to synchronize machines with each other

94
Clock synchronization
4 Coordination

Clock drift
• Each clock has a maximum drift rate ρ
• Two clocks may drift by 2ρ
• Limit drift rate with δ
— re-synchronize every δ/2ρ

95
Network Time Protocol
4 Coordination

When contacting the server, message delays will have outdated the reported time.
Estimation
• Offset θ (assumption: δTreq = T 2 − T 1 ≈ T 4 − T 3 = Tres )
— θ = T 3 + ((T 2 − T 1) + (T 4 − T 3))/2 + T 4 = ((T 2 − T 1) + (T 3 − T 4))/2
◦ if θ < 0, clock is set backward
• Delay δ
— δ = ((T 4 − T 1) − (T 3 − T 2))/2

96
Lamport’s logical clocks
4 Coordination

What usually matters is not that all processes agree on exactly what time it is, but that
they agree on the order in which events occur. Requires a notion of ordering.
The happened-before relation
• If a and b are two events in the same process, and a comes before b, then a → b.
• If a is the sending of a message, and b is the receipt of that message, then a → b.
• If a → b and b → c, then a → c.
This introduces a partial ordering of events in a system with concurrently operating
processes.

97
Logical clocks
4 Coordination

How do we maintain a global view of the system’s behavior that is consistent with the
happened-before relation?
The notion of time
Attach a timestamp C(e) to each event e, satisfying the following properties:
P1 If a and b are two events in the same process, and a → b, then we demand that
C(a) < C(b).
P2 If a corresponds to sending a message m, and b to the receipt of that message, then
also C(a) < C(b).
Problem: How to attach a timestamp to an event when there’s no global clock ⇒
maintain a consistent set of logical clocks, one per process.

98
Logical clocks
4 Coordination

Solution
Each process Pi maintains a local counter Ci and adjusts this counter
1. For each new event that takes place within Pi , Ci is incremented by 1.
2. Each time a message m is sent by process Pi , the message receives a timestamp
ts(m) = Ci.
3. Whenever a message m is received by a process Pj , Pj adjusts its local counter Cj
to max{Cj , ts(m)} + 1.
Notes:
• Property P1 is satisfied by (1); Property P2 by (2) and (3).
• It can still occur that two events happen at the same time. Avoid this by breaking ties
through process IDs.

99
Example
Logical clocks

Consider three processes with event counters operating at different rates

100
Implementation
Logical clocks

Note
Adjustments take place in the middleware layer

101
Example: Totally ordered multicast
Logical clocks

Problem
Concurrent updates on a replicated database are seen in the same order everywhere
• P1 adds $100 to an account (initial value: $1000)
• P2 increments account by 1%
• There are two replicas

Result
In absence of proper synchronization: replica #1 ← $1111, while replica #2 ← $1110
(propagation delay).
102
Example: Totally ordered multicast
Logical clocks

Solution
• Process Pi sends timestamped message mi to all others. The message itself is put in
a local queue Qi .
• Any incoming message at Pj is queued in Qj , according to its timestamp, and
acknowledged to every other process.

103
Vector clocks
4 Coordination

Observation
Lamport’s clocks do not guarantee that if C(a) < C(b) that a causally preceded b

Obs.
Event a: m1 is received at T = 16;
Event b: m2 is sent at T = 20.

Note
We cannot conclude that a causally precedes b.

104
Vector clocks
4 Coordination

Definition
We say that b may causally depend on a if ts(a) < ts(b), with:
• for all k, ts(a)[k] ≤ ts(b)[k] and
• there exists at least one index k ′ for which ts(a)[k ′ ] < ts(b)[k ′ ]

Precedence vs. dependency


• We say that a causally precedes b.
• b may causally depend on a, as there may be information from a that is propagated
into b.

105
Causal dependency
Vector clocks

Definition
We say that b may causally depend on a if ts(a) < ts(b), with:
• for all k, ts(a)[k] ≤ ts(b)[k] and
• there exists at least one index k ′ for which ts(a)[k ′ ] < ts(b)[k ′ ]

Precedence vs. dependency


• We say that a causally precedes b.
• b may causally depend on a, as there may be information from a that is propagated
into b.

106
Capturing potential causality
Vector clocks

Solution
Each Pi maintains a vector V Ci
• V Ci [i] is the local logical clock at process Pi .
• If V Ci [j] = k then Pi knows that k events have occurred at Pj .

Maintaining vector clocks


1. Before executing an event, Pi executes V Ci [i] ← V Ci [i] + 1.
2. When process Pi sends a message m to Pj , it sets m’s (vector) timestamp ts(m)
equal to V Ci after having executed step 1.
3. Upon the receipt of a message m, process Pj sets
V Cj [k] ← max{V Cj [k], ts(m)[k]} for each k, after which it executes step 1 and
then delivers the message to the application.
107
Example
Vector clocks

108
Causally ordered multicasting
4 Coordination

We can now ensure that a message is delivered only if all causally preceding messages
have already been delivered.
Adjustment
Pi increments V Ci [i] only when sending a message, and Pj "adjusts" V Cj when
receiving a message (i.e., effectively does not change V Cj [j]).

Pj postpones delivery of m until:


• ts(m)[i] = V Cj [i] + 1.
• ts(m)[k] ≤ V Cj [k] for k ̸= i.

109
Causally ordered multicasting
4 Coordination

Enforcing causal communication

110
Mutual exclusion
4 Coordination

A number of processes in a distributed system want exclusive access to some resource.

Basic solutions
• Permission-based – a process wanting to enter its critical region, or access a
resource, needs permission from other process(es).
• Token-based – a token is passed between processes. The one who has the token may
proceed in its critical region, or pass it on when not interested.

111
Centralized
Mutual exclusion

(a) Process P1 asks the coordinator for permission to access a shared resource.
Permission is granted.
(b) Process P2 then asks permission to access the same resource. The coordinator does
not reply.
(c) When P1 releases the resource, it tells the coordinator, which then replies to P 2 .
Note: re-election during failure.

112
Distributed: Ricart & Agrawala
Mutual exclusion

Principle
The same as Lamport/total ordering except that acknowledgments aren’t sent. Instead,
replies (i.e. grants) are sent only when
• The receiving process has no interest in the shared resource; or
• The receiving process is waiting for the resource, but has lower priority (known
through comparison of timestamps).
In all other cases, reply is deferred, implying some more local administration.

113
Example
Ricart & Agrawala

(a) Two processes want to access a shared resource at the same moment.
(b) P0 has the lowest timestamp, so it wins.
(c) When process P0 is done, it sends an OK also, so P2 can now go ahead.

114
Token ring algorithm
Mutual exclusion

Essence
Organize processes in a logical ring, and let a token be passed between them. The one
that holds the token is allowed to enter the critical region (if it wants to).
An overlay network constructed as a logical ring with a circulating token

115
Election algorithms
4 Coordination

An algorithm requires that some process acts as a coordinator. The question is how to
select this special process dynamically.

Note
• In many systems, the coordinator is chosen manually (e.g., file servers). This leads to
centralized solutions ⇒ single point of failure.

116
Election algorithms
4 Coordination

Assumptions
1. All processes have unique id’s
2. All processes know id’s of all processes in the system (but not if they are up or down)
3. Election means identifying the process with the highest id that is up

117
Election by bullying
Election algorithms

Each process has an associated priority (weight). The process with the highest priority
should always be elected as the coordinator.

How do we find the heaviest process?


Consider N processes {P0 , ..., PN −1 } and let id(Pk ) = k . When a process Pk notices
that the coordinator is no longer responding to requests, it initiates an election:
1. Pk sends an ELECTION message to all processes with higher identifiers:
Pk + 1, Pk+2 , ..., PN −1 .
2. If no one responds, Pk wins the election and becomes coordinator.
3. If one of the higher-ups answers, it takes over and Pk ’s job is done.

118
Election by bullying
Election algorithms

119
Election in a ring
Election algorithms

Process priority is obtained by organizing processes into a (logical) ring. The process with
the highest priority should be elected as coordinator.
• Any process can start an election by sending an election message to its successor. If a
successor is down, the message is passed on to the next successor.
• If a message is passed on, the sender adds itself to the list. When it gets back to the
initiator, everyone had a chance to make its presence known.
• The initiator sends a coordinator message around the ring containing a list of all
living processes. The one with the highest priority is elected as coordinator.

120
Election in a ring
Election algorithms

• The solid line shows the election messages initiated by P6


• The dashed one, the messages by P3

121
Distributed Systems
5 Distributed programming

▶ Introduction

▶ Processes

▶ Communication

▶ Coordination

▶ Distributed programming

122
Socket programming for distributed systems
5 Distributed programming

Socket
A sockets is a virtual end point where entities can perform inter-process
communication. Sockets may communicate between processes on the same
machine, or between processes on different continents. E.g., client/server.

Socket API
A standard API for accessing network services provided by lower layers (4-3-2).

123
TCP Client-server interaction
Sockets

124
Socket operations
Sockets

Operations offered by the socket API


• Specify endpoints (both TCP and UDP)
• TCP
— Open a connection (client-side)
— Wait for a connection (server-side)
— Send/receive data on a connection
— Be notified when data arrive
— Gracefully terminate or abort a connection
• UDP
— Send/receive a datagram

125
Families of socket
Sockets

1 import socket
2
3 # create a TCP socket ( SOCK_STREAM )
4 s = socket . socket ( socket . AF_INET , socket . SOCK_STREAM )
5 print ( ’ Socket created ’)

Families of socket
• AF_INET – IPv4 Internet Protocols (32 bit addresses).
• AF_INET6 – IPv6 Internet Protocols (128 bit addresses).
• AF_UNIX – communication within the same machine

126
Types of Sockets
Sockets

1 import socket
2
3 # create a TCP socket ( SOCK_STREAM )
4 s = socket . socket ( socket . AF_INET , socket . SOCK_STREAM )
5 print ( ’ Socket created ’)

Types of Sockets
• SOCK_STREAM – Reliable, bidirectional flow. Example: TCP.
• SOCK_DGRAM – Unreliable, unidirectional data flow. Example: UDP.
• SOCK_RAW – Provides access to internal network protocol. Example: ICMP.

127
Assigning address (server)
Sockets

bind(address) – the method binds a local network address (host_address, port_number)


pair to a socket.
Parameters
• Host address – example, ’localhost’, ’127.0.0.1’, or ”.
• Port number – any available port number. Example: 12345

1 ...
2 port = 12345 # port number
3 # Bind socket to any address and port on the machine
4 s . bind (( ’ ’ , port ) )
5 print ( " socket binded to % s " %( port ) )
6 ...

128
Request listening (server)
Sockets

listen([backlog]) – listens for a specified number of connections to a bound socket, IP


and port.
Parameter
• The function accepts a queue size through the parameter backlog. This denotes
maximum number of connections that can be queued for this socket by the
operating system.

1 ...
2 s . listen (5) # Maximum queue size of 5
3 ...

129
Accepting request (server)
Sockets

accept() – receives connections to a bound socket, IP and port. Waits until a connection
is received.
1 ...
2 conn , addr = s . accept ()
3 ...

Return values
• The return value is a pair, (conn, addr).
— conn – a new socket object usable to send and receive data on the connection.
— addr – the address bound to the socket on the other end of the connection

130
Sending connection request (client)
Sockets

connect(address) – connects to a given (host_ip, port) pair. The function initiates a


3-way handshake.
Parameters
• Host IP – address of the local or remote server.
• Port – port number of the local or remote host.

1 ...
2 host_ip = input ( " Enter target host : " ) # takes the server address
from input
3 port = 12345 # server port number
4 c . connect (( host_ip , port ) )
5 ...

131
Sending data (TCP)
Sockets

send(bytes) – used to send data from one socket to another socket. The method can
only be used with a connected socket (e.g., TCP).
Parameters
• Bytes – The data to be sent in bytes. In case the data is in string format, the encode()
method of str can be called to convert it into bytes.

1 ...
2 # Send data to server
3 data = " Hello Server ! "
4 s . send ( data . encode ( ’utf -8 ’) )
5 ...

132
Sending data (UDP)
Sockets

sendto(bytes, address) – is used to send datagrams to a UDP socket.


Parameters
• Bytes – the data to be sent in bytes. In case the data is in string format, the encode()
method of str can be called to convert it into bytes.
• Address – a tuple consisting of IP address and port number.

1 ...
2 # Send data to server
3 data = " Hello Server ! "
4 s . sendto ( data . encode ( ’utf -8 ’) ,( " 127.0.0.1 " ,12345) )
5 ...

133
Receiving data (TCP)
Sockets

recv(buf size) – used to receive data from sockets.


Parameters
• Bufsize – no of bytes to receive.
1 ...
2 bufferSize = 1024
3 # Receive incoming datagrams
4 data = s . recv ( bufferSize )
5 print ( data . decode ( ’utf -8 ’) )
6 ...

Return value
• Returns the received data as bytes object.

134
Receiving data (UDP)
Sockets

recvfrom(buf size) – used to receive data from the UDP sockets.


Parameters
• Bufsize – no of bytes to receive.
1 ...
2 bufferSize = 1024
3 # Receive incoming datagrams
4 data , addr = s . recvfrom ( bufferSize )
5 print ( data . decode ( ’utf -8 ’) )
6 ...

Return value
• Returns a bytes object read from an UDP socket and the address of the client socket
as a tuple.
135
Closing a connection
Sockets

close() – closes the connection with the host. Once that happens, all future operations
on the socket object will fail. The remote end will receive no more data.

1 ...
2 # Close the connection with the client
3 conn . close ()
4 ...

136
Exercise: Echo protocol
Sockets

• Implement a simple echo server using TCP and UDP


— The server continuously echoes back the characters received on the connection.
— The connection is closed by the client when it is done.
• Implement a client using TCP and UDP

137
Echo client algorithm
Sockets

1. Read server IP and port number from standard input or use constants
2. Create a TCP socket
3. Connect the socket to the server
4. Repeat
— Read a line of text form the keyboard
— Send the line to the server
— Receive the response and print on the screen/console
Until ’stop’ or ’close’ line is read
5. Close the socket

138
Sequential Echo server algorithm
Sockets

1. Read server port number from standard input or use constants


2. Create a TCP socket
3. Bind the socket to a specific TCP port and any IP address available on the local host
4. Put the socket in passive mode
5. Repeat forever
— Accept a new connection
— Repeat
◦ Read data from the connection
◦ Send the same data back to the client
Until the connection is closed by the client
— Close the connection

139
Echo server on UDP
Sockets

1. Read server port number from standard input or use constants


2. Create a UDP socket
3. Bind the socket to a specific UDP port and any IP address available on the local host
4. Repeat forever
— Read a datagram
— Send the same datagram back to the client

140
Echo client on UDP
Sockets

1. Read the server IP address and port number from standard input or use constants
2. Create a UDP socket
3. Repeat
— Read a line of text form the keyboard
— Send the line to the server
— Receive the response and print on the screen/console
Until ’stop’ or ’close’ line is read
4. Close the socket

141
Exercise
Sockets

• Implement Chat service


— Using TCP
— Using UDP

142
Concurrency
Sockets

• Concurrent TCP server


— Processes
— Threads

• Exercise: Implement concurrent Echo service

143
File handling
Sockets

• Opening
— file_object = open(filename) – default is read
◦ E.g., myfile = open(“article.txt”,”r”)
— File access modes
◦ read only (’r’), write only (’w’), append (’a’), read and write (’r+’), etc.
• Reading
— myfile.read([n]) – read n bytes. If n is not specified it reads entire file.
— myfile.readline([n]) – reads a line of input. If n is specified reads at most n bytes.
— myfile.readlines() – reads all lines and returns them in a list.
• Writing
— myfile.write(data) – writes the string data on the file
— myfile.writelines(L) – writes all the string data from list L line-by-line.
• Closing
— myfile.close()

144
File transfer
Sockets

• TCP File transfer


— Use os library to check whether the file exits locally or not (os.path.isfile(file_name))
before trying to send it.
— Notify the file receiver about the size of the file (in bytes) before the transfer. Use
os.path.getsize(file_name).

• Exercise: Implement FTP service

145
FTP server algorithm
Sockets

1. ...
2. Put the socket in passive mode
3. Repeat forever
— Accept a new connection
— Read the file_name from the connection
— Check if the file exists
◦ Get file size
◦ Send file size to the client
◦ Open the file for reading
◦ Repeat
– Send bytes
Until all bytes are transferred
◦ Close the file
— Close the connection

146
FTP client algorithm
Sockets

1. ...
2. Connect the socket to the server
3. Read the file_name form the keyboard
4. Send file_name to the server
5. Receive the file_size from the server
6. Open a new_file for writing
7. Repeat
◦ Receive bytes
Until all bytes are received
8. Close the new_file
9. Close the socket

147
Mini-project
Sockets

1. Implement group chat service.

— The service must support more than two users, chatting at the same time.
— When any user joins the service, he/she should use a username, and the group must be
notified.
— All users must receive the messages (line-by-line) on their screen with the timestamp, the
username and the chat text.
— Users can join in the service using one of the two protocols (TCP and UDP), i.e., the service
should be listening on both TCP and UDP. Note: There can only be one group.

148
Mini-project
Sockets

2 Implement a distributed system that uses a master server to receive client requests and
two computing/storage servers.

— Client operations – store and access data/files remotely


— Master node – receive files from the remote users, and save on computing/storage nodes.
While saving data/files on the two servers, the master node should use load balancing
and/or other classification techniques so that the fetching process can be efficient.
— Communication – the master should support both TCP and UDP services for the client. The
compute/storage nodes support either TCP or UDP but different from each other. Example:
M <–> C1 ⇒ TCP and M <–> C2 ⇒ UDP or vise-versa.

149

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy