DCC Unit 1 Digital Notes
DCC Unit 1 Digital Notes
This document is confidential and intended solely for the educational purpose of
RMK Group of Educational Institutions. If you have received this document
through email in error, please notify the system manager. This document
contains proprietary information and is intended only to the respective group /
learning community as intended. If you are not the addressee you should not
disseminate, distribute or copy through e-mail. Please notify the sender
immediately by e-mail if you have received this document by mistake and delete
this document from your system. If you are not the intended recipient you are
notified that disclosing, copying, distributing or taking any action in reliance on
the contents of this information is strictly prohibited.
22CS401
Distributed and Cloud
Computing
Department: CSE
Batch/Year/Sem: 2023-2027/II/IV
Created by: Dr. M. Vedaraj,
Associate Professor/CSE
Mrs. V. Sharmila,
Assistant Professor/CSE
Mrs. D. Sterlin Rani,
Assistant Professor/CSE
Date:18-12-2024
Table of Contents
Sl. Topics Page
No. No.
1. Contents 5
2. Course Objectives 6
6. CO PO/PSO Mapping 14
Synchronization
Concurrency
Syllabus
L T P C
22CS401 DISTRIBUTED AND CLOUD COMPUTING 2 0 2 3
Unit I : INTRODUCTION 6+6
Definition - Relation to computer system components - Message-passing systems versus shared
memory systems - Primitives for distributed communication - Synchronous versus asynchronous
executions. A model of distributed computations: A distributed program - A model of distributed
executions - Models of communication networks - Global state of a distributed system.
List of Exercise/Experiments:
1. Implement a simple distributed program that communicates between two nodes using Java's
RMI (Remote Method Invocation) API.
2. Develop a distributed program that uses Java's messaging API (JMS) to communicate
between nodes. Explore the different messaging paradigms (pub/sub, point-to-point) and
evaluate their performance and scalability.
3. Develop a model of a distributed program using Java's concurrency and synchronization
primitives.
Unit II : LOGICAL TIME, GLOBAL STATE, AND SNAPSHOT ALGORITHMS 6+6
Logical time–Scalar Time–Vector Time-Efficient implementations of vector clocks–Virtual Time.
Global state and snapshot recording algorithms: System model-Snapshot algorithms for FIFO
channels and non-FIFO channels.
List of Exercise/Experiments:
1. Develop a program in Java that implements vector clocks to synchronize the order of events
between nodes in a distributed system.
2. Implement a snapshot algorithm for recording the global state of the distributed system
using vector clocks, for both FIFO and non-FIFO channels. Test the algorithm by recording
snapshots at various points in the system's execution and analyzing the resulting global
state.
Unit III : DISTRIBUTED MUTUAL EXCLUSION ALGORITHMS 6+6
Introduction-Lamport’s algorithm-Ricart–Agrawala algorithm-Quorum-based mutual exclusion
algorithms-Maekawa’s algorithm-Suzuki–Kasami’s broadcast algorithm.
List of Exercise/Experiments:
1. Implement Lamport's algorithm for mutual exclusion in a distributed system using Java's RMI
API.
2. Develop a program in Java that implements Maekawa's algorithm for mutual exclusion in a
distributed system.
3. Implement Suzuki-Kasami's broadcast algorithm in Java to achieve reliable message delivery
in a distributed system.
Unit IV : CLOUD INFRASTRUCTURE AND VIRTUALIZATION 6+6
Data Center Infrastructure and Equipment – Virtual Machines – Containers – Virtual Networks -
Virtual Storage.
List of Exercise/Experiments:
1. Set up a virtualized data center using a hypervisor like VMware or VirtualBox and create
multiple virtual machines (VMs) on it. Configure the VMs with different operating systems,
resources, and network configurations, and test their connectivity and performance.
2. Deploy a containerized application on a virtual machine using Docker or Kubernetes.
Unit V : AUTOMATION AND ORCHESTRATION 6+6
Automation - Orchestration: Automated Replication and Parallelism - The MapReduce Paradigm:
The MapReduce Programming Paradigm – Splitting Input – Parallelism and Data size – Data
access and Data Transmission – Apache Hadoop – Parts of Hadoop – HDFS Components –
Block Replication and Fault Tolerance – HDFS and MapReduce - Microservices.
List of Exercise/Experiments:
1. Set up and configure a single-node Hadoop cluster.
2. Run the word count program in Hadoop.
3. Deploy a microservices architecture using a container orchestration tool like Kubernetes or
Docker Swarm.
Course Outcomes
Course Outcomes
CO# COs K Level
K6 Evaluation
K5 Synthesis
K4 Analysis
K3 Application
K2 Comprehension
K1 Knowledge
CO – PO/PSO Mapping
CO – PO /PSO Mapping Matrix
CO1 3 3 2 1 1 - - 2 2 2 - 2 3 2 -
CO2 3 2 2 2 2 - - 2 2 2 - 2 3 2 -
CO3 3 3 2 2 2 - - 2 2 2 - 2 3 2 -
CO4 3 2 2 2 2 - - 2 2 2 - 2 3 3 -
CO5 3 2 2 2 2 - - 2 2 2 - 2 3 2 -
Lecture Plan
Unit I
Lecture Plan - Unit I
Definition -
Relation to
Chalk &
1 computer 1 18-12-2024 CO1
Talk
system
components
Message-
passing systems
versus shared 19-12-2024
Video
20-12-2024
memory Lectures
2 3 21-12-2024 CO1
systems - and
Primitives for Practical
distributed
communication
Synchronous
versus Chalk &
3 1 23-12-2024 CO1
asynchronous Talk
executions
A model of
distributed 24-12-2024 Video
26-12-2024 Lectures
4 computations: A 3
27-12-2024
CO1
&
distributed Practical
program
A model of
distributed 28-12-2024
executions - 02-01-2025 PPT &
5 3 CO1
Models of 03-01-2025 Practical
communication
networks
Global state of a Chalk &
6 distributed 1 04-01-2025 CO1
system Talk
Activity Based Learning
Unit I
Activity Based Learning
1.Live assessment using RMK NEXTGEN
Definition
A collection of computers that do not share common memory or a common physical clock,
that communicate by a messages passing over a communication network, and where each
computer has its own memory and runs its own operating system. Typically the computers
are semi-autonomous and are loosely coupled while they cooperate to address a problem
collectively
A collection of independent computers that appears to the users of the system as a single
coherent computer
A wide range of computers, from weakly coupled systems such as wide-area networks, to
strongly coupled systems such as local area networks, to very strongly coupled systems
such as multiprocessor systems.
Features
No common physical clock-It introduces the element of “distribution” in the system and
gives rise to the inherent asynchrony amongst the processors
Autonomy and heterogeneity-The processors are “loosely coupled” in that they have
different speeds and each can be running a different operating system. They are usually
not part of a dedicated system, but cooperate with one another by offering services or
solving a problem jointly.
Unit I INTRODUCTION
The figure below shows the relationships of the software components that run on each of
the computers and use the local operating system and network protocol stack for
functioning. The distributed software is also termed as middleware. A distributed execution
is the execution of processes across the distributed system to collaboratively achieve a
common goal. An execution is also sometimes termed a computation or a run. The
distributed system uses a layered architecture to break down the complexity of system
design. The middleware is the distributed software that drives the distributed system, while
providing transparency of heterogeneity at the platform level
The middleware layer does not contain the traditional application layer functions of the
network protocol stack, such as http, mail, ftp, and telnet. Various primitives and calls to
functions defined in various libraries of the middleware layer are embedded in the user
program code
Unit I INTRODUCTION
There are several standards such as Object Management Group’s (OMG) Common Object
Request Broker Architecture (CORBA), and the Remote Procedure Call (RPC) mechanism.
The RPC mechanism conceptually works like a local procedure call, with the difference that
the procedure code may reside on a remote machine, and the RPC software sends a
message across the network to invoke the remote procedure. It then awaits a reply, after
which the procedure call completes from the perspective of the program that invoked it.
Currently deployed commercial versions of middleware often use CORBA, DCOM
(Distributed Component Object Model), Java, and RMI (Remote Method Invocation)
technologies. The Message-Passing Interface (MPI) developed in the research community is
an example of an interface for various communication functions.
The processors are in very close physical proximity and are connected by an
interconnection network. Inter-process communication across processors is traditionally
through read and write operations on the shared memory and message-passing primitives.
All the processors usually run the same operating system, and both the hardware and
software are very tightly coupled.
Unit I INTRODUCTION
The architecture is shown in Figure 1.3 (a).
The processors are usually of the same type, and are housed within the same
box/container with a shared memory. The interconnection network to access the memory
may be a bus, it is usually a multistage switch with a symmetric and regular design.
3. Array processors belong to a class of parallel computers that are physically co-located,
are very tightly coupled, and have a common system clock.
Flynn’s taxonomy
• Single instruction stream, single data stream (SISD) This mode corresponds to the
conventional processing in the von Neumann paradigm with a single CPU, and a single
memory unit connected by a system bus.
• Single instruction stream, multiple data stream (SIMD) This mode corresponds to
the processing by multiple homogenous processors which execute in lock-step on different
data items. Applications that involve operations on large arrays and matrices, such as
scientific applications, can best exploit systems that provide the SIMD mode of operation
because the data sets can be partitioned easily.
Unit I INTRODUCTION
• Multiple instruction stream, single data stream (MISD) This mode corresponds to the
execution of different operations in parallel on the same data. This is a specialized mode of
operation with limited but niche applications, e.g., visualization.
• Multiple instruction stream, multiple data stream (MIMD) In this mode, the various
processors execute different code on different data. This is the mode of operation in
distributed systems as well as in the vast majority of parallel systems. There is no common
clock among the system processors.
Coupling: The degree of coupling among a set of modules, whether hardware or software, is
measured in terms of the interdependency and binding and/or homogeneity among the
modules. When the degree of coupling is high (low), the modules are said to be tightly
(loosely) coupled. SIMD and MISD architectures generally tend to be tightly coupled.
This is a measure of the relative speedup of a specific program, on a given machine. The
speedup depends on the number of processors and the mapping of the code to the
processors. It is expressed as the ratio of the time T1 with a single processor, to the time Tn
with n processors.
This is an aggregate measure of the percentage of time that all the processors are executing
CPU instructions productively, as opposed to waiting for communication (either via shared
memory or message-passing) operations to complete.
Concurrency
Granularity of a program
The ratio of the amount of computation to the amount of communication within the
parallel/distributed program is termed as granularity. If the degree of parallelism is coarse-
grained (fine-grained), there are relatively many more (fewer) productive CPU instruction
executions, compared to the number of times the processors communicate either via shared
memory or message passing and wait to get synchronized with the other processors.
Unit I INTRODUCTION
Message-passing systems versus shared memory systems
Shared memory systems are those in which there is a (common) shared address space
throughout the system. Communication among processors takes place via shared data
variables, and control variables for synchronization among the processors. Semaphores and
monitors that were originally designed for shared memory uniprocessors and multiprocessors
are examples of how synchronization can be achieved in shared memory systems. All
multicomputer (NUMA as well as message-passing) systems that do not have a shared
address space provided by the underlying architecture and hardware necessarily communicate
by message passing. For a distributed system, this abstraction is called distributed shared
memory.
The shared address space can be partitioned into disjoint parts, one part being assigned to
each processor. “Send” and “receive” operations can be implemented by writing to and
reading from the destination/sender processor’s address space, respectively. Specifically, a
separate location can be reserved as the mailbox for each ordered pair of processes. A Pi–Pj
message-passing can be emulated by a write by Pi to the mailbox and then a read by Pj from
the mailbox. In the simplest case, these mailboxes can be assumed to have unbounded size.
The write and read operations need to be controlled using synchronization primitives to inform
the receiver/sender after the data has been sent/received.
This involves the use of “send” and “receive” operations. Each shared location can be modeled
as a separate process; “write” to a shared location is emulated by sending an update message
to the corresponding owner process; a “read” to a shared location is emulated by sending a
query message to the owner process. This emulation is expensive. The latencies involved in
read and write operations may be high.
Within the multiprocessor system, the processors communicate via shared memory. Between
two computers, the communication is by message passing.
Unit I INTRODUCTION
Primitives for distributed communication
Message send and message receive communication primitives are denoted Send() and
Receive(), respectively.
A Send primitive has at least two parameters – the destination, and the buffer in the user
space, containing the data to be sent.
A Receive primitive has at least two parameters – the source from which the data is to be
received (this could be a wildcard), and the user buffer into which the data is to be received.
There are two ways of sending data when the Send primitive is invoked:
• Buffered option
• Unbuffered option.
The buffered option which is the standard option copies the data from the user buffer to the
kernel buffer. The data later gets copied from the kernel buffer onto the network.
In the unbuffered option, the data gets copied directly from the user buffer onto the
network. For the Receive primitive, the buffered option is usually required because the data
may already have arrived when the primitive is invoked, and needs a storage place in the
kernel.
Blocking primitives A primitive is blocking if control returns to the invoking process after
the processing for the primitive (whether in synchronous or asynchronous mode) completes.
Unit I INTRODUCTION
For non-blocking primitives, a return parameter on the primitive call returns a system-
generated handle which can be later used to check the status of completion of the call. The
process can check for the completion of the call in two ways. First, it can keep checking if
the handle has been flagged or posted. Second, it can issue a Wait with a list of handles as
parameters. The Wait call usually blocks until one of the parameter handles is posted.
After issuing the primitive in non-blocking mode, the process has done whatever actions it
could and now needs to know the status of completion of the call, therefore using a
blocking Wait() call is usual programming practice.
If at the time that Wait() is issued, the processing for the primitive (whether synchronous
or asynchronous) has completed, the Wait returns immediately. The completion of the
processing of the primitive is detectable by checking the value of handlek. If the processing
of the primitive has not completed, the Wait blocks and waits for a signal to wake it up.
When the processing for the primitive completes, the communication subsystem software
sets the value of handlek and wakes up (signals) any process with a Wait call blocked on
this handlek. This is called posting the completion of the operation.
A non-blocking send primitive. When the Wait call returns, at least one of its
parameters is posted
Unit I INTRODUCTION
There are four versions of the Send primitive
Synchronous blocking
Synchronous non-blocking
Asynchronous blocking
Asynchronous non-blocking
For the Receive primitive, there are blocking synchronous and non-blocking synchronous
versions. These versions of the primitives are illustrated in Figure.
Three time lines are shown for each process:
(1) for the process execution
(2) for the user buffer from/to which data is sent/received
(3) for the kernel/communication subsystem.
Unit I INTRODUCTION
Blocking synchronous Send Figure (a) - The data gets copied from the user buffer to
the kernel buffer and is then sent over the network. After the data is copied to the
receiver’s system buffer and a Receive call has been issued, an acknowledgement back to
the sender causes control to return to the process that invoked the Send operation and
completes the Send.
Non-blocking synchronous Send Figure (b) - Control returns back to the invoking
process as soon as the copy of data from the user buffer to the kernel buffer is initiated. A
parameter in the non-blocking call also gets set with the handle of a location that the user
process can later check for the completion of the synchronous send operation. The location
gets posted after an acknowledgement returns from the receiver, as per the semantics
described for (a). The user process can keep checking for the completion of the non-
blocking synchronous Send by testing the returned handle, or it can invoke the blocking
Wait operation on the returned handle.
Blocking asynchronous Send - Figure (c) The user process that invokes the Send is
blocked until the data is copied from the user’s buffer to the kernel buffer
Non-blocking asynchronous Send - Figure (d) The user process that invokes the Send
is blocked until the transfer of the data from the user’s buffer to the kernel buffer is
initiated. Control returns to the user process as soon as this transfer is initiated, and a
parameter in the non-blocking call also gets set with the handle of a location that the user
process can check later using the Wait operation for the completion of the asynchronous
Send operation. The asynchronous Send completes when the data has been copied out of
the user’s buffer. The checking for the completion may be necessary if the user wants to
reuse the buffer from which the data was sent.
Blocking Receive - Figure (a) The Receive call blocks until the data expected arrives and
is written in the specified user buffer. Then control is returned to the user process.
Non-blocking Receive - Figure (b) The Receive call will cause the kernel to register the
call and return the handle of a location that the user process can later check for the
completion of the non-blocking Receive operation. This location gets posted by the kernel
after the expected data arrives and is copied to the user-specified buffer. The user process
can check for the completion of the non-blocking Receive by invoking the Wait operation on
the returned handle.
Unit I INTRODUCTION
Processor synchrony
Processor synchrony indicates that all the processors execute in lock-step with their clocks
synchronized. As this synchrony is not attainable in a distributed system, what is more
generally indicated is that for a large granularity of code, usually termed as a step, the
processors are synchronized. This abstraction is implemented using some form of barrier
synchronization to ensure that no processor begins executing the next step of code until all
the processors have completed executing the previous steps of code assigned to each of
the processors.
any two events ei and ej, if ei → ej, then event ej is directly or transitively dependent on
event ei; graphically, it means that there exists a path consisting of message arrows and
process-line segments in the space–time diagram that starts at ei and ends at ej. Note that
relation → denotes flow of information in a distributed computation and ei → ej dictates
that all the information available at ei is potentially accessible at ej
Unit I INTRODUCTION
For any two events ei and ej, ei → ej denotes the fact that event ej does not directly or
transitively dependent on event ei. That is, event ei does not causally affect event ej. Event
ej is not aware of the execution of ei or any event executed after ei on the same process.
• for any two events ei and ej, ei → ej ⇒ ej → ei
• for any two events ei and ej , ei → ej ⇒ ej → ei.
For any two events ei and ej , if ei → ej and ej → ei, then events ei and ej are said to be
concurrent and the relation is denoted as ei ║ ej .
For any two events ei and ej in a distributed execution, ei → ej or ej → ei, or ei║ ej .
Logical vs. physical concurrency
In a distributed computation, two events are logically concurrent if and only if they do not
causally affect each other. Physical concurrency is that the events occur at the same instant
in physical time.
For a global snapshot to be meaningful, the states of all the components of the distributed
system must be recorded at the same instant. This will be possible if the local clocks at
processes were perfectly synchronized or there was a global system clock that could be
instantaneously read by the processes. But both are impossible.
Even if the state of all the components in a distributed system has not been recorded at the
same instant, such a state will be meaningful provided every message that is recorded as
received is also recorded as sent. Basic idea is that an effect should not be present without
its cause. A message cannot be received if it was not sent; that is, the state should not
violate causality. Such states are called consistent global states and are meaningful global
states. Inconsistent global states are not meaningful.
A global state is a consistent global state iff it satisfies the
following condition:
In the distributed execution, a global state GS1 consisting of local states {LS11, LS23, LS33,
LS42} is inconsistent because the state of p2 has recorded the receipt of message m12,
however, the state of p1 has not recorded its send. On the contrary, a global state GS2
consisting of local states {LS12, LS24, LS34, LS42} is consistent; all the channels are empty
except C21 that contains message m21.
A global state is transitless iff
A tightly coupled system is a system architecture and computing method in which all hardware
and software components are linked together so that every component is dependent on the
others. Tightly coupled system architecture encourages application and code
interdependence.
5. Give examples for loosely coupled and tightly coupled systems. (CO1,K1)
The middleware is the distributed software that drives the distributed system,
while providing transparency of heterogeneity at the platform level. middleware layer
does not contain the traditional application layer functions of the network protocol stack,
such as http, mail, ftp, and telnet. Various primitives and calls to functions defined in
various libraries of the middleware layer are embedded in the user program code.
10. List out the middleware standards in distributed systems. (CO1, K1)
There are several standards such as Object Management Group’s (OMG) Common Object
Request Broker Architecture (CORBA), Remote Procedure Call (RPC) mechanism, DCOM
(Distributed Component Object Model), Java, and RMI (Remote Method Invocation) and
message-passing interface (MPI).
The RPC mechanism conceptually works like a local procedure call, with the difference
that the procedure code may reside on a remote machine, and the RPC software sends a
message across the network to invoke the remote procedure. It then awaits a reply, after
which the procedure call completes from the perspective of the program that invoked it.
12. What is distributed shared memory? (CO1,K1)
All multicomputer (NUMA as well as message-passing) systems that do not have a shared
address space provided by the underlying architecture and hardware necessarily
communicate by message passing. For a distributed system, this abstraction is called
distributed shared memory.
PART - A Questions
13. What is the motivation for using Distributed Systems. (CO1,K1)
• Inherently distributed computations
• Resource sharing
• Access to geographically remote data and resources
• Enhanced reliability
• Increased performance/cost ratio
• Scalability
• Modularity and incremental expandability
14. What is shared memory (CO1,K1)
Shared memory systems are those in which there is a (common) shared address space
throughout the system. Communication among processors takes place via shared data
variables, and control variables for synchronization among the processors. Semaphores and
monitors that were originally designed for shared memory uniprocessors and multiprocessors
are examples of how synchronization can be achieved in shared memory systems.
15. Why message passing is required in distributed systems? (CO1,K1)
All multicomputer (NUMA as well as message-passing) systems that do not have a shared
address space provided by the underlying architecture and hardware necessarily communicate
by message passing.
used to communicate between the single most commonly utilized in a distributed setting
processor and multiprocessor systems when communicating processes are spread
over multiple devices linked by a network.
system calls are only required to establish the performed via the kernel
shared memory
The code for reading and writing the data no such code is required in this case because
from the shared memory should be written the message passing feature offers a method
explicitly by the developer for communication and synchronization of
activities executed by the communicating
processes
Resource sharing Resources such as peripherals, complete data sets in databases, special
libraries, as well as data (variable/files) cannot be fully replicated at all the sites because it is
often neither practical nor cost-effective. They cannot be placed at a single site because access
to that site might prove to be a bottleneck. Therefore, such resources are typically distributed
across the system.
PART - A Questions
18. Draw a block diagram of distributed system. (CO1,K1)
27. When two events are said to be logically concurrent and physically
concurrent? (CO1,K1)
Two events are logically concurrent if and only if they do not causally affect each other.
Physical concurrency has a connotation that the events occur at the same instant in physical
time.
Two popular interconnection networks – the Omega network and the Butterfly network. It
is a multi-stage network formed of 2×2 switching elements. Each 2×2 switch allows data
on either of the two input wires to be switched to the upper or the lower output wire.
In a single step, however, only one data unit can be sent on an output wire. So if the data
from both the input wires is to be routed to the same output wire in a single step, there is
a collision. Various techniques such as buffering or more elaborate interconnection designs
can address collisions.
The two outgoing edges from any switch x s are as follows. There is an edge from switch
<x, s> to switch <y, s+1> if (i) x = y or (ii) x XOR y has exactly one 1 bit, which is in the
s+1th MSB. For stage s, apply the rule above for M/2s switches.
Consider the Butterfly network in Figure 1.4(b), n = 8 and M = 4. There are three stages, s
= 0 1 2, and the interconnection pattern is defined between s = 0 and s = 1 and between s
= 1 and s = 2. The switch number x varies from 0 to 3 in each stage, i.e., x is a 2-bit string.
Consider the first stage interconnection (s = 0) of a butterfly of size M, and hence having
log22M stages. For stage s = 0, as per rule (i), the first output line from switch 00 goes to
the input line of switch 00 of stage s = 1. As per rule (ii), the second output line of switch
00 goes to input line of switch 10 of stage s = 1. Similarly, x = 01 has one output line go to
an input line of switch 11 in stage s = 1.
For stage s = 1 connecting to stage s = 2, we apply the rules considering only M/21 = M/2
switches, i.e., we build two butterflies of size M/2 – the “upper half” and the “lower half”
switches. The recursion terminates for M/2s = 1, when there is a single switch.
Butterfly routing function In a stage s switch, if the s +1th MSB of j is 0, the data is
routed to the upper output wire, otherwise it is routed to the lower output wire.
Assessment Schedule
(Proposed Date & Actual
Date)
Assessment Schedule
Disclaimer:
This document is confidential and intended solely for the educational purpose of RMK Group of
Educational Institutions. If you have received this document through email in error, please notify the
system manager. This document contains proprietary information and is intended only to the
respective group / learning community as intended. If you are not the addressee you should not
disseminate, distribute or copy through e-mail. Please notify the sender immediately by e-mail if you
have received this document by mistake and delete this document from your system. If you are not
the intended recipient you are notified that disclosing, copying, distributing or taking any action in
reliance on the contents of this information is strictly prohibited.