0% found this document useful (0 votes)
27 views45 pages

Unit-2 Notes

Uploaded by

tecnologyhub96
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views45 pages

Unit-2 Notes

Uploaded by

tecnologyhub96
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 45

UNIT-2

INTER PROCESS COMMUNICATION AND REMOTE INVOCATION


SYLLABUS: Introduction to Inter-Process Communication - API for Internet
Protocols - External Data Representation and Marshalling - Multicast
communication – Request-Reply protocols - Remote Procedure Call - Remote
Method Invocation.

Interprocess Communication in Distributed Systems


Interprocess Communication (IPC) in distributed systems is crucial for
enabling processes across different nodes to exchange data and coordinate
activities. This article explores various IPC methods, their benefits, and
challenges in modern distributed computing environments.

Important Topics for Interprocess Communication in Distributed Systems


 What is Interprocess Communication in a Distributed system?
 Characteristics of Inter-process Communication in Distributed Systems
 Types of Interprocess Communication in Distributed Systems
 Benefits of Interprocess Communication in Distributed Systems
 Challenges of Interprocess Communication in Distributed Systems
 Example of Interprocess Communication in Distributed System
What is Interprocess Communication in a Distributed system?
Interprocess Communication in a distributed system is a process of
exchanging data between two or more independent processes in a distributed
environment is called as Interprocess communication. Interprocess
communication on the internet provides both Datagram and stream
communication.
Characteristics of Inter-process Communication in Distributed Systems
There are mainly five characteristics of inter-process communication in a
distributed environment/system.
 Synchronous System Calls: In synchronous system calls both sender and
receiver use blocking system calls to transmit the data which means the
sender will wait until the acknowledgment is received from the receiver
and the receiver waits until the message arrives.
 Asynchronous System Calls: In asynchronous system calls, both sender
and receiver use non-blocking system calls to transmit the data which
means the sender doesn’t wait from the receiver acknowledgment.
 Message Destination: A local port is a message destination within a
computer, specified as an integer. Aport has exactly one receiver but
many senders. Processes may use multiple ports from which to receive
messages. Any process that knows the number of a port can send the
message to it.
 Reliability: It is defined as validity and integrity.
 Integrity: Messages must arrive without corruption and duplication to
the destination.
Types of Interprocess Communication in Distributed Systems
Below are the types of interprocess communication (IPC) commonly used in
distributed systems:
 Message Passing:
o Definition: Message passing involves processes communicating by
sending and receiving messages. Messages can be structured data
packets containing information or commands.
o Characteristics: It is a versatile method suitable for both
synchronous and asynchronous communication. Message passing
can be implemented using various protocols such as TCP/IP, UDP,
or higher-level messaging protocols like AMQP (Advanced
Message Queuing Protocol) or MQTT (Message Queuing
Telemetry Transport).

 Remote Procedure Calls (RPC):


o Definition: RPC allows one process to invoke a procedure (or
function) in another process, typically located on a different
machine over a network.
o Characteristics: It abstracts the communication between processes
by making it appear as if a local procedure call is being made. RPC
frameworks handle details like parameter marshalling, network
communication, and error handling.
 Sockets:
o Definition: Sockets provide a low-level interface for network
communication between processes running on different computers.
o Characteristics: They allow processes to establish connections,
send data streams (TCP) or datagrams (UDP), and receive
responses. Sockets are fundamental for implementing higher-level
communication protocols.
 Message Queuing Systems:
o Description: Message queuing systems facilitate asynchronous
communication by allowing processes to send messages to and
receive messages from queues.
o Characteristics: They decouple producers (senders) and consumers
(receivers) of messages, providing fault tolerance, scalability, and
persistence of messages. Examples include Apache Kafka,
RabbitMQ, and AWS SQS.
 Publish-Subscribe Systems:
o Description: Publish-subscribe (pub-sub) systems enable
communication between components without requiring them to
directly know each other.
o Characteristics: Publishers publish messages to topics, and
subscribers receive messages based on their interest in specific
topics. This model supports one-to-many communication and is
scalable for large-scale distributed systems. Examples include
MQTT and Apache Pulsar.
These types of IPC mechanisms each have distinct advantages and are chosen
based on factors such as communication requirements, performance
considerations, and the nature of the distributed system architecture. Successful
implementation often involves selecting the most suitable IPC type or
combination thereof to meet specific application needs.
Benefits of Interprocess Communication in Distributed Systems
Below are the benefits of IPC in Distributed Systems:
 Facilitates Communication:
o IPC enables processes or components distributed across different
nodes to communicate seamlessly.
o This allows for building complex distributed applications where
different parts of the system can exchange information and
coordinate their activities.
 Integration of Heterogeneous Systems:
o IPC mechanisms provide a standardized way for integrating
heterogeneous systems and platforms.
o Processes written in different programming languages or running
on different operating systems can communicate using common
IPC protocols and interfaces.
 Scalability:
o Distributed systems often need to scale horizontally by adding
more nodes or instances.
o IPC mechanisms, especially those designed for distributed
environments, can facilitate scalable communication patterns such
as publish-subscribe or message queuing, enabling efficient scaling
without compromising performance.
 Fault Tolerance and Resilience:
o IPC techniques in distributed systems often include mechanisms
for handling failures and ensuring resilience.
o For example, message queues can buffer messages during network
interruptions, and RPC frameworks can retry failed calls or
implement failover strategies.
 Performance Optimization:
o Effective IPC can optimize performance by minimizing latency and
overhead associated with communication between distributed
components.
o Techniques like shared memory or efficient message passing
protocols help in achieving low-latency communication.
Challenges of Interprocess Communication in Distributed Systems
Below are the challenges of IPC in Distributed Systems:
 Network Latency and Bandwidth:
o Distributed systems operate over networks where latency (delay in
transmission) and bandwidth limitations can affect IPC
performance.
o Minimizing latency and optimizing bandwidth usage are critical
challenges, especially for real-time applications.
 Reliability and Consistency:
o Ensuring reliable and consistent communication between
distributed components is challenging.
o IPC mechanisms must handle network failures, message loss, and
out-of-order delivery while maintaining data consistency across
distributed nodes.
 Security:
o Securing IPC channels against unauthorized access, eavesdropping,
and data tampering is crucial.
o Distributed systems often transmit sensitive data over networks,
requiring robust encryption, authentication, and access control
mechanisms.
 Complexity in Error Handling:
o IPC errors, such as network timeouts, connection failures, or
protocol mismatches, must be handled gracefully to maintain
system stability.
o Designing robust error handling and recovery mechanisms adds
complexity to distributed system implementations.
 Synchronization and Coordination:
o Coordinating actions and ensuring synchronization between
distributed components can be challenging, especially when using
shared resources or implementing distributed transactions.
o IPC mechanisms must support synchronization primitives and
consistency models to avoid race conditions and ensure data
integrity.
Example of Interprocess Communication in Distributed System
Let’s consider a scenario to understand the Interprocess Communication in
Distributed System:
Consider a distributed system where you have two processes running on
separate computers, a client process (Process A) and a server process (Process
B). The client process needs to request information from the server process and
receive a response.
IPC Example using Remote Procedure Calls (RPC):
1. RPC Setup:
 Process A (Client): Initiates an RPC call to Process B (Server).
 Process B (Server): Listens for incoming RPC requests and
responds accordingly.
2. Steps Involved:
 Client-side (Process A):
o The client process prepares an RPC request, which includes
the name of the remote procedure to be called and any
necessary parameters.
o It sends this request over the network to the server process.
 Server-side (Process B):
o The server process (Process B) listens for incoming RPC
requests.
o Upon receiving an RPC request from Process A, it executes
the requested procedure using the provided parameters.
o After processing the request, the server process prepares a
response (if needed) and sends it back to the client process
(Process A) over the network.
3. Communication Flow:
 Process A and Process B communicate through the RPC
framework, which manages the underlying network
communication and data serialization.
 The RPC mechanism abstracts away the complexities of network
communication and allows the client and server processes to
interact as if they were local.
4. Example Use Case:
 Process A (Client) could be a web application requesting user data
from a database hosted on Process B (Server).
 Process B (Server) receives the request, queries the database,
processes the data, and sends the results back to Process A (Client)
via RPC.
 The client application then displays the retrieved data to the user.
In this example, RPC serves as the IPC mechanism facilitating communication
between the client and server processes in a distributed system. It allows
processes running on different machines to collaborate and exchange data
transparently, making distributed computing more manageable and scalable.
API For Internet Protocols
Differences between TCP and UDP
Transmission Control Protocol (TCP) and User Datagram Protocol (UDP) both
are protocols of the Transport Layer Protocols. TCP is a connection-oriented
protocol whereas UDP is a part of the Internet Protocol suite, referred to as the
UDP/IP suite. Unlike TCP, it is an unreliable and connectionless protocol. In
this article, we will discuss the differences between TCP and UDP.
What is Transmission Control Protocol (TCP)?
TCP (Transmission Control Protocol) is one of the main protocols of the
Internet protocol suite. It lies between the Application and Network Layers
which are used in providing reliable delivery services. It is a connection-
oriented protocol for communications that helps in the exchange of messages
between different devices over a network. The Internet Protocol (IP), which
establishes the technique for sending data packets between computers, works
with TCP.
Grasping the differences between TCP and UDP is essential for excelling in
exams like GATE, where networking is a significant topic. To strengthen your
understanding and boost your exam preparation, consider enrolling in the GATE
CS Self-Paced Course. This course offers comprehensive coverage of
networking protocols, including in-depth explanations of TCP, UDP, and their
applications, ensuring you’re well-prepared for your exams.

Transmission Control Protocol


Features of TCP
 TCP keeps track of the segments being transmitted or received by
assigning numbers to every single one of them.
 Flow control limits the rate at which a sender transfers data. This is done
to ensure reliable delivery.
 TCP implements an error control mechanism for reliable data transfer.
 TCP takes into account the level of congestion in the network.
Applications of TCP
 World Wide Web (WWW) : When you browse websites, TCP ensures
reliable data transfer between your browser and web servers.
 Email : TCP is used for sending and receiving emails. Protocols
like SMTP (Simple Mail Transfer Protocol) handle email delivery across
servers.
 File Transfer Protocol (FTP) : FTP relies on TCP to transfer large files
securely. Whether you’re uploading or downloading files, TCP ensures
data integrity.
 Secure Shell (SSH) : SSH sessions, commonly used for remote
administration, rely on TCP for encrypted communication between client
and server.
 Streaming Media : Services like Netflix, YouTube, and Spotify use TCP
to stream videos and music. It ensures smooth playback by managing
data segments and retransmissions.
Advantages of TCP
 It is reliable for maintaining a connection between Sender and Receiver.
 It is responsible for sending data in a particular sequence.
 Its operations are not dependent on Operating System .
 It allows and supports many routing protocols.
 It can reduce the speed of data based on the speed of the receiver.
Disadvantages of TCP
 It is slower than UDP and it takes more bandwidth.
 Slower upon starting of transfer of a file.
 Not suitable for LAN and PAN Networks.
 It does not have a multicast or broadcast category.
 It does not load the whole page if a single data of the page is missing.
What is User Datagram Protocol (UDP)?
User Datagram Protocol (UDP) is a Transport Layer protocol. UDP is a part of
the Internet Protocol suite, referred to as the UDP/IP suite. Unlike TCP, it is an
unreliable and connectionless protocol. So, there is no need to establish a
connection before data transfer. The UDP helps to establish low-latency and
loss-tolerating connections establish over the network. The UDP enables
process-to-process communication.

User Datagram Protocol


Features of UDP
 Used for simple request-response communication when the size of data is
less and hence there is lesser concern about flow and error control.
 It is a suitable protocol for multicasting as UDP supports packet
switching .
 UDP is used for some routing update protocols like RIP(Routing
Information Protocol) .
 Normally used for real-time applications which can not tolerate uneven
delays between sections of a received message.
Application of UDP
 Real-Time Multimedia Streaming : UDP is ideal for streaming audio
and video content. Its low-latency nature ensures smooth playback, even
if occasional data loss occurs.
 Online Gaming : Many online games rely on UDP for fast
communication between players.
 DNS (Domain Name System) Queries : When your device looks
up domain names (like converting “www.example.com” to an IP
address), UDP handles these requests efficiently .
 Network Monitoring : Tools that monitor network performance often
use UDP for lightweight, rapid data exchange.
 Multicasting : UDP supports packet switching, making it suitable for
multicasting scenarios where data needs to be sent to multiple recipients
simultaneously.
 Routing Update Protocols : Some routing protocols, like RIP (Routing
Information Protocol), utilize UDP for exchanging routing information
among routers.
Advantages of UDP
 It does not require any connection for sending or receiving data.
 Broadcast and Multicast are available in UDP.
 UDP can operate on a large range of networks.
 UDP has live and real-time data.
 UDP can deliver data if all the components of the data are not complete.
Disadvantages of UDP
 We can not have any way to acknowledge the successful transfer of data.
 UDP cannot have the mechanism to track the sequence of data.
 UDP is connectionless, and due to this, it is unreliable to transfer data.
 In case of a Collision, UDP packets are dropped by Routers in
comparison to TCP.
 UDP can drop packets in case of detection of errors.
Which Protocol is Better: TCP or UDP?
The answer to this question is difficult because it totally depends on what work
we are doing and what type of data is being delivered. UDP is better in the case
of online gaming as it allows us to work lag-free. TCP is better if we are
transferring data like photos, videos, etc. because it ensures that data must be
correct has to be sent. In general, both TCP and UDP are useful in the context of
the work assigned by us. Both have advantages upon the works we are
performing, that’s why it is difficult to say, which one is better.

Difference Between TCP and UDP


Where TCP is Used?
 Sending Emails
 Transferring Files
 Web Browsing

Where UDP is Used?


 Gaming
 Video Streaming
 Online Video Chats
Differences between TCP and UDP

Transmission Control Protocol User Datagram


Basis (TCP) Protocol (UDP)

Type of Service TCP is a connection-oriented UDP is the


protocol. Connection orientation Datagram-oriented
means that the communicating protocol. This is
devices should establish a because there is no
connection before transmitting overhead for
data and should close the opening a
connection after transmitting the connection,
data. maintaining a
connection, or
terminating a
connection. UDP is
Transmission Control Protocol User Datagram
Basis (TCP) Protocol (UDP)

efficient for
broadcast and
multicast types of
network
transmission.

The delivery of data


TCP is reliable as it guarantees
to the destination
Reliability the delivery of data to the
cannot be
destination router.
guaranteed in UDP.

TCP provides extensive error- UDP has only the


Error checking checking mechanisms. It is basic error-checking
mechanism because it provides flow control mechanism
and acknowledgment of data. using checksums.

No
Acknowledgmen An acknowledgment segment is
acknowledgment
t present.
segment.

There is no
Sequencing of data is a feature of sequencing of data
Transmission Control Protocol in UDP. If the order
Sequence
(TCP). this means that packets is required, it has to
arrive in order at the receiver. be managed by the
application layer.

UDP is faster,
TCP is comparatively slower than
Speed simpler, and more
UDP.
efficient than TCP.
Transmission Control Protocol User Datagram
Basis (TCP) Protocol (UDP)

There is no
retransmission of
Retransmission of lost packets is
Retransmission lost packets in the
possible in TCP, but not in UDP.
User Datagram
Protocol (UDP).

TCP has a (20-60) bytes variable UDP has an 8 bytes


Header Length
length header. fixed-length header.

Weight TCP is heavy-weight. UDP is lightweight.

It’s a connectionless
Handshaking Uses handshakes such as SYN,
protocol i.e. No
Techniques ACK, SYN-ACK
handshake

TCP doesn’t support UDP supports


Broadcasting
Broadcasting. Broadcasting.

UDP is used
TCP is used by HTTP,
by DNS , DHCP ,
Protocols HTTPs , FTP , SMTP and Telnet
TFTP, SNMP , RIP ,
.
and VoIP .

The TCP connection is a byte UDP connection is a


Stream Type
stream. message stream.

Overhead Low but higher than UDP. Very low.

Applications This protocol is primarily utilized This protocol is


Transmission Control Protocol User Datagram
Basis (TCP) Protocol (UDP)

used in situations
where quick
communication is
in situations when a safe and
necessary but where
trustworthy communication
dependability is not
procedure is necessary, such as in
a concern, such as
email, on the web surfing, and
VoIP, game
in military services.
streaming, video,
and music
streaming, etc.

External data representation and marshalling


External Data Representation:
Data structures are used to represent the information held in running
applications. The information consists of a sequence of bytes in messages that
are moving between components in a distributed system. So, conversion is
required from the data structure to a sequence of bytes before the transmission
of data. On the arrival of the message, data should also be able to be converted
back into its original data structure.
Different types of data are handled in computers, and these types are not the
same in every position where data must be transmitted. Individual primitive data
items can have a variety of data values, and not all computers store primitive
values like integers in the same order. Different architectures also represent
floating-point numbers differently. Integers are ordered in two ways, big-endian
order, in which the Most Significant Byte (MSB) is placed first, and little-
endian order, in which the Most Significant Byte (MSB) is placed last or the
Least Significant Byte (LSB) is placed first. Furthermore, one more issue is the
set of codes used to represent characters. Most applications on UNIX systems
use ASCII character coding, which uses one byte per character, whereas the
Unicode standard uses two bytes per character and allows for the representation
of texts in many different languages.
There should be a means to convert all of this data to a standard format so that it
can be sent successfully between computers. If the two computers are known to
be of the same type, the external format conversion can be skipped otherwise
before transmission, the values are converted to an agreed-upon external format,
which is then converted to the local format on receiving. For that, values are
sent in the sender’s format, along with a description of the format, and the
recipient converts them if necessary. It’s worth noting, though, that bytes are
never changed during transmission. Any data type that can be supplied as a
parameter or returned, as a result, must be able to be converted and the
individual primitive data values expressed in an accepted format to support
Remote Procedure Call (RPC) or Remote Method Invocation (RMI)
mechanisms. So, an external data representation is a standard for representing
data structures and primitive values that have been agreed upon.
 Marshalling: Marshalling is the process of transferring and formatting a
collection of data structures into an external data representation type
appropriate for transmission in a message.
 Unmarshalling: The converse of this process is unmarshalling, which
involves reformatting the transferred data upon arrival to recreate the
original data structures at the destination.
Approaches:
There are three ways to successfully communicate between various sorts of data
between computers.
1. Common Object Request Broker Architecture (CORBA):
CORBA is a specification defined by the Object Management Group (OMG)
that is currently the most widely used middleware in most distributed systems.
It allows systems with diverse architectures, operating systems, programming
languages, and computer hardware to work together. It allows software
applications and their objects to communicate with one another. It is a standard
for creating and using distributed objects. It is made up of five major
components. Components and their function are given below:
 Object Request Broker (ORB): It provides a communication
infrastructure for the objects to communicate across a network.
 Interface Definition Language (IDL): It is a specification language
used to provide an interface in a software component. To exemplify, it
allows communication between software components written in C++ and
Java.
 Dynamic Invocation Interface (DII): Using DII, client applications are
permitted to use server objects without even knowing their types at
compile time. Here client obtains an instance of a CORBA object and
then invocation requests can be made dynamically on the corresponding
object.
 Interface Repository (IR): As the name implies, interfaces can be added
to the interface repository. The purpose of IR is that a client should be
able to find an object which is not known at compile-time and
information about its interface then request is made to be sent to ORB.
 Object Adapter (OA): It is used to access ORB services like object
reference generation.

Data Representation in CORBA:


Common Data Representation (CDR) is used to describe structured or primitive
data types that are supplied as arguments or results during remote invocations
on CORBA distributed objects. It allows clients and servers’ built-in computer
languages to communicate with one another. To exemplify, it converts little-
endian to big-endian.
There are 15 primitive types: short (16-bit), long (32-bit), unsigned short,
unsigned long, float (32-bit), double (64-bit), char, boolean (TRUE, FALSE),
octet (8-bit), and any (which can represent any basic or constructed type), as
well as a variety of composite types.
CORBA CDR Constructed Types:
Let’s have a look at Types with their representation:
 sequence: It refers to length (unsigned long) to be followed by elements
in order
 string: It refers to length (unsigned long) followed by characters in order
(can also have wide characters)
 array: The elements of the array follow order and length is fixed so not
specified.
 struct: in the order of declaration of components
 enumerated: It is unsigned long and here, the values are specified by the
order declared.
 union: type tag followed by the selected member
Example:
struct Person {
string name;
string place;
long year;
};
Marshalling CORBA:
From the specification of the categories of data items to be transmitted in a
message, Marshalling CORBA operations can be produced automatically.
CORBA IDL describes the types of data structures and fundamental data items
and provides a language/notation for specifying the types of arguments and
results of RMI methods.
2. Java’s Object Serialization:
Java Remote Method Invocation (RMI) allows you to pass both objects and
primitive data values as arguments and method calls. In Java, the term
serialization refers to the activity of putting an object (an instance of a class) or
a set of related objects into a serial format suitable for saving to disk or sending
in a message.
Java provides a mechanism called object serialization. This allows an object to
be represented as a sequence of bytes containing information about the object’s
data and the type of object and the type of data stored in the object. After the
serialized object is written to the file, it can be read from the file and
deserialized. You can recreate an object in memory with type information and
bytes that represent the object and its data.

Moreover, objects can be serialized on one platform and deserialized on


completely different platforms as the whole process is JVM independent.
For example, the Java class equivalent to the Person struct defined in CORBA
IDL might be:
Java
import java.io.*;
public class Person implements Serializable {
public String name;
public String place;
public int phonenumber;
public void letter() {
System.out.println("Issue a letter to " + name + " " + place);
}
}

3. Extensible Markup Language (XML):


Clients communicate with web services using XML, which is also used to
define the interfaces and other aspects of web services. However, XML is
utilized in a variety of different applications, including archiving and retrieval
systems; while an XML archive is larger than a binary archive, it has the
advantage of being readable on any machine. Other XML applications include
the design of user interfaces and the encoding of operating system configuration
files.
In contrast to HTML, which employs a fixed set of tags, XML is extensible in
the sense that users can construct their tags. If an XML document is meant to be
utilized by several applications, the tag names must be unique.
Clients, for example, typically interface with web servers via SOAP messages.
SOAP is an XML standard with tags that web services and their customers can
utilize. Because it is expected that the client and server sharing a message have
prior knowledge of the order and types of information it contains, some external
data representations (such as CORBA CDR) do not need to be self-describing.
On the other hand, XML was designed to be utilized by a variety of applications
for a variety of reasons. This has been made possible by the inclusion of tags
and the usage of namespaces to specify the meaning of the tags. Furthermore,
the usage of tags allows applications to pick only the portions of a document
that they need to process.
Example:
XML definition of the Person struct:
<person id="9865">
<name>John</name>
<place>England</place>
<year>1876</year>
<!-- comment -->
</person>

Usage:
Marshalling is used to create various remote procedure call (RPC) protocols,
where separate processes and threads often have distinct data formats,
necessitating the need for marshalling between them.
To transmit data across COM object boundaries, the Microsoft Component
Object Model (COM) interface pointers employ marshalling. When a common-
language-runtime-based type has to connect with other unmanaged types via
marshalling, the same thing happens in the.NET framework. DCOM stands for
Distributed Component Object Model.

Scripts and applications based on the Cross-Platform Component Object Model


(XPCOM) technology are two further examples where marshalling is crucial.
The Mozilla Application Framework makes heavy use of XPCOM, which
makes considerable use of marshalling.
So, XML (Extensible Markup Language) is a text-based format for expressing
structured data. It was designed to represent data sent in messages exchanged by
clients and servers in web services
The primitive data types are marshalled into a binary form in the first two ways-
CORBA and Java’s object serialization. The primitive data types are expressed
textually in the third technique (XML). A data value’s textual representation will
typically be longer than its binary representation. The HTTP protocol is another
example of the textual approach.
On the other hand, type information is included in both Java serialization and
XML, but in distinct ways. Although Java serializes all of the essential type
information, XML documents can refer to namespaces, which are externally
specified groups of names (with types).
Group Communication in Distributed Systems
In distributed systems, efficient group communication is crucial for
coordinating activities among multiple entities. This article explores the
challenges and solutions involved in facilitating reliable and ordered message
delivery among members of a group spread across different nodes or networks.

Group Communication in Distributed Systems


Important Topics for Group Communication in Distributed Systems
 What is Group Communication in Distributed Systems?
 Importance of Group Communication in Distributed Systems
 Types of Group Communication in a Distributed System
 Reliable Multicast Protocols for Group Communication
 Scalability and Performance for Group Communication
 Challenges of Group Communication in Distributed Systems
 FAQs for Group Communication in Distributed Systems
What is Group Communication in Distributed Systems?
Group communication in distributed systems refers to the process of exchanging
information among multiple nodes or entities that are geographically dispersed
or located on different machines within a network. It involves mechanisms and
protocols designed to facilitate communication and coordination among
members of a group, where each member typically plays a specific role or
performs particular tasks within the distributed system.
Importance of Group Communication in Distributed Systems
Group communication is critically important in distributed systems due to
several key reasons:
 Coordination and Synchronization:
o Distributed systems often involve multiple nodes or entities that
need to collaborate and synchronize their activities.
o Group communication mechanisms facilitate the exchange of
information, coordination of tasks, and synchronization of state
among these distributed entities.
o This ensures that all parts of the system are aware of the latest
updates and can act in a coordinated manner.
 Efficient Information Sharing:
o In distributed systems, different nodes may generate or process
data that needs to be shared among multiple recipients.
o Group communication allows for efficient dissemination of
information to all relevant parties simultaneously, reducing latency
and ensuring consistent views of data across the system.
 Fault Tolerance and Reliability:
o Group communication protocols often include mechanisms for
ensuring reliability and fault tolerance.
o Messages can be replicated or acknowledged by multiple nodes to
ensure that communication remains robust even in the face of node
failures or network partitions.
o This enhances the overall reliability and availability of the
distributed system.
 Scalability:
o As distributed systems grow in size and complexity, the ability to
scale effectively becomes crucial.
o Group communication mechanisms are designed to handle
increasing numbers of nodes and messages without compromising
performance or reliability.
o They enable the system to maintain its responsiveness and
efficiency as it scales up.
Types of Group Communication in a Distributed System
Below are the three types of group communication in distributed systems:
1. Unicast Communication

Unicast Communication
Unicast communication refers to the point-to-point transmission of data
between two nodes in a network. In the context of distributed systems:
 Definition: Unicast involves a sender (one node) transmitting a message
to a specific receiver (another node) identified by its unique network
address.
 Characteristics:
o One-to-One: Each message has a single intended recipient.
o Direct Connection: The sender establishes a direct connection to
the receiver.
o Efficiency: Suitable for scenarios where targeted communication is
required, such as client-server interactions or direct peer-to-peer
exchanges.
 Use Cases:
o Request-Response: Common in client-server architectures where
clients send requests to servers and receive responses.
o Peer-to-Peer: Direct communication between two nodes in a
decentralized network.
 Advantages:
o Efficient use of network resources as messages are targeted.
o Simplified implementation due to direct connections.
o Low latency since messages are sent directly to the intended
recipient.
 Disadvantages:
o Not scalable for broadcasting to multiple recipients without
sending separate messages.
o Increased overhead if many nodes need to be contacted
individually.
2. Multicast Communication

Multicast Communication
Multicast communication involves sending a single message from one sender to
multiple receivers simultaneously within a network. It is particularly useful in
distributed systems where broadcasting information to a group of nodes is
necessary:
 Definition: A sender transmits a message to a multicast group, which
consists of multiple recipients interested in receiving the message.
 Characteristics:
o One-to-Many: Messages are sent to multiple receivers in a single
transmission.
o Efficient Bandwidth Usage: Reduces network congestion
compared to multiple unicast transmissions.
o Group Membership: Receivers voluntarily join and leave
multicast groups as needed.
 Use Cases:
o Content Distribution: Broadcasting updates or notifications to
subscribers.
o Collaborative Systems: Real-time collaboration tools where
changes made by one user need to be propagated to others.
 Advantages:
o Saves bandwidth and network resources by transmitting data only
once.
o Simplifies management by addressing a group rather than
individual nodes.
o Supports scalable communication to a large number of recipients.
 Disadvantages:
o Requires mechanisms for managing group membership and
ensuring reliable delivery.
o Vulnerable to network issues such as packet loss or congestion
affecting all recipients.
3. Broadcast Communication
Broadcast communication involves sending a message from one sender to all
nodes in the network, ensuring that every node receives the message:

Broadcast Communication
 Definition: A sender transmits a message to all nodes within the network
without the need for specific recipients.
 Characteristics:
o One-to-All: Messages are delivered to every node in the network.
o Broadcast Address: Uses a special network address (e.g., IP
broadcast address) to reach all nodes.
o Global Scope: Suitable for disseminating information to all
connected nodes simultaneously.
 Use Cases:
o Network Management: Broadcasting status updates or
configuration changes.
o Emergency Alerts: Disseminating critical information to all
recipients in a timely manner.
 Advantages:
o Ensures that every node receives the message without requiring
explicit recipient lists.
o Efficient for scenarios where global dissemination of information is
necessary.
o Simplifies communication in small-scale networks or LAN
environments.
 Disadvantages:
o Prone to network congestion and inefficiency in large networks.
o Security concerns, as broadcast messages are accessible to all
nodes, potentially leading to unauthorized access or information
leakage.
o Requires careful network design and management to control the
scope and impact of broadcast messages.
Reliable Multicast Protocols for Group Communication
Reliable multicast protocols are essential in distributed systems to ensure that
messages sent from a sender to multiple recipients are delivered reliably,
consistently, and in a specified order. These protocols are designed to handle the
complexities of group communication, where ensuring every member of a
multicast group receives the message correctly is crucial. Types of Reliable
Multicast Protocols include:
 FIFO Ordering:
o Ensures that messages are delivered to all group members in the
order they were sent by the sender.
o Achieved by sequencing messages and delivering them
sequentially to maintain the correct order.
 Causal Ordering:
o Preserves the causal relationships between messages based on their
dependencies.
o Ensures that messages are delivered in an order that respects the
causal dependencies observed by the sender.
 Total Order and Atomicity:
o Guarantees that all group members receive messages in the same
global order.
o Ensures that operations based on the multicast messages (like
updates to shared data) appear atomic or indivisible to all
recipients.
Scalability and Performance for Group Communication
Scalability and performance are critical aspects of group communication in
distributed systems, where the ability to handle increasing numbers of nodes,
messages, and participants while maintaining efficient operation is essential.
Here’s an in-depth explanation of scalability and performance considerations in
this context:
1. Scalability
Scalability in group communication refers to the system’s ability to efficiently
accommodate growth in terms of:
 Number of Participants: As the number of nodes or participants in a
group increases, the system should be able to manage communication
without significant degradation in performance.
 Volume of Messages: Handling a larger volume of messages being
exchanged among group members, ensuring that communication remains
timely and responsive.
 Geographical Distribution: Supporting communication across
geographically dispersed nodes or networks, which may introduce
additional latency and bandwidth challenges.
2. Challenges in Scalability
 Communication Overhead: As the group size increases, the overhead
associated with managing group membership, message routing, and
coordination can become significant.
 Network Bandwidth: Ensuring that the network bandwidth can handle
the increased traffic generated by a larger group without causing
congestion or delays.
 Synchronization and Coordination: Maintaining consistency and
synchronization among distributed nodes becomes more complex as the
system scales up.
3. Strategies for Scalability
 Partitioning and Sharding: Dividing the system into smaller partitions
or shards can reduce the scope of communication and management tasks,
improving scalability.
 Load Balancing: Distributing workload evenly across nodes or partitions
to prevent bottlenecks and ensure optimal resource utilization.
 Replication and Caching: Replicating data or messages across multiple
nodes can reduce access latency and improve fault tolerance, supporting
scalability.
 Scalable Protocols and Algorithms: Using efficient communication
protocols and algorithms designed for large-scale distributed systems,
such as gossip protocols or scalable consensus algorithms.
4. Performance
Performance in group communication involves optimizing various aspects to
achieve:
 Low Latency: Minimizing the time delay between sending and receiving
messages within the group.
 High Throughput: Maximizing the rate at which messages can be
processed and delivered across the system.
 Efficient Resource Utilization: Using network bandwidth, CPU, and
memory resources efficiently to support fast and responsive
communication.
5. Challenges in Performance
 Message Ordering: Ensuring that messages are delivered in the correct
order while maintaining high throughput can be challenging, especially in
protocols that require strict ordering guarantees.
 Concurrency Control: Managing concurrent access to shared resources
or data within the group without introducing contention or bottlenecks.
 Network Conditions: Adapting communication strategies to varying
network conditions, such as bandwidth limitations or packet loss, to
maintain optimal performance.
6. Strategies for Performance
 Optimized Message Routing: Using efficient routing algorithms to
minimize the number of network hops and reduce latency.
 Asynchronous Communication: Employing asynchronous messaging
patterns to decouple sender and receiver activities, improving
responsiveness.
 Caching and Prefetching: Pre-fetching or caching frequently accessed
data or messages to reduce latency and improve response times.
 Parallelism: Leveraging parallel processing techniques to handle
multiple tasks or messages concurrently, enhancing throughput.
Challenges of Group Communication in Distributed Systems
Group communication in distributed systems poses several challenges due to the
inherent complexities of coordinating activities across multiple nodes or entities
that may be geographically dispersed or connected over unreliable networks.
Here are some of the key challenges:
 Reliability: Ensuring that messages are reliably delivered to all intended
recipients despite network failures, node crashes, or temporary
disconnections. Reliable delivery becomes especially challenging when
nodes join or leave the group dynamically.
 Scalability: As the number of group members increases, managing
communication becomes more challenging. Scalability issues arise in
terms of bandwidth consumption, message processing overhead, and the
ability to maintain performance as the system scales.
 Concurrency and Consistency: Ensuring consistency of shared data
across distributed nodes while allowing concurrent updates can be
difficult. Coordinating access to shared resources to prevent conflicts and
maintain data integrity requires robust synchronization mechanisms.
 Fault Tolerance: Dealing with node failures, network partitions, and
transient communication failures without compromising the overall
reliability and availability of the system. This involves mechanisms for
detecting failures, managing group membership changes, and ensuring
that communication continues uninterrupted.

Request/Reply Protocol:
 The Request-Reply Protocol is also known as the RR protocol.
 It works well for systems that involve simple RPCs.
 The parameters and result values are enclosed in a single packet buffer in
simple RPCs. The duration of the call and the time between calls are both
briefs.
 This protocol has a concept base of using implicit acknowledgements
instead of explicit acknowledgements.
 Here, a reply from the server is treated as the acknowledgement (ACK)
for the client’s request message, and a client’s following call is considered
as an acknowledgement (ACK) of the server’s reply message to the
previous call made by the client.
 To deal with failure handling e.g. lost messages, the timeout transmission
technique is used with RR protocol.
 If a client does not get a response message within the predetermined
timeout period, it retransmits the request message.
 Exactly-once semantics is provided by servers as responses get held in
reply cache that helps in filtering the duplicated request messages and
reply messages are retransmitted without processing the request again.
 If there is no mechanism for filtering duplicate messages then at least-call
semantics is used by RR protocol in combination with timeout
transmission.
Remote Procedural Call (RPC) Mechanism in Distributed System
RPC is an effective mechanism for building client-server systems that are
distributed. RPC enhances the power and ease of programming of the
client/server computing concept. It’s a protocol that allows one software to seek
a service from another program on another computer in a network without
having to know about the network. The software that makes the request is called
a client, and the program that provides the service is called a server.
The calling parameters are sent to the remote process during a Remote
Procedure Call, and the caller waits for a response from the remote procedure.
The flow of activities during an RPC call between two networking systems is
depicted in the diagram below.

Semantic Transparency:
 Syntactic transparency: This implies that there should be a similarity
between the remote process and a local procedure.
 Semantic transparency: This implies that there should be similarity in
the semantics i.e. meaning of a remote process and a local procedure.
Working of RPC:
There are 5 elements used in the working of RPC:
 Client
 Client Stub
 RPC Runtime
 Server Stub
 Server

 Client: The client process initiates RPC. The client makes a standard call,
which triggers a correlated procedure in the client stub.
 Client Stub: Stubs are used by RPC to achieve semantic transparency.
The client calls the client stub. Client stub does the following tasks:
o The first task performed by client stub is when it receives a request
from a client, it packs(marshalls) the parameters and required
specifications of remote/target procedure in a message.
o The second task performed by the client stub is upon receiving the
result values after execution, it unpacks (unmarshalled) those
results and sends them to the Client.
 RPC Runtime: The RPC runtime is in charge of message transmission
between client and server via the network. Retransmission,
acknowledgement, routing, and encryption are all tasks performed by it.
On the client-side, it receives the result values in a message from the
server-side, and then it further sends it to the client stub whereas, on the
server-side, RPC Runtime got the same message from the server stub
when then it forwards to the client machine. It also accepts and forwards
client machine call request messages to the server stub.
 Server Stub: Server stub does the following tasks:
o The first task performed by server stub is that it
unpacks(unmarshalled) the call request message which is received
from the local RPC Runtime and makes a regular call to invoke the
required procedure in the server.
o The second task performed by server stub is that when it receives
the server’s procedure execution result, it packs it into a message
and asks the local RPC Runtime to transmit it to the client stub
where it is unpacked.
 Server: After receiving a call request from the client machine, the server
stub passes it to the server. The execution of the required procedure is
made by the server and finally, it returns the result to the server stub so
that it can be passed to the client machine using the local RPC Runtime.
RPC process:
 The client, the client stub, and one instance of RPC Runtime are all
running on the client machine.
 A client initiates a client stub process by giving parameters as normal.
The client stub acquires storage in the address space of the client.
 At this point, the user can access RPC by using a normal Local
Procedural Call. The RPC runtime is in charge of message transmission
between client and server via the network. Retransmission,
acknowledgment, routing, and encryption are all tasks performed by it.
 On the server-side, values are returned to the server stub, after the
completion of server operation, which then packs (which is also known as
marshaling) the return values into a message. The transport layer receives
a message from the server stub.
 The resulting message is transmitted by the transport layer to the client
transport layer, which then sends a message back to the client stub.
 The client stub unpacks (which is also known as unmarshalling) the
return arguments in the resulting packet, and the execution process
returns to the caller at this point.
When the client process requests by calling a local procedure then the procedure
will pass the arguments/parameters in request format so that they can be sent in
a message to the remote server. The remote server then will execute the local
procedure call ( based on the request arrived from the client machine) and after
execution finally returns a response to the client in the form of a message. Till
this time the client is blocked but as soon as the response comes from the server
side it will be able to find the result from the message. In some cases, RPCs can
be executed asynchronously also in which the client will not be blocked in
waiting for the response.
The parameters can be passed in two ways. The first is to pass by value,
whereas the second is to pass by reference. The parameters receiving the
address should be pointers when we provide it to a function. In Pass by
reference, a function is called using pointers to pass the address of variables.
Call by value refers to the method of sending variables’ actual values.
The language designers are usually the ones who decide which parameter
passing method to utilize. It is sometimes dependent on the data type that is
being provided. Integers and other scalar types are always passed by value in C,
whereas arrays are always passed by reference.
Remote Method Invocation
Remote Method Invocation (RMI) is an API that allows an object to invoke a
method on an object that exists in another address space, which could be on the
same machine or on a remote machine. Through RMI, an object running in a
JVM present on a computer (Client-side) can invoke methods on an object
present in another JVM (Server-side). RMI creates a public remote server object
that enables client and server-side communications through simple method calls
on the server object.
Stub Object: The stub object on the client machine builds an information block
and sends this information to the server.
The block consists of
 An identifier of the remote object to be used
 Method name which is to be invoked
 Parameters to the remote JVM
Skeleton Object: The skeleton object passes the request from the stub object to
the remote object. It performs the following tasks
 It calls the desired method on the real object present on the server.
 It forwards the parameters received from the stub object to the method.
Working of RMI
The communication between client and server is handled by using two
intermediate objects: Stub object (on client side) and Skeleton object (on server-
side) as also can be depicted from below media as follows:

These are the steps to be followed sequentially to implement Interface as


defined below as follows:
1. Defining a remote interface
2. Implementing the remote interface
3. Creating Stub and Skeleton objects from the implementation class using
rmic (RMI compiler)
4. Start the rmiregistry
5. Create and execute the server application program
6. Create and execute the client application program.
Step 1: Defining the remote interface
The first thing to do is to create an interface that will provide the description of
the methods that can be invoked by remote clients. This interface should extend
the Remote interface and the method prototype within the interface should
throw the RemoteException.
Example:
 Java

// Creating a Search interface


import java.rmi.*;
public interface Search extends Remote
{
// Declaring the method prototype
public String query(String search) throws RemoteException;
}

Step 2: Implementing the remote interface


The next step is to implement the remote interface. To implement the remote
interface, the class should extend to UnicastRemoteObject class of java.rmi
package. Also, a default constructor needs to be created to throw the
java.rmi.RemoteException from its parent constructor in class.
 Java

// Java program to implement the Search interface


import java.rmi.*;
import java.rmi.server.*;
public class SearchQuery extends UnicastRemoteObject
implements Search
{
// Default constructor to throw RemoteException
// from its parent constructor
SearchQuery() throws RemoteException
{
super();
}

// Implementation of the query interface


public String query(String search)
throws RemoteException
{
String result;
if (search.equals("Reflection in Java"))
result = "Found";
else
result = "Not Found";

return result;
}
}

Step 3: Creating Stub and Skeleton objects from the implementation class
using rmic
The rmic tool is used to invoke the rmi compiler that creates the Stub and
Skeleton objects. Its prototype is rmic classname. For above program the
following command need to be executed at the command prompt
rmic SearchQuery.
Step 4: Start the rmiregistry
Start the registry service by issuing the following command at the command
prompt start rmiregistry
Step 5: Create and execute the server application program
The next step is to create the server application program and execute it on a
separate command prompt.
 The server program uses createRegistry method of LocateRegistry class
to create rmiregistry within the server JVM with the port number passed
as an argument.
 The rebind method of Naming class is used to bind the remote object to
the new name.
 Java

// Java program for server application


import java.rmi.*;
import java.rmi.registry.*;
public class SearchServer
{
public static void main(String args[])
{
try
{
// Create an object of the interface
// implementation class
Search obj = new SearchQuery();

// rmiregistry within the server JVM with


// port number 1900
LocateRegistry.createRegistry(1900);

// Binds the remote object by the name


// geeksforgeeks
Naming.rebind("rmi://localhost:1900"+
"/geeksforgeeks",obj);
}
catch(Exception ae)
{
System.out.println(ae);
}
}
}

Step 6: Create and execute the client application program


The last step is to create the client application program and execute it on a
separate command prompt . The lookup method of the Naming class is used to
get the reference of the Stub object.
 Java

// Java program for client application


import java.rmi.*;
public class ClientRequest
{
public static void main(String args[])
{
String answer,value="Reflection in Java";
try
{
// lookup method to find reference of remote object
Search access =
(Search)Naming.lookup("rmi://localhost:1900"+
"/geeksforgeeks");
answer = access.query(value);
System.out.println("Article on " + value +
" " + answer+" at GeeksforGeeks");
}
catch(Exception ae)
{
System.out.println(ae);
}
}
}

Note: The above client and server program is executed on the same machine so
localhost is used. In order to access the remote object from another machine,
localhost is to be replaced with the IP address where the remote object is
present.
save the files respectively as per class name as
Search.java , SearchQuery.java , SearchServer.java & ClientRequest.java
Important Observations:
1. RMI is a pure java solution to Remote Procedure Calls (RPC) and is used
to create the distributed applications in java.
2. Stub and Skeleton objects are used for communication between the client
and server-side.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy