CC UNIT-1
CC UNIT-1
Distributed Systems
Presentation Layer
o Function: Handles user interaction and presentation of data. It is
responsible for user interfaces and client-side interactions.
o Responsibilities: Rendering data, accepting user inputs, and
sending requests to the underlying layers.
Application Layer
o Function: Contains the business logic and application-specific
functionalities.
o Responsibilities: Processes requests from the presentation
layer, executes business rules, and provides responses back to
the presentation layer.
Middleware Layer
o Function: Facilitates communication and data exchange
between different components or services.
o Responsibilities: Manages message passing, coordination, and
integration of various distributed components.
Data Access Layer
o Function: Manages data storage and retrieval from databases or
other data sources.
o Responsibilities: Interacts with databases or file systems,
performs data queries, and ensures data integrity and
consistency.
Advantages of Layered Architecture in Distributed System
Separation of Concerns: Each layer focuses on a specific aspect of the
system, making it easier to develop, test, and maintain.
Modularity: Changes in one layer do not necessarily affect others,
allowing for more flexible updates and enhancements.
Reusability: Layers can be reused across different applications or
services within the same system.
Scalability: Different layers can be scaled independently to handle
increased load or performance requirements.
Disadvantages of Layered Architecture in Distributed System
Performance Overhead: Each layer introduces additional overhead due
to data passing and processing between layers.
Complexity: Managing interactions between layers and ensuring proper
integration can be complex, particularly in large-scale systems.
Rigidity: The strict separation of concerns might lead to rigidity, where
changes in the system’s requirements could require substantial
modifications across multiple layers
Key Features of Peer-to-Peer (P2P) Architecture in Distributed
Systems
Decentralization
o Function: There is no central server or authority. Each peer
operates independently and communicates directly with other
peers.
o Advantages: Reduces single points of failure and avoids central
bottlenecks, enhancing robustness and fault tolerance.
Resource Sharing
o Function: Peers share resources such as processing power,
storage space, or data with other peers.
o Advantages: Increases resource availability and utilization
across the network.
Scalability
o Function: The network can scale easily by adding more peers.
Each new peer contributes additional resources and capacity.
o Advantages: The system can handle growth in demand without
requiring significant changes to the underlying infrastructure.
Self-Organization
o Function: Peers organize themselves and manage network
connections dynamically, adapting to changes such as peer
arrivals and departures.
o Advantages: Facilitates network management and resilience
without central coordination.
Advantages of Peer-to-Peer (P2P) Architecture in Distributed
Systems
Fault Tolerance: The decentralized nature ensures that the failure of one
or several peers does not bring down the entire network.
Cost Efficiency: Eliminates the need for expensive central servers and
infrastructure by leveraging existing resources of the peers.
Scalability: Easily accommodates a growing number of peers, as each
new peer enhances the network’s capacity.
Disadvantages of Peer-to-Peer (P2P) Architecture in
Distributed Systems
Security: Decentralization can make it challenging to enforce security
policies and manage malicious activity, as there is no central authority to
oversee or control the network.
Performance Variability: The quality of services can vary depending on
the peers’ resources and their availability, leading to inconsistent
performance.
Complexity: Managing connections, data consistency, and network
coordination without central control can be complex and may require
sophisticated protocols.
3. Data-Centic Architecture in Distributed Systems
Data-Centric Architecture is an architectural style that focuses on the central
management and utilization of data. In this approach, data is treated as a
critical asset, and the system is designed around data management, storage,
and retrieval processes rather than just the application logic or user
interfaces.
The core idea of Data-Centric Architecture is to design systems where
data is the primary concern, and various components or services are
organized to support efficient data management and manipulation.
Data is centrally managed and accessed by multiple applications or
services, ensuring consistency and coherence across the system.
Key Principles of Data-Centic Architecture in Distributed
Systems
Centralized Data Management:
o Function: Data is managed and stored in a central repository or
database, making it accessible to various applications and
services.
o Principle: Ensures data consistency and integrity by maintaining
a single source of truth.
Data Abstraction:
o Function: Abstracts the data from the application logic, allowing
different services or applications to interact with data through
well-defined interfaces.
o Principle: Simplifies data access and manipulation while hiding
the underlying complexity.
o
Data Normalization:
o Function: Organizes data in a structured manner, often using
normalization techniques to reduce redundancy and improve
data integrity.
o Principle: Enhances data quality and reduces data anomalies by
ensuring consistent data storage.
Data Integration:
o Function: Integrates data from various sources and systems to
provide a unified view and enable comprehensive data analysis.
o Principle: Supports interoperability and facilitates
comprehensive data analysis across diverse data sources.
Scalability and Performance:
o Function: Designs the data storage and management systems
to handle increasing volumes of data efficiently.
o Principle: Ensures the system can scale to accommodate
growing data needs while maintaining performance.
Advantages and Disadvantages of Data-Centic Architecture
in Distributed Systems
Advantages:
o Consistency: Centralized data management helps maintain a
single source of truth, ensuring data consistency across the
system.
o Integration: Facilitates easy integration of data from various
sources, providing a unified view and enabling better decision-
making.
o Data Quality: Data normalization and abstraction help improve
data quality and reduce redundancy, leading to more accurate
and reliable information.
o Efficiency: Centralized management can optimize data access
and retrieval processes, improving overall system efficiency.
Disadvantages:
o Single Point of Failure: Centralized data repositories can
become a bottleneck or single point of failure, potentially
impacting system reliability.
o Performance Overhead: Managing large volumes of centralized
data can introduce performance overhead, requiring robust
infrastructure and optimization strategies.
o Complexity: Designing and managing a centralized data system
can be complex, especially when dealing with large and diverse
datasets.
o Scalability Challenges: Scaling centralized data systems to
accommodate increasing data volumes and access demands
can be challenging and may require significant infrastructure
investment.
4. Service-Oriented Architecture (SOA) in Distributed Systems
Service-Oriented Architecture (SOA) is a design paradigm in distributed
systems where software components, known as “services,” are provided and
consumed across a network. Each service is a discrete unit that performs a
specific business function and communicates with other services through
standardized protocols.
In SOA, the system is structured as a collection of services that are
loosely coupled and interact through well-defined interfaces. These
services are independent and can be developed, deployed, and managed
separately.
They communicate over a network using standard protocols such as
HTTP, SOAP, or REST, allowing for interoperability between different
systems and technologies.
Advantages and Disadvantages of Event-Based Architecture
in Distributed Systems
Advantages:
o Scalability: Supports scalable and responsive systems by
decoupling event producers from consumers.
o Flexibility: Allows for dynamic and real-time processing of
events, adapting to changing conditions.
o Responsiveness: Enables systems to react immediately to
events, improving responsiveness and user experience.
Disadvantages:
o Complexity: Managing event flow, ensuring reliable delivery,
and handling event processing can be complex.
o Event Ordering: Ensuring correct processing order of events
can be challenging, especially in distributed systems.
o Debugging and Testing: Troubleshooting issues in an event-
driven system can be difficult due to asynchronous and
distributed nature.
Key Principles of Client Server Architecture in Distributed
Systems
Separation of Concerns:
o Function: Clients handle user interactions and requests, while
servers manage resources, data, and business logic.
o Principle: Separates user interface and client-side processing
from server-side data management and processing, leading to a
clear division of responsibilities.
Centralized Management:
o Function: Servers centralize resources and services, making
them accessible to multiple clients.
o Principle: Simplifies resource management and maintenance by
concentrating them in one or more server locations.
Request-Response Model:
o Function: Clients send requests to servers, which process these
requests and send back responses.
o Principle: Defines a communication pattern where the client and
server interact through a well-defined protocol, often using HTTP
or similar standards.
Scalability:
o Function: Servers can be scaled to handle increasing numbers
of clients or requests.
o Principle: Servers can be upgraded or expanded to improve
performance and accommodate growing demand.
Security:
o Function: Security mechanisms are often implemented on the
server side to control access and manage sensitive data.
o Principle: Centralizes security policies and controls, making it
easier to enforce and manage security measures.
Advantages and Disadvantages of Client Server Architecture
in Distributed Systems
Advantages:
o Centralized Control: Easier to manage and update resources
and services from a central location.
o Simplified Maintenance: Updates and changes are made on
the server side, reducing the need for client-side modifications.
o Resource Optimization: Servers can be optimized for
performance and reliability, serving multiple clients efficiently.
o Security Management: Centralized security policies and
controls make it simpler to protect resources and data.
Disadvantages:
o Single Point of Failure: Servers can become a single point of
failure, impacting all connected clients if they go down.
o Scalability Challenges: Handling a large number of client
requests can overwhelm servers, requiring careful load
management and scaling strategies.
o Network Dependency: Clients depend on network connectivity
to access server resources, which can impact performance and
reliability.
o Performance Bottlenecks: High demand on servers can lead to
performance bottlenecks, requiring efficient resource
management and optimization
Types of Distributed Systems:
1. Client-Server Systems:
o Clients request services, and servers provide them.
o Examples: Web servers, file servers.
2. Peer-to-Peer Systems:
o Every node acts as both a client and a server.
o Examples: File-sharing networks like BitTorrent.
3. Cluster Computing:
o A group of tightly coupled computers working together.
o Examples: High-performance computing (HPC) clusters.
4. Grid Computing:
o Large-scale resource sharing across geographically dispersed locations.
o Examples: SETI@home, Folding@home.
5. Cloud Computing:
o Services delivered over the internet using shared resources.
o Examples: AWS, Google Cloud.
Mainframe computing in the 1950s and the internet explosion in the 1990s came together to give rise to
cloud computing. Since businesses like Amazon, Google, and Salesforce started providing web-
based services in the early 2000s. The term “cloud computing” has gained popularity. Scalability,
adaptability, and cost-effectiveness are to be facilitated by the concept’s on-demand internet-based
access to computational resources.
These days, cloud computing is pervasive, driving a wide range of services across markets and
transforming the processing, storage, and retrieval of data
Cloud computing architecture refers to the components and sub-components required for cloud
computing. These components typically refer to:
1. Front end ( Fat client, Thin client)
2. Back-end platforms ( Servers, Storage )
3. Cloud-based delivery and a network ( Internet, Intranet, Intercloud )
IaaS provides users with virtualized computing resources over the internet. It offers a flexible,
scalable environment for businesses to run applications without having to invest in or manage
physical hardware.
Flexibility and Control: IaaS comes up with providing virtualized computing resources such as
VMs, Storage, and networks facilitating users with control over the Operating system and
applications.
Reducing Expenses of Hardware: IaaS provides business cost savings with the elimination of
physical infrastructure investments making it cost-effective.
Scalability of Resources: The cloud provides in scaling of hardware resources up or down as per
demand facilitating optimal performance with cost efficiency.
Amazon Web Services (AWS): Offers computing power, storage, and networking.
Microsoft Azure: Provides virtual machines, networking, and storage.
Google Cloud Platform (GCP): Offers scalable compute resources, storage, and networking.
IBM Cloud: Provides virtual servers, cloud storage, and Kubernetes services.
PaaS provides a platform that enables developers to build, deploy, and manage applications
without dealing with the underlying infrastructure. It abstracts away hardware management and
focuses on application development.
Google App Engine: A fully managed platform for app development and hosting.
Microsoft Azure App Services: A cloud platform for building, deploying, and scaling web apps.
Heroku: A platform for building and deploying apps using several programming languages.
Red Hat OpenShift: A Kubernetes-based platform for building and scaling containerized
applications.
SaaS (software as a service)
SaaS delivers software applications over the internet on a subscription basis. Users don’t need to
install or maintain the software on their devices as the application is hosted and managed by the
provider.
Collaboration And Accessibility: Software as a Service (SaaS) helps users to easily access
applications without having the requirement of local installations. It is fully managed by the AWS
Software working as a service over the internet encouraging effortless cooperation and ease of
access.
Automation of Updates: SaaS providers manage the handling of software maintenance with
automatic latest updates ensuring users gain experience with the latest features and security patches.
Cost Efficiency: SaaS acts as a cost-effective solution by reducing the overhead of IT support by
eliminating the need for individual software licenses.
Google Workspace (formerly G Suite): Cloud-based productivity tools like Gmail, Docs,
Sheets, and Drive.
Microsoft 365: A suite of productivity tools such as Word, Excel, PowerPoint, and Teams.
Salesforce: A CRM platform for sales, marketing, and customer service.
Dropbox: A cloud storage and file-sharing platform.
Slack: A messaging and collaboration platform for teams.
FaaS, often referred to as serverless computing, enables developers to run functions or pieces of
code in response to events, without worrying about managing servers or the underlying
infrastructure.
Event-Driven Execution: FaaS helps in the maintenance of servers and infrastructure making users
worry about it. FaaS facilitates the developers to run code as a response to the events
Cost Efficiency: FaaS facilitates cost efficiency by coming up with the principle “Pay as per you
Run” for the computing resources used.
Scalability and Agility: Serverless Architectures scale effortlessly in handing the workloads
promoting agility in development and deployment.
Examples: AWS Lambda, Google Cloud Functions, Microsoft Azure Functions, IBM Cloud
Functions
. Public Cloud
Definition: A cloud environment provided by third-party cloud service providers and made
available to the public over the internet.
Examples: Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform (GCP).
Key Characteristics:
o Resources are shared among multiple users (multi-tenancy).
o Cost-effective as users pay only for what they use.
o Minimal setup and maintenance requirements.
Advantages of Public Cloud
Since it is a public cloud where multiple users are using the same resources, it gives rise
to data security and privacy concerns.
There are more chances of compromising reliability because the same servers are
available for a wide range of users, which leads to outages and malfunctioning.
There are service and licensing limitations because users are offered only general services
that are insufficient for complex IT tasks.
2. Private Cloud
Definition: A cloud environment dedicated to a single organization, either hosted on-premises or
by a third-party provider.
Private cloud provides high security and data privacy since only authorized users can
access the resources.
It offers high scalability and flexible deployment options that allow companies to
customize their infrastructures as per the need.
Private Cloud supports a legacy system that cannot access the public cloud.
It is not the right choice for small companies due to its cost. It requires maintaining its
infrastructure in-house, leading to staff training and spending a lot on hardware and software.
It offers fixed scalability as per the choice of hardware.
3. Hybrid Cloud
Definition: A combination of public and private cloud environments, allowing data and
applications to be shared between them.
One of the major advantages of a hybrid cloud is that it comes up at a reasonable cost.
It enhances the scalability and flexibility of resources.
It offers improved security.
Setting up a hybrid cloud is a complex process as two or more clouds are getting
integrated.
This model applies to organizations with multiple use cases and wants to separate the
critical data from non-critical data.
4. Community Cloud
Definition: A cloud environment shared by a specific group of organizations with common goals
or compliance requirements.
1. On-demand self-services: The Cloud computing services does not require any human
administrators, user themselves are able to provision, monitor and manage computing resources as
needed.
2. Broad network access: The Computing services are generally provided over standard networks
and heterogeneous devices.
3. Rapid elasticity: The Computing services should have IT resources that are able to scale out and in
quickly and on a need basis. Whenever the user require services it is provided to him and it is scale
out as soon as its requirement gets over.
4. Resource pooling: The IT resource (e.g., networks, servers, storage, applications, and services)
present are shared across multiple applications and occupant in an uncommitted manner. Multiple
clients are provided service from a same physical resource.
5. Measured service: The resource utilization is tracked for each application and occupant, it will
provide both the user and the resource provider with an account of what has been used. This is done
for various reasons like monitoring billing and effective use of resource.
6. Multi-tenancy: Cloud computing providers can support multiple tenants (users or organizations)
on a single set of shared resources.
8. Resilient computing: Cloud computing services are typically designed with redundancy and fault
tolerance in mind, which ensures high availability and reliability.
9. Flexible pricing models: Cloud providers offer a variety of pricing models, including pay-per-use,
subscription-based, and spot pricing, allowing users to choose the option that best suits their needs.
10. Security: Cloud providers invest heavily in security measures to protect their users’ data and
ensure the privacy of sensitive information.
11. Automation: Cloud computing services are often highly automated, allowing users to deploy and
manage resources with minimal manual intervention.
12. Sustainability: Cloud providers are increasingly focused on sustainable practices, such as energy-
efficient data centers and the use of renewable energy sources, to reduce their environmental
impact.
1. Cost Efficiency:
o Reduced Capital Expenditure: With cloud computing, businesses don’t need to invest
in expensive hardware and infrastructure. Instead, they can pay for services on a
subscription or usage basis, reducing upfront costs.
o Pay-as-you-go Model: Users pay only for the resources they use, which can help control
costs, particularly for small to medium-sized businesses that don’t need constant
resources.
LIMITATIONS
Disadvantages of Cloud Computing
5. Hidden Costs:
o Over-Provisioning: Companies might inadvertently scale their cloud resources higher
than necessary, leading to increased costs. If resources are not carefully managed,
businesses could end up paying for unused capacity.
o Service Complexity: Some cloud services can be complex and require additional training
or professional services to manage effectively, which might increase operational costs.
1. Dependence on the Internet: Accessibility to cloud services is dependent on the availability of secure
and fast internet services, making offline nearly impossible.
2. Security Issues: While most cloud services are highly secure and reliable, there is always an
underlying threat of innovative cyber criminals who may access the data.
3. Limited Expertise: There is a huge demand for qualified and skilled professionals in this industry,
leading to a struggle within organisations to keep up with cloud technology.
4. Dependency on Vendors: Organisations opting for cloud services become dependent on vendors who
provide this service, which may not always suit their organisational structure.
5. Hidden Costs: While cloud services are cost-effective, some vendors may charge hidden fees that may
become an additional burden to the user.
6. Legal Compliance: Some countries in the world have stringent laws and regulations regarding cloud
technology and some cloud services vendors may not be fully compliant with such rules .
LEGAL ISSUES
Cloud computing introduces several legal issues that businesses, users, and providers must
carefully consider. These issues often vary depending on the jurisdiction, type of cloud service,
and data being handled. Key legal challenges include:
Compliance with Privacy Laws: Cloud providers and users must comply with regulations like
GDPR (EU), CCPA (California), HIPAA (US healthcare), etc. Non-compliance can lead to fines
and reputational damage.
Cross-border Data Transfers: Data stored in the cloud might be transferred to or accessed from
multiple jurisdictions, raising issues about conflicting data protection laws.
Data Ownership: It's essential to clarify who owns the data stored in the cloud to avoid disputes.
Applicable Law: Cloud services often operate globally, leading to confusion about which
country's laws govern the contract or data.
Litigation and Disputes: Resolving disputes can be complex when data and operations span
multiple jurisdictions.
Ownership of Uploaded Content: Users must ensure they retain rights to their data or content
stored in the cloud.
Copyright Infringement: Cloud platforms might be held liable for hosting copyrighted content
uploaded by users.
4. Security and Liability
5. Regulatory Compliance
Sector-specific Regulations: Industries like finance and healthcare have stricter compliance
requirements. Cloud providers must tailor their services accordingly.
Audits and Oversight: Ensuring regulatory bodies have access to necessary data or logs can be
challenging in cloud environments.
Portability Issues: Migrating data from one cloud provider to another can be costly or
technically difficult, raising concerns about vendor lock-in.
Exit Strategies: Contracts should address data retrieval and deletion after termination.
7. Contractual Clarity
Standardized Agreements: Many cloud providers use standardized contracts that may not
address specific needs or concerns of businesses.
Indemnification Clauses: Ensuring that liability is appropriately assigned in case of disputes or
third-party claims is critical.
Law Enforcement Requests: Governments may require access to data stored in the cloud, even
if stored outside their jurisdiction (e.g., CLOUD Act in the US).
Transparency: Users need clarity on how providers handle such requests and protect user data.
Retention Policies: Legal requirements for data retention may conflict with user expectations or
privacy
Virtualization in Cloud Computing
Virtualization uses a software layer called a hypervisor to create and manage virtual machines.
The hypervisor abstracts the underlying hardware resources (CPU, memory, storage, and
network) and allocates them to multiple VMs, each running its own operating system and
applications
Types of Virtualization
1. Server Virtualization
o Divides a physical server into multiple VMs.
o Example: VMware ESXi, Microsoft Hyper-V.
o Benefits: Better resource utilization, reduced hardware costs, and easier server
management.
2. Storage Virtualization
o Combines physical storage from multiple devices into a single virtual storage pool.
o Example: Software-defined storage solutions like VMware vSAN.
o Benefits: Enhanced storage management, scalability, and flexibility.
3. Network Virtualization
o Creates a virtual network that abstracts physical network components.
o Example: Software-defined networking (SDN).
o Benefits: Improved network control, dynamic provisioning, and cost-efficiency.
4. Desktop Virtualization
o Enables users to access their desktop environments from anywhere.
o Example: Virtual Desktop Infrastructure (VDI) platforms.
o Benefits: Centralized management and enhanced security.
5. Application Virtualization
o Runs applications in a virtual environment separate from the underlying OS.
o Example: Docker containers, Kubernetes orchestration.
o Benefits: Consistent performance, portability, and simplified deployment.
2. Resource Optimization
o Improves utilization of hardware resources by allocating them dynamically.
3. Scalability
o Allows cloud providers to scale resources up or down based on demand.
4. Isolation
o VMs operate independently, ensuring that one VM's issues do not affect others.
5. Disaster Recovery
o Virtualized environments are easier to back up and restore.
6. Flexibility
o Supports diverse operating systems and applications on a single physical server.
7. Environment Testing
o Developers can create isolated environments to test applications without affecting
production systems.
Challenges of Virtualization
1. Performance Overhead
o Virtualization adds a layer of abstraction, which can slightly reduce performance
compared to bare-metal systems.
2. Security Risks
o Hypervisor vulnerabilities can lead to security breaches, such as VM escape attacks.
3. Complexity
o Managing a virtualized infrastructure can require specialized skills and tools.
4. Licensing Costs
o Commercial virtualization software can be expensive.
4. VMware Cloud
o Offers enterprise-grade virtualization solutions for private and hybrid clouds.
Grid computing is a type of distributed computing that involves connecting and utilizing
geographically dispersed and heterogeneous resources to work together as a unified system. It is
often used for large-scale computations, where tasks are broken into smaller chunks and
distributed across multiple systems for parallel processing.
Grid Computing
Grid computing is a type of distributed computing that involves connecting and utilizing
geographically dispersed and heterogeneous resources to work together as a unified system. It is
often used for large-scale computations, where tasks are broken into smaller chunks and
distributed across multiple systems for parallel processing.
1. Resource Sharing:
o Grid computing allows sharing of computational power, data storage, and software
resources across a network.
2. Heterogeneity:
o Involves diverse resources (hardware and software) that work together.
3. Geographical Distribution:
o Resources can be located anywhere in the world.
4. Parallel Processing:
o Tasks are divided into smaller subtasks and executed simultaneously across multiple
nodes.
5. Fault Tolerance:
o Built-in mechanisms to handle node failures by redistributing tasks.
6. Scalability:
o The system can grow by adding more resources to the grid.
7. Decentralization:
o No central control; tasks and resources are distributed.
1. Resources:
o Computing nodes, storage devices, and specialized hardware like GPUs.
2. Grid Middleware:
o Software that manages resource allocation, job scheduling, and communication between
nodes.
o Examples: Globus Toolkit, Apache Hadoop.
3. Grid Nodes:
o Individual computers or servers connected to the grid.
4. Users:
o Individuals or organizations leveraging the grid for computational tasks.
5. Network:
o The communication infrastructure connecting all nodes.
Types of Grids:
1. Computational Grid:
o Focused on providing high computational power for intensive tasks.
o Example: Simulating climate models.
2. Data Grid:
o Optimized for storing and managing large datasets.
o Example: CERN's LHC Data Grid.
3. Collaborative Grid:
o Supports real-time collaboration and resource sharing among users.
o Example: Virtual laboratories for scientific research.
4. Access Grid:
o Focused on facilitating virtual meetings and distributed group communication.
o Example: Distributed conferencing platforms.
1. Scientific Research:
o Particle physics simulations, genome sequencing, and climate modeling.
2. Engineering:
o Structural analysis, computational fluid dynamics, and material simulations.
3. Financial Analysis:
o Risk analysis, algorithmic trading, and economic forecasting.
4. Healthcare:
o Drug discovery, medical imaging, and protein folding simulations.
1. Cost Efficiency:
o Utilizes existing resources, reducing the need for expensive supercomputers.
2. Flexibility:
o Combines resources from different platforms and locations.
3. High Performance:
o Achieves faster processing by parallelizing tasks.
4. Scalability:
o Easily adds new nodes to meet increased demands.
5. Fault Tolerance:
o Redundant resources ensure reliability even if some nodes fail.
Cluster Computing
Cluster computing refers to a set of computers, called nodes, that work together as a single
system to perform tasks. These nodes are tightly coupled and interconnected through a high-
speed local area network (LAN). Cluster computing is designed to deliver improved
performance, scalability, and fault tolerance for computational tasks by utilizing the collective
resources of multiple systems.
4. Storage Clusters:
o Focused on providing large-scale, reliable data storage.
o Example: Distributed file systems like Ceph or GlusterFS
1. Parallel Processing:
o Tasks are divided into smaller chunks and executed concurrently across multiple nodes.
2. High-Speed Interconnect:
o Uses low-latency, high-bandwidth networks like InfiniBand or 10/40/100 Gigabit
Ethernet to ensure fast communication between nodes.
3. Scalability:
o Designed to scale horizontally by adding more nodes to accommodate larger workloads.
4. Centralized Storage:
o Shared file systems like Lustre or GPFS ensure efficient data access across nodes.
5. Specialized Hardware:
o Often includes Graphics Processing Units (GPUs), Field Programmable Gate Arrays
(FPGAs), or specialized accelerators for intensive computations.
6. Fault Tolerance:
o Redundancy and error-handling mechanisms ensure that the system continues operating
in the event of node failures.
1. Compute Nodes:
o Perform the actual computation. Each node typically consists of multiple CPUs or GPUs.
3. Interconnect Network:
o High-speed communication links between nodes for efficient data transfer and
synchronization.
4. Storage:
o High-performance, centralized storage systems for managing and accessing large
datasets.
6. Operating System:
o Most HPC clusters run Linux-based operating systems optimized for performance and
scalability.
2. Engineering Simulations:
o Computational fluid dynamics (CFD), structural analysis, and material simulations.
4. Finance:
o Risk analysis, financial modeling, and market simulations.
1. High Performance:
o Capable of solving complex problems in significantly less time.
2. Cost-Effectiveness:
o Uses commodity hardware to achieve supercomputer-like performance.
3. Scalability:
o Can handle increasing workloads by adding more nodes.
4. Customizability:
o Tailored to specific computational requirements.
5. Resource Optimization:
o Efficient use of resources through parallel processing.
1. Supercomputers:
o Summit (USA), Fugaku (Japan).
2. HPC Systems:
o Cray XC40, IBM Power Systems.