0% found this document useful (0 votes)
18 views

Unit -01 easid

Parallel computing utilizes multiple processors to solve problems by dividing them into smaller, independent tasks that are processed simultaneously. It encompasses various types such as bit-level, instruction-level, and task parallelism, and can be implemented in systems like supercomputers and workstations. The document also contrasts parallel computing with distributed systems, outlines different computer models and architectures, and discusses the applications, limitations, and costs associated with parallel computing.

Uploaded by

haq4ibtisam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Unit -01 easid

Parallel computing utilizes multiple processors to solve problems by dividing them into smaller, independent tasks that are processed simultaneously. It encompasses various types such as bit-level, instruction-level, and task parallelism, and can be implemented in systems like supercomputers and workstations. The document also contrasts parallel computing with distributed systems, outlines different computer models and architectures, and discusses the applications, limitations, and costs associated with parallel computing.

Uploaded by

haq4ibtisam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Unit 1: Introduction to Parallel Computing

What is Parallel Computing?

Parallel computing involves multiple processors working together at the same time to solve a
problem. It breaks large problems into smaller, independent parts that are processed
simultaneously. After processing, the results are combined to complete the task.

● Key Features of Parallel Computing:


○ Splits problems into smaller parts.
○ Uses multiple processors to work at the same time.
○ Shares memory for communication.

Types of Parallel Computing:

1. Bit-level parallelism:
○ Reduces the number of instructions needed by increasing the size of data the
processor can handle in one step.
2. Instruction-level parallelism:
○ Hardware-based: The processor decides which instructions to run in parallel
during runtime.
○ Software-based: The compiler (software) decides which instructions to run in
parallel.
3. Task parallelism:
○ Runs different tasks at the same time on the same data using multiple
processors.
4. Superword-level parallelism:
○ Groups similar instructions into one operation to perform tasks faster.

Types of Parallel Applications:

● Fine-grained parallelism: Subtasks communicate frequently (many times per second).


● Coarse-grained parallelism: Subtasks communicate less often.
● Embarrassing parallelism: Subtasks rarely or never communicate.

What is a Parallel Computer?

A parallel computer is a system with multiple processors working together to solve a problem.

● Examples:
○ Supercomputers with hundreds or thousands of processors.
○ Workstations with multiple processors.
○ Embedded systems.
● How It Works:
○ Processors are connected and exchange data.
○ They work together on the same problem.

Parallel vs. Distributed Systems:

● Parallel Systems:
○ Processors are close together and solve one problem jointly.
● Distributed Systems:
○ Processors are spread across a large area and work on separate tasks.
○ Main goals:
■ Use all available resources.
■ Share information over a network.

Parallel Computer Models:

1. Single Machine Model:


○ Based on the Von Neumann architecture.
○ One CPU connected to memory, running instructions one at a time.
○ Also called a sequential machine.
2. Multicomputer Model:
○ A network of multiple computers (each with its CPU and memory).
○ Computers communicate by sending messages.
○ Each computer runs its own program but can share data through the network.

Classes of Parallel Computer Architecture

Parallel computers can be grouped based on various features like the type of processors, how
they connect with each other, how they control operations, and how they handle input/output
tasks.

Types of Parallel Computers:

1. Distributed-Memory MIMD Computer:


○ MIMD stands for Multiple Instructions, Multiple Data.
○ Each processor can execute its own instructions using its own local data.
○ Distributed memory means each processor has its own memory, instead of
using one shared memory.
○ The cost of sending data between processors depends on their distance and the
network's traffic.
○ Examples: IBM SP, Intel Paragon, Thinking Machines CM5.
2. Shared-Memory MIMD Computer:
○Also called multiprocessor systems.
○All processors share a common memory (via a bus or multiple buses).
○These systems are efficient for message passing, as the shared memory allows
quick communication between processors.
○ Examples: Silicon Graphics Challenge, Sequent Symmetry, multiprocessor
workstations.
○ Memory Hierarchy: Often, frequently used data is stored in a faster cache close
to each processor for quick access.
3. SIMD (Single Instruction, Multiple Data) Computer:
○ All processors execute the same instruction but on different pieces of data.
○ This is useful for problems like image processing and numerical simulations,
where tasks are similar.
○ Example: MasPar MP.

Concepts and Terminology

Von Neumann Architecture

● First introduced by John von Neumann in 1945.


● It's also known as the Princeton architecture.
● Key Features:
○ Control Unit (manages operations)
○ Arithmetic and Logic Unit (ALU) (performs calculations)
○ Memory Unit (stores data and instructions)
○ Registers (temporary storage)
○ Inputs/Outputs (communication with outside devices)
● How It Works:
○ Both data and instructions are stored as binary digits in the same memory.
○ Instructions are fetched from memory one by one and executed in order.
○ This process repeats until there are no more instructions to execute.
● Common Today: This architecture is the foundation of most computers still used today.

Central Processing Unit (CPU)

The CPU is the part of the computer that executes program instructions. It is sometimes called
the microprocessor. The CPU has three main components:

● ALU (Arithmetic and Logic Unit): Does calculations and logical operations.
● CU (Control Unit): Controls the ALU, memory, and input/output devices. It makes sure
everything in the computer works together.
● Registers: Fast storage areas in the CPU where data is kept temporarily before being
processed.
Registers:

1. MAR (Memory Address Register): Holds the location of data in memory.


2. MDR (Memory Data Register): Holds data that is being transferred to/from memory.
3. AC (Accumulator): Stores results from arithmetic and logic operations.
4. PC (Program Counter): Holds the address of the next instruction to execute.
5. CIR (Current Instruction Register): Holds the current instruction being processed.

Arithmetic and Logic Unit (ALU)

The ALU performs math operations (like addition and subtraction) and logical operations (like
AND, OR, NOT).

Control Unit (CU)

The Control Unit directs the operations of the ALU, memory, and input/output devices. It tells
these components how to respond to the program instructions it reads from memory. It also
provides the timing and control signals needed for other parts of the computer to work together.

Buses

Buses are pathways that carry data between the CPU, memory, and input/output devices.

● Address Bus: Carries the memory addresses between the processor and memory.
● Data Bus: Carries data between the processor, memory, and input/output devices.
● Control Bus: Carries control signals from the CPU and status signals to coordinate the
computer’s activities.

Memory Unit

The Memory Unit consists of RAM (Random Access Memory), which is fast and directly
accessed by the CPU. It is used to store data temporarily while the CPU works with it. RAM is
divided into sections, each with a unique address. Data from permanent storage (like a hard
drive) is loaded into RAM so that the CPU can work faster.
Flynn’s Classical Taxonomy

Flynn's Taxonomy is a classification system for computer architectures, created by Michael J.


Flynn in 1966. It classifies computers based on two factors: the number of instruction streams
(commands) and data streams (information). These factors can each be either single or
multiple.

Flynn’s Taxonomy Classes:

1. SISD (Single Instruction stream, Single Data stream): A single instruction is executed
on a single data stream (like a traditional computer).
2. SIMD (Single Instruction stream, Multiple Data stream): The same instruction is
applied to multiple data streams at the same time (useful for tasks like image
processing).

3. MISD (Multiple Instruction Stream, Single Data Stream)

● MISD means multiple instructions are executed, but only one data stream is used.
● This setup is rare and not commonly used in modern computing.
● It would involve different instructions working on the same data at different times, but this
setup isn't efficient in practice.

4. MIMD (Multiple Instruction Stream, Multiple Data Stream)

● MIMD means multiple instructions are executed at the same time, each on different data
streams.
● This architecture is widely used in parallel computing and is common in modern
supercomputers and multi-core processors.

Single Instruction Stream, Single Data Stream (SISD)

● SISD is a traditional, sequential computer that processes instructions one at a time.


● It has one control unit (CU) that fetches one instruction from memory, and then that
instruction is executed on one data stream.
● Examples: Old mainframes, minicomputers, workstations, single-core PCs.
● Characteristics of SISD:
○ Non-parallel (Serial processing).
○ Single instruction: One instruction is processed at a time.
○ Single data: One piece of data is processed at a time.
○ Deterministic execution: The same results every time.
Single Instruction Stream, Multiple Data Streams (SIMD)

● SIMD means one instruction is applied to multiple pieces of data at the same time.
● SIMD can be executed using techniques like pipelining or parallelism (multiple units
processing data simultaneously).
● Flynn divided SIMD into three types:
1. Array Processor:
○ All processing units receive the same instruction.
○ Each unit has its own memory and registers.
○ Modern version: SIMT (Single Instruction, Multiple Threads).
2. Pipelined Processor:
○ All units receive the same instruction.
○ They process pieces of data sequentially from a central resource (like memory or
registers).
○ Packed SIMD: A type where each unit processes a piece of data and writes it
back to memory.
3. Associative Processor:
○ All units receive the same instruction, but each unit makes its own decision
based on local data whether to execute or skip the instruction.
○ Modern name: Predicated (or Masked) SIMD.
○ Example: GPUs today use features from more than one of these types (SIMT
and Associative processing).

SIMD (Single Instruction, Multiple Data)

● SIMD involves executing one instruction on multiple data elements simultaneously.


● Characteristics:
○ Single Instruction: All processing units execute the same instruction.
○ Multiple Data: Each processing unit operates on different data elements.
○ Best suited for: Specialized problems like graphics or image processing.
○ Execution: Synchronous (lockstep) and deterministic.
○ Two varieties:
■ Processor Arrays: Each processor has its own memory.
■ Vector Pipelines: Multiple stages process the data sequentially.
○ Examples:
■ Processor Arrays: Thinking Machines CM-2, MasPar MP-1 & MP-2,
ILLIAC IV.
■ Vector Pipelines: IBM 9000, Cray X-MP, Y-MP, C90, Fujitsu VP, NEC
SX-2.

MISD (Multiple Instruction Streams, Single Data Stream)


● MISD involves multiple instructions being executed on a single data stream.
● Characteristics:
○ Multiple Instructions: Each processor has its own instruction stream, working
on the same data.
○ Single Data: One data stream is processed by all units.
○ Rare Architecture: Few examples exist. Mainly used for fault tolerance.
○ Examples:
■ Space Shuttle Flight Control Computer.
○ Potential Use Cases:
■ Multiple frequency filters working on one signal.
■ Cryptography algorithms working on a single coded message.

MIMD (Multiple Instruction Streams, Multiple Data Streams)

● MIMD involves multiple processors executing different instructions on different data


streams.
● Characteristics:
○ Multiple Instructions: Each processor executes different instructions.
○ Multiple Data: Each processor operates on different data.
○ Execution: Can be synchronous or asynchronous, deterministic or
non-deterministic.
○ Most Common: Widely used in modern supercomputers, networked clusters,
multi-core PCs.
○ Examples:
■ Supercomputers, multi-core processors, multi-core PCs, networked
parallel clusters.
○ Note: Many MIMD systems also have SIMD execution sub-components.

Parallel Computing Terminology

● CPU: A modern CPU may have multiple cores, each with its own instruction stream.
Cores may be organized in sockets, and there is usually memory sharing across
sockets.
● Node: A node is a standalone computer or unit with multiple CPUs/cores, memory, and
network interfaces. Nodes are often networked together in supercomputers.
● Task: A task is a unit of computational work, often in the form of a program or set of
instructions. Parallel programs involve multiple tasks running on multiple processors.
● Pipelining: Dividing a task into steps processed by different units, similar to an
assembly line. Inputs flow through each step, creating parallelism.
● Shared Memory: All processors have direct access to common physical memory, and
parallel tasks can directly access and modify memory locations.
● Symmetric Multi-Processor (SMP): A shared memory architecture where multiple
processors have equal access to all resources, like memory and disk.
● Distributed Memory: Each processor has local memory, and tasks can only access
their local memory. Communication is needed to access memory on other machines.
● Communications: Parallel tasks often need to exchange data. This can be done via
shared memory or through network communication.

Synchronization

● Definition: The coordination of parallel tasks to ensure they execute in a desired


sequence.
● Impact: It can increase wall-clock execution time because tasks may have to wait.
● Commonly Associated With: Task coordination and communications between tasks.

Computational Granularity

● Definition: A measure of the ratio of computation to communication.


● Types:
○ Coarse: Large computational work done between communications.
○ Fine: Small computational work done between communications.

Observed Speedup

● Definition: A simple metric to measure the performance of a parallel program.


● Formula: Speedup=Wall-clock time of serial executionWall-clock time of parallel
execution\text{Speedup} = \frac{\text{Wall-clock time of serial execution}}{\text{Wall-clock
time of parallel execution}}Speedup=Wall-clock time of parallel executionWall-clock time
of serial execution​

Parallel Overhead

● Definition: The additional execution time required for managing parallel tasks, which is
not related to the useful computation.
● Includes:
○ Task start-up time
○ Synchronization
○ Data communication
○ Software overhead from libraries, operating systems, etc.
○ Task termination time
Massively Parallel

● Definition: Refers to hardware with a very large number of processing elements.


● Current Scale: The largest systems have hundreds of thousands to millions of
processing elements.

Embarrassingly Parallel

● Definition: A class of parallel tasks that are highly independent and require little to no
coordination between tasks.
● Examples: Simultaneously solving similar independent tasks like data processing or
simulations.

Scalability

● Definition: The ability of a parallel system to show a proportional increase in speedup


with the addition of more resources (like more processors).
● Factors Affecting Scalability:
○ Hardware (memory-CPU bandwidth)
○ Application algorithms
○ Parallel overhead
○ Application-specific characteristics

Uses, Limitations, and Costs of Parallel Computing

Uses of Parallel Computing

● Primary Goal: To increase computation power and solve problems faster by distributing
work across multiple processors.
● Advantages:
○ Reduces time and cost by enabling simultaneous task processing.
○ Solves larger problems that serial computing cannot handle.
○ Efficient use of non-local resources (e.g., cloud resources or the Internet).
○ Improves overall hardware utilization, reducing waste in computing power.
○ Essential for managing large datasets and real-time dynamic simulations.

Applications of Parallel Computing


1. Real-Time Simulation:
○ Engineering: Aircraft design, motor controller design, space robotics.
○ Computer Gaming: For simulation in real time.
2. Science & Engineering:
○ Modeling complex problems in physics, biosciences, chemistry, geology,
mechanical and electrical engineering, etc.
3. Industrial & Commercial:
○ Data analysis, AI, oil exploration, web search, financial modeling, medical
imaging, image processing, and advanced graphics.
4. Global Applications:
○ Used extensively in research, finance, logistics, aerospace, telecommunications,
defense, and healthcare.

Limitations of Parallel Computing

● Memory Access:
○ Accessing local memory is cheaper than accessing remote memory (from
different nodes).
○ Locality: Frequent access to local data is crucial for efficient parallel software.
○ Impact of Locality: The ratio of remote to local access costs can vary, with
remote access being up to 1000 times more expensive in some cases.

Unit 2: Introduction to Distributed Systems Distributed Systems

Unit 2: Introduction to Distributed Systems (Simplified)

What are Distributed Systems?

● A distributed system is a group of independent computers (called nodes) that work


together to perform tasks, appearing to users as one system.
● These nodes communicate with each other to achieve common goals.
● Nodes can be hardware (like a computer) or software processes.
● The system works by passing messages between nodes to process data and perform
tasks.

Key Features:
● The size of distributed systems can range from a few devices to millions of computers
spread across different locations.
● The network connecting these devices can be wired or wireless.
● These systems are dynamic, meaning computers can join or leave at any time, affecting
performance.

Operational Layers in Distributed Systems:

1. Application Layer: User applications that increase system speed or quality.


Performance depends on both computation and storage.
2. Middleware Layer: This acts as a bridge between the application and resource layers. It
handles tasks like communication, scheduling, security, and reliability.
3. Resource Layer: Includes all computing nodes and storage units, managing resources
through hardware and operating systems.
4. Network Layer: Responsible for routing and transferring data packets, ensuring network
services reach the resource layer.

Middleware in Distributed Systems:

● Middleware is software that sits above the operating systems of the nodes in a
distributed system.
● It helps manage resources, enabling efficient sharing across the network and offering
services like:
○ Communication (e.g., Remote Procedure Call - RPC)
○ Security
○ Accounting
○ Failure recovery
● Middleware simplifies development by providing common services so developers don't
need to recreate them.

Types of Distributed Systems:

1. Distributed Computing Systems: Systems that share computational tasks across


multiple computers.
2. Distributed Information Systems: Systems that distribute data across multiple nodes.
3. Pervasive Systems: Systems integrated into everyday life through technology.

High-Performance Distributed Computing:

● Early high-performance computing used multiprocessor machines with shared memory,


but today, distributed-memory systems are common.
● Distributed-memory systems overcome the limitations of shared memory by having
multiple computers share resources over a network.

Cluster Computing:
● A group of similar computers (e.g., workstations or PCs) connected by a high-speed
local network, all running the same operating system.

Grid Computing:

● A system where resources from different organizations work together to form a virtual
organization, allowing collaboration across institutions.
● It uses a multi-layer architecture:
1. Fabric Layer: Interfaces with local resources.
2. Connectivity Layer: Supports communication between resources.
3. Resource Layer: Manages individual resources.
4. Collective Layer: Manages access to multiple resources.
5. Application Layer: Includes the applications that use the grid environment.

What is Cloud Computing?

● Cloud computing is the outsourcing of computing resources (hardware, storage, etc.) to


run applications and store data.
● Utility computing is a similar concept, where customers upload tasks to a data center
and pay based on the resources used. This laid the foundation for cloud computing.

IBM's Definition:

● A cloud is a pool of virtualized computer resources that can run various tasks, from
backend jobs to interactive applications.

Cloud Layers:

1. Hardware Layer:
○ This layer includes the physical resources like processors, routers, power, and
cooling systems, typically managed in data centers. Users generally do not
interact directly with this layer.
2. Infrastructure Layer:
○ The backbone of cloud computing, providing virtualized storage and computing
resources. It involves managing virtual servers and storage devices.
3. Platform Layer:
○ This layer offers tools for developing and deploying applications in the cloud.
Developers use vendor-specific APIs to upload and execute their applications on
the cloud.
4. Application Layer:
○ This is where actual applications run, like text processors, spreadsheets, and
presentation software, and are accessible to users for further customization.

Types of Cloud Services:

1. Infrastructure-as-a-Service (IaaS):
○ Provides hardware and infrastructure like virtual servers and storage.
2. Platform-as-a-Service (PaaS):
○ Offers a platform for developers to build and deploy applications.
3. Software-as-a-Service (SaaS):
○ Provides access to software applications hosted on the cloud, like email or office
applications.

Distributed Information Systems (Simplified)

What are Distributed Information Systems?

● Distributed systems often arise in organizations with networked applications that need to
work together, but where integration between these applications is difficult.
● Server-Client Model:
○ A server runs a networked application (like a database) and makes it available to
remote clients. Clients send requests, and the server processes them.
● Distributed Transactions:
○ Clients can bundle multiple requests into a single larger request to be executed
as a distributed transaction. Either all requests succeed or none of them do.

Enterprise Application Integration (EAI):

● As applications became more complex, direct communication between applications


became necessary, leading to a whole industry focused on EAI, helping different
systems work together.

Distributed Transaction Processing:

● In distributed systems, operations on databases are usually done as transactions. A


transaction is a unit of work that ensures either complete success or failure.

Transaction Primitives:

● Examples of transaction primitives (core actions in a transaction):


○ BEGIN TRANSACTION: Starts the transaction.
○ COMMIT: Confirms and saves the transaction.
○ ROLLBACK: Reverts the changes made during the transaction if something
goes wrong.
● Transactions in Distributed Systems:
○ A transaction in distributed systems is an operation where either all actions are
executed successfully, or none of them are executed. This ensures consistency
and reliability.
○ Transactions are often divided into sub-transactions, forming a nested
transaction structure. This allows operations to be parallelized and distributed
across different machines.
● Nested Transactions:
○ Nested transactions break down a large transaction into smaller, manageable
sub-transactions that can be executed independently.
○ Example: Booking a trip can involve multiple sub-transactions, like booking
individual flights. Each flight booking is a sub-transaction that can be handled
independently.
● Role of TP Monitors:
○ Transaction Processing (TP) monitors coordinate the commitment of
sub-transactions, ensuring they follow a standardized protocol like distributed
commit. This protocol ensures that all sub-transactions either commit (succeed)
or rollback (fail).

Enterprise Application Integration (EAI)

1. Communication Models:
○ With the decoupling of applications from their databases, inter-application
communication became crucial. This communication is managed through
several middleware technologies.
2. Middleware Communication Types:
○ Remote Procedure Call (RPC): Allows an application component to call another
application component remotely, as if it were a local procedure.
■ Drawback: Requires both the caller and callee to be running at the same
time.
○ Remote Method Invocation (RMI): Similar to RPC, but operates on objects
instead of procedures.
■ Drawback: Tight coupling between the caller and callee.
○ Message-Oriented Middleware (MOM): Enables asynchronous communication.
Applications send messages to predefined logical contact points, and the system
ensures that messages reach their intended recipients. This is part of
publish-subscribe systems.
3. Methods of Application Integration:
○ File Transfer: One application produces a file that another application reads.
■ Challenges: Agreement on file format, file management, and handling
updates.
○ Shared Database: All applications access the same database.
■ Challenges: Designing a common schema and performance bottlenecks.
○ Remote Procedure Call (RPC): Allows one application to invoke a procedure on
another without direct access.
■ Challenges: Requires both applications to be running at the same time.
○ Messaging: Ensures that requests and responses are delivered asynchronously,
even if the systems are temporarily unavailable.

Pervasive Systems

1. Pervasive Systems:
○With mobile and embedded computing devices, pervasive systems emerge.
These systems aim to seamlessly blend into the environment, often not requiring
direct interaction from the user.
○ Devices are small, battery-powered, mobile, and often connect wirelessly to the
network, forming part of the Internet of Things (IoT).
2. Key Requirements for Pervasive Systems:
○ Context Awareness: Devices must be aware of environmental changes, such as
network availability, and adjust their behavior accordingly.
○ Ad-Hoc Composition: Devices should be configurable, either by the user or
automatically, to form a useful suite of applications.
○ Sharing: Devices should easily share and access information, adapting to
intermittent and changing connectivity.
3. Types of Pervasive Systems:
○ Ubiquitous Computing Systems: Devices are networked and continuously
present in the environment. These systems are designed to be transparent, with
minimal user interaction, and must be context-aware and autonomous.
○ Mobile Computing Systems: These systems rely on mobile devices like
smartphones and tablets that use wireless communication for network
connectivity.
○ Sensor Networks: These networks are composed of devices that collect and
share data through sensors.

Second, in mobile computing the location of a device is assumed to change over time. A
changing location has its effects on many issues. Changing locations also has a profound effect
on communication.

Unit 3: Parallel Computer Memory

Memory Hierarchies

Memory in parallel computers is organized in layers or levels to balance speed and storage.
These layers can be categorized as follows:

1. Primary Memory
○ CPU Registers: Small, very fast memory locations within the CPU used for
immediate data manipulation.
○ Cache Memory: A smaller, faster memory that stores copies of frequently
accessed data from the main memory to speed up access.
○ Physical/Main Memory: The main memory (RAM) where active programs and
data are stored. It’s slower than cache but has larger capacity.
2. Secondary/Auxiliary Memory
○ Solid State Memory: Fast, non-volatile storage (e.g., SSDs).
○ Magnetic Memory: Traditional, slower storage (e.g., hard drives).

Parallel Computer Memory Architecture

There are three main types of memory architectures used in parallel computing systems:

1. Shared Memory
2. Distributed Memory
3. Hybrid Distributed-Shared Memory

Shared Memory

General Characteristics:

● In shared memory systems, all processors can access a single global memory address
space, meaning any changes made to memory by one processor are visible to all other
processors.
● These systems allow multiple processors to work independently while sharing the same
memory resources.

Key Types:

1. Uniform Memory Access (UMA):


○ All processors have uniform, equal access time to all memory locations.
○ Symmetric Multiprocessor (SMP): Every processor has equal access to
memory and peripheral devices, and all processors are identical.
■ Cache Coherent UMA (CC-UMA): Ensures that when one processor
updates a memory location, all other processors are aware of this update
through cache coherency mechanisms.
○ Asymmetric Multiprocessor: Only one or a few processors have access to
peripheral devices.
○ Common in shared memory systems where processors are identical and can
access memory with equal speed.

Non-Uniform Memory Access (NUMA)

● In the NUMA system, memory access time depends on the memory's location.
● Memory is distributed across processors, called local memories.
● Global address space is formed by all these local memories, accessible by every
processor.
● Often created by linking two or more SMPs (Symmetric Multiprocessors), so one SMP
can access the memory of another.
● Memory access across links is slower compared to local access.
● When cache coherency is maintained, it's called CC-NUMA (Cache Coherent NUMA).

Cache Only Memory Architecture (COMA)

● A special NUMA model where all distributed memory is turned into cache memory.

Advantages of Shared Memory Architecture

1. Global Address Space: Simplifies memory management and programming, as


everything appears in one address space.
2. Fast Data Sharing: Close proximity between processors and memory ensures quick
and uniform data sharing between tasks.

Disadvantages of Shared Memory Architecture

1. Scalability Issues: Adding more CPUs increases memory traffic and makes cache
coherence more challenging to manage.
2. Synchronization Challenges: Programmers must ensure correct access to shared
memory to avoid conflicts.

Distributed Memory

● In a distributed memory system, multiple computers (nodes) are connected by a


message-passing network.
● Each node has its own local memory, which is private and not shared with other
processors.
● No global address space exists; processors operate independently.
● Communication between processors happens via message-passing, and
synchronization is the programmer’s responsibility.
● These systems are often called NORMA (No-Remote-Memory-Access) machines.

Hybrid Distributed-Shared Memory


● The largest and most powerful computers combine both shared memory and
distributed memory architectures.
○ Shared memory components could include shared memory machines or GPUs.
○ Distributed memory connects multiple shared memory machines or GPUs.
● Data transfer between different machines requires network communication.
● This hybrid architecture is expected to dominate in high-performance computing for the
foreseeable future.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy