0% found this document useful (0 votes)
9 views

CH5 Parallel Processing

Parallel processing enables simultaneous execution of multiple tasks by breaking them into smaller subtasks processed concurrently by multiple processors, enhancing performance, resource utilization, and scalability. Various processor organizations include SISD, SIMD, MISD, and MIMD, with SMP being a common architecture that allows multiple processors to share memory and I/O resources. Cache coherence and synchronization are critical in multiprocessor systems to maintain data consistency and efficient operation.

Uploaded by

Isaac King
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

CH5 Parallel Processing

Parallel processing enables simultaneous execution of multiple tasks by breaking them into smaller subtasks processed concurrently by multiple processors, enhancing performance, resource utilization, and scalability. Various processor organizations include SISD, SIMD, MISD, and MIMD, with SMP being a common architecture that allows multiple processors to share memory and I/O resources. Cache coherence and synchronization are critical in multiprocessor systems to maintain data consistency and efficient operation.

Uploaded by

Isaac King
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Parallel processing


PARALLEL PROCESSING
 Parallel processing refers to the simultaneous execution of
multiple tasks or instructions.
 It involves breaking down a problem or task into smaller
subtasks that can be executed in parallel.
 These subtasks are processed concurrently by multiple
processors or computing resources.
 Advantage of parallel processing
•Increased performance: significantly speed up the execution of
large and complex tasks.
• Resource utilization: efficient utilization of computing resources
by distributing workloads.
• Scalability: enables systems to handle larger workloads and
scale with increasing demands.
• Real-time processing: it is crucial for applications that require
immediate responses, such as simulations and data analysis.
MULTIPLE PROCESSOR ORGANIZATION
• Single instruction, single data (SISD) stream
• Single processor executes a single instruction stream to
operate on data stored in a single memory
• Uniprocessors fall into this category
The processing unit operates on a single data stream (DS) from a
memory unit (MU).
CONT’D
• Single instruction, multiple data (SIMD) stream
• A single machine instruction controls the simultaneous execution of a number of
processing elements on a lockstep basis
• Vector and array processors fall into this category
There is still a single control unit, now feeding a single instruction stream to multiple
pus.
 Each pu may have its own dedicated memory or there may be a shared memory
CONT’D
• Multiple Instruction, Single Data (MISD) Stream
• A Sequence Of Data Is Transmitted To A Set Of Processors,
Each Of Which Executes A Different Instruction Sequence
• Not Commercially Implemented
CONT’D
• Multiple instruction, multiple data (MIMD) stream
• A set of processors simultaneously execute different instruction
sequences on different data sets
• Smps, clusters and NUMA systems fit this category
There are multiple control units, each feeding a separate instruction
stream to its own PU.
Shared-memory multiprocessor or a distributed-memory
multicomputer
CONT’D

• A taxonomy of parallel processor architecture


MIMD
 The processors are general purpose
 Able to process all of the instructions necessary to perform the
appropriate data transformation.
 Can be further subdivided by the means in which the processors
communicate.
Symmetric multiprocessor (smp)
Multiple processors share a single memory or pool of memory by
means of a shared bus or other interconnection mechanism.
The memory access time to any region of memory is
approximately the same for each processor.
Processors communicate with each other via that memory.
Non-uniform memory access (numa)
A more recent development
The memory access time to different regions of memory differ.
SYMMETRIC MULTIPROCESSOR (SMP)

A stand alone computer with the following


characteristics:
Processors share All processors System
same memory share access to controlled by
and I/O I/O devices integrated
facilities • Either through same operating
Two or more • Processors are channels or All processors system
similar connected by a bus different channels can perform the
giving paths to • Provides interaction
processors of or other internal same functions between processors
connection same devices
comparable (hence and their programs
• Memory access at job, task, file
capacity time is “symmetric”) and data element
approximately the levels
same for each
processor
SYMMETRIC MULTIPROCESSOR (CONT’D)
• Advantages of SMP organization over a uniprocessor:
Performance:
Multiple processors have performance than single processor.
Availability:
Failure of a single processor does not halt the machine.
Incremental growth:
enhance the performance of a system by adding an additional
processor.
Scaling:
Range of products with different price and performance
characteristics
Symmetric Multiprocessor Organization
CONT’D
• The most common organization is the time-shared bus.
The simplest mechanism for constructing a multiprocessor
system.
The bus consists of control, address, and data lines.
To facilitate DMA transfers from i/o subsystems to processors, the
following features are provided:
• Addressing: it must be possible to distinguish modules on the bus
to determine the source and destination of data.
• Arbitration: A mechanism is provided to arbitrate competing
requests for bus control, using some sort of priority scheme.
• Time-sharing: when one module is controlling the bus, other
modules are locked out and must, if necessary, suspend operation
until bus access is achieved.
Several attractive features bus organization
• Simplicity
• Simplest approach to multiprocessor organization
• Flexibility
• Generally easy to expand the system by attaching
more processors to the bus
• Reliability
• The bus is essentially a passive medium and the
failure of any attached device should not cause
failure of the whole system
Disadvantages of the bus organization
• Main drawback is performance
• All memory references pass through the common bus
• Performance is limited by bus cycle time
• Each processor should have cache memory
• Reduces the number of bus accesses
• Leads to problems with cache coherence
• If a word is altered in one cache it could conceivably
invalidate a word in another cache
• To prevent this the other processors must be alerted that an
update has taken place
• Typically addressed in hardware rather than the operating
system
MULTIPROGRAMMING AND MULTIPROCESSING
MULTIPROGRAMMING AND MULTIPROCESSING

Multiprocessing Multiprogramming
The availability of more than one processor
A process of running multiple programs in
per system to execute multiple sets of
the system’s main memory simultaneously.
instructions simultaneously.
Job processing is less time taking. Job processing takes more time.
Faster job processing allows the
simultaneous execution of multiple Only one process can run at a time.
processes.
Multiprocessing uses multiple processors to Multiprogramming uses batch OS. During
do the job. execution, the CPU is fully utilized.
Requires more than one CPU. Requires only one CPU.
Multiprocessor Operating System Design
Considerations
Multiprocessor operating system must provide all the functionality of a
multiprogramming system plus additional features to accommodate multiple
processors. Among the key design issues:
• Simultaneous concurrent processes
• OS routines need to be reentrant to allow several processors to execute the
same IS code simultaneously
• OS tables and management structures must be managed properly to avoid
deadlock or invalid operations
• Scheduling
• Any processor may perform scheduling so conflicts must be avoided
• Scheduler must assign ready processes to available processors
CONT’D
Synchronization
 With multiple active processes having potential access to shared address spaces
or I/O resources, care must be taken to provide effective synchronization
 Synchronization is a facility that enforces mutual exclusion and event ordering
Memory management
 OS needs to exploit the available hardware parallelism to achieve the best
performance
 Paging mechanisms on different processors must be coordinated to enforce
consistency when several processors share a page or segment and to decide on
page replacement
Reliability and fault tolerance
 OS should provide graceful degradation in the face of processor failure
 Scheduler and other portions of the operating system must recognize the loss of
a processor and restructure accordingly
CACHE COHERENCE
• Cache coherence:
 The uniformity of shared resource data that ends up stored
in multiple local caches
• Cache coherence problem:
 It is challenge of keeping multiple local caches
synchronized when one of the processor updates its local
copy of data which is shared among multiple caches.
CACHE COHERENCE SOLUTION
• Software solutions
 Attempt to avoid the need for additional hardware circuitry and
logic by relying on the compiler and operating system to deal
with the problem
 Attractive because the overhead of detecting potential problems
is transferred from run time to compile time, and the design
complexity is transferred from hardware to software
However, compile-time software approaches generally must
make conservative decisions, leading to inefficient cache
utilization
CACHE COHERENCE SOLUTION(CONT’D)
• Hardware-based solution
 Referred to as cache coherence protocols
 Because the problem is only dealt with when it actually
arises there is more effective use of caches, leading to
improved performance over a software approach
 Approaches are transparent to the programmer and the
compiler, reducing the software development burden
 Can be divided into two categories:
Directory protocols
Snoopy protocols
THREADS AND PROCESSES
Thread is concerned with
scheduling and execution,
whereas a process is concerned
with both scheduling/execution
and resource and resource
ownership

Thread switch
• The act of switching Process:
processor control between • An instance of program
threads within the same running on computer
process • Two key characteristics:
• Typically less costly than • Resource ownership
process switch • Scheduling/execution

Thread:
Process switch
• Dispatchable unit of work
within a process • Operation that switches the
• Includes processor context processor from one process to
(which includes the program another by saving all the
counter and stack pointer) and process control data, registers,
data area for stack and other information for the
first and replacing them with
• Executes sequentially and is the process information for the
interruptible so that the second
processor can turn to another
thread
IMPLICIT AND EXPLICIT MULTITHREADING

All Commercial Processors And Most Experimental Ones Use


Explicit Multithreading
Concurrently Execute Instructions From Different Explicit
Threads

Implicit Multithreading Is Concurrent Execution Of Multiple


Threads Extracted From Single Sequential Program
APPROACHES TO EXPLICIT
MULTITHREADING
• Interleaved • Blocked
• Fine-grained • Coarse-grained
• Processor deals with two or more • Thread executed until event causes
thread contexts at a time delay
• Switching thread at each clock • Effective on in-order processor
cycle • Avoids pipeline stall
• If thread is blocked it is skipped
• Chip multiprocessing
• Simultaneous (SMT) • Processor is replicated on a single chip
• Instructions are simultaneously • Each processor handles separate
issued from multiple threads to threads
execution units of superscalar
processor • Advantage is that the available logic
area on a chip is used effectively
CLUSTERS
 Alternative to SMP as an approach to providing high performance and high
availability
 Particularly attractive for server applications
 Defined as:
 A group of interconnected whole computers working together as a unified
computing resource that can create the illusion of being one machine
 Each computer in a cluster is called a node
 Benefits:
 Absolute scalability
 Incremental scalability
 High availability
 Superior price/performance
CLUSTER COMPUTER ARCHITECTURE
CLUSTERS COMPARED TO SMP
 Both provide a configuration with multiple processors to
support high demand applications
 Both solutions are available commercially
• SMP • CLUSTERING
• Easier to manage and
configure • Far superior in terms of
• Much closer to the original incremental and absolute
single processor model for scalability
which nearly all applications • Superior in terms of
are written availability
• Less physical space and • All components of the system
lower power consumption can readily be made highly
• Well established and stable redundant
NONUNIFORM MEMORY ACCESS (NUMA)
• Alternative to SMP and clustering
• Non uniform memory access (NUMA)
• All processors have access to all parts of main memory
using loads and stores
• Access time of processor differs depending on which region
of main memory is being accessed
• Different processors access different regions of memory at
different speeds
• Cache-coherent NUMA (CC-NUMA)
• A NUMA system in which cache coherence is maintained
among the caches of the various processors
NUMA PROS AND CONS
• Main advantage of a CC-NUMA • Does not transparently look like
system is that it can deliver effective an SMP
performance at higher levels of
parallelism than SMP without • Software changes will be
requiring major software changes required to move an operating
system and applications from an
• Bus traffic on any individual node is
SMP to a CC-NUMA system
limited to a demand that the bus can
handle • Concern with AVAILABILITY
• If many of the memory accesses are
to remote nodes, performance begins
to break down

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy