Week 7
Week 7
(CS 526)
Muhammad Awais,
• Parallel Task
– A task that can be executed by multiple processors safely
(producing correct results)
• Serial Execution
– Execution of a program sequentially, one statement at a
time.
– In the simplest sense, this is what happens on a one
processor machine.
Some General Parallel Terminologies
• Parallel Execution
– Execution of a program by more than one task (threads)
– Each task being able to execute the same or different
statement at the same moment in time.
• Shared Memory
– where all processors have direct (usually bus based) access
to common physical memory
– In a programming sense, it describes a model where parallel
tasks all have the same "picture" of memory
• Distributed Memory
– Network based memory access for physical memory that is
not common.
– Tasks can only logically "see" local machine memory and
must use communications to access memory on other
Some General Parallel Terminologies
• Communications
– Parallel tasks typically need to exchange data. This can be
accomplished: shared memory or over a network,
– However the actual event of data exchange is commonly
referred to as communications (regardless of the method
employed).
• Synchronization
– The coordination of parallel tasks in real time, very often
associated with communications
– Often implemented by establishing a synchronization point
within an application where a task may not proceed further
until another task(s) reaches the same or logically
equivalent point.
Some General Parallel Terminologies
• Granularity
– In parallel computing, granularity is a measure of the ratio
of computation to communication.
– Coarse: relatively large amount of computational work are
done between communication events
– Fine: relatively small amounts of computational work are
done between communication events
• Observed Speedup:
– Observed speedup of a code which has been parallelized
wall-clock time of serial execution
wall-clock time of parallel execution
1. Fine-grain parallelism
2. Coarse-grain parallelism
Fine-grain Parallelism
• Relatively small amounts of computational work are done between
communication events
• Low computation to communication ratio
• Implies high communication overhead and less opportunity for
performance enhancement
• If granularity is too fine it is possible that the overhead required for
communications and synchronization between tasks takes longer
than the computation.
Coarse-grain Parallelism
• Relatively large amounts of computational
work are done between
communication/synchronization events
1
Max.speedup = --------
1 - P
Proc
speedup
-
N P = .50 P = .90 P = .99
-
10 1.82 5.26 9.17
100 1.98 9.17 50.25
1000 1.99 9.91 90.99
10000 1.99 9.91 99.02
Amdahl's Law
F = serial fraction
E.g., 1/0.05 (5% serial) = 20 speedup (maximum)
Maximum Speedup (Amdahl's Law)
Maximum speedup is usually p with p processors
(linear speedup).
e.g. if f==1 (all the code is serial, then the speedup will be 1
no matter how may processors are used
Speedup
• However, certain problems demonstrate increased performance
by increasing the problem size. For example:
– 2D Grid Calculations 85 seconds 85%
– Serial fraction 15 seconds 15%
• Massively Parallel
– Refers to the hardware that comprises a given parallel
system - having many processors (over 100’s of processors)
Some General Parallel Terminologies
• Scalability
– Refers to a parallel system's (hardware and/or software)
ability to demonstrate a proportionate increase in
parallel speedup with the addition of more processors.