Parallel Models of Computation
Parallel Models of Computation
Developing a standard parallel model of computation for analyzing algorithms has proven
difficult because different parallel computers tend to vary significantly in their organizations.
In spite of this difficulty, useful parallel models have emerged, along with a deeper
understanding of the modeling process. In this section we describe three important principles
that have emerged.
runs in time using processors is not efficient. The first algorithm is better
than the second - even though it is slower - because it's work, or cost, is smaller. Of
course, given two parallel algorithms that perform the same amount of work, the faster
one is generally better.
2. Emulation. The notion of work-efficiency leads to another important observation: a
model can be useful without mimicking any real or even realizable machine. Instead, it
suffices that any algorithm that runs efficiently in the model can be translated into an
algorithm that runs efficiently on real machines. As an example, consider the widely-used
parallel random-access machine (PRAM) model. In the PRAM model, a set of processors
share a single memory system. In a single unit of time, each processor can perform an
arithmetic, logical, or memory access operation. This model has often been criticized as
unrealistically powerful, primarily because no shared memory system can perform
memory accesses as fast as processors can execute local arithmetic and logical operations.
The important observation, however, is that for a model to be useful we only require that
algorithms that are efficient in the model can be mapped to algorithms that are efficient
on realistic machines, not that the model is realistic. In particular, any algorithm that runs
efficiently in a P-processor PRAM model can be translated into an algorithm that runs
Algorithmic Techniques
A major advance in parallel algorithms has been the identification of fundamental algorithmic
techniques. Some of these techniques are also used by sequential algorithms, but play a more
prominent role in parallel algorithms, while others are unique to parallelism. Here we list
some of these techniques with a brief description of each.
points into the leftmost and rightmost , recursively finding the convex hull of
each set in parallel, and then merging the two resulting hulls. Divide-and-conquer has
proven to be one of the most powerful techniques for solving problems in parallel with
applications ranging from linear systems to computer graphics and from factoring large
numbers to n-body simulations.
2. Randomization. The use of random numbers is ubiquitous in parallel algorithms.
Intuitively, randomness is helpful because it allows processors to make local decisions
which, with high probability, add up to good global decisions. For example, suppose we
want to sort a collection of integer keys. This can be accomplished by partitioning the
keys into buckets then sorting within each bucket. For this to work well, the buckets must
represent non-overlapping intervals of integer values, and contain approximately the same
number of keys. Randomization is used to determine the boundaries of the intervals. First
each processor selects a random sample of its keys. Next all of the selected keys are
sorted together. Finally these keys are used as the boundaries. Such random sampling is
also used in many parallel computational geometry, graph, and string matching
algorithms. Other uses of randomization include symmetry breaking, load balancing, and
routing algorithms.
3. Parallel Pointer Manipulations. Many of the traditional sequential techniques for
manipulating lists, trees, and graphs do not translate easily into parallel techniques. For
example techniques such as traversing the elements of a linked list, visiting the nodes of a
tree in postorder, or performing a depth-first traversal of a graph appear to be inherently
sequential. Fortunately, each of these techniques can be replaced by efficient parallel
techniques. These parallel techniques include pointer jumping, the Euler-tour technique,
ear decomposition, and graph contraction. For example, one way to label each node of
an n-node list (or tree) with the label of the last node (or root) is to use pointer jumping.
In each pointer-jumping step each node in parallel replaces its pointer with that of its
successor (or parent). After at most steps, every node points to the same node, the
end of the list (or root of the tree).
4. Others. Other useful techniques include finding small graph separators for partitioning
data among processors to reduce communication, hashing for balancing load across
processors and mapping addresses to memory, and iterative techniques as a replacement
for direct methods for solving linear systems.
These techniques have led to efficient parallel algorithms in most problem areas for which
efficient sequential algorithms are known. In fact, some of the techniques originally
developed for parallel algorithms have led to improvements in sequential algorithms.