0% found this document useful (0 votes)
4 views3 pages

Sheet 2: Problem 1: Matrix Multiplication Using CREW PRAM

Uploaded by

bassemsadakah.bs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views3 pages

Sheet 2: Problem 1: Matrix Multiplication Using CREW PRAM

Uploaded by

bassemsadakah.bs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Sheet 2

Problem 1: Matrix Multiplication using CREW PRAM


a) Pseudo-code for n being a power of two

Algorithm 1 CREW Matrix Multiplication (n is power of 2) 1:


Input: n × n matrices A and B, where n is power of 2
2: Output: n × n matrix C = A · B
3: // Phase 1: Parallel computation of products
4: for all processors P(i, j, k) where 0 ≤ i, j, k ≤ n − 1 do in parallel 5:
temp[i, j, k] ← A[i, k] × B[k, j]
6: end for
7: // Phase 2: Parallel reduction to compute sums
8: for l = 1 to log2 n do
l−1
9: stride ← 2
10: for all processors P(i, j, k) where 0 ≤ i, j ≤ n − 1 and 0 ≤ k < n/stride do in
parallel
11: if k%(2 × stride) < stride then
12: temp[i, j, k] ← temp[i, j, k] + temp[i, j, k + stride] 13: end if
14: end for
15: end for
16: // Phase 3: Copy results to C
17: for all processors P(i, j) where 0 ≤ i, j ≤ n − 1 do in parallel 18:
C[i, j] ← temp[i, j, 0]
19: end for

b) Solution when n is not a power of two


For n not a power of two:

1. Pad the matrices with zeros to size m × m, where m is the smallest


power of two such that m ≥ n

1
2. Perform matrix multiplication on the padded matrices using the algorithm
from part (a)

3. Extract the n × n result from the m × m product matrix

Complexity Analysis:
• m < 2n

• Time complexity remains O(log m) = O(log n)

• Processor complexity becomes O(m3) = O((2n)3) = O(n3) •

Memory complexity similarly remains O(n3)

c) Cost-optimality analysis
The original solution uses O(n3) processors and O(log n) time, giving a cost of
O(n3log n), which is not cost-optimal (sequential algorithm has cost O(n3)).
Cost-optimal version:

Algorithm 2 Cost-Optimal CREW Matrix Multiplication


1: Input: n × n matrices A and B
2: Output: n × n matrix C = A · B
3: for all processors P(i, j) where 0 ≤ i, j ≤ n − 1 do in parallel 4:
sum ← 0
5: for k = 0 to n − 1 do
6: sum ← sum + A[i, k] × B[k, j]
7: end for
8: C[i, j] ← sum
9: end for

This version:

• Uses O(n2) processors (one per output element)

• Each processor performs O(n) work sequentially

• Runs in O(n) time

• Total cost is O(n3), matching the sequential algorithm

• Still uses CREW model (no concurrent writes)

Problem 2: Parallel array initialization


Properties:
• Uses O(n) processors (one per array element)

2
Algorithm 3 CREW Array Initialization
1: Input: Array A of size n, value X
2: Output: Array A initialized with value X
3: for all processors P(i) where 1 ≤ i ≤ n do in parallel
4: A[i] ←X
5: end for

• Runs in O(1) time (constant time initialization)

• Operates on CREW PRAM (concurrent reads of X allowed, no write conflicts)

• Cost-optimal with total cost O(n) matching sequential initialization 3

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy