0% found this document useful (0 votes)

141 views

Parallel Programming and MPI

Uploaded by

sandeepkaranpanwar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

141 views

Parallel Programming and MPI

Uploaded by

sandeepkaranpanwar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 54

Parallel

Programming and
MPI
A course for IIT-M. September 2008
R Badrinath, STSD Bangalore
(ramamurthy.badrinath@hp.com)

© 2006 Hewlett-Packard Development Company, L.P.

The information contained herein is subject to change without notice
Context and Background
• IIT- Madras has recently added a good deal of compute power.
• Why –
− Further R&D in sciences, engineering
− Provide computing services to the region
− Create new opportunities in education and skills
−…
• Why this course –
− Update skills to program modern cluster computers
• Length -2 theory and 2 practice sessions, 4 hrs each

2 September 2008 IIT-Madras

Audience Check

3
Contents
1. MPI_Init Instead we
2. •Understand Issues
MPI_Comm_rank
3. MPI_Comm_size
• Understand Concepts
4. MPI_Send
5. MPI_Recv •Learn enough to pickup from the manual
6. MPI_Bcast
7.
Go
MPI_Create_comm
• by motivating examples
8. MPI_Sendrecv
9. MPI_Scatter •Try out some of the examples
10. MPI_Gather

………………

4 September 2008 IIT-Madras

Outline
• Sequential vs Parallel programming
• Shared vs Distributed Memory
• Parallel work breakdown models
• Communication vs Computation
• MPI Examples
• MPI Concepts
• The role of IO

5 September 2008 IIT-Madras

Sequential vs Parallel
• We are used to sequential programming – C, Java, C+
+, etc. E.g., Bubble Sort, Binary Search, Strassen
Multiplication, FFT, BLAST, …
• Main idea – Specify the steps in perfect order
• Reality – We are used to parallelism a lot more than
we think – as a concept; not for programming
• Methodology – Launch a set of tasks; communicate to
make progress. E.g., Sorting 500 answer papers by –
making 5 equal piles, have them sorted by 5 people,
merge them together.

6 September 2008 IIT-Madras

Shared vs Distributed Memory
Programming
• Shared Memory – All tasks access the same memory, hence the
same data. pthreads
• Distributed Memory – All memory is local. Data sharing is by
explicitly transporting data from one task to another (send-
receive pairs in MPI, e.g.)

Program Memory Communications channel

• HW – Programming model relationship – Tasks vs CPUs;

• SMPs vs Clusters

7 September 2008 IIT-Madras

Designing Parallel Programs

8
Simple Parallel Program – sorting numbers
in a large array A
• Notionally divide A into 5 pieces
[0..99;100..199;200..299;300..399;400..499].
• Each part is sorted by an independent sequential
algorithm and left within its region.

• The resultant parts are merged by simply reordering

among adjacent parts.

9 September 2008 IIT-Madras

What is different – Think about…
• How many people doing the work. (Degree of
Parallelism)
• What is needed to begin the work. (Initialization)
• Who does what. (Work distribution)
• Access to work part. (Data/IO access)
• Whether they need info from each other to finish their
own job. (Communication)
• When are they all done. (Synchronization)
• What needs to be done to collate the result.

10 September 2008 IIT-Madras

Work Break-down
• Parallel algorithm
• Prefer simple intuitive breakdowns
• Usually highly optimized sequential algorithms are not
easily parallelizable
• Breaking work often involves some pre- or post-
processing (much like divide and conquer)
• Fine vs large grain parallelism and relationship to
communication

11 September 2008 IIT-Madras

Digression – Let’s get a simple MPI Program to work
#include <mpi.h>
#include <stdio.h>

int main()
{
int total_size, my_rank;

MPI_Init(NULL,NULL);

MPI_Comm_size(MPI_COMM_WORLD, &total_size);
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);

printf("\n Total number of programs = %d, out of which

rank of this process is %d\n", total_size, my_rank);
MPI_Finalize();
return 0;
}
12 September 2008 IIT-Madras
Getting it to work
• Compile it:
− mpicc –o simple simple.c # If you want HP-MPI set your path
# /opt/hpmpi/bin
• Run it
− This depends a bit on the system
− mpirun -np2 simple
− qsub –l ncpus=2 –o simple.out /opt/hpmpi/bin/mpirun <your
program location>/simple
− [Fun: qsub –l ncpus=2 –I hostname ]

• Results are in the output file.

• What is mpirun ?
• What does qsub have to do with MPI?... More about qsub in a separate
talk.

13 September 2008 IIT-Madras

What goes on
• Same program is run at the same time on 2 different
CPUs
• Each is slightly different in that each returns different
values for some simple calls like MPI_Comm_rank.
• This gives each instance its identity
• We can make different instances run different pieces
of code based on this identity difference
• Typically it is an SPMD model of computation

14 September 2008 IIT-Madras

Continuing work breakdown…
Simple Example: Find shortest distances
PROBLEM: 7
Find shortest path distances 2 2 3
5
1 3
1 2
2

7 0 6 4

Let Nodes be numbered 0,1,…,n-1

0 2 1 .. 6
Let us put all of this in a matrix 7 0 .. .. ..
A[i][j] is the distance from i to j 1 5 0 2 3
.. .. 2 0 2
.. .. .. .. 0

15 September 2008 IIT-Madras

Floyd’s (sequential) algorithm
For (k=0; k<n; k++)
For (i=0; i<n; i++)
for (j=0; j<n; j++)
a[i][j]=min( a[i][j] , a[i,k]+a[k][j] );

Observation:
For a fixed k,
Computing i-th row needs i-th row and k-th row

16 September 2008 IIT-Madras

Parallelizing Floyd
• Actually we just need n2 tasks, with each task iterating
n times (once for each value of k).
• After each iteration we need to make sure everyone
sees the matrix.
• ‘Ideal’ for shared memory.. Programming
• What if we have less than n2 tasks?... Say p<n.
• Need to divide the work among the p tasks.
• We can simply divide up the rows.

17 September 2008 IIT-Madras

Dividing the work
• Each task gets [n/p] rows, with the last possibly getting
a little more.
T0

i-th row
q x [ n/p ]

k-th row Remember the

observation

18 September 2008 IIT-Madras

/* “id” is TASK NUMBER, each node has only the part of A that
it owns. This is approximate code */ Note that each node calls its
for (k=0;k<n;k++) { own matrix by the same name
name a [ ][ ] but has only
The MPI Model… current_owner_task = GET_BLOCK_OWNER(k);
[p/n] rows.
-All nodes run the if (id == current_owner_task) {
same code!! P k_here = k - LOW_END_OF_MY_BLOCK(id);
replica tasks!!… Distributed Memory Model
for(j=0;j<n;j++)
-Some times they
rowk[j]=a[k_here][j];
need to do
different things }
/* rowk is broadcast by the owner and received by others..
The MPI code will come here later */
for(i=0;i<GET_MY_BLOCK_SIZE(id);i++)
for(j=0;j<n;j++)
a[i,j]=min(a[i][j],
a[i][k]+rowk[j]);

19 September 2008
} IIT-Madras
The MPI model
• Recall MPI tasks are typically created when the jobs
are launched – not inside the MPI program (no
forking).
− mpirun usually creates the task set
− mpirun –np 2 a.out <args to a.out>
− a.out is run on all nodes and a communication channel is
setup between them
• Functions allow for tasks to find out
− Size of the task group
− Ones own position within the group

20 September 2008 IIT-Madras

MPI Notions [ Taking from the example ]
• Communicator – A group of tasks in a program
• Rank – Each task’s ID in the group
− MPI_Comm_rank() … /* use this to set “id” */
• Size – Of the group
− MPI_Comm_size() … /* use to set “p” */
• Notion of send/receive/broadcast…
− MPI_Bcast() … /* use to broadcast rowk[] */

• For actual syntax use a good MPI book or manual

• Online resource: http://www-unix.mcs.anl.gov/mpi/www/

21 September 2008 IIT-Madras

MPI Prologue to our Floyd example
int a[MAX][MAX];
int n=20; /* real size of the matrix,
can be read in */
int id,p;

MPI_Init(argc,argv);

MPI_Comm_rank(MPI_COMM_WORLD,&id);
MPI_Comm_size(MPI_COMM_WORLD,&p);
.
./* This is where all the real work happens */
.
MPI_Finalize(); /* Epilogue */

22 September 2008 IIT-Madras

This is the time to try out several
simple MPI programs using the
few functions we have seen.
- use mpicc
- use mpirun

23
Visualizing the execution Multiple Tasks/CPUs
maybe on the same node

Job is Launched Scheduler ensures 1 task

per cpu
Tasks On CPUs

•MPI_INIT, MPI_Comm_rank, MPI_Comm_size etc…

•Other initializations, like reading in the array
•For initial values of k, task with rank 0 broadcasts row k, others receive
•For each value of k they do their computation with the correct rowk
•Loop above for all values of k

•Task 0 receives all blocks of the final array and prints them out
•MPI_Finalize
24 September 2008 IIT-Madras
Communication vs Computation
• Often communication is needed between iterations to complete
the work.
• Often the more the tasks the more the communication can
become.
− In Floyd, bigger “p” indicates that “rowk” will be sent to a larger
number of tasks.
− If each iteration depends on more data, it can get very busy.
• This may mean network contention; i.e., delays.
• Try to count the numbr of “a”s in a string. Time vs p
• This is why for a fixed problem size increasing number of
CPUs does not continually increase performance
• This needs experimentation – problem specific

25 September 2008 IIT-Madras

Communication primitives
• MPI_Send(sendbuffer, senddatalength,
datatype, destination, tag,
communicator);
• MPI_Send(“Hello”, strlen(“Hello”),
MPI_CHAR, 2 , 100,
MPI_COMM_WORLD);
• MPI_Recv(recvbuffer, revcdatalength,
MPI_CHAR, source, tag,
MPI_COMM_WORLD,
&status);
• Send-Recv happen in pairs.

26 September 2008 IIT-Madras

Collectives
• Broadcast is one-to-all communication
• Both receivers and sender call the same function
• All MUST call it. All end up with SAME result.
• MPI_Bcast (buffer, count, type, root, comm);
• Examples
− MPI_Bcast(&k, 1, MPI_INT, 0,
MPI_Comm_World);
− Task 0 sends its integer k and all others receive it.
− MPI_Bcast(rowk,n,MPI_INT,current_owner_task,
MPI_COMM_WORLD);
− Current_owner_task sends rowk to all others.

27 September 2008 IIT-Madras

Try out a simple MPI program with
send-recvs and braodcasts.

Try out Floyd’s algorithm.

What if you have to read a file to
initialize Floyd’s algorithm?

28
A bit more on Broadcast
Ranks: 0 1 2
x : 0 1 2
MPI_Bcast(&x,1,..,0,..); MPI_Bcast(&x,1,..,0,..); MPI_Bcast(&x,1,..,0,..);

x : 0 0 0

0 0 0

29 September 2008 IIT-Madras

Other useful collectives
• MPI_Reduce(&values,&results,count,type,operator,
root,comm);
• MPI_Reduce(&x, &res, 1, MPI_INT, MPI_SUM,
9, MPI_COMM_WORLD);

• Task number 9 gets in the variable res the sum of

whatever was in x in all of the tasks (including itself).
• Must be called by ALL tasks.

30 September 2008 IIT-Madras

Scattering as opposed to broadcasting
• MPI_Scatterv(sndbuf, sndcount[], send_disp[], type,
recvbuf, recvcount, recvtype, root, comm);
• All nodes MUST call

Rank0

Rank0 Rank1 Rank2 Rank3

31 September 2008 IIT-Madras

Common Communication pitfalls!!
• Make sure that communication
primitives are called by the right
number of tasks.
• Make sure they are called in the right
sequence.
• Make sure that you use the proper
tags.
• If not, you can easily get into
deadlock (“My program seems to be
hung”)

32 September 2008 IIT-Madras

More on work breakdown
• Finding the right work breakdown can be challenging
• Sometime dynamic work breakdown is good
• Master (usually task 0) decides who will do what and
collects the results.
• E.g., you have a huge number of 5x5 matrices to
multiply (chained matrix multiplication).
• E.g., Search for a substring in a huge collection of
strings.

33 September 2008 IIT-Madras

Master-slave dynamic work assignment
Master
1
0

2 Slaves

34 September 2008 IIT-Madras

Master slave example – Reverse strings
Slave(){
do{
MPI_Recv(&work,MAX,MPI_CHAR,i,0,MPI_COMM_WORLD,&stat);
n=strlen(work);
if(n==0) break; /* detecting the end */

reverse(work);

MPI_Send(&work,n+1,MPI_CHAR,0,0,MPI_COMM_WORLD);
} while (1);

MPI_Finalize();
}

35 September 2008 IIT-Madras

Master slave example – Reverse strings
Master(){ /* rank 0 task */
initialize_work_tems();
for(i=1;i<np;i++){ /* Initial work distribution */
work=next_work_item();
n = strlen(work)+1;
MPI_Send(&work,n,MPI_CHAR,i,0,MPI_COMM_WORLD);
}
unfinished_work=np;
while (unfinished_work!=0) {
MPI_Recv(&res,MAX,MPI_CHAR,MPI_ANY_SOURCE,0,
MPI_COMM_WORLD,&status);
process(res);
work=next_work_item();
if(work==NULL) unfinished_work--;
else {
n=strlen(work)+1;
MPI_Send(&work,n,MPI_CHAR,status->MPI_source,
0,MPI_COMM_WORLD);
}
}
36 September 2008 IIT-Madras
Master slave example
Main(){
...
MPI_Comm_Rank(MPI_COMM_WORLD,&id);
MPI_Comm_size(MPI_COMM_WORLD,&np);
if (id ==0 )
Master();
else
Slave();
...
}

37 September 2008 IIT-Madras

Matrix Multiply and Communication
Patterns

38
Block Distribution of Matrices
• Matrix Mutliply: •Each task owns a block – its own
− Cij = Σ (Aik * Bkj) part of A,B and C
•The old formula holds for blocks!
• BMR Algorithm:
•Example:
C21=A20 * B01
A21 * B11
A22 * B21
A23 * B31

Each is a smaller Block – a submatrix

39 September 2008 IIT-Madras
Block Distribution of Matrices
C21 = A20 * B01
• Matrix Mutliply:
A21 * B11
− Cij = Σ (Aik * Bkj)
A22 * B21
• BMR Algorithm:
A23 * B31

•A22 is row broadcast

•A22*B21 added into C21
•B_1 is Rolled up one slot
•Out task now has B31
Now repeat the above block except
the item to broadcast is A23
Each is a smaller Block – a submatrix
40 September 2008 IIT-Madras
Attempt doing this with just Send-
Recv and Broadcast

41
Communicators and Topologies
• BMR example shows limitations of broadcast..
Although there is pattern
• Communicators can be created on subgroups of
processes.
• Communicators can be created that have a topology
− Will make programming natural
− Might improve performance by matching to hardware

42 September 2008 IIT-Madras

for (k = 0; k < s; k++) {
sender = (my_row + k) % s;
if (sender == my_col) {
MPI_Bcast(&my_A, m*m, MPI_INT,
sender, row_comm);
T = my_A;
else MPI_Bcast(&T, m*m, MPI_INT,
sender, row_comm);
my_C = my_C + T x my_B;
}
MPI_Sendrecv_replace(my_B, m*m, MPI_INT, dest, 0,
source, 0, col_comm, &status); }

43 September 2008 IIT-Madras

Creating topologies and communicators
• Creating a grid
• MPI_Cart_create(MPI_COMM_WORLD, 2,
dim_sizes, istorus, canreorder, &grid_comm);
− int dim_sizes[2], int istorus[2], int canreorder, MPI_Comm
grid_comm

• Divide a grid into rows- each with own communicator

• MPI_Cart_sub(grid_comm,free,&rowcom)
− MPI_Comm rowcomm; int free[2]

44 September 2008 IIT-Madras

Try implementing the BMR
algorithm with communicators

45
A brief on other MPI Topics – The last leg
• MPI+Multi-threaded / OpenMP
• One sided Communication
• MPI and IO

46 September 2008 IIT-Madras

MPI and OpenMP
•Grain
•Communication

•Where does the

interesting pragma
… omp for fit in our
MPI Floyd?
…
•How do I assign
exactly one MPI
task per CPU?

47 September 2008 IIT-Madras

One-Sided Communication
• Have no corresponding send-recv pairs!
• RDMA
• Get
• Put

48 September 2008 IIT-Madras

IO in Parallel Programs
• Typically a root task, does the IO.
− Simpler to program
− Natural because of some post processing occasionally needed
(sorting)
− All nodes generating IO requests might overwhelm
fileserver, essentially sequentializing it.
• Performance not the limitation for Lustre/SFS.
• Parallel IO interfaces such as MPI-IO can make use of
parallel filesystems such as Lustre.

49 September 2008 IIT-Madras

MPI-BLAST exec time vs other time[4]

50 September 2008 IIT-Madras

How IO/Comm Optimizations help MPI-
BLAST[4]

51 September 2008 IIT-Madras

What did we learn?
• Distributed Memory Programming Model
• Parallel Algorithm Basics
• Work Breakdown
• Topologies in Communication
• Communication Overhead vs Computation
• Impact of Parallel IO

52 September 2008 IIT-Madras

What MPI Calls did we see here?
1. MPI_Init
2. MPI_Finalize
3. MPI_Comm_size
4. MPI_Comm_Rank
5. MPI_Send
6. MPI_Recv
7. MPI_Sendrecv_replace
8. MPI_Bcast
9. MPI_Reduce
10. MPI_Cart_create
11. MPI_Cart_sub
12. MPI_Scatter

53 September 2008 IIT-Madras

References
1. Parallel Programming in C with MPI and OpenMP, M J
Quinn, TMH. This is an excellent practical book. Motivated
much of the material here, specifically Floyd’s algorithm.
2. BMR Algorithm for Matrix Multiply and topology ideas is
motivated by
http://www.cs.indiana.edu/classes/b673/notes/matrix_mult.ht
ml
3. MPI online manual
http://www-unix.mcs.anl.gov/mpi/www/
4. Efficient Data Access For Parallel BLAST, IPDPDS’05

54 September 2008 IIT-Madras

All Pua Routines
63% (8)
All Pua Routines
4 pages
(FREE PDF Sample) (Original PDF) Modern Macroeconomics (The MIT Press) by Sanjay K. Chugh Ebooks
100% (6)
(FREE PDF Sample) (Original PDF) Modern Macroeconomics (The MIT Press) by Sanjay K. Chugh Ebooks
35 pages
Flexisparepartslists HW
100% (2)
Flexisparepartslists HW
107 pages
02 HisenseHitachi Presentation - EN PDF
No ratings yet
02 HisenseHitachi Presentation - EN PDF
46 pages
Mitral Regurgitation
100% (2)
Mitral Regurgitation
22 pages
MPI_tutorial_Fall_Break_2022
No ratings yet
MPI_tutorial_Fall_Break_2022
60 pages
Intro_MPI
No ratings yet
Intro_MPI
60 pages
Cs-3006 6 Mpi Basics 2
No ratings yet
Cs-3006 6 Mpi Basics 2
52 pages
Mpi 1
No ratings yet
Mpi 1
38 pages
Message Passing Interface (MPI)
No ratings yet
Message Passing Interface (MPI)
22 pages
Lecture 15 MPI Summarization
No ratings yet
Lecture 15 MPI Summarization
26 pages
CS-3006_5_MPI Basics
No ratings yet
CS-3006_5_MPI Basics
53 pages
Mpi
No ratings yet
Mpi
30 pages
Mpi
No ratings yet
Mpi
67 pages
Intro To MPI: Hpc-Support@duke - Edu
No ratings yet
Intro To MPI: Hpc-Support@duke - Edu
56 pages
Lecture 11 Distributed Memory Programming
No ratings yet
Lecture 11 Distributed Memory Programming
28 pages
Lecture 1
No ratings yet
Lecture 1
23 pages
Super Quick Introduction To MPI
No ratings yet
Super Quick Introduction To MPI
32 pages
in3200-chap09
No ratings yet
in3200-chap09
56 pages
Mpi Half Day Public
No ratings yet
Mpi Half Day Public
140 pages
Class03 - MPI, Part 1, Intermediate PDF
No ratings yet
Class03 - MPI, Part 1, Intermediate PDF
83 pages
Parallel & Distributed Computing: MPI - Message Passing Interface
No ratings yet
Parallel & Distributed Computing: MPI - Message Passing Interface
49 pages
Clase 4 - Tutorial de MPI
No ratings yet
Clase 4 - Tutorial de MPI
35 pages
Lecture 10-Introduction to MPI
No ratings yet
Lecture 10-Introduction to MPI
51 pages
NGK Mpi
No ratings yet
NGK Mpi
74 pages
ATPESC 2019 Track-2 1-7-30 830am Guo-Raffenetti-Thakur-MPI For Scalable Computing
No ratings yet
ATPESC 2019 Track-2 1-7-30 830am Guo-Raffenetti-Thakur-MPI For Scalable Computing
199 pages
An Introduction To MPI: Parallel Programming With The Message Passing Interface
No ratings yet
An Introduction To MPI: Parallel Programming With The Message Passing Interface
48 pages
PA
No ratings yet
PA
87 pages
Lab Mpi
No ratings yet
Lab Mpi
29 pages
Chapter 4 - Message-Passing Programming, MPI
No ratings yet
Chapter 4 - Message-Passing Programming, MPI
79 pages
Lec 9 DR Marwa Abbas
No ratings yet
Lec 9 DR Marwa Abbas
64 pages
Mpi Lecture
No ratings yet
Mpi Lecture
129 pages
Message Passing Interface (MPI) : EC3500: Introduction To Parallel Computing
100% (1)
Message Passing Interface (MPI) : EC3500: Introduction To Parallel Computing
40 pages
Week 10
No ratings yet
Week 10
52 pages
BIg data anslysi
No ratings yet
BIg data anslysi
57 pages
Mpi Openmp Handouts
No ratings yet
Mpi Openmp Handouts
67 pages
Parallel Programming Using MPI
No ratings yet
Parallel Programming Using MPI
69 pages
Intro To MPI
No ratings yet
Intro To MPI
44 pages
SERC IntroMPI 2019-09-14 v0
No ratings yet
SERC IntroMPI 2019-09-14 v0
43 pages
Unit IV
No ratings yet
Unit IV
12 pages
MiniTool Partition Wizard Crack 12 Key Download Free 2025
No ratings yet
MiniTool Partition Wizard Crack 12 Key Download Free 2025
29 pages
03-MPIProgramStructure[1]
No ratings yet
03-MPIProgramStructure[1]
42 pages
3.MPI
No ratings yet
3.MPI
44 pages
Message Passing Interface (MPI) : Steve Lantz Center For Advanced Computing Cornell University
No ratings yet
Message Passing Interface (MPI) : Steve Lantz Center For Advanced Computing Cornell University
53 pages
Distributed Memory Programming With MPI: Peter Pacheco
No ratings yet
Distributed Memory Programming With MPI: Peter Pacheco
121 pages
Computing LLNL Gov
No ratings yet
Computing LLNL Gov
42 pages
5 MPIprogramming
No ratings yet
5 MPIprogramming
43 pages
Introduction To MPI Ranger Lonestar
No ratings yet
Introduction To MPI Ranger Lonestar
67 pages
MPI Part2 Updated
No ratings yet
MPI Part2 Updated
20 pages
Lab Mpi
No ratings yet
Lab Mpi
32 pages
02 Mpi 0
No ratings yet
02 Mpi 0
19 pages
[Scientific and Engineering Computation] William Gropp, Ewing L. Lusk, Anthony Skjellum, Rajeev Thakur - Using MPI and Using MPI-2 (1999, The MIT Press)
No ratings yet
[Scientific and Engineering Computation] William Gropp, Ewing L. Lusk, Anthony Skjellum, Rajeev Thakur - Using MPI and Using MPI-2 (1999, The MIT Press)
385 pages
02 Message Passing Interface Tutorial
No ratings yet
02 Message Passing Interface Tutorial
34 pages
2013 02 24 Ppopp Mpi Basic
No ratings yet
2013 02 24 Ppopp Mpi Basic
102 pages
ECE 1747H: Parallel Programming: Message Passing (MPI)
No ratings yet
ECE 1747H: Parallel Programming: Message Passing (MPI)
67 pages
Lec_6
No ratings yet
Lec_6
21 pages
Pcap Cse 3263 Lab Manual 2023
No ratings yet
Pcap Cse 3263 Lab Manual 2023
70 pages
Slides 07-1
No ratings yet
Slides 07-1
57 pages
Ms. V. Uma Maheswari, Assistant Lecturer, Department of Information Technology, National Institute of Technology, Surathkal
No ratings yet
Ms. V. Uma Maheswari, Assistant Lecturer, Department of Information Technology, National Institute of Technology, Surathkal
91 pages
Introduction to C MPI PM
No ratings yet
Introduction to C MPI PM
50 pages
Foundation Course for Advanced Computer Studies
From Everand
Foundation Course for Advanced Computer Studies
Franck Ismael Djédjé
No ratings yet
Profound Python Libraries
From Everand
Profound Python Libraries
Onder Teker
No ratings yet
Dreamcast Architecture: Architecture of Consoles: A Practical Analysis, #9
From Everand
Dreamcast Architecture: Architecture of Consoles: A Practical Analysis, #9
Rodrigo Copetti
No ratings yet
C Programming for the Pc the Mac and the Arduino Microcontroller System
From Everand
C Programming for the Pc the Mac and the Arduino Microcontroller System
Peter D Minns
No ratings yet
Introduction to Python Programming: Do your first steps into programming with python
From Everand
Introduction to Python Programming: Do your first steps into programming with python
Greytower Corp
No ratings yet
A New Party System or a New Political System
No ratings yet
A New Party System or a New Political System
11 pages
Business Research Report For Myanmar
100% (1)
Business Research Report For Myanmar
29 pages
CP Ii 2020 Course Work RR
No ratings yet
CP Ii 2020 Course Work RR
14 pages
Garden Grown Indoors: Objectives
No ratings yet
Garden Grown Indoors: Objectives
5 pages
Python Programming Question Bank eDBDA Sept 21
No ratings yet
Python Programming Question Bank eDBDA Sept 21
29 pages
SS12 CPGK DC50 PDF
No ratings yet
SS12 CPGK DC50 PDF
4 pages
Max 31.5V at 24V Max 61.0V at 48V: Main Features
No ratings yet
Max 31.5V at 24V Max 61.0V at 48V: Main Features
1 page
Prashan - Shrestha - ICTPMG505 - Assessment Task 2
No ratings yet
Prashan - Shrestha - ICTPMG505 - Assessment Task 2
33 pages
Brake Calculations
100% (2)
Brake Calculations
15 pages
Chemistry Chapter 3 Multiple Choice Questions
No ratings yet
Chemistry Chapter 3 Multiple Choice Questions
8 pages
A Quiz - Who Was William Carey?
No ratings yet
A Quiz - Who Was William Carey?
5 pages
12 A Var 2
No ratings yet
12 A Var 2
6 pages
Jib Crane 1
No ratings yet
Jib Crane 1
28 pages
Algebraic Identities: Solution
No ratings yet
Algebraic Identities: Solution
35 pages
Adverbs of Degree Exercise: A Fill The Gaps Using The Words in Brackets
No ratings yet
Adverbs of Degree Exercise: A Fill The Gaps Using The Words in Brackets
2 pages
Salesforce Developer Tasks
No ratings yet
Salesforce Developer Tasks
9 pages
Full Download (Ebook PDF) Global Marketing 8th Edition by Svend Hollensen PDF
100% (3)
Full Download (Ebook PDF) Global Marketing 8th Edition by Svend Hollensen PDF
41 pages
Public Policy Analysis
No ratings yet
Public Policy Analysis
12 pages
Employer Job Offer Letter copy Noreen
No ratings yet
Employer Job Offer Letter copy Noreen
2 pages
Digits and Place Value
No ratings yet
Digits and Place Value
8 pages
Mark Research
No ratings yet
Mark Research
67 pages
Family Fun Run: Practical Information
No ratings yet
Family Fun Run: Practical Information
1 page
A enchanted
No ratings yet
A enchanted
2 pages
Software Test Planning: Shamik Dhar
No ratings yet
Software Test Planning: Shamik Dhar
10 pages
Special Contact Cleaner Scotch 1625 TDS Master EMD 2018-05-09 AABBCC03915 - EN - 01
No ratings yet
Special Contact Cleaner Scotch 1625 TDS Master EMD 2018-05-09 AABBCC03915 - EN - 01
2 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Parallel Programming and MPI

Uploaded by

Parallel Programming and MPI

Uploaded by

Parallel

© 2006 Hewlett-Packard Development Company, L.P.

2 September 2008 IIT-Madras

4 September 2008 IIT-Madras

5 September 2008 IIT-Madras

6 September 2008 IIT-Madras

Program Memory Communications channel

• HW – Programming model relationship – Tasks vs CPUs;

7 September 2008 IIT-Madras

• The resultant parts are merged by simply reordering

9 September 2008 IIT-Madras

10 September 2008 IIT-Madras

11 September 2008 IIT-Madras

printf("\n Total number of programs = %d, out of which

• Results are in the output file.

13 September 2008 IIT-Madras

14 September 2008 IIT-Madras

Let Nodes be numbered 0,1,…,n-1

15 September 2008 IIT-Madras

16 September 2008 IIT-Madras

17 September 2008 IIT-Madras

k-th row Remember the

18 September 2008 IIT-Madras

20 September 2008 IIT-Madras

• For actual syntax use a good MPI book or manual

21 September 2008 IIT-Madras

22 September 2008 IIT-Madras

Job is Launched Scheduler ensures 1 task

•MPI_INIT, MPI_Comm_rank, MPI_Comm_size etc…

25 September 2008 IIT-Madras

26 September 2008 IIT-Madras

27 September 2008 IIT-Madras

Try out Floyd’s algorithm.

29 September 2008 IIT-Madras

• Task number 9 gets in the variable res the sum of

30 September 2008 IIT-Madras

Rank0 Rank1 Rank2 Rank3

31 September 2008 IIT-Madras

32 September 2008 IIT-Madras

33 September 2008 IIT-Madras

34 September 2008 IIT-Madras

35 September 2008 IIT-Madras

37 September 2008 IIT-Madras

Each is a smaller Block – a submatrix

•A22 is row broadcast

42 September 2008 IIT-Madras

43 September 2008 IIT-Madras

• Divide a grid into rows- each with own communicator

44 September 2008 IIT-Madras

46 September 2008 IIT-Madras

•Where does the

47 September 2008 IIT-Madras

48 September 2008 IIT-Madras

49 September 2008 IIT-Madras

50 September 2008 IIT-Madras

51 September 2008 IIT-Madras

52 September 2008 IIT-Madras

53 September 2008 IIT-Madras

54 September 2008 IIT-Madras

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.