Ee8218 Lab2
Ee8218 Lab2
ASSIGNMENT No. 02
*By signing above you attest that you have contributed to this submission and
confirm that all work you have contributed to this submission is your own work.
Any suspicion of copying or plagiarism in this work will result in an investigation
of Academic Misconduct and may result in a “0” on the work, an “F” in the
course, or possibly more severe penalties, as well as a Disciplinary Notice on
your academic record under the Student Code of Academic Conduct, which can
be found online at: http://www.ryerson.ca/senate/policies/pol60.pdf.
Objective:
This lab is about installing and getting familiarize with the MPI software. Similarly, visualize the
performance improvement through parallel computing a program in a network of computers
Introduction:
Message Passing Interface (MPI) is a portable message-passing system used in many computer
languages such as FORTRAN, C, C++ and Java. Further, MPI software, allows to run a computer
program parallel in the network of computers to increase the computational power of a system.
Parallel computing is really important for handling big data since the sequential algorithm (single
processor) has many limitation when it comes to big data. Depending on data, they might take
years to solve a problem. However, by diving the problem and solving it parallel in the network of
computers helps solve the problem much faster.
This being said, there are certain thing one has to keep in mind for parallel computing. Such as for
parallel computing the data has to be big and independent since the communication time is really
high for the small or dependent data resulting in poor performance in the network of computer.
Similarly, the data has to be divided in the network equally. If the data is not distributed equally,
the process with minimum task will finish it quickly and stay idle until the rest of the processors
finish their task which results in long overhead. In this report, these problem will be explicitly
discuss and proven with the results.
Experiment:
In order to visualize and compare the performance of a parallel computing, two square matrices
with the size of 1000, 3000, and 5000, 6000 respectively initialized. Further, both square matrices
where multiplied together using sequential algorithm as well as parallel computing algorithm in
the network of 4 and 6 computers.
Table 1 above represents the size of two square matrices used for the application and the time the
system took to multiply those two matrices using sequential algorithm
Size of Matrices Computational Time (seconds) in a network of 4 computers
1000 x 1000 3.44 Seconds
3000 x 3000 147.04 Seconds (~2.45 Minutes)
5000 x 5000 678.63 Seconds (~11.5 Minutes)
6000 x 6000 979.52 Seconds (~16.5 Minutes)
Table 2 above represents the size of two square matrices used for the application and the time the
system took to multiply those two matrices in a network of 4 computers.
Result Comparison:
Further, as the size of the matrices becomes bigger and bigger there are more values to execute
and sequential algorithm waits for the first commands to complete before executing the next
command which results in a really slow response. Similarly, the parallel algorithm divide the
workload into network of computers and execute at the same time. After completing the process,
they send their result to main host computer where the main computer combines the results in
much faster way.
10.36 339.92
147.04
3.44 125.08
2.64
1000X1000 3000X3000
Seqential 4 Computer network 6 computer network Seqential 4 Computer network 6 computer network
2523.55
1577.78
678.63 979.52
525.6 625.6
5000X5000 6000X6000
Seqential 4 Computer network 6 computer network Seqential 4 Computer network 6 computer network
Hence, the graph below represents the computation time for all matrices, 1000x1000, 3000x3000,
5000x5000 and 6000x6000.
3000
2523.55
2500
Time in Seconds
2000
1577.78
1500
979.52
678.63
1000
525.6 625.6
339.92
500 147.04
10.36 3.44 2.64 125.08
0
1000X1000 3000X3000 5000X5000 6000X6000
Conclusion:
Based on the above results we see that in parallel computing, even by using 4 computers network
or 6 computers network, we still don’t receive 4 times or 6 times faster response. That could be
the reason for many different options. First, in this lab, we only used maximum size of matrix to
be 6000x6000 which is still not big enough. Also, it could also be the result of network speed
since we are communicating with a network of computers. In order to achieve a better result, we
have to use faster network speed with Fiber optics.
However, we do see from the above graph that as the matrix size is increases, the time difference
in sequential, 4 computer network and 6 computer network increases.
Appendix A:
Program output used to verify that the program works
#include "/usr/local/mpich-3.1.4/include/mpi.h"
#include <stdio.h>
#include <math.h>
#define sizeOfMatrix 1000
int matrix1[sizeOfMatrix][sizeOfMatrix];
int matrix2[sizeOfMatrix][sizeOfMatrix];
int result[sizeOfMatrix][sizeOfMatrix];
int row, colum;
int n, myid, numprocs;
int tempArray[sizeOfMatrix * sizeOfMatrix];
int myid, numprocs, temp;
double startwtime = 0.0, endwtime;
int namelen, ierr, icount;
int i, j, k;
int columNumber = 0;
char processor_name[MPI_MAX_PROCESSOR_NAME];
MPI_Status status;
void initialize();
void display();
if (myid == 0) {
// display();
endwtime = MPI_Wtime();
printf("wall clock time = %f\n", endwtime - startwtime);
}
MPI_Finalize();
return 0;
}
void initialize() {
for (row = 0; row < sizeOfMatrix; row++) {
for (colum = 0; colum < sizeOfMatrix; colum++) {
// matrix1[row][colum] = 1;
//matrix2[row][colum] = 2;
matrix1[row][colum] = rand() % 100;
matrix2[row][colum] = rand() % 100;
}
}
}
void display() {
int counter = 0;
for (counter = 0; counter < 3; counter++) {
for (row = 0; row < sizeOfMatrix; row++) {
for (colum = 0; colum < sizeOfMatrix; colum++) {
if (counter == 0)
printf(" %d ", matrix1[row][colum]);
else if (counter == 1)
printf(" %d ", matrix2[row][colum]);
else if (counter == 2)
printf(" %d ", result[row][colum]);
}
printf("\n");
}
printf("\n");
}