0% found this document useful (0 votes)

27 views22 pages

8300 Gui SV

Uploaded by

poolvn1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views22 pages

8300 Gui SV

Uploaded by

poolvn1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 22

THE MAPREDUCE-BASED APPROACH TO IMPROVE THE SHORTEST PATH

COMPUTATION
Copyright © 2023 the author(s). This is an open access article distributed under the Creative Commons Attribution License, which permits

unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract: When building sequential algorithms for problems on the graphic network, the algorithms themselves are

not only very complex but the complexity of the algorithms also is very considerab. Thus, sequential algorithms

must be parallel to share work and reduce computation time. For above reasons, it is crucial to build parallelization

of algorithms in extended graph to find the shortest path. Therefore, a study of algorithm finding the shortest path

from a source node to all nodes in the MapReduce architectures is essential to deal with many real problems with

huge input data in our daily life. MapReduce architectures processes on (Key, Value) pairs are independent between

processes, so multiple processes can be assigned to execute simultaneously on the Hodoop system to reduce

calculation time.

Keywords: Shortest path, Graph, Algorithm, Hadoop, MapReduce.

2020 AMS Subject Classification: 68M14, 68Q10

1. INTRODUCTION
Given extended graph G= (V, E) with a set of vertices V and a set of edges E, where edges
can be directed or undirected. Each edge (u,v) E is weighted w(u,v). Problem finding the
shortest path there are 3 cases:
(a) Problem finding the shortest path from a source node to all nodes (1-n)
(b) Problem finding the shortest path from a source node to destination node (1-1);
(c) Problem finding the shortest path between every pair of vertices (n-n)
To deal with the problems effectively in computers, it is crucial to build parallel algorithms
and the common way we do is to convert the sequential algorithms into parallel algorithms, or
convert parallel algorithms into other suitable parallel algorithms which are totally equal to the
original algorithms
In paper [3], [4], authors construct parallel all-pairs shortest path algorithm with a MapReduce
architecture. Parallel shortest path of an A* algorithm with a MapReduce architecture are
implemented in [5], [6], [7], [8]. In paper [9], [10], [11], [12], authors perform parallel data-

1
2

processing paradigm with Hadoop

In this paper, we operate a parallel algorithm of shortest path algorithm (1-n): the adjacency
list based algorithm on MapReduce architecture. In addition to abstract, introduction, conclusion,
references, the paper has five algorithms: Algorithm 1: Find the shortest path algorithm;
Algorithm 2: BFS algorithm; Algorithm 3: Mapper algorithm; Algorithm 4: Reducer algorithm;
Algorithm 5: Random graph create algorithm.
We develop experimental program on Hadoop systems, then offer specific data to evaluate
and compare the results of new parallel algorithms with sequential algorithm.

2. HADOOP AND MAPREDUCE

Hadoop has four modules:

- Hadoop Common: These are necessary Java libraries and utilities for other modules to use.
These libraries provide file system and OS layer abstraction, and contain Java code to start
Hadoop.
- Hadoop YARN: This is framework for managing processes and resources of clusters.
- Hadoop Distributed File System (HDFS): This is distributed file system that provides high-
throughput access for data mining applications.
- Hadoop MapReduce: This is YARN-based system for parallel processing of large data sets.
MapReduce:
MapReduce works on a simple principle. Operations take as input a set of key/value pairs and
give a set of key/value as output. MapReduce represents computation using just two functions:
Map and Reduce. The Map function, takes as input a key/value pair and outputs a set of
intermediate key/value pairs. These intermediate key/value pairs are then combined, and
intermediate key/value pairs with the same key are passed to the Reduce function. From there,
the Reduce function calculates on these pairs to give general values rather than the final result.
The Map task is performed distributed across storage nodes. The distributed process is done
automatically through the input data being broken down. The Reduce task is also distributed
through intermediate key/value pairs being grouped into pairs with the same key. MapReduce
3
IMPROVE THE SHORTEST PATH COMPUTATION

cluster basically consists of a Master node and Worker nodes. The Master node is responsible for
managing and regulating Workers. [9-12, 14,15]
According to Hadoop documentation [6, 9], Hadoop is an Apache open source framework
inspired by Google File System [6] [10]. It allows parallel processing on distributed data sets
across a cluster of multiple nodes connected under a master-slaves architecture. Hadoop consists
of two main components: HDFS [6], [7], [11] and MapReduce [11], [12].
The first component is the Hadoop Distributed File System (HDFS). HDFS is designed to
support very large file of data sets. It is also distributed, scalable and fault-tolerant. The Big Data
file uploaded into the HDFS is split into block file with specific size defined by the client and
replicated across the cluster nodes. The master node (NameNode) manages the distributed file
system, namespace and metadata. While the slave nodes (DataNode) manage the storage of
block files and periodically report the status to NameNode.

Figure 1. Typical component of one hadoop cluster

As stressed in the figure 1 [5], a hadoop system consists of a master and several slave nodes.
The master should consist a NameNode and a JobTracker to manage respectively data storage
and computation jobs scheduling. A slave node could be a DataNode for data storage or/and a
TaskTracker for processing computation jobs. A secondary NameNode can be used to replicate
4

data of the NameNode to provide a high quality of service.

The second one is the MapReduce programming model for intensive computation on large
data sets in parallel way. To ensure good parallelism, the data input/output needs
to be uploaded into the HDFS. In MapReduce framework, the master node works as JobTracker
and the slave nodes as TaskTracker. The JobTracker assumes the responsibility and coordinates
the job execution. The TaskTracker runs all tasks submitted by the JobTracker.
Figure 2 [8] shows an execution workflow of a MapReduce. The execution workflow is
made-up of two main phases:
(a) The Map phase, which contains the following steps:
1. The input file is split into several pieces of data. Each piece is called a split or a chunk.
2. Each salve node hosts a map task, called a mapper, reads the content of the corresponding
input split from the distributed file system.
3. Each mapper converts the content of its input split into a sequence of key-value pairs and
calls the user-defined Map procedure for each key-value pair. The produced intermediate
pairs are buffered in memory.
4. Periodically, the buffered intermediate key-value pairs are written to a local intermediate
file, called segment file. In each file, the data items are sorted by keys. A mapper node
could host several segment files and its number depends on the number of reducer nodes.
The intermediate data should be written into different files if they are destined to different
reducer nodes. A partitioning function ensures that pairs with the same key are always
allocated to the same segment file.
(b) The Reduce phase, made of the following steps:
1. On the completion of a map task, the reducer node will pull over its corresponding
segments.
2. When a reducer reads all intermediate data, it sorts the data by keys. All occurrences of the
same key are grouped together. If the amount of intermediate data is too large to fit in
memory, an external sort will be used. The reducer then merges the data to produce for
each key a single value.
5
IMPROVE THE SHORTEST PATH COMPUTATION

3. Each reducer iterates over the sorted intermediate data and passes each key-value pair to
the reduce function.
4. Each reducer writes its result to the distributed file system

Figure 2. MapReduce execution overview

3. FIND THE SHORTEST PATH ALGORITHM

3.1. Adjacency – List

In many application problems of graph theory, the graph representation as an adjacency list is
the most appropriate representation. In this representation, for each vertex v of the graph we
store a list of its neighbors, which we will denote by {Node v, u V | (v,u)E, w(v,u)} with
G=(V,E, w).
Example 1: A undirected graph in figure 1 with 12 vertices and its adjacency list are shown in
Table 1.

Table 1. Adjacency list

1 2,7 3,5
2 3,7 4,6
3 4,11 5,10 6,10
4 1,4 2,1 8,5
6

5 7,20
6 3,6 7,2 8,15 11,10
7 11,7
8 3,4 4,18 6,10 9,20
10 8,6
11 10,1 12,5
5
3.2. Find the shortest path algorithm
Find the shortest path algorithm from a vertex to z vertex in the graph as following:

Algorithm 1: Find the shortest path algorithm

Input: A connected graph G=(V, E, w), w(i, j) ≥ 0 ∀ (i, j) ∈E and a specified strating vertex a, z.
Output: L(z) is the length of the shortest path from a to z and the shortest path (if L(z) <+∞)
Step 1. Initialize:
Assign L(a):= 0. ∀ x ≠ a Assign L(x) := ∞ . Assign T:= V.
Assign P(x) := ∅ , ∀ x∈V (P(x)is before x vertex on shortest path from a to x).
Step 2. Calculate m := min{L(u) | u ∈T}.
If m = + ∞ , return “Not have shortest path from a to z”. Finish.
Else if m < + ∞ , select v ∈ T so that L(v) = m, and assign T := T - {v} go to Step 3.
Step 3.
- If z = v then L(z) is shortest distance from a to z. From z tracing back the front vertex-
edge, we receive the shortest path as following: assign z1 = P(z), z2= P(z1), …, zk= P(zk -1), a =
P(zk). It is Inferred that the shortest path is: azk zk-1 …  z1z Finish.
- Else, if z≠ v then go to step 4.
Step 4. For any x  T adjacent (post-adjacent) v
If (L(x) > L(v)+w(v,x)) then assign L(x) := L(v)+w(v,x) and P(x):= v. Back to step 2.

Theorem 1. The complexity of the algorithm is O(n3) [13]

Example 2: The graph is showed in figure 1. Applying finding the shortest path algorithm, at
each vertex symbol (v, L(v), P(x)), v is vertex, L(v) is the length of the shortest path to v, P(x) is
before x vertex on shortest path to x
7
IMPROVE THE SHORTEST PATH COMPUTATION

3.3. BFS algorithm

Given the graph G= (V, E). A tree T is called a covering tree or spanning tree of G, if T is a
covering subgraph of G. In this algorithm, we denote Q as the queue of vertices, adjacent x as
the list of vertices adjacent to vertex x. The BFS is described follows.

Algorithm 2: BFS algorithm

Input: A connected graph G = (V, E) and a specified starting vertex v
Output: The breadth first indices of the vertices of G (spanning tree of G) starting with v=1,
or graph G not connect
1. Initialize: “queue” with the start vertex v, T is graph (T=(v, ))
2. While (“queue” is not empty AND T is not spanning all vertex) do
3. x <= remove the first vertex from queue
4. for each vertex y adjacent x do
5. if y  T then
6. Begin
7. place edge (x,y) and vertex y on T
8. place y on queue
9. End
10. Endwhile
11. If T is spanning all vertex of G then return T is spanning tree of G
12. Else if queue is empty return graph G not connect

Example 3: The graph is showed in figure 3. Applying BFS algorithm finding spanning tree of
G
8

Figure 3. Results finding the shortest path algorithm

7
2 4

8
1 6
3
12
9
13 11 10

Figure 4. Result is spanning tree of G (dashed edges)

4. FIND THE SHORTEST PATH ALGORITHM ON MAPREDUCE

So far, parallel algorithms finding shortest path have been implemented on multi-core
processors, with shared external memory. What is new in this approach is to implement the
parallel shortest path algorithm on Map Reduce structure. In case of Hadoop framework, to make
9
IMPROVE THE SHORTEST PATH COMPUTATION

the algorithm efficient for running it parallel on several machines. Algorithm to find the shortest
path on MapReduce bases on Adjacent list
4.1. Proposed MapReduce of find shortest path algorithms
The problem investigated in this section the set of shortest paths from the source node to all
other nodes in the graph on MapReduce architecture.
Map stage:
The mapper class takes the entire file an input and parses it line by line.
Reduce stage:
The output of the mapper will be the input to the reducer class. The reducer class takes the
minimum of all the path weights and adds it to the adjacency list of the keyId node.
Data representation:
A connected graph G=(V, E, w), w(i, j) ≥ 0 ∀ (i, j) ∈E and a specified source node v.
Initialize: Adjacency – List representation as follows:
{Node i| ∀ i ∈V,Node Label,Node Status} TAB {Node j | ∀ j Adjacent i,w(i,j) }
There in:
- Node Label: is L, Assign L(v): = 0. ∀ x ≠ a Assign L(x) := ∞ (INF)
- Node Status: Assign Node Status=Unmarked ∀ x ∈V
Example 4: The graph is showed in figure 3, Adjacency–List representation as follows (Table
2)
Table 2. Initialize: Adjacency – List
{ Node i| ∀ i ∈V,Node {Node j | ∀ j adjacent i, w(i,j) }
Label,Node Marked }
{1,0,UNMARKED} {2,7} {3,5}
{2,INF,UNMARKED(thu {3,7} {4,6}
ộc tính)}
{3,INF,UNMARKED} {4,11} {5,10} {6,10}
{4,INF,UNMARKED} {1,4} {2,1} {8,5}
{5,INF,UNMARKED} {7,20}
{6,INF,UNMARKED} {3,6} {7,2} {8,15} {11,10}
10

{7,INF,UNMARKED} {11,7}
{8,INF,UNMARKED} {3,4} {4,18} {6,10} {9,20}
{10,INF,UNMARKED} {8,6}
{11,INF,UNMARKED} {10,15} {12,5}

Algorithm 3: Mapper algorithm

Input: (Key, Value) is Adjacent-list
- Key: {Node i| ∀ i ∈V,Node label,Node Node status, [Path]} Path = The path from the source
node to the visiting node
-Value: {Node j | ∀ j Adjacent i,w(i,j)
Output: (Key, Value)
BEGIN
1. // Find Adjacency – List representation
For (int i=1, i<=|V|, i++)
If (Node label <> inf (khác)) and (Node status =Unmarked)
Begin
1.1. Emit (Key, value) = (Key, value),
//But assign Node status=marked in field Node status
1.2. Emit (Key(là nút i)), Key= {Node j| ∀ j Adjacen(kề vói) i, Node label,Node
status,Path}
/* There in
- Node label= node label+w(i,j) (trọng số tương úng)
- Node Status=Unmarked
- Path=Path+”-“(đến)+node I ()
*/
End
Else Emit (key, value) = (key, value)
11
IMPROVE THE SHORTEST PATH COMPUTATION

2. Sort: Sort (Key, Value) with field Node i| i∈V (xắp xếp theo nút i)
END.

Algorithm 4: Reducer algorithm

Input: (Key, Value), the output of the mapper will be the input to the reducer class
Output: (Key, Value), the output of the reducer will be the input to the mapper class, The
MapReduce Job is repeatedly run until all (Key, Value) have Node status = Marked
BEGIN
1. Assign set S={s1, s2, …, sn} ={1,2,…,|V|}
∀(Key ,Value) pairs
Begin
For (int i=1, i<=|V|, i++)
Begin
∀ Node j| j¿ si
Emit (Key, Value)
\\There in
- Node labelmin =min{Node labelj ∀ j¿ si}
- (Key, Value)={Node k|k¿ j , Node labelmin, Node status, [path]}
- If Value=Null then (nếu rỗng)
Value={∀ l∨l adjacen k , w ( k , l ) }
Else Value=Value
End;
End;
2. If (Node status =Marked) for all nodes then stop, return final Output
Else return Output, the output of the reducer will be the input to the mapper class and
the MapReduce Job
END.

Theorem 2. The algorithm finding the shortest path from a vertex to many vertices in
12

MapReduce is true
Proof:
The entire graph is read from the HDFS, transferred from Mappers to Reducers, and then, with
updated distance values, written to the HDFS
- Maper:
Mapper processes a single vertex u, emitting Key have weight Node label +w(u,v) for each
vertex v in u’s adjacency list is computed and sent to the Reducers. (step 1.2 of algorithm 3).
Once a vertex v has been tested and mapped (mapper), that vertex v is marked as Marked and
will not be mapped in subsequent iterations. The same task is assigned T=T\{v} in step 2 in
algorithm 1.
In iterations next, the Maper keeps re-computing paths for vertices whose shortest path was
already found by statement Path=Path+”-“+node i in step 1.2 of algorithm 3.
Like BFS, the program distinguishes between “Marked” and “Unmarked” vertices. Marked
vertices are those that could potentially help reduce the distance for another vertex. I define a
vertex to be marked if and only if its distance value changed in the previous iteration. The only
exception to this rule is source vertex a, which is set to “Marked” before the first iteration. Note
that a vertex that was marked in one iteration could become unmarked in the next, and vice
versa.
- Reducer:
For every vertex v, no matter if its shortest path was already found in previous iterations or not
in Reducer, the statement (Key, Value) = {Node k|k=j, Node labelmin, Node status, [path]} in
algorithm 4 creates a (Key, Value) pair with a Node label having the smallest value and this is
reflected in formula Node labelmin= min {Node labelj ∀j=si}. So after each iteration of the
Redecer function, algorithm 4 will update the shortest path of the vertices for every vertex v, no
matter if its shortest path was already found in previous iterations or not, the Reduce function is
executed to recompute the shortest distance.
From the above analysis, it can be seen through each iteration, the algorithms will update the
longest path, shortest path and mark high-level points as Marked or Unmarked. Thus, Mapreduce
13
IMPROVE THE SHORTEST PATH COMPUTATION

will execute iteratively until all the ranks have been marked as “Marked”. The final output is the
end of the problem.
Mapping and reducing processes on (Key, Value) are independent between processes, so
multiple processes can be assigned to execute simultaneously on the Hodoop system to reduce
calculation time
■
The next part is the implementation of Mapper and Reducer for a specific graph
4.2. How to perform MapReduce on a specific graph
Table 3. Input and Output for MapReduce
Mapper Input: Table 2
Mapper Output (sorted)
Key Value
{1,0,MARKED,1} {2,7} {3,5}
{2,7,UNMARKED,1}
{2,INF,UNMARKED} {3,7} {4,6}
{3,5,UNMARKED,1}
{3,INF,UNMARKED} {4,11} {5,10} {6,10}
{4,INF,UNMARKED} {1,4} {2,1} {8,5}
{5,INF,UNMARKED} {7,20}
{6,INF,UNMARKED} {3,6} {7,2} {8,15} {11,10}
{7,INF,UNMARKED} {11,7}
{8,INF,UNMARKED} {3,4} {4,18} {6,10} {9,20}
{10,INF,UNMARKED} {8,6}
{11,INF,UNMARKED} {10,15} {12,5}
The output of the mapper will be the input to the reducer class
The output emitted by the reducer is
Key Value
{1,0,MARKED,1} {2,7} {3,5}
14

{2,7,UNMARKED,1} {3,7} {4,6}

{3,5,UNMARKED,1} {4,11} {5,10} {6,10}
{4,INF(vô cùng),UNMARKED} bỏ {1,4} {2,1} {8,5}
{5,INF,UNMARKED} {7,20}
{6,INF,UNMARKED} {3,6} {7,2} {8,15} {11,10}
{7,INF,UNMARKED} {11,7}
{8,INF,UNMARKED} {3,4} {4,18} {6,10} {9,20}
{10,INF,UNMARKED} {8,6}
{11,INF,UNMARKED} {10,15} {12,5}
The output of the reducer will be the input to the mapper class
Mapper Output (sorted)
Key Value
{1,0,MARKED,1} {2,7} {3,5}
{2,7,MARKED,1} {3,7} {4,6} X
{3,14,UNMARKED,1-2} Trên này có X
đi xuống
{3,5,MARKED,1} {4,11}Giữ {5,10} {6,10} X
nguyên
{4,13,UNMARKED,1-2} chọn X
minLabel
{4,16,UNMARKED,1-3} X
{4,INF,UNMARKED} {1,4} {2,1} {8,5}
{5,15,UNMARKED,1-3} X
{5,INF,UNMARKED} {7,20}
{6,15,UNMARKED,1-3} X
{6,INF,UNMARKED} {3,6} {7,2} {8,15} {11,10}
{7,INF,UNMARKED} {11,7}
{8,INF,UNMARKED} {3,4} {4,18} {6,10} {9,20}
15
IMPROVE THE SHORTEST PATH COMPUTATION

{10,INF,UNMARKED} {8,6}
{11,INF,UNMARKED} {10,15} {12,5}
The output of the mapper will be the input to the reducer class
The MapReduce Job is looped until all nodes are marked and then stopped
The output emitted by the reducer next iteration is
Key Value
{1,0,MARKED,1} {2,7} {3,5}
{2,7,MARKED,1-2} {3,7} {4,6}
{3,5,MARKED,1-3} {4,11} {5,10} {6,10}
{4,13,MARKED,1-2-4} {1,4} {2,1} {8,5}
{5,15,MARKED,1-3-5} {7,20}
{6,15,MARKED,1-3-6} {3,6} {7,2} {8,15} {11,10}
{7,17,MARKED,1-3-6-7} {11,7}
{8,18,MARKED,1-2-4-8} {3,4} {4,18} {6,10} {9,20}
{9,38,MARKED,1-2-4-8-9}
{10,39,MARKED,1-3-6-7-11-10} {8,6}
{11,24,MARKED,1-3-6-7-11} {10,15} {12,5}
{12,29,MARKED,1-3-6-7-11-12}
16

Figure 5. Result on file part-00000

4.3. Experimental results
We Developing experimental programs on Hadoop 3.3.0 systems, then offering specific data to
evaluate and compare the results of sequential algorithms or with the other previous parallel
algorithms. The experimental results show that such approach achieves significant gain of
computational time. The implementation of Mapper and Reducer for a specific graph (M100), it
is implemented (figure 5).
Random graphs (Figure 7) are created as our database to test the algorithms. Input: NumNode,
Expansion coefficient.
Example 5: Input: NumNode=5, Expansion coefficient=2
Output: Node 1 adjacent Node 2 and Node 3; Node 2 adjacent Node 3 and Node 4; Node 3
adjacent Node 4 and Node 5; Node 4 adjacent Node 5. With w(Node i, Node j) is random ( see
Algorithm 5)

Algorithm 5: Random graph create algorithm

Input: NumNode, Expansion coefficient
17
IMPROVE THE SHORTEST PATH COMPUTATION

Output: Graph (Namefile.txt)

BEGIN
ofstream f ("Namfile.txt");
f<<"{"<<1<<",0,UNMARKED}";
for(int i=1;i<=Expansion coefficient;i++)
Begin
srand(Number);
int w = rand() % (100 - 2 + 1) + 2;
Number=Number+1;
f<<" {"<<i+1<<","<<w<<"}";
end;
f<<endl;
for(int i=2;i<= NumNode;i++)
Begin
f<<"{"<<i<<",INFINITY,UNMARKED}";
for(int j=i+1;j<=i+Expansion coefficient;j++)
if(j<= NumNode)
Begin
srand(Munber);
int w = rand() % (100 - 0 + 1) + 0;
Number=Number+1;
f<<" {"<<j<<","<<w<<"}";
End;
f<<endl;
End;
f.close();
END.
18

Figure 6. Result on HDFS (file part-00000)

We experimentally random graphs nodes as follows: The graph corresponds to 10000 nodes,
49985 edges (Expansion coefficient=5); 15000 nodes, 74985 edges (Expansion coefficient=5)
and 20000 nodes, 99985 edges (Expansion coefficient=5). The simulation result demonstrates
that the runtime of parallel algorithms in the MapReduce architectures is better than sequential
algorithm.
19
IMPROVE THE SHORTEST PATH COMPUTATION

Figure 7. Create database (Random graph create algorithm)

Figure 8. Graph with NumNode =100, Expansion coefficient=4

The simulation result demonstrates that the runtime of parallel algorithms on large graph is
better than small graph.
Table 4. The run time
Graph 10000 nodes 15000 nodes 20000 nodes
Time 123 mins 141 mins 174 mins
20

200
180 174
160 141
140 123
120
100
80
60
40
20
0
10000 nodes 15000 nodes 20000 nodes

Figure 9. Chart performs the run time of graphs

5. CONCLUSION AND FURTHER WORKS

This paper presents new parallel algorithms (algorithm 3, 4 and 5) based on the actual
requirements, proving soundness. In addition, thesis also does parallelization for existing
algorithms, then indicates the advantages of the new ones over previous algorithms.
In particularly, this work develops experimental programs on Hadoop 3.3.0 parallel system,
then offers specific data to evaluate and compare the results of new parallel algorithms with
sequential algorithms
There's a certain novelty value of the algorithms compared to other papers
- The algorithms create a random graph
- The algorithms are generalized
- The algorithms demo on Hadoop 3.3.0 systems
- The algorithms are proven.
As part of future work:
- Proving complexity of the algorithms by MapReduce find shortest path algorithm for a
given graph size.
- Applying MapReduce to find shortest path algorithm approach on a real road network.
21
IMPROVE THE SHORTEST PATH COMPUTATION

- Suggesting parallel algorithms on MapReduce architectures for problems: listing

combinatorial algorithm or finding the shortest path in extended graph, or find maximum flow on
network graph or traveling salesman problem or Genetic Algorithm.
ACKNOWLEDGMENT
We would like to express my sincere thanks to University of Science and Education, The
University of Da Nang for their great support for this work through the project code T2023-TN-
07.
CONFLICT OF INTERESTS
The authors declare that there is no conflict of interests.
REFERENCES
[1] Seyed H. Roosta, Paralell processing and parallel algorithm theory and computation, Springer, (2000).

[2] Robert Sedgewick, Algorithms in C part 5: graph algorithms (third edition), Addison-Wesley, (2000).

[3] V. Dragomir, All-pair shortest path modified matrix multiplication based algorithm for a one-chip MapReduce

architecture, U.P.B. Sci. Bull., Series C, 78, 4, (2016) pp. 95-108.

[4] Voichiţa DRAGOMIR, Gheorghe M. ŞTEFAN, All-pair shortest path on a hybrid Map-Reduce based

architecture, Proceedings of The Romanian Academy, Series A, the publishing House of the Romanian

Academy, Volume 20, (2019), pp. 411–417.

[5] Sabeur Aridhi, Vincent Benjamin, Philippe Lacomme, Libo Ren, Shortest path resolutionusing hadoop,

MOSIM’14, Nancy – France, (2014).

[6] Wilfried Yves Hamilton Adoni, Tarik Nahhal, Brahim Aghezzaf* and Abdeltif Elbyed, The MapReduce-based

approach to improve the shortest path computation in large-scale road networks: the case of A* algorithm,

journal of Big Data, Springer, open access, (2018)

[7] Wilfried Yves Hamilton Adoni, Tarik Nahhal, Brahim Aghezzaf, and Abdeltif Elbyed, MRA*: Parallel and

Distributed Path in Large-Scale Graph Using MapReduce-A* Based Approach, Springer International

Publishing AG (2017), pp. 390–401.

[8] Sabeur Aridhi, Philippe Lacomme, Libo Ren, Benjamin Vincent, A MapReduce-based approach for shortest

path problem in large-scale networks, Elsevier, Journal of Engineering Applications of Artificial Intelligence 41

(2015) 151–165.
22

[9] Hadoop, A. Welcome to Apache Hadoop. http://hadoop.apache.org/. Accessed 10 Mar 2017.

[10] Ghemawat S, Gobioff H, Leung ST. The google file system. In: ACM SIGOPS operating systems review, vol.

37. New York: ACM; (2003). p. 29–43.

[11] Dean J, Ghemawat S. Mapreduce: simplified data processing on large clusters. Commun ACM. (2008), 51(1)

pp 107–13.

[12] Vavilapalli VK, Seth S, Saha B, Curino C, O’Malley O, Radia S, Reed B, Baldeschwieler E, Murthy AC,

Douglas C, Agarwal S, Konar M, Evans R, Graves T, Lowe J, Shah H. Apache hadoop YARN: yet another

resource negotiator. In: Proceedings of the 4th annual symposium on cloud computing. Santa Clara: ACM

Press; (2013). pp 1– 16.

[13] Nguyen Dinh Lau, Tran Quoc Chien, Le Manh Thanh, Improved Computing Performance for Algorithm

Finding the Shortest Path in Extended Graph, proceedings of the 2014 international conference on foundations

of computer science (FCS’14), USA, (2014), pp 14-20.

[14] M. Hena, N. Jeyanthi, A Three-Tier Authentication Scheme for Kerberized Hadoop Environment, Cybernetics

and Information Technologies, Volume 21, No 4, (2021) pp 119-136.

[15] Davit Petrosyan, Hrachya Astsatryan, Serverless (2022) High-Performance Computing over Cloud,

Cybernetics and Information Technologies, Volume 22, No 3, (2022), pp 82-92.

Product MANUAL: EC35D, ECR35D, ECR40D, ECR50D
75% (4)
Product MANUAL: EC35D, ECR35D, ECR40D, ECR50D
42 pages
The Book of Love and Creation A Channeled Text Multiformat Download
100% (17)
The Book of Love and Creation A Channeled Text Multiformat Download
17 pages
Unit 5 Reading Questions
No ratings yet
Unit 5 Reading Questions
1 page
Prime Ready Mix.
No ratings yet
Prime Ready Mix.
27 pages
Worksheet of The Making of A Scientist
No ratings yet
Worksheet of The Making of A Scientist
1 page
Full Solved English Paper Class X 2025
No ratings yet
Full Solved English Paper Class X 2025
6 pages
H2 Chapter 14 Sequences and Series Learning Package 2025
No ratings yet
H2 Chapter 14 Sequences and Series Learning Package 2025
31 pages
SSRN 4579415
No ratings yet
SSRN 4579415
64 pages
Unit 3 BDT
No ratings yet
Unit 3 BDT
42 pages
8semester Result
No ratings yet
8semester Result
1 page
2578 - Citizen C690
No ratings yet
2578 - Citizen C690
5 pages
Insect Science Age Cheilomenes
No ratings yet
Insect Science Age Cheilomenes
9 pages
A 12.4-32 GHZ Cmos Down-Conversion Mixer For 28 GHZ 5G New Radio (NR)
No ratings yet
A 12.4-32 GHZ Cmos Down-Conversion Mixer For 28 GHZ 5G New Radio (NR)
11 pages
Pay Query Procedure
No ratings yet
Pay Query Procedure
1 page
GE2 - Exercise 2.1 Juvine Ramos
No ratings yet
GE2 - Exercise 2.1 Juvine Ramos
4 pages
Unit 3 Notes
No ratings yet
Unit 3 Notes
21 pages
Shopping: Enter Your Title
No ratings yet
Shopping: Enter Your Title
12 pages
MapReduce Unit3
No ratings yet
MapReduce Unit3
27 pages
Unit 4 1
No ratings yet
Unit 4 1
12 pages
BDA Manual
No ratings yet
BDA Manual
57 pages
22PAM0062 - INTERMEDIATE ACADEMIC ENGLISH - Part8
No ratings yet
22PAM0062 - INTERMEDIATE ACADEMIC ENGLISH - Part8
20 pages
Big Data Analytics UNIT 3 Notets
No ratings yet
Big Data Analytics UNIT 3 Notets
12 pages
Jsa - Earthing System
83% (6)
Jsa - Earthing System
4 pages
BenchTurn 7000 User Guide - D PDF
No ratings yet
BenchTurn 7000 User Guide - D PDF
208 pages
Adaptive Dynamic Data Placement Algorithm
No ratings yet
Adaptive Dynamic Data Placement Algorithm
14 pages
Introduction To MapReduce
No ratings yet
Introduction To MapReduce
9 pages
Big Data Notes
No ratings yet
Big Data Notes
13 pages
Map Reduce Algorithm
No ratings yet
Map Reduce Algorithm
2 pages
Bda Module 4
No ratings yet
Bda Module 4
34 pages
Hadoop - MapReduce
No ratings yet
Hadoop - MapReduce
5 pages
CC Unit-7
No ratings yet
CC Unit-7
16 pages
Multiplexing and Demultiplexing
No ratings yet
Multiplexing and Demultiplexing
48 pages
Principal Component Analysis - A Tutorial
No ratings yet
Principal Component Analysis - A Tutorial
37 pages
Unit 5 Lecture 5
No ratings yet
Unit 5 Lecture 5
21 pages
Map Reduce
No ratings yet
Map Reduce
25 pages
8300 17977 1 PB
No ratings yet
8300 17977 1 PB
19 pages
Unit 2 Topic 5 Developing A Map Reduce Application
No ratings yet
Unit 2 Topic 5 Developing A Map Reduce Application
52 pages
3.1.how Map Reduce Works & 3.2 Anatomy
No ratings yet
3.1.how Map Reduce Works & 3.2 Anatomy
11 pages
Map Reduce Algorithm 1
No ratings yet
Map Reduce Algorithm 1
2 pages
BDA-Unit 4
No ratings yet
BDA-Unit 4
20 pages
Map Reduce
No ratings yet
Map Reduce
35 pages
Piping Engineering - Knowledge Base: I. Dyke Wall Height Calculation
No ratings yet
Piping Engineering - Knowledge Base: I. Dyke Wall Height Calculation
3 pages
3 Fuel Consumption Example - MR
No ratings yet
3 Fuel Consumption Example - MR
7 pages
Hadoop Notesforstudents
No ratings yet
Hadoop Notesforstudents
13 pages
BDA UNIT-3 (1) - Merged
No ratings yet
BDA UNIT-3 (1) - Merged
98 pages
Hadoop Streaming: Mapreduce
No ratings yet
Hadoop Streaming: Mapreduce
8 pages
U-3 Big Data
No ratings yet
U-3 Big Data
23 pages
Spark Streaming Research
No ratings yet
Spark Streaming Research
6 pages
Group 4, Freud's Theory On Socialization
No ratings yet
Group 4, Freud's Theory On Socialization
35 pages
Map Reduce Programming
No ratings yet
Map Reduce Programming
74 pages
Bda Unit-3
No ratings yet
Bda Unit-3
20 pages
Logbook Yr 2
No ratings yet
Logbook Yr 2
7 pages
Spark 4-2 Documentation
No ratings yet
Spark 4-2 Documentation
60 pages
BDA Module 3 - Part 1 (Mapreduce and HBase) 2023
No ratings yet
BDA Module 3 - Part 1 (Mapreduce and HBase) 2023
15 pages
Unit Iv-1
No ratings yet
Unit Iv-1
84 pages
Groups/Buttons Description: Clipboard Group Paste Cut Copy Format Painter
100% (1)
Groups/Buttons Description: Clipboard Group Paste Cut Copy Format Painter
3 pages
Ditp - ch2 4
No ratings yet
Ditp - ch2 4
2 pages
The Design and Simulation of An S-Band Circularly Polarized Microstrip Antenna Array
No ratings yet
The Design and Simulation of An S-Band Circularly Polarized Microstrip Antenna Array
5 pages
DSBDA Manual Assignment 11
No ratings yet
DSBDA Manual Assignment 11
6 pages
Chapter Five Hadoop Mapreduce & HDFS
No ratings yet
Chapter Five Hadoop Mapreduce & HDFS
44 pages
Manual Del Medidor de Campo
No ratings yet
Manual Del Medidor de Campo
17 pages
18mcs35e U4
No ratings yet
18mcs35e U4
7 pages
Unit-2 (MapReduce-I)
No ratings yet
Unit-2 (MapReduce-I)
28 pages
00766874
No ratings yet
00766874
8 pages
05 Movies Data Analysis Using Mapreduce
No ratings yet
05 Movies Data Analysis Using Mapreduce
20 pages
Module 3 (Part-1) - Big Data
No ratings yet
Module 3 (Part-1) - Big Data
46 pages
Notes Bug Data and of Apache
No ratings yet
Notes Bug Data and of Apache
4 pages
Makita HR4013C, HR4013CV - Dec2014
No ratings yet
Makita HR4013C, HR4013CV - Dec2014
5 pages
Shayri
No ratings yet
Shayri
15 pages
C-3411 Piping Bellows Expansion Joints
No ratings yet
C-3411 Piping Bellows Expansion Joints
18 pages
Unit 2 - From Hadoop Streaming PDF
No ratings yet
Unit 2 - From Hadoop Streaming PDF
20 pages
HADOOP: A Solution To Big Data Problems Using Partitioning Mechanism Map-Reduce
No ratings yet
HADOOP: A Solution To Big Data Problems Using Partitioning Mechanism Map-Reduce
6 pages
By Christian Mechem and Geoff Crowley
No ratings yet
By Christian Mechem and Geoff Crowley
11 pages
BFSMpR:A BFS Graph Based Recommendation System Using Map Reduce
No ratings yet
BFSMpR:A BFS Graph Based Recommendation System Using Map Reduce
5 pages
Map Reduce Report
No ratings yet
Map Reduce Report
16 pages
Parameterized Pipelined Map Reduce Based Approach For Performance Improvement of Parallel Programming Model
No ratings yet
Parameterized Pipelined Map Reduce Based Approach For Performance Improvement of Parallel Programming Model
5 pages
Hadoop: A Seminar Report On
No ratings yet
Hadoop: A Seminar Report On
28 pages
A Brief On MapReduce Performance
No ratings yet
A Brief On MapReduce Performance
6 pages
Hadoop Introduction PDF
No ratings yet
Hadoop Introduction PDF
3 pages
Efficient Ways To Improve The Performance of HDFS For Small Files
No ratings yet
Efficient Ways To Improve The Performance of HDFS For Small Files
5 pages
Introduction
No ratings yet
Introduction
2 pages
Hadoop: A Report Writing On
No ratings yet
Hadoop: A Report Writing On
13 pages
The Map Reduce Programming
No ratings yet
The Map Reduce Programming
15 pages
Unit - III Advanced Analytics Technology and Tools
No ratings yet
Unit - III Advanced Analytics Technology and Tools
44 pages
Survey Paper On Traditional Hadoop and Pipelined Map Reduce: Dhole Poonam B, Gunjal Baisa L
No ratings yet
Survey Paper On Traditional Hadoop and Pipelined Map Reduce: Dhole Poonam B, Gunjal Baisa L
5 pages
Term Paper Java
No ratings yet
Term Paper Java
14 pages
An Introduction To Hadoop
No ratings yet
An Introduction To Hadoop
12 pages
Towards Efficient Mapreduce Using Mpi
No ratings yet
Towards Efficient Mapreduce Using Mpi
10 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

8300 Gui SV

Uploaded by

8300 Gui SV

Uploaded by

THE MAPREDUCE-BASED APPROACH TO IMPROVE THE SHORTEST PATH

Keywords: Shortest path, Graph, Algorithm, Hadoop, MapReduce.

processing paradigm with Hadoop

2. HADOOP AND MAPREDUCE

Hadoop has four modules:

Figure 1. Typical component of one hadoop cluster

data of the NameNode to provide a high quality of service.

Figure 2. MapReduce execution overview

3. FIND THE SHORTEST PATH ALGORITHM

3.1. Adjacency – List

Table 1. Adjacency list

Algorithm 1: Find the shortest path algorithm

Theorem 1. The complexity of the algorithm is O(n3) [13]

3.3. BFS algorithm

Algorithm 2: BFS algorithm

Figure 3. Results finding the shortest path algorithm

Figure 4. Result is spanning tree of G (dashed edges)

4. FIND THE SHORTEST PATH ALGORITHM ON MAPREDUCE

Algorithm 3: Mapper algorithm

Algorithm 4: Reducer algorithm

{2,7,UNMARKED,1} {3,7} {4,6}

Figure 5. Result on file part-00000

Algorithm 5: Random graph create algorithm

Output: Graph (Namefile.txt)

Figure 6. Result on HDFS (file part-00000)

Figure 7. Create database (Random graph create algorithm)

Figure 8. Graph with NumNode =100, Expansion coefficient=4

Figure 9. Chart performs the run time of graphs

5. CONCLUSION AND FURTHER WORKS

- Suggesting parallel algorithms on MapReduce architectures for problems: listing

architecture, U.P.B. Sci. Bull., Series C, 78, 4, (2016) pp. 95-108.

Academy, Volume 20, (2019), pp. 411–417.

MOSIM’14, Nancy – France, (2014).

journal of Big Data, Springer, open access, (2018)

Publishing AG (2017), pp. 390–401.

[9] Hadoop, A. Welcome to Apache Hadoop. http://hadoop.apache.org/. Accessed 10 Mar 2017.

37. New York: ACM; (2003). p. 29–43.

Press; (2013). pp 1– 16.

of computer science (FCS’14), USA, (2014), pp 14-20.

and Information Technologies, Volume 21, No 4, (2021) pp 119-136.

Cybernetics and Information Technologies, Volume 22, No 3, (2022), pp 82-92.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.