Mod 4 Notes DS
Mod 4 Notes DS
)
R. L. JALAPPA INSTITUTE OF TECHNOLOGY
(Approved by AICTE, New Delhi, Affiliated to VTU, Belagavi & Accredited by NAAC “A” Grade)
Kodigehalli, Doddaballapur- 561 203
Department of CS&E - PG
Subject Code: MCS103
Subject Name: DATA STRUCTURES AND APPLICATIONS
Module Number: 04
Name of the Module: GRAPH ALGORITHMS Scheme:
2024 Prepared by: Dr. Mamatha C M
Professor
Institute Vision
To be a premier Institution by imparting quality Technical education, Professional Training and
Research.
Institute Mission
M1:To provide an outstanding Teaching, Learning and Research environment through Innovative
Practices in Quality Education.
M2: Develop Leaders with high level of Professionalism to have career in the Industry, Zeal for
Higher Education, focus on Entrepreneurial and Societal activities.
Department Vision- PG
To nurture students with advanced expertise and research-oriented skills in Computer Science and
Engineering, empowering them to drive technological innovation and thrive in an evolving global
landscape.
Department Mission- PG
M1: To foster advanced skills in specialized domains of Computer Science and Engineering, equipping
students with the necessary expertise to address contemporary challenges and meet the evolving
demands of the global industry.
M2: To promote cutting-edge research and technological innovation, while cultivating entrepreneurship
and consultancy skills that empower students to contribute to the technological needs of industries,
governments, and society.
PROGRAMME SPECIFIC OUTCOMES (PSOs)
PSO1: Students will have a knowledge of Advanced Software, Hardware, Network Models,
Algorithms
PSO2: Students will be able to develop applications in the areas related to Artificial Intelligence,
Machine Learning, Data Science and IoT for efficient design of computer-based systems.
PROGRAMME EDUCATIONAL OBJECTIVES (PEOs)
PEO1: Our Graduates will have prospective careers in the IT Industry.
PEO2: Our Graduates will exhibit a high level of Professionalism and Leadership skills in work
Environment.
PEO3: Our Graduates will pursue Research, and focus on Entrepreneurship.
Module-4
Graph Algorithms: Bellman - Ford Algorithm; Single source shortest paths in a DAG; Johnson’s Algorithm for
sparse graphs; Flow networks and Ford-Fulkerson method; Maximum bipartite matching. Polynomials and the FFT:
Representation of polynomials; The DFT and FFT; Efficient implementation of FFT.
How it works:
1. Initialization:
Assign a large positive value to the distance of all vertices except the source vertex, which is set to 0.
2. Relaxation loop:
Iterate through all edges in the graph (in topological order for a DAG).
For each edge (u, v), check if relaxing the edge (i.e., updating the distance to v based on the distance to
u and the edge weight) can provide a shorter path.
Repeat this relaxation step (V-1) times, where V is the number of vertices.
Example
The first four edges that are checked in our graph are A->C, A->E, B->C, and C->A. These first four edge checks do
not lead to any updates of the shortest distances because the starting vertex of all these edges has an infinite distance.
After the edges from vertices A, B, and C are checked, the edges from D are checked. Since the starting point (vertex
D) has distance 0, the updated distances for A, B, and C are the edge weights going out from vertex D.
The next edges to be checked are the edges going out from vertex E, which leads to updated distances for vertices B
and C.
The Bellman-Ford algorithm have now checked all edges 1 time. The algorithm will check all edges 3 more times
before it is finished, because Bellman-Ford will check all edges as many times as there are vertices in the graph,
minus 1.
The algorithm starts checking all edges a second time, starting with checking the edges going out from vertex A.
Checking the edges A->C and A->E do not lead to updated distances.
The next edge to be checked is B->C, going out from vertex B. This leads to an updated distance from vertex D to C
of 5-4=1.
Checking the next edge C->A, leads to an updated distance 1-3=-2 for vertex A.
The check of edge C->A in round 2 of the Bellman-Ford algorithm is actually the last check that leads to
an updated distance for this specific graph. The algorithm will continue to check all edges 2 more times
without updating any distances.
Checking all edges V−1V−1 times in the Bellman-Ford algorithm may seem like a lot, but it is done this
many times to make sure that the shortest distances will always be found.
Time complexity: Topological sorting takes O (V E) time. The relaxation phase takes O(E) time because
each edge is processed once. Therefore, the total time complexity is O(V E).O(n2).
Disadvantage: this algorithm won’t work with negative edges with cycles. To overcome this disadvantage
implement Johnson’s Algorithm
Johnson’s Algorithm for sparse graphs;
The problem is to find the shortest paths between every pair of vertices in a given weighted directed Graph and
weights may be negative. Using Johnson’s algorithm, we can find all pair shortest paths in O(V2log V + VE)
time. Johnson’s algorithm uses both Dijkstra and Bellman-Ford as subroutines. If we apply Dijkstra’s Single Source
shortest path algorithm for every vertex, considering every vertex as the source, we can find all pair shortest paths in
O(V*VLogV) time.
But the problem with Dijkstra’s algorithm is, that it doesn’t work for negative weight edge. The idea of Johnson’s
algorithm is to re-weight all edges and make them all positive, then apply Dijkstra’s algorithm for every vertex.
How to transform a given graph into a graph with all non-negative weight edges?
The idea of Johnson’s algorithm is to assign a weight to every vertex. Let the weight assigned to vertex u be h[u].
We reweight edges using vertex weights. For example, for an edge (u, v) of weight w(u, v), the new weight becomes
w(u, v) + h[u] – h[v]. The great thing about this reweighting is, that all set of paths between any two vertices is
increased by the same amount and all negative weights become non-negative. Consider any path between two
vertices s and t, the weight of every path is increased by h[s] – h[t], and all h[] values of vertices on the path from s
to t cancel each other.
How do we calculate h[] values?
Bellman-Ford algorithm is used for this purpose. Following is the complete algorithm. A new vertex is added to the
graph and connected to all existing vertices. The shortest distance values from the new vertex to all existing vertices
are h[] values.
Algorithm:
Let the given graph be G. Add a new vertex s to the graph, add edges from the new vertex to all vertices of G.
Let the modified graph be G’.
Run the Bellman-Ford algorithm on G’ with s as the source. Let the distances calculated by Bellman-Ford be
h[0], h[1], .. h[V-1]. If we find a negative weight cycle, then return. Note that the negative weight cycle cannot
be created by new vertex s as there is no edge to s. All edges are from s.
Reweight the edges of the original graph. For each edge (u, v), assign the new weight as “original weight + h[u]
– h[v]”.
Remove the added vertex s and run Dijkstra’s algorithm for every vertex.
The following property is always true about h[] values as they are the shortest
distances. h[v] <= h[u] + w(u, v)
The property simply means that the shortest distance from s to v must be smaller than or equal to the shortest
distance from s to u plus the weight of the edge (u, v). The new weights are w(u, v) + h[u] – h[v]. The value of the
new weights must be greater than or equal to zero because of the inequality “h[v] <= h[u] + w(u, v)”.
We add a source s and add edges from s to all vertices of the original graph. In the following diagram s is 4.
We calculate the shortest distances from 4 to all other vertices using Bellman-Ford algorithm. The shortest distances
from 4 to 0, 1, 2 and 3 are 0, -5, -1 and 0 respectively, i.e., h[] = {0, -5, -1, 0}. Once we get these distances, we
remove the source vertex 4 and reweight the edges using following formula. w(u, v) = w(u, v) + h[u] – h[v].
Time Complexity: The main steps in the algorithm are Bellman-Ford Algorithm called once and Dijkstra called V
times. Time complexity of Bellman Ford is O(VE) and time complexity of Dijkstra is O(VLogV). So overall time
complexity is O(V2log V + VE).
The Ford-Fulkerson algorithm is a widely used algorithm to solve the maximum flow problem in a flow network.
The maximum flow problem involves determining the maximum amount of flow that can be sent from a source
vertex to a sink vertex in a directed weighted graph, subject to capacity constraints on the edges.
The algorithm works by iteratively finding an augmenting path, which is a path from the source to the sink in the
residual graph, i.e., the graph obtained by subtracting the current flow from the capacity of each edge. The algorithm
then increases the flow along this path by the maximum possible amount, which is the minimum capacity of the
edges along the path.
What? A matching in a Bipartite Graph is a set of the edges chosen in such a way that no two edges share an
endpoint. A maximum matching is a matching of maximum size (maximum number of edges). In a maximum
matching, if any edge is added to it, it is no longer a matching. There can be more than one maximum matchings for
a given Bipartite Graph.
Why ?
There are many real world problems that can be formed as Bipartite Matching. For example, consider the following
problem:
“There are M job applicants and N jobs. Each applicant has a subset of jobs that he/she is interested in. Each job
opening can only accept one applicant and a job applicant can be appointed for only one job. Find an assignment of
jobs to applicants in such that as many applicants as possible get jobs.”
How: We strongly recommend to read the following post first. “Ford-Fulkerson Algorithm for Maximum Flow
Problem”
Maximum Bipartite Matching and Max Flow Problem :
Maximum Bipartite Matching (MBP) problem can be solved by converting it into a flow network (See this video to
know how did we arrive this conclusion). Following are the steps.
1) Build a Flow Network : There must be a source and sink in a flow network. So we add a source and add edges
from source to all applicants. Similarly, add edges from all jobs to sink. The capacity of every edge is marked as 1
unit.
2) Find the maximum flow: We use Ford-Fulkerson algorithm to find the maximum flow in the flow network built in
step 1. The maximum flow is actually the MBP we are looking for.
How to implement the above approach?
Let us first define input and output forms. Input is in the form of Edmonds matrix which is a 2D array
‘bpGraph[M][N]’ with M rows (for M job applicants) and N columns (for N jobs). The value bpGraph[i][j] is 1
if i’th applicant is interested in j’th job, otherwise 0.
FFT: DFT:
Represents a signal's frequency components by calculating a set of complex numbers based on a specific formula.
Requires O(N^2) operations for N data points, making it computationally expensive for large datasets.
Directly applies the Fourier transform formula to discrete data.
FFT:
An algorithm that leverages the inherent symmetries in the DFT to significantly decrease the number of
computations needed.
Utilizes a divide-and-conquer strategy, breaking down the DFT calculation into smaller, more manageable sub-
problems.
Has a computational complexity of O(N log N), making it much faster than the direct DFT for large data sets.
Efficient implementation of FFT:
Cooley-Tukey Algorithm:
The most widely used FFT algorithm, which recursively splits the input data into even and odd indices, then
combines the results to calculate the DFT.
Radix-2 FFT:
A common variant of the Cooley-Tukey algorithm where the data is divided into groups of 2 at each stage, utilizing
"butterfly operations" to perform the necessary calculations efficiently.