Chap 22
Chap 22
1. 2.
Representation of a graph and the graph search algorithms: BFS, DFS The computation of
1. A minimum-weight spanning tree of a graph:
the least-weight way of connecting all of the vertices together when each edge has an associated weight
2.
3.
A maximum flow of material in a network (directed graph) having a specified source of material, a specified sink, and specified capacities for the amount of material that can traverse each directed edge. Description of the running time of a graph algorithm on G=(V, E), we measure the size of the input in terms of the number of vertices (|V|) and that of edges (|E|) of the graph. Only inside asymptotic notation, the symbol V denotes |V| and the symbol E denotes |E|. E.g.) O(|V||E|) = O(VE). V[G], E[G] : the vertex set or the edge set of a graph G, respectively.
4.
5.
Chapter 22.
Elementary Graph Algorithms
Graph Representation
Given graph G = (V, E). may be either directed or undirected. Two common ways to represent for algorithms: 1. Adjacency lists. 2. Adjacency matrix.
Expressing the running time of an algorithm is often in terms of both |V| and |E|.
In asymptotic notation - and only in asymptotic notation - well drop the cardinality. Example: O(V + E).
Adjacency lists
Array Adj of |V| lists, one per vertex. Vertex us list has all vertices v such that (u, v) E. (Works for both directed and undirected graphs.)
Example:
If edges have weights, can put the weights in the lists. Weight: w : E R Well use weights later on for spanning trees and shortest paths. Space: (V + E). Time: to list all vertices adjacent to u: (degree(u)).
Adjacency Matrix
|V| |V| matrix A = (a i ) a ij = 1 if (i, j ) E , 0 otherwise .
j
Breadth-First Search
Input:
Graph G = (V, E), either directed or undirected, and source vertex s V. d[v] = distance (smallest # of edges) from s to v, for all v V. Also [v] = u such that (u, v) is last edge on shortest path u is vs predecessor. set of edges {([v], v) : v = s} forms a tree.
Output:
Later, a breadth-first search will be generalized with edge weights. Now, lets keep it simple. Compute only d[v], not [v]. Omitting colors of vertices. Idea: Send a wave out from s. First hits all vertices 1 edge from s. From there, hits all vertices 2 edges from s. Etc. Use FIFO queue Q to maintain wavefront. v Q if and only if wave has hit v but has not come out of v yet.
Can show that Q consists of vertices with d values. i i i . . . i i+1 i+1 . . . i+1
Only 1 or 2 values. If 2, differ by 1 and all smallest are first.
Since each vertex gets a finite d value at most once, values assigned to vertices are monotonically increasing over time. Actual proof of correctness is a bit trickier. See book. BFS may not reach all vertices. Time = O(V + E). O(V) because every vertex enqueued at most once. O(E) because every vertex dequeued at most once and we examine (u, v) only when u is dequeued. Therefore, every edge examined at most once if directed, at most twice if undirected.
Depth-First Search
Input:
Graph G = (V, E), either directed or undirected. No source vertex given.
Discovery and finish times: In other words, 1 d[v] < f [v] 2 |V|.
Unique integers from 1 to 2 |V|. For all v, d[v] < f [v].
Example:
Example:
Time = (V + E). O(V) because every vertex enqueued at most once. , not just O, since guaranteed to examine every vertex and edge. DFS forms a depth-first forest comprised of > 1 depth-first trees.
Each tree is made of edges (u, v) such that u is gray and v is white when (u, v) is explored.
1. d[u] < f [u] < d[v] < f [v] or d[v] < f [v] < d[u] < f [u] and neither of u and v is a descendant of the other. 2. d[u] < d[v] < f [v] < f [u] and v is a descendant of u. 3. d[v] < d[u] < f [u] < f [v] and u is a descendant of v.
So d[u] < d[v] < f [u] < f [v] cannot happen. Like parentheses: OK: ()[] ([]) [()] Not OK: ([)] [(])
Corollary v is a proper descendant of u if and only if d[u] < d[v] < f [v] < f [u]. Theorem (White-path theorem)
v is a descendant of u if and only if at time d [u], there is a path consisting of only white vertices. (Except for u, which was just colored gray.)
Classification of edges
Tree edge: in the depth-first forest. Found by exploring (u, v). Back edge: (u, v), where u is a descendant of v. Forward edge: (u, v), where v is a descendant of u, but not a tree edge. Cross edge: any other edge. Can go between vertices in same depth-first tree or in different depth-first trees.
In an undirected graph, there may be some ambiguity since (u, v) and (v, u) are the same edge. Classify by the first type above that matches.
Theorem In DFS of an undirected graph, we get only tree and back edges. No forward or cross edges.
Can create GT in (V + E) time if using adjacency lists. Observation: G and GT have the same SCCs. (u and v are reachable from each other in G if and only if reachable from each other in GT.)
Component Graph
G SCC (V SCC , E SCC ) V SCC has one vertex for each SCC in G. E SCC has an edge if theres an edge b/t the corresponding SCCs in G. Example:
.. continued Lemma
GSCC is a dag (directed acyclic graph). More formally, let C and C be distinct SCCs in G, let u, v C, u, v C, and suppose there is a path in G. Then there cannot also be a path in G. Proof Suppose there is a path in G. Then there are paths and in G. Therefore, u and v are reachable from each other, so they are not in separate SCCs. -- Contradiction!. Therefore, there does not exist a path from v to v.
SCC(G)
call DFS(G) to compute finishing times f [u] for all u compute GT call DFS(GT ), but in the main loop, consider vertices in order of decreasing f [u] (as computed in first DFS) output the vertices in each tree of the depth-first forest formed in second DFS as a separate SCC
23
Time: (V+E) Idea: By considering vertices in second DFS in decreasing order of finishing
times from first DFS, we are visiting vertices of the component graph in topological sort order.
To prove that it works, first deal with 2 notational issues: Will be discussing d[u] and f [u]. These always refer to first DFS. Extend notation for d and f to sets of vertices U V: d(U) = min uU {d[u]} (earliest discovery time) f (U) = max uU { f [u]} (latest finishing time)
.. continued Lemma:
Let C and C be distinct SCCs in G = (V, E). Suppose there is an edge (u, v) E such that u C and v C.
Then f (C) > f (C). Proof Two cases, depending on which SCC had the first discovered vertex during the first DFS.
If d(C) < d(C), let x be the first vertex discovered in C. At time d[x], all vertices in C and C are white. Thus, there exist paths of white vertices from x to all vertices in C and C. By the white-path theorem, all vertices in C and C are descendants of x in depth-first tree. By the parenthesis theorem, f [x] = f (C) > f (C). If d(C) > d(C), let y be the first vertex discovered in C. At time d[y], all vertices in C are white and there is a white path from y to each vertex in C all vertices in C become descendants of y. Again, f [y] = f (C). At time d[y], all vertices in C are white. By earlier Lemma(slide 23), since there is an edge (u, v), we cannot have a path from C to C. So no vertex in C is reachable from y. Therefore, at time f[y], all vertices in C are still white. Therefore, for all w C, f [w] > f [y], which implies that f (C) > f (C).
.. continued
Collary:
Let C and C be distinct SCCs in G = (V, E). Suppose there is an edge (u, v) ET , where u C and v C. Then f(C) < f(C). Proof (u, v) ET (v, u) E. Since SCCs of G and GT are the same, f (C) > f (C).
Corollary
Let C and C be distinct SCCs in G = (V, E), and suppose that f (C) > f (C). Then there cannot be an edge from C to C in GT . Proof Its the contrapositive of the previous corollary.
.. continued
Now we have the intuition to understand why the SCC procedure works. When we do the second DFS, on GT, start with SCC C such that f (C) is maximum. The second DFS starts from some x C, and it visits all vertices in C. Corollary says that since f (C) > f (C) for all C C, there are no edges from C to C in GT . Therefore, DFS will visit only vertices in C. Which means that the depth-first tree rooted at x contains exactly the vertices of C. The next root chosen in the second DFS is in SCC C such that f (C) is maximum over all SCCs other than C. DFS visits all vertices in C, but the only edges out of C go to C, which weve already visited. Therefore, the only tree edges will be to vertices in C. We can continue the process. Each time we choose a root for the second DFS, it can reach only vertices in its SCC get tree edges to these, vertices in SCCs already visited in second DFS get no tree edges to these. We are visiting vertices of (GT)SCC in reverse of topologically sorted order.