0% found this document useful (0 votes)
13 views14 pages

Design and Analysis of Algorithm Notes-47-60

The document discusses Dynamic Programming and Greedy Techniques in algorithm design, covering key concepts like the Coin Changing Problem, Binomial Coefficients, Floyd's Algorithm, and Optimal Binary Search Trees. It explains the principles behind these algorithms, their applications, and provides pseudocode for implementation. Additionally, it highlights the efficiency and optimization strategies for these algorithms.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views14 pages

Design and Analysis of Algorithm Notes-47-60

The document discusses Dynamic Programming and Greedy Techniques in algorithm design, covering key concepts like the Coin Changing Problem, Binomial Coefficients, Floyd's Algorithm, and Optimal Binary Search Trees. It explains the principles behind these algorithms, their applications, and provides pseudocode for implementation. Additionally, it highlights the efficiency and optimization strategies for these algorithms.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

DOWNLOADED FROM STUCOR APP CS8451/DESIGN AND ANALYSIS OF ALGORITHMS

UNIT III

DYNAMIC PROGRAMMING AND GREEDY TECHNIQUE

Dynamic programming – Principle of optimality - Coin changing problem, Computing a


Binomial Coefficient – Floyd‘s algorithm – Multi stage graph - Optimal Binary Search Trees –
Knapsack Problem and Memory functions. Greedy Technique – Container loading problem -
Prim‘s algorithm and Kruskal's Algorithm – 0/1 Knapsack problem, Optimal Merge pattern -
Huffman Trees.

3.1 DYNAMIC ROGRAMMING

Invented in 1950 by Richard Bellman an U.S mathematician as a general method for

P
optimizing multistage decision processes. It is a planning concept, which is a technique for
solving problems with overlapping subproblems. This arises from a recurrence relating a

AP
solution to a given problem with solutions to its smaller subproblems of the same type. Rather
than solving overlapping subproblems it suggests solving each of the smaller subproblems only
once and recording the results in a table from which we can then obtain a solution to the
original problem. It can also be used for avoiding using extra space.
Bottom-up approach all the smaller subproblems to be solved. In top-down it avoids solving
the unnecessary subproblems, which is called memory-functions.
R
3.1.1 COMPUTING A BINOMIAL COEFFICIENT
CO

It’s used to apply in nonoptimization problem. Elementary combinatorics, binomial


coefficient, denoted c(n,k), is the number of combinations k elements from an n-element set (0 ≤
k ≤ n).
The binomial formula is:
U

(a+b)n = c(n,0)an + …. + c(n,i)an-ibi + … + c(n,n)bn


c(n,k) = c(n-1,k-1) + c(n-1,k) for n>k>0 and c(n,0)= c(n,n) = 1
ST

0 1 2………………………………k-1 k
0 1
1 1 1
2 1 21
.
.
.
k 1 1
.
.
.
n-1 1 c(n-1,k-1) c(n-1,k)

DOWNLOADED FROM STUCOR APP


ALGORITHMS
DOWNLOADED FROM STUCOR APP CS8451/DESIGN AND ANALYSIS OF ALGORITHMS

n 1 c(n,k)
Algorithm Binomial (n,k)
// Computes c(n,k)
// Input:A pair of non –ive int n ≥ k ≥ 0
// Output:The value of c(n,k)
for i 0 to n do
for j 0 to min(i,k) do if
j=0 or j=k
c[i,j] 1
else c[i,j] c[i-1, j-1] + c[i-1, j]
return c[n,k]

P
The time efficiency is based on the basic operation i.e., addition. Also the 1st k+1 rows of the
table form a triangle, while the remaining n-k rows form a rectangle, we have to split the sum

AP
expressing A(n,k) into 2 parts

k i-1 n k k n
A(n,k) = ∑ ∑ 1 + ∑∑ 1 = ∑ (i-1) + ∑ k
i=
R
i=k i=k
1 j=1 +1 j=1 i=1 +1
CO

= [(k-1)k]/2 + k(n-k) Є θ(nk)

3.2 WARSHALL’S AND FLOYD’S ALGORITHMS


U

Warshall’s algorithm for computing the transitive closure of a directed graph and Floyd’s
algorithm for the all pairs shortest-paths problem.
ST

3.2.1 Warshall’s algorithm:


The adjacency matrix A={aij} of a directed graph is the Boolean matrix that has 1 in its ith row
and jth column iff there is a directed edge from the ith vertex to jth vertex.
Definition:- The transitive closure of a directed graph with n vertices can be defined as the n x n
Boolean matrix T={tij}, in which the element in the ith row (1<=i<=n) and the jth column (1 ≤ j
≤ n) is 1 if there exists a nontrivial directed path (i.e., directed path of positive length) from the
ith vertex to the jth vertex; otherwise ,tij is 0.

We can generate it with DFS or BFS .It constructs the transitive closure of a given
diagraph with n vertices through a series of n x n matrices.
R0, R (1), …… R (k-1) , R (k),……R(n) .

DOWNLOADED FROM STUCOR APP


ALGORITHMS
DOWNLOADED FROM STUCOR APP CS8451/DESIGN AND ANALYSIS OF ALGORITHMS

The element rij(k) in the ith row and j th col of matrix R(k) = 1 iff there exists a directed path
from the ith vertex to jth vertex with each intermediate vertex, if any, not numbered higher than
k. It starts with R(0) which doesnot allow any intermediate vertices in its paths; R(1) use the 1st
vertex as intermediate; i.e., it may contain more ones than R(0). Thus, R(n) reflects paths that can
use all n vertices of the digraph as intermediate and hence is nothing else but the digraph’s
transitive closure.

R(k) is computed from its immediate predecessor R(k-1) .Let rij (k),the element in the ith row and
jth col of matrix R(k) ,be equal to 1.This means that there exists a path from ith vertex vi to the jth
vertex vj with each intermediate vertex numbered not higher than k.

vi, a list of intermediate vertices each numbered not higher than k,vj.

P
There are 2 possibilities. First is, if there is a list the intermediate vertex doesnot contains the kth

AP
vertex. The path from vi to vj has vertices not higher than k-1, and therefore rij(k -1)=1. The second
is, that path does contain the kth vertex vk among the intermediate vertices. vk occurs only once in
the list. Therefore vi,vertices numbered ≤k-1, vk, vertices numbered ≤k-1, vj.
Means there exists a path from vi to vk with each intermediate vertex numbered not higher than
k- 1 hence rik(k-1) =1 and the 2nd part is such that a path from vk to vj with each intermediate
vertex numbered not higher than
R
k-1 hence, rkj(k-1) = 1 i.e., if rij(k) = 1,then either rij(k-1) = 1 or both rik(k-1) =1 and
rkj(k-1) =1.
CO

Thus the formula is : rij(k) = rij(k -1) or rik(k-1) and rkj(k-1) This
formula yields the Warshall’s algorithm which implies:
 If an element rij is 1 in R(k-1) ,it remains 1 in R(k) .
 If an elt rij is 0 in R(k-1) , it has to be changed to 1 in R(k) iff the element in its row i
U

and column k and the element in its column j and row k are both 1’s in R(k-1) .
ST

Algorithm Warshall (A[1..n,1..n])


R(0) A
for k 1 to n do for i 1
to n do
for j 1 to n do
R(k)[i,j] R R(k-1) [i,j] or R R(k-1) [i,k] and R R(k-1) [k,j]
Return R(n) .

The time efficiency is in θ(n3).This can be speed up by restricting its innermost loop.
Another way to make the algorithm run faster is to treat matrix rows as bit strings and apply
bitwise OR operations.

DOWNLOADED FROM STUCOR APP


ALGORITHMS
DOWNLOADED FROM STUCOR APP CS8451/DESIGN AND ANALYSIS OF ALGORITHMS

As to the space is considered we used separate matrices for recording intermediate


results, which is unnecessary.

3.2.1 Floyd’s algorithm for the All-pairs shortest paths problem:

It is to find the distances (the length of the shortest paths) from each vertex to all other
vertices. Its convenient to record the lengths of shortest paths in an n x n matrix D called
distance matrix. The element dij in the ith row and jth column of this matrix indicates the
length of the shortest path from the ith vertex to jth vertex (1≤ i,j ≤ n).
(k-1).

Floyd’s algorithm is applicable to both undirected and directed weighted


graphs provided that they do not contain a cycle of negative length.

It computes the distance matrix of a weighted graph with n vertices thru a series of n x n

P
matrices.
D0, D (1), …… D (k-1) , D (k),…… D(n)

AP
D(0) with no intermediate vertices and D(n) consists all the vertices as intermediate
vertices. dij(k) is the shortest path from vi to vj with their intermediate vertices numbered not
higher than k. D(k) is obtained from its immediate predecessor D(k-1).
R
i.e., vi a list of int. vertices each numbered not higher than k, vj.

We can partition the path into 2 disjoint subsets: those that do not use the k th vertex as
CO

intermediate and those that do. Since the paths of the 1st subset have their
intermediate vertices numbered not higher than k-1,the shortest of them is dij

The length of the shortest path in the 2nd subset is,if the graph doesnot contain a cycle of
negative length, the subset use vertex vk as intermediate vertex exactly once.
U

vi,vertices numbered ≤ k-1, vk, vertices numbered ≤ k-1, vj .


ST

The path vi to vk is equal to dik(k-1) and vk to vij is dkj(k-1) ,the length of the shortest

path among the paths that use the kth vertex is equal to dik(k-1) + dkj(k-1).

dij(k) = min{ dij(k-1), dik(k-1) + dkj(k-1) } for k ≥ 1,


dij(0) = wij.

Algorithm Floyd(w[1..n,1..n])
//Implemnt Floyd’s algorithm for the all-pairs shortest-paths problem
// Input: The weight matrix W of a graph
// Output: The distance matrix of the shortest paths lengths

Dw
For k 1 to n do

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP CS8451/DESIGN AND ANALYSIS OF ALGORITHMS

For i 1 to n do
For j 1 to n do
D[i,j] min {D[i,j], D[i,k] + D[k,j]}

Return D.
The time efficiency is θ (n3).The principle of optimality holds for these problems
where there is an optimization.

3.3 OPTIMAL BINARY SEARCH TREES

A binary search tree is one of the most important data structures in computer science.
One of its principal applications is to implement a dictionary, a set of elements with the
operations of searching, insertion, and deletion. If probabilities of searching for elements of a
set are known—e.g., from accumulated data about past searches—it is natural to pose a
question about an optimal binary search tree for which the average number of comparisons in a

P
search is the smallest possible. For simplicity, we limit our discussion to minimizing the
average number of comparisons in a successful search. The method can be extended to include
unsuccessful searches as well.As an example, consider four keys A, B, C, and D to be searched

AP
for with probabilities 0.1, 0.2, 0.4, and 0.3, respectively. The average number of comparisons
in a successful search in the first of these trees is 0.1 . 1+ 0.2 . 2 + 0.4 .3+ 0.3 . 4 = 2.9, and for
the second one it is 0.1 . 2
+ 0.2 . 1+ 0.4 . 2 + 0.3 . 3= 2.1.Neither of these two trees is, in fact, optimal. (Can you tell
which binary tree is optimal?) For our tiny example, we could find the optimal tree by
generating all 14 binary search trees with these keys. As a general algorithm, this exhaustive-
R
search approach is unrealistic: the total number of binary search trees with n keys is equal to
the nth Catalan number,
c(n) = 1/(n + 1)(2n/n) for n > 0, c(0) = 1,
CO

which grows to infinity as fast as 4n/n1.5 (see Problem 7 in this section’s exercises). So let a1,
. . . , an be distinct keys ordered from the smallest to the largest and let p1, . . . , pn be the
probabilities of searching for them. Let C(i, j) be the smallest average number of comparisons
made in a successful search in a binary search tree Tji made up of keys ai, . .
. , aj , where i, j are some integer indices, 1≤ i ≤ j ≤ n.Following the classic dynamic
U

programming approach, we will find values of C(i, j) for all smaller instances of the problem,
although we are interested just in C(1, n). To derive a recurrence underlying a dynamic
programming algorithm, we will consider all possible ways to choose a root ak among the
ST

keys ai, . . . , aj . For such a binary search tree (Figure 8.8), the root contains key ak, the left
subtree T k−1 i contains keys ai, . . . , ak−1 optimally arranged, and the right subtree Tjk+1
contains keys ak+1, . . . , aj also optimally arranged. (Note how we are taking advantage of the
principle of optimality here.).
The two-dimensional table has the values needed for computing C(i, j) by formula
,they are in row i and the columns to the left of column j and in column j and the rows below
row i. The arrows point to the pairs of entries whose sums are computed in order to find the
smallest one to be recorded as the value of C(i, j). This suggests filling the table along its
diagonals, starting with all zeros on the main diagonal and given probabilities pi, 1≤ i ≤ n, right
above it and moving toward the upper right corner.
The algorithm we just sketched computes C(1, n)—the average number of comparisons
for successful searches in the optimal binary tree. If we also want to get the optimal tree itself,
we need to maintain another two-dimensional table to record the value of k for which the
minimum is achieved. The table has the same shape as the table in and is filled in the same
manner, starting with entries R(i, i) = i for 1≤ i ≤ n. When the table is filled, its entries indicate
indices of the roots of the optimal subtrees, which makes it possible to reconstruct an optimal
tree for the entire set given.

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP CS8451/DESIGN AND ANALYSIS OF ALGORITHMS

EXAMPLE Let us illustrate the algorithm by applying it to the four-key set we


used at the beginning of this section:
Key A B C D
Probability 0.1 0.2 0.4 0.3

C(i, j) = mini≤k≤j {C(i, k − 1) + C(k + 1, j)} +s=ito j ps for 1≤ i ≤ j ≤ n.


Thus, out of two possible binary trees containing the first two keys, A and B, the root of the
optimal tree has index 2 (i.e., it contains B), and the average number of comparisons in a
successful search in this tree is 0.4.

ALGORITHM OptimalBST(P [1..n])

//Finds an optimal binary search tree by dynamic programming


//Input: An array P[1..n] of search probabilities for a sorted list of n keys
//Output: Average number of comparisons in successful searches in the
// optimal BST and table R of subtrees’ roots in the optimal BST

P
for i ←1 to n do
C[i, i − 1]←0

AP
C[i, i]←P[i]
R[i, i]←i
C[n + 1, n]←0
for d ←1 to n − 1 do //diagonal count
for i ←1 to n − d do
j ←i + d
R
minval←∞
for k←i to j do
if C[i, k − 1]+ C[k + 1, j]< minval
CO

minval←C[i, k − 1]+ C[k + 1, j]; kmin←k


R[i, j ]←kmin
sum←P[i]; for s ←i + 1 to j do sum←sum + P[s] C[i,
j ]←minval + sum
return C[1, n], R
U

The algorithm’s space efficiency is clearly quadratic; the time efficiency of this version of the
algorithm is cubic (why?).Amore careful analysis shows that entries in the root table are
ST

always nondecreasing along each row and column. This limits values for R(i, j) to the range
R(i, j − 1), . . . , R(i + 1, j) and makes it possible to reduce the running time of the algorithm to
_(n2).
3.4 THE KNAPSACK PROBLEM AND MEMORY FUNCTIONS

The knapsack problem states that given n items of known weights w1, w2 ,…, wn and values
v1, v2 ,….., vn and a knapsack of capacity W,find the most valuable subset of the
items that fit into the knapsack. In dynamic programming we need to obtain the solution by
solving the smaller subinstances.

Let v[i,j] be the value of an optimal soln to the instance, i.e., the value of the most
valuable subset of the 1st i items that fit into the knapsack of capacity j. We can divide the
subsets into 2: those that do not include the ith item and those that do.

1. Among the subsets that do not include the ith item, the value of an optimal subset
is,by definition: v[i-1,j]

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP CS8451/DESIGN AND ANALYSIS OF ALGORITHMS

2. Among the subsets that do not include the ith item, an optimal subset is made up
of this item and an optimal subset of the 1st i-1 items that fit into the knapsack of
capacity j-wi. The value of such an optimal subset is vi+v[i-1,j-wi].

Thus the value of an optimal solution among all feasible subsets of the 1 st i items is the
maximum of these 2 values. Of course, if the ith item does not fit into the knapsack,the value
of an optimal subset selected from the 1st i items is the same as the value of an optimal subset
selected from the first i-1 items.

Therefore, v[i,j] = { max{ v[i-1,j], vi + v[i-1,j-wi] },if j-wi ≥ 0 v[i-1,j], if j-wi<


0.
The initial conditions are:
v[0,j] = 0 for j ≥ 0 and v[i,0] = 0 for i ≥ 0.
Our goal is to find v[n,w],the maximal value of a subset of n items which fit into W.

P
The maximal value is v[4,5]=$37. Since v[4,5] ≠ v[3,5] item 4 was included in an
optimal solution along with an optimal subset for filling 5-2 = 3 remaining units of the knapsack

AP
capacity. Next v[3,3] = v[2,3], item 3 is not a part of an optimal subset. Since v[2,3] ≠ v[1,3] item
2 is a part of an optimal selection, which leaves elt.v[1,3-1] to specify the remaining composition.
v[1,2] ≠ v[0,2] item 1 is part of the solution. Therefore {item1,item2,item4} is the subset with
value 37.
R
Memory Functions:

The direct top-down approach to finding a solution to such a recurrence leads to an algorithm that
CO

solves common subproblems more than once and hence is very inefficient. The other approach is
bottom-up, it fills a table with solutions to all smaller subproblems but each of them is solved only
once. The drawback of bottom-up is to solve all smaller subproblems even if its not necessary for
getting the solution we use to combine the top-down and bottom-up, where the goal is to get a
method that solves only subproblems that are necessary and does it only once. Such a method is
U

by using memory functions.


ST

This method solves by top-down; but, creates a table to be filled in by bottom-up. Intially, all the
table’s entries are initialized with a special “null” symbol to indicate that they have not yet been
calculated called virtual initialization. Therefore, whenever a new value needs to be calculated, the
method checks the corresponding entry in the table first: if this entry is not null its retrieved from
the table; otherwise, its computed by the recursive call whose result is then recorded in the table.

Algorithm MF Knapsack(i,j)
// Uses a global variables input arrays w[1..n],v[1..n] and
// Table v[0..n,0..w] whose entries are initialized
// with -1’s except for row 0 and column 0 initialized with 0’s. if
v[i,j] < 0 if j< weights[i]
value MF knapsack(i-1, j) else
value max (MF knapsack(i-1,j),values[i]+MF knapsack(i-1,j-weights[i]) v[i,j] value
return
v[i,j].

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP CS8451/DESIGN AND ANALYSIS OF ALGORITHMS

3.5 GREEDY TECHNIQUE

The change making problem is a good example of greedy concept. That is to give change
for a specific amount ‘n’ with least number of coins of the denominations d1 >d2……dm. For
example: d1
=25, d1=10, d3=5, d4=1. To get a change for 48 cents. First step is to give 1 d1, 2 d2 and 3 d4’s
which gives a optimal solution with the lesser number of coins. The greedy technique is applicable
to optimization problems. It suggests constructing a solution through a sequence of steps, each
expanding a partially constructed solution obtained so far, until a complete solution to the problem
is reached. On each step the choice made must be –
 Feasible - i.e., it has to satisfy the problem constraints.
 Locally optimal - i.e., it has to be the best local choice among all feasible
choices available on that step.
 Irrevocable – i.e., once made, it cannot be changed on subsequent steps of the

P
algorithm.

PRIM’S ALGORITHM

AP
3.5.1

A spanning tree of connected graph is its connected acyclic sub graph that contains all the
vertices of the graph. A minimum spanning tree of a weighted connected graph is its spanning tree
of the smallest weight, where the weight of the tree is defined as the sum of the weight on all its
edges. The minimum spanning tree problem is the problem of finding a minimum spanning tree
R
for a given weighted connected graph.
Two serious obstacles are: first, the number of spanning tree grows exponentially with
CO

the graph size. Second, generating all spanning trees for the given graph is not easy; in fact, it’s
more difficult than finding a minimum spanning for a weighted graph by using one of several
efficient algorithms available for this problem.

Graph and its spanning tree ; T1 is the min spanning tree .


U

Prim’s algorithm constructs a minimum spanning tree thru a sequence of expanding sub
trees. The initial sub tree in such a sequence consists of a single vertex selected arbitrarily from
the set V of the graph’s vertices. On each iteration, we expand the current tree in the greedy
ST

manner by simply attaching to it the nearest vertex not in that tree. The algorithm stops after being
constructed. The total number of iterations will be n-1, if there are ‘n’ number of edges.

Algorithm Prim(G)
// Input: A weighted connected graph G = {V, E}
// Output: ET, the set of edges composing a minimum spanning tree of G.
VT <-{V0}
ET <- ø
For i <- 1 to |V|-1 do
Find a minimum weight edge e*=(v*,u*) among all the edges (v,u)
such that v is in VT & u is in V-VT.

VT <- VT U
{u*} ET <- ET
U {e*}

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP CS8451/DESIGN AND ANALYSIS OF ALGORITHMS

Return ET.

The algorithm makes to provide each vertex not in the current tree with the information
about the shortest edge connecting the vertex to a tree vertex. Vertices that are not adjacent is
labeled “∞ “ (infinity). The Vertices not in the tree are divided into 2 sets : the fringe and the
unseen. The fringe contains only the vertices that are not in the tree but are adjacent to atleast one
tree vertex. From these next vertices is selected. The unseen vertices are all the other vertices of
the graph, (they are yet to be seen). This can be broken arbitrarily.
After we identify a vertex u* to be added to the tree, we need to perform 2 operations:
 Move u* from the set V-VT to the set of tree vertices VT.

 For each remaining vertex u in V-VT that is connected to u* by a shorter edge than the
u’s
current distance label, update its label by u* and the weight of the edge between u* & u,

P
respectively.
Prim’s algorithm yields always a Minimum Spanning Tree. The proof is by induction.

AP
Since T0 consists of a single vertex and hence must be a part of any Minimum Spanning Tree.
Assume that Ti-1 is part of some Minimum Spanning Tree. We need to prove that Ti generated
from Ti-1 is also a part of Minimum Spanning Tree, by contradiction. Let us assume that no
Minimum Spanning Tree of the graph can contain Ti. let ei = (v,u) be the minimum weight edge
from a vertex in Ti-1 to a vertex not in Ti-1 used by Prim’s algorithm to expand T i-1 to Ti, e i
cannot belong to any Minimum Spanning Tree including T. Therefore if we add ei to T, a cycle
R
must be formed.

In addition to edge ei = (v,u), this cycle must contain another edge (v’,u’)
CO

connecting a vertex v’ Є Ti-1 to a vertex u’ which is not in Ti-1. If we now delete the edge (v’,u’)
from this cycle we obtain another spanning tree of the entire graph whose weight is less than or
equal to the weight of T. Since the weight of ei is less than or equal to the weight of (v’,u’). Hence
this is a Minimum Spanning Tree and contradicts that no Minimum Spanning Tree contains Ti.
U

The efficieny of this algorithm depends on the data structure chosen for the graph itself and for the
p vertex priorities are the distance to the nearest tree vertices. For eg, if the graph is represented by
weigh matrix and the priority queue is implemented as an unordered array the algorithm’s running
ST

time will be in θ(|V|2).

Priority queue can be implemented by a min_heap, which is a complete binary tree


in which every element is less than or equal to its children. Deletion of smallest element from and
insertion of a new element into a min_heap of size n is O(logn) operations.

If a graph is represented by its adjacency linked lists and the priority queue is
implemented as a min_heap, the running time of the algorithm’s is in O(|E|log|V|). This is because
the algorithm performs |V|-1 deletions of the smallest element and makes |E| verifications , and
changes of an element’s priority in a min_heap of size not greater than |V|. Each of these
operations is a O(log|V|) operations.Hence , the running time is in:

(|V|-1+|E|) O(log|V|) = O(|E|log|V|)


Because , in a connected graph ,|V|-1 ≤ |E|.

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP CS8451/DESIGN AND ANALYSIS OF ALGORITHMS

3.5.2 KRUSKAL’S ALGORITHM

It looks at Minimum Spanning Tree for a weighted connected graph G=<V,E> as


an acyclic sub graph with |V|-1 edges for which the sum of the edges weights is the smallest. The
algorithm constructs a Minimum Spanning Tree as an expanding sequence of sub graphs, which
are always acyclic but are not necessarily connected on the intermediate stages of the algorithm.

The algorithm begins by sorting the graphs edges in increasing order of their
weights. Then, starting with the empty subgraph , it scans this sorted list adding the next edge on
the list to the current sub graph if such an inclusion does not create a cycle and simply skips the
edge otherwise.

Algorithm Kruskal(G)

P
// Input: A weighted graph G=<V,E>
// Output: ET,the set of edges composing a Minimum Spanning

AP
Tree of G Sort E in increasing order of edge weights
ET<- ø ; ecounter <- 0 //initialize the set of tree edges and its size.

K <- 0 //initialize the no of processed edges


R
While ecounter < |V|-1
K <- K+1
CO

if ET U{eik} is acyclic
ET <- ET U {eik};
ecounter<-ecounter+1
Return ET.

Kruskal’s algorithm is not simpler than prim’s. Because, on each iteration it has to check
U

whether the edge added forms a cycle. Each connected component of a sub graph generated is a
tree because it has no cycles.
ST

There are efficient algorithms for doing these observations, called union_ find algorithms.
With this the time needed for sorting the edge weights of a given graph and it will be O(|E|log|E|) .

Disjoint sub sets and union_find algorithm.

Kruskal’s algorithm requires a dynamic partition of some n-element set S into a collection
of disjoint subsets S1,S2,S3….. Sk. After initializing each consist of different elements of S,
which is the
sequence of intermixed union and find algorithms or operations . Here we deal with an abstract
data type of collection of disjoint subsets of a finite set with the following operations:

 Makeset(x): Creates a 1-elt set{x}. Its assumed that this operations


can be applied to each of the element of set S only once.
 Find(x): Returns a subset containing x.

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP CS8451/DESIGN AND ANALYSIS OF ALGORITHMS

 Union(x,y): Constructs the union of the disjoint subsets Sx & Sy containing x & y
respectively and adds it to the collection to replace Sx & Sy, which are deleted from it.

For example, Let S={1,2,3,4,5,6}. Then make(i) Creates the set{i} and apply this operation to
create single sets:
{1},{2},{3},{4},{5},{6}.
Performing union (1,4) & union(5,2) yields
{1,4},{5,2},{3},{6}

Then union(4,5) & union(3,6) gives


{1,4,5,2},{3,6}

It uses one element from each of the disjoint sub sets in a collection as that subset’s representative.
There are two alternatives for implementing this data structure, called the quick find, optimizes the
time efficiency of the find operation, the second one, called the quick union, optimizes the union

P
operation.

AP
Size Last First

L List 1 4 1 4 5 2
R
Null Null
L List 2 0l n
CO

L List 3 2 3 6 n Null
U

n Null
List 4 0l n Null
ST

n Null
List 5 0l n Null

n Null
L List 6 0l Null

Subset representatives
Element Index Representation
1 1
2 1
3 3
4 1
5 1
6 3

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP CS8451/DESIGN AND ANALYSIS OF ALGORITHMS

For the makeset(x) time is in θ(1) and for ‘n’ elements it will be in θ(n). The efficiency of find(x)
is also in θ (1): union(x,y) takes longer of all the operations like update & delete.

Union (2,1), union(3,2),….. , union(i+1,i),….union(n,n-1)


Which runs in θ (n2) times , which is slow compared to other ways: there is an operation called
union-by-size which unites based on the length of the list. i.e., shorter of the two list will be
attached to longer one, which takes θ(n) time. Therefore, the total time needed will be θ(nlog n).

Each ai is updated Ai times, the resulting set will have 2Ai elements. Since the entire set
has n elements, 2Ai ≤ n and hence A i ≤ log2n. Therefore, the total number of possible updates of
the representatives for all n elements is S will not exceed nlog2n.
Thus for union by size, the time efficiency of a sequence of at most n-1 unions and m finds
is in O(n log n+m).

P
The quick union-represents by a rooted tree. The nodes of the tree contain the subset
elements , with the roots element considered the subsets representatives, the tree’s edges are

AP
directed from children to their parents.
Makeset(x) requires θ (1) time and for ‘n’ elements it is θ (n), a union (x,y) is θ (1) and
find(x) is in θ (n) because a tree representing a subset can degenerate into linked list with n nodes.
The union operation is to attach a smaller tree to the root of a larger one. The tree size can
be measured either by the number of nodes or by its height(union-by-rank). To execute each find it
R
takes O(log n) time. Thus, for quick union , the time efficiency of at most n-1 unions and m finds
is in O(n+ m log n).
CO

Forest representation of subsets,{1,4,5,2} & {3,6}.


Resultant of union(5,6).

An even better efficiency can be obtained by combining either variety of quick union with path
compression. This modification makes every node encountered during he execution of a find
U

operation point to the tree’s root.

3.5.3 DIJKSTRA’S ALGORITHM


ST

The single-source shortest paths problem: for a given vertex called the source in a
weighted connected graph, find shortest paths to all its other vertices is considered. It is a set pf
paths, each leading from the source to a different vertex in the graph, though some paths may. of
course, have edges in common.
This algorithm is applicable to graphs with nonnegative weights only. It finds shortest
paths to a graph’s vertices in order of their distance from a given source. First, it finds the shortest
path from the source to a vertex nearest to it, then to a second nearest and so on. In general ,
before its ith iteration commences, the algorithm has already, identified its shortest path to i -1
other vertices nearest to the source. This forms a sub tree Ti and the next vertex chosen to be
should be vertices adjacent to the vertices of Ti, fringe vertices. To identify , the i th nearest vertex,
the algorithm computes for every fringe vertex u, the sum of the distance to the nearest tree vertex
v and then select the smallest such sum. Finding the next nearest vertex u* becomes the simple
task of finding a fringe vertex with the smallest d value. Ties can be broken arbitrarily.
After we have identified a vertex u* to be added to the tree, we need to perform two operations:

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP CS8451/DESIGN AND ANALYSIS OF ALGORITHMS

 Move u* from the fringe to the set of the tree vertices.


 For each remaining fringe vertex u that is connected to u* by an edge of weight w(u*,u)
s.t. du* + w(u*,u) < du, update the labels of u by u* and , du* + w(u*,u) respectively.

Algorithm Dijkstra(a)
// Input: A weighted connected graph G=<V,E> and its vertex s
// Output: The length dv of a shortest path from s to v and its penultimate vertex pv //
// for every vertex v in V.

Initialize(Q) //initialize vertex priority queue to empty for every vertex v in V do for
every vertex v in V do
dv <- ∞; pv <-null
Insert(Q,v, dv) //initialize vertex priority in the priority queue ds
<- 0; decrease(Q,s, ds) //update priority of s with ds VT <- ø

P
for i<-0 to |V|-1 do
u* <- deleteMin(Q) //delete the minimum priority element
VT <- VT U {u*}

AP
for every vertex u in V-VT that is adjacent to u* do
if du* + w(u*,u) < du
du <- du* + w(u*,u); pu <-
u* Decrease(Q,u, du)
R
The time efficiency depends on the data structure used for implementing the priority queue
and for representing the input graph. It is in θ(|V|)2 for graphs represented by their weight matrix
and the priority queue implemented as an unordered array. For graphs represented by their
CO

adjacency linked lists and the priority queue implemented as a min_heap it is in O(|E|log|V|).

3.6 HUFFMAN TREES

Suppose we have to encode a text that comprises n characters from some alphabet by
U

assigning to each of the text’s characters some sequence of bits called the code word. Two types
are there :fixed length encoding that assigns to each character a bit string of the same length m
ST

variable-length encoding, which assigns codewords of different lengths to different characters ,


introduces a problem that fixed length encoding does not have.

To avoid complication, we called prefix-free code. In a prefix code, no codeword is a


prefix of a code word of another character. Hence, with such an encoding we can simplify scan a
bit string until we get the first group of bits that is a code word for some character, replace these
bits by this character, and repeat this operation until the bit string’s end is reached.
Huffman’s algorithm

Step 1: Initialize n one-node trees and label them with the characters of the alphabet.
Record the frequency of each character in its tree’s root to indicate the tree’s weight.
Step 2: Repeat the following operation until a single tree is obtained. Find two trees with the
smallest weight. Make them the left and right sub tree of a new tree and record the sum of their
weights in the root of the new tree as its weight. A tree constructed by the above algm is
called a Huffman tree. It defines – a Huffman code.

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP CS8451/DESIGN AND ANALYSIS OF ALGORITHMS

Example : consider the 5 char alphabet {A,B,C,D,-} with the following occurrence probabilities:

Char A B C D __

Probability 0 0.35 0 0.1 0 0.2 0 0.2 0 0.15

The resulting code words are as follows:

Char A B C D _ _

Probability 0 0.35 0 0.1 0 0.2 0 0.2 0 0.15

Codeword 1 11 1 100 0 00 0 01 1 101

P
Hence DAD is encoded as 011101, and 10011011011101 is decoded as BAD-AD.

AP
With the occurrence probabilities given and the codeword lengths obtained, the expected
number of bits per char in this code is:

2*0.35 + 3*0.1 + 2*0.2 + 2*0.2 + 3*0.15 = 2.25


R
The compression ratio, is a measure of the compression algorithms effectiveness of (3-
2.25)/3*100%=25%. The coding will use 25% less memory than its fixed length encoding.
CO

Huffman’s encoding is one of the most important file compression methods. It is simple and
yields an optimal.

The draw back can be overcome by the so called dynamic Huffman encoding, in which the
coding tree is updated each time a new char is read from source text.
U

Huffman’s code is not only limited to data compression. The sum ∑ liwi where i=1
li is the length of the simple path from the root to ith leaf, it is weighted path length. From this
ST

decision trees, can be obtained which is used for game applications.

DOWNLOADED FROM STUCOR APP

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy