Module 5 Algorithm Analysis and Design
Module 5 Algorithm Analysis and Design
Module V
• Introduction to Complexity Theory
o Tractable and Intractable Problems
o Complexity Classes - P, NP, NP- Bard and NP-Complete Classes
o NP Completeness proof of Clique Problem and Vertex Cover Problem
o Approximation algorithm
• Bin Packing
• Graph Coloring
o Randomized Algorithms (Definitions of Monte Carlo and Las Vegas algorithms)
• Randomized version or Quick Sort algorithm with analysis
~
Tractable Problems Intractable Problems
• When the complexity is expressed as some polynomial func1ion over input size, then the
concerned problem is tractable.
• When the complexity is expressed as some exponential function over input size, then the
concerned problem is intraclable.
• An intractable problem has a faster complexity growth as compared to tractable problems.
• Tractable problem solutions are implemented in practice. They have polynomial lime
complexity.
• According 10 Cook-Krap thesis, a problem that is in P is called trac1able and that is no1 in
P is called intractable.
• Example of tractable problem
• PATH problem: Given directed graph G, determine whether a directed palh exists
from vertex s to vertex t
o Time complexity = O(n)
Where n - total number of vertices
• Example of intractable problem
• Knapsack Problem
o Time Complexit y = 0(2")
• Traveling Salesman Problem
o Time Complexity = O(n 2 2")
o Class P
• Class P consists of Lhose problems that are solvable in polynomial time.
• P problems can be solved in time O(ok). Here o is the size of input and k is some constant.
• Example:
• PATH Problem: Given directed graph G, detennine whether a directed path exists
from s tot.
o Algorithm
• Inputs: <G,s,t> G - directed graph s,t - 2 nodes
1. Place a mark on node s and enqueue it into an empty queue.
2. Repeat step 3 untiJ the queue is empty
3. Dequeue the front element a. Mark all unvisited neighbors of a and enqueue
those into the queue.
4. If t is marked, then accept. Otherwise reject.
o Complexity Calculation
• Step I & 4 will execute exactly once.
• Step 3 & 4 will execute almost n times, where n is the number of nodes in G.
• Time complexity = O(n).
• This is a polynomial I ime algorithm
• Other Examples:
• Single Source Shortest Path problem using Dijkstra's Greedy method.
• Multistage Graph problem implemented using forward or backward dynamic
programming.
• Minimum cost spanning tree using Prim's or Kruskal's method.
■ Network flow problem using Ford-Fulkerson aJgorithm.
o ClassNP
• Some problems can be solved in exponential or factoriaJ time. Suppose these problems
have no polynomiaJ time solution. We can verify these problems in polynomiaJ time.
These are called NP problems.
• NP is a class of problem that having only noo-polyoomiaJ time algorithm and a
polynomiaJ time verifier.
• Example:
• Hamiltonian path(HAMPATH) Problem
o A HamiJtonian palh in a directed graph G is a directed path that goes through each
node exactly once.
1>--------...
Module V CST 306 • Algorithm Anolysis ond Deslgn/56 CSE)
o The Hamiltonian path of the above graph is as follows
1 - ----- s
• Algorithm
I. Check whether s= P, and t= Pm, if either fails, reject
2. Check for the repetition of the nodes in the list P. If any are found,
reject.
3. For each ~ check whether ( P;, P;+1) is an edge in G. Here i is varies from
l to m-1. lf any are not, reject.
4. If all test have been passed, then accept it.
• This algorithm runs in polynomial time. Therefore HAMPATH problem is a
NP problem.
• CLIQUE Problem
o A clique in an undirected graph is a sub-graph where every two nodes are
connected by an edge.
o A k-clique is a clique that contains k nodes.
o Example:
1
• Whether P=NP? is one of the greatest unsolvable problem in theorelical computer science
• NP- Hard
• If a decision problem X is NP-Hard if every problem in NP is polynomial time
reducible to X.
Y ~p X for every Y i.n NP
• It means that X is as hard as all problems in NP.
• U X can be solved in polynomial time, then all problems in NP can also so lved in
polynomial time.
o Class NP-Complete
• lf the problem is NP as well as NP-Hard, then that problem is NP Complete.
• Example:
• CIRCUIT-SAT problem: Given a Boolean circuit C, is there an assignment to the
variables that causes the circuits to output I?
• SAT(Satisfiability) problem: Given a Boolean expression ♦, is there an assignment to
the variables that causes the expression to output I?
• 3-CNF-SAT
o Literal: The variables and its negation in a Boolean formula
o Clause: OR of one or more literals
• Ex: (x 1 V lx2 V x3)
o NP Completeness Proof
• Steps to prove that the given problem is NP Complete
l. Prove that the given problem is NP
o Write a polynomial time verification algorithm.
2. Prove that the given problem is NP Hard
o Write a polynomial Lime reduction algorithm from any NP problem to the given
problem.
Module V CST 306 • Algorithm Ano l.,sls and 06ign(S6 CSE)
• CLIQUE problem is NP Complete: Proof
• Step I : Write a polynomial time verification algorithm to prove that the g'i ven
problem is NP
o Algorithm: Let G= (V,E), we use the set V ' ~ V of k vertices in the clique as a
certificate o f G
I. Test whether V' is a set o f k vertices in the graph G
2. Check whether for each pair (u, v) e V', the edge (u, v) belongs to E.
3. If both steps pass, then accept. Otherwise reject.
o This algorithm will execute in polynomial time. Therefore CLIQUE problem is a
NP problem.
• Step 2 : Write a polynomial time reduction algorithm from 3-CNF-SAT problem to
CLIQUE problem(3-CNF-SAT ~ CLIQUE)
o Algorithm
• Let <I> = C 1 " C2 . . . . . /\ Ci. be a Boolean formula in 3CNF with k clauses
• Each clause C, bas exactly three distinct literals 1'1, 1'2 , 1'3 •
• Construct a graph G such that <I> is satisfiable iff G has a click of size k.
• The graph G is constructed as follows
• For each clause C, = ( l'1 V 1' 2 V J'3) in <I>, we place a triple of vertices
V'1• V'2 and V'3 in to V.
• Put an edge between V\ to V'; if folJowing two conditions hold
o V'; and V' 1 in different triples( that is r!=s)
o I'; is not a negation of I'; .
----
o The graph G equivalent to <1) is as fo llows
Modult V CS T306 • Algorithm Anolys/s ond Dtslgn(S6 CSE)
o lf G has a clique of size le, then <I> has a saLisfyin assignment. Here k=3.
X3
o G can eas ily be constructed from <I> in polynomial time.
o So CLIQUE problem is NP Hard.
• Conclusion
o CLIQUE problem is NP and NP Hard. So it is NP-Complete
o Step I :Write a polynomial lime verificatio n algorithm to prove that the given
problem is NP
• Inputs: <G, le, V '>
• Verifier AJgorilhm:
I. count = 0
2. for each vertex v in V ' remove all edges adjacent to v from set E
a. increment count by I
3. if counl = k and E is empty then the given solution is correct
4. else Lhe given solution is wrong
• This a lgorithm will execute in polynomial time. Therefore VERTEX
COVER problem ls a NP problem.
8/25 ••
••
••
Vertex cover ofG' is ( 1,2}
Size of vertex cover ofG' is 2.
If so G bas a clique of size IVI - 2 = 5
• This reduclion algorilhm(CLIQUE to VERTEX COVER) is a polynomial
time algorilhm
• So CLIQUE problem is NP Hard.
• Conclusion
o VERTEX COVER problem is NP and NP-Hard.
o So ii is NP-Comple1e
o Examples
1. Consider the following algorithm to determine whether or not an undirected graph has a
clique of size k. First, generate all subsets of the vertices containing exactly k vertices.
Next, check w hether any of the sub-graphs induced by these subsets is complete (i.e_
forms a clique). Why is this not a polynomial-time algorithm for the clique problem,
thereby implying that P = NP?
• Approximation Algorithm
o Approximate Solution: A feasible solution with value close to the value of optimal solution
is called an approx.imate solution
o Approximation Algorithms: An algorithm that returns near optimal solution is called
Approximation Algorithm.
o Approximation algorithms have two main properties:
• They run in polynomial time
• They produce solutions close to the optimal solutions
o Approximation algorithms are useful to give approximate solutions to NP complete
optimizat ion problems.
o It is also useful to give fast approximations to problems that run in po lynomial time.
o Approximation Ratio / Approximation Factor
• For given problem, C is the result obtained by the algorithm and C* is the optimal result.
• The approximation ratio of an algorithm is the ratio between the result obtained by the
algorithm and the optimal result.
• For maximization problem, 0 < C :S: c•, Approximation Ratio = C*/C
• For minimization problem, 0 < C* :S: C , Approximation Ratio = CIC*
• The approximation ratio of an approximation algo rithm is never less than I.
• Approximation ratio and computational time are inversely proportio nal.
• Approximation ratio and quality of the result are also inversely proport ional.
o k-Approximation Algorithm: An algorithm with approximation ratio k is called a k-
approximation algorithm.
• I-approximation algorithm produces an optimal solution
ModuleV
m 306 • Algorirhm Analysis and DfsJgn(S6 CSE)
• An approximation algorithm with a large approxima1ion ratio may return a solution
that is much worse than optimal.
o Different Types of Approximation Algorithms
• Absolute Approximation Algorithm : An algorithm is Absolute Approximation
Algorithm iff IC*-CJ ~ k, for some constant k
• f(n)-Approximation Algorithm: An algorithm is f(n)-Approximation Algorithm iff IC*-
CJIIC*I S ftn), for C* > 0
• c- Approximation Algorithm: An £-Approximation Algorithm is an f(n)-Approximation
Algorithm for which ftn) ~ E: for some constant E:
• Applications
• Loading of containers like trucks.
• Placing data on multiple disks.
• Job scheduling.
• Packing advertisements in fixed length radio/fV station breaks.
• Storing a large collection of music onto tapes/CD's, etc.
• Example: Apply different Bin packing approxim ation algo rithms on the follow ing items
with bin capacity = IO. Assuming the sizes of the items be (5, 7, 5, 2, 4, 2. 5, 1, 6 }.
• Solution
• Minimum number of bins >= Ceil ((Total Weight) / (Bin Capacity ))
=
Ceil (37 / 10) 4 =
• Next Fit
t 5
(5, 7, 5, 2, 4, 2, 5, I, 61 t-----1
7
t 5
(5, 7, 5, 2, 4, 2, 5, I, 6J
7
t 5 5
2
(5,7,5 ,2,4,2 ,5, 1,6}
t 5
7
5
2
{S, 7, S, 2, 4, 2, S, I, 6} t-----1
7
t 5 5 4
2 2
(S, 7,5,2,4,2,S, 1,6} t-----1
7
t 5 5 4
2
2
(5, 7, 5, 2, 4, 2, 5, I, 6J - - 7
f 5 5 4 5
Modult V CST 306 • A'9orithm Anolysls ond ~slgn(S6 CSE}
2 2 1
{5, 7, 5, 2, 4, 2, 5, I, 6}
t 5
7
5 4 5
15, 7, 5, 2, 4, 2, 5, I, 6J
t 5 5 4 5
'
Number of bins required = 6
• First Fit
{5,7,5,2,4,2,5, 1,6}
t 5
(5, 7, 5, 2, 4, 2, 5, I, 6t
7
t 5
(5, 7, 5, 2, 4, 2, 5, I, 6}
7
t 5
2
5
{5, 7, 5, 2, 4, 2, 5, 1, 6}
7
t 5
2
5
{5, 7, 5, 2, 4, 2, 5, J, 6}
7
t 5
4
Modult V ar 306 - Algorithm Analysis and Dtslgn/56 CSE}
s 2
{5, 7, 5, 2, 4, 2, 5, 1, 6} 2
7
t s
4
s 2
{5,7,5,2,4,2,5, 1,6} 2
7
t 5
4
5
1
5 2
{5,7,S,2,4,2,S, 1,6} 2
7
t 5
4
5
1
5 2
{5,7,5,2,4,2, 5,1,6} 2
t 5
7
4
5 6
• Best Fit
{5, 7, 5, 2, 4, 2, 5, I, 6}
7
t 5
{5,7,5,2,4,2,5, 1,6}
7
t 5
Madu/~ V CST 306 • Al90rithm Analysis and ~ sign(S6 CSE}
s 2
7
s
2
5
{5, 7, 5, 2, 4, 2, 5, I, 6} 7
t 5 4
{5, 7, 5, 2, 4, 2, 5, I, 6} 7
2
t s 4
2
5
{5, 7, 5, 2, 4, 2, 5, I, 6} 7
2
t 5 4 5
2
5
{5, 7, 5, 2, 4, 2, 5, I, 6} 7
2
t 5 4 5
2
5
2
7 6
5 5
• Worst Fit
{5, 7, 5, 2, 4, 2, 5, J, 6}
t 5
Module V CST 306 • Algorithm Anolysis ond DesJgn/56 CSE}
{5,7,5,2,4,2,5, 1,6}
7
t 5
(5, 7, 5, 2, 4, 2, 5, 1, 6}
7
t 5
2
5
{5, 7, 5, 2, 4, 2, 5, 1, 6}
7
t 5
2
5
(5, 7, 5, 2, 4, 2, 5, l, 6}
7
t 5 4
2
5
{5, 7, 5, 2, 4, 2, 5, J, 6} 2
7
t 5 4
2
5
{5,7,5,2,4,2,5, 1,6} 2
7
t 5 4 5
2
5
{5, 7, 5, 2, 4, 2, 5, 1, 6} 2
7
t 5 4 5
2
5
{5,7,5,2,4,2,5, 1,6} 2
7
t 5 4 5 6
{7, 6, 5, 5, 5, 4, 2, 2, l}
7
t
{7, 6, 5, 5, 5, 4, 2, 2, l}
t 7 6
{7, 6, 5, 5, 5, 4, 2, 2, 1}
7
t 6 5
(7, 6, 5, 5, 5, 4, 2, 2, I}
7
t 6 5
(7, 6, 5, 5, 5, 4, 2, 2, l}
t 7 6 5 5
4 5
{7, 6, 5, 5, 5 , 4, 2, 2, I}
t 7 6 5 5
2 4 5
{7, 6, 5, 5, 5, 4, 2, 2, I}
t 7 6 5 5
CST 306 • Algorithm Anolysis and Oesign(S6 CSE)
Module V
2 4 s
2
(7, 6, 5, 5, 5, 4, 2, 2, I}
7 s
t 6 s
1
2 4 s
2
{7,6,5,5,5,4,2, 2,l}
7 s s
t 6
{7, 6, 5, 5, 5, 4, 2, 2, I}
7
t
{7, 6, S, S, S, 4, 2, 2, I}
7
t 6
{7, 6, 5, 5, 5, 4, 2, 2, I}
7 s
t 6
{7, 6, 5, 5, 5, 4, 2, 2, l}
7
t 6 5
{7, 6, 5, 5, 5, 4, 2, 2, JJ
7
t 6 s 5
Module V CST 306 • Algorirhm Anolysls anti ~ Slf1n(S6 CSE)
-...;;_~
4 5
{7, 6, 5, 5, 5, 4, 2, 2, 1 }
t 7 6 5 5
2 4 5
{7, 6, 5, 5, 5, 4, 2, 2, I}
t 7 6 5 5
2 4 5
2
{7, 6, 5,5,S, 4, 2,2, 1}
t 7 6 5 5
1
2 4 5
2
{7, 6, S, S, 5 , 4 , 2, 2, 1}
t 7 6 5 5
o Graph Coloring
• Different Graph coloring problems
• Vertex coloring
• Edge coloring
• Face coloring
• Vertex coloring
• Assignment of colors to vertices in a graph such that no two adjacent vertices share
the same co lor
• A graph is 0-colorable iff V = 0
• A graph is 1-colorable iff E = 0
R ed Blue
• •
2 Colorable graph
X (G) = 2
Blue Green
Module V CST 3C6 -Algorithm Analysis and Design(S6 CSE)
3 Colorable graph
X (G) = 3
• Chromatic Number: It is the minimum number of colours with which a graph can be
coloured.
Red Blue
o x(G) = I • if G is a null graph. A null graph is a graph that contains vertices but no
edges.
o All other graphs x (G) >= 2.
o Four Color Theorem: For Every Planar graph, the chro matic number is less than
or equal to 4.
• A graph is k-colorable if it has k colo rs.
• A graph whose chro matic number is k. then it is called k-chromatic graph.
• A subset o f vert ices assign to the same color is called a color class.
ModultV CST 306 · Algorithm Analysis and (}fij9J)(S6 CSE)
• Edge coloring
• Given a graph G=(Y,E), assign a color to each edges so that no two adjacent edges
share the same color
Red Blue
Red
.,Cl 0
0
0
ri0
::I ::I
• Face Coloring
• For a planar graph, assign a color to each face/ region so that no two faces that shares
boundary have the same color.
Blue
Blue
Algorithm Approximate_Grapb_Coloring(G, n)
{
for i= I to n do
{
for c= I to n do
{
If no vertex adjacent to vi has color c
{
Color v; with c
Break
Madu/~ V CST 306 - A'9orlthm Analysis and ~ slgn(S6 CSE)
• Randomized Algorithm
o Deterministic Algorithm: The output as well as the running time are functions of the input
only.
Input Output
Algorithm
o Randomized Algorithm: The output or the running time are functions of the input and
random bits chosen
.--------,
Random bits
Input Output
Algorithm
• An algorithm that uses random numbers to decide what to do next anywhere in its logic
is called Randomized Algorithm
• Typically, this rando mness is used to reduce time complexity or space complexity in
other standard algorithms
• The computer is not capable of generating truly random numbers
• The computer can only generate pseudorandom numbers-numbers that are generated
by a formula
• Pseudorandom numbers look random, but are perfectly predictable if you know the
formula
• Pseudorandom numbers are not used for security applications
• Devices for generating truly random numbers do exist. They are based on
radioactive decay, or on lava lamps
• It hopes to achieve good performance in the "average case" over all possible choices o f
random bits.
ModuleV CST 306 • Algorithm Analysis and Ofiign{S6 CSE)
Algorithm findingA_LV(A, n)
{
repeat
{
Randomly choose one element out of n elements
}until('a' is found)
}
• This algorithm succeeds with probability I. The number of iterations varies and can
be arbitrarily large. but the expected number of iterations is
timn~a rr=•J;I : 2
• The expected number of trials before success is 2 .
• Therefore the time complexity = 0(1)
ModultV cs T 306 • Al9orithm Anolysis ond Dtsi9n(S6 CSE)
Algorithm findingA_MC(A, n, k )
{
i=O;
repeat
{
Randomly select one element out of n elements
i=i+l;
}until(i=k or 'a' is found);
}
• If an ' a' is found, the algorithm succeeds, else the algor ithm fai ls. After k iterations,
the probability of find ing an ' a' is Pr[ find a] = 1-(l/ 2)k
• This algorithm does not guarantee success, but the run time is bounded. The number
of iterations is always less than or equal to k.
• Therefore the time complexity = O(k)
• How many times while loop runs before finding a central pivot?
o The probability that the randomly chosen element is central pivot is 1/n.
o Therefore, expected number of times the while loop runs is n.
o Thus, the expected time complexity of step 2 is O(n).
• What is overall Time Complexity in Worst Case?
o In worst case, each partition divides array such that one side has n/4 elements
and other side has 3n/4 elements. The worst case height of recursion tree is
logJ,4 o which is O(log n).
o T(n) < T(n/4) + T(3n/4) + O(n)
o T(n) < 2T(3n/4) + O(n)
o Solution of above recurrence is O(n log n)
o Advantage:
• For many problems, a randomized algorithm is the simplest and the fastest
• Many NP-hard/NP Complete problems can be easily solvable