Advanced Algorithms Course. Lecture Notes. Part 4: Using Linear Programming For Approximation Algorithms
Advanced Algorithms Course. Lecture Notes. Part 4: Using Linear Programming For Approximation Algorithms
1
over, wLP w(S ) implies w(S) 2w(S ), since by rounding we have at
most doubled the variable values from the LP relaxation. This gives us yet
another algorithm with approximation ratio 2. We know already simpler
2-approximation algorithms for Weighted Vertex Cover, but this was only
an example to demonstrate the technique of LP relaxation and rounding.
The following part is not in the course book.
The LP formulation of Vertex Cover (the unweighted case where all
wi = 1) is also useful in another respect: Let P, Q, R be the set of nodes i
with xi > 0.5, xi = 0.5, xi < 0.5, respectively, in the optimal LP solution.
Clearly, the nodes in R have all their neighbors in P . Let S be some
minimum size vertex cover, and define A := P \S , B := RS . If |A| > |B|
then we decrease all xi , i A and increase all xi , i B by the same amount
> 0. When is chosen small enough, no constraint xi + xj 1 on an edge
will be violated by this change. (To see this, just inspect the different cases
of edges.) On the other hand, since |A| > |B|, this change yields a better
LP solution, contradicting its optimality. Thus we have |A| |B|. Observe
that (S \ B) A is also a vertex cover, and since |A| |B|, this vertex cover
is no larger than S . This shows that P Q contains some minimum vertex
cover entirely. Obviously we get P Q from the LP solution by rounding,
thus we also know |P Q| 2w(S ).
What is the point? The LP-plus-rounding solution is not only some
approximate vertex cover, but it also contains some optimal vertex cover
as a subset. (This is not a trivial consequence, we needed this tricky proof
to show that!) In other words, if we are interested in an optimal solution,
it suffices to further consider only vertices in P Q as candidates. This
is particularly interesting if some vertex cover S is much smaller than the
graph. Then we may even afford exhaustive search on the small candidate
set P Q. Since one can forget about the vertices outside P Q, this set is
called a problem kernel, and the computation described above is known as
the Nemhauser-Trotter kernelization for Vertex Cover. In general (for any
set minimization problem), such an approximate solution that also contains
some optimal solution is called a safe approximation.
2
algorithm if it runs in polynomial time and gives a solution close to optimum.
The approximation ratio is the ratio of the values of the output and of
an optimal solution, minimized or maximized (depending on what type of
problem we have) over all instances. It can be analyzed by relating simple
upper and lower bounds on the values of solutions. Some approaches to the
design of approximation algorithms are: greedy rules, solving dual problems
(pricing methods), and LP relaxation followed by rounding, and there are
many more techniques.
All NP-complete decision problems are equally hard subject to poly-
nomial factors in their time complexities, but they can behave very differ-
ently as optimization problems. Even different optimization criteria for the
same problem can lead to different complexities. Some problems are approx-
imable within a constant factor, or within a factor that mildly grows with
some input parameters, and some can be solved with arbitrary accuracy in
polynomial time. In the latter case we speak of polynomial-time approxima-
tion schemes. One should also notice that the proved approximation ratios
are only worst-case results. The quality of solutions to specific instances is
often much better. On the other hand, there exist problems for which we
cannot even find any good approximation in polynomial time. One example
is finding maximum cliques in graphs. However, such hardness of approx-
imation results require much deeper proof methods than in the theory of
NP-completeness. We cannot touch this topic in the course.
If you encounter a problem and wonder how well it might be solvable
approximately: There is material on the Web, e.g., A compendium of NP
optimization problems edited by Crescenzi and Kann.
Now we leave this topic and look at the polynomial side of life...
3
and f + (v) := u:(v,u)E f (e). (As a menominic aid: f (v) is consumed by
P
node v, and f + (v) is generated by node v.) The value of the flow f is defined
as v(f ) := f + (s). The Maximum Flow problem is to compute a flow with
maximum value.
The problem has obvious applications. Imagine that we want to contin-
uously ship goods or send informations in a network, from s to t. Nothing
gets lost on the way, but the bandwidths of links are limited. The maximum
flow is the flow that achieves the highest possible throughput. The problem
can be formulated as an LP, however we can be solve it more efficiently than
a generic LP solver could do.
For an algorithm we need some more preparations. If there is already
some flow f in G, we define the residual graph Gf as follows. Gf has the
same nodes as G. For every edge e of G with f (e) < ce , Gf has the same edge
with capacity ce f (e), called a forward edge. The difference is obviously
the remaining capacity available on e. For every edge e of G with f (e) > 0,
Gf has the opposite edge with capacity f (e), called a backward edge. The
meaning of backward edges is less obvious: We can undo any amount of
flow up to f (e) on e by sending it back in the opposite direction. We call
ce f (e) on forward edges and f (e) on backward edges the residual capacity.
Now let P be any simple directed s t path in Gf , and let b be the
smallest residual capacity of all edges in P . For every forward edge e in P ,
we may increase f (e) in G by b, and for every backward edge e in P , we may
decrease f (e) in G by b. It is not hard to check that the resulting function
f 0 on the edges is still a flow in G. We call f 0 an augmented flow, obtained
by these changes. Note that v(f 0 ) = v(f ) + b > v(f ).
We are ready to state the famous Ford-Fulkerson algorithm: Initially let
f := 0. As long as a directed s t path in Gf exists, augment the flow f
(as described above).
But does the Ford-Fulkerson algorithm output a maximum flow as de-
sired? You should know from basic courses that greedy algorithms (like this)
in general fail to find an optimal solution. This is not the case here, but we
have to prove that. The assertion is precisely: If no s t path in Gf exists,
then f is an optimal flow. The proof is not only a duty, it will also give us
useful structural insights.
4
Maximum Flow versus Minimum Cut
Using the earlier notations, we define an st cut in G = (V, E) as a partition
of V into sets A, B with s A, t B. The capacity of a cut is defined as
P
c(A, B) := e=(u,v):uA,vB ce . Note that only those directed edges going
from A to B count for the capacity.
For subsets S V we define f + (S) := e=(u,v):uS,vS
P
/ f (e) and f (S) :=
P +
f (e). Remember that v(f ) = f (s) f (s) by definition.
e=(u,v):uS,vS
/
(Actually we have f (s) = 0 if no edge goes into s.) We can generalize this
equation to any cut: v(f ) = uA (f + (u)f (u)), which follows easily from
P
the conservation constraints. When we rewrite the last expression for v(f ) as
a sum of flows on edges, then, for edges e with both nodes in A, terms +f (e)
and f (e) cancel out in the sum. It remains v(f ) = f + (A) f (A). It fol-
lows v(f ) f + (A) = e=(u,v):uA,vA/ f (e)
P P
/ ce = c(A, B).
e=(u,v):uA,v A
In words: The flow value v(f ) is bounded by the capacity of any cut (which
is also intuitive).
Next we show that, for the flow f returned by Ford-Fulkerson, there
exists a cut with v(f ) = c(A, B). This implies that the algorithm in fact
computes a maximum flow!
Clearly, when the Ford-Fulkerson algorithm stops, no directed s t path
exists in Gf . Now we specify a cut as desired: Let A be the set of nodes v
such that some directed s v path is in Gf , and B = V \ A. Since s A and
t B, this is actually a cut. For every edge (u, v) with u A, v B we have
f (e) = ce (or v should be in A). For every edge (u, v) with u B, v A we
have f (e) = 0 (or u should be in A because of the backward edge (v, u) in
Gf ). Altogether we obtain v(f ) = f + (A) f (A) = f + (A) = c(A, B). In
words: The flow value v(f ) equals the capacity of a minimum cut (which is
still intuitive).
The last statement is the famous Max-Flow Min-Cut Theorem.