0% found this document useful (0 votes)

11 views

lgo

These lecture notes cover Linear and Integer Optimization, providing an introduction to optimization problems with linear objective functions and constraints. The course requires prior knowledge in algorithmic mathematics and linear algebra, and it references several key textbooks for further reading. Key topics include duality, the structure of polyhedra, simplex algorithms, and integer linear programming.

Uploaded by

fnmvtduzksjzwumplp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views

lgo

Uploaded by

fnmvtduzksjzwumplp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 125

Linear and Integer Optimization (V3C1/F4C1)

Lecture notes

Ulrich Brenner
Research Institute for Discrete Mathematics, University of Bonn
Summer term 2023
April 9, 2024
18:11

1
Preface
Continuous updates of these lecture notes can be found on the webpage of the lecture course:
http://www.or.uni-bonn.de/lectures/ss22/lgo ss22.html
These lecture notes are based on a number of textbooks and lecture notes from earlier courses.
See e.g. the lecture notes by Tim Nieberg (winter term 2012/2013) and Stephan Held (winter
term 2013/2014 and 2017/18) that are available online on the teaching web pages of the Research
Institute for Discrete Mathematics, University of Bonn (http://www.or.uni-bonn.de/lectures).
Recommended textbooks:

• Chvátal [1983]: Still a good introduction into the field of linear programming.

• Korte and Vygen [2018]: Chapters 3–5 contain the most important results of this lecture
course. Very compact description.

• Matoušek and Gärtner [2007]: Very good description of the linear programming part. For
some results, proofs are missing, and the book does not consider integer programming.

• Schrijver [1986]: Comprehensive textbook covering both linear and integer programming.
Proofs are short but precise.

Prerequisites of this course are the lectures “Algorithmische Mathematik I” and “Lineare
Algebra I/II”. The lecture “Algorithmische Mathematik I” is covered by the textbook by
Hougardy and Vygen [2018]. The results concerning Linear Algebra that are used in this course
can be found, e.g., in the textbooks by Anthony and Harvey [2012], Bosch [2007], and Fischer
[2009].
We we also make use of some basic results of the complexity theory as they are taught in the
lecture course “Einführung in die Diskrete Mathematik”. These results on complexity theory
can be found e.g. in Chapter 15 of the textbook by Korte and Vygen [2018].
The notation concerning graphs is based on the notation proposed in the textbook by Korte
and Vygen [2018].
Please report any errors in these lecture notes to brenner@or.uni-bonn.de

2
Inhaltsverzeichnis
1 Introduction 5
1.1 A First Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Optimization Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Possible Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Integrality Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 Modeling of Optimization Problems as (Integral) Linear Programs . . . . . . . 9
1.6 Polyhedra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2 Duality 17
2.1 Dual LPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 Fourier-Motzkin Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 Farkas’ Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4 Strong Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.5 Complementary Slackness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3 The Structure of Polyhedra 33

3.1 Mappings of Polyhedra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2 Faces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3 Facets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.4 Minimal Faces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.5 Cones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.6 Polytopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.7 Decomposition of Polyhedra . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

4 Simplex Algorithm 45
4.1 Feasible Basic Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.2 The Simplex Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.3 Efficiency of the Simplex Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 57
4.4 Dual Simplex Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

3
4.5 Network Simplex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5 Sizes of Solutions 65
5.1 Gaussian Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

6 Ellipsoid Method 69
6.1 Idealized Ellipsoid Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.2 Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.3 Ellipsoid Method for Linear Programs . . . . . . . . . . . . . . . . . . . . . . . 78
6.4 Separation and Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

7 Interior Point Methods 85

7.1 Modification of the LP and Computation of an Initial Solution . . . . . . . . . 86
7.2 Solutions for Reduced Values of µ . . . . . . . . . . . . . . . . . . . . . . . . . 89
7.3 Finding an Optimum Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
7.4 Further remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

8 Integer Linear Programming 99

8.1 Integral Polyhedra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
8.2 Integral Solutions of Equation Systems . . . . . . . . . . . . . . . . . . . . . . 101
8.3 TDI Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
8.4 Total Unimodularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
8.5 Cutting Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
8.6 Branch-and-Bound Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

4
1 Introduction

1.1 A First Example

Assume that a farmer has 10 hectares of farmland where he can grow two kinds of crops: corn
and wheat (or a combination of both). For each hectare of corn he gets a revenue of 2 units of
money and for each hectare of wheat he gets 3 units of money. Planting corn in an area of one
hectare takes him 1 day while planting wheat takes him 2 days per hectare. In total, he has 16
days for the work on his field. Moreover, each hectare planted with corn needs 5 units of water
and each hectare planted with wheat needs 2 units of water. In total he has 40 units of water.
How can he maximize his revenue?
If x1 is the number of hectares planted with corn and x2 is the number of hectares planted with
wheat we can write the corresponding optimization problem in the following compact way:

max 2x1 + 3x2 // Objective function

subject to x1 + x2 ≤ 10 // Bound on the area
x1 + 2x2 ≤ 16 // Bound on the workload
5x1 + 2x2 ≤ 40 // Bound on the water resources
x1 , x2 ≥ 0 // An area cannot be negative

This is what we call a linear program (LP). In such an LP, we are given a linear objective
function (in our case (x1 , x2 ) 7→ 2x1 + 3x2 ) that has to be maximized or minimized under a
number of linear constraints. These constraints can be given by linear inequalities (but not
strict inequalities “<”) or by linear equations. However, a linear equation can easily be replaced
by a pair of inequalities (e.g. 4x1 + 3x2 = 7 is equivalent to 4x1 + 3x2 ≤ 7 and 4x1 + 3x2 ≥ 7),
so we may assume that all constraints are given by linear inequalities.
In our example, there were only two variables, x1 and x2 . In this case, linear programs can be
solved graphically. Figure 1 illustrates the method. The grey area is the set

{(x1 , x2 ) ∈ R2 | x1 + x2 ≤ 10} ∩ {(x1 , x2 ) ∈ R2 | x1 + 2x2 ≤ 16} ∩

{(x1 , x2 ) ∈ R2 | 5x1 + 2x2 ≤ 40} ∩ {(x1 , x2 ) ∈ R2 | x1 , x2 ≥ 0},

which is the set of all feasible solutions of our problem. We can solve the problem by moving
the green line, which is orthogonal to the cost vector 23 (shown in red), in the direction of 23
as long as it intersects the feasible area. We end up with x1 = 4 and x2 = 6, which is in fact an
optimum solution.

5
x2
5x1 + 2x2 = 40

x1 + x2 = 10 x1 + 2x2 = 16

Fig. 1: Graphic solution of the first example.

1.2 Optimization Problems

Definition 1 An optimization problem is a pair (I, f ) where I is a set and f : I → R

is a mapping. The elements of I are called feasible solutions of (I, f ). If I = ∅, the
optimization problem (I, f ) is called infeasible, otherwise we call it feasible. The function
f is called the objective function of (I, f ).
We either ask for an element x∗ ∈ I such that for all x ∈ I we have f (x) ≤ f (x∗ ) (then
(I, f ) is called a maximization problem) or for an element x∗ ∈ I such that for all
x ∈ I we have f (x) ≥ f (x∗ ) (then (I, f ) is called a minimization problem). In both
cases, such an element x∗ is called an optimum solution. (I, f ) is unbounded if
for all K ∈ R, there is an x ∈ I with f (x) > K (for the maximization problem) or an
x ∈ I with f (x) < K (for the minimization problem). An optimization problem is called
bounded if it is not unbounded.

In this lecture course, we consider optimization problems with linear objective functions and
linear constraints. The constraints can be written in a compact way using matrices:

Linear Programming
Instance: A matrix A ∈ Rm×n , vectors c ∈ Rn and b ∈ Rm .
Task: Find a vector x ∈ Rn with Ax ≤ b maximizing ct x.

6
Notation: Unless stated differently, always let A = (aij ) i=1,...,m ∈ Rm×n , b = (b1 , . . . , bm ) ∈ Rm
j=1,...,n
and c = (c1 , . . . , cn ) ∈ Rn .
Remark: Real vectors are simply ordered sets of real numbers. But when we multiply vectors
with each other or with matrices, we have to interpret them as n × 1-matrices (column vectors)
or as 1 × n-matrices (row vectors). By default, we consider vectors as column vectors in this
context, so if we want to use them as row vectors, we have to transpose them (“ct ”).
We often write linear programs in the following way:
max ct x
(1)
s.t. Ax ≤ b

Or, in a shorter version we write: max{ct x | Ax ≤ b}.

The i-th row of matrix A encodes the constraint nj=1P
P
aij xj ≤ bi on a solution x = (x1 , . . . , xn ).
We could also allow equation constraints of the form nj=1 aij xj = bi but (as mentioned in the
example in Section 1.1) these could be easily replaced by two inequalities. The formulation (1)
which avoids such equation constraints is called standard inequality form. Obviously, we
can also handle minimization problems with this approach because minimizing the objective
function ct x means maximizing the objective function −ct x.
A second important standard form for linear programs is the standard equation form:
max ct x
s.t. Ax = b (2)
x ≥ 0

Both standard forms can be transformed into each other: If we are given a linear program in
standard equation form we can replace each equation by a pair of inequalities and the constraint
x ≥ 0 by −In x ≤ 0 (where In is always the n × n-identity matrix). This leads to a formulation
of the same linear program in standard inequality form.
The transformation from the standard inequality form into the standard equation form is slightly
more complicated: Assume we are given the following linear program in standard inequality
form
max ct x
(3)
s.t. Ax ≤ b
We replace each variable xi by two variables zi and z̄i . Moreover, for each of the m constraints
we introduce a new variable x̃i (a so-called slack variable). With variables z = (z1 , . . . , zn ),
z̄ = (z̄1 , . . . , z̄n ) and x̃ = (x̃1 , . . . , x̃m ), we state the following LP in standard equation form:

max ct (z − z̄)


z
s.t. [A | − A | Im ]  z̄  = b (4)
x̃
z, z̄, x̃ ≥ 0

7
Note that [A | − A | Im ] is the m × 2n + m-matrix that we get by concatenating the matrices
A, −A and Im . Any solution z,z̄ and x̃ of the LP (4) gives a solution of the LP (3) with the
same cost by setting: xj := zj − z̄j (for j ∈ {1, . . . , n}).
On the other hand, if x is a solution of LP (3), then we get a solution of LP (4) with thePsame cost
n
by setting zj := max{xj , 0}, Pnz̄j := − min{xj , 0} (for j ∈ {1, . . . , n}) and x̃i = bi − j=1 aij xj
(for i ∈ {1, . . . , m}, where j=1 aij xj ≤ bi is the i-th constraint of Ax ≤ b).
Note that (in contrast to the first transformation) this second transformation (from the standard
inequality form into the standard equation form) leads to a different solution space because we
have to introduce new variables.

1.3 Possible Outcomes

There are three possible outcomes for a linear program max{ct x | Ax ≤ b}:

• The linear program can be infeasible. This means that {x ∈ Rn | Ax ≤ b} = ∅. A simple

example is:
max x
s.t. x ≤ 0 (5)
−x ≤ −1
• The linear program can be feasible but unbounded. This means that for each constant
K there is a feasible solution x with ct x ≥ K. An example is
max x
s.t. x−y ≤ 0 (6)
y−x ≤ 1
• The linear program can be feasible and bounded, so there is an x ∈ Rn with Ax ≤ b
and we have sup{ct x | Ax ≤ b} < ∞. An example is the LP that we saw in Section 1.1.
It will turn out that in this case there is always a vector x̃ ∈ Rn with Ax̃ ≤ b with
ct x̃ = sup{ct x | Ax ≤ b}.

We will see that deciding if a linear program is feasible is as hard as computing an optimum
solution to a feasible and bounded linear program (see Section 2.4).

1.4 Integrality Constraints

In many applications, we need an integral solution. This leads to the following class of problems:

Integer Linear Programming

Instance: A matrix A ∈ Rm×n , vectors c ∈ Rn and b ∈ Rm .
Task: Find a vector x ∈ Zn with Ax ≤ b maximizing ct x.

8
Replacing the constraint x ∈ Rn by x ∈ Zn makes a huge difference. We will see that
there are polynomial-time algorithms for Linear Programming while Integer Linear
Programming is NP-hard.
Of course, one can also consider optimization problems where we have integrality constraints
only for some of the variables. These linear optimization problems are called Mixed Integer
Linear Programs.

1.5 Modeling of Optimization Problems as (Integral) Linear Pro-

grams

We consider some examples how optimization problems can be modeled as LPs or ILPs. Many
flow problems can easily formulated as linear programs:

Definition 2 Let G be a directed graph with capacities u : E(G) → R>0 and let s and t
be two vertices of G. A feasible s-t-flow in (G, u) is a mapping f : E(G) → R≥0 with

• f (e) ≤ u(e) for all e ∈ E(G) and

•
P P
+
e∈δG (v) f (e) − −
e∈δG (v) f (e) = 0 for all v ∈ V (G) \ {s, t}.
P P
The value of an s-t-flow f is val(f ) = e∈δ+ (s) f (e) − e∈δ− (s) f (e).
G G

Maximum-Flow Problem
Instance: A directed Graph G, capacities u : E(G) → R>0 , vertices s, t ∈ V (G) with s ̸= t.
Task: Find an s-t-flow f : E(G) → R≥0 of maximum value.

This problem can be formulated as a linear program in the following way:

P P
max xe − xe
+ −
e∈δG (s) e∈δG (s)
s.t. xe ≥ 0 for e ∈ E(G)
(7)
P P xe ≤ u(e) for e ∈ E(G)
xe − xe = 0 for v ∈ V (G) \ {s, t}
+ −
e∈δG (v) e∈δG (v)

It is well known that the value of a maximum s-t-flow equals the capacity of a minimum cut
separating s from t. We will see in Section 2.5 that this result also follows from properties of the
linear program formulation. Moreover, if the capacities are integral, there is always a maximum
flow that is integral (see Section 8.4).

9
In some cases, we first have to modify a given optimization problem slightly in order to get a
linear program formulation. See the following example of a modified version of the Maximum-
Flow Problem where we have two sources and want to maximize the minimal out-flow of
both sources.

Bottleneck Maximum-Flow Problem with 2 Sources

Instance: A directed Graph G, capacities u : E(G) → R>0 ,
three vertices s1 , s2 , t ∈ V (G).
Task: Find a mapping f : E(G) → R≥0 with

• f (e) ≤ u(e) for all e ∈ E(G) and

• ∆f (e) = 0 for all v ∈ V (G) \ {s1 , s2 , t}

such that min{∆f (s1 ), ∆f (s2 )} is maximized

The objective function here is not a linear function but the minimum of two linear function. To
see how such a problem can be written as an LP, we assume slightly more general that we are
given the following optimization problem:

max min{ct x + d, et x + f }
s.t. Ax ≤ b

for some c, e ∈ Rn and d, f ∈ R.

This is not a linear program because the objective function is not a linear function but a piecewise
linear function. Though the objective function ist not linear, we can define an equivalent linear
program in the following way:
max σ
s.t. σ − ct x ≤ d
σ − et x ≤ f
Ax ≤ b

And of course, this trick also works if we want to compute the minimum of more than two
linear functions.
More or less the same trick can be applied to the following problem in which the objective
function contains absolute values of linear functions:

min |ct x + d|
s.t. Ax ≤ b

for some c ∈ Rn and d ∈ R. Again the problem can be written equivalently as a linear program

10
in the following form:
max −σ
s.t. −σ − ct x ≤ d
−σ + ct x ≤ −d
Ax ≤ b
The two additional constraints on σ ensure that we have σ ≥ max{ct x + d, −ct x − d} = |ct x + d|.
Other problems allow a formulation as an ILP but assumably not an LP formulation:

Vertex Cover Problem

Instance: An undirected graph G, weights c : V (G) → R≥0 .
Task: P a set X ⊆ V (G) with {v, w} ∩ X ̸= ∅ for all e = {v, w} ∈ E(G) such that
Find
v∈X c(v) is minimized.

This problem is known to be NP-hard (see standard textbooks like Korte and Vygen [2018]),
so we cannot hope for a polynomial-time algorithm. Nevertheless, the problem can easily be
formulated as an integer linear program:
P
min v∈V (G) xv c(v)
s.t. x v + xw ≥ 1 for {v, w} ∈ E(G) (8)
xv ∈ {0, 1} for v ∈ V (G)

For each vertex v ∈ V (G), we have a 0-1-variable xv which is 1 if and only if v should be in the
set X, i.e. if (xv )v∈V (G) is an optimum solution to (8), the set X = {v ∈ V (G) | xv = 1} is an
optimum solution to the Vertex Cover Problem.
This example shows that Integer Linear Programming itself is an NP-hard problem. By
skipping the integrality constraints (xv ∈ {0, 1}) we get the following linear program:
P
min v∈V (G) xv c(v)
s.t. x v + xw ≥ 1 for {v, w} ∈ E(G)
(9)
xv ≥ 0 for v ∈ V (G)
xv ≤ 1 for v ∈ V (G)

We call this linear program an LP-relaxation of (8). In this particular case, the relaxation
gives a 2-approximation of the Vertex Cover Problem: For any solution x of the relaxed
problem, we get an integral solution x̃ by setting

1 : xv ≥ 21

x̃v =
0 : xv < 12
P P
It is easy to check that yields a feasible solution of the ILP with x̃v c(v) ≤ 2 xv c(v).
v∈V (G) v∈V (G)

Obviously, in minimization problems relaxing some constraints can only decrease the value of
an optimum solution. We call the supremum of the ratio between the values of the optimum

11
solutions of an ILP and its LP-relaxation the integrality gap of the relaxation. The rounding
procedure described above also proves that in this case the integrality gap is at most 2. Indeed,
2 is the integrality gap as the example of a complete graph with weights c(v) = 1 for all vertices
v shows. For the Maximum-Flow Problem with integral edge capacities, the integrality gap
is 1 because there is always an optimum flow that is integral.
The following problem is NP-hard as well:

Stable Set Problem

Instance: An undirected graph G, weights c : V (G) → R≥0 .
Task: P a set X ⊆ V (G) with |{v, w} ∩ X| ≤ 1 for all e = {v, w} ∈ E(G) such that
Find
v∈X c(v) is maximized.

Again, this problem can easily be formulated as an integer linear program:

P
max v∈V (G) xv c(v)
s.t. x v + xw ≤ 1 for {v, w} ∈ E(G) (10)
xv ∈ {0, 1} for v ∈ V (G)

An LP-relaxation looks like this:

P
max v∈V (G) xv c(v)
s.t. x v + xw ≤ 1 for {v, w} ∈ E(G)
(11)
xv ≥ 0 for v ∈ V (G)
xv ≤ 1 for v ∈ V (G)

Unfortunately, in this case, the LP-relaxation is of no use. Even if G is a complete graph (were a
feasible solution of the Stable Set Problem can contain at most one vertex), setting xv = 12
for all v ∈ V (G) would be a feasible solution of the LP-relaxation. This example shows that the
integrality gap is at least n2 . Hence, this LP-relaxation does not provide any useful information
about a good ILP solution.

1.6 Polyhedra

In this section, we examine basic properties of solution spaces of linear programs.

Definition 3 Let X ⊆ Rn (for n ∈ N). X is called convex if for all x, y ∈ X and

t ∈ [0, 1] we have tx + (1 − t)y ∈ X.

12
Definition 4 For x1 , . . . , xk ∈ Rn , λ1 , . . . , λk , λi ≥ 0 (i ∈ {1, . . . , k}) with ki=1 λi = 1,
P

we call x = ki=1 λi xi convex combination of x1 , . . . , xk . The convex hull conv(X) of

P
a set X ⊆ Rn is the set of all convex combinations of sets of vectors in X.

Remark: It is easy to check that the convex hull of a set X ⊆ Rn is the (inclusion-wise)
minimal convex set containing X.

Definition 5 Let X ⊆ Rn for some n ∈ N.

(a) X is called a half-space if there is a vector a ∈ Rn \ {0} and a number b ∈ R such

that X = {x ∈ Rn | at x ≤ b}. The vector a is called a normal of X.

(b) X is called a hyperplane if there is a vector a ∈ Rn \ {0} and a number b ∈ R

such that X = {x ∈ Rn | at x = b}. The vector a is called a normal of X.

(c) X is called a polyhedron if there are a matrix A ∈ Rm×n and a vector b ∈ Rm

such that X = {x ∈ Rn | Ax ≤ b}.

(d) X is called a polytope if it is a polyhedron and there is a number K ∈ R such that

||x|| ≤ K for all x ∈ X.

Examples: The empty set is a polyhedron because ∅ = {x ∈ Rn | 0t x ≤ −1} and, of course, it

is a polytope. Rn is also a polyhedron, because Rn = {x ∈ Rn | 0t x ≤ 0} (but, of course, Rn is
not a polytope).
Observation: Polyhedra are convex and closed (see exercises).

Lemma 1 A set X ⊆ Rn is a polyhedron if and only if one of the following conditions

holds:

• X = Rn

• X is the intersection of a finite number of half-spaces.

Proof: “⇐:” If X = Rn or X is the intersection of a finite number of half-spaces, it is

obviously a polyhedron.
“⇒:” Assume that X is a polyhedron but X ̸= Rn . If X = ∅, then X = {x ∈ Rn | 11tn x ≤
−1} ∩ {x ∈ Rn | −11tn x ≤ −1} (where 11n is the all-one vector of length n). Hence we can assume
that X ̸= ∅.
Let A ∈ Rm×n be a matrix and b ∈ Rm a vector with X = {x ∈ Rn | Ax ≤ b}. Denote the rows

13
of A by a1 , . . . am . If aj = 0 for an j ∈ {1, . . . , m}, then bj ≥ 0 (where b = (b1 , . . . , bm )) because
otherwise X = ∅. Hence we have
m
\ \
X= {x ∈ Rn | atj x ≤ bj } = {x ∈ Rn | atj x ≤ bj },
j=1 j=1,...,m:aj ̸=0

which is a representation of X as an intersection of a finite number of half-spaces. 2

Definition 6 The dimension of a set X ⊆ Rn is

dim(X) = n − max{rank(A) | A ∈ Rn×n with Ax = Ay for all x, y ∈ X}.

In other words, the dimension of X ⊆ Rn is n minus the maximum size of a set of linear
independent vectors that are orthogonal to any difference of elements in X. For example, the
empty set and sets consisting of exactly one vector have dimension 0. The set Rn has dimension
n.
Observation: The dimension of a set X ⊆ Rn is the largest d for which X contains elements
v0 , v1 , . . . , vd such that v1 − v0 , v2 − v0 , . . . , vd − v0 are linearly independent.

Definition 7 A set X ⊆ Rn is called a convex cone if X ̸= ∅ and for all x, y ∈ X and

λ, µ ∈ R≥0 we have λx + µy ∈ X.

Observation: A non-empty set X ⊆ Rn is a convex cone if and only if X is convex and for all
x ∈ X and λ ∈ R≥0 we have λx ∈ X.

Definition 8 A set X ⊆ Rn is called a polyhedral cone if it is a polyhedron and a

convex cone.

Lemma 2 A set X ⊆ Rn is a polyhedral cone if and only if there is a matrix A ∈ Rm×n

such that X = {x ∈ Rn | Ax ≤ 0}.

Proof: “⇐:” Let X = {x ∈ Rn | Ax ≤ 0} for some matrix A ∈ Rm×n . Then X obviously

is a polyhedron and non-empty (because 0 ∈ X). And if x, y ∈ X and λ, µ ∈ R≥0 , then
A(λx + µy) ≤ 0, so λx + µy ∈ X. Hence X is a convex cone, too.
“⇒:” Let X ⊆ Rn be a polyhedral cone. In particular, there is a matrix A ∈ Rm×n and a vector
b ∈ Rm such that X = {x ∈ Rn | Ax ≤ b}. Since X is a convex cone, it is non-empty and it
must contain 0. Therefore, no entry of b can be negative. Thus, X ⊇ {x ∈ Rn | Ax ≤ 0}. But if
there was a vector x ∈ X such that Ax has positive i-th entry (for an i ∈ {1, . . . , m}), then for

14
sufficiently large λ, the i-th entry of λAx would be greater than bi which is a contradiction to
the assumption that X is a convex cone. Therefore, X = {x ∈ Rn | Ax ≤ 0}. 2
Let x1 , . . . , xm ∈ Rn be vectors. The cone generated by x1 , . . . , xm is the set
( m )
X
cone({x1 , . . . , xm }) := λi xi | λ1 , . . . , λm ≥ 0 .
i=1

A convex cone C is called finitely generated if there are vectors x1 , . . . , xm ∈ Rn with

C = cone({x1 , . . . , xm }).
It is easy to check that cone({x1 , . . . , xm }) is indeed a convex cone. We will see in Section 3.5
that a cone is polyhedral if and only if it is finitely generated.

15
16
2 Duality

2.1 Dual LPs

Consider the following linear program (P):

max 12x1 + 10x2

s.t. 4x1 + 2x2 ≤ 5
8x1 + 12x2 ≤ 7
2x1 − 3x2 ≤ 1

How can we find upper bounds on the value of an optimum solution? By combining the first
two constraints we can get the following bound for any feasible solution (x1 , y1 ):

1 1
12x1 + 10x2 = 2 · (4x1 + 2x2 ) + (8x1 + 12x2 ) ≤ 2 · 5 + · 7 = 13.5.
2 2
We can even do better by combining the last two inequalities:
7 4 7 4
12x1 + 10x2 = · (8x1 + 12x2 ) + · (2x1 − 3x2 ) ≤ · 7 + · 1 = 9.5.
6 3 6 3
More generally, for computing upper bounds we ask for non-negative numbers u1 , u2 , u3 such
that
12x1 + 10x2 = u1 · (4x1 + 2x2 ) + u2 · (8x1 + 12x2 ) + u3 · (2x1 − 3x2 ).
Then, 5 · u1 + 7 · u2 + 1 · u3 is an upper bound on the value of any solution of (P), so we want
to chose u1 , u2 , u3 in such a way that 5 · u1 + 7 · u2 + 1 · u3 is minimized.
This leads us to the following linear program (D):

min 5u1 + 7u2 + u3

s.t. 4u1 + 8u2 + 2u3 = 12
2u1 + 12u2 − 3u3 = 10
u1 ≥ 0
u2 ≥ 0
u3 ≥ 0

This linear program is called the dual linear program of (P). Any solution of (D) yields
an upper bound on the optimum value of of (P), and in this particular case it turns out that
u1 = 0 , u2 = 76 , u3 = 43 (the second solution from above) with value 9.5 is an optimum solution
of (D) because x1 = 11 16
, x2 = 18 is a solution of (P) with value 9.5.
For a general linear program (P)

max ct x
s.t. Ax ≤ b

17
in standard inequality form we define its dual linear program (D) as

min bt y
s.t. At y = c
y ≥ 0

In this context, we call the linear program (P) primal linear program.
Remark: Note that the dual linear program does not only depend on the objective function
and the solution space of the primal linear program but on its description by linear inequalities.
For example adding redundant inequalities to the system Ax ≤ b will lead to more variables in
the dual linear program.

Proposition 3 (Weak duality) If both the equation systems Ax ≤ b and At y = c, y ≥ 0

have a feasible solution, then

max{ct x | Ax ≤ b} ≤ min{bt y | At y = c, y ≥ 0}.

Proof: For x with Ax ≤ b and y with At y = c, y ≥ 0, we have

ct x = (At y)t x = y t Ax ≤ y t b.

2
Remark: The term “dual” implies that applying the transformation from (P) to (D) twice
yields (P) again. This is not exactly the case but it is not very difficult to see that dualizing (D)
(after transforming it into standard equational form) gives a linear program that is equivalent
to (P) (see the exercises).

2.2 Fourier-Motzkin Elimination

Consider the following system of inequalities:

3x + 2y + 4z ≤ 10
3x + 2z ≤ 9
2x − y ≤ 5
(12)
−x + 2y − z ≤ 3
−2x ≤ 4
2y + 2z ≤ 7

Assume that we just want to decide if a feasible solution x, y, z exists. The goal is to get rid of
the variables one after the other. To get rid of x, we first reformulate the inequalities such that

18
we can easily see lower and upper bounds for x:
x ≤ 10
3
− 2
3
y − 4
3
z
2
x ≤ 3 − 3
z
5
x ≤ 2
+ 12 y
(13)
x ≥ −3 + 2y − z
x ≥ −2
2y + 2z ≤ 7

This system of inequalities has a feasible solution if and only if the following system (that does
not contain x) has a solution:

min 10 − 32 y − 43 z, 3 − 23 z, 52 + 21 y ≥ max {−3 + 2y − z, −2}

3 (14)
2y + 2z ≤ 7
This system can be rewritten equivalently in the following way:
10 2 4
3
− 3
y ≥ − 3
z −3 + 2y − z
10 2 4
3
− 3
y ≥ − 3
z −2
2
3 ≥ − 3
z −3 + 2y − z
2
3 ≥ − 3
z −2 (15)
5 1
2
+ 2y ≥ −3 + 2y − z
5 1
2
+ 2y ≥ −2
2y + 2z ≤ 7
This is equivalent to the following system in standard form:
8 1 19
3
y+ ≤ 3
z 3
2 4 16
3
y+ ≤ 3
z 3
1
2y − ≤ 3
z 6
2
≤ 3
z 5 (16)
3 11
2
y − z ≤ 2
1 9
−2y ≤ 2
2y + 2z ≤ 7
We can iterate this step until we end up with a system of inequalities without variables. It is
easy to check if all inequalities in this final system are valid, which is equivalent to the existence
of a feasible solution of the initial system of inequalities. Moreover, we can also find a feasible
solution if one exists. To see this, note that any solution of the system (14) (that contains y
and z as variables only) also gives a solution of the system (13) by setting x to a value in the
interval

10 2 4 2 5 1
max {−3 + 2y − z, −2} , min − y − z, 3 − z, + y .
3 3 3 3 2 2

19
Note that this method, which is called Fourier-Motzkin elimination, is in general very
inefficient. If m is the number of inequalities in the initial system, it may be necessary to state
m2
4
inequalities in the system with one variable less (this is the case if there are m2 inequalities
that gave an upper bound on the variable we got rid of and m2 inequalities that gave a lower
bound).
Nevertheless, the Fourier-Motzkin elimination can be used to get a certificate that a given
system of inequalities does not have a feasible solution. In the proof of the following theorem
we give a general description of one iteration of the method:

Theorem 4 Let A ∈ Rm×n and b ∈ Rm (with n ≥ 1). Then there are Ã ∈ Rm̃×(n−1) and
2
b̃ ∈ Rm̃ with m̃ ≤ m + m4 such that

(a) Each inequality in the system Ãx̃ ≤ b̃ is a positive linear combination of inequalities
from Ax ≤ b

(b) The system Ax ≤ b has a solution if and only if Ãx̃ ≤ b̃ has a solution.

Proof: Denote the entries of A by aij , i.e. A = (aij ) i=1,...,m . We will show how to get rid of
j=1,...,n
the variable with index 1. To this end, we partition the index set {1, . . . , m} of the rows into
three disjoint sets U ,L, and N :

U := {i ∈ {1, . . . , m} | ai1 > 0}

L := {i ∈ {1, . . . , m} | ai1 < 0}
N := {i ∈ {1, . . . , m} | ai1 = 0}

We can assume that |ai1 | = 1 for all i ∈ U ∪ L (otherwise we divide the corresponding inequality
by |ai1 |).
For vectors ãi = (ai2 , . . . ain ) and x̃ = (x2 , . . . xn ) (that are empty if n = 1), we replace the
inequalities that correspond to indices in U and L by

ãti x̃ + ãtk x̃ ≤ bi + bk i ∈ U, k ∈ L. (17)

Obviously, each of these |U |·|L| new inequalities is simply the sum of two of the given inequalities
(and hence a positive linear combination of them).
The inequalities with index in N are rewritten as

ãtl x̃ ≤ bl l ∈ N. (18)

The inequalities in (17) and (18) form a set of inequalities Ãx̃ ≤ b̃ with n − 1 variables, and each
solution of Ax ≤ b gives a solution of Ãx̃ ≤ b̃ by restricting x = (x1 , . . . , xn ) to (x2 , . . . , xn ).

20
On the other hand, if x̃ = (x2 , . . . , xn ) is a solution of Ãx̃ ≤ b̃, then we can set x̃1 to any value
in the (non-empty) interval

max{ãtk x̃ − bk | k ∈ L}, min{bi − ãti x̃ | i ∈ U }

where we set the minimum of an empty set to ∞ and the maximum of an empty set to −∞.
Then, x = (x̃1 , x2 , . . . , xn ) is a solution of Ax ≤ b. 2

2.3 Farkas’ Lemma

Theorem 5 (Farkas’ Lemma for a system of inequalities) For A ∈ Rm×n and b ∈ Rm ,

the system Ax ≤ b has a solution if and only if there is no vector u ∈ Rm with u ≥ 0,
ut A = 0t and ut b < 0.

Proof: “⇒:” If Ax ≤ b and u ∈ Rm with u ≥ 0, ut A = 0t and ut b < 0, then 0 = (ut A)x =

ut (Ax) ≤ ut b < 0, which is a contradiction.
“⇐:” Assume that Ax ≤ b does not have a solution. Let A(0) := A and b(0) := b. We apply
Theorem 4 to A(0) x(0) ≤ b(0) and get a system A(1) x(1) ≤ b(1) of inequalities with n − 1 variables
such that A(1) x(1) ≤ b(1) does not have a solution either and such that each inequality of
A(1) x(1) ≤ b(1) is a positive linear combination of inequalities of A(0) x(0) ≤ b(0) . We iterate this
step n times, and in the end, we get a system of inequalities A(n) x(n) ≤ b(n) without variables
(so x(n) is in fact a vector of length 0) that does not have a solution. Moreover, each inequality
in A(n) x(n) ≤ b(n) is a positive linear combination of inequalities in Ax ≤ b. Since A(n) x(n) ≤ b(n)
does not have a solution, it must contain an inequality 0 ≤ d for a constant d < 0. This is a
positive linear combination of inequalities in Ax ≤ b, so there is a vector u ∈ Rm with u ≥ 0,
ut A = 0t and ut b = d < 0. 2

Theorem 6 (Farkas’ Lemma, most general case) For A ∈ Rm1 ×n1 , B ∈ Rm1 ×n2 , C ∈
Rm2 ×n1 , D ∈ Rm2 ×n2 , a ∈ Rm1 and b ∈ Rm2 exactly one of the two following systems has
a feasible solution:
System 1:
Ax + By ≤ a
Cx + Dy = b (19)
x ≥ 0
System 2:
ut A + v t C ≥ 0t
ut B + v t D = 0t
(20)
u ≥ 0
u a + vtb
t
< 0

21
Proof: The first system is equivalent to

Ax + By ≤ a
Cx + Dy ≤ b
−Cx − Dy ≤ −b
−In1 x ≤ 0

By Theorem 5, this system has a solution if and only if the following sytem does not have a
solution:

ut1 A + ut2 C − ut3 C − ut4 = 0t

ut1 B + ut2 D − ut3 D = 0t
ut1 a + ut2 b − ut3 b < 0t
u1 ≥ 0
u2 ≥ 0
u3 ≥ 0
u4 ≥ 0

Obviously, this system has a solution if and only if the second system of the theorem has a
solution. 2

Corollary 7 (Farkas’ Lemma, further variants) For A ∈ Rm×n and b ∈ Rm , the following
statements hold:

(a) There is a vector x ∈ Rn with x ≥ 0 and Ax = b if and only if there is no vector

u ∈ Rm with ut A ≥ 0t and ut b < 0.

(b) There is a vector x ∈ Rn with Ax = b if and only if there is no vector u ∈ Rm with

ut A = 0t and ut b < 0.

Proof: Restrict the statement of Theorem 6 to the vector b and matrix C (for part (a)) of D
(for part (b)). 2
Remark: Statement (a) of Corollary 7 has a nice geometric interpretation. Let C be the cone
generated by the columns of A. Then, the vector b is either in C or there is a hyperplane (given
by the normal u) that separates b from C.

2 3
and b1 = 52 and b2 = 13 (see Figure 2). The vector b1

As an example consider A =
1 1
is in the cone generated by the columns of A (because 52 = 21 + 31 ) while b2 can by separated

1
from the cone by a hyperplane orthogonal to u = −2 .
Hence, we have get the following corollary.

22
y

1

b2 = 3

5

b1 = 2

2

1
3

1
x

1

u= −2

Fig. 2: Example for the statement in Corollary 7(a).

Corollary 8 Let a1 , . . . , an ∈ Rm . Then for any vector b ∈ Rm exactly one of the following
statements holds:

(a) b ∈ cone({a1 , . . . , am }).

(b) There is a half-space H such that {a1 , . . . , an } ⊆ H and b ̸∈ H.

Proof: Statement (b) is equivalent to the existence of a vector u ∈ Rm with ut ai ≥ 0 for all
i ∈ {1, . . . , n} and ut b < 0. Thus the corollary follows from Corollary 7 (a). 2

23
2.4 Strong Duality

Theorem 9 (Strong duality) For the two linear programs

max ct x (P )
s.t. Ax ≤ b

and

min bt y (D)
s.t. At y = c
y ≥ 0

exactly one of the following statements is true:

1. Neither (P) nor (D) have a feasible solution.

2. (P) is unbounded and (D) has no feasible solution.

3. (P) has no feasible solution and (D) is unbounded.

4. Both (P) and (D) have a feasible solution. Then both have an optimal solution, and
for an optimal solution x̃ of (P) and an optimal solution ỹ of (D), we have

ct x̃ = bt ỹ.

Proof: Obviously, at most one of the statements can be true.

If one of the linear programs is unbounded, then the other one must be infeasible because of
the weak duality.
Assume that one of the LPs (say (P) without loss of generality) is feasible and bounded.
Hence the system
Ax ≤ b (21)
has a feasible solution while there is a B such that the system
Ax ≤ b
(22)
−ct x ≤ −B
does not have a feasible solution. By Farkas’ Lemma (Theorem 5), this means that there is a
vector u ∈ Rm and a number z ∈ R with with u ≥ 0 and z ≥ 0 such that ut A − zct = 0t and
bt u − zB < 0.
Note that z > 0 because if z = 0, then ut A = 0t and bt u < 0 which means that Ax ≤ b does not
have a feasible solution, which is a contradiction to our assumption. Therefore, we can define

24
ũ := z1 u. This implies At ũ = c and ũ ≥ 0, so ũ is a feasible solution of (D). Therefore (D) is
feasible. It is bounded as well because of the weak duality.
It remains to show that there are feasible solutions x of (P) and y of (D) such that ct x ≥ bt y.
This is the case if (and only if) the following system has a feasible solution:
Ax ≤ b
At y = c
−ct x + bt y ≤ 0
y ≥ 0
By Theorem 6, this is the case if and only if the following system (with variables u ∈ Rm ,
v ∈ Rn and w ∈ R) does not have a feasible solution:
ut A −wct = 0
v t At + wbt ≥ 0
u b + vtc
t
< 0 (23)
u ≥ 0
w ≥ 0
Hence, assume that system (23) has a feasible solution u, v and w.
Case 1: w = 0. Then (again by Farkas’ Lemma) the system
Ax ≤ b
At y = c
y ≥ 0
does not have a feasible solution, which is a contradiction because both (P) and (D) have a
feasible solution.
Case 2: w > 0. Then
0 > wut b + wv t c ≥ ut (−Av) + v t (At u) = 0,
which is a contradiction. 2
Remark: Theorem 9 shows in particular that if a linear program max{ct x | Ax ≤ b} is feasible
and unbounded that there is a vector x̃ with Ax̃ ≤ b such that ct x̃ = sup{ct x | Ax ≤ b}.
The following table gives an overview of the possible combinations of states of the primal and
dual LPs (“✓” means that the combination is possible, “x” means that it is not possible):
(D)
Feasible, Feasible,
Infeasible
bounded unbounded
Feasible,
✓ x x
bounded
(P) Feasible,
x x ✓
unbounded
Infeasible x ✓ ✓

25
Remark: The previous theorem can be used to show that computing a feasible solution of
a linear program is in general as hard as computing an optimum solution. Assume that we
want to compute an optimum solution of the program (P) in the theorem. To this end, we can
compute any feasible solution of the following linear program:

max ct x
s.t. Ax ≤ b
At y = c (24)
ct x ≥ bt y
y ≥ 0

Here x and y are the variables. We can ignore the objective function in the modified LP because
we just need any feasible solution. The constraints At y = c, ct x ≥ bt y and y ≥ 0 guarantee that
any vector x from a feasible solution of the new LP is an optimum solution of (P).

Corollary 10 Let A,B,C,D,E,F ,G,H,K be matrices and a,b,c,d,e,f be vectors of appro-

priate dimensions such that:
 
A B C
 D E F  is an m × n-matrix,
G H K
   
a d
 b  is a vector of length m and  e  is a vector of length n.
c f
Then  

 Ax + By + Cz ≤ a 
Dx + Ey + F z = b 

 
 
max dt x + et y + f t z : Gx + Hy + Kz ≥ c
x ≥ 0 

 

 

z ≤ 0
 
=  
t t t

 A u + D v + G w ≥ d 

t t t
B u + E v + H w = e

 

 
t t t t t t
min a u+b v+c w : C u + F v + K w ≤ f ,
u ≥ 0 

 


 
w ≤ 0
 

provided that both sets are non-empty.

Proof: Transform the first LP into standard inequality form and apply Theorem 9. The
details are again left as an exercise. 2
Table 1 gives an overview of how a primal linear program can be converted into a dual linear
program.

26
Primal LP Dual LP
Variables x1 , . . . , x n y1 , . . . , y m
Matrix A At
Right-hand side b c
Objective function max ct x min bt y
n
P
aij xj ≤ bi yi ≥ 0
j=1
Pn
aij xj ≥ bi yi ≤ 0
j=1
Pn
aij xj = bi yi ∈ R
j=1
Constraints m
P
xj ≥ 0 aij yi ≥ cj
i=1
Pm
xj ≤ 0 aij yi ≤ cj
i=1
Pm
xj ∈ R aij yi = cj
i=1

Tabelle 1: Dualization of linear programs.

Here are some important special cases of primal-dual pairs of LPs:

Primal LP Dual LP
t
max{c x | Ax ≤ b} min{b y | y t A = c, y ≥ 0}
t

max{ct x | Ax ≤ b, x ≥ 0} min{bt y | y t A ≥ c, y ≥ 0}
max{ct x | Ax ≥ b, x ≥ 0} min{bt y | y t A ≥ c, y ≤ 0}
max{ct x | Ax = b, x ≥ 0} min{bt y | y t A ≥ c}

2.5 Complementary Slackness

Theorem 11 (Complementary slackness for inequalities) Let max{ct x | Ax ≤ b} and

min{bt y | At y = c, y ≥ 0} be a pair of a primal and a dual linear program. Then, for
x ∈ Rn with Ax ≤ b and y ∈ Rm with At y = c and y ≥ 0 the following statements are
equivalent:

(a) x is an optimum solution of max{ct x | Ax ≤ b} and y an optimum solution of

min{bt y | At y = c, y ≥ 0}.

(b) ct x = bt y.

27
Proof: The equivalence of the statements (a) and (b) follows from Theorem 9. To see the
equivalence of (b) and (c) note that y t (b − Ax) = y t b − y t Ax = y t b − ct x, so ct x = bt y is
equivalent to y t (b − Ax) = 0. 2
With the notation of the theorem, let at1 , . . . , atm be the rows of A and b = (b1 , . . . , bm ). Then,
the theorem implies that for an optimum primal solution x and an optimum dual solution y and
i ∈ {1, . . . , m} we have yi = 0 or ati x = bi (since m t t
P
i=1 i i − ai x) must be zero and yi (bi − ai x)
y (b
cannot be negative for any i ∈ {1, . . . , m}).

Theorem 12 (Complementary slackness for inequalities with non-negative variables) Let

max{ct x | Ax ≤ b, x ≥ 0} and min{bt y | At y ≥ c, y ≥ 0} be a pair of a primal and a dual
linear program. Then, for x ∈ Rn with Ax ≤ b and x ≥ 0 and y ∈ Rm with At y ≥ c and
y ≥ 0 the following statements are equivalent:

(a) x is an optimum solution of max{ct x | Ax ≤ b, x ≥ 0} and y an optimum solution

of min{bt y | At y ≥ c, y ≥ 0}.

(b) ct x = bt y.

(c) y t (b − Ax) = 0 and xt (At y − c) = 0.

Proof: The equivalence of the statements (a) and (b) follows again from Theorem 9. To
see the equivalence of (b) and (c) note that 0 ≤ y t (b − Ax) and 0 ≤ xt (At y − c). Hence
y t (b − Ax) + xt (At y − c) = y t b − y t Ax + xt At y − xt c = y t b − xt c is zero if and only if
0 = y t (b − Ax) and 0 = xt (At y − c). 2

Corollary 13 Let max{ct x | Ax ≤ b} be a feasible linear program. Then, the linear

program is bounded if and only if c is in the convex cone generated by the rows of A.

Proof: The linear program is bounded if and only if its dual linear program is feasible. This is
the case if and only if there is a vector y ≥ 0 with y t A = c which is equivalent to the statement
that c is in the cone generated by the rows of A. 2
Theorem 11 allows us to strengthen the statement of the previous Corollary. Let x be an
optimum solution of the linear program max{ct x | Ax ≤ b} and y an optimum solution of its
dual min{bt y | At y = c, y ≥ 0}. Denote the row vectors of A by at1 , . . . , atm . Then yi = 0 if
ati x < bi (for i ∈ {1, . . . , m}), so c is in fact in the cone generated only by these rows of A where
ati x = bi (see Figure 3 for an illustration).

28
y
a1
c
a2

at3 x = b3

at1 x = b1
{x ∈ R2 | Ax ≤ b}

x
at2 x = b2

Fig. 3: Cost vector c as non-negative combination of rows in A.

Theorem 14 (Strict Complementary Slackness) Let max{ct x | Ax ≤ b} and min{bt y |

At y = c, y ≥ 0} be a pair of a primal and a dual linear program that are both feasible and
bounded. Then, for each inequality ati x ≤ bi in Ax ≤ b exactly one of the following two
statements holds:

(a) The primal LP max{ct x | Ax ≤ b} has an optimum solution x∗ with ati x∗ < bi .

(b) The dual LP min{bt y | At y = c, y ≥ 0} has an optimum solution y ∗ with yi∗ > 0.

Proof: By complementary slackness, at most one the statements can be true. Let δ =
max{ct x | Ax ≤ b} be the value of an optimum solution. Assume that (a) does not hold. This
means that
max −ati x
Ax ≤ b
−ct x ≤ −δ
has an optimum solution with value −bi . Hence, also its dual LP
min bt y − δu
At y − uc = −ai
y ≥ 0
u ≥ 0

must have an optimum solution of value −bi . Therefore, there are y ∈ Rm and u ∈ R with y ≥ 0
and u ≥ 0 with y t A − uct = −ati and y t b − uδ = −bi . Let ỹ = y + ei (i.e. ỹ arises from y by
increasing the i-th entry by one). If u = 0, then ỹ t A = y t A+ati = 0 and ỹ t b = y t b+bi = 0, so if y ∗

29
is an optimum solution of min{bt y | At y = c, y ≥ 0}, then y ∗ + ỹ is also an optimum solution and
has a positive i-th entry. If u > 0, then u1 ỹ is an optimum solution of min{bt y | At y = c, y ≥ 0},
(because u1 ỹ t A = u1 y t A + u1 ati = ct and u1 ỹ t b = u1 y t b + u1 bi = δ) and has a positive i-th entry. 2

Theorem 15 Let max{ct x | Ax ≤ b} and min{bt y | At y = c, y ≥ 0} be a pair of a primal

and a dual linear program that are both feasible and bounded. Then, there are optimum
solution x∗ and y ∗ of the LPs such that for each inequality ati x ≤ bi in Ax ≤ b either
ati x∗ < bi or yi∗ > 0 holds.

Proof: By Theorem 14, for any inequality ati x ≤ bi there is a pair of optimum solutions
(i)
x(i) ∈ Rn , y (i) ∈ Rm such that ati x(i) < bi or yi > 0. Since the convex
Pm combination of optimum
LP solutions is again an optimum solution, we can set x := m i=1 x and y := m1 m
∗ 1 (i) ∗ (i)
P
i=1 y
and get a pair of optimum solutions fulfilling the conditions of the theorem. 2

As an application of complementary slackness we consider again the Maximum-Flow Problem.

Let G be a directed graph with s, t ∈ V (G), s ̸= t, and capacities u : E(G) → R>0 . Here is the
LP-formulation of the Maximum-Flow Problem:

P P
max xe − xe
+ −
e∈δG (s) e∈δG (s)
s.t. xe ≥ 0 for e ∈ E(G)
(25)
P x
P e ≤ u(e) for e ∈ E(G)
xe − xe = 0 for v ∈ V (G) \ {s, t}
+ −
e∈δG (v) e∈δG (v)

By dualizing it, we get

P
min u(e)ye
e∈E(G)
s.t. ye ≥ 0 for e ∈ E(G)
ye + zv − zw ≥ 0 for e = (v, w) ∈ E(G), {s, t} ∩ {v, w} = ∅
ye + zv ≥ 0 for e = (v, t) ∈ E(G), v ̸= s
ye − zw ≥ 0 for e = (t, w) ∈ E(G), w ̸= s (26)
ye − zw ≥ 1 for e = (s, w) ∈ E(G), w ̸= t
ye + zv ≥ −1 for e = (v, s) ∈ E(G), v ̸= t
ye ≥ 1 for e = (s, t) ∈ E(G)
ye ≥ −1 for e = (t, s) ∈ E(G)

In a simplified way its dual LP can be written with two dummy variables zs = −1 and zt = 0:

30
P
min u(e)ye
e∈E(G)
s.t. ye ≥ 0 for e ∈ E(G)
ye + zv − zw ≥ 0 for e = (v, w) ∈ E(G) (27)
zs = −1
zt = 0

We will use the dual LP to show the Max-Flow-Min-Cut-Theorem. We call a set δ + (R) with
R ⊂ V (G), s ∈ R and t ̸∈ R an s-t-cut.

Theorem 16 (Max-Flow-Min-Cut-Theorem) Let G be a directed graph with edge capacities

u : E(G) → R>0 . Let s, t ∈ V (G) be two different vertices. Then, the minimum of all
capacities of s-t-cuts equals the maximum value of an s-t-flow.

Proof: If x is a feasible solution of the primal problem (25) (i.e. x encodes an s-t-flow) and
δ + (R) is an s-t-cut, then
 
X X X X X X X X
xe − xe =  xe − xe  = xe − xe ≤ u(e).
+ − v∈R + − + − +
e∈δG (s) e∈δG (s) e∈δG (v) e∈δG (v) e∈δG (R) e∈δG (R) e∈δG (R)

P P
The first equation follows from the flow conservation rule (i.e. xe − xe = 0) applied
+ −
e∈δG (v) e∈δG (v)
to all vertices in R \ {s} and the second one from the fact that flow values on edges inside R
cancel out in the sum. The last inequality follows from the fact that flow values are between 0
and u.
Thus, the capacity of any s-t-cut is an upper bound for the value of an s-t-flow. We will show
that for any maximum s-t-flow there is an s-t-cut whose capacity equals the value of the flow.
Let x̃ be an optimum solution of the primal problem (25) and ỹ, z̃ be an optimum solution of
the dual problem (27). In particular x̃ defines a maximum s-t-flow. Consider the set R := {v ∈
V (G) | z̃v ≤ −1}. Then s ∈ R and t ̸∈ R.
+
If e = (v, w) ∈ δG (R), then z̃v < z̃w , so ỹe ≥ z̃w − z̃v > 0. By complementary slackness
−
this implies x̃e = u(e). On the other hand, if e = (v, w) ∈ δG (R), then z̃v > z̃w and hence
ỹe + z̃v − z̃w ≥ z̃v − z̃w > 0, so again by complementary slackness x̃e = 0. This leads to:
 
X X X X X X X X
x̃e − x̃e =  x̃e − x̃e  = x̃e − x̃e = u(e).
+ − v∈R + − + − +
e∈δG (s) e∈δG (s) e∈δG (v) e∈δG (v) e∈δG (R) e∈δG (R) e∈δG (R)

31
32
3 The Structure of Polyhedra

3.1 Mappings of Polyhedra

Proposition 17 Let A ∈ Rm×(n+k) and b ∈ Rm . Then the set

n k x
P = {x ∈ R | ∃y ∈ R : A ≤ b}
y

is a polyhedron.

Proof: Exercise. 2
x

Remark: The set P = {x ∈ Rn | ∃y ∈ Rk : A y
≤ b} is called a projection of {z ∈ Rn+k |
Az ≤ b} to Rn .
More generally, the image of a polyhedron {x ∈ Rn | Ax ≤ b} under an affine linear mapping
f : Rn → Rk , which is given by D ∈ Rk×n , d ∈ Rk and x 7→ Dx + d is also a polyhedron:

Corollary 18 Let A ∈ Rm×n , b ∈ Rm , D ∈ Rk×n and d ∈ Rk . Then

{y ∈ Rk | ∃x ∈ Rn : Ax ≤ b and y = Dx + d}

is a polyhedron.

Proof: Note that

(
k n
y ∈ R | ∃x ∈ R : Ax ≤ b and y = Dx + d
    
A 0 b
 x 
= y ∈ Rk | ∃x ∈ Rn :  D −Ik  ≤  −d 
y
−D Ik d
 

and apply the previous proposition. 2

33
3.2 Faces

Definition 9 Let P = {x ∈ Rn | Ax ≤ b} be a non-empty polyhedron and c ∈ Rn \ {0}.

(a) For δ := max{ct x | x ∈ P } < ∞, the set {x ∈ Rn | ct x = δ} is called supporting

hyperplane of P .

(b) A set X ⊆ Rn is called face of P if X = P or if there is a supporting hyperplane H

of P such that X = P ∩ H.

(c) If {x′ } is a face of P , we call x′ vertex of P or basic solution of the system

Ax ≤ b.

Proposition 19 Let P = {x ∈ Rn | Ax ≤ b} be a non-empty polyhedron and F ⊆ P .

Then, the following statements are equivalent:

(a) F is a face of P .

(b) There is a vector c ∈ Rn such that δ := max{ct x | x ∈ P } < ∞ and F = {x ∈ P |

ct x = δ}.

(c) There is a subsystem A′ x ≤ b′ of Ax ≤ b such that F = {x ∈ P | A′ x = b′ } =

̸ ∅.

Proof:

“(a) ⇒ (b)”: Let F be a face of P . If F = P , then c = 0 yields F = {x ∈ P | ct x = 0}. If

̸ P , then there must be a c ∈ Rn such that for δ := max{ct x | x ∈ P }(< ∞) we have
F =
F = {x ∈ Rn | ct x = δ} ∩ P = {x ∈ P | ct x = δ}.

“(b) ⇒ (c)”: Let c, δ and F be as described in (b). Let A′ x ≤ b′ be a maximal subsystem of

Ax ≤ b such that A′ x = b′ for all x ∈ F . Hence F ⊆ {x ∈ P | A′ x = b′ } and it remains
to show that F ⊇ {x ∈ P | A′ x = b′ }. Let Ãx ≤ b̃ be the inequalities in Ax ≤ b that are
not contained in A′ x ≤ b′ . Denote the inequalities of Ãx ≤ b̃ by ãtj x ≤ b̃j (j = 1, . . . , k).
Hence, for each j = 1, . . . , k we have an xj ∈ F with ãj xj < b̃j .
If k > 0, we set x∗ := k1 kj=1 xj . Otherwise let x∗ be an arbitrary element of F . In any
P

case, we have ãtj x∗ < b̃j for all j ∈ {1, . . . , k}.

Consider an arbitrary y ∈ P \ F . We have to show that A′ y ̸= b′ .
Because of y ∈ P \ F we know that ct y < δ.
b̃j −ãt x∗
Choose ϵ > 0 with ϵ < ãt (x∗j−y) for all j ∈ {1, . . . , k} with ãtj x∗ > ãtj y (note that all these
j
upper bounds on ϵ are positive).

34
Set z := x∗ + ϵ(x∗ − y) (see Figure 4). Then ct z > δ, so z ̸∈ P . Therefore, there must
be an inequality at x ≤ β of the system Ax ≤ b such that at z > β. We claim that this
inequality cannot belong to Ãx ≤ b̃. To see this assume that at x ≤ β belongs to Ãx ≤ b̃.
If at x∗ ≤ at y then
at z = at x∗ + ϵat (x∗ − y) ≤ at x∗ < β.
But if at x∗ > at y then

t t ∗ t ∗ β − at x ∗ t ∗
t ∗
a z = a x + ϵa (x − y) < a x + t ∗ a (x − y) = β.
a (x − y)
In both cases, we get a contradiction, so the inequality at x ≤ β belongs to A′ x ≤ b′ .
Therefore, at y = at (x∗ + 1ϵ (x∗ − z)) = (1 + 1ϵ )β − 1ϵ at z < β, which means that A′ y ̸= b′ .

“(c) ⇒ (a)”: Let A′ x ≤ b′ be a subsystem of Ax ≤ b such that F = {x ∈ P | A′ x = b′ }. Let

ct be the sum of all row vectors of A′ , and let δ be the sum of the entries of b′ . Then,
ct x ≤ δ for all x ∈ P and F = P ∩ H with H = {x ∈ Rn | ct x = δ}. 2

c
x∗ + ϵ(x∗ − y)
F
x∗
y
P

Fig. 4: Illustration of part “(b) ⇒ (c)” of the proof of Proposition 19

The following corollary summarizes direct consequences of the previous proposition:

Corollary 20 Let P ̸= ∅ be a polyhedron and F a face of P .

(a) Let c ∈ Rn be a vector such that max{ct x | x ∈ P } < ∞. Then the set of all vectors
x where the maximum of ct x over P is attained is a face of P .

(b) F is a polyhedron.

(c) A subset F ′ ⊆ F is a face of F if and only if F ′ is a face of P .

(d) If P is of the form P = {x ∈ Rn | Ax = b}, so it is in fact a linear subspace, then

P has only one face, namely P itself. 2

We are in particular interested in the largest and the smallest faces of a polyhedron.

35
3.3 Facets

Definition 10 Let P be a polyhedron. A facet of P is an (inclusion-wise) maximal face

̸ P . An inequality ct x ≤ δ is facet-defining for P if ct x ≤ δ for all
F of P with F =
x ∈ P and {x ∈ P | ct x = δ} is a facet.

Theorem 21 Let P ⊆ {x ∈ Rn | Ax = b} be a non-empty polyhedron of di-

mension n − rank(A). Let A′ x ≤ b′ be a minimal system of inequalities such that
P = {x ∈ Rn | Ax = b, A′ x ≤ b′ }. Then, every inequality in A′ x ≤ b′ is facet-defining for
P and every facet of P is given by an inequality of A′ x ≤ b′ .

Proof: If P = {x ∈ Rn | Ax = b}, then P does not have a facet (the only face of P is P itself,
see Corollary 20 (d)), so both statements are trivial.
Hence assume that P ̸= {x ∈ Rn | Ax = b}.
Let A′ x ≤ b′ be a minimal system of inequalities such that P = {x ∈ Rn | Ax = b, A′ x ≤ b′ }.
Let at x ≤ β be an inequality in A′ x ≤ b′ , and let A′′ x ≤ b′′ be the rest of the system A′ x ≤ b′
without at x ≤ β.
We will show that at x ≤ β is facet-defining.
Let y ∈ Rn be a vector with Ay = b, A′′ y ≤ b′′ and at y > β. Such a vector exists because
otherwise A′′ y ≤ b′′ would be a smaller system of inequalities than A′ ≤ b′ with P = {x ∈ Rn |
Ax = b, A′′ x ≤ b′′ }, which is a contradiction to the definition of A′ x ≤ b′ .
Moreover, let ỹ ∈ P be a vector with A′ ỹ < b′ (such a vector ỹ exists because P is full-
dimensional in the linear subspace {x ∈ Rn | Ax = b} and because of the minimality of the
system A′ x ≤ b′ ). Consider the vector

β − at ỹ
z = ỹ + (y − ỹ).
at y − at ỹ

t t
Then, at z = at ỹ + aβ−a ỹ t t β−a ỹ
t y−at ỹ (a y − a ỹ) = β. Furthermore, 0 < at y−at ỹ < 1. Thus, z is the convex

combination of ỹ and y, so Az = b and A′′ z ≤ b′′ . Therefore, we have z ∈ P .

Set F := {x ∈ P | at x = β}. Then, F = ̸ ∅ (because z ∈ F ), and F =
̸ P because ỹ ∈ P \ F .
Hence, F is a face of P . It is also a facet because a x ≤ β is the only inequality of A′ x ≤ b′
t

that is met by all elements of F with equality (e.g. the vector z ∈ F fulfills all inequalities in
A′′ x ≤ b′′ with strict inequality).
On the other hand, by Proposition 19 any facet is defined by an inequality of A′ x ≤ b′ . 2

36
Corollary 22 Let P ⊆ Rn be a polyhedron.

(a) Every face F of P with F ̸= P is the intersection of facets of P .

(b) The dimension of every facet of P is dim(P ) − 1. 2

In particular, this means that the smallest possible representation of a full-dimensional polyhe-
dron P = {x ∈ Rn | Ax ≤ b} is unique (up to swapping inequalities and multiplying inequalities
with positive constants). If possible, we want to describe any polyhedron by facet-defining
inequalities because according to the Theorem 21, this gives such a smallest possible description
of the polyhedron (with respect to the number of inequalities).

3.4 Minimal Faces

Definition 11 A face F of a polyhedron P is called a minimal face if there is no face

F ′ of P with F ′ ⊊ F .

Proposition 23 Let P = {x ∈ Rn | Ax ≤ b} be a polyhedron. A non-empty set F ⊆ P

is a minimal face of P if and only if there is a subsystem A′ x ≤ b′ of Ax ≤ b with
F = {x ∈ Rn | A′ x = b′ }.

Proof: “⇒:” Let F be a minimal face of P . By Proposition 19, we know that there is a
subsystem A′ x ≤ b′ of Ax ≤ b with F = {x ∈ P | A′ x = b′ }. Choose A′ x ≤ b′ maximal with
this property. Let Ãx ≤ b̃ be a minimal subsystem of Ax ≤ b such that F = {x ∈ Rn | A′ x =
b′ , Ãx ≤ b̃}.
We have to show the following claim:
Claim: Ãx ≤ b̃ is an empty system of inequalities.
Proof of the Claim: Assume that at x ≤ β is an inequality in Ãx ≤ b̃. The inequality at x ≤ β
is not redundant, so by Theorem 21, F ′ = {x ∈ Rn | A′ x = b′ , Ãx ≤ b̃, at x = β} is a facet of F ,
and hence, by Corollary 20, F ′ is s face of P . On the other hand, we have F ′ ̸= F , because
at x = β is not valid for all elements of F (otherwise we could have added at x ≤ β to the set of
inequalities A′ x ≤ b′ ). This is a contradiction to the minimality of F . This proves the claim.
“⇐:” Assume that F = {x ∈ Rn | A′ x = b′ } ⊆ P (for a subsystem A′ x ≤ b′ of Ax ≤ b) is
non-empty.
Then, F cannot contain a proper subset as a face (see Corollary 20 (d)).

37
Moreover, F = {x ∈ Rn | A′ x = b′ } = {x ∈ P | A′ x = b′ }, so by Proposition 19 the set F is a
face of P . Since any proper subset of F that is a face of P would also be a face of F and we
know that F does not contain proper subsets as faces, F is a minimal face of P . 2

Corollary 24 Let P = {x ∈ Rn | Ax ≤ b} be a polyhedron. Then the minimal faces of P

have dimension n − rank(A).

Proof: Let F be a minimal face of P = {x ∈ Rn | Ax ≤ b}. By Proposition 23, it can be

written as F = {x ∈ Rn | A′ x = b′ } for a subsystem A′ x ≤ b′ of Ax ≤ b. If A′ had smaller rank
than A, we could add a new constraint at x ≤ β of Ax ≤ b to A′ x ≤ b′ such that at is linearly
independent to all rows of A′ . Then, {x ∈ Rn | A′ x = b′ , at x = β} ⊊ F would be a face of F
and thus a face of P . This is a contradiction to the minimality of F . Hence, we can assume
that rank(A′ ) = rank(A).
Therefore, dim(F ) = n − rank(A′ ) = n − rank(A). 2

Proposition 25 Let P = {x ∈ Rn | Ax ≤ b} be a polyhedron and x′ ∈ P . Then, the

following statements are equivalent:

(a) x′ is a vertex of P .

(b) There is a subsystem A′ x ≤ b′ of Ax ≤ b of n inequalities such that the rows of A′

are linearly independent and {x′ } = {x ∈ P | A′ x = b′ }.

(c) x′ cannot be written as a convex combination of vectors in P \ {x′ }.

(d) There is no non-zero vector d ∈ Rn such that {x′ + d, x′ − d} ⊆ P .

Proof:

“(a) ⇔ (b)”: By Proposition 23, x′ is a vertex if and only if there is a subsystem A′ x ≤ b′ of

Ax ≤ b with {x′ } = {x ∈ Rn | A′ x = b′ }. Since {x′ } is of dimension 0, this is the case if
and only if the statement in (b) holds.

“(b) ⇒ (c)”: Let A′ x ≤ b′ be a subsystem of n inequalities of Ax ≤ b such that the rows of A′

are linearly independentPand {x′ } = {x ∈ P | A′ x = b′ }. Assume that x′ can be written as
a convex combination ki=1 λi x(i) of vectors x(i) ∈ P \ {x′ } (so λi ≥ 0 for i ∈ {1, . . . , k}
and ki=1 λi = 1). If we had at x(i) < β for any inequality at x ≤ β in A′ x ≤ b′ and
P

i ∈ {1, . . . , k}, then at x′ = ki=1 λi at x(i) < β, which is a contradiction. But then, we have
P
x(i) ∈ {x ∈ P | A′ x = b′ } = {x′ } for all i ∈ {1, . . . , k}, which is a contradiction, too.

“(c) ⇒ (d)”: If {x′ + d, x′ − d} ⊆ P , then x′ = 12 ((x′ + d) + (x′ − d)), so x′ can be written as

a convex combination of vectors in P \ {x′ }.

38
“(d) ⇒ (b)”: Let A′ x ≤ b′ be a maximal subsystem of Ax ≤ b such that A′ x′ = b′ . Assume
that A′ does not contain n linearly independent rows. Then, there is a vector d that is
orthogonal to all rows in A′ . Hence, for any ϵ > 0, we have A′ (x′ + ϵd) = A′ (x′ − ϵd) = b′ .
For any inequality at x ≤ β that is in Ax ≤ b but not in A′ x ≤ b′ , we have at x′ < β.
Therefore, if ϵ > 0 is sufficiently small, at (x′ + ϵd) ≤ β and at (x′ − ϵd) ≤ β are valid for
inequalities at x ≤ β in Ax ≤ b but not in A′ x ≤ b′ . In other words, we have (x′ + ϵd) ∈ P
and (x′ − ϵd) ∈ P . 2

Definition 12 A polyhedron is called pointed if it is empty or all minimal faces of it

are of dimension 0.

Examples:

• Polytopes are pointed.

To see this, consider a non-empty polytope P = {x ∈ Rn | Ax ≤ b}. If rank(A) < n, then
there is a vector x̃ ∈ Rn such that Ax̃ = 0. But then for any x ∈ P and K ∈ R, we have
x + K x̃ ∈ P , which is a contradiction to the assumption that P fits into a ball of finite
radius. Hence we have rank(A) = n, so P is pointed.

• Polyhedra P that can be written as P = {x ∈ Rn

| Ax =b, x ≥0} arepointed.
A b
n 
This can be seen by writing P as P = {x ∈ R | −A  x≤  −b }. Obviously, the
  −In 0
A
matrix  −A  has rank n, hence P is pointed.
−In

Corollary 26 If the linear program max{ct x | Ax ≤ b} is feasible and bounded and the
polyhedron P = {x ∈ Rn | Ax ≤ b} is pointed, then there is a vertex x′ of P such that
ct x′ = max{ct x | Ax ≤ b}. 2

3.5 Cones

Theorem 27 (Carathéodory’s Theorem) If X ⊆ Rn is a finite set of vectors and

c ∈ cone(X) then there are linearly independent vectors a1 , . . . , ak ∈ X such that
c ∈ cone({a1 , . . . , ak }).

39
Proof: Let {a1 , . . . , ak } be an inclusion-wise minimal set of vectors in X such that c ∈
cone({a
Pk 1 , . . . , ak }). This means that there are positive numbers λ1 , . . . , λk such that c =

i=1 i i .
λ a
We show that the vectors a1 , . P . . , ak are linearly independent. If this is not the case, there are
numbers γ1 , . . . , γk such that ki=1 γi ai = 0. We can assume that at least one γi is positive.
Choose σ maximal such that λi − σγi ≥ 0 for all i ∈ {1,P. . . , k}. Then, in particular, for at least
on on i ∈ {1, . . . , k}, we have λi − σγi = 0. Therefore, ki=1 (λi − σγi )ai is a representation of c
with less vectors, which is a contradiction to the minimality of the set {a1 , . . . , ak }. 2

Theorem 28 (Fundamental Theorem of Linear Inequalities) Let a1 , . . . , am , c ∈ Rn be

vectors and let t be the dimension of the subspace of Rn spanned by a1 , . . . , am , c (so t is
the rank of the matrix whose rows are at1 , . . . , atm , ct ). Then, exactly one of the following
statements is true:

(a) c can be written as a non-negative combination of linearly independent vectors from

at1 , . . . , atm .

(b) There is a hyperplane {x ∈ Rn | ut x = 0} (for a non-zero vector u ∈ Rn ) containing

t−1 linearly independent vectors from a1 , . . . , am such that ati u ≥ 0 for i ∈ {1, . . . , m}
and ct u < 0.

Proof: Obviously, at most one of the statements can be valid. Let A be the matrix with rows
at1 , . . . , atm .
If c ∈ cone({a1 , . . . , am }) then by the previous theorem, c can be written as a non-negative
combination of linearly independent vectors from at1 , . . . , atm .
Hence, assume that c ̸∈ cone({a1 , . . . , am }), so there is no vector v ∈ Rm , v ≥ 0 such that
ct = v t A. By Farkas’ Lemma (Theorem 6), this implies that there is a vector ũ ∈ Rn such that
Aũ ≥ 0 and ct ũ < 0. This implies that the following LP (with u ∈ Rn as variable vector) has a
feasible solution:

max ct u
s.t. ct u ≤ −1
−ct u ≤ 1
−Au ≤ 0

Moreover, the LP is bounded (-1 is the value of an optimum solution). Hence, the optimum is
attained on a face of the solution polyhedron. By Theorem 23, we can write a minimal face
where the optimum solution value is attained as a set F = {u ∈ Rn | A′ u = b′ } where A′ u ≤ b′
is a subsystem of ct u ≤ −1, −ct u ≤ 1, −Au ≤ 0 consisting of t linearly independent vectors.
Hence, any vector u ∈ F fulfills the condition of (b). 2

40
Theorem 29 (Farkas-Minkowski-Weyl Theorem) A convex cone is polyhedral if and only
if it is finitely generated.

Proof: “⇐:” Let a1 , . . . , am ∈ Rn be vectors. We have to show that cone({a1 , . . . , am }) is

polyhedral. W.l.o.g. we can assume that the vectors a1 , . . . , am span the vector space Rn .
Consider the set H of half-spaces Hu = {x ∈ Rn | ut x ≤ 0} such that for each Hu ∈ H the
following conditions hold:

• {a1 , . . . , am } ⊆ Hu , and
• There are n − 1 linearly independent vectors ai1 , . . . , ain−1 in {a1 , . . . , am } such that
ut aij = 0 for j ∈ {1, . . . , n − 1}
m

The set H is finite because there are at most n−1 such half-spaces, and by Theorem 28 the
set cone({a1 , . . . , am }) is the intersection of these half-spaces. Hence, cone({a1 , . . . , am }) is a
polyhedron.
“⇒:” Let C = {x ∈ Rn | Ax ≤ 0} be a polyhedral cone. We have to show that C is finitely
generated. Let CA be the cone generated by the rows of A. By the first part of the proof, we
know that CA (as any other finitely generated cone) is polyhedral. Hence, there are vectors
d1 , . . . , dk ∈ Rn such that CA = {x ∈ Rn | dt1 x ≤ 0, . . . , dtk x ≤ 0}. Let CB = cone({d1 , . . . , dk })
be the cone generated by d1 , . . . , dk .
Claim: C = CB .
Proof of the claim: “CB ⊆ C”: Every row vector of A is contained in CA . Hence Adi ≤ 0 for all
i ∈ {1, . . . , k}. Therefore, di ∈ C (for i ∈ {1, . . . , k}) and thus (as C is a cone) CB ⊆ C.
“C ⊆ CB ”: Assume that there is a y ∈ C \ CB . Again by the first part, CB is polyhedral. Thus,
there must be a vector w ∈ Rn with wt di ≤ 0 (for i = 1, . . . , k) and wt y > 0. This implies
w ∈ CA , and therefore wt x ≤ 0 for all x ∈ C. Obviously, together with wt y > 0 this is a
contradiction to the assumption y ∈ C. 2
Remark: For a set S ⊆ Rn we call the set S o = {x ∈ Rn | xt y ≤ 0 for all y ∈ S}, the polar
cone of S (in particular it obviously is a convex cone). For a polyhedral cone C = {x ∈ Rn |
Ax ≤ 0} its polar cone C o is the cone generated by the rows of A (see exercises). We have just
seen in the proof that C oo = C for a polyhedral cone C.

3.6 Polytopes

Theorem 30 A set X ⊆ Rn is a polytope if and only if it is the convex hull of a finite

set of vectors in Rn .

41
Proof: “⇒:” Let X = {x ∈ Rn | Ax ≤ b} be a non-empty polytope. We can write X as
follows:
n x
X= x∈R | ∈C
1
where
x n+1
C= ∈R | λ ≥ 0, Ax − λb ≤ 0 .
λ
The set C is a polyhedral cone, so be Theorem 29 it s finitely generated by a set λx11 , . . . , λxkk

of vectors. Since X is bounded, C cannot contain a vector λx with non-zero x but λ = 0.

Hence, all λi are positive (for i ∈ {1, . . . , k}). We can even assume that we have λi = 1 for all
i ∈ {1, . . . , k} because otherwise we could scale all vectors by the factor λi . Thus, we have

x x1 xk
x ∈ X ⇔ ∃µ1 , . . . , µk ≥ 0 : = µ1 + · · · + µk .
1 1 1
This implies that X is the convex hull of x1 , . . . , xk .
“⇐:” Let X = conv({x1 , . . . , xk }) be the
convex hull of x1 , . . . , xk . Wex1have
to show
that X is a
x1 xk xk
polytope. Let C = cone({ 1 , . . . , 1 }) be the cone generated by 1 , . . . , 1 .
Then, we have
x
x∈X⇔ ∈ C.
1
x

By Theorem 29, C is polyhedral, so we can write C as C = { λ
| Ax + bλ ≤ 0}. This shows
X = {x ∈ Rn | Ax + b ≤ 0}, so X is a polyhedron.
It is P
even a polytope, because for M = max{||x i || | i ∈ {1, . . . , k}}
Pkand x ∈ X, we can
Pk write x as
k Pk
x = i=1 λi xi with λ1 , . . . , λk ≥ 0 and i=1 λi = 1, so ||x|| ≤ i=1 λi ||xi || ≤ M i=1 λi = M .
2

Corollary 31 A polytope is the convex hull of its vertices.

Proof: Let P be a polytope with vertex set X. Since P is convex and X ⊆ P , we have
conv(X) ⊆ P . It remains to show that P ⊆ conv(X). Theorem 30 implies that conv(X) is a
polytope, so in particular a polyhedron. Assume that there is a vector y ∈ P \ conv(X). Then,
there is a half-space Hy = {x ∈ Rn | ct x ≤ δ} such that conv(X) ⊆ Hy and y ̸∈ Hy . This means
that ct y > ct x for all x ∈ X, so the maximum of the function ct x over P will not be attained at
a vertex. This is a contradiction to Corollary 26. 2

3.7 Decomposition of Polyhedra

Notation: For two vector sets X, Y ⊆ Rn , we define their Minkowski sum as:
X + Y := {z ∈ Rn | ∃x ∈ X ∃y ∈ Y : z = x + y}.

42
In particular if X = ∅ or Y = ∅, we get X + Y = ∅.

Theorem 32 Let P = {x ∈ Rn | Ax ≤ b} be a polyhedron. Then, there are finite sets

V, E ⊆ Rn such that
P = conv(V ) + cone(E).

Proof: The cone

x n
C= | x ∈ R , λ ∈ R, λ ≥ 0, Ax − λb ≤ 0
λ

is polyhedral, so by the Farkas-Minkowski-Weyl Theorem (Theorem 29), it is generated by

finitely many vectors λx11 , . . . , λxkk . Then, x ∈ P if and only if x1 ∈ C, which is the case

n o
if any only if x1 ∈ cone x1 xk

λ1
, . . . , λk
. None of the λi can be negative, and by scaling,
we can assume λi ∈ {0, 1} for i = 1, . . . , k in any such representation. Then, the sets V =
{xi | i ∈ {1, . . . , k}, λi = 1} and E = {xi | i ∈ {1, . . . , k}, λi = 0} give a decomposition P =
conv(V ) + cone(E). 2
It is easy to check that the Minkowski sum of two polyhedra is again a polyhedron (see exercises).
Thus, a set P ⊆ Rn is a polyhedron if and only if there are finite sets V, E ⊆ Rn such that

P = conv(V ) + cone(E).

43
44
4 Simplex Algorithm

The Simplex Algorithm by Dantzig [1951] is the oldest algorithm for solving general linear
programs. Geometrically it works as follows: Given a polyhedron P and a linear objective
function, we start with any vertex of P . Then we walk along a one-dimensional face of P to
another vertex and repeat this until we found a vertex where the objective function attains a
maximum.
If we want to have a chance to follow this main strategy, we need a pointed polyhedron. That
is why in this section we consider linear programs in standard equation form:

max ct x
s.t. Ax = b (28)
x ≥ 0

As usual A is an m × n-matrix and b vector of length m.

We assume that rank(A) = m and that Ax = b has a feasible solution. These assumptions are
no real restrictions because we can run Gaussian elimination on the system Ax = b in advance
(see Section 5.1). Doing this we easily find out if Ax = b is indeed feasible and we can get rid of
redundant constraints, i.e. reduce A to a set linearly independent rows.
Thus, we also have m ≤ n. If m = n, then there is only one vector x with Ax = b. We can
compute this vector (again by using Gaussian elimination) and check if it is non-negative. This
solves the linear program in this case. Hence we assume that m < n.
We are interested in vertices of {x ∈ Rn | Ax = b, x ≥ 0}, so in particular, we ask for vectors
x∗ ∈ Rn with Ax∗ = b, x∗ ≥ 0 such that (at least) n − m entries of x∗ are zero (since n
constraints must be satisfied with equality.
The Simplex Algorithm works on linear programs in standard equation form (see (28)).
Nevertheless, in examples we will often start with LPs in the following form:

max ct x
s.t. Ãx ≤ b (29)
x ≥ 0

By adding (non-negative) slack variables x̃ we get a special case of an LP in standard equation

form (with A = [Ã Im ]). These LPs of the form max{ct x | Ãx + Im x̃ = b, x ≥ 0, x̃ ≥ 0} have
the advantage that, provided that b ≥ 0, one can easily compute a vertex of the corresponding
polyhedron (setting x = 0 and x̃i = bi for i ∈ {1, . . . , m} gives a vertex). Such a vertex is
needed to start the Simplex Algorithm.

45
4.1 Feasible Basic Solutions

Notation: We denote the index set of the columns of a matrix A ∈ Rm×n by {1, . . . , n}. For a
subset B ⊆ {1, . . . , n}, we denote by AB the sub-matrix of A containing exactly the columns
with index in B. Similarly, for a vector x ∈ Rn , we denote by xB the sub-vector of x containing
the entries with index in B. Note that xB is a vector of length |B| but its entries are not indexed
from 1 to |B|, but the indices are the elements of B, so for example for B = {2, 4, 9} we have
xB = (x2 , x4 , x9 ).

Definition 13 Let A ∈ Rm×n be a matrix with rank m and b ∈ Rm a vector. Let B ⊆

{1, . . . , n} with |B| = m such that AB is regular (i.e. invertible). Set N := {1, . . . , n} \ B.

(a) We call B a basis of A. The vector x with xB = A−1

B b and xN = 0 is called basic
solution of Ax = b for the basis B.

(b) If x is a basic solution of Ax = b for B, then the variables xj with j ∈ B are called
basic variables and the variables xj with j ∈ N are called non-basic variables.

(d) A feasible basic solution x for a basis B is called non-degenerated if A−1

B b > 0.
Otherwise it is called degenerated.

Remark: We also use the above definition for inequality systems of the type Ãx ≤ b, x ≥ 0
(with Ã ∈ Rm×ñ ). E.g. we call a vector x∗ ∈ Rñ with Ãx∗ ≤ b and x∗ ≥ 0 a basic solution if x∗ , s∗
with s∗ := b − Ãx∗ is a basic solution for Ãx + Im s = b, x ≥ 0, s ≥ 0 (with n := ñ + m variables).
In particular, in a feasible basic solution of Ãx ≤ b, x ≥ 0, the number of tight constraints
(including non-negativity constraints) must be at least n − m = ñ, and in a non-degenerated
feasible basic solution, the number of tight constraints must be exactly ñ. This is because
each positive non-slack variable and each positive slack variable is associated with a non-tight
constraint.
Example: Consider the following system of equations:

x 1 + x2 + s 1 = 1
2x1 + x2 + s2 = 2 (30)
x1 , x 2 , s 1 , s 2 ≥ 0

The variables are x1 , x2 , s1 , and s2 . We denoted the last two variables by s1 and s2 because
they can be interpreted as slack variables for the following system of inequalities: x1 + x2 ≤
1, 2x1 + x2 ≤ 2, x1 , x2 ≥ 0.

46
If we write the system of equations in matrix notation, we get:
 
x1
1 1 1 0   x 2 = 1

2 1 0 1  s1  2
s2

1 1
For B = {1, 2}, we get AB = with feasible basis solution (1, 0, 0, 0). So in particular
2 1
this
basic feasible solution is degenerated. If we choose instead B = {2, 3}, we get AB =
1 1
and the corresponding basic solution is (0, 2, −1, 0) which if, of course, infeasible.
1 0
Figure 5 illustrates these two basic solutions. However, note that the figure does not show the
solution space (which is 4-dimensional) but only the solution space of the problem without the
slack variables s1 and s2 , i.e. the solution space of the system x1 + x2 ≤ 1, 2x1 + x2 ≤, x1 , x2 ≥ 0.
So the two points (1, 0) and (0, 2) are basic solutions only in the sense of the remark stated
after the last definition.
y
Infeasible
2x1 + x2 = 2

x1 + x2 = 1
Degenerated

Fig. 5: Infeasible and degenerated basic solutions of (30) to R2 .

In this example we could easily make the degenerated basic solution non-degenerated by
skipping the redundant constraint 2x1 + x2 ≤ 2. This is always possible if we only have two
non-slackness variables but already in three dimensions there are instances where we cannot
get rid of degenerated basic solutions. As an example consider Figure 6. If the pyramid defines
the set of all feasible solution the marked vector is a degenerated basic solution, because four
constraints are fulfilled with equality while there are only three non-slack variables.
Note that the example (30) shows that the same vertex of a polyhedron can belong to a
degenerated or a non-degenerated basic solution, depending on how we describe the polyhedron
by a system of inequalities.

Theorem 33 Let P = {x ∈ Rn | Ax = b, x ≥ 0} be a polyhedron with rank(A) = m < n.

Then a vector x′ ∈ P is a vertex of P if and only if it is a feasible basic solution.

47
Degenerated basic solution

Fig. 6: A degenerated point in R3 .

Proof: The vector x′ is a vertex of P is and only if it is a feasible solution of the following
system and fulfills n linearly independent inequalities of the system with equality:

Ax ≤ b
−Ax ≤ −b
−In x ≤ 0

This is the case if and only if x′ ≥ 0, Ax′ = b and x′N = 0 for a set N ⊆ {1, . . . , n} with
|N | = n − m such that with B = {1, . . . , n} \ N the matrix AB has full rank. This is equivalent
to being a feasible basic solution. 2

4.2 The Simplex Method

Before we describe the algorithm in general, we will present some examples (which are taken
from Matoušek and Gärtner [2007]).
Consider the following linear program:

max x1 + x2
s.t. −x1 + x2 + x3 = 1
x1 + x4 = 3
x2 + x5 = 2
x1 , x 2 , x 3 , x 4 , x 5 ≥ 0
 
  x1  
−1 1 1 0 0  x2 
 1
 1 0 0 1 0  x3 = 3 
 
0 1 0 0 1  x4  2
x5

48
We first need a basis to start with. We simply choose B = {3, 4, 5}, which gives us the basic
solution x = (0, 0, 1, 3, 2). We write the constraints and the objective function in a so-called
simplex tableau:

x3 = 1 + x1 − x2
x4 = 3 − x1
x5 = 2 − x2
z = x1 + x2

The first three rows describe an equation system that is equivalent to the given one but each
basic variable is written as a combination of the non-basic variable. The last line describes the
objective function.
We will try to increase non-basic variables (which are zero in the current solution) with a
positive coefficient in the objective function. Hence, here we could use x1 or x2 , and we choose
x2 . Equation x3 = 1 + x1 − x2 is the critical constraint that prevents us from increasing to
something bigger than 1 (without increasing x1 ). If we set x2 to something bigger than 1,
x3 would become negative. The constraint x5 = 2 − x2 only gives an upper bound of 2 for
the value of x2 . Since the bound induced by non-negativity of x3 is tighter (so the constraint
x3 = 1 + x1 − x2 is critical), we replace 3 in the basis by 2. The new basic variable x2 can be
written as a combination of the non-basic variables by using the first constraint: x2 = 1 + x1 − x3 .
The new base is B = {2, 4, 5} with a new basic solution x = (0, 1, 0, 3, 1). This is the new
simplex tableau:
x2 = 1 + x1 − x3
x4 = 3 − x1
x5 = 1 − x1 + x3
z = 1 + 2x1 − x3

Increase x1 . x5 = 1−x1 +x3 is critical. x1 = 1+x3 −x5 . New base B = {1, 2, 4}. x = (1, 2, 0, 2, 0).

x1 = 1 + x3 − x5
x2 = 2 − x5
x4 = 2 − x3 + x5
z = 3 + x3 − 2x5

Increase x3 . x4 = 2−x3 +x5 is critical. x3 = 2−x4 +x5 . New base B = {1, 2, 3}. x = (3, 2, 2, 0, 0).

x1 = 3 − x4
x2 = 2 − x5
x3 = 2 − x4 + x5
z = 5 − x4 − x5

The value of the objective function for any feasible solution (x1 , . . . , x5 ) is 5 − x4 − x5 . Since we
have found a solution where x4 = x5 = 0 and we have the constraint that xi ≥ 0 (i = 1, . . . , 5),

49
our solution is an optimum solution.
Unbounded instance:
As a second example, consider:

max x1
s.t. x 1 − x2 + x3 = 1
−x1 + x2 + x4 = 2
x1 , x 2 , x 3 , x 4 ≥ 0

Quite obviously this LP in unbounded (one can choose x1 arbitrarily large and set x2 = x1 ,
x3 = 1, and x4 = 2).
Again we use the “slack variables” (here x3 and x4 ) for a first basis. This gives B = {3, 4} and
x = (0, 0, 1, 2).

x3 = 1 − x1 + x2
x4 = 2 + x1 − x2
z = x1

Increase x1 . x3 = 1 − x1 + x2 is critical. x1 = 1 + x2 − x3 . New base B = {1, 4}. x = (1, 0, 0, 3).

x1 = 1 + x2 − x 3
x4 = 3 − x3
z = 1 + x2 − x 3

We can increase x2 as much as we want (provided that we increase x1 by the same amount).
Thus the simplex tableau show that the linear program is unbounded.
Degeneracy:
A final example shows what may happen if we get a degenerated basic solution.

max x2
s.t. −x1 + x2 + x3 = 0
x1 + x4 = 2
x1 , x 2 , x 3 , x 4 ≥ 0

Starting basis is B = {3, 4}, so x = (0, 0, 0, 2) which is a degenerated solution.

x3 = x1 − x2
x4 = 2 − x1
z = x2

50
We want to increase x2 . x3 = x1 − x2 is critical. x2 = x1 − x3 . We will replace 3 by 2 in the
basis. However, we cannot increase x2 . New base B = {2, 4}. x = (0, 0, 0, 2).

x2 = x1 − x3
x4 = 2 − x1
z = x1 − x3

Increase x1 . x4 = 2 − x1 is critical. x1 = 2 − x4 . New base B = {1, 2}. x = (2, 2, 0, 0).

x1 = 2 − x4
x2 = 2 − x3 − x4
z = 2 − x3 − x4

Again, we have found an optimum solution because all coefficients of the non-basic variables in
the objective function z = 2 − x3 − x4 are negative.
After these three examples, we will now describe the simplex method in general.
For a basis B, the simplex tableau is a system T (B) of m + 1 linear equations with variables
x1 , . . . , xn and z with this form

xB = p + QxN
(31)
z = z0 + rt xN

and the following properties:

• xB is the vector of the basic variables, N = {1, . . . , n} \ B, and xN is the vector of the
non-basic variables,

• T (B) has the same set of solutions as the system Ax = b, z = ct x.

• p is a vector of length m, Q is an m × (n − m)-matrix, r is a vector of length n − m, and

z0 ∈ R.

Note that the entries of p are not necessarily numbered from 1 to m but that p uses B as the
set of indices (and for r, we have a corresponding statement). In particular, the rows of Q are
indexed by B and the columns by N . We denote the entries of Q by qij (where i ∈ B and
j ∈ N ).

Lemma 34 For each basis B, there is a simplex tableau T (B).

Proof: Set p = A−1 −1 t −1 t t −1

B b, Q = −AB AN , r = cN − (cB AB AN ) , and z0 = cB AB b.

Then xB = A−1 −1
B b − AB AN xN which is equivalent to AB xB = b − AN xN and Ax = b.

51
Moreover, z = ctB A−1 t t −1 t −1 t t −1
B b + (cN − (cB AB AN ))xN = cB AB (b − AN xN ) + cN xN = cB AB AB xB +
ctN xN = ctB xB + ctN xN = ct x for any solution x of Ax = b. 2
Remark: It is easy to check that there is only one simplex tableau for every feasible basis B.
The cost function z0 + rt xN does not directly depend on the basic variables but only on the
non-basic variables. Their impact on the overall cost is given be the vector r = cN − (ctB A−1 t
B AN ) .
An entry of r is called the reduced cost of its corresponding non-basic variable.
If all reduced costs are non-positive, we have already found an optimum solution:

Lemma 35 Let T (B) be a simplex tableau for a feasible basis B. If r ≤ 0, then the basic
solution of B is optimum.

Proof: Let x be the basic solution of B. Since xN = 0, we have ct x = z0 (= ctB A−1 ∗

B b). If x is
any feasible solution with value z ∗ = ct x∗ , then x∗ and z ∗ are also a solution of T (B), and we
have (because of r ≤ 0 and x∗N ≥ 0) z ∗ = z0 + rt x∗N ≤ z0 = ct x. 2

Lemma 36 Let T (B) be a simplex tableau for a feasible basis B. If there is an α ∈ N

with rα > 0 such that the column of Q with index α contains non-negative entries only,
the linear program is unbounded.

Proof: Let x the feasible basic solution for B. Let K ∈ R with K > ct x be a constant. Define
tx
a new feasible solution x̃ as follows: x̃α := K−c
rα
, x̃i = xi for i ∈ N \ {α}, and x̃j := pj + qjα x̃α
for j ∈ B. It is easy to check that x̃ is a feasible solution with ct x̃ ≥ K. Hence, the linear
program is unbounded. 2
In the following, we denote the entries of A by aij (i ∈ {1, . . . , m}, j ∈ {1, . . . , n}). The column
of A with index j is denoted by a·j .

Lemma 37 Let T (B) be a simplex tableau for a feasible basis B. Let α ∈ N be an

pβ
index with rα > 0 and β ∈ B with qβα < 0 and qβα = max{ qpiαi | qiα < 0, i ∈ B}. Then
B̃ = (B ∪ {α}) \ {β} is a feasible basis.

Proof: We have to show that AB̃ has full rank and that it is feasible i.e. that its basic solution
is non-negative.

(i) B̃ is a basis: We will show that A−1

B AB̃ has full rank.

All but one columns of AB̃ belong to AB . Hence, the matrix A−1B AB̃ contains all unit
vectors ei with the possible exception of eβ because we removed the β-th column from
AB . However, this removed column has been replaced by the α-th column a·α of A, so

52
the remaining column of A−1 −1
B AB̃ is AB a·α . But this is exactly the column with index
α of −Q = A−1 −1
B AN . By construction, qβα ̸= 0, so all columns of AB AB̃ are linearly
independent.
β p
(ii) We have to show that the basic solution of B̃ is non-negative. We increase xα to − qβα and
pβ
set the basic variables xB to p − q·α qβα , where q·α is the column with index α of Q. For
pβ
i ∈ B with qiα ≥ 0 (so in particular i ̸= β) we have pi − qiα qβα ≥ pi ≥ 0. For i ∈ B with
pβ pi pβ
qiα < 0 we have qβα ≥ qiα , so pi ≥ qiα qβα with equality in the last inequality for i = β.
This leads to xβ = 0 and xB ≥ 0, so we get a feasible basic solution for B̃. 2

Algorithm 1: Simplex Algorithm

Input: A matrix A ∈ Rm×n , a vector b ∈ Rm , and a vector c ∈ Rn
Output: A vector x̃ ∈ {x ∈ Rn | Ax = b, x ≥ 0} maximizing ct x or the message that
max{ct x | Ax = b, x ≥ 0} is unbounded or infeasible
1 Compute a feasible basis B;
2 If no such basis exists, stop with the message “infeasible”;
3 Set N = {1, . . . , n} \ B and compute the feasible basic solution x for B;
// xB = A−1
B b, xN = 0.
4 Compute the simplex tableau T(B)

xB = p + QxN
z = z0 + rt xN

for the basis B; // See equation (31) and the following notation.
5 if r ≤ 0 then
return x̃ = x; // x̃ is optimum (see Lemma 35).
6 Choose an index α ∈ N with rα > 0;
// Here we can apply different pivot rules.
7 if qiα ≥ 0 for all i ∈ B then
return “unbounded”; // By Lemma 36, the LP is unbounded.
pβ
8 Choose an index β ∈ B with qβα < 0 and qβα = max{ qpiαi | qiα < 0, i ∈ B};
// Again, we can apply different pivot rules.
9 Set B = (B \ {β}) ∪ {α};
// See Lemma 37 proving that we get a new feasible basis.
10 go to line 3

Algorithm 1 summarizes the Simplex Algorithm.

Remark: In line 1 of the algorithm we have to compute an initial feasible basis. This can be
done with the following trick: We assume that the Simplex Algorithm works correctly and
has a finite running time, provided that we can compute an initial basis. We further assume that
b ≥ 0 (otherwise, we have to multiply some equations by -1 first). Then, we set Ã = (A | Im ),
add new variables xn+1 , . . . , xn+m , and solve (with x̃ = (x1 , . . . , xn+m )) the following problem:

53
max −(xn+1 + xn+2 + · · · + xn+m )
s.t. Ãx̃ = b (32)
x̃ ≥ 0
For this linear program, it is trivial to find a feasible basis ({n + 1, . . . , n + m} will work), so
we can solve it by the Simplex Algorithm. If the value of its optimum solution is negative,
this means that the original linear program does not have a feasible solution. Otherwise, the
Simplex Algorithm will provide a basic solution for the original linear program. In this case,
the solution of the new LP computed by the Simplex Algorithm could contain variables
from xn+1 , . . . , xn+m as basic variables but their value must be 0 and hence they can be replaced
easily by variables from x1 , . . . , xn .
In lines 6 and 8, we may have a choice between different candidates to enter or leave the basis.
The elements chosen in these steps are called pivot elements, and the rules by which we choose
them are called pivot rules. Several different pivot rules for the entering variable have been
proposed:

• Largest coefficient rule: For the entering variable choose α such that rα is maximized.
This is the rule that was proposed by Dantzig in his first description of the Simplex
Algorithm.

• Largest increase rule: Choose the entering variable such that the increase of the
objective function is maximized. Finding an α with that property takes more time because
it is not sufficient to consider the vector r only.

• Steepest edge rule: Choose the entering variable in such a way that we move the
feasible basic solution in a direction as close to the direction of the vector c as possible.
This means we maximize
ct (xnew − xold )
||xnew − xold ||
where xold is the basic feasible solution of the current basis and xnew is the basic feasible
solution of the basis after the exchange step. This rule is even more timing-consuming but
in many practical experiments it turned out to lead to a small number of exchange steps.

Here, we only analyze a pivot rule that is quite inefficient in practice but has the nice property
that we can show that the Simplex Algorithm terminates at all, if we follow that rule. If all
exchange steps improve the value of the current solution, we can be sure that the algorithm will
terminate because we can never visit the same basic solution twice, and there is only a finite
(though exponential) number of basic solutions. However, exchange steps do not necessarily
change the value of the solution. Therefore, depending on the pivot rules, it is possible that
the Simplex Algorithm runs in an endless loop by considering the same sequence of bases
forever. This behavior is called cycling (see page 30 ff. of Chvátal [1983] for an example that
this can really happen). The good news is that we can avoid cycling by using an appropriate
pivot rule.

54
If the algorithm does not terminate, it has to consider the same basis B twice. The computation
between two occurrences of B is called a cycle. Let F ⊆ {1, . . . , n} be the indices of the variables
that have been added to (and hence removed from) the basis during one cycle. We call xF the
cycle variables.

Lemma 38 If the Simplex Algorithm cycles, all basic solutions during the cycling
are the same and all cycle variables are 0.

Proof: The value of a solution considered in Simplex Algorithm never decreases, so during
cycling it cannot increase either. Let B be a feasible basis that occurs in the cycle, and let
B ′ = (B ∪ {α}) \ {β} be the next basis. The only non-basic variable that could be increased is
xα . However, if it indeed was increased, then, because rα > 0, this would increase the value of
the solution. This shows that the non-basic variables remain zero. But then, all variables remain
unchanged because the basic variables are determined uniquely by the non-basic variables. 2
A pivot rule that is able to avoid cycling is Bland’s rule (Bland [1977]) that can be described
as follows: In line 6 of the Simplex Algorithm, we choose α among all elements in N with
rα > 0 such that α is minimal. In line 8, we choose β among all elements in B with qβα < 0
pβ
and qβα = max{ qpiαi | qiα < 0, i ∈ B} such that β is minimal.

Theorem 39 With Bland’s rule as pivot rule in lines 6 and 8, the Simplex Algorithm
terminates after a finite number of steps.

Proof: Assume that the algorithm cycles while using Bland’s rule. We use the notation from
above and consider the set F of the indices of the cycle variables. Let π be the largest element
of F , and let B be the basis just before π enters the basis. Let p,Q,r and z0 be the entries of
the simplex tableau T (B). Let B ′ be the basis just before π leaves it. Let p′ ,Q′ ,r′ and z0′ be the
entries of the simplex tableau T (B ′ ).
Let N = {1, . . . , n} \ B be the set of the non-basic variables (so in particular π ∈ N ). According
to Bland’s rule we choose the smallest index and π = max(F ), so when B is considered, π is
the only candidate in F to enter the basis. In other words:

rπ > 0 and rj ≤ 0 for all j ∈ N ∩ (F \ {π}). (33)

Let α be the index entering B ′ . Again by Bland’s rule, π must have been the only candidate
among all elements of F to leave B ′ . Since p′j = 0 for all j ∈ B ′ ∩ F , this means that
′ ′
qπα < 0 and qjα ≥ 0 for j ∈ B ′ ∩ (F \ {π}). (34)

Roughly spoken, we will get a contradiction because (33) says that in a feasible basic solution
increasing a non-basic variable in xF \{π} or decreasing xπ (to something negative!) will not

55
improve the result. On the other hand, (34) says that increasing xα while decreasing xπ (again
to something negative) will improve the result.
We will formalize this statement by considering the following auxiliary linear program:

max ct x
s.t. Ax = b
xF \{π} ≥ 0 (35)
xπ ≤ 0
xN \F = 0

Note that there a no constraints on the signs of the variables in xB\F .

We will show two claims that obviously cause a contradiction:
Claim 1: The LP (35) has an optimum solution.
Proof of Claim 1: Let x̃ be a basic feasible solution (of the original LP) of the basis B. We
have x̃F = 0, so in particular x̃π = 0, and hence x̃ is a feasible solution of (35). The cost of any
solution x of Ax = b can be written as ct x = z0 + rt xN . For any solution x of (35), we have

≥0 if j ∈ F \ {π}
xj
≤0 if j = π

Therefore, by statement (33), rj xj ≤ 0 for all j ∈ n ∩ F . With the condition xN \F = 0 this

leads to rt xN ≤ 0 for any solution x of (35). Therefore, the value of any such solution is at
most z0 , and thus x̃ is an optimum solution of (35). This proves Claim 1.
Claim 2: The LP (35) is unbounded.
The bases are changed during the cycling but we always have the same basic solution. Hence,
if x̃ is a feasible basic solution of the original LP for basis B is also a feasible basic solution
for the basis B ′ . We choose a positive number K and set x′α = K. For j ∈ N ′ \ {α} (with
N ′ = {1, . . . , n} \ B ′ ), we set x′j = x̃j = 0. Moreover, we set xB ′ = p′ + Q′ x′N ′ . By (34), this
defines a feasible solution of the auxiliary LP (35). Since α was a candidate for entering the
basis B ′ , we have rα′ > 0. Hence, we get a solution with value ct x′ = z0′ + r′t x′N ′ = z0′ + K · rα′ .
As we can choose K arbitrarily large, this shows that LP (35) is unbounded. 2

56
4.3 Efficiency of the Simplex Algorithm

We have seen that Bland’s rule guarantees that the Simplex Algorithm will terminate. What
can we say about the running time? Consider for some ϵ with 0 < ϵ < 12 the following example:
max xn
−x1 ≤ 0
x1 ≤ 1
ϵxj−1 − xj ≤ 0 for j ∈ {2, . . . , n}
ϵxj−1 + xj ≤ 1 for j ∈ {2, . . . , n}

Of course, adding non-negativity constraints for all variables would not change the problem.
The polyhedron defined by these inequalities is called Klee-Minty cube (Klee and Minty
[1972]). It turns out that the Simplex Algorithm with Bland’s rule (depending on the initial
solution) may consider 2n bases before finding the optimum solution. In particular, this example
shows that we don’t get a polynomial-time algorithm.
The bad news is that for any of the above pivot rules instances have been found where the
Simplex Algorithm with that particular pivot rule has exponential running time (see e.g.
Goldfarb and Sit [1979] for the steepest edge rule).
Assume that you are given an optimum pivot rule that guides you to an optimum solution
with a smallest possible number of iterations. Then, the number of iterations depends on the
following property of the instances:

Definition 14 The combinatorial diameter of a pointed polyhedron P is the diameter

(i.e. the largest distance of two nodes) of the undirected graph GP , where V (GP ) is the set
of vertices of P and two nodes v, w ∈ V (GP ) are connected by an edge in GP if and only
if there is a face of dimension 1 containing v and w.

Obviously, if we don’t make any assumptions on the starting solution, the number of iterations
performed by the Simplex Algorithm optimizing over a polyhedron P will be at least the
combinatorial diameter of P , even with an optimum pivot rule.
It is an open question what the largest combinatorial diameter of a d-dimensional polyhedron
with n facets is. In 1957, W. Hirsch conjectured that the combinatorial diameter could be
at most n − d. This conjecture was open for decades but it has been disproved by Santos
[2011] who showed that there is a 20-dimensional polyhedron with 40 facets and combinatorial
diameter 21. More generally, he proved that there are counter-examples to the Hirsch conjecture
with arbitrarily many facets. Nevertheless, it is still possible that the combinatorial diameter
is always polynomially (or even linearly) bounded in the dimension and the number of facets.
An upper bound for the combinatorial diameter is O(n2+log d ), which was proven by Kalai and
Kleitman [1992]. This bound has been improved by Todd [2014] to O((n − d)log d ). For an

57
overview of this topic see Section 3.3 of Ziegler [2007].
In practical experiments, the Simplex Algorithm typically turns out to be very efficient. It
could also be proved that the average running time (with a specified probabilistic model) is
polynomial (see Borgwardt [1982]). Moreover, Spielmann and Teng [2005] have shown that the
expected running time on a slight perturbation of a worst-case instance can be bounded by a
polynomial (“smoothed analysis”, see also Dadush and Hulberts [2019])

Revised Simplex Algorithm

If one implements the Simplex Algorithm as described above, an explicit computation of the
simplex tableau can be time-consuming. This can be avoided in the so-called Revised Simplex
Algorithm. In particular, we do not have to store the m × (n − m)-matrix Q completely. It is
sufficient to compute the column of Q with index α after we have found an α ∈ N with rα > 0.
Moreover, we do not really need the matrix A−1 B . In fact, we only want to solve equation system
of the type AB y = d. It is more efficient to compute an LU-decomposition of AB and update it
after each exchange step.

4.4 Dual Simplex Algorithm

If the linear program max{ct x | Ax = b, x ≥ 0} is feasible and bounded then the Simplex
Algorithm does not only provide an optimum primal solution but we can also get an optimum
solution of the dual linear program min{bt y | At y ≥ c}. To see this, let B the feasible basis
corresponding to the optimum computed by the Simplex Algorithm. Set ỹ = A−t B cB (where
A−tB = (A t −1
B ) ). This leads to A t
B ỹ = c B and At
N ỹ = A t
A −t
N B B c ≥ c N where the last inequality
t −1 t
follows from the fact that in T (B) we have 0 ≥ r = cN − (cB AB AN ) . So the vector ỹ is feasible
for the dual LP, and it is an optimum solution because together with the (primal) basic solution
x̃ for the basis B, it satisfies the complementary slackness condition (ỹ t A − ct )x̃ = 0.
In fact, the condition r ≤ 0 in the simplex tableau T (B) guarantees the existence of a dual
solution y with y t AB = ctB . In the Dual Simplex Algorithm, we start with a feasible basic
dual solution, i.e. a feasible dual solution for which a basis B exists with y t AB = ctB . If ctB A−1
B
is a feasible dual solution, we call B a dual feasible basis. Then, we compute the corresponding
simplex tableau T (B) (which exists for any basis not just a feasible basis). Thus the vector r
will have no positive entry. Note that B may not be feasible, so entries of p can be negative.
Now the algorithm swaps elements between the basis and the rest of the variables similarly to
the simplex algorithm but instead of keeping p non-negative it keeps r non-positive.
For any basis B such that in T (B) the vector r has no positive entry, the following properties
(that are easy to prove) are the basis of the Dual Simplex Algorithm:

• There is a feasible dual solution y with y t AB = cB .

• If p ≥ 0 then the current dual solution is optimum.

58
• z0 is the current solution value of the dual solution.

• If there is a β ∈ B with pβ < 0 such that qβj ≤ 0 for all j ∈ N , then the primal LP is
infeasible.
r
• For β ∈ B with pβ < 0 and α ∈ N with qβα > 0 with qrβα α
≥ qβjj for all j ∈ N with qβj > 0,
then (B \ {β}) ∪ {α} is a dual feasible basis. Then the value of the dual solution is changed
−p
by qβαβ rα . In particular, if rα ̸= 0 then the value of the dual solution gets smaller.

The Dual Simplex Algorithm simply applies the exchange steps in the last item until we
get a feasible basis. The algorithm can be considered as the Simplex Algorithm applied to
the dual LP. Thus it can also run into cycling and its efficiency is not better then the efficiency
of the Simplex Algorithm.
However, in some applications, the Dual Simplex Algorithm is very useful: If you add
an additional constraint to the primal LP, then a primal solution can become infeasible, so in
the Primal Simplex Algorithm we have to start from scratch. However, the dual solution
is still feasible. It is possibly not optimal but often it can be made optimal with just some
iterations of the Dual Simplex Algorithm.

4.5 Network Simplex

The Network Simplex Algorithm can be seen as the Simplex Algorithm applied to
Min-Cost-Flow-Problems. Even for this special case, we cannot prove a polynomial running
time but it turns out that, in practice, the Network Simplex Algorithm is among the
fastest algorithms for Min-Cost-Flow-Problems. Though it is a variant of the Simplex
Algorithm, it can be described as a pure combinatorial algorithm.

be an directed graph with capacities u : E(G) → R>0 and numbers

Definition 15 Let G P
b : V (G) → R with v∈V (G) b(v) = 0. A feasible b-flow in (G, u, b) is a mapping
f : E(G) → R≥0 with

• f (e) ≤ u(e) for all e ∈ E(G) and

•
P P
e∈δ + (v) f (e) −
G e∈δ − (v) f (e) = b(v) for all v ∈ V (G).
G

Notation: We call b(v) the balance of v. If b(v) > 0, we call it the supply of v, and if b(v) < 0,
we call it the demand of v. Nodes v of G with b(v) > 0 are called sources, nodes v with
b(v) < 0 are called sinks.
During this chapter, n is always the number of nodes and m the number of edges of the graph
G.

59
Minimum-Cost Flow Problem

P directed graph G, capacities u : E(G) → R>0 , numbers b : V (G) → R with

Instance: A
v∈V (G) b(v) = 0, edge costs c : E(G) → R.
P
Task: Find a b-flow f minimizing e∈E(G) c(e) · f (e).

We will use the following standard notation:

↔ ↔
Definition 16 Let G be a directed graph. We define the graph G by V (G) = V (G) and
↔ ← ←
E(G) = E(G)∪{ ˙ e | e ∈ E(G)} where e is an edge from w to v if e is an edge from v to
← ↔
w. e is called the reverse edge of e. Note that G may have parallel edges even if G does
not contain any parallel edges. If we have edge costs c : E(G) → R these are extended
↔ ←
canonically to edges in E(G) by setting c( e ) = −c(e).
Let (G, u, b, c) be an instance of the Minimum-Cost Flow Problem and let f be a
b-flow in (G, u). Then, the residual graph Gu,f is defined by V (Gu,f ) := V (G) and
← ↔
E(Gu,f ) := {e ∈ E(G) | f (e) < u(e)}∪{ ˙ e ∈ E(G) | f (e) > 0}. For e ∈ E(G) we define
←
the residual capacity of e by uf (e) = u(e) − f (e) and the residual capacity of e by
←
uf ( e ) = f (e).

The residual graph contains the edges where flow can be increased as forward edges and edges
where flow can be reduced as reverse edges. In both cases, the residual capacity is the maximum
value by which the flow can be modified. If P is a subgraph of the residual graph, then an
augmentation along P by γ means that we increase the flow on forward edges in P (i.e. edges
in E(G) ∩ E(P )) by γ and reduce it on reverse edges in P by γ. Note that the resulting mapping
is only a flow if γ is at most the minimum of the residual capacity of the edges in P .

Definition 17 Let (G, u, b, c) be an instance of the Minimum-Cost Flow Pro-

blem. A b-flow f in (G, u) is called a spanning tree solution if the graph
(V (G), {e ∈ E(G) | 0 < f (e) < u(e)}) does not contain any undirected cycle.

Spanning tree solutions can be interpreted as vertex solutions:

60
Lemma 40 Let (G, u, b, c) be an instance of the Minimum-Cost Flow Problem. A
b-flow f is a spanning tree solution if and only if x̃ ∈ RE(G) with x̃e = f (e) is a vertex of
the polytope
 
 X X 
E(G)
x∈R | 0 ≤ xe ≤ u(e) (e ∈ E(G)), xe − xe = b(v) (v ∈ V (G)) . (36)
 + −

e∈δ (v) e∈δ (v)

Proof: “⇒:” Let f be a spanning tree solution and x̃ ∈ RE(G) with x̃e = f (e) for e ∈ E(G).
Consider all inequalities xe ≥ 0 for edges e with f (e) = 0, xe ≤ u(e) for edges e with f (e) = u(e)
and for each connectedP componentPof (V (G), {e ∈ E(G) | 0 < f (e) < u(e)}) for all but one
vertex the equation e∈δ+ (v) xe − e∈δ− (v) xe = b(v). These are |E(G)| linearly independent
inequalities that are fulfilled with equality by x̃. Hence x̃ is a vertex.
“⇐:” Let f be a b-flow. Assume that x̃ ∈ RE(G) with x̃e = f (e) is a vertex of the polytope (36).
Assume that (V (G), {e ∈ E(G) | 0 < f (e) < u(e)}) contains an undirected cycle C. Choose
an ϵ > 0 such that ϵ ≤ min{min{f (e), u(e) − f (e)} | e ∈ E(C)}. Fix one of the two possible
orientations of C. We call an edge of C a forward edge if its orientation is the same as the
chosen orientation, otherwise it is called backward edge. Set x′e = ϵ for all forward edges and
x′e = −ϵ for all backward edges. For all edges e ∈ E(G) \ E(C), we set x′e = 0. Then x̃ + x′
and x̃ − x′ belong to the polytope (36) and x̃ = 12 ((x̃ + x′ ) + (x̃ − x′ )), so by Proposition 25, x̃
cannot be a vertex. Hence, we have a contradiction. 2

Corollary 41 Let (G, u, b, c) be an instance of the Minimum-Cost Flow Problem.

If there is a b-flow in (G, u), then there is an optimum solution of (G, u, b, c) that is a
spanning tree solution.

Proof: Since the polyhedron (36) is in fact a polytope, it is pointed, so there is an optimum
solution that is a vertex. Together with Lemma 40, this proves the statement. 2

61
Definition 18 Let (G, u, b, c) be an instance of the Minimum-Cost Flow Problem
where we assume that G is connected. A spanning tree structure is a quadruple
(r, T, L, U ) where r ∈ V (G), E(G) = T ∪˙ L ∪˙ U , |T | = |V (G)| − 1, and (V (G), T ) does
not contain any undirected cycle.
The b-flow f associated to the spanning tree structure (r, T, L, U ) is defined by

• f (e) = 0 for e ∈ L,

• f (e) = u(e) for e ∈ U ,

• f (e) = v∈Ce b(v) + e′ ∈U ∩δ− (Ce ) u(e′ ) − e′ ∈U ∩δ+ (Ce ) u(e′ ) for e ∈ T where we
P P P
G G
denote by Ce vertex set of the the connected component of (V (G), T \ {e}) containing
v (for e = (v, w)).

Let (r, T, L, U ) be a spanning tree structure and f the b-flow associated to it. The structure
(r, T, L, U ) is called feasible if 0 ≤ f (e) ≤ u(e) for all e ∈ T .
An edge (v, w) ∈ T is called downward if v is on the undirected r-w-path in T , otherwise
is is called upward.

A feasible spanning tree structure (r, T, L, U ) is called strongly feasible if 0 < f (e) for
every downward edge e ∈ T and f (e) < u(e) for every upward edge e ∈ T (where f is
again the b-flow associated to (r, T, L, U )).
We call the unique function π : V (G) → R with π(r) = 0 and cπ (e) := c(e)+π(v)−π(w) = 0
for all e = (v, w) ∈ T the potential associated to the spanning tree structure
(r, T, L, U ).

Remarks:

• Obviously, the b-flow associated to the spanning tree structure (r, T, L, U ) fulfills the flow
conservation rule, but it may be infeasible.
↔ ↔
• π(v) is the length of the r-v-path in (G, c ) consisting of edges of T and their reverse
edges, only.

• In a strongly feasible tree structure, we can send a positive flow from each vertex v to r
along tree edges such that that the new flow remains non-negative and fulfills the capacity
constraints.

Proposition 42 Given an instance (G, u, b, c) of the Minimum-Cost Flow Problem

and a spanning tree structure (r, T, L, U ), the b-flow f and the potential π associated to
(r, T, L, U ) can be computed in time O(m).

62
Proof: Since the potential π just encodes the distances to r in T , a breadth-first search in
the edges of T and the reverse edges of T is sufficient.
We can compute f by scanning the vertices in an order of non-increasing distance to r in T . 2

Proposition 43 Let (r, T, L, U ) be a feasible spanning tree structure and π the potential
associated to it. If cπ (e) ≥ 0 for all e ∈ L and cπ (e) ≤ 0 for all e ∈ U , then the b-flow
associated to (r, T, L, U ) is optimum.

Proof: The flow associated to (r, T, L, U ) is a basic solution of the standard linear program-
ming formulation for the minimum-cost flow problem. The criterion in the proposition is
equivalent to the statement that the reduced costs of all non-basic variables are non-positive.
This is equivalent to the optimality of the solution. 2
↔ ←
For an edge e = (v, w) ∈ E(G) \ T with e ̸∈ T , we call e together with the w-v path consisting
of edges of T and reverse edges of edges of T only, the fundamental circuit of e. The vertex
closest to r in the fundamental circuit is called the peak of e.
Algorithm 2 gives a summary of the Network Simplex Algorithm. As an input, we need
a strongly feasible tree structure. However, even if there is a feasible b-flow, such a strongly
feasible tree structure may not exist. But we can modify the instance such that we can easily
find a strongly feasible tree structure (r, T, L, U ). We add artificial expensive edges between r
and all other nodes. For each sink v ∈ V (G) \ {r}, we add an edge (r, v) with u((r, v)) = −b(v).
For all other nodes v ∈ V (G) \ {r} we add an edges (v, r) with u((v, r)) = b(v) + 1. Then,
we get a strongly feasible spanning tree structure by setting L to the set of all old edges (i.e.
without the artificial edges connecting r) and by setting U = ∅. If the weight on the artificial
edges is high enough (1 + n maxe∈E(G) |c(e)| would be sufficient) and there is a solution that
does not use these edges at all, no optimum solution will send flow along these new edges, so
the new instance is equivalent.

Theorem 44 The Network Simplex Algorithm terminates after a finite number of

iterations and computes an optimum solution.

Proof: It is easy to check that after the modification in the lines 7 to 14 f and π are still the
b-flow and the potential associated to (r, T, L, U ).
We will show that the spanning tree structure (r, T, L, U ) remains strongly feasible. By the
choice of γ in line 5 it remains feasible.
For an edge e = (v, w) on T let ẽ = (v, w) if e is an upward edge and ẽ = (w, v) if e is a
downward edge. We have to show that after an iteration of the algorithm, for all edges e ∈ T ,
the edge ẽ has a positive residual capacity. This is obvious for all edges outside C. For the edges
on the path on C from the head of e′ to the peak of C, this is also obvious because we augment

63
Algorithm 2: Network Simplex Algorithm
Input: An instance (G, u, b, c) of the Minimum-Cost Flow Problem and a strongly
feasible spanning tree structure (r, T, L, U ).
Output: A minimum-cost flow f .
1 Compute the b-flow f and the potential π associated to (r, T, L, U );
2 Let e0 be an edge with e0 ∈ L and cπ (e0 ) < 0 or an edge with e0 ∈ U and cπ (e0 ) > 0;
3 if No such edge exists then
return f
←
4 Let C be the fundamental circuit of e0 (if e0 ∈ L) or of e0 (if e0 ∈ U ) and let ρ = cπ (e0 );
5 Let γ = mine′ ∈E(C) uf (e′ ), and let e′ the last edge where this minimum is attained when
C is traversed (starting at the peak);
←
6 Let e1 be the corresponding edge in the input graph, i.e. e′ = e1 or e′ =e1 ;
7 Remove e0 from L or U ;
8 Set T = (T ∪ {e0 }) \ {e1 };
9 if e′ = e1 then
Set U = U ∪ {e1 };
10 else
Set L = L ∪ {e1 };
11 Augment f along γ by C;
12 Let X be the connected component of (V (G), T \ {e0 }) that contains r;
13 if e0 ∈ δ + (X) then
Set π(v) = π(v) + ρ for v ∈ V (G) \ X;
14 if e0 ∈ δ − (X) then
Set π(v) = π(v) − ρ for v ∈ V (G) \ X;
15 go to line 2;

by γ = uf (e′ ) which is smaller than the residual capacities on this path (by the choice of e′ ).
For the remaining edges e on C − e′ , the residual capacity uf (ẽ) is, after the augmentation, at
least γ. Thus, is if γ > 0, we are done. But if γ = 0, then e′ must be on the path from the peak
to e0 , so for the edges e on the path from the peak to the tail of e′ we had uf (ẽ) > 0 before
the augmentation (because (r, T, L, U ) was strongly feasible), so this is still the case after the
augmentation.
We will show that we never consider the same spanning tree structure twice. In each iteration,
the cost of the flow is reduced by γ|ρ|, so if γ > 0, then we are done. Hence assume that γ = 0.
− +
P e0 ̸= e1 (since all capacities are positive). Moreover, e0 ∈ L ∩ δ (X) or e0 ∈ U ∩ δ (X),
Then
so v∈V (G) π(v) will get larger (and it will never get smaller). This shows that we can never
get the same spanning tree structure twice.Since there is only a finite number of spanning tree
structures, this proves that the algorithm will terminate after a finite number of iterations.
By Proposition 43, the output of the algorithm is optimal when the algorithm terminates. 2

64
5 Sizes of Solutions

Before we will describe polynomial-time algorithms for solving linear programs we have to make
sure that we can store the output and all intermediate results with numbers whose sizes are
polynomial in the input size. To this end we have to define the size of numbers. Assuming that
all numbers are given in a binary representation, we define for

• n ∈ Z : size(n) := 1 + ⌈log(|n| + 1)⌉,

• r= p
q
with p, q ∈ Z, relatively prime: size(r) := size(p) + size(q),

• vectors x = (x1 , . . . , xn ) ∈ Qn : size(x) := n +

Pn
i=1 size(xi ),

• matrices A = (aij ) i=1,...,m ∈ Qm×n : size(A) := nm +

Pm Pn
i=1 j=1 size(aij ).
j=1,...,n

Remark: In order to get a description of a fraction r of with size(r) bits, we have to write
r as pq for numbers p, q ∈ Z that are relatively prime. Therefore, in any computation, when a
fraction pq arises, we apply the Euclidean Algorithm to p and q and divide p and q by their
greatest common divisor. The Euclidean Algorithm has polynomial running time, so during
any algorithm, we can assume that any fraction r is stored by using just size(r) bits.

Proposition 45 For r1 , . . . , rn ∈ Q, we have

n n
Q P
(a) size ri ≤ size(ri )
i=1 i=1
n
n
P P
(b) size ri ≤2 size(ri )
i=1 i=1

Proof: Both statements are obvious if the numbers r1 , . . . , rn are integers. Hence assume that
ri = pqii for non-zero numbers pi and qi that are relatively prime (i = 1, . . . , n).

n
n
n
n n n
Q Q Q P P P
(a) size ri ≤ size pi + size qi ≤ size(pi ) + size(qi ) = size(ri ).
i=1 i=1 i=1 i=1 i=1 i=1

!
n
Q n
P n
P n
P Q
(b) We have size qi ≤ size(qi ) ≤ size(ri ), and size pi qj ≤
i=1 i=1 i=1 i=1 j∈{1,...,n}\{i}
!
n n n n n
1
P Q P P P Q
size |pi | qj ≤ size(ri ). Since ri = n
Q pi qj , this proves the
i=1 j=1 i=1 i=1 qi i=1 j∈{1,...,n}\{i}
i=1
claim. 2

65
Proposition 46 For x, y ∈ Qn , we have

(a) size(x + y) ≤ 2(size(x) + size(y))

(b) size(xt y) ≤ 2(size(x) + size(y))

Proof:

(a) We have
n
X n
X n
X
size(x + y) = n + size(xi + yi ) ≤ n + 2 size(xi ) + 2 size(yi ) = 2(size(x) + size(y)) − 3n.
i=1 i=1 i=1

(b) We have
n
! n n n
!
X X X X
t
size(x y) = size xi yi ≤2 size(xi yi ) ≤ 2 size(xi ) + size(yi )
i=1 i=1 i=1 i=1
= 2(size(x) + size(y)) − 4n.
2

Proposition 47 For any matrix A ∈ Qn×n , we have size(det(A)) ≤ 2size(A).

p
Proof: Write the entries aij of A as aij = qijij where pij and qij are relatively prime (i, j =
1, . . . , n). Let det(A) = pq where p and q are relatively prime, too.
Then |det(A)| ≤ ni=1 nj=1 (|pij | + 1) and |q| ≤ ni=1 nj=1 |qij |. Therefore,
Q Q Q Q

size(q) ≤ size(A)
Qn Qn
and |p| = |det(A)||q| ≤ i=1 j=1 (|pij | + 1)|qij | . We can conclude
n X
X n
size(p) ≤ (size(pij ) + 1 + size(qij )) = size(A).
i=1 j=1

This proves size(det(A)) ≤ 2size(A). 2

Proposition 48 Let max{ct x | Ax ≤ b} be a feasible bounded linear program with

A ∈ Qm×n and b ∈ Qm . Then, there is an optimum (rational) solution x with
size(x) ≤ 4n(size(A) + size(b)). If b = ei or b = −ei for a unit vector ei , then there is a
non-singular submatrix A′ of A and an optimum solution x with size(x) ≤ 4nsize(A′ ).

66
Proof: By Corollary 20 the maximum of ct x over P = {x ∈ Rn | Ax ≤ b} must be attained in
a minimal face of P . Let F be a minimal face where the maximum is attained. By Proposition 23,
we can write F = {x ∈ Rn | Ãx = b̃} for some subsystem Ãx ≤ b̃ of Ax ≤ b. We can assume
that the rows of Ã are linearly independent. Choose B ⊆ {1, . . . , n} such that ÃB is a regular
square matrix. Then x ∈ Rn with xB = Ã−1 B b̃ and xN = 0 (with N = {1, . . . , n} \ B) is an
optimum solution of the linear program. By Cramer’s rule the entries of xB can be written
det(Ã )
as xj = det(Ã j ) where Ãj arises from ÃB by replacing the j-th column by b̃. Thus, we have
B

size(x) ≤ n + 2n(size(Ãj ) + size(ÃB )) ≤ 4n(size(ÃB ) + size(b̃)).

If b ∈ {ei , −ei }, then |det(Ãj )| is the absolute value of a determinant of a submatrix of ÃB . 2

Corollary 49 Let max{ct x | Ax ≤ b} be a feasible bounded linear program with

A ∈ Qm×n and b ∈ Qm . Then, there is an optimum (rational) solution x such that for
each non-zero entry xj of x, we have |xj | ≥ 2−4n(size(A)+size(b)) .

Proof: According to the proof of the previous proposition there is an optimum solution x
such that for each entry xj of x we have size(xj ) ≤ 4n(size(A) + size(b)). Since every positive
number smaller than 2−4n(size(A)+size(b)) has a size larger than 4n(size(A) + size(b)), this proves
the claim. 2

5.1 Gaussian Elimination

Assume that we want solve an equation system Ax = b. We can do this by applying the Gaussian
Elimination. This algorithm performs three kinds of operations to the matrix A:

1. Add a multiple of a row to another row.

2. Swap two columns.

3. Swap two rows.

It should be well-known (see e.g. textbooks Hougardy and Vygen [2018] or Korte and Vygen
[2018]) that with these steps O(mn(rank(A) + 1)) elementary arithmetical operations are
sufficient to transform A into an upper (right) triangular matrix. Then it is easy to check if the
equation system is feasible, and, in case that it is feasible, to compute a solution. However, in
order to show that Gaussian Elimination is a polynomial-time algorithm, we have to show that
the numbers that arise during the algorithm aren’t too big.
The intermediate matrices that occur during the algorithm are of the type

B C
, (37)
0 D

67
where B is an upper triangular matrix. Then, an elementary step of the Gaussian Elimination
consist of choosing a non-zero entry of D (called pivot element; if no such entry exists, we are
done) and to swap rows and/or columns such that this element is at position (1, 1) of D. Then
we add a multiple of the first row of D to the other rows of D such that the entry at position
(1, 1) is the only non-zero entry of the first column of D.
We want to prove that the numbers that occur during the algorithm can be encoded using a
polynomial number of bits. We can assume that we don’t need any swapping operation because
swapping columns or rows doesn’t change the numbers in the matrix.

B C
Assume that our current matrix is Ã = where B is a k × k-matrix. Then for each
0 D
entry dij of D we have
det(Ã1,...,k,k+i 1,...,k
1,...,k,k+j ) = dij · det(Ã1,...,k ). (38)
where Mji11,...,j
,...,it
t
denotes the submatrix of a matrix M induced by the rows i1 , . . . , it and the
columns j1 , . . . , jt . To see the correctness of (38), apply Laplace’s formula to the last row of
Ã1,...,k,k+i
1,...,k,k+j which contains dij as the only non-zero element. Since the determinant does not
change if we add the multiple of a row to another row, this leads to
det(A1,...,k,k+i
1,...,k,k+j )
dij =
det(A1,...,k
1,...,k )

By Proposition 47 and Proposition 45, this implies size(dij ) ≤ 4 size(A). Since all entries of the
matrix occur as entries of such a matrix D, this shows that the sizes of all numbers that are
considered during the Gaussian Elimination are bounded by 4size(A).
Note that we have to apply the Euclidean Algorithm to any intermediate result in order to
get small representations of the numbers. But this is not a problem because the Euclidean
Algorithm is polynomial as well.
Finally, we get the result:

Proposition 50 The Gaussian Elimination is an algorithm with polynomial running

time. 2

In particular this result shows that the following problems can be solved with a polynomial
running time:
• Solving a system of linear equations.
• Computing the determinant of a matrix.
• Computing the rank of a matrix.
• Computing the inverse of a regular matrix.
• Checking if a set of rational vectors is linearly independent.

68
6 Ellipsoid Method

The Ellipsoid Method (proposed by Khachiyan [1979]) was the first polynomial-time algorithm
for linear programming. The algorithm solves the problem of finding a feasible solution of a
linear program. As we have seen in Section 2.4, this is sufficient to solve as well the optimization
problem.

6.1 Idealized Ellipsoid Method

Definition 19 A set E ⊂ Rn is an ellipsoid if there are a vector s ∈ Rn and a

nonsingular matrix M ∈ Rn×n such that

E = {M x + s | x ∈ B n }

where B n = {x ∈ Rn | xt x ≤ 1} is the n-dimensional unit ball.

As a short notation, we write E = s + M B n .

Definition 20 A symmetric matrix A is called positive definite if xt Ax > 0 for any

non-zero vector x. It is called positive semidefinite if xt Ax ≥ 0 for any vector x.

Remark: An n × n-matrix Q is positive definite if and only if there is a non-singular matrix

M such that Q = M M t . For example, the Cholesky decomposition of Q achieves this. For
a proof of this statement, we refer to textbooks on linear algebra, e.g. Strang [1980].

Lemma 51 A set E ⊂ Rn is an ellipsoid if and only if there is a (symmetric) positive de-

finite n×n-matrix Q and a vector s ∈ Rn such that E = {x ∈ Rn | (x−s)t Q−1 (x−s) ≤ 1}.

Proof: A set E ⊆ Rn is an ellipsoid if and only if there is a nonsingular matrix M ∈ Rn×n

and a vector s ∈ Rn such that

E = {M x+s | x ∈ B n } = {y ∈ Rn | M −1 (y−s) ∈ B n } = {y ∈ Rn | (y−s)t (M −1 )t M −1 (y−s) ≤ 1}.

But (using the previous remark) this is equivalent to the statement that there is a positive
definite n × n-matrix Q and a vector s ∈ Rn such that E = {x ∈ Rn | (x − s)t Q−1 (x − s) ≤ 1}.
2
The Ellipsoid Algorithm just finds an element in an polytope or ends with the assertion
that the polytope is empty. On the other hand, it can be applied to more general sets K ⊆ Rn

69
provided that K is a compact convex set and that for any x ∈ Rn \ K we can find a half-space
containing K such that x is on the border of the half-space.
Basically, the algorithms works as follows: We always keep track of an ellipsoid containing K.
Then we check if the center c of the ellipsoid is contained in K. If this is the case, we are done.
Otherwise, we compute the intersection X of the ellipsoid and a half-space containing K such
that c is on the border of the half-space. Then, we find a new (smaller) ellipsoid containing X.
For the 1-dimensional space, the ellipsoid method contains the binary search as a special case.
However, for technical reasons, we assume in the following that the dimension of our solution
space is at least 2.
We start with a special case that is easier to handle: We assume that our given ellipsoid is
the ball B n (with radius 1 and center 0). We want to find a small ellipsoid E covering the
intersection of B n with the half-space {x ∈ Rn | x1 ≥ 0} (the gray area in Figure 7).

(0, 1)

(c, 0) (1, 0)

E
(0, −1)

Fig. 7: Intersection of B n with {x ∈ Rn | x1 ≥ 0}.

For symmetry reasons, we choose the center of the new smaller ellipsoid on the vector e1 at a
position c · e1 (where c is still to be determined). Our candidates for the ellipsoid are of the form
( n
)
X
E = x ∈ Rn | α2 (x1 − c)2 + β 2 x2i ≤ 1
i=2

1
where we also have to choose α and β. The matrix Q is then a diagonal matrix with entry α2
at position (1, 1) and β12 on all other diagonal positions.
To keep E small, we want e1 to lie on the border of E. This condition leads to α2 (1 − c)2 = 1
and hence
1
α2 = . (39)
(1 − c)2

Moreover, we want all points on the intersection of the border of B n and {x ∈ Rn | x1 = 0} to

70
be on the border of E. This condition leads to α2 c2 + β 2 = 1 and thus

c2 1 − 2c
β 2 = 1 − α 2 c2 = 1 − 2
= . (40)
(1 − c) (1 − c)2

p
The volume of an ellipsoid E = {x ∈ Rn | (x−s)t Q−1 (x−s) ≤ 1} is vol(E) = det(Q)×vol(B n )
(a result from measure theory, see e.g. Proposition 6.1.2 in Cohn [1980]).
p
Therefore, our goal is to choose α, β and c in such a way that det(Q) = α−1 β −(n−1) is
minimized.
(1−c)2n
Thus, we want to find a c minimizing (1−2c)n−1
.
2n 2n 2n−1
d (1−c)
We have dc (1−2c)n−1
= 2(n−1)(1−c)
(1−2c)n
− 2n(1−c)
(1−2c)n−1
which is zero if 2(n−1)(1−c)
1−2c
= 2n. This leads to
2(n − 1) − 2c(n − 1) = 2n − 4cn and c(2n − (n − 1)) = 1. Thus, we minimize the volume by
1
setting c = n+1 .
(n+1)2 n2 −1
Then, α2 = n2
and β 2 = n2
.

Lemma 52 (Half-Ball Lemma) We have

( 2 n
)
(n + 1)2 n2 − 1 X 2

1
B n ∩{x ∈ Rn | x1 ≥ 0} ⊆ E := x ∈ Rn | x1 − + x ≤1 .
n2 n+1 n2 i=2 i
1
Moreover, vol(E)
vol(B n )
≤ e− 2(n+1) .

Proof: Consider x ∈ B n ∩ {x ∈ Rn | x1 ≥ 0}. We have ni=2 x2i ≤ 1 − x21 , and hence it is

P
2 2 n2 −1
sufficient to show that g(x1 ) := (n+1)
n2
x1 − n+11
+ n2 (1 − x21 ) ≤ 1. For x1 = 0, we have
2 (n+1)2 2
n2 −1
g(0) = (n+1)
n2
1
(n+1)2 + n 2 = 1. And for x 1 = 1: g(1) = n2
n
n+1
= 1.
(n+1)2 n2 −1
Moreover, g is a quadratic function and the coefficient of x21 is n2
− n2
> 0. Therefore, we
have g(x1 ) ≤ 1 for 0 ≤ x1 ≤ 1.
n−1
vol(E) n2 2
p
For the second statement, note that vol(B n )
= det(Q) = α−1 β −(n−1) = n
n+1 n2 −1
≤
1 n−1 1 1 1
e− n+1 e = e− n+1 + 2(n+1) = e− 2(n+1) . For the first inequality we made use of the fact that
2(n2 −1)

1 + x ≤ ex for any x ∈ R. 2

71
Lemma 53 (Half-Ellipsoid Lemma) Let E = p + {x ∈ Rn | xt Q−1 x ≤ 1} be an ellipsoid
and a ∈ Rn with at Qa = 1. Then,
2

n t t 1 ′ n n −1 t −1 2 t
E ∩{x ∈ R | a x ≥ a p} ⊆ E = p+ Qa+ x ∈ R | x Q + aa x ≤ 1 .
n+1 n2 n−1
1
vol(E ′ )
Moreover, vol(E)
≤ e− 2(n+1) .

Proof: Let M be a non-singular n × n-matrix with Q = M M t . We can assume that at M = et1

(and thus Qa = M M t a = M (at M )t = M e1 ) because otherwise we can multiply M by a rotation
matrix that maps the vector at M to e1 . Then

E ∩ {x ∈ Rn | at x ≥ at p}
= (p + M B n ) ∩ {x ∈ Rn | at x ≥ at p}
= p + (M B n ∩ {x ∈ Rn | at (x + p) ≥ at p})
= p + (M B n ∩ {x ∈ Rn | at x ≥ 0})
= p + M (B n ∩ M −1 {x ∈ Rn | at x ≥ 0})
= p + M (B n ∩ {x ∈ Rn | at M x ≥ 0})
= p + M (B n ∩ {x ∈ Rn | et1 x ≥ 0})
2

1 n n −1 t 2
Lem. 52
t
⊆ p+ M e1 + M x ∈ R | x In + e1 e x ≤ 1
n+1 n2 n−1 1
2

1 n n −1 −1 t 2 t −1
= p+ M e1 + x ∈ R | (M x) In + e1 e M x ≤ 1
n+1 n2 n−1 1
2

1 n n −1 t −1 2 t
= p+ Qa + x ∈ R | x Q + aa x ≤ 1
n+1 n2 n−1
n o
We can write the ellipsoid E ′ in standard form as E ′ = p + 1
n+1
Qa + x ∈ Rn | xt Q̃−1 x ≤ 1
2
with Q̃ = n2n−1 Q − n+1
2

Qaat Qt because

n2 − 1 n2

2 −1 t 2 t t
Q + aa Q− Qaa Q
n2 n−1 n2 − 1 n+1
2 2 4
= In − aat Qt + aat Q − 2 a at Qa at Qt
n+1 n−1 n − 1 | {z }
=1
= In .
q
vol(E ′ )
Therefore, vol(E) = det( Q̃)
det(Q)
.
2 n n
n2 n2
We have det( Q̃) n 2 2

det(Q)
= det n2 −1
In − n+1
aat Qt = n2 −1
det In − n+1
aat Qt = n2 −1
(1 −
2
n+1
). To see the last equality note that the matrix aat Qt has eigenvalue 1 for the eigenvector

72
a (because at Qt a = 1) while all other eigenvalues are zero (the rank of aat Qt is 1).q Since the
determinant is the product of all eigenvalues, this implies the last equation. Hence, det( Q̃)
det(Q)
≤
2 n2 n−1
1
≤ e− 2(n+1) (see the proof of the Half-Ball Lemma for
n 2 1 n n2 2
n2 −1
(1 − n+1 ) 2 = n+1 n2 −1
details of the last steps). 2
n o 2
Remark: The ellipsoid E ′ = p+ n+1 1
Qa+ x ∈ Rn | xt Q̃−1 x ≤ 1 with Q̃ = n2n−1 Q − n+12

Qaat Qt
is called Löwner-John ellipsoid. It is in fact the smallest ellipsoid containing E ∩ {x ∈ Rn |
at x ≥ at p}.
A separation oracle for a convex set K ⊆ Rn is a black-box algorithm which, given x ∈ Rn ,
either returns an a ∈ Rn with at y > at x for all y ∈ K or asserts x ∈ K.
Observation: Given A ∈ Qm×n and b ∈ Qm , a separation oracle for {x ∈ Rn | Ax ≤ b} can be
implemented in O(mn) arithmetical operations.

Algorithm 3: Idealized Ellipsoid Algorithm

Input: A separation oracle for a closed convex set K ⊆ Rn , a number R > 0 with
K ⊆ {x ∈ Rn | xt x ≤ R2 }, and a number ϵ > 0
Output: An x ∈ K or the message “vol(K) < ϵ”.
2
1 p0 := 0, A0 := R In ;
1
2 for k = 0, . . . , N (R, ϵ) := ⌈2(n + 1)(n ln(2R) + ln( ))⌉ do
ϵ
3 if pk ∈ K then
4 return pk ;
5 Let ā ∈ Rn be a vector with āt y > āt pk for all y ∈ K;
6 bk := √Atk ā ;
ā Ak ā
1
7 pk+1 := pk + n+1 bk ;
n2 2
8 Ak+1 := n2 −1 (Ak − n+1 bk btk );
9 return “vol(K) < ϵ”;

Theorem 54 Given a convex set K ⊆ Rn (specified by a separation oracle), ϵ > 0, and

a number R with K ⊆ {x ∈ Rn | xt x ≤ R2 }, we can find an x ∈ K or (correctly) assert
“vol(K) < ϵ”, in O(n(n ln(R) + ln( 1ϵ ))) iterations of the Idealized Ellipsoid Method.
Each iteration requires one oracle call, O(n2 ) basic arithmetical operations, and the
computation of one square root of real numbers.

Proof: As an invariant, we will prove that during the k-th iteration of the algorithm, the
set K is contained in the set pk + {x ∈ Rn | xt A−1k x ≤ 1}. For k = 0, this is true because R is
big enough. For the step from k to k + 1, we apply the Half-Ellipsoid Lemma (Lemma 53) to
t
Q = Ak and a = √ āt (this scaling leads to at Ak a = āāt A k ā
Ak ā
= 1).
ā Ak ā

73
We have vol({x ∈ Rn | xt x ≤ R2 }) ≤ vol([−R, R]n ) = 2n Rn , and in each iteration, the
1
− 2(n+1)
volume of Ek = {x ∈ Rn | xt A−1
k x ≤ 1} is reduced at least by the factor e , so we get
k
− 2(n+1)
vol(Ek ) ≤ e 2n Rn .
k
Thus, we have to find a smallest k such that e− 2(n+1) 2n Rn ≤ ϵ which is equivalent to 2(n+1)k
≥
2n Rn 1 1

ln ϵ and k ≥ 2(n + 1)(n ln(2R) + ln( ϵ )). This shows that O(n(n ln(R) + ln( ϵ ))) iterations
are sufficient. 2

6.2 Error Analysis

We cannot compute square roots exactly, so during the algorithm, we have to work with rounded
intermediate solutions. Let pek and Ãk be the exact values and pk and Ak be the rounded values
(and the same for the corresponding ellipsoids Ẽk and Ek ). Note that pek and Ãk are based on
the rounded values pk−1 and Ak−1 .
Let δ be an upper bound on the maximum absolute rounding error for the entries in pek and
Ãk , so ∥pk − pek ∥∞ ≤ δ and ∥Ak − Ãk ∥∞ ≤ δ. So δ (that will be defined later) describes the
precision of the rounding. When we round the entries in Ãk , we do it in such a way that the
matrix remains symmetric. Let Γk = Ak − Ãk and ∆k = pk − pek .
In the following, we write by ∥∥˙ the Euclidean norm for vectors and the induced operator norm
for the matrices. When considering matrices, we often make use of the fact that the Frobenius
norm is an upper bound for the operator norm induced by the Euclidean norm.
−1
For any x ∈ K we can assume that (x − pek )t Ãk (x − pek ) ≤ 1 and we want to prove the same
for pk and Ak . To this end, we have to increase the ellipsoid slightly by scaling Ãk .
−1 −1
We have (x − pk )t A−1 t
k (x − pk ) = (x − pk ) Ãk (x − pk ) + (x − pk )t (A−1
k − Ãk )(x − pk ). We
analyze the two summands separately:
−1 −1 −1 −1
(x − pk )t Ãk (x − pk ) = (x − p˜k )t Ãk (x − p˜k ) + |2∆tk Ãk (x − p˜k )| + ∆tk Ãk ∆k
−1 −1
≤ 1 + 2∥∆k ∥ · ∥Ãk ∥ (R + ∥p˜k ∥) + ∥∆k ∥2 · ∥Ãk ∥ (41)
√ −1 −1
≤ 1 + 2 nδ ∥Ãk ∥ (R + ∥p˜k ∥) + nδ 2 ∥Ãk ∥.
And:
−1 −1
(x − pk )t (A−1
k − Ãk )(x − pk ) ≤ ∥x − pk ∥2 · ∥A−1
k − Ãk ∥
−1 −1
≤ (R + ∥pk ∥)2 ∥Ak (Ak − Ãk )Ãk ∥
−1 (42)
≤ (R + ∥pk ∥)2 ∥A−1
k ∥ · ∥Ãk ∥ · ∥Γk ∥
2 −1 −1
≤ (R + ∥pk ∥) ∥Ak ∥ · ∥Ãk ∥ · nδ
1
We adjust Ãk by multiplying it by µ = 1 + 2n(n+1)
, so we replace Ãk by µÃk (which we call Ãk
again). Then
−1 1 2n(n + 1) 1
(x − p̃k )t Ãk (x − p̃k ) = 1 = < 1 − . (43)
1+ 2n(n+1)
2n2 + 2n + 1 4n2

74
and (E
]k+1 also refers to the scaled version of Ãk ):
n2
vol(E
] k+1 ) 1
− 2(n+1) 1 1 1 1
≤e 1+ ≤ e− 2(n+1) e 4(n+1) = e− 4(n+1) . (44)
vol(Ek ) 2n(n + 1)
Thus,
q
vol(Ek+1 ) vol(E
]k+1 ) vol(Ek+1 ) 1 −1
= ≤ e− 4(n+1) det(Ak+1 A
]k+1 ) (45)
vol(Ek ) vol(Ek ) vol(E]k+1 )
We have
−1 −1
det(Ak+1 A
] k+1 ) = det In + (Ak+1 − A
] k+1 )A]k+1
(∗) −1
n
≤ ∥In + (Ak+1 − A ] k+1 )Ak+1 ∥
]
−1
n
≤ (1 + ∥Γk+1 ∥ · ∥A
] k+1 ∥)
−1
n
≤ (1 + nδ∥A
]k+1 ∥)
−1
2 δ∥A ∥
≤ en
^ k+1 ,
Qn
where inequality (∗) follows from Hadamard’s inequality (| det(A)| ≤ i=1 ∥ai ∥ for an n × n-
matrix with columns a1 , . . . , an , see exercises).
This implies
vol(Ek+1 ) 1 1 2
−1
≤ e− 4(n+1) · e 2 n δ∥Ak+1 ∥ .
^
vol(Ek )
−1 1
Hence, if we had 12 δ∥A
] k+1 ∥ <
1
8(n+1)3
, then we had vol(Ek+1 )
vol(Ek )
< e− 8(n+1) .
Therefore, and by equations (41) and (42) our goal is to choose δ such that we get the following
inequalities:

√ fk −1 ∥ (R + ∥p˜k ∥) + nδ 2 ∥A
fk −1 ∥ + (R + ∥pk ∥)2 ∥A−1 ∥ · ∥A
fk −1 ∥nδ ≤
• 2 nδ ∥A k
1
4n2
−1
• δ∥A
]k+1 ∥ ≤
1
4(n+1)3

For the analysis, we assume that R ≥ 1.

1
Proposition 55 Assume that δ is chosen such that δ ≤ 12n4k
in iteration k of the
Ellipsoid Method. Then, we have:

(a) Ak is positive definite.

(b) ∥pk ∥ ≤ R2k , ∥pek ∥ ≤ R2k .

(d) ∥A−1 −2 k f −1 ∥ ≤ R−2 4k .

k ∥ ≤ R 4 , ∥Ak

75
Proof: We have
n2 − 1 1 āāt

−1 2
A
]k+1 = A−1
k + .
n2 µ n − 1 āt Ak ā
−1
Thus, as a sum of a positive definite matrix and a positive semidefinite matrix A
]k+1 is positive
n2 2 t
definite. Therefore Ak+1 = n2 −1 µ(Ak − n+1 bk bk ) is positive definite.
]

We will show by induction that Ak is positive definite and ∥Ak −1 ∥ ≤ R−2 4k .

t āt ā −1
∥ ātāā
Ak ā
∥= āt Ak ā
≤ (min{xt Ak x | ∥x∥ = 1}) ≤ ∥A−1
k ∥.

Thus,
n2 − 1 1 āāt

−1 2
∥A
]k+1 ∥≤ ∥A−1
k ∥+ ∥ t ∥ ≤ 3∥Ak −1 ∥
n2 µ n − 1 ā Ak ā

Let λ be a smallest eigenvalue of Ak+1 and v a vector with ∥v∥ = 1 such that λ = v t Ak+1 v.
Then:

v t Ak+1 v ≥ v t A
]k+1 v − nδ

≥ min{ut A ] n
k+1 u | u ∈ R , ∥u∥ = 1} − nδ
1
≥ −1 − nδ
∥Ak+1 ∥
]
1
≥ − nδ
3∥Ak −1 ∥
1 1
≥ − nδ
3 R−2 4k
1
≥ ,
R−2 4k+1
provided that:
R2

1 1
nδ ≤ − . (46)
3 4 4k

This shows that Ak+1 is positive definite and by ∥A−1

0 ∥ = R
−2
and 1
∥A−1
= v t Ak+1 v this proves
k+1 ∥
∥A−1 −2 k+1
k+1 ∥ ≤ R 4
−1 −1
−1 −2 k+1
By ∥A
]k+1 ∥ ≤ 3∥Ak ∥ we get as well ∥A
]k+1 ∥ ≤ R 4 . This proves (d).
n2
We have ∥A] k+1 ∥ ≤ n2 −1 µ∥Ak ∥ because ∥A∥ ≤ ∥A + B∥ for positive semidefinite matrices A
and B (see the exercises). Together with ∥A0 ∥ = R2 , this leads by induction to

n2
∥Ak+1 ∥ ≤ ∥A
] k+1 ∥ + ∥Γk+1 ∥ ≤ 2−1
µ ∥Ak ∥ + nδ ≤ R2 2k+1
n
| {z }
≤ 32

n2
We also get ∥A
]k+1 ∥ ≤ n2 −1
µ∥Ak ∥ ≤ R2 2k+1 , so we have proved (c).

76
We can write Ak = M M t with a regular matrix M . Then,
r s
∥Ak ā∥ t
ā Ak Ak ā (M t ā)t Ak (M t ā) p k
∥bk ∥ = √ t = t
= t t t
≤ ∥Ak ∥ ≤ R2 2 , (47)
ā Ak ā ā Ak ā (M ā )(M ā)

where the first inequality follows from the fact that ∥Ak ∥ = max{xT Ak x | ∥x∥ = 1} because Ak
is positive semidefinite (see exercises).
Therefore, we get by induction (using the fact that p0 = 0)
1 √ k √ k 1
∥pk+1 ∥ ≤ ∥pk ∥ + ∥bk ∥ + nδ ≤ ∥pk ∥ + R2 2 + nδ ≤ R2k + R2 2 + √ k ≤ R2k+1 .
n+1 3 n4

This also gives us: ∥pg

k+1 ∥ ≤ ∥pk ∥ +
1
n+1
∥bk ∥ ≤ R2k+1 . This shows statement (b). 2

Algorithm 4: Ellipsoid Algorithm

Input: A separation oracle for a closed convex set K ⊆ Rn , a number R > 0 with
K ⊆ {x ∈ Rn | xt x ≤ R2 }, and a number ϵ > 0
Output: An x ∈ K or the message “vol(K) < ϵ”.
2
1 p0 := 0, A0 := R In ;
1
2 for k = 0, . . . , N (R, ϵ) := ⌈8(n + 1)(n ln(2R) + ln( ))⌉ do
ϵ
3 if pk ∈ K then
4 return pk ;
5 Let ā ∈ Rn be a vector with āt y > āt pk for all y ∈ K;
6 bk := √Atk ā ;
ā Ak ā
1
7 pk+1 an approximation of pg
k+1 := pk + n+1 bk with a maximum error of δ;
2
1 n 2
8 Ak+1 a symmetric approximation of A ]k+1 := 1 + 2n(n+1) n2 −1
(Ak − n+1 bk btk ) with
a maximum error of δ;
9 return “vol(K) < ϵ”;

−1
Lemma 56 Let δ be positive with δ < 26(N (R,ϵ)+1) 16n3 where N (R, ϵ) :=
1
⌈8(n + 1)(n ln(2R) + ln( ϵ ))⌉. Then, in iteration k of the Ellipsoid Algorithm, we
k
have K ⊆ pk + Ek and vol(Ek ) < e− 8(n+1) 2n Rn .

1 1
Proof: By the choice of δ, we have nδ ≤ 12 4k
.
Moreover,

√ fk −1 ∥ (R + ∥p˜k ∥) + nδ 2 ∥A
fk −1 ∥ +(R + ∥pk ∥)2 ∥A−1 ∥ · ∥A
fk −1 ∥ nδ ≤ δn26k ≤
• 2 nδ ∥A k
1
4n2
| {z } |{z} | {z } |{z} | {z } | {z }
≤R−2 4k ≤R2k ≤R−2 4k ≤R2k ≤R−2 4k ≤R−2 4k

77
−1
• δ ∥A
] ∥≤ 1
| k+1
{z } 4(n+1)3
≤R−2 4k

Hence, by the above analysis, Ek (with rounded numbers) always contains the set K, and
1
is reduced at least by a factor of e− 8(n+1) in each iteration, so after
the volume of Ek
O n n ln R + ln 1ϵ iterations, the algorithm terminates with a correct output. 2

Theorem 57 For a compact convex set K ⊆ {x ∈ Rn | xt x ≤ R2 }, given by a separation

oracle, the Ellipsoid Algorithm either finds a vector x ∈ K or asserts vol(K) ≤ ϵ. It
1

needs O n n ln R + ln ϵ iterations, and in each iteration it performs one oracle call,
the approximative computation of one square root and O(n2 ) arithmetical operations on
O n n ln R + ln 1ϵ 2

bits.

There number of calls of the separation oracle can be reduced to O(n ln( nR
ϵ
)) (see Lee, Sidford,
and Wong [2015] for an algorithm that only needs O(n ln( ϵ )) oracle calls and O(n3 lnO(1) ( nR
nR
ϵ
))
additional time).

6.3 Ellipsoid Method for Linear Programs

We first want to use the Ellipsoid Algorithm just to check if a given polyhedron P is empty.
This can be done directly, provided that P is in fact a polytope and if we have the assertion
that if P is non-empty, its volume cannot be arbitrarily small. The following proposition implies
that we can assume these properties:

Proposition 58 Let A ∈ Qm×n , b ∈ Qm and P = {x ∈ Rn | Ax ≤ b}. For R = 1 +

−1
24n(size(A)+size(b)) and ϵ = 2n24n(size(A)+size(b)) let PR,ϵ = {x ∈ [−R, R]n | Ax ≤ b + ϵ11}.
Then:

(a) P = ∅ ⇔ PR,ϵ = ∅.
2ϵ
n
(b) If P ̸= ∅, then vol(PR,ϵ ) ≥ n2size(A)
.

Proof:

(a) “P = ∅ ⇒ PR,0 = ∅” is trivial, and by Proposition 48, we have “PR,0 = ∅ ⇒ P = ∅”.

“PR,ϵ = ∅ ⇒ PR,0 = ∅” is also trivial, so it remains to show: “P = ∅ ⇒ PR,ϵ = ∅”. Assume
that P = ∅. By Farkas’ Lemma (Theorem 5) this implies that there is a vector y ≥ 0 with

78
y t A = 0 and y t b = −1. Then, by Proposition 48

min 11t y
At y = 0
bt y = −1
y ≥ 0

has an optimum solution y such that the absolute value of any entry of y is at most
24n(size(A)+size(b)) . Thus, y t (b + ϵ11) < −1 + (n + 1)24n(size(A)+size(b)) ϵ < 0. Again by Farkas’
Lemma, this implies that Ax ≤ b + ϵ11 does not have a feasible solution. In particular,
there is no feasible solution in [−R, R]n , so PR,ϵ = ∅.

(b) If P ̸= ∅, then PR−1,0 ̸= ∅ (with the same proof as in (a) for R). But for any z ∈ PR−1,0 , we
ϵ
have {x ∈ Rn | ||x − z||∞ < n2size(A) } ⊆ PR,ϵ . Hence vol(PR,ϵ ) ≥ vol({x ∈ Rn | ||x − z||∞ <
n
ϵ 2ϵ
2

n2size(A)
}) = n2size(A) .

Theorem 59 Given a polyhedron P = {x ∈ Rn | Ax ≤ b} with A ∈ Qm×n and b ∈ Qm

we can decide in polynomial running time if P is empty.

√
Proof: We can apply the Ellipsoid Algorithm to K = PR,ϵ with R = ⌈ n(1+24n(size(A)+size(b)) )⌉
n −1
and ϵ′ = n2size(A)
2ϵ
(for ϵ = 2n24n(size(A)+size(b)) as a lower bound for the volume). We need
N (R, ϵ′ ) = O(n(n ln(R) + ln( ϵ1′ ))) iterations, which is polynomial in the input size.
Moreover, it is sufficient to set the bound on the absolute rounding error to any value δ <
′ −1
26(N (R,ϵ )+1) 16n3 , so also the number of bits that we have to compute during the algorithm
is polynomial. 2

Theorem 60 There is a polynomial-time algorithm that computes an optimum solution

for a given linear program max{ct x | Ax ≤ b} with A ∈ Qm×n , c ∈ Qn and b ∈ Qm if one
exists.

Proof: By Theorem 59, we can check in polynomial time if a given linear program has a
feasible solution. We will show that this is sufficient for computing a feasible solution if one exists.
Assume that we are given m inequalities ati x ≤ bi with ai ∈ Qn and bi ∈ Q (i ∈ {1, . . . , m}). First
check if the system is feasible. If it infeasible, we are done. Otherwise, perform for i = 1, . . . , m
the following steps: Check if the system remains feasible if we replace ati x ≤ bi by ati x = bi . If
this is the case, replace ati x ≤ bi by ati x = bi . Otherwise, the inequality is redundant, and we
can skip it. We end up with a feasible system of equations with the property that any solution
of this system of equations is also a solution of the given system of inequalities. However, the
system of equations can be solved in polynomial time by using Gaussian Elimination (see

79
Section 5.1). Hence, for any linear program, we can compute in polynomial-time a feasible
solution if one exists.
In Section 2.4 we have seen that the task of computing an optimum solution for a bounded
feasible linear program can be reduced to the computation of a feasible solution of a modified
linear program (see the LP (24)). Thus, we can also compute an optimum solution. 2
Remark: By Proposition 23, the method described in the previous proof computes a solution
in a minimal face of the solution polyhedron P . In particular, if P is pointed, we compute a
vertex of P .

6.4 Separation and Optimization

An advantage of the Ellipsoid Algorithm is that it does not necessarily need a complete
description of a solution space K ⊆ Rn but only needs a separation oracle that provides a linear
inequality satisfied by all elements of K but not by a given vector x ∈ Rn \ K. This allows us
to use the method e.g. for linear program with an exponential number of constraints.
Example: Consider the Maximum-Matching Problem. A matching in an undirected
graph is a set M ⊆ E(G) such that |δG (v) ∩ M | ≤ 1 for all v ∈ V (G). In the Maximum-
Matching Problem we are given an undirected graph G and ask for a matching with
maximum cardinality. It can be formulated as the following integer linear program:
P
max x
P e∈E(G) e
e∈δG (v) xe ≤ 1 v ∈ V (G)
xe ∈ {0, 1} e ∈ E(G)

In the LP-relaxation, we simply replace the constraint “xe ∈ {0, 1}” by “xe ≥ 0”. However, this
allows us e.g. in the graph K3 (i.e. the complete graph on three vertices) to set all values xe to
1
2
. To avoid such solutions, we may add the following constraints:
P |U |−1
e∈E(G[U ]) xe ≤ 2
U ⊆ V (G), |U | odd

It turns out that the feasible solutions of the LP

P
max x
P e∈E(G) e
xe ≤ 1 v ∈ V (G)
P e∈δG (v) |U |−1
e∈E(G[U ]) xe ≤ 2
U ⊆ V (G), |U | odd
xe ≥ 0 e ∈ E(G)

are indeed the convex combinations of the solutions of the ILP formulation. In other words, the
vertices of the solution polyhedron of the LP are the integer solutions. We won’t prove this
statement here, see Edmonds [1965] for a proof. Hence, solving the linear program would be
sufficient to solve the matching problem. The number of constraints is exponential in the size
of the graph, but the good news is that there is a separation oracle with polynomial running

80
time for this linear program (see Padberg and Rao [1982]). We will see how such a separation
oracle can be used for solving the optimization problem.
In the remainder of this chapter, we always consider closed convex sets K for which numbers r
and R with 0 < r < R2 exist such that rB n ⊆ K ⊆ RB n . We call sets for which such numbers r
and R exist, r-R-sandwiched sets.
We will consider relaxed versions both of linear optimization problems and of separation
problems. In the weak optimization problem we are given a set K ⊆ Rn , a number ϵ > 0
and a vector c ∈ Qn . The task is to find an x ∈ K with ct x ≥ max{ct z | z ∈ K} − ϵ.
In order to apply the Ellipsoid Algorithm directly to an optimization problem, we need the
property that the set of almost optimum solutions cannot have an arbitrarily small volume.
The following lemma guarantees this for r-R-sandwiched sets:

Lemma 61 Let K ⊆ Rn be an r-R-sandwiched convex set, c ∈ Rn , δ = sup{ct x | x ∈ K},

and 0 < ϵ < δ. Moreover, let U = {x ∈ K | ct x ≥ δ − ϵ}. Then,
n−1
ϵ 1 ϵ 1
vol(U ) ≥ rn−1 .
2∥c∥R| nn 2∥c∥ n

Proof: Let z ∈ K with ct z ≥ δ − 2ϵ . The set A = {x ∈ Rn | ct x = 0, xt x ≤ r2 } is an

(n − 1)-dimensional ball of radius r and is contained in K. Its (n − 1)-dimensional volume is
rn−1 vol(Bn−1 ). And by convexity of K, we have conv(A ∪ {z}) ⊆ K. Let A′ = conv(A ∪ {z}) ∩
{x ∈ Rn | ct x = ct z − 2ϵ }. Then the (n − 1)-dimensional volume of A′ is

n−1
ϵ 1
rn−1 vol(Bn−1 )
2 ct z

Moreover, conv(A′ ∪ {z}) ⊂ U and

ϵ n−1 ϵ 1
′ n−1
vol(conv(A ∪ {z})) ≥ r vol(Bn−1 )
2ct z 2∥c∥ n
n−1
ϵ 1 ϵ 1
≥ rn−1 n .
2∥c∥R| n 2∥c∥ n

Here we use the fact that conv(A′ ∪ {z}) is an n-dimensional pyramid with height at least ϵ
2∥c∥
n−1 n−1
and a base of ((n − 1)-dimensional) volume 2cϵt z r vol(Bn−1 ). 2
This result allows us to find a polynomial-time algorithm for the weak optimization problem
provided that we can solve the corresponding separation problem efficiently:

81
Proposition 62 Given a polynomial-time separation oracle for an r-R-sandwiched
convex set K ⊆ Rn with running time polynomial in size(R), size(r) and size(x)
(where x is the input vector for the oracle), a number ϵ > 0 and a vector c, there is a
polynomial-time algorithm (w.r.t. size(R), size(r), size(c) and size(ϵ)) that computes a
vector v ∈ K with ct v ≥ sup{ct x | x ∈ K} − ϵ.

Proof: Apply the Ellipsoid Algorithm to find an almost optimum vector in K. Use the
previous lemma that shows that the set of almost optimum vectors in K cannot be arbitrarily
small. 2
A weak separation oracle for a convex set K ⊆ Rn is an algorithm which, given x ∈ Rn
and η with 0 < η < 21 , either asserts x ∈ K or finds v ∈ Rn with v t z ≤ 1 for all z ∈ K and
v t x ≥ 1 − η.
Remark: For the previous proposition, it would be enough to have a weak separation oracle
for K.
Notation: For K ⊆ Rn , we define K ∗ := {y ∈ Rn | y t x ≤ 1 for all x ∈ K}.

Theorem 63 If there is an algorithm with running time polynomial in size(r) and

size(R) maximizing linear objective functions over a closed convex r-R-sandwiched set
K ⊆ Rn , then there is a weak separation oracle for K with running time polynomial in
size(r), size(R) and size(η).

Proof: Claim: K ∗∗ = K
Proof of the claim: For x ∈ K, we have y t x ≤ 1 for all y ∈ K ∗ which implies x ∈ K ∗∗ . Therefore,
we have K ⊆ K ∗∗ .
Now let z ∈ Rn \ K. And let w ∈ K be a vector such that ∥z − w∥2 is smallest possible over
vectors in K (w exists because K is convex and closed). Let u = z − w. Then, for all x ∈ K, we
have ut x ≤ ut w < ut z. Moreover, since 0 ∈ K, we have ut w ≥ 0. By scaling u, we can assume
that ut z > 1 while ut x ≤ 1 for all x ∈ K. But then u ∈ K ∗ and ut z > 1 which implies z ̸∈ K ∗∗ .
Thus K ∗∗ ⊆ K. This prove the claim.
Now, let x ∈ Rn be an instance for the weak separation oracle. If x = 0, we can assert x ∈ K,
x
and if ∥x∥ > R we can choose v = ∥x∥ 2 . Therefore, we can assume that 0 < ∥x∥ ≤ R.

We can solve the (strong) separation problem for K ∗ (see the exercises). Since K ∗ is a closed
convex R1 - 1r -sandwiched set, we can apply Proposition 62 to it, and thus, we can solve the weak
η
optimization problem for K ∗ with c = ∥x∥ x
2 and ϵ = R in polynomial time. Thus, we get a vector

xt t xt
v0 ∈ K ∗ with v
∥x∥ 0
x
≥ max{ ∥x∥ v | v ∈ K ∗ } − Rη . If v
∥x∥ 0
≥ 1
∥x∥
− Rη , then v0t x ≥ 1 − η ∥x∥
R
≥ 1 − η,
t
and v0t z ≤ 1 for all z ∈ K (since v0 ∈ K ∗ ). Otherwise max{ ∥x∥
x
v | v ∈ K ∗} ≤ 1
∥x∥
, so

82
max{xt v | v ∈ K ∗ } ≤ 1, which implies x ∈ K ∗∗ . Together with the above claim, this implies
x ∈ K. Therefore, we have a weak separation oracle for K in polynomial running time. 2
It turns out that for rational r-R-sandwiched polyhedra P an exact polynomial-time separati-
on algorithm also provides an exact polynomial-time optimization algorithm, provided that
appropriate bounds on the sizes of the vertices of P are given:

Theorem 64 Let n ∈ N and c ∈ Qn . Let P ⊆ Rn be a rational polytope and let x0 ∈ P

be a vector in the interior of P . Let T be a positive integer such that size(x0 ) ≤ log(T )
and size(x) ≤ log(T ) for all vertices x of P .
Given n,c, x0 , T and a polynomial-time separation oracle for P , a vertex x∗ of P attaining
max{ct x | x ∈ P } can be found in time polynomial in n, log(T ) and size(c).

For a proof, we refer to Korte and Vygen [2018].

The other direction (from an optimization algorithm to a separation oracle) works as well:

Theorem 65 Let n ∈ N and y ∈ Qn . Let P ⊆ Rn be a rational polytope and let x0 ∈ P

be a vector in the interior of P . Let T be a positive integer such that size(x0 ) ≤ log(T )
and size(x) ≤ log(T ) for all vertices x of P .
Given n, y, x0 , T and an oracle which for given c ∈ Qn returns a vertex x∗ of P attaining
max{ct x | x ∈ P }, we can implement a separation oracle for P and y with running time
polynomial in n, log(T ) and size(y). If y ̸∈ P , we can find with this running time a
facet-defining inequality of P that is violated by y.

For a proof, we again refer to Korte and Vygen [2018].

83
84
7 Interior Point Methods
The Ellipsoid Algorithm gives a polynomial-time algorithm for solving linear programs but
in practice it is typically much less efficient than the Simplex Algorithm. In contrast, the
algorithm that we will describe in this section is efficient both in theory and practice.
The term “interior point method” refers to several quite different algorithms. They all have in
common that during the algorithm we always consider vectors in the interior of the polyhedron
of feasible solutions (in contrast to the Simplex Algorithm where we always have vectors
on the border of the polyhedron). Here, we restrict ourselves to one variant and follow the
description by Mehlhorn and Saxena [2015]. The first version of the algorithm has been proposed
by Karmakar [1984].
We consider an LP max{ct x | Ax ≤ b} in standard inequality form.
To simplify the notation, we write the slack variables s explicitly, so we consider the following
problem:
max ct x
s.t. Ax + s = b (48)
s ≥ 0

We write its dual problem in standard form:

min bt y
s.t. At y = c (49)
y ≥ 0

In fact, what we will compute is a solution of this dual linear program.

In the following, we assume that the columns of A are linearly independent (otherwise we had
redundant equations in the constrains of the dual LP) and that the number of rows is larger
than the number of columns (otherwise we could simply solve the equation system for the dual
and check if the solution is non-negative). These are the same assumptions that we had for the
Simplex Algorithm (but for the transposed matrix).
By complementary slackness, we have solved both problems to optimality when we have found
a feasible solution x, s of the primal LP and a feasible solution y of the dual LP such that
y t s = 0. In other words, we want to find x, s, and y with:

Ax + s = b
At y = c
yts = 0 (50)
y ≥ 0
s ≥ 0

Note that y t s = 0 is not a linear constraint. Without this constraint (i.e. for the system
Ax + s = b, At y = c, y ≥ 0, s ≥ 0), the term y t s is exactly the difference between the

85
(dual) value of the dual solution y and the (primal) value of the primal solution x, s because
bt y − ct x = xt At y + st y − ct x = xt c + st y − ct x = st y.
The system (50) has a solution only if both the primal and the dual linear program are feasible
and bounded, so for the moment we assume that this is the case. In Section 7.1, we will see
what to do to enforce these properties.
In the interior point methods, one generally considers vectors in the interior of the solution
space. In the system (50), the only inequalities are y ≥ 0 and s ≥ 0, so during the algorithm,
we always have solutions x, s, y with y > 0 and s > 0. We will replace the condition y t s = 0 by
2
the condition σ 2 := m yi s i
− 1 ≤ 14 for some number µ > 0. During the iterations of the
P
i=1 µ
algorithm, we will decrease µ more and more towards 0.
To summarize, during the algorithm, we have a number µ > 0 and vectors x, s, y meeting the
following invariants
Ax + s = b
At y = c

Pm yi si 2
1 (51)
i=1 µ
−1 ≤ 4
y > 0
s > 0

Now, the general strategy consists of three main parts:

(I) Compute an initial solution of a modified version of (51) (Section 7.1).

(II) Reduce µ by a constant factor and adapt x, y and s to this new value of µ such that we
again get a solution of (51). Iterate this step until µ is small enough (Section 7.2).

(III) Compute an optimum solution of the dual LP (Section 7.3).

7.1 Modification of the LP and Computation of an Initial Solution

We will show how we can modify (51) to an equivalent problem that can be solved easily,
provided that we are allowed to choose µ. This modification will in particular make both the
primal and the dual LP feasible. This is equivalent to the statement that one of them is feasible
and bounded. We will show how to modify the dual LP (49) such that the modified version is
feasible and bounded.
In a first step, we make the LP (49) bounded (in such a way that we do not change the
problem if the given LP was bounded). By Theorem 48, we know that if (49) is feasible and
bounded, then there is a W with W ∈ 2Θ(m(size(A)+size(c))) such that there is an optimum solution
y = (y1 , . . . , ym ) ≥ 0 with yi ≤ W (i = 1, . . . , m). So in this case there is a vector y ≥ 0 with
11t y ≤ mW and At y = c. Equivalently (after dividing everything by W ), we can ask for a vector
y ≥ 0 with 11t y ≤ m and At y = W1 c. By relaxing the constraint 11t y ≤ m to 11t y ≤ m + 1 and

86
by adding a slack variable ym+1 ≥ 0 this leads to the following LP which is equivalent to (49)
provided that (49) is bounded:

min bt y
s.t. At y = W1 c
11t y + ym+1 = m+1 (52)
y ≥ 0
ym+1 ≥ 0

In a second step, we will make the LP feasible. To this end, we add a new variable ym+2
such that setting all variables to 1 will get us a feasible solution. Let H be a constant (to be
determined later). Then, we state the following LP:

min bt y + Hy
m+2
1
t
s.t. A y + W
c − A 11 ym+2 = W1 c
t
t
11 y + ym+1 + ym+2 = m + 2
(53)
y ≥ 0
ym+1 ≥ 0
ym+2 ≥ 0

The goal is to choose H that big that if this LP has a feasible solution with ym+2 = 0 at all,
then in any optimum solution ym+2 = 0 will hold. In fact, by Corollary 49 we know that there
is a constant l such that if there is an optimum solution of (53) with ym+2 > 0, then there is an
optimum solution with ym+2 ≥ 2−4ml(size(A)+size(c)+size(W )) . On the other hand, bt y ≤ ∥b∥1 (m + 2)
in any feasible solution of (53), so if we set H = (∥b∥1 (m + 2) + 1)24ml(size(A)+size(c)+size(W )) , then
we enforce that ym+2 = 0 in any optimum solution (if a solution with ym+2 = 0 exists).

Lemma 66 The linear program (53) is feasible and bounded.

Proof: Setting y = 11, ym+1 = 1 and ym+2 = 1 gives a feasible solution, and due to the
constraint 11y + ym+1 + ym+2 = m + 2 and the non-negativity constraints the LP is bounded. 2
In addition, we can use an optimum solution of (53) to check if the initial dual LP was feasible
and/or bounded, and if it is feasible and bounded, we can find an optimum solution of it:
Let y1 , . . . , ym+2 be an optimum solution of (53) where all non-zero entries have an absolute
values of at least 2−4ml(size(A)+size(c)+size(W )) . If ym+2 > 0, then we know that (52) has no feasible
solution (otherwise there was a feasible solution of (53) with ym+2 = 0 which is cheaper). Thus,
the LP (49) has no feasible solution either. On the other hand, if ym+2 = 0, then the initial dual
LP must be feasible. Assume that this is the case, then we still have to check if the initial dual
LP was bounded. If ym+1 > 0, the initial dual program must be bounded. If ym+1 = 0, then the
initial dual LP can be bounded or unbounded. To decide if it is bounded, we can replace c by
the all-zero vector and first solve this new problem. Then, by Farkas’ Lemma, the LP (49) is
bounded if and only if the value of an optimum solution of the new problem is non-negative.

87
If we dualize the LP (53), we get the following LP (with variables x ∈ Rn , s ∈ Rm and additional
variables xn+1 , sm+1 , and sm+2 ):
max W1 ct x + (m + 2)xn+1
Ax + xn+1 11 + s = b
1 t t

W
c − 11 A x + xn+1 + sm+2 = H
xn+1 + sm+1 = 0 (54)
s ≥ 0
sm+1 ≥ 0
sm+2 ≥ 0

Instead of the primal-dual pair (48) and (49), we will consider the pair (53) and (54). Due to
the modification, both LPs are feasible and bounded.
For the new pair
2 of LPs we can easily find feasible solutions and a number µ such that
Pm+2 yi si
i=1 µ
− 1 ≤ 41 : We set y1 = y2 = · · · = ym = ym+1 = ym+2 = 1 which is obviously
µ
feasible for (53). For (54), we set x1 = x2 = · · · = xn = 0. Moreover, we choose sm+1 = ym+1 =µ
(where µ itself is still to be determined). This leads to xn+1 = −µ, sm+2 = H + µ, and
si = bi − xn+1 = bi + µ (i = 1, . . . m).
As a consequence of this choice, we get:
y i si bi
−1 = i = 1, . . . , m
µ µ
ym+1 sm+1
−1 = 0
µ
ym+2 sm+2 H
−1 =
µ µ

Therefore, !
m+2
X 2 m
yi si 1 X
σ2 = −1 = 2 H2 + b2i .
i=1
µ µ i=1
p Pm 2 2 1
Hence, by choosing µ = 2 H 2 + i=1 bi , we enforce σ ≤ 4
. Moreover, since µ > |bi |, we have
si = bi + µ > 0 for i ∈ {1, . . . , m}.
So what did we get so far? We have replaced the primal-dual pair (48) and (49) by the pair (53)
and (54) such that optimum solutions of these modified problems directly lead to a solution of
the original problem. Moreover, the new primal-dual pair consists of two feasible and bounded
problems.
We will write (53) as

min b̃t y
s.t. Ãt y = c̃ (55)
y ≥ 0

88
and (54) as
max c̃t x
s.t. Ãx + s = b̃ (56)
s ≥ 0

so Ã ∈ R(m+2)×(n+1) , b̃ ∈ Rm+1 and c̃ ∈ Rn+1 .

Note that in these modified problems we have variables x ∈ Rn+1 and y, s ∈ Rm+2 (nevertheless
we denote them by x, y, s as in (48) and (49)).
We have already found initial solutions µ(0) , x(0) , y (0) , s(0) for the following system:

Ãx + s = b̃
Ãt y = c̃

Pm+2 yi si 2
i=1 µ
− 1 ≤ 14 (57)
y > 0
s > 0

7.2 Solutions for Reduced Values of µ

In this section, we will describe a solution for the following problem: Given a solution
µ(k) , x(k) , y (k) , s(k) of (57) we want to compute a new solution µ(k+1) , x(k+1) , y (k+1) , s(k+1) of
(57) where µ(k+1) = (1 − δ)µ(k) for some δ that does not depend on the solution (to be
determined later).
In a first version, we describe the step without considering the sizes of the numbers that occur
during the computation. Afterwards, we will show how we can round intermediate solutions in
such a way that the numbers can be written with a polynomial number of bits.
We write x(k+1) = x(k) + f , y (k+1) = y (k) + g, and s(k+1) = s(k) + h. Think of the entries of f , g
and h as relatively small values. Assuming that µ(k+1) is fixed, we describe how to compute
appropriate values for f , g and h. The first two conditions of (57) lead to Ãf + h = 0 and
(k) (k)
Ãt g = 0. In addition we want to choose f and h such that (yi + gi )(si + hi ) is close to µ(k+1)
(k) (k) (k) (k) (k) (k)
(i = 1 . . . , m + 2). Since (yi + gi )(si + hi ) = yi si + gi si + yi hi + gi hi and the product gi hi
(k) (k) (k) (k)
is small (provided that gi and hi are small) we simply demand yi si + gi si + yi hi = µ(k+1)
(i = 1 . . . , m + 2). Hence, we want to compute f , g and h such that

Ãt g = 0
Ãf + h = 0 (58)
(k) (k) (k) (k)
si gi + yi hi = µ(k+1) − yi si i = 1, . . . , m + 2

Note that y (k) and s(k) are constant in this context. In this formulation, we skipped the
constraints that y (k+1) > 0 and s(k+1) > 0. We will see what we can do to get positive values,
anyway.

89
Let f , g and h be a solution of (58). By construction, we have

(y (k) + g)t (s(k) + h) = (m + 2)µ(k+1) + g t h. (59)

Furthermore the first and second constraint of (58) give

g t h = −g t Ãf = 0t f = 0. (60)

This implies
t
b̃t y (k+1) − c̃t x(k+1) = Ã(x(k) + f ) + (s(k) + h) (y (k) + g) − c̃t (x(k) + f )
t
= Ã(x(k) + f ) (y (k) + g) + (m + 2)µ(k+1) − c̃t (x(k) + f ) (61)
t
= x(k) + f Ãt y (k) + (m + 2)µ(k+1) − c̃t (x(k) + f )
= (m + 2)µ(k+1)

Lemma 67 The system (58) has a unique solution.

(k)
Proof: Let S be an (m + 2) × (m + 2)-diagonal matrix with si as entry at position (i, i)
(k)
and Y be an (m + 2) × (m + 2)-diagonal matrix with yi as entry at position (i, i).
Then, the last condition of (58) is equivalent to

Sg + Y h = µ(k+1) 11m+2 − Sy (k) ,

which is equivalent to
g + S −1 Y h = S −1 µ(k+1) 11m+2 − y (k) .
This implies
Ãt g + Ãt S −1 Y h = Ãt S −1 µ(k+1) 11m+2 − Ãt y (k) , (62)
and hence
Ãt S −1 Y h = Ãt S −1 µ(k+1) 11m+2 − c̃. (63)
With h = −Ãf this leads to

−Ãt S −1 Y Ãf = Ãt S −1 µ(k+1) 11m+2 − c̃.

However, the matrix Ãt S −1 Y Ã is invertible, so f = (Ãt S −1 Y Ã)−1 (c̃ − Ãt S −1 µ(k+1) 11m+2 ) is
the unique solution of this last equation. In particular, if (58) has a solution, this is the only
choice for f . By setting h = −Ãf , we fulfill the second constraint of (58). Finally, we set
g = S −1 µ(k+1) 11m+2 − y (k) − S −1 Y h (again the only choice) satisfying the third constraint of
(58).
Since we have chosen g and h such that (62) and (63) are met, we also have Ãt g = 0, so the
solution satisfies the first condition of (58). 2

90
In the above proof we have to solve an equation system −Ãt S −1 Y Ãf = Ãt S −1 µ(k+1) 11m+2 − c̃
in order to compute f . This equation system depends on the previous solutions s(k) and y (k) , so
here the sizes of the numbers to store the intermediate solutions could get too big. At the end
of this section, we will describe how to handle such issues.

s 2 s 2 sm+2
m+2 (k) (k) m+2 (k+1) (k+1) P gi hi 2
yi s i y s
We have σ (k) = − 1 and σ (k+1) =
P P
µ(k)
i i
µ(k+1)
− 1 = µ(k+1)
.
i=1 i=1 i=1

It remains to show that y (k+1) > 0 and s(k+1) > 0 and σ (k+1) ≤ 21 .
We first show that for an appropriate choice of µ(k+1) we get σ (k+1) ≤ 21 .

µ(k) 1
Lemma 68 (a) For i = 1, . . . , m + 2 we have (k) (k) ≤ 1−σ (k)
.
yi si

m+2 (k) (k)

yi s i √
≤ σ (k) m + 2.
P
(b) 1− µ(k)
i=1

Proof:
2 2
(k) (k) (k) (k)
Pm+2 yi s i yi s i
(a) We have (σ (k) )2 = i=1 µ(k)
− 1 , so µ(k)
−1 ≤ (σ (k) )2 which implies
(k) (k) (k) (k)
yi si yi s i
1− µ(k)
≤ σ (k) and µ(k)
≥ 1 − σ (k) for i = 1, . . . , m + 2. This proves the claim.

(b) The statement is simply a special case of the Cauchy-Schwarz inequality that can be
proved as follows:

m+2
!2
(k) (k)
X y s
(σ (k) 2
) (m + 2) − 1 − i (k)i
i=1
µ
!2
m+2 (k) (k) 2 m+2 (k) (k)
X y i si X y s
= (m + 2) 1− − 1 − i (k)i
i=1
µ(k) i=1
µ
m+2 (k) (k) 2 m+2
X m+2 (k) (k) (k) (k)
X y i si X y s yj sj
= (m + 1) 1− − 2 1 − i (k)i · 1−
i=1
µ(k) i=1 j=i+1
µ µ(k)
m+2 (k) (k)
!2
X m+2
X y s
(k) (k)
y j sj
= 1 − i (k)i − 1−
i=1 j=i+1
µ µ(k)
≥ 0

This proves (b). 2

91
Lemma 69 If δ = √1 (i.e. µ(k+1) = (1 − √1 )µ(k) ) then σ (k+1) < 21 .
8 m+2 8 m+2

r r
(k) (k)
si yi
Proof: Let Gi := gi (k) and Hi := hi (k) (for i ∈ {1, . . . , m + 2}).
yi µ(k+1) si µ(k+1)
v v
um+2 um+2
u X gi hi 2 uX
σ (k+1) = t
(k+1)
= t (Gi Hi )2
i=1
µ i=1
v !
u 1 m+2 m+2
u
X X
= t (G2 + Hi2 )2 − (G2i − Hi2 )2
4 i=1 i i=1
v
um+2 m+2
1u X 1X 2
≤ t 2
(G + Hi )2 2
≤ (G + Hi2 )
2 i=1 i 2 i=1 i
m+2 m+2
g t h=0 1X 1X 1 (k) (k) 2
= (Gi + Hi )2 = g i s i + hi y i
2 i=1 2 i=1 yi(k) s(k)
i µ
(k+1) | {z }
(k) (k)
=µ(k+1) −yi si
m+2 2 !2
(k) (k)
1X µ(k) µ(k+1) yi si
= −
2 i=1 yi(k) s(k)
i µ
(k+1) µ(k) µ(k)
m+2
!2
(k) (k)
1 X µ(k) 1 y s
= (k) (k)
−δ + 1 − i (k)i
2 i=1 yi si 1 − δ µ
| {z }
1
≤
1−σ (k)

(k) (k) 2
m+2
! m+2
! !
(k) (k)
1 X y s yi si
X
≤ (m + 2)δ 2
− 2δ 1 − i (k)i
+ 1−
2(1 − δ)(1 − σ (k) ) i=1
µ i=1
µ(k)
(k) (k) 2
m+2 m+2
! !
(k) (k)
1 X y s X y s
≤ (m + 2)δ 2 + 2δ 1 − i (k)i + 1 − i (k)i
2(1 − δ)(1 − σ (k) ) µ µ
|i=1 {z
√
} i=1
| {z }
2
≤σ (k) m+2 =(σ )
(k)

1 √ 2
≤ (k)
(m + 2)δ 2 + 2δσ (k) m + 2 + σ (k)
2(1 − δ)(1 − σ )
1 √ 2
(k)
= m + 2δ + σ
2(1 − δ)(1 − σ (k) )
σ (k) ≤ 12
2 √ 2
√

1 1 8 m+2 1 1
≤ m + 2δ + = √ +
1−δ 2 8 m+2−1 8 2
1
≤ .
2
2

92
Lemma 70 We have y (k+1) > 0 and s(k+1) > 0.

(k+1) (k+1)
Proof: Claim: We have yi si > 0 for i = 1, . . . , m + 2.
Proof of the Claim:
(k+1) (k+1)
Assume that yj sj ≤ 0 for a j ∈ {1, . . . , m + 2}. Then,

m+2
!2 (k+1) (k+1)
!2
(k+1) (k+1)
(k+1) 2
X yi si yj sj
σ = −1 ≥ −1 ≥ 1,
i=1
µ(k+1) µ(k+1)

which is a contradiction to the previous lemma. This proves the claim.

(k+1) (k+1) (k+1) (k)
Thus if yi ≤ 0, then si ≤ 0 and vice versa. Assume that yi = yi + gi ≤ 0 and
(k+1) (k) (k) (k)
si = si + hi ≤ 0. This implies (because si > 0 and yi > 0) that

(k) (k) (k) (k)

si (yi + gi ) + yi (si + hi ) ≤ 0
| {z }
(k) (k)
=si yi +µ(k+1)

(k) (k)
which is a contradiction to the fact that si , yi , and µ(k+1) are positive. 2

Rounding the intermediate solution

When computing the modification vectors f , g and h according to (58), we have to avoid that
the number of bits needed to store the numbers increases too much in each iteration. We can
(k) (k)
do this in the following way: Instead of the exact values of yi and si , we solve the system
(k) (k)
(58) with respect to rounded values ỹi and s̃i . We do this in such a way that they remain
(k) (k) (k) (k)
ỹ s̃ y s 1 1
positive and such that | iµ(k)i − iµ(k)i | < ϵ for some ϵ with 0 < ϵ < m+2 300
. By restricting the
solution space to a polytope we can assume that a polynomial number of bits is sufficient to
(k) (k)
store these rounded numbers ỹi and s̃i .
2 2
(k) (k) (k) (k) (k) (k)
Pm+2 ỹi s̃i Pm+2 yi si Pm+2 yi s i
ϵ+ m+2 2
P
Then, we get i=1 1 − µ(k) ≤ i=1 1 − µ(k) + i=1 2 1 − µ(k) i=1 ϵ ≤
2 2
(k) (k) (k) (k)
Pm+2 yi s i 1
Pm+2 yi si
i=1 1 − µ(k) + 100 Thus, if we can bound i=1 1 − µ(k) by, say, 0.24 instead of
2
(k) (k)
Pm+2 ỹi s̃i
0.25, we get i=1 1 − µ(k) ≤ 12 . For the initial solution, this is easy (simply increase the
initial value µ(0) slightly). For the intermediate step, this is also not an issue because in the
proof of Lemma 69, we easily get in the very last inequality even 0.48 as the upper bound.

93
7.3 Finding an Optimum Solution

We will describe a way to find an optimum solution of the dual LP (55).

For the remainder of the chapter, we use the following notation: Let y ∗ be an optimum solution
of (55) and x∗ , s∗ an optimum solution of (56). By Corollary 49, we can assume that all positive
entries of y ∗ and s∗ have a value of at least η for some η = 2−Θ(size(Ã)+size(b̃)+size(c̃)) .

Lemma 71 Let µ, x, y, s be a solution of (57). Let i ∈ {1, . . . , m + 2}. Then:

η
(a) If yi < 4(m+2)
, then yi∗ = 0.
η
(b) If si < 4(m+2)
, then s∗i = 0.

Pm+2 yi si 2
Proof: By the condition i=1 µ
− 1 ≤ 41 , we get

µ 3µ
≤ y i si ≤ < 2µ
2 2
for all i ∈ {1, . . . , m + 2}. Therefore, st y = m+2
P
i=1 yi si ≤ 2(m + 2)µ.

(a) Since y ∗ is an optimum and y a feasible solution of the dual LP, we have b̃t y ≥ b̃t y ∗ and
thus
st y = b̃t y − xt Ãt y = b̃t y − c̃t x ≥ b̃t y ∗ − c̃t x = b̃t y ∗ − xt Ãt y ∗ = st y ∗ .
η
Let i ∈ {1, . . . , m + 2} with yi < 4(m+2)
. We have

µ 2(m + 2)µ st y
si ≥ > ≥ .
2yi η η
Assume that yi∗ > 0, so yi∗ ≥ η. This implies

st y
st y ∗ ≥ si yi∗ > · η = st y ≥ st y ∗ ,
η
which is a contradiction. Therefore, yi∗ = 0.

(b) The case is very similar to part (a): Since x∗ , s∗ is an optimum and x, s a feasible solution
of the primal LP, we have c̃t x ≤ c̃t x∗ and thus

st y = b̃t y − xt Ãt y = b̃t y − c̃t x ≥ b̃t y − c̃t x∗ = b̃t y − y t Ãx∗ = y t s∗ .

η
Let i ∈ {1, . . . , m + 2} with si < 4(m+2)
. We have

µ 2(m + 2)µ st y
yi ≥ > ≥ .
2si η η

94
Assume that s∗i > 0, so s∗i ≥ η. This implies

st y
y t s∗ ≥ s∗i yi > η · = st y ≥ y t s∗ ,
η

which is again a contradiction. Therefore, s∗i = 0. 2

There are several ways to find an optimum solution. Before we describe a method to round
an interior point directly to an optimum solution, we will present a simpler but less efficient
η2
method: We choose k big enough such that µ(k) < 32(m+2) 2 . Then, for each i ∈ {1, . . . , m + 2},
(k) η (k) η
we have yi < 4(m+2)
or si <4(m+2)
. Let Āt y = c̄ be the subsystem of Ãt y = c̃ consisting of
(k) η
the rows with indices i for which si
< 4(m+2) , so s∗i = 0. For all other rows, we know that
yi∗ = 0, so we can ignore them when computing an optimum solution for the dual LP. If Āt y = c̄
has only one solution, we compute it and get an optimum solution of the modified dual LP
(k) η
(53) (provided that the result is non-negative). Otherwise, we check if yi0 < 4(m+2) for some
i0 ∈ {1, . . . , m}. In this case we know that if the initial dual LP has an optimal solution, then
there is one with yi0 = 0. Hence we can start the whole process again but now without the
variable yi0 , without the row of A with index i0 and without the entry of b with index i0 . Hence
we have reduced the instance size, so this method will terminate after at most m iterations.
(k) η
What can we do if there is no i ∈ {1, . . . , m} with yi < 4(m+2)
? To handle this case, we first
make sure that the system Ãx = b̃ does not have a feasible solution. If it has a feasible solution
(which can be checked by Gaussian Elimination), we modify b̃ slightly to a vector b∗ such that
Ãx = b∗ has no feasible solution. To this end choose n linearly independent rows of A. These
rows will define the solution of Ãx = b̃. Then, any modification of b outside these rows will make
the system Ãx = b̃ infeasible. We simply add an ϵ > 0 to one of these entries of b. If ϵ is small
enough, then an optimum solution of the dual LP with respect to b∗ will still be an optimum
solution of the original dual LP. To see that we can write ϵ with a polynomial number of bits,
observe that the absolute value of the difference between the costs of two basic solutions of an
LP is either 0 or can be bounded from below by some value 2−L where L is polynomial in the
input size. This follows from the fact that any basic solution can be written with a polynomial
number of bits. Thus, the same is true for any difference u of two basic solutions and for the
scalar product b̃t u. Hence, b̃t u is either zero or its absolute value is at least 2−L . This implies
that we can choose ϵ in such a way that it can be written with polynomially many bits and
that no suboptimal solution can become optimal by the modification.
Now assume that the initial dual LP is bounded and feasible. Then, we can compute optimum
solutions x∗ , y ∗ , s∗ of the modified LPs (53) and (54) by expanding optimum solutions of the
initial primal and dual problems in an canonical way. In particular, we will set xn+1 to 0. Then
Ax∗ + s∗ = b∗ but Ax = b∗ has no feasible solution. Hence, there must be an i0 ∈ {1, . . . , m}
(k) η
with s∗i0 > 0, so yi0 < 4(m+2) and yi∗0 = 0. Again, we get rid of at least one dual variable and
can restart the whole procedure on a smaller instance.
Now, we describe how we can avoid iterating the whole process:

95
Consider again the two problems (55) and (56). Theorem 14 implies that we can partition the
˙ such that for i ∈ B
index set {1, . . . , m + 2} of the dual variables into {1, . . . , m + 2} = B ∪N
∗ ∗
there is an optimum dual solution y with yi > 0 and for i ∈ N there is an optimum primal
solution x∗ , s∗ with s∗i > 0. Moreover, it is easy to see that there is also solutions x∗ , s∗ and y ∗
with these properties such that in addition the size of their entries is O(size(Ã)+size(b̃)+size(c̃)).
η η
Hence, in Lemma 71 for any i ∈ {1, . . . , m + 2} we can either have yi < 4(m+2) or si < 4(m+2) but
2
η
not both. Now we choose k big enough such that µ(k) < 32(m+2) 2 ∆ for some ∆ ≥ 1 that will be
η
determined later. Then, for each i ∈ {1, . . . , m + 2}, exactly one of the inequalities yi < 4(m+2)∆
η ˙ . In
and si < 4(m+2)∆ holds. Therefore, we can find the partitioning {1, . . . , m + 2} = B ∪N
η η
particular, we have yi ≥ 4(m+2) for each i ∈ B and yi < 4(m+2)∆ for each i ∈ N

Let AB be the submatrix of Ã consisting of the rows with indices in B, and AN be the submatrix
(k) (k)
of Ã consisting of the remaining rows. By yB , yN , bB , bN we denote the corresponding subvectors
(k)
of vectors y (k) and b. As in the description of the Simplex Algorithm, the entries of e.g. yB
are not necessarily indexed from 1 to |B| but their index set is the set B ⊆ {1, . . . , m + 2}. We
can assume that AB has full column rank.
In the following, the vector norm is the Euclidean norm ∥ · ∥2 and the matrix norm is the norm
induced by the Euclidean norm.

p
Theorem 72 Set ∆ = max{ (m + 2)∥AB (AtB AB )−1 AtN ∥, 1}. Let k be big enough such
η2
that µ(k) < 32(m+2)2 ∆ . Let YB be a diagonal matrix whose rows and columns are indexed
(k)
with B such that the entry at position (i, i) is yi . Define
(k)
dy := YB AB (AtB (YB )2 AB )−1 AtN yN
(k)
and ỹB = YB dy + yB . Then:

(a) AtB ỹB = c̃.

(b) ∥dy ∥ < 1.

(c) The vector ỹ ∈ Rm+2 which arises from ỹB by adding zeros for the entries with index
in N is an optimum dual solution.

Proof:

(k) (k) (k)

(a) We have AtB (YB dy + yB ) = AtN yN + AtB yB = c̃.

96
(b)
(k)
∥dy ∥ = ∥YB AB (AtB (YB )2 AB )−1 AtN yN ∥
(k)
= ∥YB AB (AtB (YB )2 AB )−1 AtB YB YB−1 AB (AtB AB )−1 AtN yN ∥
| {z }
=In
(k)
≤ ∥YB AB (AtB (YB )2 AB )−1 AtB YB ∥ ·∥YB−1 AB (AtB AB )−1 AtN yN ∥
| {z }
=1
(k)
≤ ∥YB−1 ∥ · ∥AB (AtB AB )−1 AtN ∥ · ∥yN ∥
| {z } | {z } | √{z }
4(m+2) ∆ η m+2
≤ ≤ √m+2 < 4(m+2)∆
η

≤ 1.

(c) By (a), we have Ãt ỹ = c̃, and by (b), we know that ỹB > 0, so we have ỹ ≥ 0. Hence ỹ
is a feasible dual solution. Moreover, we know that there is a feasible primal solution in
which the slack variables si are zero for i ∈ B. Hence, by complementary slackness, ỹ is
an optimum dual solution. 2

Theorem 73 Given a feasible and bounded linear program min{bt y | y t A = ct , y ≥ 0}

with A ∈ Qm×n , b ∈ Qm , and c ∈ Qn , the Interior Point Method computes an
optimum solution in polynomial time. Moreover, the algorithm decides correctly, if a
linear program is feasible or bounded. 2

7.4 Further remarks

Interior point methods can be motivated as follows: When we consider the primal LP

max ct x
s.t. Ax + s = b (64)
s ≥ 0

we mayP enforce an interior point as a solution by replacing the objective function ct x by

c x+µ m
t
i=1 ln(si ) for some parameter µ > 0. If there is any solution in the interior, then in
any optimum solution of the modified problem, we have si > 0 for all i ∈ {1, . . . , m}. And the
larger µ is, the more we arePenforced to find a solution away from from border of the solution
space. That is why ct x + µ m i=1 ln(si ) is called barrier function and the modified problem is
called barrier problem. Then, one can relax the constraints Ax + s = b by some Lagrangean
relaxation an consider the following function, called Lagrange function
m
X
L(x, s, y) = ct x + µ ln(si ) + y t (b − Ax − s).
i=1

97
For any fixed vector y the maximum value L̃(y) = max{L(x, s, y) | x ∈ Rn , s ∈ Rn } is an upper
m
bound on max{ct x + µ i=1 ln(si ) | Ax + sP= b}, and hence one asks for min{L̃(y) | y ∈ Rm }.
P
In this setting, one can show that ct x + µ m i=1 ln(si ) is maximum if all partial derivatives of
L(x, s, y) are zero, i.e. all the following terms must be zero:
m
∂L(x, s, y) X
= cj − yi aij for j ∈ {1, . . . , n}
∂xj i=1
∂L(x, s, y) µ
= − yi for i ∈ {1, . . . , m}
∂si si
m
∂L(x, s, y) X
= bi − aij xj − si for i ∈ {1, . . . , m}
∂yi j=1

This leads to y t A = c, yi sj = µ (for j ∈ {1, . . . , n}), and Ax + s = b. Note that we do not get
explicit sign constraints on y but if we request s to be positive, then y will be positive. For
a decreasing sequence of µ(k) values, the sequence of corresponding vectors x(k) , s(k) and y (k)
(k) (k)
satisfying (y (k) )t A = c, yi sj = µ(k) (for j ∈ {1, . . . , n}), and Ax(k) + s(k) = b(k) is called a
central path. However, note that also the sequence of solutions we constructed in the first
(k) (k)
three subsections with only yi sj ≈ µ(k) is called central path.

98
8 Integer Linear Programming
Imposing integrality constraints on all or some variables of a linear program allows to model
many new conditions that could not be described by linear constraints. For example, even if
we only consider Binary Linear Programs (i.e. all integrality constraints are of the type
x ∈ {0, 1}) we can easily model the following conditions for variables x, y:

• “(x ≥ a or y ≥ b) and x, y ≥ 0” for some a, b > 0.

• “x ∈ {s1 , . . . , sk }” for a set {s1 , . . . , sk } of real numbers.

On the other hand, we have already seen that there are NP-hard optimization problems that
can be modeled as (mixed) integer linear programs. Hence, we cannot hope for polynomial-time
algorithms to solve general ILPs.

8.1 Integral Polyhedra

Definition 21 Let P = {x ∈ Rn | Ax ≤ b} be a polyhedron.Then, we define

PI := conv{x ∈ Zn | Ax ≤ b} as the integer hull of P .

Fig. 8: A polyhedron P (given by the red hyperplanes) and its integer hull PI (green). The
black dots indicate the integral vectors.

Definition 22 A polyhedron P ⊆ Rn is called rational if there are A ∈ Qm×n and

b ∈ Qm with P = {x ∈ Rn | Ax ≤ b}.

Observations:

99
• For a rational polyhedral cone (i.e. a cone C = {x ∈ Rn | Ax ≤ 0} with A ∈ Qm×n ), we
have CI = C (because a polyhedral cone is rational if and only if it is generated by a
finite number of integral vectors).

• PI is not necessarily a polyhedron.

• If P is a polytope, then PI is a polyhedron.

Theorem 74 Let P = {x ∈ Rn | Ax ≤ b} be a rational polyhedron with A ∈ Qm×n and

b ∈ Qm . Then, PI is a polyhedron.

Proof: Let P = {x ∈ Rn | Ax ≤ b} with A ∈ Qm×n and b ∈ Qm . By Theorem 32, we can

write P = conv(V ) + cone(E) for two finite sets V, E ⊆ Rn . Moreover, the proof of Theorem 32
also shows that we can assume that the elements of V and E are rational vectors. Hence, we
Psassume that E = {y1 , . . . , ys } where yi are integral vectors (i = 1, . . . , s). Define
can even
B := { i=1 λi yi | 0 ≤ λi ≤ 1 for i ∈ {1, . . . , s}}.
Claim: PI = (conv(V ) + B)I + cone(E).
Proof of the claim: “PI ⊆ (conv(V ) + B)I + cone(E)”:
vector of P . Then, p = q + c for some q ∈ conv(V ) and some P
Let p be an integral P c ∈ cone(E).
s s
We can write c = i=1 µi yi with µi ≥ 0 for i ∈ {1, . . . , s}. Therefore c = i=1 µi yi =
Xs Xs
(µi − ⌊µi ⌋)yi + ⌊µi ⌋yi , so we can write c = b + c′ with b ∈ B and c′ ∈ cone(E) ∩ Zn .
|i=1 {z } |i=1 {z }
∈B ∈(cone(E)∩Zn )
Thus, p = (q + b) + c . We have q + b ∈ conv(V ) + B. And q + b = p − c′ , so q + b is integral.
′

Hence, q + b ∈ (conv(V ) + B)I , and therefore p ∈ (conv(V ) + B)I + cone(E).

“PI ⊇ (conv(V ) + B)I + cone(E)”:
We have

(conv(V ) + B)I + cone(E) ⊆ PI + cone(E) = PI + (cone(E))I ⊆ (P + cone(E))I = PI .

This proves the claim.

The claim implies the statement of the theorem because conv(V ) + B is a polytope, so
(conv(V ) + B)I is also a polytope. This shows that PI can be written as the Minkowski sum of
two polyhedra. However, by an earlier exercise, the Minkowski sum of two polyhedra is again a
polyhedron. 2
In particular, we get the (somewhat surprising) consequence that one can solve integer linear
programs by solving linear programs. The problem is that the polyhedron PI may not have
a simple description even if there is one for P . For example, Rubin [1970] has shown that for
any k there are rational polyhedra P ⊆ R2 with only 3 facets such that PI has more than k
facets. Moreover, Bárány, Howe, and Lovász [1992] gave an a example showing that there are

100
rational polyhedra P = {x ∈ Rn | Ax ≤ b} ⊆ Rn such that PI has Ω(ϕn−1 ) vertices, where
ϕ = size(A) + size(b).

Definition 23 A polyhedron P is called integral if P = PI .

Proposition 75 Let P = {x ∈ Rn | Ax ≤ b} with A ∈ Qm×n and b ∈ Qm such that

PI ̸= ∅. Let c ∈ Rn be a vector. Then, max{ct x | x ∈ P } is bounded if and only if
max{ct x | x ∈ PI } is bounded.

Proof: “⇒:” trivial.

“⇐:” Assume that max{ct x | x ∈ P } is unbounded. Then, the dual LP must be infeasible, so
there is no vector y with y t A = c and y ≥ 0. By Farkas’ Lemma (Theorem 6), this means that
there is a vector z with ct z < 0 and Az ≥ 0. Thus, the LP min{ct x | Ax ≥ 0, −11 ≤ x ≤ 11} is
feasible and has an optimum solution with negative value. By Proposition 48, there is a rational
optimum solution x∗ . By multiplying x∗ by an appropriate integer, we get an integral vector w
with Aw ≥ 0 and ct w < 0. Hence, for any v ∈ PI and k ∈ N we have v − kw ∈ PI . Therefore,
max{ct x | x ∈ PI } is unbounded. 2

8.2 Integral Solutions of Equation Systems

In this section, our goal is to find a certificate that a given system of equations does not have
any integral solution (which will be the result of Corollary 77).

Definition 24 An m × n-matrix A is in Hermite normal form if it can be written

as A = B 0 where B is a nonsingular lower triangular non-negative matrix such
that each row of B has a unique maximum entry and this maximum entry is on the diagonal.

The following operations on matrices are called elementary unimodular column operations:

• Exchange two columns.

• Multiply a column by −1.
• Add an integral multiple of one column to another column.

Theorem 76 Each matrix A ∈ Qm×n of rank m can be transformed into a matrix in

Hermite normal form by a series of elementary unimodular column operations.

101
Proof: We may assume that A is integral. Assume that we have already transformed A
F 0
into a matrix where F is a lower triangular matrix with positive diagonal. Let
G H
h11 , . . . , h1k be the first row of H. Apply elementary unimodular column operations to H such
that all h1j are non-negative and such that kj=1 h1j is as small as possible. We may assume
P
that h11 ≥ h12 ≥ · · · ≥ h1k . Then, h11 > 0 because A has rank m. Moreover, h1j = 0 for
j ∈ {2, . . . , k} because otherwise subtracting h1j from h11 would reduce kj=1 h1j . Hence, we
P
have obtained a larger lower triangular matrix F ′ .

We iterate this step and end up with a matrix B 0 where B is a lower triangular matrix
with positive diagonal. Denote the entries of B be bij (i = 1, . . . , m, j = 1, . . . , m). Finally, we
perform for i = 2, . . . , m the following steps: For j = 1, . . . , i − 1 add an integer multiple of the
i-th column of B to the j-th column of B such that the bij is non-negative and less than bii . 2

Corollary 77 Let A ∈ Qm×n and b ∈ Qm . Then, Ax = b has an integral solution x if

and only if bt y is integral for each y ∈ Qm for which At y is integral.

Proof: “⇒:” If x and y t A are integral vectors and Ax = b, then y t Ax = y t b is also integral.
“⇐:” Assume that bt y is integral for each y ∈ Qm for which At y is integral. Then, Ax = b must
have a (fractional) solution, since otherwise, by Farkas’ Lemma (Corollary 7), there would be
a vector y ∈ Qm with y t A = 0 and y t b = − 12 . Thus, we may assume that the rows of A are
linearly independent, so A has rank m.
It is easy to check the statement to be proved holds for A if and and only if it holds for any
matrix Ã where Ã arises from A by applying an elementary unimodular column operation.
Hence, we can assume that A is in Hermite normal form [B 0]. Thus B −1 [B 0] = [Im 0] is
−1 −1
an integral matrix. Therefore by our
−1 assumption (applied to−1therows of B ), B b is an
B b B b
integral vector. Since [B 0] = b, the vector x := is an integral solution for
0 0
[B 0] x = b. 2

Corollary 78 There is a polynomial-time algorithm that, given A ∈ Qm×n and b ∈ Qm ,

computes an integral solution of Ax = b, provided that there is one.

Proof: Transform a into Hermite normal form [B 0] by elementary unimodular column

operations which can be done in polynomial time. Then, as in the proof of Corollary 77 compute
an integer solution of [B 0] x = B which can be translated into an integer solution of Ax = b
provided that we stored the elementary unimodular column operations that have been applied.
2

102
8.3 TDI Systems

Theorem 79 Let P = {x ∈ Rn | Ax ≤ b} with A ∈ Qm×n and b ∈ Qm . Then, the

following statements are equivalent:

(a) P is integral

(b) Each face of P contains at least one integral vector.

(c) Each minimal face of P contains at least one integral vector.

(d) Each supporting hyperplane of P contains at least one integral vector.

(e) Each rational supporting hyperplane of P contains at least one integral vector.

(f ) max{ct x | x ∈ P } is attained by an integral vector for each c for which the maximum
is finite.

(g) max{ct x | x ∈ P } is an integer for each integral vector c for which the maximum is
finite.

Proof: The following implications are obvious: “(b) ⇔ (c)”, “(b) ⇒ (d)”, “(d) ⇒ (e)”, and
“(f) ⇒ (g)”
“(a) ⇒ (b):” Assume that P is integral. Let F = P ∩ H be a face of P where H = {x ∈ Rn |
ct x = δ} is a supporting hyperplane of P . Then, any z ∈ F is a convex combination of integral
vectors v1 , . . . , vk of P . If vi ∈ P \ F (so ct vi < δ) for an i ∈ {1, . . . , k}, then (since ct x = δ)
there must be a j ∈ {1, . . . , k} with ct vj > δ, which is a contradiction to vj ∈ P . Thus, all vi
must be in F , so in particular F contains an integral vector.
“(c) ⇒ (f):” Follows from Corollary 20.
“(f) ⇒ (a):” Assume that (f) holds but P ̸= PI . Then, there is an x∗ ∈ P \ PI . By Theorem 74,
PI is a polyhedron, so there is an inequality at x ≤ β that is valid for PI but not for x∗ , so
at x∗ > β. This is a contradiction to (f) because max{at x | x ∈ P } is finite (by Proposition 75)
but is not attained by any integral vector.
So far, we have proved that (a),(b),(c), and (f) are equivalent.
“(e) ⇒ (c):” We may assume that A and b are integral. Let F = {x ∈ Rn | A′ x = b′ } be a
minimal face of P (where A′ x ≤ b′ is a subsystem of Ax ≤ b). If there is no integral vector x
with A′ x = b′ , then, by Corollary 77, there must be a rational vector y such that c := (A′ )t y is
integral while δ := y t b′ is not an integer. Moreover, we may assume that all entries of y are
positive (otherwise we add an appropriate integral vector to y). Since c is integral but δ is not
integral, the rational hyperplane H := {x ∈ Rn | ct x = δ} does not contain any integral vector.
We will show that H ∩ P = F which implies that H is a supporting hyperplane. By construction,

103
we have F ⊆ H, so we have to show that H∩P ⊆ F . Let x ∈ H∩P . Then, y t A′ x = ct x = δ = y t b′ ,
so y t (A′ x − b′ ) = 0. Thus, since all components of y are positive, A′ x = b′ , so x ∈ F .
Now, we know that (a),(b),(c),(d),(e), and (f) are equivalent.
“(g) ⇒ (e):” Let H = {x ∈ Rn | ct x = δ} be a rational supporting hyperplane of P , so
max{ct x | x ∈ P } = δ. Assume that H does not contain any integral vector. Then, by
Corollary 77, there is a positive number γ for which γc is integral but γδ is not integral. Then
max{(γc)t x | x ∈ P } = γ max{ct x | x ∈ P } = γδ ̸∈ Z, so the statement of (g) is false.
Since “(f) ⇒ (g)” is trivial, this shows the equivalence of all statements. 2

Corollary 80 There is a polynomial-time algorithm which computes, given a rational

polyhedron P ⊆ Rn with P = PI and a rational vector c, a vector x ∈ P ∩ Zn maximizing
ct x over P ∩ Zn , provided that there is an optimum solution.

Proof: We only have to compute an integral element of a minimal face F consisting of

optimum solutions only (for finding F we can apply the Ellipsoid Method). This can be
done by computing an integral solution of an equation system, which can be done in polynomial
time due to possible due to Corollary 78. 2
Moreover, by the equivalence of (f) and (g), the existence of an integral solution can be deduced
from the integrality of the solution value. This motivates the following definition:

Definition 25 A system of inequalities Ax ≤ b is called totally dual integral (TDI

system), if the LP min{bt y | At y = c, y ≥ 0} has an integral optimum solution for each
integral vector c for which the LP is feasible and bounded.

Note that total dual integrality is in fact a property of the system of inequalities, not just of
the polyhedron that is defined by them. For example the systems
   
1 1 0
x1
 1 0  ≤  0 
x2
1 −1 0
and
1 1 x1 0
≤
1 −1 x2 0
define the same polyhedron. But it is easy to check that the first system of inequalities is TDI
while the second one is not TDI.

Theorem 81 Let A ∈ Qm×n and b ∈ Zm such that Ax ≤ b is totally dual integral. Then,
the polyhedron P = {x ∈ Rn | Ax ≤ b} is integral.

104
Proof: If Ax ≤ b is TDI, then by definition min{bt y | At y = c, y ≥ 0} is an integer for each
integral vector c for which the minimum is finite. By duality, this implies that max{ct x | Ax ≤ b}
is an integer for each integral vector c for which the maximum is finite. Thus, by the implication
“(g) ⇒ (a)” of Theorem 79, P is integral. 2

Proposition 82 If Ax ≤ b is a TDI-system, and at x ≤ β is valid for any x ∈ Rn with

Ax ≤ b, then the system Ax ≤ b, at x ≤ β is also totally dual integral.

Proof: Let Ax ≤ b be a TDI-system, and at x ≤ β a valid inequality for any x ∈ Rn with

Ax ≤ b. Let c be an integral vector for which the LP min{bt y + βγ | At y + γa = c, y ≥ 0, γ ≥ 0}
is feasible and and bounded. Then

min{bt y + βγ | At y + γa = c, y ≥ 0, γ ≥ 0} = max{ct x | Ax ≤ b, at x ≤ β}
= max{ct x | Ax ≤ b}
= min{bt y | At y = c, y ≥ 0}

The last minimization problem has an optimum solution y ∗ that is integral. Together with
γ ∗ = 0, this gives an optimum integral solution for the first minimization problem. 2
Hence, if a system Ax ≤ b is not TDI, then no proper subsystem A′ x ≤ b′ with {x ∈ Rn | Ax ≤
b} = {x ∈ Rn | A′ x ≤ b′ } can be TDI. We call a system Ax ≤ b minimally TDI if it is TDI
but no proper subsystem of Ax ≤ b defining the same polyhedron is TDI.

Proposition 83 If Ax ≤ b, at x ≤ β is a TDI-system with integral vector a, then

Ax ≤ b, at x = β is also a TDI-system.

Proof: Let c be an integral vector for which

max{ct x | Ax ≤ b, at x = β}
(65)
= min{bt y + β(λ − µ) | y ≥ 0, λ, µ ≥ 0, At y + (λ − µ)a = c}

is finite. Let x∗ , y ∗ , λ∗ , µ∗ be optimum primal and dual solutions. Set c̃ := c + ⌈µ∗ ⌉a. Then,

max{c̃t x | Ax ≤ b, at x ≤ β}
(66)
= min{bt y + βλ | y ≥ 0, λ ≥ 0, At y + λa = c̃}

is finite because x∗ is feasible for the maximum and y ∗ and λ∗ + ⌈µ∗ ⌉ − µ∗ are feasible for the
minimum.
Since Ax ≤ b, at x ≤ β is a TDI-system, the minimum in equation (66) has an integer optimum
solution ỹ, λ̃. Then, y := ỹ, λ := λ̃, µ := ⌈µ∗ ⌉ is an integer optimum solution for the minimum
in (65): it is obviously feasible, and its cost is:

bt ỹ + β(λ̃ − ⌈µ∗ ⌉) = bt ỹ + β λ̃ − β⌈µ∗ ⌉ ≤ bt y ∗ + β(λ∗ + ⌈µ∗ ⌉ − µ∗ ) − β⌈µ∗ ⌉ = bt y ∗ + β(λ∗ − µ∗ ).

105
The inequality follows from the fact that y ∗ , λ∗ + ⌈µ∗ ⌉ − µ∗ is feasible for the minimum in (66)
and ỹ, λ̃ is an optimum solution for the minimum in (66). Hence, the minimum in (65) has an
integral optimum solution, so Ax ≤ b, at x = β is TDI. 2

Definition 26 A finite set of vectors {v1 , . . . , vt } is a Hilbert basis if each integral

vector in cone({v1 , . . . , vt }) is a non-negative integral combination of v1 , . . . , vt .

Example: The set of the standard unit vectors together with their negatives is a Hilbert basis
of the cone Rn .

Theorem 84 Every rational polyhedral cone is generated by an integral Hilbert basis.

Proof: Let C be a rational polyhedral cone. C is generated by some rational vectors b1 , . . . , bk ,

and we can assume with loss of generality that these vectors are integral. Let H consist of all
integral vectors in
( k )
X
P = λi bi | 0 ≤ λi ≤ 1 for i ∈ {1, . . . , k} .
i=1

Obviously H is a finite set. We claim that H is a Hilbert basis generating C. As {b1 , . . . , bk } ⊆

H ⊆ C, the cone C is generated by H. To see that H forms a Hilbert basis, let b be an
integral
P vector in C. Since b1 , . . . , bk generate C, there are nonnegative numbers µ1 , . . . , µk with
b = ki=1 µi bi , so
k
! k
X X
b= ⌊µi ⌋bi + (µi − ⌊µi ⌋)bi .
i=1 i=1

Then, the vector

k
! k
X X
b− ⌊µi ⌋bi = (µi − ⌊µi ⌋)bi
i=1 i=1

is integral and an element of P . Thus (since {b1 , . . . , bk } ⊆ H), b can be written a a non-negative
integral combination of the elements of H. This shows that H is a Hilbert basis. 2
Notation: For a system of inequalities Ax ≤ b and a face F of {x ∈ Rn | Ax ≤ b}, we call
a row of A active, if the corresponding inequality in Ax ≤ b is satisfied with equality for all
x ∈ F.

Theorem 85 A feasible rational system of inequalities Ax ≤ b is TDI if and only if for

each minimal face F of P := {x ∈ Rn | Ax ≤ b}, the rows that are active in F form a
Hilbert basis.

106
Proof: “⇒:” Suppose that Ax ≤ b is TDI. Let F be a minimal face of P and let a1 , . . . , at be
the rows of A that are active for F . We have to show that {a1 , . . . , at } is a Hilbert basis. Let
c be an integral vector in cone({a1 , . . . , at }). We have to write c as an integral non-negative
combination of a1 , . . . , at . The maximum in the LP-duality equation

max{ct x | Ax ≤ b} = min{bt y | At y = c, y ≥ 0} (67)

is attained by every vector x in F . Since Ax ≤ b is TDI, the dual problem has an integral
optimum solution y. By complementary slackness, the entries of y at positions corresponding
to rows that are not active in F are 0. Thus, c is an integral non-negative combination of
a1 , . . . , a t .
“⇐:” Assume that for each minimal face F of P , the rows that are active in F form a Hilbert
basis. Let c be an integral vector for which the optima in (67) are finite. We have to show that
the minimum is attained by an integral vector. Let F be a minimal face of P such that each
vector in F attains the maximum in the duality equation. Let a1 , . . . , at be rows of A that are
active in F . Then, by complementary Pt slackness, c ∈ cone({a1 , . . . , at }). Since a1 , . . . , at form a
Hilbert basis, we can write c = i=1 λi ai for certain non-negative integral numbers λ1 , . . . , λt .
We can extend (λ1 , . . . , λt ) with zero-components to a vector y ∈ Zm with y ≥ 0, At y = c and
bt y = xt At y = ct x for all x ∈ F . In other words, y is an integral optimum solution of the dual
LP. 2

Theorem 86 The rational system of inequalities Ax ≤ 0 is TDI if and only if the rows
of A form a Hilbert basis.

Proof: Follows from the previous Theorem with b = 0 (note that in the unique minimal face
of {x ∈ Rn | Ax ≤ 0} all rows of A are active). 2

Theorem 87 (Giles and Pulleyblank [1979]) For each rational polyhedron P ⊆ Rn there
exists a rational TDI-system Ax ≤ b with A ∈ Zm×n and P = {x ∈ Rn | Ax ≤ b}. The
vector b can be chosen to be integral if and only if P is integral.

Proof: We can assume w.l.o.g. that P ̸= ∅. For each minimal face F of P , we define

CF := {c ∈ Rn | ct z = max{ct x | x ∈ P } for all z ∈ F }.

Then, CF is a polyhedral cone. To see this, assume that P = {Ãx ≤ b̃} is some description of
P . Then CF is generated by the rows of Ã that are active in F .
Let F be a minimal face, and let a1 , . . . , at be an integral Hilbert basis generating CF . Choose
x0 ∈ F , and define βi := ati x0 for i = 1, . . . , t. Then, βi = max{ati x | x ∈ P } (i = 1, . . . , t). Let
SF be the system at1 x ≤ β1 , . . . , att x ≤ βt . All inequalities in SF are valid for P . Let Ax ≤ b
be the union of the systems SF over all minimal faces F of P . Then, P ⊆ {x ∈ Rn | Ax ≤ b}.

107
On the other hand, if x∗ ∈ Rn \ P , then there is a supporting hyperplane of P separating x∗
from P , and this supporting hyperplane touches P in a minimal face, so there is an inequality
in Ax ≤ b that is violated by x∗ . Hence, P = {x ∈ Rn | Ax ≤ b}. Moreover, by Theorem 85,
Ax ≤ b is TDI.
If P is integral, then all the βi can chosen to be integral because we can choose the vectors
x0 ∈ F as integral vectors. On the other hand, if b is integral, then by Theorem 81, P is integral.
2

8.4 Total Unimodularity

In this section, we want to identify integral matrices A such that Ax ≤ b, x ≥ 0 is TDI for
any vector b. It will turn out that these are exactly the totally unimodular matrices (see
Corollary 92).

Definition 27 An m × n-matrix A with rank m is called unimodular if A ∈ Zm×n and

for all regular m × m-submatrices B of A, we have det(B) ∈ {−1, 1}.

In particular, a regular square matrix is unimodular if and only if it is integral and its determinant
is −1 or 1. Moreover, by Cramer’s rule, the inverse of any unimodular square matrix is an
integral matrix.
Exercise: Check that any series of elementary unimodular column operations, applied to a
matrix A (see Chapter 8.2), can be performed by multiplying A from the right by an appropriate
regular unimodular square matrix.

Definition 28 A matrix A is called totally unimodular (TU) if every subdeterminant

of A (i.e. every determinant of quadratic submatrices of A) is 0, −1 or 1.

In particular, all entries of totally unimodular matrices must be 0, −1 or 1.

Observation: A matrix A is totally unimodular if and only if Im A is unimodular.

Theorem 88 Let A be a totally unimodular matrix, and let b be an integral vector. Then,
the polyhedron P = {x ∈ Rn | Ax ≤ b} is integral.

Proof: Let F be a minimal face of P . We will show that F contains an integral vector. By
the implication “(c) ⇒ (a)” of Theorem 79 this is sufficient to prove that P is integral.
By Proposition 23, we can write the minimal face as F = {x ∈ Rn | A′ x = b′ } where A′ x ≤ b′
is a subsystem of Ax ≤ b. We can assume that A′ has full row rank. By permuting coordinates,

108
U −1 b′
we can write A′ = U V for some matrix U with det(U ) ∈ {−1, 1}. Thus x :=

0
is an
integral vector in F . 2

Theorem 89 Let A ∈ Zm×n be a matrix with rank m. Then A is unimodular if and only
if for each integral vector b the polyhedron {x ∈ Rn | Ax = b, x ≥ 0} is integral.

Proof: “⇒:” Assume that A is unimodular, and let b be an integral vector. Let x′ be a vertex
of {x ∈ Rn | Ax = b, x ≥ 0}. This means that there are n linearly independent constraints
in the system Ax ≤ b, −Ax ≤ −b, −In x ≤ 0 that are satisfied by x′ with equality. Thus, the
columns of A corresponding to non-zero entries of x′ are linearly independent. This set of
columns can be extended to a regular m × m-submatrix B of A. Then, the restriction of x′ to
coordinates corresponding to B is B −1 b. This is integral (because det(B) ∈ {−1, 1}). The other
entries of x′ are zero, so x′ is integral.
“⇐:” Suppose that {x ∈ Rn | Ax = b, x ≥ 0} is integral for every integral vector b. Let B be
a regular m × m-submatrix of A. We have to show that det(B) ∈ {−1, 1}. To this end, it
is sufficient to show that B −1 u is integral for every integral vector u (by Cramer’s rule). So
let u be an integral vector. Then, there is an integral vector y such that z := y + B −1 u ≥ 0.
Then, b := Bz is integral. Let z ′ be a vector with Az ′ = Bz = b that arises from z by adding
zero-entries. Then, z ′ is a feasible (i.e. non-negative) basic solution of Ax = b, so it is a vertex
of {x ∈ Rn | Ax = b, x ≥ 0}. Therefore z ′ is integral, which also shows that z is integral. This
implies that B −1 u = z − y is integral. 2

Theorem 90 (Hoffman and Kruskal [1956]) Let A be an integral matrix. Then

A is totally unimodular if and only if for each integral vector b the polyhedron
{x ∈ Rn | Ax ≤ b, x ≥ 0} is integral.

Proof: The matrix A is totally unimodular if and only if Im A is unimodular. Let b be an
integral vector. Then, the
vertices
of {x ∈ Rn | Ax ≤ b, x ≥ 0} are integral if and only if the
vertices of {z ∈ Rm+n | Im A z = b, z ≥ 0} are integral. Thus, the statement follows from
Theorem 89. 2

Corollary 91 An integral matrix A is totally unimodular if and only if for all integral
vectors b and c optimum values for both sides of the duality equation

max{ct x | Ax ≤ b, x ≥ 0} = min{bt y | At y ≥ c, y ≥ 0}

are attained by integral vectors (if they are finite).

Proof: Follows directly from Hoffmans and Kruskal’s Theorem (Theorem 90) using the fact

109
that a matrix is totally unimodular if and only if its transposed matrix is totally unimodular. 2

Corollary 92 An integral matrix A is totally unimodular if and only if the system

Ax ≤ b, x ≥ 0 is TDI for each vector b.

Proof: “⇒:” If A is totally unimodular, then also At is totally unimodular. Thus, by Theo-
rem 90, min{bt y | At y ≥ c, y ≥ 0} is attained by an integral vector for each vector b and each
integral vector c for which the minimum is finite. This implies that the system Ax ≤ b, x ≥ 0 is
TDI for each vector b.
“⇐:” Suppose that Ax ≤ b, x ≥ 0 is TDI for each vector b. By Theorem 81 this implies that the
polyhedron {x ∈ Rn | Ax ≤ b, x ≥ 0} is integral for each integral vector b. By Theorem 90, this
means that A is totally unimodular. 2
The following theorem provides as a certificate to show that a matrix is totally unimodular.

Theorem 93 (Ghoulia-Houri [1962]) A matrix A = (aij ) i=1,...,m ∈ Zm×n is totally unimo-

j=1,...,n
˙ 2
dular if and only if for each set R ⊆ {1, . . . , n} there are sets R1 and R2 with R = R1 ∪R
such that for each i ∈ {1, . . . , m}:
X X
aij − aij ∈ {−1, 0, 1}.
j∈R1 j∈R2

Proof: “⇒:” Let A be totally unimodular and R ⊆ {1, . . . , n}. Let d ∈ {0, 1}n be the
characteristic vector for R, i.e.

1 for r ∈ R
dr =
0 for r ∈ {1, . . . , n} \ R



A
Since A is totally unimodular, also the matrix  −A  is also totally unimodular. Thus, the
In
polytope

n 1 1
P := x ∈ R | Ax ≤ Ad , Ax ≥ Ad , x ≤ d, x ≥ 0
2 2

is integral. It contains the vector 12 d, so it is non-empty.

l PLet z bem an integral vertex of P .
Pn 1 n 1 1
Pn
Then, for any i ∈ {1, . . . , m}, we have j=1 aij zj ≤ 2 j=1 aij dj ≤ 2 + 2 j=1 aij dj and
Pn j P k
1 n 1 1
Pn Pn
a z
j=1 ij j ≥ 2 j=1 ij j ≥ − 2 + 2
a d j=1 aij dj , so −1 ≤ j=1 aij (dj − 2zj ) ≤ 1.

110
Define R1 := {r ∈ R | zr = 0} and R2 := {r ∈ R | zr = 1}. For i ∈ {1, . . . , m}, this yields
X X n
X
aij − aij = aij (dj − 2zj ) ∈ {−1, 0, 1}
j∈R1 j∈R2 j=1

˙ 2 as
“⇐:” Assume that for each R ⊆ {1, . . . , n} there are sets R1 , R2 ⊆ R with R = R1 ∪R
described in the theorem. We show by induction in k that every k × k-submatrix of A has
determinant -1,0, or 1. For k = 1 this follows from the criterion for |R| = 1.
Let k > 1. Let B = (bij )i,j∈{1,...,k} a submatrix of A. We can assume that B is non-singular
because otherwise its determinant is 0.
′
By Cramer’s rule, each entry of B −1 is det(B
det(B)
)
where B ′ arises from B by replacing a column by
a unit vector. By the induction hypothesis det(B ′ ) ∈ {−1, 0, 1}. Hence, all entries of the matrix
B ∗ := (det(B))B −1 are in {−1, 0, 1}.
Let b∗ be the first column of B ∗ . Then, Bb∗ = det(B)e1 where e1 is the first unit P vector.∗ We
∗ ∗
define R := {j ∈ {1, . . . , k} | bj ̸= 0}. For i ∈ {2, . . . , k}, we have 0 = (Bb )i = j∈R bij bj , so
|{j ∈ R | bij ̸= 0}| is even.
˙ 2 such that j∈R bij − j∈R bij ∈ {−1, 0, 1} for all i ∈ {1, . . . , k}. Thus, for
P P
Let R = R1 ∪R 1 2 P P
i ∈ {2,
P . . . , k}, we
P have (since |{j ∈ R | b ij ̸
= 0}| is even): j∈R 1
b ij − j∈R2 bij = 0. If we also
had j∈R1 b1j − j∈R2 b1j = 0, then the columns of B would not be linearly independent. Hence,
k
P P
j∈R1 b1j − j∈R2 b1j ∈ {−1, 1} and thus, Bx ∈ {e1 , −e1 } where the vector x ∈ {−1, 0, 1} is
defined by 
 1 for j ∈ R1
xj = −1 for j ∈ R2
0 for j ∈ {1, . . . , k} \ R


Therefore, b∗ = det(B)B −1 e1 ∈ {det(B)x, −det(B)x}. But both b∗ and x are non-zero vectors
with entries -1,0,1 only, so we can conclude that det(B) ∈ {−1, 1}. 2
This result allows as to prove total unimodularity for some quite important matrices: The
incidence matrix of an undirected graph G is the matrix AG = (av,e ) v∈V (G) which is defined
e∈E(G)
by:
1, if v ∈ e
av,e =
0, if v ∈
̸ e
The incidence matrix of a directed graph G is the matrix AG = (av,e ) v∈V (G) which is defined
e∈E(G)
by: 
 −1, if v = x
av,(x,y) = 1, if v = y
0, if v ̸∈ {x, y}


Theorem 94 The incidence matrix of an undirected graph G is totally unimodular if

and only if G is bipartite.

111
Proof: Let G be an undirected graph and AG its incidence matrix. Since a matrix is TU if
and only if its transposed matrix is TU, we can apply Theorem 93 to the rows of AG : AG is TU
if and only if for each X ⊆ V (G) there is a partition X = A∪B ˙ with E(G[A]) = E(G[B]) = ∅.
The last condition is satisfied if and only if G is bipartite. 2
Applications:

• The previous theorem can be used to show König’s Theorem: The maximum cardinality
of a matching in a bipartite graph equals the minimum cardinality of a vertex cover.
To see this, let G be a bipartite graph and AG its incident matrix. Then, a maximum
matching is given by an integral solution of max{11m x | AG x ≤ 11n , x ≥ 0} and a minimum
vertex cover by an integral solution of min{11n y | AtG y ≥ 11m , y ≥ 0}. By the previous
theorem, AG is TU, so by Corollary 91 both optima are attained by integral vectors.

• Another implication of the theorem provides a characterization of doubly stochastic

matrices: A square matrix M = (xij ) i=1,...,n ∈ Rn×n≥0 is called doubly stochastic if for all
j=1,...,n
i ∈ {1, . . . , n}, we have j=1 xij = 1 and for all j ∈ {1, . . . , n}, we have ni=1 xij = 1. If
Pn P
in addition all entries are integral, we call the matrix a permutation matrix. We claim
that each doubly stochastic matrix can be written as a convex combination of permutation
matrices (which has also been proved in an earlier exercise). To see this, note that the
2
set of all doubly stochastic n × n-matrices is given by P = {x ∈ Rn | AG x ≤ 11, x ≥ 0}
where AG is the incidence vector of the complete bipartite Graph Kn,n (which contains a
vertex for each column of M in one side of the bipartition and a vertex for each row of M
in the other side of the bipartition). Since AG is TU, by Theorem 90 all vertices of P are
integral, so the represent permutation matrices.

Theorem 95 The incidence matrix of a directed graph is totally unimodular.

Proof: Again, we apply Theorem 93 to the transpose of the incidence matrix. For any set
R ⊆ {1, . . . , m} we can choose the R1 := R and R2 := ∅ satisfying the constraints of Theorem 93.
2
Remark: This result gives a reason for the existence of integral optimum solution of flow
problems. These results can be extended to more general linear functions on the edges of
directed graphs (see exercises).

8.5 Cutting Planes

The general strategy of cutting-plane methods can be described as follows: Assume that we are
given a polyhedron P and we want to optimize a linear function over the integral vectors in P .
To this end, we first find an optimum solution x∗ over P . If this belongs to PI , we are done,
because then we can also easily compute and integral solution of the same cost. Otherwise we

112
look for an hyperplane separating x∗ from PI , so we ask for a vector c and a number δ, such
that ct x ≤ δ for all x ∈ PI but ct x∗ > δ. Then, we add the constraint ct x ≤ δ, solve the linear
program again and iterate these steps until we get an integral solution.
How can we find half-space that contain PI but not necessarily P ? An easy observation is that
if H is a half-space that contains P , then PI is contained in HI . This motivates the following
definition:

Definition 29 Let P ⊆ Rn be a convex set. Let M be the set of all rational half-spaces
H = {x ∈ Rn | ct x ≤ δ} with P ⊆ H. Then, we define
\
P ′ := HI .
H∈M

′
We set P (0) := P and P (i+1) := P (i) for i ∈ N \ {0}. P (i) is the i-th Gomory-Chvátal-
truncation of P .

Obviously, we have P ⊇ P (1) ⊇ P (2) ⊇ · · · ⊇ PI for any rational polyhedron P . In particular

we have P = P ′ if P = PI .
An example that P ′ may differ from PI is given by the polytope P = conv({(0, 0), (0, 1), (1, 12 }).
For any half-space H containing P , we have ( 12 , 12 ) ∈ HI , so we get ( 12 , 12 ) ∈ P ′ an thus PI ̸= P ′ .
In this polyhedron, PI = P (2) , but by extending the polyhedron to the right, one can get for
each k a rational polyhedron for which also PI ̸= P (k) .

Lemma 96 Let H = {x ∈ Rn | ct x ≤ δ} be a rational half-space with

c ∈ Zn such that the greatest common divisor of the components of c is 1. Then
HI = H ′ = {x ∈ Rn | ct x ≤ ⌊δ⌋}.

Proof: Obviously, we have HI ⊆ H ′ ⊆ {x ∈ Rn | ct x ≤ ⌊δ⌋}, so we only have to show that

{x ∈ Rn | ct x ≤ ⌊δ⌋} ⊆ HI . It is sufficient to show for x∗ ∈ {x ∈ Qn | ct x ≤ ⌊δ⌋} that x∗ ∈ HI .
By Corollary 77, the hyperplane {x ∈ Rn | ct x = ⌊δ⌋} contains an integral vector y (because
the greatest common divisor of the components of the components of c is 1). Let α ∈ N \ {0}
be a number such that αx∗ is integral. Then,

1 α−1
x∗ = (αx∗ − (α − 1)y) + y.
α α

Since ct (αx∗ − (α − 1)y) ≤ ct y = ⌊δ⌋, this shows that x∗ is the convex combination of two
integral vectors in H, so x∗ ∈ HI . 2

113
Proposition 97 Let P = {x ∈ Rn | Ax ≤ b} be a rational polyhedron. Then

P ′ = {x ∈ Rn | ut Ax ≤ ⌊ut b⌋ for all u ≥ 0 with ut A integral}.

Proof: “⊆:” For any u ≥ 0, we have P ⊆ {x ∈ Rn | ut Ax ≤ ut b}. Hence, if in addition ut A is

integral, this implies P ′ ⊆ {x ∈ Rn | ut Ax ≤ ut b}I ⊆ {x ∈ Rn | ut Ax ≤ ⌊ut b⌋}.
“⊇:” W.l.o.g. we can assume that {x ∈ Rn | ut Ax ≤ ⌊ut b⌋ for all u ≥ 0 with ut A integral} ̸= ∅.
Then also P ̸= ∅.
Let z ∈ {x ∈ Rn | ut Ax ≤ ⌊ut b⌋ for all u ≥ 0 with ut A integral}. We have to show that z is in
P ′ , i.e. that z is contained in the integer hull of every half-space containing P .
Let H = {x ∈ Rn | ct x ≤ δ} with c ∈ Qn such that P ⊆ H. We can assume that the greatest
common divisor of the components of c is 1.
The LP max{ct x | Ax ≤ b} is feasible and bounded (by δ), so we get the duality equation

max{ct x | Ax ≤ b} = min{ut b | At u = c, u ≥ 0}.

Let ũ be an optimum solution of the minimum. Since ũt A = ct is integral, this leads to
ũt Az ≤ ⌊ũt b⌋, so
ct z = ũt Az ≤ ⌊ũt b⌋ ≤ ⌊δ⌋.
By the previous lemma, this implies z ∈ HI . Since this is true for any half-space H containing
P , it also shows z ∈ P ′ . 2
Cuts that are given by inequalities of the type ut Ax ≤ ⌊ut b⌋ (for some vector u ≥ 0 with ut A
integral) are called Gomory-Chvátal cuts. They have been used for the first algorithms for
integer linear programming based on cutting planes (see Gomory [1963]).

Theorem 98 Let Ax ≤ b with A ∈ Zm×n and b ∈ Qm be a TDI system. Let

P = {x ∈ Rn | Ax ≤ b}. Then, P ′ = {x ∈ Rn | Ax ≤ ⌊b⌋}.

Proof: “P ′ ⊆ {x ∈ Rn | Ax ≤ ⌊b⌋}:” Each inequality in Ax ≤ b gives a half-space H, and the

corresponding inequality in Ax ≤ ⌊b⌋ gives a half-space that contains HI and hence P ′ .
“P ′ ⊇ {x ∈ Rn | Ax ≤ ⌊b⌋}:” We can assume that {x ∈ Rn | Ax ≤ ⌊b⌋} = ̸ ∅. Let x̃ ∈ {x ∈ Rn |
Ax ≤ ⌊b⌋}, an let u ≥ 0 be a vector with ut A integral. By the previous proposition, we have to
show that ut Ax̃ ≤ ⌊ut b⌋.
The LP max{ut Ax | Ax ≤ b} is feasible (since {x ∈ Rn | Ax ≤ ⌊b⌋} =
̸ ∅) and bounded (by ut b),
so we have the primal-dual equation

max{ut Ax | Ax ≤ b} = min{bt y | y ≥ 0, y t A = ut A}.

114
Since Ax ≤ b is TDI, the minimum is attained by an integral vector ỹ. Thus,
ut Ax̃ = ỹ t Ax̃ ≤ ỹ t ⌊b⌋ ≤ ⌊ỹ t b⌋ ≤ ⌊ut b⌋.
This shows P ′ ⊇ {x ∈ Rn | Ax ≤ ⌊b⌋}. 2

Corollary 99 For any rational polyhedron P , the set P ′ is a polyhedron.

Proof: Follows from the previous theorem and the fact that any rational polyhedron can be
described by a rational TDI system with integral matrix (Theorem 87). 2
Dunkel and Schulz [2013] haven shown that also for any polytope P (no matter if it is rational
or not), P ′ is a polytope.

Lemma 100 Let F be a face of a rational polyhedron P . Then, F ′ = F ∩ P ′ .

Proof: Let P be a rational polyhedron. By Theorem 87, we can write P as P = {x ∈ Rn |

Ax ≤ b} with A integral, b rational and Ax ≤ b TDI. Let F = {x ∈ Rn | Ax ≤ b, at x = β} be a
face of P where at x ≤ β with a and β integral is a valid inequality for P . By Proposition 82,
the system Ax ≤ b, at x ≤ β is TDI. Therefore, by Proposition 83, also Ax ≤ b, at x = β is TDI.
Since β is integral, we get (by applying Theorem 98 twice):
P ′ ∩ F = {x ∈ Rn | Ax ≤ ⌊b⌋, at x = β}
= {x ∈ Rn | Ax ≤ ⌊b⌋, at x ≤ ⌊β⌋, at x ≥ ⌈β⌉}
= F ′.
2

Corollary 101 Let F be a face of a rational polyhedron P . Then, F (i) = F ∩ P (i) .

Proof: Let P be a rational polyhedron, and F a face of P . By the previous lemma, F ′ is

either empty or a face of P ′ . By induction on i, we show that F (i) is either empty or a face of
P (i) , and F (i) = F ∩ P (i) . For i = 1, this follows from that previous lemma. For i > 1 we get:
F (i) = (F (i−1) )′ = (P (i−1) )′ ∩ F (i−1) = P (i) ∩ (P (i−1) ∩ F ) = P (i) ∩ F . 2

Lemma 102 Let P ⊆ Rn be a polyhedron, U a unimodular n × n-matrix and

f (X) = {U x | x ∈ X} for all X ⊆ Rn . Then, f (P ) is a polyhedron. Moreover, if P is
rational, then (f (P ))′ = f (P ′ ) and (f (P ))I = f (PI ).

Proof: Let P = {x ∈ Rn | Ax ≤ b}, then f (P ) = {x ∈ Rn | AU −1 x ≤ b}, so f (P ) is a

polyhedron.

115
Now assume in addition that P is rational. Since U is unimodular, U x is integral if and only if
x is integral. This implies
(f (P ))I = conv({y ∈ Zn | y = U x, x ∈ P })
= conv({y ∈ Rn | y = U x, x ∈ P, x ∈ Zn })
= conv({y ∈ Rn | y = U x, x ∈ PI })
= f (PI ).
By Theorem 87, we can assume that Ax ≤ b is TDI, A is integral and b is rational. Then, for
any integral vector c for which min{bt y | y t AU −1 = ct , y ≥ 0} is feasible and bounded, also
min{bt y | y t A = ct U, y ≥ 0} is feasible and bounded and ct U is integral. Hence AU −1 x ≤ b is
TDI. Thus, Theorem 98 implies
(f (P ))′ = {x ∈ Rn | AU −1 x ≤ b}′ = {x ∈ Rn | AU −1 x ≤ ⌊b⌋} = f (P ′ ).
2
Remark: This shows as well that (f (P ))(i) = f (P (i) ) for a rational polyhedron P and i ∈ N.

Theorem 103 (Schrijver [1980]) For every rational polyhedron P , there is a number t
with P (t) = PI .

Proof: Let P ⊆ Rn be a rational polyhedron. We prove the statement by induction on

n + dim(P ). The case dim(P ) = 0 is trivial.
Case 1: dim(P ) < n.
Then, P ⊆ K for some rational hyperplane K = {x ∈ Rn | at x = β}. We can assume that the
greates commong divisor of the entries of a is 1.
If K does not contain any integral vector, then by Corollary 77, β must be non-integral. Then,
P ′ ⊆ {x ∈ Rn | at x ≤ ⌊β⌋, at x ≥ ⌈β⌉} = ∅ = PI .
If K contains an integral vector y, we can assume that it contains 0 because the theorem holds
for P is and only if it holds for P − y since y is integral. Thus, we can assume that β = 0.
If we interpret at as a 1 × n-matrix, we can bring it into Hermite normal form by elementary
unimodular column operations. The Hermite normal form of at is of the type αet1 . Since any
series of elementary unimodular column operations can be performed by a multiplication
form the right with a unimodular square matrix, there is a unimodular square matrix U with
at U = αet1 . However, by the previous lemma, the theorem is invariant under the transformation
x 7→ U −1 x, so we may assume that at = αet1 . Then, the first component of every vector in P is
zero, so P = {0} × Q for some polyhedron Q ⊆ Rn−1 . We can apply the induction hypothesis
to Q. Since ({0} × Q)I = {0} × QI and ({0} × Q)(t) = {0} × Q(t) for any t ∈ N, this proves the
theorem in the case dim(P ) < n.
Case 2: dim(P ) = n. We can write P as P = {x ∈ Rn | Ax ≤ b} with A integral. Since
P is rational, by Theorem 74, PI is a rational polyhedron as well, so it can be written as

116
PI = {x ∈ Rn | Cx ≤ d} with some integral matrix C and some rational vector d. If PI = ∅, we
choose C = A and d = b − A′ 11n where A′ arises from A by taking the absolute value of each
entry. Note that {x ∈ Rn | Ax + A′ 11n ≤ b} = ∅ because any vector x∗ with Ax∗ + A′ 11n ≤ b
could be rounded down to an integral vector x with Ax ≤ b.
Let ct x ≤ δ be an inequality in Cx ≤ d. Then, we claim that there is an s ∈ N with
P (s) ⊆ H := {x ∈ Rn | ct x ≤ δ}. The theorem is a direct consequence of this claim.
Proof of the claim: Observe that there is an integral number β ≥ δ with P ⊆ {x ∈ Rn | ct x ≤ β}.
If PI = ∅, this is true by construction. In the case PI ≠ ∅, it follows from the fact that ct x is
bounded over P if and only if it is bounded over PI (Proposition 75).
Assume that the claim is false, so there is an integer γ with δ < γ ≤ β for which there is an
s0 ∈ N with P (s0 ) ⊆ {x ∈ Rn | ct x ≤ γ} but there is no s ∈ N with P (s) ⊆ {x ∈ Rn | ct x ≤ γ−1}.
Then, max{ct x | x ∈ P (s) } = γ for all s ≥ s0 . To see this, assume that max{ct x | x ∈ P (s) } < γ
for some s. Then there is an ϵ > 0 with P (s) ⊆ {x ∈ Rn | ct x ≤ γ − ϵ}. This implies
max{ct x | x ∈ P (s+1) } ≤ γ − 1 because {x ∈ Rn | ct x ≤ γ − ϵ}I ⊆ {x ∈ Rn | ct x ≤ γ − 1}.
Define F := P (s0 ) ∩ {x ∈ Rn | ct x = γ}. Then, dim(F ) < n = dim(P ), so we can apply the
induction hypothesis to F , which implies that there is a number s1 with F (s1 ) = FI . Thus,

F (s1 ) = FI ⊆ PI ∩ {x ∈ Rn | ct x = γ} = ∅.

Since F is a face of P (s0 ) , we can apply Corollary 101 to F and P (s0 ) , so

∅ = F (s1 ) = P (s0 +s1 ) ∩ F = P (s0 +s1 ) ∩ {x ∈ Rn | ct x = γ}.

Therefore, max{ct x | x ∈ P (s0 +s1 ) } < γ, which is a contradiction. 2

Theorem 104 For every polytope P , there is a number t with P (t) = PI .

Proof: We claim that there is a rational polytope Q with P ⊆ Q such that QI = PI . This
is sufficient to show the theorem because we can apply the previous theorem to Q. To prove
the claim, let W be a hypercube containing P . Then, there are finitely many integral vectors
z ∈ W \ P . For each such vector z we choose a rational hyperplane separating z from P . The
set of all the inequalities corresponding to these separating hyperplanes together with the
inequalities that define W give a description of a set Q with the desired properties. 2
For a polyhedron P , the smallest number t with P (t) = PI (if there is such a number) is called
Chvátal rank of P .

8.6 Branch-and-Bound Methods

Branch-and-Bound Methods (they are also called Divide-and-Conquer Algorithms

or Backtracking Algorithms) are a quite simple approach to integer linear programming.

117
Nevertheless, they are of great practical relevance. Algorithm 5 describes the approach for
integer linear programs but it can be applied to mixed integer linear programs, too. The
algorithm stores a number L which is the cost of the best integral solution found so far (so in
the beginning it is −∞). In each iteration of the main loop, the algorithm chooses a polyhedron
Pj , which is a subset of the given polyhedron P0 , and solves the corresponding linear program. If
this LP is bounded and feasible, the algorithm first checks if the value c∗ of an optimum solution
x∗ is larger than L. If this is not the case, the algorithm can reject the polyhedron Pj because
it cannot contain a better integral solution than the best current solution (this is the bounding
part). If c∗ > L and x∗ is integral, we have found a better integral solution and can update L.
Otherwise, we choose a non-integral component x∗i of x∗ and compute sub-polyhedra P2j+1 and
P2j+2 of Pj with additional constraints that arise by rounding x∗i up or down (branching step).

Algorithm 5: Branch-and-Bound Algorithm

Input: A matrix A ∈ Qm×n , a vector b ∈ Qm , and a vector c ∈ Qn such that the LP
max{ct x | Ax ≤ b} is feasible and bounded.
Output: A vector x̃ ∈ {x ∈ Zn | Ax ≤ b} maximizing ct x or the message that there is
no optimum solution.
1 L := −∞;
n
2 P0 := {x ∈ R | Ax ≤ b};
3 K := {P0 };
4 while K ̸= ∅ do
5 Choose a Pj ∈ K;
6 K := K \ {Pj };
7 if Pj ̸= ∅ then
8 Solve max{ct x | x ∈ Pj };
9 Let x∗ be an optimum solution and c∗ = ct x∗ ;
10 if c∗ > L then
11 if x∗ ∈ Zn then
12 L := c∗ ;
13 x̃ := x∗ ;
14 else
15 Choose i ∈ {1, . . . , n} with x∗i ̸∈ Z;
16 P2j+1 := {x ∈ Pj | xi ≤ ⌊x∗i ⌋};
17 P2j+2 := {x ∈ Pj | xi ≥ ⌈x∗i ⌉};
18 K := K ∪ {P2j+1 , P2j+2 };

19 if L > −∞ then
20 return x̃;
21 else
22 return “There is no feasible solution”;

118
Example: Consider the following ILP:

max −x1 + 3x2

subject to −4x1 + 6x2 ≤ 9
x1 + x2 ≤ 4
x1 , x2 ≥ 0
x1 , x2 ∈ Z

Figure 9 illustrates what the algorithm may do on this instance. Since the optimum solution
of the LP-relaxation is not integral, we create in the first branching step two sub-polytopes
P1 = {(x1 , x2 ) | x2 ≤ 2} ∩ P0 and P1 = {(x1 , x2 ) | x2 ≥ 3} ∩ P0 = ∅. In P1 we still do not find
an integral optimum solution, so we branch again and get the polytopes P3 and P4 . In P4 we
get an integral optimum x∗ = (1, 2) with cost 3. In P3 we get a non-integral optimum solution
(0, 1.5) whose cost is not better than the best integral solution found so far (provided that we
considered P4 before P3 ), so the algorithm will stop here.
A branch-and-bound computation is often represented by a so-called branch-and-bound tree.
This is in fact rather an arborescence than a tree. Its nodes are the polyhedra Pj that are
considered during the computation, and P0 is the root. For any Pj , the nodes P2j+1 and P2j+2
are its children (if they exist).
In line 5 of the algorithm, we have to choose the next LP to be solved, and in line 15 we have to
decide which non-integral component is used for creating new sub-problems. There are different
strategies for these steps (branching rules). For example, it is often reasonable to store the
elements of K in a last-in-first-out queue and to choose the last element that has been added to
K. In the branch-and-bound tree, this corresponds to a leaf with the biggest distance to the
root. This strategy can reduce the time until the first feasible solution has been found. Another
reasonable branching rule consist in choosing a polyhedron Pj for which max{ct x | x ∈ Pj } is
as large as possible. Note that the maximum over all these values for all Pj ∈ K gives an upper
bound U on the best possible solution that can still be computed. Hence, by choosing a Pj with
max{ct x | x ∈ Pj } = U , we get a chance to reduce U . This can be useful if we do not want to
compute an exact optimum solution but we stop as soon as U − L is small enough.
For the choice of x∗i a common strategy is to choose x∗i such that |x∗i − ⌊x∗i ⌋ − 12 | is minimized.
Another, more time-consuming approach is to choose x∗i such that the effect on the objective
function is maximized (strong branching).
Further remarks:

• In order to get at least a finite algorithm, we have to guarantee that in line 8 we always
find a integral optimum solution if Pj is integral.

• Instead of initializing L with −∞, it is often possible to compute some reasonable integral
solution by some heuristics. In particular this is often the case for combinatorial problems.

• If one only wants to compute an approximative solution, the condition c∗ > L can be
strengthened to c∗ > L(1 + ϵ). This way, some optimum solution may be cut off but

119
x2

x∗

P0
−4x1 + 6x2 = 9
x1

x1 + x 2 = 4
x2

P2 = ∅
x2 = 3

x2 = 2
x∗
P1
−4x1 + 6x2 = 9
x1

x1 + x2 = 4
x2

P3 = {0} × [0, 1.5]

x∗
P4
−4x1 + 6x2 = 9
x1

x1 = 0 x2 = 1 x1 + x2 = 4

Fig. 9: A branch-and-bound example.

120
the running can sometimes be reduced drastically. In many cases, an optimum solution
is computed quite early but then much time is needed just in order to prove that the
solution is optimum.

• The branch-and-bound strategy can be combined with a cutting-plane algorithm (see the
previous section). For each sub-polyhedron Pj , one can try to find hyperplanes separating
some non-integral vectors in Pj from (Pj )I . This combination is called branch-and-cut
method. For example, this approach has been for solving quite large Traveling Salesman
Problems (see Padberg and Rinaldi [1991]).

121
Bibliography
Adler, I., Karp, R.M., Shamir, R. [1987]: A simplex variant solving an m × d linear program in
O(min(m2 , d2 )) expected number of steps. Journal of Complexity, 3, 372–387, 1987.

Ahuja, R.K., Magnanti, T.L., and Orlin [1993]: Network Flows: Theory, Algorithms, and Applications.
Prentice Hall, 1993.

Anthony, M., and Harvey, M. [2012]: Linear Algebra: Concepts and Methods. Cambridge University
Press, 2012.

Bárány, I., Howe, R., and Lovász, L. [1992]: On integer points in polyhedra: a lower bound. Combina-
torica, 12, 135–142, 1992.

Bertsimas, D., and Tsitsiklis, J.N. [1997]: Introduction to Linear Optimization. Athena Scientific, 1997.

Bertsimas, D., and Weismantel, R. [2005]: Optimization over Integers. Dynamic Ideas, 2005.

Bland, R.G. [1977]: New finite pivoting rules for the simplex method. Mathematics of Operations
Research, 2, 103–107, 1977.

Borgwardt, K. [1982]: The average number of pivot steps required by the simplex method is polynomial.
Zeitschrift für Operations Research, 26, 157–177, 1982.

Bosch, S. [2007]: Lineare Algebra. 4th edition, Springer, 2007.

Chvátal, V. [1983]: Linear programming. Series of books in the mathematical sciences, W.H. Freeman,
1983.

Cohn, D.L. [1980]: Measure Theory. Birkhäuser, 1980.

Cunningham, W.H. [1976]: A network simplex method. Mathematical Programming, 11, 105–116,
1976.

Dadush, D. and Hulberts, S. [2019]: A Friendly Smoothed Analysis of the Simplex Method. SIAM
Jounal on Computing, 49, 5, 449–499, 2019

Dantzig, G.B. [1951]: Maximization of a linear function of variables subject to linear inequalities. In:
Koopmans, T.C (ed.), Activity Analysis of Production and Allocation, 359–373, Wiley, 1951.

Dunkel, J. and Schulz, A.S. [2013]: The Gomory-Chvátal closure of a non-rational polytope is a rational
polytope. Mathematics of Operation Research, 38, 1, 63–91, 2013.

Edmonds, J. [1965]: Maximum matching and polyhedron with (0,1) vertices. Journal of Research of
the National Bureau of Standards, B, 69, 125–130, 1965.

Eisenbrand, F. [2003]: Fast integer programming in fixed dimension. Lecture Notes in Computer
Science, 2832, 196–207, 2003.

Fischer, G. [2009]: Lineare Algebra: Eine Einführung für Studienanfänger. 18th edition, Springer,
2013.

122
Ghoulia-Houri, A. [1962]: Charactérisation des matrices totalement unimodulaires. Comptes Rendus
Hebdomadaires des Séances de l’Académie des Sciences (Paris), 254, 1192-1194, 1962.

Giles, F.R. and Pulleyblank, W.R. [1979]: Total dual integrality and integer polyhedra. Linear Algebra
and Its Applications, 25, 191–196, 1979.

Goldfarb, D. and Sit, W.Y. [1979]: Worst case behavior of the steepest edge simplex method. Discrete
Applied Mathematics, 1, 4, 277–285, 1979.

Gomory, R.E. [1963]: An algorithm for integer solutions of linear programs. In: Recent Advances in
Mathematical Programing (R.L. Graves, P. Wolfe, eds.), McGraw-Hill, 269–302, 1963.

Grötschel, M., Lovász, L. and Schrijver, A. [1981]: The ellipsoid method and its consequences in
combinatorial optimization. Combinatorica, 1, 169–197, 1981.

Guenin, B., Könemann, J., and Tunçel, L. [2014]: A Gentle Introduction to Optimization. Cambridge
University Press, 2014.

Hoffman, A. and Kruskal, J. [1956]: Integral boundary points of convex polyhedra. Linear Inequalities
and Related Systems (H. Kuhn, A. Tucker, eds.), Annals of Mathematics Studies, 38, 223–246, 1956.

Hougardy, S., and Vygen, J. [2018]: Algorithmische Mathematik. Second edition, Springer, 2018.

Kafer, S. [2022]: Polyhedral Diameters and Applications to Optimization. PhD Thesis, University of
Waterloo, 2022.

Kalai, G., and Kleitman, D. [1992]: A quasi-polynomial bound for the diameter of graphs of polyhedra.
Bulletin of the American Mathematical Society, 26, 315–316, 1992.

Karmakar, L. [1984]: A new polynomial-time algorithm for linear programming. Combinatorica, 4,

373–395, 1984.

Karloff, H. [1991]: Linear Programming. Birkhäuser, 1991.

Khachiyan, L. [1979]: A polynomial algorithm for linear programming. Soviet Mathematics Doklady,
20, 191–194, 1979.

Klee, V., and Minty, G.J. [1972]: How good is the simplex algorithm? In: Inequalities III (O. Shisha,
ed.), Academic Press, 159–175, 1972.

Korte, B., and Vygen, J. [2018]: Combinatorial Optimization: Theory and Algorithms. Sixth edition,
Springer, 2018.

Lang, S. [1987]: Linear Algebra. Third edition, Springer, 1987.

Lee, T., Sidford, A., Wong, S.C. [2015]: A Faster Cutting Plane Method and its Implications for
Combinatorial and Convex Optimization. arxiv.org/abs/1508.04874, Symposium on Foundations of
Computer Science, 2015.

Lenstra, H.W. [1983]: Integer programming with a fixed number of variables. Mathematics of Operations
Research, 8, 538–548, 1983.

123
Matoušek, J., and Gärtner, B. [2007]: Understanding and Using Linear Programming. Springer, 2007.

Megiddo, N. [1984]: Linear programming in linear time when the dimension is fixed. Journal of the
ACM, 31, 114–127, 1984.

Mehlhorn, K., and Saxena, S. [2015]: A still simpler way of introducing the interior-point method for
linear programming. Computer Science Review, 22, 1–11, 2016.

Padberg, M. [1999]: Linear Optimization and Extensions. Second edition, Springer, 1999

Padberg, M., and Rao, M. [1982]: Odd minimum cut-sets and b-matchings. Mathematics of Operations
Research, 7, 67–80, 1982.

Padberg, M., and Rinaldi, G. [1991]: A Branch-and-Cut Algorithm for the Resolution of Large-Scale
Symmetric Traveling Salesman Problems. SIAM Review, 33, 1, 60–100, 1991.

Pan, P.-Q. [2014]: Linear Programming Computation. Springer, 2014.

Panik, M.J. [1996]: Linear Programming: Mathematics, Theory and Algorithms. Kluwer Academic
Publishers, 1996.

Roos, C., Terlaky, T., Vial, J.-P. [2005]: Interior Point Methods for Linear Optimization. Second
edition, Springer, 2005.

Rubin, D. [1970]: On the unlimited number of faces in integer hulls of linear programs with a single
constraint. Operations Research, 18, 5, 940 – 946, 1970.

Saigal, R. [1995]: Linear Programming. A Modern Integrated Analysis. Springer, 1995.

Santos, F. [2011]: A counterexample to the Hirsch conjecture. Annals of Mathematics, 176, 1, 383–412,
2011.

Schrijver, A. [1980]: On cutting planes. Annals of Discrete Mathematics, 9, 291–296, 1980.

Schrijver, A. [1986]: Theory of Linear and Integer Programming. Wiley, 1986.

Sierksma, G., and Zwols, Y. [2015]: Linear and Integer Optimization. Theory and Practice. Third
edition, CRC Press, 2015.

Spielmann, D.A., and Teng, S.-H. [2005]: Smoothed analysis of algorithms: Why the simplex algorithm
usually takes polynomial time. Journal of the ACM, 51, 3, 385 – 463, 2004.

Strang, G. [1980]: Linear Algebra and Its Applications. Second edition, Academic Press, 1980.

Tardos, É. [1986]: A strongly polynomial algorithm to solve combinatorial linear programs. Operations
Research, 34, 2, 250 – 256, 1986.

Terlaky, R.J. [2001]: An easy way to teach interior point methods. European Journal of Operational
Research, 130, 1–19, 2001.

Todd, M. [2014]: An improved Kalai-Kleitman bound for the Diameter of a polyhedron. SIAM Jounal
on Discrete Mathematics, 28, 4, 1944–1947, 2014.

124
Vanderbei, R.J. [2014]: Linear Programming: Foundations and Extensions. Fourth edition, Springer,
2014.

Wright, S.J. [1997]: Primal-Dual Interior-Point Methods. SIAM, 1997.

Ye, Y. [1992]: On the finite convergence of interior-point algorithms for linear programming. Mathe-
matical Programming, 57, 325–335, 1992.

Ye, Y. [1997]: Interior Point Algorithms. Theory and Analysis. Wiley, 1997.

Ziegler, G. [2007]: Lectures on Polytopes. Seventh Printing, Springer, 2007.

125

Introduction To Mathematical Optimization by Matteo Fischetti
No ratings yet
Introduction To Mathematical Optimization by Matteo Fischetti
232 pages
An Introduction To Continuous Optimization
100% (4)
An Introduction To Continuous Optimization
400 pages
Linear and Integer Optimization - Theory and Practice, 3rd Ed, 2015
100% (2)
Linear and Integer Optimization - Theory and Practice, 3rd Ed, 2015
676 pages
Sven O Krumke Integer Programming Polyhedra and Algorithms Lecture Notes
No ratings yet
Sven O Krumke Integer Programming Polyhedra and Algorithms Lecture Notes
188 pages
Decimal Division Implementation Using VHDL
No ratings yet
Decimal Division Implementation Using VHDL
16 pages
Low Pass Filter Design
No ratings yet
Low Pass Filter Design
15 pages
Linear and Integer Optimization (V3C1/F4C1) : Lecture Notes
No ratings yet
Linear and Integer Optimization (V3C1/F4C1) : Lecture Notes
129 pages
Introduction To Optimisation PDF
No ratings yet
Introduction To Optimisation PDF
264 pages
CO250 Web
No ratings yet
CO250 Web
204 pages
Integer Programming - Sven O. Krumke
No ratings yet
Integer Programming - Sven O. Krumke
180 pages
Integer Programming PDF
No ratings yet
Integer Programming PDF
141 pages
Linopt Notes
No ratings yet
Linopt Notes
201 pages
Optimisation Notes
No ratings yet
Optimisation Notes
303 pages
Co250 F11
No ratings yet
Co250 F11
169 pages
co250-note2
No ratings yet
co250-note2
79 pages
Apunte Combinatorial
No ratings yet
Apunte Combinatorial
167 pages
Introduction To Tractability And Approximability Of Optimization Problems Jianer Chen instant download
No ratings yet
Introduction To Tractability And Approximability Of Optimization Problems Jianer Chen instant download
82 pages
Main
No ratings yet
Main
515 pages
Operations Research PDF
No ratings yet
Operations Research PDF
121 pages
Advanced college algebra study guide
From Everand
Advanced college algebra study guide
Harrison Cook
No ratings yet
ADVANCED COLLEGE ALGEBRA STUDY GUIDE
From Everand
ADVANCED COLLEGE ALGEBRA STUDY GUIDE
Harrison K Cook
No ratings yet
LectureNote_TableOfContent
No ratings yet
LectureNote_TableOfContent
8 pages
Applications of Linear Programming: András London
No ratings yet
Applications of Linear Programming: András London
124 pages
Ga Tech Student Notes
No ratings yet
Ga Tech Student Notes
130 pages
BOOK - On - Linear - Programming - Integer - Programmin
No ratings yet
BOOK - On - Linear - Programming - Integer - Programmin
181 pages
Combinatorial Optimization - Chekuri (2022)
No ratings yet
Combinatorial Optimization - Chekuri (2022)
255 pages
Convex It y 2015
0% (1)
Convex It y 2015
437 pages
Cours D'optimisation
No ratings yet
Cours D'optimisation
159 pages
Mosek Modeling
No ratings yet
Mosek Modeling
93 pages
ECE 236B Course Notes
No ratings yet
ECE 236B Course Notes
90 pages
eecs127_reader
No ratings yet
eecs127_reader
199 pages
OM Notes PDF
No ratings yet
OM Notes PDF
278 pages
Classification of Optimization methods
No ratings yet
Classification of Optimization methods
68 pages
LP Book
No ratings yet
LP Book
161 pages
cmu850-f20
No ratings yet
cmu850-f20
285 pages
Hu T.C. - Linear and Integer Programming Made Easy
No ratings yet
Hu T.C. - Linear and Integer Programming Made Easy
151 pages
Convex Optimization Fracis Bach
No ratings yet
Convex Optimization Fracis Bach
172 pages
Math562 ContinuousOptimization
No ratings yet
Math562 ContinuousOptimization
126 pages
Cono Semi Defini Do
100% (1)
Cono Semi Defini Do
155 pages
Notes iPad
No ratings yet
Notes iPad
263 pages
PB Algo2 2013 PDF
No ratings yet
PB Algo2 2013 PDF
87 pages
Large Scale Linear and Integer Optimization: A Unified Approach
No ratings yet
Large Scale Linear and Integer Optimization: A Unified Approach
739 pages
Previewpdf
No ratings yet
Previewpdf
86 pages
Optimización Lineal
No ratings yet
Optimización Lineal
304 pages
OR - Theory With Examples
No ratings yet
OR - Theory With Examples
145 pages
A First Course in Linear Optimi - Wei Zhi
No ratings yet
A First Course in Linear Optimi - Wei Zhi
306 pages
An Optimization Primer
50% (2)
An Optimization Primer
149 pages
Integer Optimization-Stephan Weltge
No ratings yet
Integer Optimization-Stephan Weltge
88 pages
(Ebook) Linear programming with MATLAB by Michael C. Ferris, Olvi L. Mangasarian, Stephen J. Wright ISBN 9780898716436, 0898716438 All Chapters Instant Download
100% (3)
(Ebook) Linear programming with MATLAB by Michael C. Ferris, Olvi L. Mangasarian, Stephen J. Wright ISBN 9780898716436, 0898716438 All Chapters Instant Download
81 pages
Libro - An Introduction To Continuous Optimization
No ratings yet
Libro - An Introduction To Continuous Optimization
399 pages
Linear Programming With Matlab Michael C Ferris Olvi L Mangasarian instant download
No ratings yet
Linear Programming With Matlab Michael C Ferris Olvi L Mangasarian instant download
82 pages
CO 250 Yuan Si
No ratings yet
CO 250 Yuan Si
88 pages
Stochastic Programming
100% (2)
Stochastic Programming
315 pages
2024 03 07 Lecture Notes
No ratings yet
2024 03 07 Lecture Notes
43 pages
Workbook Integral Calculus
No ratings yet
Workbook Integral Calculus
165 pages
PhuongPhapTinh (1)
No ratings yet
PhuongPhapTinh (1)
49 pages
Stochastic Programming
100% (1)
Stochastic Programming
326 pages
Stochastic Programming
No ratings yet
Stochastic Programming
326 pages
Manujw
100% (1)
Manujw
326 pages
Lecture Notes J2LALP
No ratings yet
Lecture Notes J2LALP
201 pages
Mathematics N4: FET College Nated, #6
From Everand
Mathematics N4: FET College Nated, #6
Efetobo Emede
No ratings yet
Gray Hat Hacking the Ethical Hacker's
From Everand
Gray Hat Hacking the Ethical Hacker's
Çağatay Şanlı
5/5 (1)
Coding Theory - Exam
No ratings yet
Coding Theory - Exam
2 pages
Lab 3-C
100% (1)
Lab 3-C
6 pages
Factoring Completely: The Greatest Common Factor
No ratings yet
Factoring Completely: The Greatest Common Factor
12 pages
An Interpolation Algorithm For Discrete Fourier Transforms of Weighted Damped Sinusoidal Signals
No ratings yet
An Interpolation Algorithm For Discrete Fourier Transforms of Weighted Damped Sinusoidal Signals
9 pages
AVL Tree
No ratings yet
AVL Tree
30 pages
Lecture 1
No ratings yet
Lecture 1
43 pages
Docs Slides Lecture6
No ratings yet
Docs Slides Lecture6
31 pages
Deep Learning Overview
No ratings yet
Deep Learning Overview
102 pages
Hillier7e_Ch17_WebChapter
No ratings yet
Hillier7e_Ch17_WebChapter
67 pages
Lecture 11 Dual Simplex Method
No ratings yet
Lecture 11 Dual Simplex Method
84 pages
Decision Tree and Random Forest
No ratings yet
Decision Tree and Random Forest
74 pages
ML-UNIT4
No ratings yet
ML-UNIT4
41 pages
CSC 580 - Chapter 3
No ratings yet
CSC 580 - Chapter 3
35 pages
None
No ratings yet
None
23 pages
Cap8 Predicting Continuous Target Variables with Regression Analysis - Thakur Ankita 2016 - Python Real World Data Science
No ratings yet
Cap8 Predicting Continuous Target Variables with Regression Analysis - Thakur Ankita 2016 - Python Real World Data Science
36 pages
3.2 Data Approximation and Neville's Method: B K N X
No ratings yet
3.2 Data Approximation and Neville's Method: B K N X
7 pages
Discrete Hartley Transforms On FPGA: Sriseshan S EE09B060 Murali Naik EE09B055 April 26, 2013
No ratings yet
Discrete Hartley Transforms On FPGA: Sriseshan S EE09B060 Murali Naik EE09B055 April 26, 2013
5 pages
Gauss Jacobi Method
No ratings yet
Gauss Jacobi Method
4 pages
Bankers Algorithm in C
No ratings yet
Bankers Algorithm in C
3 pages
CRC CN
No ratings yet
CRC CN
6 pages
Speaker Recognition Project Proposal: Dheeraj Mehra, Rohan Paul, S.Arun Nair and Vaibhav Singh January 19, 2007
No ratings yet
Speaker Recognition Project Proposal: Dheeraj Mehra, Rohan Paul, S.Arun Nair and Vaibhav Singh January 19, 2007
2 pages
1 3 (Co1, Co2)
No ratings yet
1 3 (Co1, Co2)
17 pages
s11831-024-10063-0 (3)
No ratings yet
s11831-024-10063-0 (3)
40 pages
Single Image Dehazing Using A Multilayer Perceptron
No ratings yet
Single Image Dehazing Using A Multilayer Perceptron
12 pages
Data Mining - Density Based Clustering
No ratings yet
Data Mining - Density Based Clustering
8 pages
Discrete Time Fourier Transform (DTFT)
No ratings yet
Discrete Time Fourier Transform (DTFT)
5 pages
Compressive Coded Modulation For Seamless Rate Adaptation: Ravindra Padsala (140010741014)
No ratings yet
Compressive Coded Modulation For Seamless Rate Adaptation: Ravindra Padsala (140010741014)
18 pages
Concepts and Techniques: - Chapter 7
No ratings yet
Concepts and Techniques: - Chapter 7
123 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.