0% found this document useful (0 votes)
30 views49 pages

C62 Lectures10and11

Lectures 10 and 11 focus on constrained optimization problems, defining optimality conditions and illustrating them with examples. The Karush-Kuhn-Tucker (KKT) conditions are introduced as necessary conditions for optimality in constrained problems, along with the concept of Lagrange multipliers. The lectures emphasize the importance of constraint qualifications to derive these optimality conditions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views49 pages

C62 Lectures10and11

Lectures 10 and 11 focus on constrained optimization problems, defining optimality conditions and illustrating them with examples. The Karush-Kuhn-Tucker (KKT) conditions are introduced as necessary conditions for optimality in constrained problems, along with the concept of Lagrange multipliers. The lectures emphasize the importance of constraint qualifications to derive these optimality conditions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

Lectures 10 and 11: Constrained optimization

problems and their optimality conditions


Coralia Cartis, Mathematical Institute, University of Oxford

C6.2/B2: Continuous Optimization

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 1/25
Problems and solutions

minimize f (x) subject to x ∈ Ω ⊆ Rn .

f : Ω → R is (sufficiently) smooth.
f objective; x variables.
Ω feasible set determined by finitely many (equality and/or
inequality) constraints.

x∗ global minimizer of f over Ω =⇒ f (x) ≥ f (x∗ ), ∀x ∈ Ω.


x∗ local minimizer of f over Ω =⇒
∃N (x∗ , δ) such that f (x) ≥ f (x∗ ), for all x ∈ Ω ∩ N (x∗ , δ).
• N (x∗ , δ) := {x ∈ Rn : )x − x∗ ) ≤ δ}.

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 2/25
Example problem in one dimension

Example : min f (x) subject to a ≤ x ≤ b.

f(x)

x1 x2 x
a b
The feasible region Ω is the interval [a, b].
The point x1 is the global minimizer; x2 is a local
(non-global) minimizer; x = a is a constrained local minimizer.

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 3/25
An example of a nonlinear constrained problem

2
min2 (x1 − 2) + (x2 − 0.5(3 − 5))2 subject to
x∈R

−x1 − x2 + 1 ≥ 0, x2 − x21 ≥ 0.
2.5

2 c
2

1.5
c1 Ω
1 contours of f

0.5 x∗
x2

−0.5

−1

−1.5

−2
−1.5 −1 −0.5 0 0.5 1 1.5 2 2.5 3
√ √ x1

x = 0.5(−1 + 5, 3 − 5); Ω feasible set.

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 4/25
Optimality conditions for constrained problems

== algebraic characterizations of solutions −→ suitable for


computations.
provide a way to guarantee that a candidate point is optimal
(sufficient conditions)
indicate when a point is not optimal
(necessary conditions)

minimizex∈Rn f (x) subject to cE (x) = 0, cI (x) ≥ 0.


(CP)
f : Rn → R, cE : Rn → Rm and cI : Rn → Rp (suff.) smooth;
• cI (x) ≥ 0 ⇔ ci (x) ≥ 0, i ∈ I .
• Ω := {x : cE (x) = 0, cI (x) ≥ 0} feasible set of the problem.

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 5/25
Optimality conditions for constrained problems

unconstrained problem −→ x̂ stationary point (∇f (x̂) = 0).


constrained problem −→ x̂ Karush-Kuhn-Tucker (KKT) point.
Definition: x̂ KKT point of (CP) if there exist ŷ ∈ Rm and
λ̂ ∈ Rp such that (x̂, ŷ, λ̂) satisfies
! !
∇f (x̂) = ŷj ∇cj (x̂) + λ̂i ∇ci (x̂),
j∈E i∈I

cE (x̂) = 0, cI (x̂) ≥ 0,
λ̂i ≥ 0, λ̂i ci (x̂) = 0, for all i ∈ I.

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 6/25
Optimality conditions for constrained problems

unconstrained problem −→ x̂ stationary point (∇f (x̂) = 0).


constrained problem −→ x̂ Karush-Kuhn-Tucker (KKT) point.
Definition: x̂ KKT point of (CP) if there exist ŷ ∈ Rm and
λ̂ ∈ Rp such that (x̂, ŷ, λ̂) satisfies
! !
∇f (x̂) = ŷj ∇cj (x̂) + λ̂i ∇ci (x̂),
j∈E i∈I

cE (x̂) = 0, cI (x̂) ≥ 0,
λ̂i ≥ 0, λ̂i ci (x̂) = 0, for all i ∈ I.

• Let A := E ∪ {i ∈ I : ci (x̂) = 0} index set of active constraints


at x̂; cj (x̂) > 0 inactive constraint at x̂ ⇒ λ̂j = 0. Then
" "
i∈I λ̂i ∇ci (x̂) = i∈I∩A λ̂i ∇ci (x̂).

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 6/25
Optimality conditions for constrained problems

unconstrained problem −→ x̂ stationary point (∇f (x̂) = 0).


constrained problem −→ x̂ Karush-Kuhn-Tucker (KKT) point.
Definition: x̂ KKT point of (CP) if there exist ŷ ∈ Rm and
λ̂ ∈ Rp such that (x̂, ŷ, λ̂) satisfies
! !
∇f (x̂) = ŷj ∇cj (x̂) + λ̂i ∇ci (x̂),
j∈E i∈I

cE (x̂) = 0, cI (x̂) ≥ 0,
λ̂i ≥ 0, λ̂i ci (x̂) = 0, for all i ∈ I.

• Let A := E ∪ {i ∈ I : ci (x̂) = 0} index set of active constraints


at x̂; cj (x̂) > 0 inactive constraint at x̂ ⇒ λ̂j = 0. Then
" "
i∈I λ̂i ∇ci (x̂) = i∈I∩A λ̂i ∇ci (x̂).
# $
• J (x) = ∇ci (x)T i Jacobian matrix of constraints c. Thus
" "
j∈E ŷj ∇cj (x̂) = JE (x) ŷ and i∈I λ̂i ∇ci (x̂) = JI (x) λ̂.
T T

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 6/25
Optimality conditions for constrained problems ...

x̂ KKT point −→ ŷ and λ̂ Lagrange multipliers of the equality


and inequality constraints, respectively.
ŷ and λ̂ −→ sensitivity analysis.

L : Rn × Rm × Rp → R Lagrangian function of (CP),

L(x, y, λ) := f (x) − y # cE (x) − λ# cI (x), x ∈ Rn .

Thus ∇x L(x, y, λ) = ∇f (x) − JE (x)# y − JI (x)# λ,


and x̂ KKT point of (CP) =⇒ ∇x L(x̂, ŷ, λ̂) = 0
(i. e., x̂ is a stationary point of L(·, ŷ, λ̂)).
• duality theory...

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 7/25
An illustration of the KKT conditions

2
min2 (x1 − 2) + (x2 − 0.5(3 − 5))2 subject to
x∈R

−x1 − x2 + 1 ≥ 0, x2 − x21 ≥ 0. (∗)

2.5
c2
√ √
x∗ = 12 (−1 + 5, 3 − 5)$ :
2 c1

1.5
• global solution of (∗), 1
• KKT point of (∗).

∇ c1(x )
x∗
0.5

2
x

∇f (x ) = (−5 + 5, 0)$ ,
∇ f(x )
∗ 0


∇c1 (x ) = (1 − 5, 1)$ ,
∗ −0.5
∇ c (x∗)
−1 2

∇c2 (x∗ ) = (−1, −1)$ . −1.5

−2
−2 −1 0 1 2 3
x1

∇f (x ) = λ∗1 ∇c1 (x∗ )

+ , with
λ∗2 ∇c2 (x∗ ) = = λ∗1 λ∗2 5 − 1 > 0.
: constraints are active at x∗ .
c1 (x∗ ) = c2 (x∗ ) = 0

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 8/25
An illustration of the KKT conditions ...

2
min2 (x1 − 2) + (x2 − 0.5(3 − 5))2 subject to
x∈R

−x1 − x2 + 1 ≥ 0, x2 − x21 ≥ 0. (∗)

x := (0, 0)$ 2
c
is NOT a KKT point of (∗)! c1 Ω 2

1.5

c1 (x) = 0: active at x. 1
∇ c (0)
c2 (x) = 1: inactive at x.
1

0.5

2
=⇒ λ2 = 0 and
x
0
∇f (x) = λ1 ∇c1 (x),
with λ1 ≥ 0. −0.5 ∇ f(0)

∇ c2(0)

⇓ −1

−4 −3 −2 −1 0 1 2 3
x1

Contradiction with ∇f (x) = (−4, 5 − 3)$ and
∇c1 (x) = (0, 1)$ .

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 9/25
Optimality conditions for constrained problems ...

In general, need constraints/feasible set of (CP) to satisfy


regularity assumption called constraint qualification in order
to derive optimality conditions.
Theorem 16 (First order necessary conditions) Under
suitable constraint qualifications,
x∗ local minimizer of (CP) =⇒ x∗ KKT point of (CP).

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 10/25
Optimality conditions for constrained problems ...

In general, need constraints/feasible set of (CP) to satisfy


regularity assumption called constraint qualification in order
to derive optimality conditions.
Theorem 16 (First order necessary conditions) Under
suitable constraint qualifications,
x∗ local minimizer of (CP) =⇒ x∗ KKT point of (CP).

Proof of Theorem 16 (for equality constraints only): Let I = ∅.


Then the KKT conditions become: cE (x∗ ) = 0 (which is trivial
as x∗ feasible) and ∇f (x∗ ) = JE (x∗ )T y ∗ for some y ∗ ∈ Rm ,
where JE is the Jacobian matrix of the constraints cE .

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 10/25
Optimality conditions for constrained problems ...

In general, need constraints/feasible set of (CP) to satisfy


regularity assumption called constraint qualification in order
to derive optimality conditions.
Theorem 16 (First order necessary conditions) Under
suitable constraint qualifications,
x∗ local minimizer of (CP) =⇒ x∗ KKT point of (CP).

Proof of Theorem 16 (for equality constraints only): Let I = ∅.


Then the KKT conditions become: cE (x∗ ) = 0 (which is trivial
as x∗ feasible) and ∇f (x∗ ) = JE (x∗ )T y ∗ for some y ∗ ∈ Rm ,
where JE is the Jacobian matrix of the constraints cE .
Consider feasible perturbations/paths x(α) around x∗ , where
α (sufficiently small) scalar, x(α) ∈ C 1 (Rn ) and
x(0) = x∗ , x(α) = x∗ + αs + O(α2 ), s 4= 0 and c(x(α)) = 0(†) .
(†) requires constraint qualifications, namely, assuming the existence of s $= 0 with above properties.

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 10/25
Optimality conditions for constrained problems ...

Proof of Theorem 16 (for equality constraints only): (continued)


For any i ∈ E , by Taylor’s theorem for ci (x(α)) around x∗ ,
0 = ci (x(α)) = ci (x∗ + αs + O(α2 ))
= ci (x∗ ) + ∇ci (x∗ )T (x∗ + αs − x∗ ) + O(α2 )
= α∇ci (x∗ )T s + O(α2 ),
where we used ci (x∗ ) = 0.

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 11/25
Optimality conditions for constrained problems ...

Proof of Theorem 16 (for equality constraints only): (continued)


For any i ∈ E , by Taylor’s theorem for ci (x(α)) around x∗ ,
0 = ci (x(α)) = ci (x∗ + αs + O(α2 ))
= ci (x∗ ) + ∇ci (x∗ )T (x∗ + αs − x∗ ) + O(α2 )
= α∇ci (x∗ )T s + O(α2 ),
where we used ci (x∗ ) = 0. Dividing both sides by α, we
deduce
0 = ∇ci (x∗ )T s + O(α),
for all α sufficiently small. Letting α → 0, we obtain
∇ci (x∗ )T s = 0 for all i ∈ E ,
and so JE (x∗ )s = 0. [In other words, any feasible direction s
(which is assumed to exist) satisfies JE (x∗ )s = 0.]

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 11/25
Optimality conditions for constrained problems ...

Proof of Theorem 16 (for equality constraints only): (continued)


Now expanding f , we deduce
f (x(α)) = f (x∗ ) + ∇f (x∗ )T (x∗ + αs − s∗ ) + O(α2 )
= f (x∗ ) + α∇f (x∗ )T s + O(α2 ).

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 12/25
Optimality conditions for constrained problems ...

Proof of Theorem 16 (for equality constraints only): (continued)


Now expanding f , we deduce
f (x(α)) = f (x∗ ) + ∇f (x∗ )T (x∗ + αs − s∗ ) + O(α2 )
= f (x∗ ) + α∇f (x∗ )T s + O(α2 ).
Since x∗ is a local minimizer of f , we have f (x(α)) ≥ f (x∗ )
for all α sufficiently small. Thus α∇f (x∗ )T s + O(α2 ) ≥ 0 for all
α sufficiently small.

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 12/25
Optimality conditions for constrained problems ...

Proof of Theorem 16 (for equality constraints only): (continued)


Now expanding f , we deduce
f (x(α)) = f (x∗ ) + ∇f (x∗ )T (x∗ + αs − s∗ ) + O(α2 )
= f (x∗ ) + α∇f (x∗ )T s + O(α2 ).
Since x∗ is a local minimizer of f , we have f (x(α)) ≥ f (x∗ )
for all α sufficiently small. Thus α∇f (x∗ )T s + O(α2 ) ≥ 0 for all
α sufficiently small. Considering α > 0, we divide by α to
obtain ∇f (x∗ )T s + O(α) ≥ 0; now letting α → 0, we deduce
∇f (x∗ )T s ≥ 0.

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 12/25
Optimality conditions for constrained problems ...

Proof of Theorem 16 (for equality constraints only): (continued)


Now expanding f , we deduce
f (x(α)) = f (x∗ ) + ∇f (x∗ )T (x∗ + αs − s∗ ) + O(α2 )
= f (x∗ ) + α∇f (x∗ )T s + O(α2 ).
Since x∗ is a local minimizer of f , we have f (x(α)) ≥ f (x∗ )
for all α sufficiently small. Thus α∇f (x∗ )T s + O(α2 ) ≥ 0 for all
α sufficiently small. Considering α > 0, we divide by α to
obtain ∇f (x∗ )T s + O(α) ≥ 0; now letting α → 0, we deduce
∇f (x∗ )T s ≥ 0. Similarly, considering α < 0, we obtain
∇f (x∗ )T s ≤ 0. Thus
∇f (x∗ )T s = 0 for all s such that JE (x∗ )s = 0. (1)

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 12/25
Optimality conditions for constrained problems ...

Proof of Theorem 16 (for equality constraints only): (continued)


Now expanding f , we deduce
f (x(α)) = f (x∗ ) + ∇f (x∗ )T (x∗ + αs − s∗ ) + O(α2 )
= f (x∗ ) + α∇f (x∗ )T s + O(α2 ).
Since x∗ is a local minimizer of f , we have f (x(α)) ≥ f (x∗ )
for all α sufficiently small. Thus α∇f (x∗ )T s + O(α2 ) ≥ 0 for all
α sufficiently small. Considering α > 0, we divide by α to
obtain ∇f (x∗ )T s + O(α) ≥ 0; now letting α → 0, we deduce
∇f (x∗ )T s ≥ 0. Similarly, considering α < 0, we obtain
∇f (x∗ )T s ≤ 0. Thus
∇f (x∗ )T s = 0 for all s such that JE (x∗ )s = 0. (1)
By rank-nullity theorem, (1) implies that ∇f (x∗ ) must belong
to the range space of JE (x∗ )T (ie, span of columns of
JE (x∗ )T ), and so ∇f (x∗ ) = JE (x∗ )T y ∗ for some y ∗ . The next
slide details this argument.
Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 12/25
Optimality conditions for constrained problems ...

Proof of Theorem 16 (for equality constraints only): (continued)


By rank-nullity theorem, there exists y ∗ ∈ Rm and s∗ ∈ Rn such that
∇f (x∗ ) = JE (x∗ )T y ∗ + s∗ , (2)
where s∗ belongs to the null space of JE (x∗ ) (so JE (x∗ )s∗ = 0).

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 13/25
Optimality conditions for constrained problems ...

Proof of Theorem 16 (for equality constraints only): (continued)


By rank-nullity theorem, there exists y ∗ ∈ Rm and s∗ ∈ Rn such that
∇f (x∗ ) = JE (x∗ )T y ∗ + s∗ , (2)
where s∗ belongs to the null space of JE (x∗ ) (so JE (x∗ )s∗ = 0).
Taking the inner product of (2) with s∗ , we deduce
(s∗ )T ∇f (x∗ ) = (s∗ )T JE (x∗ )T y ∗ + (s∗ )T s∗ , or equivalently,
(s∗ )T ∇f (x∗ ) = (y ∗ )T JE (x∗ )s∗ + )s∗ )2 .

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 13/25
Optimality conditions for constrained problems ...

Proof of Theorem 16 (for equality constraints only): (continued)


By rank-nullity theorem, there exists y ∗ ∈ Rm and s∗ ∈ Rn such that
∇f (x∗ ) = JE (x∗ )T y ∗ + s∗ , (2)
where s∗ belongs to the null space of JE (x∗ ) (so JE (x∗ )s∗ = 0).
Taking the inner product of (2) with s∗ , we deduce
(s∗ )T ∇f (x∗ ) = (s∗ )T JE (x∗ )T y ∗ + (s∗ )T s∗ , or equivalently,
(s∗ )T ∇f (x∗ ) = (y ∗ )T JE (x∗ )s∗ + )s∗ )2 .
From (1) and JE (x∗ )s∗ = 0, we deduce (s∗ )T ∇f (x∗ ) = 0. Thus
)s∗ )2 = 0 and so s∗ = 0. Again from (2): ∇f (x∗ ) = JE (x∗ )T y ∗ . !

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 13/25
Optimality conditions for constrained problems ...

Proof of Theorem 16 (for equality constraints only): (continued)


By rank-nullity theorem, there exists y ∗ ∈ Rm and s∗ ∈ Rn such that
∇f (x∗ ) = JE (x∗ )T y ∗ + s∗ , (2)
where s∗ belongs to the null space of JE (x∗ ) (so JE (x∗ )s∗ = 0).
Taking the inner product of (2) with s∗ , we deduce
(s∗ )T ∇f (x∗ ) = (s∗ )T JE (x∗ )T y ∗ + (s∗ )T s∗ , or equivalently,
(s∗ )T ∇f (x∗ ) = (y ∗ )T JE (x∗ )s∗ + )s∗ )2 .
From (1) and JE (x∗ )s∗ = 0, we deduce (s∗ )T ∇f (x∗ ) = 0. Thus
)s∗ )2 = 0 and so s∗ = 0. Again from (2): ∇f (x∗ ) = JE (x∗ )T y ∗ . !

Let (CP) with equalities only (I = ∅). Then feasible descent


direction s at x ∈ Ω if ∇f (x)T s < 0 and JE (x)s = 0.
Let (CP). Then feasible descent direction s at x ∈ Ω if
∇f (x)T s < 0, JE (x)s = 0 and ∇ci (x)T s ≥ 0 for all i ∈ I ∩ A(x).

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 13/25
Constraint qualifications

Proof of Th 16: used (first-order) Taylor to linearize f and ci


along feasible paths/perturbations x(α) etc. Only correct if
linearized approximation covers the essential geometry of the
feasible set. CQs ensure this is the case.

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 14/25
Constraint qualifications

Proof of Th 16: used (first-order) Taylor to linearize f and ci


along feasible paths/perturbations x(α) etc. Only correct if
linearized approximation covers the essential geometry of the
feasible set. CQs ensure this is the case.
Examples:
(CP) satisfies the Slater Constraint Qualification (SCQ) ⇐⇒
if ∃ x s.t. cE (x) = Ax − b = 0 and cI (x) > 0 (i.e., ci (x) > 0, i ∈ I ).
(CP) satisfies the Linear Independence Constraint
Qualification (LICQ) ⇐⇒ ∇ci (x), i ∈ A(x), are linearly
independent (at relevant x).

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 14/25
Constraint qualifications

Proof of Th 16: used (first-order) Taylor to linearize f and ci


along feasible paths/perturbations x(α) etc. Only correct if
linearized approximation covers the essential geometry of the
feasible set. CQs ensure this is the case.
Examples:
(CP) satisfies the Slater Constraint Qualification (SCQ) ⇐⇒
if ∃ x s.t. cE (x) = Ax − b = 0 and cI (x) > 0 (i.e., ci (x) > 0, i ∈ I ).
(CP) satisfies the Linear Independence Constraint
Qualification (LICQ) ⇐⇒ ∇ci (x), i ∈ A(x), are linearly
independent (at relevant x).
Both SCQ and LICQ fail for
Ω = {(x1 , x2 ) : c1 (x) = 1 − x21 − (x2 − 1)2 ≥ 0; c2 (x) = −x2 ≥ 0}.
TΩ (x) = {(0, 0)} and F (x) = {(s1 , 0) : s1 ∈ R}. Thus TΩ (x) 4= F (x).

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 14/25
Constraint qualifications...

Tangent cone to Ω at x: [See Chapter 12, Nocedal & Wright]


TΩ (x) = {s : limiting direction of feasible sequence} [‘geometry’ of Ω]
zk − x
s = lim k
where z k
∈ Ω , t k
> 0 , t k
→ 0 and z k
→ x as k → ∞.
k→∞ t
Set of linearized feasible directions: [‘algebra’ of Ω]
F (x) = {s : sT ∇ci (x) = 0, i ∈ E; sT ∇ci (x) ≥ 0, i ∈ I ∩ A(x)}
Want TΩ (x) = F (x) ←−[ensured if a CQ holds]

min(x1 ,x2 ) x1 + x2
s.t. x21 + x22 − 2 = 0.

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 15/25
Optimality conditions for constrained problems ...

If the constraints of (CP) are linear in the variables, no constraint


qualification is required.

Theorem 17 (First order necessary conditions for linearly


constrained problems) Let (cE , cI )(x) := Ax − b in (CP). Then
x∗ local minimizer of (CP) =⇒ x∗ KKT point of (CP).

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 16/25
Optimality conditions for constrained problems ...

If the constraints of (CP) are linear in the variables, no constraint


qualification is required.

Theorem 17 (First order necessary conditions for linearly


constrained problems) Let (cE , cI )(x) := Ax − b in (CP). Then
x∗ local minimizer of (CP) =⇒ x∗ KKT point of (CP).

Let A = (AE , AI ) and b = (bE , bI ) corresponding to equality


and inequality constraints.
KKT conditions for linearly-constrained (CP): x∗ KKT point ⇔
there exists (y ∗ , λ∗ ) such that
∇f (x∗ ) = AT ∗ T ∗
E y + AI λ ,

AE x∗ − bE = 0, AI x∗ − bI ≥ 0,
λ∗ ≥ 0, (λ∗ )T (AI x∗ − bI ) = 0.

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 16/25
Optimality conditions for convex problems

(CP) is a convex programming problem if and only if


f (x) is a convex function, ci (x) is a concave function for all
i ∈ I and cE (x) = Ax − b.

• ci is a concave function ⇔ (−ci ) is a convex function.


• (CP) convex problem ⇒ Ω is a convex set.
• (CP) convex problem ⇒ any local minimizer of (CP) is global.

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 17/25
Optimality conditions for convex problems

(CP) is a convex programming problem if and only if


f (x) is a convex function, ci (x) is a concave function for all
i ∈ I and cE (x) = Ax − b.

• ci is a concave function ⇔ (−ci ) is a convex function.


• (CP) convex problem ⇒ Ω is a convex set.
• (CP) convex problem ⇒ any local minimizer of (CP) is global.

First order necessary conditions are also sufficient for optimality


when (CP) is convex.
Theorem 18. (Sufficient optimality conditions for convex
problems: Let (CP) be a convex programming problem.
x̂ KKT point of (CP) =⇒ x̂ is a (global) minimizer of (CP). !

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 17/25
Optimality conditions for convex problems

Proof of Theorem 18.


f convex =⇒ f (x) ≥ f (x̂) + ∇f (x̂)$ (x − x̂), for all x ∈ Rn . (3)

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 18/25
Optimality conditions for convex problems

Proof of Theorem 18.


f convex =⇒ f (x) ≥ f (x̂) + ∇f (x̂)$ (x − x̂), for all x ∈ Rn . (3)
"
(3)+[∇f (x̂) = A ŷ +
$
i∈I λ̂i ∇ci (x̂)] =⇒

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 18/25
Optimality conditions for convex problems

Proof of Theorem 18.


f convex =⇒ f (x) ≥ f (x̂) + ∇f (x̂)$ (x − x̂), for all x ∈ Rn . (3)
"
(3)+[∇f (x̂) = A ŷ +
$
λ̂i ∇ci (x̂)] =⇒
i∈I
"
f (x) ≥ f (x̂) + (A ŷ) (x − x̂) + i∈I λ̂i (∇ci (x̂)$ (x − x̂)),
$ $

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 18/25
Optimality conditions for convex problems

Proof of Theorem 18.


f convex =⇒ f (x) ≥ f (x̂) + ∇f (x̂)$ (x − x̂), for all x ∈ Rn . (3)
"
(3)+[∇f (x̂) = A ŷ +
$
i∈I λ̂i ∇ci (x̂)] =⇒
"
f (x) ≥ f (x̂) + (A ŷ) (x − x̂) + i∈I λ̂i (∇ci (x̂)$ (x − x̂)),
$ $

"
f (x) ≥ f (x̂) + ŷ $ A(x − x̂) + i∈I λ̂i (∇ci (x̂)$ (x − x̂)) (4).

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 18/25
Optimality conditions for convex problems

Proof of Theorem 18.


f convex =⇒ f (x) ≥ f (x̂) + ∇f (x̂)$ (x − x̂), for all x ∈ Rn . (3)
"
(3)+[∇f (x̂) = A ŷ +
$
i∈I λ̂i ∇ci (x̂)] =⇒
"
f (x) ≥ f (x̂) + (A ŷ) (x − x̂) + i∈I λ̂i (∇ci (x̂)$ (x − x̂)),
$ $

"
f (x) ≥ f (x̂) + ŷ $ A(x − x̂) + i∈I λ̂i (∇ci (x̂)$ (x − x̂)) (4).

Let x ∈ Ω arbitrary =⇒ Ax = b and c(x) ≥ 0.


Ax = b and Ax̂ = b =⇒ A(x − x̂) = 0. (5)

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 18/25
Optimality conditions for convex problems

Proof of Theorem 18.


f convex =⇒ f (x) ≥ f (x̂) + ∇f (x̂)$ (x − x̂), for all x ∈ Rn . (3)
"
(3)+[∇f (x̂) = A ŷ +
$
i∈Iλ̂i ∇ci (x̂)] =⇒
"
f (x) ≥ f (x̂) + (A ŷ) (x − x̂) + i∈I λ̂i (∇ci (x̂)$ (x − x̂)),
$ $

"
f (x) ≥ f (x̂) + ŷ $ A(x − x̂) + i∈I λ̂i (∇ci (x̂)$ (x − x̂)) (4).

Let x ∈ Ω arbitrary =⇒ Ax = b and c(x) ≥ 0.


Ax = b and Ax̂ = b =⇒ A(x − x̂) = 0. (5)

ci concave =⇒ ci (x) ≤ ci (x̂) + ∇ci (x̂)$ (x − x̂).


=⇒ ∇ci (x̂)$ (x − x̂) ≥ ci (x) − ci (x̂).
=⇒ λ̂i (∇ci (x̂)$ (x − x̂)) ≥ λ̂i (ci (x) − ci (x̂)) = λ̂i ci (x)≥ 0,
since λ̂ ≥ 0, λ̂i ci (x) = 0 and c(x) ≥ 0.

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 18/25
Optimality conditions for convex problems

Proof of Theorem 18.


f convex =⇒ f (x) ≥ f (x̂) + ∇f (x̂)$ (x − x̂), for all x ∈ Rn . (3)
"
(3)+[∇f (x̂) = A ŷ +
$
i∈Iλ̂i ∇ci (x̂)] =⇒
"
f (x) ≥ f (x̂) + (A ŷ) (x − x̂) + i∈I λ̂i (∇ci (x̂)$ (x − x̂)),
$ $

"
f (x) ≥ f (x̂) + ŷ $ A(x − x̂) + i∈I λ̂i (∇ci (x̂)$ (x − x̂)) (4).

Let x ∈ Ω arbitrary =⇒ Ax = b and c(x) ≥ 0.


Ax = b and Ax̂ = b =⇒ A(x − x̂) = 0. (5)

ci concave =⇒ ci (x) ≤ ci (x̂) + ∇ci (x̂)$ (x − x̂).


=⇒ ∇ci (x̂)$ (x − x̂) ≥ ci (x) − ci (x̂).
=⇒ λ̂i (∇ci (x̂)$ (x − x̂)) ≥ λ̂i (ci (x) − ci (x̂)) = λ̂i ci (x)≥ 0,
since λ̂ ≥ 0, λ̂i ci (x) = 0 and c(x) ≥ 0.
Thus, from (4) and (5), f (x) ≥ f (x̂). !

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 18/25
Example: Optimality conditions for QP problems

A Quadratic Programming (QP) problem has the form


minimizex∈Rn c# x + 12 x# Hx s. t. Ax = b, Ãx ≥ b̃. (QP)
H symm. pos. semidefinite =⇒ (QP) convex problem.
The KKT conditions for (QP):
x̂ KKT point of (QP) ⇐⇒ ∃ (ŷ, λ̂) ∈ Rm × Rp such that

H x̂ + c = A# ŷ + Ã# λ̂,
Ax̂ = b, Ãx̂ ≥ b̃,
λ̂ ≥ 0, λ̂# (Ãx̂ − b̃) = 0.

“An example of a nonlinear constrained problem” is convex;


removing the constraint x2 − x21 ≥ 0 makes it a convex (QP).

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 20/25
Example: Duality theory for QP problems

For simplicity, let A := 0 and H 8 0 in (QP): primal problem:


minimizex∈Rn c$ x + 12 x$ Hx s. t. Ãx ≥ b̃. (QP)
The KKT conditions for (QP):

H x̂ + c = Ã$ λ̂,
Ãx̂ ≥ b̃,
λ̂ ≥ 0, λ̂$ (Ãx̂ − b̃) = 0.

Dual problem:
maximize(x,λ) − 12 xT Hx + b̃T λ s.t. − Hx + Ã$ λ = c and λ ≥ 0.
Optimal value of primal pb=optimal value of dual pb (provided
they exist).

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 21/25
Optimality conditions for nonconvex problems

• When (CP) is not convex, the KKT conditions are not in


general sufficient for optimality
−→ need positive definite Hessian of the Lagrangian function
along “feasible” directions.

• More on second-order optimality conditions later on.

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 19/25
Second-order optimality conditions

• When (CP) is not convex, the KKT conditions are not in


general sufficient for optimality.
Assume some CQ holds. Then at a given point x∗ : the set of
feasible directions for (CP) at x∗ :
% &
F (x∗ ) = s : JE (x∗ )s = 0, sT ∇ci (x∗ ) ≥ 0, i ∈ A(x∗ ) ∩ I .

If x∗ is a KKT point, then for any s ∈ F (x∗ ),

∗ ∗ T ∗
"
T T
s ∇f (x ) = s JE (x ) y + i∈A(x∗ )∩I λi sT ∇ci (x∗ )
"
= (JE (x∗ )s)T y ∗ + i∈A(x∗ )∩I λi sT ∇ci (x∗ )
" T ∗
= ∗
i∈A(x )∩I λ i s ∇ci (x ) ≥ 0. (6)

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 22/25
Second-order optimality conditions...

If x∗ is a KKT point, then for any s ∈ F (x∗ ), either


sT ∇f (x∗ ) > 0
−→ so f can only increase and stay feasible along s
or sT ∇f (x∗ ) = 0
−→ cannot decide from 1st order info if f increases or not
along such s.
From (6), we see that the directions of interest are:
JE (x∗ )s = 0 and sT ∇ci (x∗ ) = 0, ∀i ∈ A(x∗ ) ∩ I with λi > 0.
i > 0},
F (λ∗ ) = {s ∈ F (x∗ ) : sT ∇ci (x∗ ) = 0, ∀i ∈ A(x∗ ) ∩ I with λ∗
where λ∗ is a Lagrange multiplier of the inequality constraints.
Then note that sT ∇f (x∗ ) = 0 for all s ∈ F (λ∗ ).

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 23/25
Second-order optimality conditions ...

Theorem 19 (Second-order necessary conditions)


Let some CQ hold for (CP). Let x∗ be a local minimizer of
(CP), and (y ∗ , λ∗ ) Lagrange multipliers of the KKT conditions
at x∗ . Then
sT ∇2xx L(x∗ , y ∗ , λ∗ )s ≥ 0 for all s ∈ F (λ∗ ),

where L(x, y, λ) = f (x) − y T cE (x) − λT cI (x) is the


Lagrangian function and so
"m "p
∇2xx L(x, y, λ) = ∇2 f (x) − j=1 yj ∇2 c j (x) − i=1 λi ci (x)].

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 24/25
Second-order optimality conditions ...

Theorem 19 (Second-order necessary conditions)


Let some CQ hold for (CP). Let x∗ be a local minimizer of
(CP), and (y ∗ , λ∗ ) Lagrange multipliers of the KKT conditions
at x∗ . Then
sT ∇2xx L(x∗ , y ∗ , λ∗ )s ≥ 0 for all s ∈ F (λ∗ ),

where L(x, y, λ) = f (x) − y T cE (x) − λT cI (x) is the


Lagrangian function and so
"m "p
∇2xx L(x, y, λ) = ∇2 f (x) − j=1 yj ∇2 c j (x) − i=1 λi ci (x)].

Theorem 20 (Second-order sufficient conditions)


Assume that x∗ is a feasible point of (CP) and (y ∗ , λ∗ ) are
such that the KKT conditions are satisfied by (x∗ , y ∗ , λ∗ ). If
sT ∇2xx L(x∗ , y ∗ , λ∗ )s > 0 for all s ∈ F (λ∗ ), s 4= 0,
then x∗ is a local minimizer of (CP). [See proofs in Nocedal & Wright]

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 24/25
Some simple approaches for solving (CP)

Equality-constrained problems: direct elimination (a simple


approach that may help/work sometimes; cannot be
automated in general)
Method of Lagrange multipliers: using the KKT and second
order conditions to find minimizers (again, cannot be
automated in general)
[see Pb Sheet 4]

Lectures 10 and 11: Constrained optimization problems and their optimality conditions – p. 25/25

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy