Chapter 5.3
Chapter 5.3
and let J be the set of indices j for which gj !x∗ " = 0. Then x∗ is said to be a
(E1) if the gradient vectors !hi !x∗ ", !gj !x∗ ",
regular point of the constraints (33)
1 ! i ! m# j ∈ J are linearly independent.
We note that, following the definition of active constraints given in
ction 11.1, a point x∗ is a regular point if the gradients of the active constraints
∗
Some notations:
Ω : set of points satisfying h(x) = 0 and g(x) ≤ 0
Ω E : set of points satisfying h(x) = 0, g(x) ≤ 0 and
g i (x) = 0 for all i in J.
Ω N : set of points satisfying h(x) = 0, g(x) ≤ 0 and
gi (x) < 0 for at least one i ∈J.
Ω = ΩE + ΩN .
(1) SE is the set of curves that are entirely in Ω E , which pass through
the point x* . Each curve can be split into two curves
that end at x* through opposing directions. Let SE be the set of such
curves.
(2) SN , set of curves in Ω but not in Ω E (except x* ), which end at the
point x* .
(3) S is the set of all curves in Ω that end at x* .
S = SE + S N .
Tangents of curves in S is the tangent cone TΩ (x* ).
minimize f!x"
(34)
(E2)
subject to h!x" = 0# g!x" ! 0#
and suppose x∗ is a regular point for the constraints. Then there is a vector
"∈E pp with # " 0 such that
mm and a vector # ∈ E
∇f (x *
!f!x ) + ∗J"h+ )λ!h!x
* T
(x" ∗ * )µ =0
+ J g (x" + #T !g!x∗ " = 0 (35)
(E3)
• Consider a curve x(t) ∈SN such that x(0) = x* . Let x(t) is such
that bi (t) = gi (x(t)) satisfies bi (t) < 0 when t > 0 for exactly one
i ∈J and, for remaining j ∈J, b j (t) = g j (x(t)) satisfies b j (t) = 0.
for all t.
•The above statement implies that bi′(0) ≤ 0 for one i ∈J and b ′j (0) = 0
remaining j ∈J. Hence ( x! (0)) ∇gi (x* ) ≤ 0 for one i ∈J and
T
Maximize f (x)
subject to h(x) = 0 and g(x) ≤ 0 KKT conditions:
KKT conditions: ∇f (x) + J h (x* )λ * + J g (x* )µ * = 0
−∇f (x) + J h (x* )λ * + J g (x* )µ * = 0 µ* ≤ 0
µ* ≥ 0 µ *T g(x* ) = 0
µ *T g(x* ) = 0 h(x* ) = 0
h(x* ) = 0 g(x* ) ≤ 0
g(x* ) ≤ 0
Minimize f (x)
subject to h(x) = 0 and g(x) ≥ 0 KKT conditions:
KKT conditions: ∇f (x) + J h (x* )λ * + J g (x* )µ * = 0
∇f (x) + J h (x* )λ * − J g (x* )µ * = 0 µ* ≤ 0
µ* ≥ 0 µ *T g(x* ) = 0
µ *T g(x* ) = 0 h(x* ) = 0
h(x* ) = 0 g(x* ) ≥ 0
g(x* ) ≥ 0
Maximize f (x)
subject to h(x) = 0 and g(x) ≥ 0 KKT conditions:
KKT conditions: ∇f (x) + J h (x* )λ * + J g (x* )µ * = 0
−∇f (x) + J h (x* )λ * − J g (x* )µ * = 0 µ* ≥ 0
µ* ≥ 0 µ *T g(x* ) = 0
µ *T g(x* ) = 0 h(x* ) = 0
h(x* ) = 0 g(x* ) ≥ 0
g(x* ) ≥ 0
Example:
Maximize/minimize f (x)
subject to g(x) ≥ 0
KKT conditions for x1 (maximum):
∇f (x 1 ) + ∇g(x 1 )µ1 = 0
µ1 ≥ 0
µ1T g(x 1 ) = 0
g(x 1 ) ≥ 0
KKT conditions for x 2 (minimum):
∇f (x 2 ) + ∇g(x 2 )µ2 = 0
µ2 ≤ 0
µ2T g(x 2 ) = 0
g(x 2 ) ≥ 0
5.3.3
Minimiza/on
with
posi/vity
The above example is a specialconstraints
case of a more general prob
Definition M Ω (x* ):
For a feasible point x* , the set is defined as
⎧ T
∇h *
) = 0, i = 1,...., m
⎪ d i (x
M Ω (x ) = ⎨d T
*
⎪⎩ d ∇g j (x *
) ≤ 0, for all j s.t. g j (x *
)=0
⎪ :y T
∇g (x *
) = 0,∀i ∈J (x *
, µ *
) ⎪
⎩ i z ⎭
For a regular point x* , M1 (x* ) is the set of tangents of the curves x(t)
satisfying x(0) = x* and hi (x(t)) = 0,∀i = 1,..., m,
gi (x(t)) ≤ 0,∀i ∈J(x* , µ * ),
gi (x(t)) < 0, for at least one i ∈J p (x* , µ * ) for t > 0.
gi (x(t)) = 0,∀i ∈J z (x* , µ * )
The subset M 2 (x* ) :
⎧ y ∈! n : yT ∇hi (x* ) = 0, ∀i = 1,..., m; ⎫
⎪ ⎪
⎪ :y T
∇gi (x *
) ≤ 0, ∀i ∈J(x* , µ * ) ⎪
M 2 (x ) = ⎨
*
* ⎬
⎪ :y T
∇gi (x *
) < 0, for at least one i ∈J z (x , µ ) ⎪
*
⎪ :y T
∇g (x *
) = 0, ∀i ∈J p (x* , µ * ) ⎪
⎩ i ⎭
For a regular point x* , M 2 (x* ) is the set of tangents of the curves x(t)
satisfying x(0) = x* and hi (x(t)) = 0,∀i = 1,..., m,
gi (x(t)) ≤ 0,∀i ∈J(x* , µ * ),
gi (x(t)) < 0, for at least one i ∈J z (x* , µ * ) for t > 0.
gi (x(t)) = 0,∀i ∈J p (x* , µ * )
Second order necessary condition (Theorem 5.3B):
Let x* be the minimum of the function f (x) in ! n subject to
constraint h(x) = 0 and g(x) ≤ 0 where h(x) and g(x) are m × 1
and p × 1 vector functions in ! n . Also, let x* be the regular
point of the constraints. Then there exists vectors λ * ∈! m ,
µ * ∈! p , and x* ∈! n satisfying KKT conditions. Also,
for every vector y in the subset M(x* ) + M 2 (x* ), we have
yT L(x* , λ * , µ * )y ≥ 0.
Second order sufficient condition (Theorem 5.3C):
For a function f (x), and constraints h(x) = 0 and g(x) ≤ 0,
let x* ∈! n , λ * ∈! m and µ * ∈! p be the vectors satisfying
KKT conditions. For every vector y in the subset
M(x* ) + M 2 (x* ), if we have yT L(x* , λ * , µ * )y > 0, then x* is a
strict minimum of the function f (x) subject to constraints h(x) = 0
and g(x) ≤ 0.
Second
order
condi/ons
are
proved
using
the
following
theorem
Theorem 5.3D:
Consider the function f (x) in ! n and constraints h(x) = 0 and
g(x) ≤ 0 where h(x) and g(x) are m × 1 and p × 1 vector functions
in ! n . Let there exist vectors, λ * ∈! m , µ * ∈! p , and
x* ∈! n such that ∇f (x* ) + J h (x* )λ * + J g (x* )µ * = 0, µ * ≥ 0,
and µ *T g(x* ) = 0. Then for every s(t) = f (x(t)) such that
x'(0) = d ∈M (x* ) + M 2 (x* ), and x(0) = x* we have
s(ε) = s(0) + ε 2 dT L(x* , λ * , µ * )d + o(ε 2 ).
Proof of Theorem 5.3D:
For the feasible curve, define
m p
c(t) = L(x(t), λ * , µ * ) = f (x(t)) + ∑ λ *i hi (x(t)) + ∑ µ *i gi (x(t))
i=1 i=1
Differentiating gives
⎛ m p
⎞
c′(t) = x ′ (t) ⎜ ∇f (x(t)) + ∑ λ i ∇hi (x(t)) + ∑ µ i ∇gi (x(t))⎟
T * *
⎝ ⎠
!########"########$
i=1 i=1
L x (x(t ),λ * , µ * )
⎛ m p
⎞
c′′(t) = x ′′ (t) ⎜ ∇f (x(t)) + ∑ λ i ∇hi (x(t)) + ∑ µ i ∇gi (x(t))⎟ +
T * *
⎝ ⎠
!########"########$
i=1 i=1
L x (x(t ),λ * , µ * )
⎛ m p
⎞
x ′ (t) ⎜ F(x(t)) + ∑ λ i Hi (x(t)) + ∑ µ i Hi (x(t))⎟ x ′(t)
T * *
⎝ ⎠
!####### i=1
#"######## i=1
$
L xx (x(t ),λ * , µ * )
Now by definition
c(0) = f (x(0)) =L(x* , λ * , µ * ) = f (x* )
Hence
c′(0) =0 c′′(0) =x ′T (0)L xx (x* , λ * , µ * )x ′(0)
Recap:
m p
c(ε) = L(x(ε), λ * , µ * ) = f (x(ε)) + ∑ λ *i hi (x(ε)) + ∑ µ *i gi (x(ε))
i=1 i=1
s(ε) = f (x(ε)).
Hence
s(ε) = f (x* ) +0.5 x ′T (0)L xx (x(0), λ * , µ * )x ′(0) + o(ε 2 )