73% found this document useful (11 votes)

8K views75 pages

Numerical Optimization - Solutions Manual

This document contains solutions to selected problems in numerical optimization from the book "Solutions to Selected Problems in Numerical Optimization" by J. Nocedal and S.J. Wright. The document is organized by chapter and contains solutions to over 100 problems related to optimization techniques like line search methods, trust-region methods, conjugate gradient methods, quasi-Newton methods, large-scale optimization, calculating derivatives, derivative-free optimization, least squares problems, and nonlinear equations. Each problem solution includes detailed steps and explanations.

Uploaded by

Gugo Man

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

73% found this document useful (11 votes)

8K views75 pages

Numerical Optimization - Solutions Manual

Uploaded by

Gugo Man

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 75

Solutions to Selected Problems in

NUMERICAL OPTIMIZATION
by J. Nocedal and S.J. Wright
Second Edition
Solution Manual Prepared by:
Frank Curtis
Long Hei
Gabriel Lopez-Calva
Jorge Nocedal
Stephen J. Wright
1
Contents
1 Introduction 6
2 Fundamentals of Unconstrained Optimization 6
Problem 2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Problem 2.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Problem 2.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Problem 2.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Problem 2.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Problem 2.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Problem 2.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Problem 2.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Problem 2.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Problem 2.13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Problem 2.14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Problem 2.15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Problem 2.16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3 Line Search Methods 14
Problem 3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Problem 3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Problem 3.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Problem 3.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Problem 3.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Problem 3.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Problem 3.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Problem 3.13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4 Trust-Region Methods 20
Problem 4.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Problem 4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Problem 4.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Problem 4.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Problem 4.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5 Conjugate Gradient Methods 23
Problem 5.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Problem 5.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Problem 5.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2
Problem 5.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Problem 5.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Problem 5.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Problem 5.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
6 Quasi-Newton Methods 28
Problem 6.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Problem 6.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
7 Large-Scale Unconstrained Optimization 29
Problem 7.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Problem 7.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Problem 7.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Problem 7.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
8 Calculating Derivatives 31
Problem 8.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Problem 8.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Problem 8.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
9 Derivative-Free Optimization 33
Problem 9.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Problem 9.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
10 Least-Squares Problems 35
Problem 10.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Problem 10.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Problem 10.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Problem 10.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Problem 10.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
11 Nonlinear Equations 39
Problem 11.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Problem 11.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Problem 11.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Problem 11.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Problem 11.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Problem 11.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Problem 11.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3
12 Theory of Constrained Optimization 43
Problem 12.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Problem 12.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Problem 12.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Problem 12.13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Problem 12.14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Problem 12.16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Problem 12.18 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Problem 12.21 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
13 Linear Programming: The Simplex Method 49
Problem 13.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Problem 13.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Problem 13.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Problem 13.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
14 Linear Programming: Interior-Point Methods 52
Problem 14.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Problem 14.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Problem 14.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Problem 14.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Problem 14.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Problem 14.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Problem 14.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Problem 14.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Problem 14.12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Problem 14.13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Problem 14.14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
15 Fundamentals of Algorithms for Nonlinear Constrained Op-
timization 62
Problem 15.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Problem 15.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Problem 15.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Problem 15.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Problem 15.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Problem 15.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4
16 Quadratic Programming 66
Problem 16.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Problem 16.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Problem 16.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Problem 16.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Problem 16.15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Problem 16.21 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
17 Penalty and Augmented Lagrangian Methods 70
Problem 17.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Problem 17.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Problem 17.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
18 Sequential Quadratic Programming 72
Problem 18.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Problem 18.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
19 Interior-Point Methods for Nonlinear Programming 74
Problem 19.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Problem 19.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Problem 19.14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5
1 Introduction
No exercises assigned.
2 Fundamentals of Unconstrained Optimization
Problem 2.1
f
x
1
= 100 2(x
2
x
2
1
)(2x
1
) + 2(1 x
1
)(1)
= 400x
1
(x
2
x
2
1
) 2(1 x
1
)
f
x
2
= 200(x
2
x
2
1
)
=f(x) =
_
400x
1
(x
2
x
2
1
) 2(1 x
1
)
200(x
2
x
2
1
)
_

2
f
x
2
1
= 400[x
1
(2x
1
) + (x
2
x
2
1
)(1)] + 2 = 400(x
2
3x
2
1
) + 2

2
f
x
2
x
1
=

2
f
x
1
x
2
= 400x
1

2
f
x
2
2
= 200
=
2
f(x) =
_
400(x
2
3x
2
1
) + 2 400x
1
400x
1
200
_
1. f(x

) =
_
0
0
_
and x

=
_
1
1
_
is the only solution to f(x) = 0
2.
2
f(x

) =
_
802 400
400 200
_
is positive denite since 802 > 0, and det(
2
f(x

)) =
802(200) 400(400) > 0.
3. f(x) is continuous.
(1), (2), (3) imply that x

is the only strict local minimizer of f(x).

6
Problem 2.2
f
x
1
= 8 + 2x
1
f
x
2
= 12 4x
2
=f(x) =
_
8 + 2x
1
12 4x
2
_
=
_
0
0
_
One solution is x

=
_
4
3
_
.
This is the only point satisfying the rst order necessary conditions.

2
f(x) =
_
2 0
0 4
_
is not positive denite, since det(
2
f(x)) = 8 < 0.
Therefore, x

is NOT a minimizer. Consider min(f(x)). It is seen that

2
[f(x)] is also not positive denite. Therefore x

is NOT a maximizer.
Thus x

is a saddle point and only a stationary point.

The contour lines of f(x) are shown in Figure 1.
Problem 2.3
(1)
f
1
(x) = a
T
x
=
n

i=1
a
i
x
i
f
1
(x) =
_

_
f
1
x
1
. . .
f
1
xn
_

_ =
_
_
a
1
. . .
a
n
_
_
= a

2
f
1
(x) =
_
_

2
f
1
x
2
1

2
f
1
x
2
x
1
. . .
.
.
.
.
.
.
.
.
.
_
_
=
_

2
P
i
a
i
x
i
xsxt
_
s = 1 n
t = 1 n
= 0
7
6 5.5 5 4.5 4 3.5 3 2.5 2
1
1.5
2
2.5
3
3.5
4
4.5
5
Figure 1: Contour lines of f(x).
(2)
f
2
(x) = x
T
Ax =
n

i=1
n

j=1
A
ij
x
i
x
j
f
2
(x) =
_
f
2
xs
_
s=1n
=
_
j
A
sj
x
j
+

i
A
is
x
i

s=1n
=
_
2

n
j=1
A
sj
x
j

s=1n
(since A is symmetric)
= 2Ax

2
f
2
(x) =
_

2
f
2
xsxt
_
s = 1 n
t = 1 n
=
_

2
P
i
P
j
A
ij
x
i
x
j
xsxt
_
s = 1 n
t = 1 n
=
_
A
st
+A
ts

s = 1 n
t = 1 n
= 2A
8
Problem 2.4
For any univariate function f(x), we know that the second oder Taylor
expansion is
f(x + x) = f(x) +f
(1)
(x)x +
1
2
f
(2)
(x +tx)x
2
,
and the third order Taylor expansion is
f(x + x) = f(x) +f
(1)
(x)x +
1
2
f
(2)
(x)x
2
+
1
6
f
(3)
(x +tx)x
3
,
where t (0, 1).
For function f
1
(x) = cos (1/x) and any nonzero point x, we know that
f
(1)
1
(x) =
1
x
2
sin
1
x
, f
(2)
1
(x) =
1
x
4
_
cos
1
x
+ 2xsin
1
x
_
.
So the second order Taylor expansion for f
1
(x) is
cos
1
x+x
= cos
1
x
+
_
1
x
2
sin
1
x
_
x

1
2(x+tx)
4
_
cos
1
x+tx
2(x +tx) sin
1
x+tx
_
x
2
,
where t (0, 1). Similarly, for f
2
(x) = cos x, we have
f
(1)
2
(x) = sin x, f
(2)
2
(x) = cos x, f
(3)
2
(x) = sin x.
Thus the third order Taylor expansion for f
2
(x) is
cos (x + x) = cos x (sin x)x
1
2
(cos x)x
2
+
1
6
[sin (x +tx)]x
3
,
where t (0, 1). When x = 1, we have
cos (1 + x) = cos 1 (sin 1)x
1
2
(cos 1)x
2
+
1
6
[sin (1 +tx)]x
3
,
where t (0, 1).
9
Problem 2.5
Using a trig identity we nd that
f(x
k
) =
_
1 +
1
2
k
_
2
(cos
2
k + sin
2
k) =
_
1 +
1
2
k
_
2
,
from which it follows immediately that f(x
k+1
) < f(x
k
).
Let be any point in [0, 2]. We aim to show that the point (cos , sin )
on the unit circle is a limit point of x
k
.
From the hint, we can identify a subsequence
k
1
,
k
2
,
k
3
, . . . such that
lim
j

k
j
= . Consider the subsequence x
k
j

j=1
. We have
lim
j
x
k
j
= lim
j
_
1 +
1
2
k
__
cos k
j
sin k
j
_
= lim
j
_
1 +
1
2
k
_
lim
j
_
cos
k
j
sin
k
j
_
=
_
cos
sin
_
.
Problem 2.6
We need to prove that isolated local min strict local min. Equiv-
alently, we prove the contrapositive: not a strict local min not an
isolated local min.
If x

is not even a local min, then it is certainly not an isolated local

min. So we suppose that x

is a local min but that it is not strict. Let ^

be any nbd of x

such that f(x

) f(x) for all x ^. Because x

is not a
strict local min, there is some other point x
A
^ such that f(x

) = f(x
A
).
Hence x
A
is also a local min of f in the neighborhood ^ that is dierent
from x

. Since we can do this for every neighborhood of x

within which x

is a local min, x

cannot be an isolated local min.

Problem 2.8
Let S be the set of global minimizers of f. If S only has one element, then
it is obviously a convex set. Otherwise for all x, y S and [0, 1],
f(x + (1 )y) f(x) + (1 )f(y)
since f is convex. f(x) = f(y) since x, y are both global minimizers. There-
fore,
f(x + (1 )y) f(x) + (1 )f(x) = f(x).
10
But since f(x) is a global minimizing value, f(x) f(x + (1 )y).
Therefore, f(x +(1 y) = f(x) and hence x +(1 )y S. Thus S is
a convex set.
Problem 2.9
f indicates steepest descent. (p
k
) (f) = |p
k
| |f| cos . p
k
is a
descent direction if 90

< < 90

cos > 0.
p
k
f
|p
k
||f|
= cos > 0 p
k
f < 0.
f =
_
2(x
1
+x
2
2
)
4x
2
(x
1
+x
2
2
)
_
p
k
f
k

x=
0
@
1
0
1
A
=
_
1
1
_

_
2
0
_
= 2 < 0
which implies that p
k
is a descent direction.
p
k
=
_
1
1
_
, x =
_
1
0
_
f(x
k
+
k
p
k
) = f((1 , )
T
) = ((1 ) +
2
)
2
=
d
d
f(x
k
+
k
p
k
) = 2(1 +
2
)(1 + 2) = 0 only when =
1
2
.
It is seen that
d
2
d
2
f(x
k
+
k
p
k
)

=
1
2
= 6(2
2
2 + 1)

=
1
2
= 3 > 0, so
=
1
2
is indeed a minimizer.
Problem 2.10
Note rst that
x
j
=
n

i=1
S
ji
z
i
+s
j
.
11
By the chain rule we have

z
i

f(z) =
n

j=1
f
x
j
x
j
z
i
=
n

j=1
S
ji
f
x
j
=
_
S
T
f(x)

i
.
For the second derivatives, we apply the chain rule again:

2
z
i
z
k

f(z) =

z
k
n

j=1
S
ji
f(x)
x
j
=
n

j=1
n

l=1
S
ji

2
f(x)
x
j
x
l
x
l
z
k
S
lk
=
_
S
T

2
f(x)S

ki
.
Problem 2.13
x

= 0
|x
k+1
x

|
|x
k
x

|
=

k
k + 1

< 1 and
k
k + 1
1.
For any r (0, 1), k
0
such that k > k
0
,
k
k + 1
> r.
This implies x
k
is not Q-linearly convergent.
Problem 2.14
|x
k+1
x

|
|x
k
x

|
2
=
(0.5)
2
k+1
((0.5)
2
k
)
2
=
(0.5)
2
k+1
(0.5)
2
k+1
= 1 < .
Hence the sequence is Q-quadratic.
Problem 2.15
x
k
=
1
k!
x

= lim
n
x
k
= 0
|x
k+1
x

|
|x
k
x

|
=
k!
(k + 1)!
=
1
k + 1
k
0.
12
This implies x
k
is Q-superlinearly convergent.
|x
k+1
x

|
|x
k
x

|
2
=
k!k!
(k + 1)!
=
k!
k + 1
.
This implies x
k
is not Q-quadratic convergent.
Problem 2.16
For k even, we have
|x
k+1
x

|
|x
k
x

|
=
x
k
/k
x
k
=
1
k
0,
while for k odd we have
|x
k+1
x

|
|x
k
x

|
=
(1/4)
2
k
x
k1
/k
= k
(1/4)
2
k
(1/4)
2
k1
= k(1/4)
2
k1
0,
Hence we have
|x
k+1
x

|
|x
k
x

|
=0,
so the sequence is Q-superlinear. The sequence is not Q-quadratic because
for k even we have
|x
k+1
x

|
|x
k
x

|
2
=
x
k
/k
x
2
k
=
1
k
4
2
k
.
The sequence is however R-quadratic as it is majorized by the sequence
z
k
= (0.5)
2
k
, k = 1, 2, . . . . For even k, we obviously have
x
k
= (0.25)
2
k
< (0.5)
2
k
= z
k
,
while for k odd we have
x
k
< x
k1
= (0.25)
2
k1
= ((0.25)
1/2
)
2
k
= (0.5)
2
k
= z
k
.
A simple argument shows that z
k
is Q-quadratic.
13
3 Line Search Methods
Problem 3.2
Graphical solution
We show that if c
1
is allowed to be greater than c
2
, then we can nd a
function for which no steplengths > 0 satisfy the Wolfe conditions.
Consider the convex function depicted in Figure 2, and let us choose c
1
=
0.99.
sufficient decrease line
slope = -1
slope = -1/2
()
()

Figure 2: Convex function and sucient decrease line
We observe that the sucient decrease line intersects the function only once.
Moreover for all points to the left of the intersection, we have

t
()
1
2
.
Now suppose that we choose c
2
= 0.1 so that the curvature condition requires

t
() 0.1. (1)
Then there are clearly no steplengths satisfying the inequality (1) for which
the sucient decrease condition holds.
14
Problem 3.3
Suppose p is a descent direction and dene
() = f(x +p), 0.
Then any minimizer

of () satises

t
(

) = f(x +

p)
T
p = 0. (2)
A strongly convex quadratic function has the form
f(x) =
1
2
x
T
Qx +b
T
x, Q > 0,
and hence
f(x) = Qx +b. (3)
The one-dimensional minimizer is unique, and by Equation (2) satises
[Q(x +

p) +b]
T
p = 0.
Therefore
(Qx +b)
T
p +

p
T
Qp = 0
which together with Equation (3) gives

=
(Qx +b)
T
p
p
T
Qp
=
f(x)
T
p
p
T
Qp
.
Problem 3.4
Let f(x) =
1
2
x
T
Qx+b
T
x+d, with Q positive denite. Let x
k
be the current
iterate and p
k
a non-zero direction. Let 0 < c <
1
2
.
The one-dimensional minimizer along x
k
+ p
k
is (see the previous ex-
ercise)

k
=
f
T
k
p
k
p
T
k
Qp
k
Direct substitution then yields
f(x
k
) + (1 c)
k
f
T
k
p
k
= f(x
k
)
(f
T
k
p
k
)
2
p
T
k
Qp
k
+c
(f
T
k
p
k
)
2
p
T
k
Qp
k
15
Now, since f
k
= Qx
k
+b, after some algebra we get
f(x
k
+
k
p
k
) = f(x
k
)
(f
T
k
p
k
)
2
p
T
k
Qp
k
+
1
2
(f
T
k
p
k
)
2
p
T
k
Qp
k
,
from which the rst inequality in the Goldstein conditions is evident. For
the second inequality, we reduce similar terms in the previous expression to
get
f(x
k
+
k
p
k
) = f(x
k
)
1
2
(f
T
k
p
k
)
2
p
T
k
Qp
k
,
which is smaller than
f(x
k
) +c
k
f
T
k
p
k
= f(x
k
) c
(f
T
k
p
k
)
2
p
T
k
Qp
k
.
Hence the Goldstein conditions are satised.
Problem 3.5
First we have from (A.7)
|x| = |B
1
Bx| |B
1
| |Bx|,
Therefore
|Bx| |x|/|B
1
|
for any nonsingular matrix B.
For symmetric and positive denite matrix B, we have that the matrices
B
1/2
and B
1/2
exist and that |B
1/2
| = |B|
1/2
and |B
1/2
| = |B
1
|
1/2
.
Thus, we have
cos =
f
T
p
|f| |p|
=
p
T
Bp
|Bp| |p|

p
T
Bp
|B| |p|
2
=
p
T
B
1/2
B
1/2
p
|B| |p|
2
=
|B
1/2
p|
2
|B| |p|
2

|p|
2
|B
1/2
|
2
|B| |p|
2
=
1
|B
1
| |B|

1
M
.
16
We can actually prove the stronger result that cos 1/M
1/2
. Dening
p = B
1/2
p = B
1/2
f, we have
cos =
p
T
Bp
|f| |p|
=
p
T
p
|B
1/2
p| |B
1/2
p|
=
| p|
2
|B
1/2
| | p| |B
1/2
| | p|
=
1
|B
1/2
| |B
1/2
|

1
M
1/2
.
Problem 3.6
If x
0
x

is parallel to an eigenvector of Q, then

f(x
0
) = Qx
0
b = Qx
0
Qx

+Qx

b
= Q(x
0
x

) +f(x

)
= (x
0
x

)
for the corresponding eigenvalue . From here, it is easy to get
f
T
0
f
0
=
2
(x
0
x

)
T
(x
0
x

),
f
T
0
Qf
0
=
3
(x
0
x

)
T
(x
0
x

),
f
T
0
Q
1
f
0
= (x
0
x

)
T
(x
0
x

).
Direct substitution in equation (3.28) yields
|x
1
x

|
2
Q
= 0 or x
1
= x

.
Therefore the steepest descent method will nd the solution in one step.
Problem 3.7
We drop subscripts on f(x
k
) for simplicity. We have
x
k+1
= x
k
f,
so that
x
k+1
x

= x
k
x

f,
By the denition of | |
2
Q
, we have
|x
k+1
x

|
2
Q
= (x
k+1
x

)
T
Q(x
k+1
x

)
= (x
k
x

f)
T
Q(x
k
x

f)
= (x
k
x

)
T
Q(x
k
x

) 2f
T
Q(x
k
x

) +
2
f
T
Qf
= |x
k
x

|
2
Q
2f
T
Q(x
k
x

) +
2
f
T
Qf
17
Hence, by substituting f = Q(x
k
x

) and = f
T
f/(f
T
Qf), we
obtain
|x
k+1
x

|
2
Q
= |x
k
x

|
2
Q
2f
T
f +
2
f
T
Qf
= |x
k
x

|
2
Q
2(f
T
f)
2
/(f
T
Qf) + (f
T
f)
2
/(f
T
Qf)
= |x
k
x

|
2
Q
(f
T
f)
2
/(f
T
Qf)
= |x
k
x

|
2
Q
_
1
(f
T
f)
2
(f
T
Qf)|x
k
x

|
2
Q
_
= |x
k
x

|
2
Q
_
1
(f
T
f)
2
(f
T
Qf)(f
T
Q
1
f)
_
,
where we used
|x
k
x

|
2
Q
= f
T
Q
1
f
for the nal equality.
Problem 3.8
We know that there exists an orthogonal matrix P such that
P
T
QP = = diag
1
,
2
, ,
n
.
So
P
T
Q
1
P = (P
T
QP)
1
=
1
.
Let z = P
1
x, then
(x
T
x)
2
(x
T
Qx)(x
T
Q
1
x)
=
(z
T
z)
2
(z
T
z)(z
T

1
z)
=
(

i
z
2
i
)
2
(

i
z
2
i
)(

1
i
z
2
i
)
=
1
P
i

i
z
2
i
P
i
z
2
i

P
i

1
i
z
2
i
P
i
z
2
i
.
Let u
i
= z
2
i
/

i
z
2
i
, then all u
i
satisfy 0 u
i
1 and

i
u
i
= 1. Therefore
(x
T
x)
2
(x
T
Qx)(x
T
Q
1
x)
=
1
(

i
u
i

i
)(

i
u
i

1
i
)
=
(u)
(u)
, (4)
where (u) =
1
P
i
u
i

i
and (u) =

i
u
i

1
i
.
Dene function f() =
1

, and let

=

i
u
i

i
. Note that

[
1
,
n
].
Then
(u) =
1

i
u
i

i
= f(

). (5)
18
Let h() be the linear function tting the data (
1
,
1

1
) and (
n
,
1
n
). We
know that
h() =
1

n
+
1

1
n

1
(
n
).
Because f is convex, we know that f() h() holds for all [
1
,
n
].
Thus
() =

i
u
i
f(
i
)

i
u
i
h(
i
) = h(

i
u
i

i
) = h(

). (6)
Combining (4), (5) and (6), we have
(x
T
x)
2
(x
T
Qx)(x
T
Q
1
x)
=
(u)
(u)

f(

)
h(

)
min

1
n
f()
h()
(since

[
1
,
n
])
= min

1
n

1
1
n
+
n

1
n
=
1

n
min

1
n
1
(
1
+n)
=
1

1
+n
2
(
1
+n

1
+n
2
)
(since the minimum happens at d =

1
+n
2
)
=
4
1
n
(
1
+n)
2
.
This completes the proof of the Kantorovich inequality.
Problem 3.13
Let
q
() = a
2
+b+c. We get a, b and c from the interpolation conditions

q
(0) = (0) c = (0),

t
q
(0) =
t
(0) b =
t
(0),

q
(
0
) = (
0
) a = ((
0
) (0)
t
(0)
0
)/
2
0
.
This gives (3.57). The fact that
0
does not satisfy the sucient decrease
condition implies
0 < (
0
) (0) c
1

t
(0)
0
< (
0
) (0)
t
(0)
0
,
where the second inequality holds because c
1
< 1 and
t
(0) < 0. From here,
clearly, a > 0. Hence,
q
is convex, with minimizer at

1
=

t
(0)
2
0
2 [(
0
) (0)
t
(0)
0
]
.
19
Now, note that
0 < (c
1
1)
t
(0)
0
= (0) +c
1

t
(0)
0
(0)
t
(0)
0
< (
0
) (0)
t
(0)
0
,
where the last inequality follows from the violation of sucient decrease at

0
. Using these relations, we get

1
<

t
(0)
2
0
2(c
1
1)
t
(0)
0
=

0
2(1 c
1
)
.
4 Trust-Region Methods
Problem 4.4
Since liminf |g
k
| = 0, we have by denition of the liminf that v
i
0,
where the scalar nondecreasing sequence v
i
is dened by v
i
= inf
ki
|g
k
|.
In fact, since v
i
is nonnegative and nondecreasing and v
i
0, we must
have v
i
= 0 for all i, that is,
inf
ki
|g
k
| = 0, for all i.
Hence, for any i = 1, 2, . . . , we can identify an index j
i
i such that
|g
j
i
| 1/i, so that
lim
i
|g
j
i
| = 0.
By eliminating repeated entries from j
i

i=1
, we obtain an (innite) subse-
quence o of such that lim
iS
|g
i
| = 0. Moreover, since the iterates x
i

iS
are all conned to the bounded set B, we can choose a further subsequence

o such that
lim
i

S
x
i
= x

,
for some limit point x

. By continuity of g, we have |g(x

)| = 0, so
g(x

) = 0, so we are done.
Problem 4.5
Note rst that the scalar function of that we are trying to minimize is
()
def
= m
k
(p
S
k
) = m
k
(
k
g
k
/|g
k
|) = f
k

k
|g
k
|+
1
2

2
k
g
T
k
B
k
g
k
/|g
k
|
2
,
20
while the condition |p
S
k
|
k
and the denition p
S
k
=
k
g
k
/|g
k
| to-
gether imply that the restriction on the scalar is that [1, 1].
In the trivial case g
k
= 0, the function is a constant, so any value will
serve as the minimizer; the value = 1 given by (4.12) will suce.
Otherwise, if g
T
k
B
k
g
k
= 0, is a linear decreasing function of , so its
minimizer is achieved at the largest allowable value of , which is = 1, as
given in (4.12).
If g
T
k
B
k
g
k
,= 0, has a parabolic shape with critical point
=

k
|g
k
|

2
k
g
T
k
B
k
g
k
/|g
k
|
2
=
|g
k
|
3

k
g
T
k
B
k
g
k
.
If g
T
k
B
k
g
k
0, this value of is negative and is a maximizer. Hence, the
minimizing value of on the interval [1, 1] is at one of the endpoints of
the interval. Clearly (1) < (1), so the solution in this case is = 1, as
in (4.12).
When g
T
k
B
k
g
k
0, the value of above is positive, and is a minimizer
of . If this value exceeds 1, then must be decreasing across the interval
[1, 1], so achieves its minimizer at = 1, as in (4.12). Otherwise, (4.12)
correctly identies the formula above as yielding the minimizer of .
Problem 4.6
Because |g|
2
= g
T
g, it is sucient to show that
(g
T
g)(g
T
g) (g
T
Bg)(g
T
B
1
g). (7)
We know from the positive deniteness of B that g
T
Bg > 0, g
T
B
1
g > 0,
and there exists nonsingular square matrix L such that B = LL
T
, and thus
B
1
= L
T
L
1
. Dene u = L
T
g and v = L
1
g, and we have
u
T
v = (g
T
L)(L
1
g) = g
T
g.
The Cauchy-Schwarz inequality gives
(g
T
g)(g
T
g) = (u
T
v)
2
(u
T
u)(v
T
v) = (g
T
LL
T
g)(g
T
L
T
L
1
g) = (g
T
Bg)(g
T
B
1
g).
(8)
Therefore (7) is proved, indicating
=
|g|
4
(g
T
Bg)(g
T
B
1
g)
1. (9)
21
The equality in (8) holds only when L
T
g and L
1
g are parallel. That is,
when there exists constant ,= 0 such that L
T
g = L
1
g. This clearly
implies that g = LL
T
g = Bg,
1

g = L
T
L
1
g = B
1
g, and hence the
equality in (9) holds only when g, Bg and B
1
g are parallel.
Problem 4.8
On one hand,
2
() =
1

1
|p()|
and (4.39) gives

t
2
() =
d
d
1
|p()|
=
d
d
(|p()|
2
)
1/2
=
1
2
(|p()|
2
)
3/2
d
d
(|p()|
2
)
=
1
2
|p()|
3
d
d
n

j=1
(q
T
j
g)
2
(
j
+)
2
= |p()|
3
n

j=1
(q
T
j
g)
2
(
j
+)
3
where q
j
is the j-th column of Q. This further implies

2
()

t
2
()
=
|p()|
1
|p()|

|p()|
3

n
j=1
(q
T
j
g)
2
(
j
+)
3
=
|p()|
2
|p()|

n
j=1
(q
T
j
g)
2
(
j
+)
3
. (10)
On the other hand, we have from Algorithm 4.3 that q = R
T
p and R
1
R
T
=
(B +I)
1
. Hence (4.38) and the orthonormality of q
1
, q
2
, . . . , q
n
give
|q|
2
= p
T
(R
1
R
T
)p = p
T
(B +I)
1
p = p
T
n

j=1
q
T
j
q
j

j
+
p
=
_
_
n

j=1
q
T
j
g

j
+
q
T
j
_
_
_
_
n

j=1
q
T
j
q
j

j
+
_
_
_
_
n

j=1
q
T
j
g

j
+
q
j
_
_
=
n

j=1
(q
T
j
g)
2
(
j
+)
3
. (11)
Substitute (11) into (10), then we have that

2
()

t
2
()
=
|p|
2
|q|
2

|p|

. (12)
Therefore (4.43) and (12) give (in the l-th iteration of Algorithm 4.3)

(l+1)
=
(l)
+
|p
l
|
2
|q
l
|
2

|p
l
|

=
(l)
+
_
|p
l
|
|q
l
|
_
2
_
|p
l
|

_
.
This is exactly (4.44).
22
Problem 4.10
Since B is symmetric, there exist an orthogonal matrix Q and a diagonal
matrix such that B = QQ
T
, where = diag
1
,
2
, . . . ,
n
and
1

2
. . .
n
are the eigenvalues of B. Now we consider two cases:
(a) If
1
> 0, then all the eigenvalues of B are positive and thus B is
positive denite. In this case B +I is positive denite for = 0.
(b) If
1
0, we choose =
1
+ > 0 where > 0 is any xed
real number. Since
1
is the most negative eigenvalue of B, we know that

i
+ > 0 holds for all i = 1, 2, . . . , n. Note that B+I = Q(+I)Q
T
,
and therefore 0 <
1
+
2
+ . . .
n
+ are the eigenvalues of B+I.
Thus B +I is positive denite for this choice of .
5 Conjugate Gradient Methods
Problem 5.2
Suppose that p
0
, . . . , p
l
are conjugate. Let us express one of them, say p
i
,
as a linear combination of the others:
p
i
=
0
p
0
+ +
l
p
l
(13)
for some coecients
k
(k = 0, 1, . . . , l). Note that the sum does not include
p
i
. Then from conjugacy, we have
0 = p
T
0
Ap
i
=
0
p
T
0
Ap
0
+ +
l
p
T
0
Ap
l
=
0
p
T
0
Ap
0
.
This implies that
0
= 0 since the vectors p
0
, . . . , p
l
are assumed to be
conjugate and A is positive denite. The same argument is used to show
that all the scaler coecients
k
(k = 0, 1, . . . , l) in (13) are zero. Equation
(13) indicates that p
i
= 0, which contradicts the fact that p
i
is a nonzero
vector. The contradiction then shows that vectors p
0
, . . . , p
l
are linearly
independent.
Problem 5.3
Let
g() = (x
k
+p
k
)
=
1
2

2
p
T
k
Ap
k
+(Ax
k
b)
T
p
k
+(x
k
).
23
Matrix A is positive denite, so
k
is the minimizer of g() if g
t
(
k
) = 0.
Hence, we get
g
t
(
k
) =
k
p
T
k
Ap
k
+ (Ax
k
b)
T
p
k
= 0,
or

k
=
(Ax
k
b)
T
p
k
p
T
k
Ap
k
=
r
T
k
p
k
p
T
k
Ap
k
.
Problem 5.4
To see that h() = f(x
0
+
0
p
0
+ +
k1
p
k1
) is a quadratic, note that

0
p
0
+ +
k1
p
k1
= P
where P is the n k matrix whose columns are the n 1 vectors p
i
, i.e.
P =
_
_
[ . . . [
p
0
. . . p
k1
[ . . . [
_
_
and is the k 1 matrix
=
_

0

k1

T
.
Therefore
h() =
1
2
(x
0
+P)
T
A(x
0
+P) +b
T
(x
0
+P)
=
1
2
x
T
0
Ax
0
+x
T
0
AP +
1
2

T
P
T
AP +b
T
x
0
+ (b
T
P)
=
1
2
x
T
0
Ax
0
+b
T
x
0
+ [P
T
A
T
x
0
+P
T
b]
T
+
1
2

T
(P
T
AP)
= C +

b
T
+
1
2

T

A
where
C =
1
2
x
T
0
Ax
0
+b
T
x
0
,

b = P
T
A
T
x
0
+P
T
b and

A = P
T
AP.
If the vectors p
0
p
k1
are linearly independent, then P has full column
rank, which implies that

A = P
T
AP
is positive denite. This shows that h() is a strictly convex quadratic.
24
Problem 5.5
We want to show
span r
0
, r
1
= span r
0
, Ar
0
= span p
0
, p
1
. (14)
From the CG iteration (5.14) and p
0
= r
0
we know
r
1
= Ax
1
b = A(x
0
+
0
p
0
) b = (Ax
0
b)
0
Ar
0
= r
0

0
Ar
0
. (15)
This indicates r
1
span r
0
, Ar
0
and furthermore
span r
0
, r
1
span r
0
, Ar
0
. (16)
Equation (15) also gives
Ar
0
=
1

0
(r
0
r
1
) =
1

0
r
0

0
r
1
.
This shows Ar
0
span r
0
, r
1
and furthermore
span r
0
, r
1
span r
0
, Ar
0
. (17)
We conclude from (16) and (17) that spanr
0
, r
1
= span r
0
, Ar
0
.
Similarly, from (5.14) and p
0
= r
0
, we have
p
1
= r
1
+
1
p
0
=
1
r
0
r
1
or r
1
=
1
p
0
p
1
.
Then span r
0
, r
1
span p
0
, p
1
, and span r
0
, r
1
span p
0
, p
1
. So
span r
0
, r
1
= span p
0
, p
1
. This completes the proof.
Problem 5.6
By the denition of r, we have that
r
k+1
= Ax
k+1
b = A(x
k
+
k
p
k
) b
= A
k
x
k
+
k
Ap
k
b = r
k
+
k
Ap
k
.
Therefore
Ap
k
=
1

k
(r
k+1
r
k
). (18)
Then we have
p
T
k
Ap
k
= p
T
k
(
1

k
(r
k+1
r
k
)) =
1

k
p
T
k
r
k+1

k
p
T
k
r
k
.
25
The expanding subspace minimization property of CG indicates that p
T
k
r
k+1
=
p
T
k1
r
k
= 0, and we know p
k
= r
k
+
k
p
k1
, so
p
T
k
Ap
k
=
1

k
(r
T
k
+
k
p
T
k1
)r
k
=
1

k
r
T
k
r
k

k
p
T
k1
r
k
=
1

k
r
T
k
r
k
. (19)
Equation (18) also gives
r
T
k+1
Ap
k
= r
T
k+1
(
1

k
(r
k+1
r
k
))
=
1

k
r
T
k+1
r
k+1

k
r
T
k+1
r
k
=
1

k
r
T
k+1
r
k+1

k
r
T
k+1
(p
k
+
k
p
k1
)
=
1

r
T
k+1
r
k+1
+
1

r
T
k+1
p
k

k
r
T
k+1
p
k1
=
1

r
T
k+1
r
k+1
.
This equation, together with (19) and (5.14d), gives that

k+1
=
r
T
k+1
Ap
k
p
T
k
Ap
k
=
1

r
T
k+1
r
k+1
1

k
r
T
k
r
k
=
r
T
k+1
r
k+1
r
T
k
r
k
.
Thus (5.24d) is equivalent to (5.14d).
Problem 5.9
Minimize

( x) =
1
2
x
T
(C
T
AC
1
) x(C
T
b)
T
x solve (C
T
AC
1
) x =
C
T
b. Apply CG to the transformed problem:
r
0
=

A x
0

b = (C
T
AC
1
)Cx
0
C
T
b = C
T
(Ax
0
b) = C
T
r
0
.
_
p
0
= r
0
= C
T
r
0
My
0
= r
0
_
= p
0
= C
T
(My
0
) = C
T
C
T
Cy
0
= Cy
0
.
26
=
0
=
r
T
0
r
0
p
T
0

A p
0
=
r
T
0
C
1
C
T
r
0
y
T
0
C
T
C
T
AC
1
Cy
0
=
r
T
0
M
1
r
0
y
T
0
Ay
0
=
r
T
0
y
0
p
T
0
Ay
0
=
0
.
x
1
= x
0
+
0
p
0
Cx
1
= Cx
0
+
r
T
0
y
0
p
T
0
Ay
0
(Cy
0
)
= x
1
= x
0

r
T
0
y
0
p
T
0
Ay
0
y
0
= x
0
+
0
p
0
r
1
= r
0
+
0

A p
0
C
T
r
1
= C
T
r
0
+
r
T
0
y
0
p
T
0
Ay
0
C
T
AC
1
(Cy
0
)
= r
1
= r
0
+
r
T
0
y
0
p
T
0
Ay
0
A(y
0
) = r
0
+
0
Ap
0

1
=
r
T
1
r
1
r
T
0
r
0
=
r
T
1
C
1
C
T
r
1
r
T
0
C
1
C
T
r
0
=
r
T
1
M
1
r
1
r
T
0
M
1
r
0
=
r
T
1
y
1
r
T
0
y
0
=
1
p
1
= r
1
+

1
p
0
Cy
1
= C
T
r
1
+
1
(Cy
0
)
= y
1
= M
1
r
1
+
1
y
0
p
1
= y
1
+
1
p
0
( because p
1
= Cp
1
).
By comparing the formulas above with Algorithm 5.3, we can see that
by applying CG to the problem with the new variables, then transforming
back into original variables, the derived algorithm is the same as Algorithm
5.3 for k = 0. Clearly, the same argument can be used for any k; the key is
to notice the relationships:
_

_
x
k
= Cx
k
p
k
= Cp
k
r
k
= C
T
r
k
_

_
.
Problem 5.10
From the solution of Problem 5.9 it is seen that r
i
= C
T
r
i
and r
j
= C
T
r
j
.
Since the unpreconditioned CG algorithm is applied to the transformed
27
problem, by the orthogonality of the residuals we know that r
T
i
r
j
= 0 for
all i ,= j. Therefore
0 = r
T
i
r
j
= r
T
i
C
1
C
T
r
j
= r
T
i
M
1
r
j
.
Here the last equality holds because M
1
= (C
T
C)
1
= C
1
C
T
.
6 Quasi-Newton Methods
Problem 6.1
(a) A function f(x) is strongly convex if all eigenvalues of
2
f(x) are positive
and bounded away from zero. This implies that there exists > 0 such that
p
T

2
f(x)p |p|
2
for any p. (20)
By Taylors theorem, if x
k+1
= x
k
+
k
p
k
, then
f(x
k+1
) = f(x
k
) +
_
1
0
[
2
f(x
k
+z
k
p
k
)
k
p
k
]dz.
By (20) we have

k
p
T
k
y
k
=
k
p
T
k
[f(x
k+1
f(x
k
)]
=
2
k
_
1
0
_
p
T
k

2
f(x
k
+z
k
p
k
)p
k

dz
|p
k
|
2

2
k
> 0.
The result follows by noting that s
k
=
k
p
k
.
(b) For example, when f(x) =
1
x + 1
, we have g(x) =
1
(x + 1)
2
. Obviously
f(0) = 1, f(1) =
1
2
, g(0) = 1, g(1) =
1
4
.
So
s
T
y = (f(1) f(0)) (g(1) g(0)) =
3
8
< 0
and (6.7) does not hold in this case.
28
Problem 6.2
The second strong Wolfe condition is

f(x
k
+
k
p
k
)
T
p
k

c
2

f(x
k
)
T
p
k

which implies
f(x
k
+
k
p
k
)
T
p
k
c
2

f(x
k
)
T
p
k

= c
2
f(x
k
)
T
p
k
since p
k
is a descent direction. Thus
f(x
k
+
k
p
k
)
T
p
k
f(x
k
)
T
p
k
= (c
2
1)f(x
k
)
T
p
k
> 0
since we have assumed that c
2
< 1. The result follows by multiplying both
sides by
k
and noting s
k
=
k
p
k
, y
k
= f(x
k
+
k
p
k
) f(x
k
).
7 Large-Scale Unconstrained Optimization
Problem 7.2
Since s
k
,= 0, the product

H
k+1
s
k
=
_
I
s
k
y
T
k
y
T
k
s
k
_
s
k
= s
k

y
T
k
s
k
y
T
k
s
k
s
k
= 0
illustrates that

H
k+1
is singular.
29
Problem 7.3
We assume line searches are exact, so f
T
k+1
p
k
= 0. Also, recall s
k
=
k
p
k
.
Therefore,
p
k+1
= H
k+1
f
k+1
=
__
I
s
k
y
T
k
y
T
k
s
k
__
I
y
k
s
T
k
y
T
k
s
k
_
+
s
k
s
T
k
y
T
k
s
k
_
f
k+1
=
__
I
p
k
y
T
k
y
T
k
p
k
__
I
y
k
p
T
k
y
T
k
p
k
_
+
k
p
k
p
T
k
y
T
k
p
k
_
f
k+1
=
_
I
p
k
y
T
k
y
T
k
p
k
_
f
k+1
= f
k+1
+
f
T
k+1
y
k
y
T
k
p
k
p
k
,
as given.
Problem 7.5
For simplicity, we consider (x
3
x
4
) as an element function despite the fact
that it is easily separable. The function can be written as
f(x) =
3

i=1

i
(U
i
x)
where

i
(u
1
, u
2
, u
3
, u
4
) = u
2
u
3
e
u
1
+u
3
u
4
,
(v
1
, v
2
) = (v
1
v
2
)
2
,
(w
1
, w
2
) = w
1
w
2
,
and
U
1
= I,
U
2
=
_
0 1 0 0
0 0 1 0
_
,
U
3
=
_
0 0 1 0
0 0 0 1
_
.
30
Problem 7.6
We nd
Bs =
_
ne

i=1
U
T
i
B
[i]
U
i
_
s
=
ne

i=1
U
T
i
B
[i]
s
[i]
=
ne

i=1
U
T
i
y
[i]
= y,
so the secant equation is indeed satised.
8 Calculating Derivatives
Problem 8.1
Supposing that L
c
is the constant in the central dierence formula, that is,

f
x
i

_
f(x +e
i
) f(x e
i
)
2
_

L
c

2
,
and assuming as in the analysis of the forward dierence formula that
[comp(f(x +e
i
)) f(x +e
i
))[ L
f
u,
[comp(f(x e
i
)) f(x e
i
))[ L
f
u,
the total error in the central dierence formula is bounded by
L
c

2
+
2uL
f
2
.
By dierentiating with respect to , we nd that the minimizer is at
=
_
L
f
u
2L
c
_
1/3
,
so when the ratio L
f
/L
c
is reasonable, the choice = u
1/3
is a good one.
By substituting this value into the error expression above, we nd that both
terms are multiples of u
2/3
, as claimed.
31
1 2
4
3
6
5
Figure 3: Adjacency Graph for Problem 8.6
Problem 8.6
See the adjacency graph in Figure 3.
Four colors are required; the nodes corresponding to these colors are 1,
2, 3, 4, 5, 6.
Problem 8.7
We start with
x
1
=
_
_
1
0
0
_
_
, x
2
=
_
_
0
1
0
_
_
, x
3
=
_
_
0
0
1
_
_
.
32
By applying the chain rule, we obtain
x
4
= x
1
x
2
+x
2
x
1
=
_
_
x
2
x
1
0
_
_
,
x
5
= (cos x
3
)x
3
=
_
_
0
0
cos x
3
_
_
,
x
6
= e
x
4
x
4
= e
x
1
x
2
_
_
x
2
x
1
0
_
_
,
x
7
= x
4
x
5
+x
5
x
4
=
_
_
x
2
sin x
3
x
1
sin x
3
x
1
x
2
cos x
3
_
_
,
x
8
= x
6
+x
7
= e
x
1
x
2
_
_
x
2
x
1
0
_
_
+
_
_
x
2
sin x
3
x
1
sin x
3
x
1
x
2
cos x
3
_
_
,
x
9
=
1
x
3
x
8

x
8
x
2
3
x
3
.
9 Derivative-Free Optimization
Problem 9.3
The interpolation conditions take the form
( s
l
)
T
g = f(y
l
) f(x
k
) l = 1, . . . , q 1, (21)
where
s
l

_
(s
l
)
T
, s
l
i
s
l
j

i<j
,
_
1

2
(s
l
i
)
2
__
T
l = 1, . . . , m1,
and s
l
is dened by (9.13). The model (9.14) is uniquely determined if and
only if the system (21) has a unique solution, or equivalently, if and only if
the set s
l
: l = 1, . . . , q 1 is linearly independent.
Problem 9.10
It suces to show that for any v, we have max
j=1,2,...,n+1
v
T
d
j
(1/4n)|v|
1
.
Consider rst the case of v 0, that is, all components of v are nonnegative.
33
We then have
max
j=1,2,...,n+1
v
T
d
j
v
T
d
n+1

1
2n
e
T
v =
1
2n
|v|
1
.
Otherwise, let i be the index of the most negative component of v. We have
that
|v|
1
=

v
j
<0
v
j
+

v
j
0
v
j
n[v
i
[ +

v
j
0
v
j
.
We consider two cases. In the rst case, suppose that
[v
i
[
1
2n

v
j
0
v
j
.
In this case, we have from the inequality above that
|v|
1
n[v
i
[ + (2n)[v
i
[ = (3n)[v
i
[,
so that
max
dT
k
d
T
v d
T
i
v
= (1 1/2n)[v
i
[ + (1/2n)

j,=i
v
j
(1 1/2n)[v
i
[ (1/2n)

j,=i,v
j
<0
v
j
(1 1/2n)[v
i
[ (1/2n)n[v
i
[
(1/2 1/2n)[v
i
[
(1/4)[v
i
[
(1/12n)|v|
1
,
which is sucient to prove the desired result. We now consider the second
case, for which
[v
i
[ <
1
2n

v
j
0
v
j
.
We have here that
|v|
1
n
1
2n

v
j
0
v
j
+

v
j
0
v
j

3
2

v
j
0
v
j
,
34
so that
max
dT
k
d
T
v d
T
n+1
v
=
1
2n

v
j
0
v
j
+
1
2n

v
j
0
v
j

1
2n
n[v
i
[ +
1
2n

v
j
0
v
j
=
1
2
[v
i
[ +
1
2n

v
j
0
v
j

1
4n

v
j
0
v
j
+
1
2n

v
j
0
v
j
=
1
4n

v
j
0
v
j

1
6n
|v|
1
.
which again suces.
10 Least-Squares Problems
Problem 10.1
Recall:
(i) J has full column rank is equivalent to Jx = 0 x = 0;
(ii) J
T
J is nonsingular is equivalent to J
T
Jx = 0 x = 0;
(iii) J
T
J is positive denite is equivalent to x
T
J
T
Jx 0(x) and
x
T
J
T
Jx = 0 x = 0.
(a) We want to show (i) (ii).
(i) (ii). J
T
Jx = 0 x
T
J
T
Jx = 0 |Jx|
2
2
= 0 Jx = 0
(by (i)) x = 0.
(ii) (i). Jx = 0 J
T
Jx = 0 (by (ii)) x = 0.
(b) We want to show (i) (iii).
(i) (iii). x
T
J
T
Jx = |Jx|
2
2
0(x) is obvious. x
T
J
T
Jx = 0
|Jx|
2
2
= 0 Jx = 0 (by (i)) x = 0.
35
(iii) (i). Jx = 0 x
T
J
T
Jx = |Jx|
2
2
= 0 (by (iii)) x = 0.
Problem 10.3
(a) Let Q be a nn orthogonal matrix and x be any given n-vector. Dene
q
i
(i = 1, 2, , n) to be the i-th column of Q. We know that
q
T
i
q
j
=
_
|q
i
|
2
= 1 (if i = j)
0 (if i ,= j).
(22)
Then
|Qx|
2
= (Qx)
T
(Qx)
= (x
1
q
1
+x
2
q
2
+ +x
n
q
n
)
T
(x
1
q
1
+x
2
q
2
+ +x
n
q
n
)
=
n

i=1
n

j=1
x
i
x
j
q
T
i
q
j
(by (22))
=
n

i=1
x
2
i
= |x|
2
.
(b) If = I, then J
T
J = (Q
1
R)
T
(Q
1
R) = R
T
R. We know that the
Cholesky decomposition is unique if the diagonal elements of the upper
triangular matrix are positive, so

R = R.
Problem 10.4
(a) It is easy to see from (10.19) that
J =
n

i=1

i
u
i
v
T
i
=

i:
i
,=0

i
u
i
v
T
i
.
Since the objective function f(x) dened by (10.13) is convex, it suces to
show that f(x

) = 0, where x

is given by (10.22). Recall v

T
i
v
j
= 1 if
36
i = j and 0 otherwise, u
T
i
u
j
= 1 if i = j and 0 otherwise. Then
f(x

) = J
T
(Jx

y)
= J
T
_
_

i:
i
,=0

i
u
i
v
T
i
_
_

i:
i
,=0
u
T
i
y

i
v
i
+

i:
i
=0

i
v
i
_
_
y
_
_
= J
T
_
_

i:
i
,=0

i
(u
T
i
y)u
i
(v
T
i
v
i
) y
_
_
=
_
_

i:
i
,=0

i
v
i
u
T
i
_
_
_
_

i:
i
,=0
(u
T
i
y)u
i
y
_
_
=

i:
i
,=0

i
(u
T
i
y)v
i
(u
T
i
u
i
)

i:
i
,=0

i
v
i
(u
T
i
y)
=

i:
i
,=0

i
(u
T
i
y)v
i

i:
i
,=0

i
v
i
(u
T
i
y) = 0.
(b) If J is rank-decient, we have
x

i:
i
,=0
u
T
i
y

i
v
i
+

i:
i
=0

i
v
i
.
Then
|x

|
2
2
=

i:
i
,=0
_
u
T
i
y

i
_
2
+

i:
i
=0

2
i
,
which is minimized when
i
= 0 for all i with
i
= 0.
37
Problem 10.5
For the Jacobian, we get the same Lipschitz constant:
|J(x
1
) J(x
2
)|
= max
|u|=1
|(J(x
1
) J(x
2
))u|
= max
|u|=1
_
_
_
_
_
_
_
_
_
_
(r
1
(x
1
) r
1
(x
2
))
T
u
.
.
.
(r
m
(x
1
) r
m
(x
2
))
T
u
_
_
_
_
_
_
_
_
_
_
max
|u|=1
max
j=1,...,m
[(r
j
(x
1
) r
j
(x
2
))
T
u[
max
|u|=1
max
j=1,...,m
|r
j
(x
1
) r
j
(x
2
)||u|[cos(r
j
(x
1
) r
j
(x
2
), u)[
L|x
1
x
2
|.
For the gradient, we get

L = L(L
1
+L
2
), with L
1
= max
xT
|r(x)|
1
and
L
2
= max
xT

m
j=1
|r
j
(x)|:
|f(x
1
) f(x
2
)|
=
_
_
_
_
_
_
m

j=1
r
j
(x
1
)r
j
(x
1
)
m

j=1
r
j
(x
2
)r
j
(x
2
)
_
_
_
_
_
_
=
_
_
_
_
_
_
m

j=1
(r
j
(x
1
) r
j
(x
2
))r
j
(x
1
) +
m

j=1
r
j
(x
2
)(r
j
(x
1
) r
j
(x
2
))
_
_
_
_
_
_

j=1
|r
j
(x
1
) r
j
(x
2
)| [r
j
(x
1
)[ +
m

j=1
|r
j
(x
2
)| [r
j
(x
1
) r
j
(x
2
)[
L|x
1
x
2
|
m

j=1
[r
j
(x
1
)[ +L|x
1
x
2
|
m

j=1
|r
j
(x
2
)|

L|x
1
x
2
|.
38
Problem 10.6
If J = U
1
SV
T
, then (J
T
J +I) = V (S
2
+I)V
T
. From here,
p
LM
= V (S
2
+I)
1
SU
T
1
r
=
n

i=1

2
i
+
(u
T
i
r)v
i
=

i:
i
,=0

2
i
+
(u
T
i
r)v
i
.
Thus,
|p
LM
|
2
=

i:
i
,=0
_

i

2
i
+
(u
T
i
r)v
i
_
2
,
and
lim
0
p
LM
=

i:
i
,=0
u
T
i
r

i
v
i
.
11 Nonlinear Equations
Problem 11.1
Note s
T
s = |s|
2
2
is a scalar, so it sucies to show that |ss
T
| = |s|
2
2
. By
denition,
|ss
T
| = max
|x|
2
=1
|(ss
T
)x|
2
.
Matrix multiplication is associative, so (ss
T
)x = s(s
T
x), and s
T
x is a scalar.
Hence,
max
|x|
2
=1
|s(s
T
x)|
2
= max
|x|
2
=1
[s
T
x[|s|
2
.
Last,
[s
T
x[ = [|s|
2
|x|
2
cos
s,x
[ = |s|
2
[ cos
s,x
[,
which is maximized when [ cos
s,x
[ = 1. Therefore,
max
|x|
2
=1
[s
T
x[ = |s|
2
,
which yields the result.
39
Problem 11.2
Starting at x
0
,= 0, we have r
t
(x
0
) = qx
q1
0
. Hence,
x
1
= x
0

x
q
0
qx
q1
0
=
_
1
1
q
_
x
0
.
A straghtforward induction yields
x
k
=
_
1
1
q
_
k
x
0
,
which certainly converges to 0 as k . Moreover,
x
k+1
x
k
= 1
1
q
,
so the sequence converges Q-linearly to 0, with convergence ratio 1 1/q.
Problem 11.3
For this function, Newtons method has the form:
x
k+1
= x
k

r(x)
r
t
(x)
= x
k

x
5
+x
3
+ 4x
5x
4
+ 3x
2
+ 4
.
Starting at x
0
= 1, we nd
x
1
= x
0

x
5
0
+x
3
0
+ 4x
0
5x
4
0
+ 3x
2
0
+ 4
= 1
4
2
= 1,
x
2
= x
1

x
5
1
+x
3
1
+ 4x
1
5x
4
1
+ 3x
2
1
+ 4
= 1 +
4
2
= 1,
x
3
= 1,
.
.
.
.
.
.
.
.
.
as described.
A trivial root of r(x) is x = 0, i.e.,
r(x) = (x 0)(x
4
x
2
4).
The remaining roots can be found by noticing that f(x) = x
4
x
2
4 is
quadratic in y = x
2
. According to the quadratic equation, we have the roots
y =
1

17
2
= x
2
x =

17
2
.
40
As a result,
r(x) = (x)
0
@
x
s
1

17
2
1
A
0
@
x
s
1 +

17
2
1
A
0
@
x +
s
1

17
2
1
A
0
@
x +
s
1 +

17
2
1
A
.
Problem 11.4
The sum-of-squares merit function is in this case
f(x) =
1
2
(sin(5x) x)
2
.
Moreover, we nd
f
t
(x) = (sin(5x) x) (5 cos(5x) 1) ,
f
tt
(x) = 25 sin(5x) (sin(5x) x) + (5 cos(5x) 1)
2
.
The merit function has local minima at the roots of r, which as previously
mentioned are found at approximately x S = 0.519148, 0, 0.519148.
Furthermore, there may be local minima at points where the Jacobian is
singular, i.e., x such that J(x) = 5 cos(5x) 1 = 0. All together, there are
an innite number of local minima described by
x

S
_
x [ 5 cos(5x) = 1 f
tt
(x) 0
_
.
Problem 11.5
First, if J
T
r = 0, then () = 0 for all .
Suppose J
T
r ,= 0. Let the singular value decomposition of J '
mn
be
J = USV
where U '
mn
and V '
nn
are orthogonal. We nd (let z = S
T
U
T
r):
() = |(J
T
J +I)
1
J
T
r|
= |(V
T
S
T
U
T
USV +V
T
V )
1
V
T
z|
= |(V
T
(S
T
S +I)V )
1
V
T
z|
= |V
T
(S
T
S +I)
1
V V
T
z| (sinceV
1
= V
T
)
= |V
T
(S
T
S +I)
1
z|
= |(S
T
S +I)
1
z| (sinceV
T
is orthogonal)
= |(D())
1
z|
41
where D() is a diagonal matrix having
[D()]
ii
=
_

2
i
+, i = 1, . . . , min(m, n)
, i = min(m, n) + 1, . . . , max(m, n).
Each entry of y() = (D())
1
z is of the form
y
i
() =
z
i
[D()]
ii
.
Therefore, [y
i
(
1
)[ < [y
i
(
2
)[ for
1
>
2
> 0 and i = 1, . . . , n, which implies
(
1
) < (
2
) for
1
>
2
> 0.
Problem 11.8
Notice that
JJ
T
r = 0 r
T
JJ
T
r = 0.
If v = J
T
r, then the above implies
r
T
JJ
T
r = v
T
v = |v|
2
= 0
which must mean v = J
T
r = 0.
Problem 11.10
The homotopy map expands to
H(x, ) =
_
x
2
1
_
+ (1 )(x a)
= x
2
+ (1 )x
1
2
(1 +).
For a given , the quadratic formula yields the following roots for the above:
x =
1
_
(1 )
2
+ 2(1 +)
2
=
1

1 + 3
2
2
.
By choosing the positive root, we nd that the zero path dened by
_
= 0 x = 1/2,
(0, 1] x =
1+

1+3
2
2
,
connects (
1
2
, 0) to (1, 1), so continuation methods should work for this choice
of starting point.
42
12 Theory of Constrained Optimization
Problem 12.4
First, we show that local solutions to problem 12.3 are also global solutions.
Take any local solution to problem 12.3, denoted by x
0
. This means that
there exists a neighborhood N(x
0
) such that f(x
0
) f(x) holds for any
x N(x
0
) . The following proof is based on contradiction.
Suppose x
0
is not a global solution, then we take a global solution x ,
which satises f(x
0
) > f( x). Because is a convex set, there exists [0, 1]
such that x
0
+ (1 ) x N(x
0
) . Then the convexity of f(x) gives
f(x
0
+ (1 ) x) f(x
0
) + (1 )f( x)
< f(x
0
) + (1 )f(x
0
)
= f(x
0
),
which contradicts the fact that x
0
is the minimum point in N(x
0
) . It
follows that x
0
must be a global solution, and that any local solution to
problem 12.3 must also be a global solution.
Now, let us prove that the set of global solutions is convex. Let
S = x [ x is a global solution to problem 12.3,
and consider any x
1
, x
2
S such that x
1
,= x
2
and x = x
1
+ (1 )x
2
,
(0, 1). By the convexity of f(x), we have
f(x
1
+ (1 )x
2
) f(x
1
) + (1 )f(x
2
)
= f(x
1
) + (1 )f(x
1
)
= f(x
1
).
Since x , the above must hold as an equality, or else x
1
would not be a
global solution. Therefore, x S and S is a convex set.
Problem 12.5
Recall
f(x) = |v(x)|

= max [v
i
(x)[, i = 1, . . . , m.
43
Minimizing f is equivalent to minimizing t where [v
i
(x)[ t, i = 1, . . . , m;
i.e., the problem can be reformulated as
min
x
t
s.t. t v
i
(x) 0, i = 1, . . . , m,
t +v
i
(x) 0, i = 1, . . . , m.
Similarly, for f(x) = max v
i
(x), i = 1, . . . , m, the minimization problem
can be reformulated as
min
x
t
s.t. t v
i
(x) 0, i = 1, . . . , m.
Problem 12.7
Given
d =
_
I
c
1
(x)c
T
1
(x)
|c
1
(x)|
2
_
f(x),
we nd
c
T
1
(x)d = c
T
1
(x)
_
I
c
1
(x)c
T
1
(x)
|c
1
(x)|
2
_
f(x)
= c
T
1
(x)f(x) +
(c
T
1
(x)c
1
(x))(c
T
1
(x)f(x))
|c
1
(x)|
2
= 0.
Furthermore,
f
T
(x)d = f
T
(x)
_
I
c
1
(x)c
T
1
(x)
|c
1
(x)|
2
_
f(x)
= f
T
(x)f(x) +
(f
T
(x)c
1
(x))(c
T
1
(x)f(x))
|c
1
(x)|
2
= |f(x)|
2
+
(f
T
(x)c
1
(x))
2
|c
1
(x)|
2
The Holder Inequality yields
[f
T
(x)c
1
(x)[ |f
T
(x)||c
1
(x)|
(f
T
(x)c
1
(x))
2
|f
T
(x)|
2
|c
1
(x)|
2
,
44
and our assumption that (12.10) does not hold implies that the above is
satised as a strict inequality. Thus,
f
T
(x)d = |f(x)|
2
+
(f
T
(x)c
1
(x))
2
|c
1
(x)|
2
< |f(x)|
2
+
|f(x)|
2
|c
1
(x)|
2
|c
1
(x)|
2
= 0.
Problem 12.13
The constraints can be written as
c
1
(x) = 2 (x
1
1)
2
(x
2
1)
2
0,
c
2
(x) = 2 (x
1
1)
2
(x
2
+ 1)
2
0,
c
3
(x) = x
1
0,
so
c
1
(x) =
_
2(x
1
1)
2(x
2
1)
_
, c
2
(x) =
_
2(x
1
1)
2(x
2
+ 1)
_
, c
3
(x) =
_
1
0
_
.
All constraints are active at x

= (0, 0). The number of active constraints

is three, but the dimension of the problem is only two, so c
i
[ i A(x

)
is not a linearly independent set and LICQ does not hold. However, for
w = (1, 0), c
i
(x

)
T
w > 0 for all i A(x

), so MFCQ does hold.

Problem 12.14
The optimization problem can be formulated as
min
x
f(x) = |x|
2
s.t. c(x) = a
T
x + 0.
The Lagrangian function is
L(x, ) = f(x) c(x)
= |x|
2
(a
T
x +)
45
and its derivatives are

x
L(x, ) = 2x a

xx
L(x, ) = 2I.
Notice that the second order sucient condition
xx
L(x, ) = 2I > 0 is
satised at all points.
The KKT conditions
x
L(x

) = 0,

c(x

) = 0,

0 imply
x

2
a
and

= 0 or a
T
x

+ =

|a|
2
2
+ = 0.
There are two cases. First, if 0, then the latter condition implies

= 0, so the solution is (x

) = (0, 0). Second, if < 0, then

) =
_

|a|
2
a,
2
|a|
2
_
Problem 12.16
Eliminating the x
2
variable yields
x
2
=
_
1 x
2
1
There are two cases:
Case 1: Let x
2
=
_
1 x
2
1
. The optimization problem becomes
min
x
1
f(x
1
) = x
1
+
_
1 x
2
1
.
The rst order condition is
f = 1
x
1
_
1 x
2
1
= 0,
which is satised by x
1
= 1/

2. Plugging each into f and choosing

the value for x
1
that yields a smaller objective value, we nd the
solution to be (x
1
, x
2
) = (1/

2, 1/

2).
46
Case 2: Let x
2
=
_
1 x
2
1
. The optimization problem becomes
min
x
1
f(x
1
) = x
1

_
1 x
2
1
.
The rst order condition is
f = 1 +
x
1
_
1 x
2
1
= 0,
which is satised by x
1
= 1/

2. Plugging each into f and choosing

the value for x
1
that yields a smaller objective value, we nd the
solution to be (x
1
, x
2
) = (1/

2, 1/

2).
Each choice of sign leads to a distinct solution. However, only case 2 yields
the optimal solution
x

=
_

2
,
1

2
_
.
Problem 12.18
The problem is
min
x,y
(x 1)
2
+ (y 2)
2
s.t. (x 1)
2
5y = 0.
The Lagrangian is
L(x, y, ) = (x 1)
2
+ (y 2)
2
((x 1)
2
5y)
= (1 )(x 1)
2
+ (y 2)
2
+ 5y,
which implies

x
L(x, y, ) = 2(1 )(x 1)

y
L(x, y, ) = 2(y 2) + 5.
The KKT conditions are
2(1

)(x

1) = 0
2(y

2) + 5

= 0
(x

1)
2
5y

= 0.
47
Solving for x

, y

, and

, we nd x

= 1, y

= 0, and

=
4
5
as the only
real solution. At (x

, y

) = (1, 0), we have

c(x, y)[
(x

)
=
_
2(x 1)
5
_

)
=
_
0
5
_
,=
_
0
0
_
,
so LICQ is satised.
Now we show that (x

, y

) = (1, 0) is the optimal solution, with f

= 4.
We nd
w F
2
(

) w = (w
1
, w
2
) satises [c(x

, y

)]
T
w = 0

_
0 5

_
w
1
w
2
_
= 0
w
2
= 0,
then for all w = (w
1
, 0) where w
1
,= 0,
w
T

2
L(x

, y

)w =
_
w
1
0

_
2(1
4
5
) 0
0 2
_ _
w
1
0
_
=
2
5
w
2
1
> 0 (for w
1
,= 0).
Thus from the second-order sucient condition, we nd (1, 0) is the optimal
solution.
Finally, we substitute (x 1)
2
= 5y into the objective function and get
the following unconstrained optimization problem:
min
y
5y + (y 2)
2
= y
2
+y + 4.
Notice that y
2
+y +4 = (y +
1
2
)
2
+
15
4

15
4
, so y = 1/2 yields an objective
value of 15/4 < 4. Therefore, optimal solutions to this problem cannot yield
solutions to the original problem.
Problem 12.21
We write the problem in the form:
min
x
1
,x
2
x
1
x
2
s.t. 1 x
2
1
x
2
2
0.
48
The Lagrangian function is
L(x
1
, x
2
, ) = x
1
x
2
(1 x
2
1
x
2
2
).
The KKT conditions are
x
2
(2x
1
) = 0
x
1
(2x
2
) = 0
0
(1 x
2
1
x
2
2
) = 0.
We solve this system to get three KKT points:
(x
1
, x
2
, )
_
(0, 0, 0),
_

2
2
,

2
2
,
1
2
_
,
_

2
2
,

2
2
,
1
2
__
Checking the second order condition at each KKT point, we nd
(x
1
, x
2
)
__

2
2
,

2
2
_
,
_

2
2
,

2
2
__
are the optimal points.
13 Linear Programming: The Simplex Method
Problem 13.1
We rst add slack variables z to the constraint A
2
x +B
2
y b
2
and change
it into
A
2
x +B
2
y +z = b
2
, z 0.
Then we introduce surplus variables s
1
and slack variables s
2
into the two-
sided bound constraint l y u:
y s
1
= l, y +s
2
= u, s
1
0, s
2
0.
Splitting x and y into their nonnegative and nonpositive parts, we have
x = x
+
x

, x
+
= max(x, 0) 0, x

= max(x, 0) 0,
y = y
+
y

, y
+
= max(y, 0) 0, y

= max(y, 0) 0.
49
Therefore the objective function and the constraints can be restated as:
max c
T
x +d
T
y min c
T
(x
+
x

) d
T
(y
+
y

)
A
1
x = b
1
A
1
(x
+
x

) = b
1
A
2
x +B
2
y b
2
A
2
(x
+
x

) +B
2
(y
+
y

) +z = b
2
l y u y
+
y

s
1
= l, y
+
y

+s
2
= u,
with all the variables (x
+
, x

, y
+
, y

, z, s
1
, s
2
) nonnegative. Hence the stan-
dard form of the given linear program is:
minimize
x
+
,x

,y
+
,y

,z,s
1
,s
2
_

_
c
c
d
d
0
0
0
_

_
T
_

_
x
+
x

y
+
y

z
s
1
s
2
_

_
subject to
_

_
A
1
A
1
0 0 0 0 0
A
2
A
2
B
2
B
2
I 0 0
0 0 I I 0 I 0
0 0 I I 0 0 I
_

_
_

_
x
+
x

y
+
y

z
s
1
s
2
_

_
=
_

_
b
1
b
2
l
u
_

_
x
+
, x

, y
+
, y

, z, s
1
, s
2
0.
Problem 13.5
It is sucient to show that the two linear programs have identical KKT
systems. For the rst linear program, let be the vector of Lagrangian
multipliers associated with Ax b and s be the vector of multipliers asso-
ciated with x 0. The Lagrangian function is then
L
1
(x, , s) = c
T
x
T
(Ax b) s
T
x.
The KKT system of this problem is given by
A
T
+s = c
Ax b
x 0
0
s 0

T
(Ax b) = 0
s
T
x = 0.
50
For the second linear program, we know that max b
T
min b
T
. Simi-
larly, let x be the vector of Lagrangian multipliers associated with A
T
c
and y be the vector of multipliers associated with 0. By introducing
the Lagrangian function
L
2
(, x, y) = b
T
x
T
(c A
T
) y
T
,
we have the KKT system of this linear program:
Ax b = y
A
T
c
0
x 0
y 0
x
T
(c A
T
) = 0
y
T
= 0.
Dening s = c A
T
and noting that y = Ax b, we can easily verify that
the two KKT systems are identical, which is the desired argument.
Problem 13.6
Assume that there does exist a basic feasible point x for linear program
(13.1), where m n and the rows of A are linearly dependent. Also as-
sume without loss of generality that B( x) = 1, 2, . . . , m. The matrix
B = [A
i
]
i=1,2,...,m
is nonsingular, where A
i
is the i-th column of A.
On the other hand, since m n and the rows of A are linearly dependent,
there must exist 1 k m such that the k-th row of A can be expressed as a
linear combination of other rows of A. Hence, with the same coecients, the
k-th row of B can also expressed as a linear combination of other rows of B.
This implies that B is singular, which obviously contradicts the argument
that B is nonsingular. Then our assumption that there is a basic feasible
point for (13.1) must be incorrect. This completes the proof.
Problem 13.10
By equating the last row of L
1
U
1
to the last row of P
1
L
1
B
+
P
T
1
, we have
the following linear system of 4 equations and 4 unknowns:
l
52
u
33
= u
23
l
52
u
34
+ l
53
u
44
= u
24
l
52
u
35
+ l
53
u
45
+ l
54
u
55
= u
25
l
52
w
3
+ l
53
w
4
+ l
54
w
5
+ w
2
= w
2
.
51
We can either successively retrieve the values of l
52
, l
53
, l
54
and w
2
from
l
52
= u
23
/u
33
l
53
= (u
24
l
52
u
34
)/u
44
l
54
= (u
25
l
52
u
35
l
53
u
45
)/u
55
w
2
= w
2
l
52
w
3
l
53
w
4
l
54
w
5
,
or calculate these values from the unknown quantities using
l
52
= u
23
/u
33
l
53
= (u
24
u
33
u
23
u
34
)/(u
33
u
44
)
l
54
= (u
25
u
33
u
44
u
23
u
35
u
44
u
24
u
33
u
45
+u
23
u
34
u
45
)/(u
33
u
44
u
55
)
w
2
= w
2
w
3
u
23
u
33
w
4
u
24
u
33
u
23
u
34
u
33
u
44
w
5
u
25
u
33
u
44
u
23
u
35
u
44
u
24
u
33
u
45
+u
23
u
34
u
45
u
33
u
44
u
55
.
14 Linear Programming: Interior-Point Methods
Problem 14.1
The primal problem is
min
x
1
,x
2
x
1
s.t. x
1
+x
2
= 1
(x
1
, x
2
) 0,
so the KKT conditions are
F(x, , s) =
_
_
_
_
_
_
x
1
+x
2
1
+s
1
1
+s
2
x
1
s
1
x
2
s
2
_
_
_
_
_
_
= 0,
with (x
1
, x
2
, s
1
, s
2
) 0. The solution to the KKT conditions is
(x
1
, x
2
, s
1
, s
2
, ) = (0, 1, 1, 0, 0),
but F(x, , s) also has the spurious solution
(x
1
, x
2
, s
1
, s
2
, ) = (1, 0, 0, 1, 1).
52
Problem 14.2
(i) For any (x, , s) N
2
(
1
), we have
Ax = b (23a)
A
T
+s = c (23b)
x > 0 (23c)
s > 0 (23d)
|XSe e|
2

1
. (23e)
Given 0
1
<
2
< 1, equation (23e) implies
|XSe e|
2

1
<
2
. (24)
From equations (23a)(23d),(24), we have (x, , s) N
2
(
2
). Thus
N
2
(
1
) N
2
(
2
) when 0
1
<
2
< 1.
For any (x, , s) N

(
1
), we have
Ax = b (25a)
A
T
+s = c (25b)
x > 0 (25c)
s > 0 (25d)
x
i
s
i

1
, i = 1, 2, . . . , n. (25e)
Given 0 <
2

1
1, equation (25d) implies
x
i
s
i

1

2
. (26)
We have from equations (25a)(25d),(26) that (x, , s) N

(
2
).
This shows that N

(
1
) N

(
2
) when 0 <
2

1
1.
(ii) For any (x, , s) N
2
(), we have
Ax = b (27a)
A
T
+s = c (27b)
x > 0 (27c)
s > 0 (27d)
|XSe e|
2
. (27e)
53
Equation (27e) implies
n

i=1
(x
i
s
i
)
2

2

2
. (28)
Suppose that there exists some k 1, 2, . . . , n satisfying
x
k
s
k
< where 1 . (29)
We have
x
k
s
k
< (1 )
= x
k
s
k
< < 0
= (x
k
s
k
)
2
>
2

2
.
Obviously, this contradicts equation (28), so we must have x
k
s
k

for all k = 1, 2, . . . , n. This conclusion, together with equations (27a)
(27d), gives (x, , s) N

(). Therefore N
2
() N

() when
1 .
Problem 14.3
For ( x,

, s) ^

() the following conditions hold:

( x,

, s) T
0
, (30)
x
i
s
i
, i = 1, . . . , n. (31)
Therefore, for an arbitrary point (x, , s) T
0
we have (x, , s) ^

()
if and only if condition (31) holds. Notice that
x
i
s
i

x
i
s
i

nx
i
s
i
x
T
s
.
Therefore, the range of such that (x, , s) ^

() is equal to the set

=
_
: min
1in
nx
i
s
i
x
T
s
_
.
54
Problem 14.4
First, notice that if |XSe e|
2
> holds for = 1, then it must hold
for every [0, 1). For n = 2,
|XSe e|
2
> (x
1
s
1
)
2
+ (x
2
s
2
)
2
>
2

_
x
1
s
1
x
2
s
2
2
_
2
+
_
x
2
s
2
x
1
s
1
2
_
2
>
_
x
1
s
1
+x
2
s
2
2
_
2
2(x
1
s
1
x
2
s
2
)
2
> (x
1
s
1
+x
2
s
2
)
2

2(x
1
s
1
x
2
s
2
) > x
1
s
1
+x
2
s
2

x
1
s
1
x
2
s
2
>

2 + 1

2 1
5.8284,
which holds, for example, when
x =
_
6
1
_
and s =
_
1
1
_
.
Problem 14.5
For (x, , s) ^

(1) the following conditions hold:

(x, , s) T
0
(32)
x
i
s
i
, i = 1, . . . , n. (33)
Assume that for some i = 1, . . . , n we have x
i
s
i
> . Then,
n

i=1
x
i
s
i
> n
x
T
s
n
> > ,
which is a contradiction. Therefore, x
i
s
i
= for i = 1, . . . , n. Along with
condition (32), this coincides with the central path (.
For (x, , s) ^
2
(0) the following conditions hold:
(x, , s) T
0
(34)
n

i=1
(x
i
s
i
)
2
0. (35)
55
If x
i
s
i
,= for some i = 1, . . . , n, then
n

i=1
(x
i
s
i
)
2
> 0,
which contradicts condition (35). Therefore, x
i
s
i
= for i = 1, . . . , n which,
along with condition (34), coincides with (.
Problem 14.7
Assuming
lim
x
i
s
i
0
= lim
x
i
s
i
0
x
T
s/n ,= 0,
i.e., x
k
s
k
> 0 for some k ,= i, we also have
lim
x
i
s
i
0
x
T
s ,= 0 and lim
x
i
s
i
0
log x
T
s ,= .
Consequently,
lim
x
i
s
i
0

= lim
x
i
s
i
0
_
log x
T
s
n

i=1
log x
i
s
i
_
= lim
x
i
s
i
0
log x
T
s lim
x
i
s
i
0
log x
1
s
1
lim
x
i
s
i
0
log x
n
s
n
= c lim
x
i
s
i
0
log x
i
s
i
= ,
as desired, where c is a nite constant.
Problem 14.8
First, assume the coecient matrix
M =
_
_
0 A
T
I
A 0 0
S 0 X
_
_
is nonsingular. Let
M
1
=
_
0 A
T
I

, M
2
=
_
A 0 0

, M
3
=
_
S 0 X

,
then the nonsingularity of M implies that the rows of M
2
are linearly inde-
pendent. Thus, A has full row rank.
56
Second, assume A has full row rank. If M is singular, then certain rows
of M can be expressed as a linear combination of its other rows. We denote
one of these such rows as row m. Since I, S, X are all diagonal matrices
with positive diagonal elements, we observe that m is neither a row of M
1
nor a row of M
3
. Thus m must be a row of M
2
. Due to the structure of
I, S, and X, m must be expressed as a linear combination of rows of M
2
itself. However, this contradicts our assumption that A has full row rank,
so M must be nonsingular.
Problem 14.9
According to the assumptions, the following equalities hold
Ax = 0 (36)
A
T
+ s = 0. (37)
Multiplying equation (36) on the left by
T
and equation (37) on the left
by x
T
yields

T
Ax = 0 (38)
x
T
A
T
+ x
T
s = 0. (39)
Subtracting equation (38) from (39) yields
x
T
s = 0,
as desired.
Problem 14.12
That AD
2
A
T
is symmetric follows easily from the fact that
_
AD
2
A
T
_
T
=
_
A
T
_
T
_
D
2
_
T
(A)
T
= AD
2
A
T
since D
2
is a diagonal matrix.
Assume that A has full row rank, i.e.,
A
T
y = 0 y = 0.
Let x ,= 0 be any vector in
m
and notice:
x
T
AD
2
A
T
x = x
T
ADDA
T
x
=
_
DA
T
x
_
T
_
DA
T
x
_
= v
T
v
= [[v[[
2
2
,
57
where v = DA
T
x is a vector in
m
. Due to the assumption that A has full
row rank it follows that A
T
x ,= 0, which implies v ,= 0 (since D is diagonal
with all positive diagonal elements). Therefore,
x
T
AD
2
A
T
x = [[v[[
2
2
> 0,
so the coecient matrix AD
2
A
T
is positive denite whenever A has full row
rank.
Now, assume that AD
2
A
T
is positive denite, i.e.,
x
T
AD
2
A
T
x > 0
for all nonzero x
m
. If some row of A could be expressed as a linear
combination of other rows in A, then A
T
y = 0 for some nonzero y
m
.
However, this would imply
y
T
AD
2
A
T
y =
_
y
T
AD
2
_ _
A
T
y
_
= 0,
which contradicts the assumption that AD
2
A
T
is positive denite. There-
fore, A must have full row rank.
Finally, consider replacing D by a diagonal matrix in which exactly m of
the diagonal elements are positive and the remainder are zero. Without loss
of generality, assume that the rst m diagonal elements of m are positive.
A real symmetric matrix M is positive denite if and only if there exists a
real nonsingular matrix Z such that
M = ZZ
T
. (40)
Notice that
C = AD
2
A
T
= (AD)(AD)
T
=
_
BD
t
_ _
BD
t
_
T
,
where B is the submatrix corresponding to the rst m columns of A and D
t
is the m m diagonal submatrix of D with all positive diagonal elements.
Therefore, according to (40), the desired results can be extended in this case
if and only if BD
t
is nonsingular, which is guaranteed if the resulting matrix
B has linearly independent columns.
58
Problem 14.13
A Taylor series approximation to H near the point (x, , s) is of the form:
_
x(),

(), s()
_
=
_
x(0),

(0), s(0)
_
+
_
x
t
(0),

t
(0), s
t
(0)
_
+
1
2

2
_
x
tt
(0),

tt
(0), s
tt
(0)
_
+ ,
where
_
x
(j)
(0),

(j)
(0), s
(j)
(0)
_
is the jth derivative of
_
x(),

(), s()
_
with respect to , evaluated at = 0. These derivatives can be deter-
mined by implicitly dierentiating both sides of the equality given as the
denition of H. First, notice that
_
x
t
(),

t
(), s
t
()
_
solves
_
_
0 A
T
I
A 0 0

S() 0

X()
_
_
_
_
x
t
()

t
()
s
t
()
_
_
=
_
_
r
c
r
b
XSe
_
_
. (41)
After setting = 0 and noticing that

X(0) = X and

S(0) = S, the linear
system in (41) reduces to
_
_
0 A
T
I
A 0 0
S 0 X
_
_
_
_
x
t
(0)

t
(0)
s
t
(0)
_
_
=
_
_
r
c
r
b
XSe
_
_
, (42)
which is exactly the system in (14.8). Therefore,
_
x
t
(0),

t
(0), s
t
(0)
_
=
_
x
a
,
a
, s
a
_
. (43)
Dierentiating (41) with respect to yields
_
_
0 A
T
I
A 0 0

S() 0

X()
_
_
_
_
x
tt
()

tt
()
s
tt
()
_
_
=
_
_
0
0
2

X
t
()

S
t
()e
_
_
. (44)
If we let (x
corr
,
corr
, s
corr
) be the solution to the corrector step, i.e.,
when the right-hand-side of (14.8) is replaced by
_
0, 0, X
a
S
a
e
_
,
then after setting = 0 and noting (43) we can see that
_
x
tt
(0),

tt
(0), s
tt
(0)
_
=
1
2
_
x
corr
,
corr
, s
corr
_
. (45)
59
Finally, dierentiating (44) with respect to yields
_
_
0 A
T
I
A 0 0

S() 0

X()
_
_
_
_
x
ttt
()

ttt
()
s
ttt
()
_
_
=
_

_
0
0
3
_

X
tt
()

S
t
() +

S
tt
()

X
t
()
_
e
_

_. (46)
Setting = 0 and noting (43) and (45), we nd
_
_
0 A
T
I
A 0 0
S 0 X
_
_
_
_
x
ttt
(0)

ttt
(0)
s
ttt
(0)
_
_
=
_

_
0
0

3
2
_
X
corr
S
a
+ S
corr
X
a
_
e
_

_. (47)
In total, a Taylor series approximation to H is given by
_
x(),

(), s()
_
= (x, , s)
+
_
x
a
,
a
, s
a
_
+
2
_
x
corr
,
corr
, s
corr
_
+

3
3!
_
x
ttt
(0),

ttt
(0), s
ttt
(0)
_
,
where
_
x
ttt
(0),

ttt
(0), s
ttt
(0)
_
solves (47).
Problem 14.14
By introducing Lagrange multipliers for the equality constraints and the
nonnegativity constraints, the Lagrangian function for this problem is given
by
L(x, y, , s) = c
T
x +d
T
y
T
(A
1
x +A
2
y b) s
T
x.
Applying Theorem 12.1, the rst-order necessary conditions state that for
(x

, y

) to be optimal there must exist vectors and s such that

A
T
1
+s = c, (48)
A
T
2
= d, (49)
A
1
x +A
2
y = b, (50)
x
i
s
i
= 0, i = 1, . . . , n, (51)
(x, s) 0. (52)
60
Equivalently, these conditions can be expressed as
F(x, y, , s) =
_

_
A
T
1
+s c
A
T
2
d
A
1
x +A
2
y b
XSe
_

_
= 0, (53)
(x, s) 0. (54)
Similar to the standard linear programming case, the central path is de-
scribed by the system including (48)-(52) where (51) is replaced by
x
i
s
i
= , i = 1, . . . , n.
The Newton step equations for = are
_

_
0 0 A
T
1
I
0 0 A
T
2
0
A
1
A
2
0 0
S 0 0 X
_

_
_

_
x
y

s
_

_
=
_

_
r
c
r
d
r
b
XSe +e
_

_
(55)
where
r
b
= A
1
x +A
2
y = b, r
c
= A
T
1
+s c, and r
d
= A
T
2
d.
By eliminating s from (55), the augmented system is given by
_
_
0 0 A
T
2
A
1
A
2
0
D
2
0 A
T
1
_
_
_
_
x
y

_
_
=
_
_
r
d
r
b
r
c
+s X
1
e
_
_
, (56)
s = s +X
1
e D
2
x, (57)
where D = S
1/2
X
1/2
.
We can eliminate x from (56) by noting
D
2
x +A
T
1
= r
c
+s X
1
e
x = D
2
_
r
c
+s X
1
e A
T
1

_
,
which yields the system
_
0 A
T
2
A
2
A
1
D
2
A
T
1
_ _
y

_
=
_
r
d
r
b
+A
1
D
2
_
r
c
+s X
1
e
_
_
(58)
x = D
2
_
r
c
+s X
1
e A
T
1

_
(59)
s = s +X
1
e D
2
x. (60)
Unfortunately, there is no way to reduce this system any further in general.
That is, there is no way to create a system similar to the normal-equations
in (14.44).
61
15 Fundamentals of Algorithms for Nonlinear Con-
strained Optimization
Problem 15.3
(a) The formulation is
min x
1
+x
2
s.t. x
2
1
+x
2
2
= 2
0 x
1
1
0 x
2
1.
This problem has only one feasible point, namely x
1
= x
2
= 1. Thus
it has a solution at x

= x

2
= 1, and the optimal objective is 2.
(b) The formulation is
min x
1
+x
2
(61a)
s.t. x
2
1
+x
2
2
1 (61b)
x
1
+x
2
= 3 (61c)
Substituting equation (61c) into (61b), we get
x
2
1
+ (3 x
1
)
2
1 which implies x
2
1
3x
1
+ 4 0.
This inequality has no solution; thus the feasible region of the original
problem is empty. This shows that the problem has no solution.
(c) The formulation is
min x
1
x
2
s.t. x
1
+x
2
= 2
Since the constraint of this problem is linear, we eliminate x
2
from the
objective and get an unconstrained problem, namely
min x
1
(2 x
1
) = (x
1
1)
2
+ 1.
Obviously, when [x
1
1[ +, we see that (x
1
1)
2
+ 1 .
This shows that the original problem is unbounded below, hence it has
no solution.
62
Problem 15.4
The optimization problem is
min
x,y
x
2
+y
2
s.t. (x 1)
3
= y
2
.
If we eliminate x by writing it in terms of y, i.e. x =
3
_
y
2
+ 1, then the
above becomes the unconstrained problem
min f(y) (y
2/3
+ 1)
2
+y
2
.
Notice f 0, so the optimal solution to the unconstrained problem is y

= 0,
which corresponds to the optimal solution (x

, y

) = (1, 0) to the original

problem.
Problem 15.5
We denote the i
th
column of B
1
by y
i
, (i = 1, 2, . . . , m), and the j
th
column
of B
1
N by z
j
, (j = 1, 2, . . . n m). The existence of B
1
shows that
y
1
, y
2
, . . . , y
m
are linearly independent. Let us consider
_
Y Z

=
_
B
1
B
1
N
0 I
_
=
_
y
1
y
2
. . . y
m
z
1
z
2
. . . z
nm
0 0 . . . 0 e
1
e
2
. . . e
nm
_
.
In order to see the linear dependence of
_
Y Z

, we consider
k
1
_
y
1
0
_
+k
2
_
y
2
0
_
+ +k
m
_
y
m
0
_
+t
1
_
z
1
e
1
_
+t
2
_
z
2
e
2
_
+ +t
nm
_
z
nm
e
nm
_
= 0.
(62)
The last (n m) equations of (62) are in fact
t
1
e
1
+t
2
e
2
+ +t
nm
e
nm
= 0,
where e
j
=
_
0 0 0 1 0 0 0

T
. Thus t
1
= t
2
= = t
nm
= 0. This
shows that the rst m equations of (62) are
k
1
y
1
+k
2
y
2
+ +k
m
y
m
= 0.
It follows immediately that k
1
= k
2
= = k
m
= t
1
= t
2
= = t
nm
= 0,
which indicates that the collection of columns of
_
Y Z

form a linearly
independent basis of R
n
.
63
Problem 15.6
Recall A
T
= Y R. Since is a permutation matrix, we know
T
=
1
.
Thus A = R
T
Y
T
. This gives
AA
T
= R
T
Y
T
Y R
T
. (63)
The matrix
_
Y Z

is orthogonal, so Y
T
Y = I. Then (63) gives
AA
T
= R
T
R
T
(AA
T
)
1
= R
1
R
T

T
A
T
(AA
T
)
1
= (Y R
T
)R
1
R
T

T
A
T
(AA
T
)
1
= Y R
T

T
A
T
(AA
T
)
1
b = Y R
T

T
b.
Problem 15.7
(a) We denote the i
th
column of matrix
_
I
(B
1
N)
T
_
=
_
_
[ . . . [ . . . [
y
1
. . . y
i
. . . y
n
[ . . . [ . . . [
_
_
by y
i
.
Then
|y
i
|
2
= 1 +|(B
1
N)
i
|
2
1.
Thus Y is no longer of norm 1. The same argument holds for the
matrix
Z =
_
B
1
N
I
_
.
Furthermore,
Y
T
Z =
_
I B
1
N

_
B
1
N
I
_
= B
1
N +B
1
N = 0,
AZ =
_
B N

_
B
1
N
I
_
= BB
1
N +N = 0.
These show that the columns of Y and Z form an independent set and
Y , Z are valid basis matrices.
64
(b) We have from A =
_
B N

that
AA
T
=
_
B N

_
B
T
N
T
_
= BB
T
+NN
T
.
Therefore,
AY =
_
B N

_
I
(B
1
N)
T
_
= B +N(B
1
N)
T
= B +NN
T
B
T
= (BB
T
+NN
T
)B
T
= (AA
T
)B
T
.
And then,
(AY )
1
= B
T
(AA
T
)
1
=Y (AY )
1
= Y B
T
(AA
T
)
1
=Y (AY )
1
(AA
T
) = Y B
T
(AA
T
)
1
(AA
T
) = Y B
T
=
_
I
(B
1
N)
T
_
B
T
=
_
N
T
B
T
B
T

=
_
B
T
N
T
_
= A
T
.
This implies Y (AY )
1
= A
T
(AA
T
)
1
. Thus Y (AY )
1
b = A
T
(AA
T
)
1
b,
which is the minimum norm solution of Ax = b.
Problem 15.8
The new problem is:
min sin(x
1
+x
2
) +x
2
3
+
1
3
_
x
4
+x
4
5
+
1
2
x
6
_
s.t. 8x
1
6x
2
+ x
3
+ 9x
4
+ 4x
5
= 6
3x
1
+ 2x
2
x
4
+ 6x
5
+ 4x
6
= 4
3x
1
+ 2x
3
1.
If we eliminate variables with (15.11):
_
x
3
x
6
_
=
_
8 6 9 4
3
4
1
2

1
4
3
2
_
_
_
_
_
_
_
x
1
x
2
x
3
x
4
x
5
_
_
_
_
_
_
+
_
6
1
_
,
65
the objective function will turn out to be (15.12). We substitute (15.11)
into the inequality constraint:
1 3x
1
+ 2(8x
1
+ 6x
2
9x
4
4x
5
+ 6)
= 13x
1
+ 12x
2
18x
4
8x
5
+ 12
= 13x
1
+ 12x
2
18x
4
8x
5
11,
which is exactly (15.23). Thus the problem turns out to be minimizing
function (15.12) subject to (15.23).
16 Quadratic Programming
Problem 16.1
(b) The optimization problem can be written as
min
x
1
2
x
T
Gx +d
T
x
s.t. c(x) 0,
where
G =
_
8 2
2 2
_
, d =
_
2
3
_
, and c(x) =
_
_
x
1
x
2
4 x
1
x
2
3 x
1
_
_
.
Dening
A = c(x) =
_
_
1 1
1 1
1 0
_
_
,
we have the Lagrangian
L(x, ) =
1
2
x
T
Gx +d
T
x
T
c(x)
and its corresponding derivatives in terms of the x variables

x
L(x, ) = Gx +d A
T
and
xx
L(x, ) = G.
Consider x = (a, a) '
2
. It is easily seen that such an x is feasible for a 2
and that
q(x) = 7a
2
5a as a .
Therefore, the problem is unbounded. Moreover,
xx
L = G < 0, so no
solution satises the second order necessary conditions are there are no local
minimizers.
66
Problem 16.2
The problem is:
min
x
1
2
(x x
0
)
T
(x x
0
)
s.t. Ax = b.
The KKT conditions are:
x

x
0
A
T

= 0, (64)
Ax

= b. (65)
Multiplying (64) on the left by A yields
Ax

Ax
0
AA
T
= 0. (66)
Substituting (65) into (66), we nd
b Ax
0
= AA
T
,
which implies

=
_
AA
T
_
1
(b Ax
0
). (67)
Finally, substituting (67) into (64) yields
x

= x
0
+A
T
(AA
T
)
1
(b Ax
0
). (68)
Consider the case where A '
1n
. Equation (68) gives
x

x
0
= A
T
(AA
T
)
1
(b Ax
0
) =
1
|A|
2
A
T
(b Ax
0
),
so the optimal objective value is given by
f

=
1
2
(x

x
0
)
T
(x

x
0
)
=
1
2
_
1
|A|
2
2
_
2
(b ax
0
)
T
AA
T
(b Ax
0
)
=
1
2
1
|A|
4
2
_
|A|
2
2
_
(b Ax
0
)
T
(b Ax
0
)
=
1
2
1
|A|
2
2
(b Ax
0
)
2
.
and the shortest distance from x
0
to the solution set of Ax = b is
_
2f

1
|A|
2
2
(b Ax
0
)
2
=
[b Ax
0
[
|A|
2
.
67
Problem 16.6
First, we will show that the KKT conditions for problem (16.3) are satised
by the point satisfying (16.4). The Lagrangian function for problem (16.3)
is
L(x, ) =
1
2
x
T
Gx +d
T
x
T
(Ax b),
so the KKT conditions are
Gx +d A
T
= 0
Ax = b.
The point (x

) satises the KKT conditions if and only if

_
G A
T
A 0
_ _
x

_
=
_
d
b
_
,
which is exactly the system given by (16.4).
Now assume that the reduced Hessian Z
T
GZ is positive denite. The
second order conditions for (16.3) are satised if w
T

xx
L(x

)w = w
T
Gw >
0 for all w ((x

), w ,= 0. By denition, w ((x

) if w = Zu for
any real u, so
w
T
Gw = u
T
Z
T
GZu > 0
and the second order conditions are satised.
Problem 16.7
Let x = x

+Zu, ,= 0. We nd
q(x) = q(x

+Zu)
=
1
2
(x

+Zu)
T
G(x

+Zu) +d
T
(x

+Zu)
=
1
2
x
T
Gx

+x
T
GZu +
1
2

2
u
T
Z
T
GZu +d
T
x

+d
T
Zu
= q(x

) +
1
2

2
u
T
A
T
GZu +(x
T
GZu +d
T
Zu).
A point (x

) satisfying the KKT conditions yields

0 = Gx

+d A
T

.
Taking the transpose and multiplying on the right by Zu, we nd
0 = x
T
GZu +d
T
Zu
T
AZu = x
T
GZu +d
T
Zu,
68
so in fact
q(x) = q(x

) +
1
2

2
u
T
A
T
GZu.
If there exists a u such that u
T
Z
T
GZu < 0, then q(x) < q(x

). Hence
(x

) is a stationary point.
Problem 16.15
Suppose that there is a vector pair (x

) that satises the KKT conditions.

Let u be some vector such that u
T
Z
T
GZu 0, and set p = Zu. Then for
any ,= 0, we have
A(x

+p) = b,
so that x

+p is feasible, while
q(x

+p) = q(x

) +p
T
(Gx

+c) +
1
2

2
p
T
Gp
= q(x

) +
1
2

2
p
T
Gp
q(x

),
where we have used the KKT condition Gx

+ c = A
T

and the fact that

p
T
A
T

= u
T
Z
T
A
T

= 0. Therefore, from any x

satisfying the KKT

conditions, we can nd a feasible direction p along which q does not increase.
In fact, we can always nd a direction of strict decrease when Z
T
GZ has
negative eigenvalues.
Problem 16.21
The KKT conditions of the quadratic program are
Gx +d A
T

A
T
= 0,
Ax b 0,

b = 0,
[Ax b]
i

i
= 0, i = 1, . . . , n
0.
69
Introducing slack variables y yields
Gx +d A
T

A
T
= 0,
Ax y b = 0,

b = 0,
y
i

i
= 0, i = 1, . . . , n
(y, ) 0,
which can be expressed as
F(x, y, , ) =
_

_
Gx +d A
T

A
T

Ax y b

b
Y e
_

_
= 0.
The analog of (16.58) is
_

_
G A
T

A
T
0
A 0 0 I

A 0 0 0
0 0 Y
_

_
_

_
x

y
_

_
=
_

_
r
d
r
b
r
b
Y e +e
_

_
where
r
d
= Gx +d A
T
, r
b
= Ax y b, and r
b
=

Ax

b.
17 Penalty and Augmented Lagrangian Methods
Problem 17.1
The following equality constrained problem
min
x
x
4
s.t. x = 0
has a local solution at x

= 0. The corresponding quadratic penalty function

is
Q(x; ) = x
4
+
1
2
x
2
,
70
which is unbounded for any value of .
The inequality constrained problem
min
x
x
3
s.t. x 0
has a local solution at x

= 0. The corresponding quadratic penalty function

is
Q(x; ) = x
3
+
1
2
([x]

)
2
= x
3
+
1
2
(max(x, 0))
2
=
_
x
3
if x 0
x
3
+
1
2
x
2
if x < 0,
which is unbounded for any value of .
Problem 17.5
The penalty function and its gradient are
Q(x; ) = 5x
2
1
+x
2
2
+

2
(x
1
1)
2
and Q(x; ) =
_
( 10)x
1

2x
2
_
,
respectively. For = 1, the stationary point is (1/9, 0) and the contours
are shown in gure 4.
Problem 17.9
For Example 17.1, we know that x

= (1, 1) and

=
1
2
. The goal
is to show that
1
(x; ) does not have a local minimizer at (1, 1) unless
|

=
1
2
.
We have from the denition of the directional derivative that for any
p = (p
1
, p
2
),
D(
1
(x

; ), p) = f(x

)
T
p +

ic
[c
i
(x

)
T
p[
= (p
1
+p
2
) +[2(p
1
+p
2
)[
=
_
(1 2)(p
1
+p
2
) if p
1
+p
2
< 0
(1 + 2)(p
1
+p
2
) if p
1
+p
2
0.
71
1.5 1 0.5 0 0.5 1 1.5
1.5
1
0.5
0
0.5
1
1.5
x
1
x
2
low low
high
high
Figure 4: Contours for the quadratic penalty function Q(x; ), = 1.
It is easily seen that when <
1
2
, we can always choose p
1
+ p
2
< 0 such
that
(1 2)(p
1
+p
2
) < 0,
in which case p is a descent direction for
1
(x

; ). On the other hand, when

1
2
, there can be no descent directions for
1
(x

; ) since D(
1
(x

; ), p)
0 always holds. This shows that
1
(x; ) does not have a local minimizer at
x

= (1, 1) unless |

=
1
2
.
18 Sequential Quadratic Programming
Problem 18.4
When
k
,= 1, we have

k
=
0.8s
T
k
B
k
s
k
s
T
k
B
k
s
k
s
T
k
y
k
72
where s
T
k
y
k
< 0.2s
T
k
B
k
s
k
. Therefore
s
T
k
r
k
= s
T
k
(
k
y
k
+ (1
k
)B
k
s
k
)
=
k
(s
T
k
y
k
) + (1
k
)s
T
k
B
k
s
k
=
0.8s
T
k
B
k
s
k
s
T
k
B
k
s
k
s
T
k
y
k
s
T
k
y
k
+
0.2s
T
k
B
k
s
k
s
T
k
y
k
s
T
k
B
k
s
k
s
T
k
y
k
s
T
k
B
k
s
k
=
s
T
k
B
k
s
k
s
T
k
B
k
s
k
s
T
k
y
k
_
0.8s
T
k
y
k
+ 0.2s
T
k
B
k
s
k
s
T
k
y
k
_
=
s
T
k
B
k
s
k
s
T
k
B
k
s
k
s
T
k
y
k
_
0.2s
T
k
B
k
s
k
0.2s
T
k
y
k
_
= 0.2s
T
k
B
k
s
k
> 0.
This shows that the damped BFGS updating satises (18.17).
Problem 18.5
We have
c(x) = x
2
1
+x
2
2
1 and c(x) =
_
2x
1
2x
2
_
,
so the linearized constraint at x
k
is
0 = c(x
k
) +c(x
k
)
T
p
= x
2
1
+x
2
2
1 + 2x
1
p
1
+ 2x
2
p
2
.
(a) At x
k
= (0, 0), the constraint becomes
0 = 1,
which is incompatible.
(b) At x
k
= (0, 1), the constraint becomes
0 = 2p
2
,
which has a solution of the form p = (q, 0), q '.
(c) At x
k
= (0.1, 0.02), the constraint becomes
0 = 0.9896 + 0.2p
1
+ 0.04p
2
,
which has a solution of the form p = (4.948, 0) +q(0.2, 1), q '.
73
(d) At x
k
= (0.1, 0.02), the constraint becomes
0 = 0.9896 0.2p
1
0.04p
2
,
which has a solution of the form p = (4.948, 0) +q(0.2, 1), q '.
19 Interior-Point Methods for Nonlinear Program-
ming
Problem 19.3
Dene the vector function
c(x) = Dr(x),
where D is a diagonal scaling matrix with nonzero diagonal entries. The
Jacobian corresponding to c(x) is
A(x) =
_

_
c
1
(x)
T
.
.
.
c
n
(x)
T
_

_ =
_

_
D
11
r
1
(x)
T
.
.
.
D
nn
r
n
(x)
T
_

_ = DJ(x).
Therefore, the Newton step p is obtained via the solution of the linear system
DJ(x)p = Dr(x),
which is equivalent to
J(x)p = r(x)
since D is nonsingular.
Problem 19.4
Eliminating the linear equation yields x
1
= 2 x
2
. Plugging this expression
into the second equation implies that the solutions satisfy
3x
2
2
+ 2x
2
+ 1 = 0. (69)
Thus, the solutions are
(x
1
, x
2
)
_
(1, 1) ,
_
7
3
,
1
3
__
.
74
Similarly, multiplying the rst equation by x
2
yields the system
_
x
1
x
2
+x
2
2
2x
2
x
1
x
2
2x
2
2
+ 1
_
= 0.
Subtracting the rst equation from the second again yields (69), and the
solutions remain unchanged.
Newtons method applied to the two systems yields the linear systems
_
1 1
x
2
x
1
4x
2
_
d =
_
x
1
+x
2
2
x
1
x
2
2x
2
2
+ 1
_
and
_
x
2
x
1
+ 2x
2
2
x
2
x
1
4x
2
_
d =
_
x
1
x
2
+x
2
2
2x
2
x
1
x
2
2x
2
2
+ 1
_
.
From the point x = (1, 1), the steps are found to be d = (4/3, 2/3) and
d = (1/2, 1/2), respectively.
Problem 19.14
For clarity, dene
U =
_

_
W
0
0
0
_

_
, V =
_

_
WM
T
0
0
0
_

_
T
,
and
C =
_
D A
T
A 0
_
,
where
D =
_
I 0
0
_
and A =
_
A
E
0
A
I
I
_
.
It can easily be shown that
C
1
=
_
D
1
D
1
A
T
(AD
1
A
T
)
1
AD
1
D
1
A
T
(AD
1
A
T
)
1
(AD
1
A
T
)AD
1
(AD
1
A
T
)
1
_
,
so the solution r of the primal-dual system (C+UV
T
)r = s can be obtained
via the ShermanMorrisonWoodbury formula as
r = (C +UV
T
)
1
s =
_
C
1
C
1
U(I +V
T
C
1
U)
1
V
T
C
1
_
s,
which requires only solutions of the system Cv = b for various b.
75

NumericalHUB 0223
No ratings yet
NumericalHUB 0223
274 pages
Robust and Optimal Control-Zhou
100% (5)
Robust and Optimal Control-Zhou
603 pages
Jorge Nocedal, Stephen Wright - Numerical Optimization - Solution Manual-Springer (2006)
No ratings yet
Jorge Nocedal, Stephen Wright - Numerical Optimization - Solution Manual-Springer (2006)
76 pages
Main
No ratings yet
Main
515 pages
Notes Ipad
No ratings yet
Notes Ipad
263 pages
Optimisation Notes
No ratings yet
Optimisation Notes
303 pages
Markov Chains and Decision Processes For Engineers and Managers
100% (6)
Markov Chains and Decision Processes For Engineers and Managers
478 pages
The Variational Approach To Optimal Control
100% (1)
The Variational Approach To Optimal Control
48 pages
A Primer On The Calculus of Variations and Optimal Control Theory
No ratings yet
A Primer On The Calculus of Variations and Optimal Control Theory
274 pages
Luc 2016
No ratings yet
Luc 2016
328 pages
Hwei Thesis
No ratings yet
Hwei Thesis
155 pages
Notes On Luenberger's Vector Space Optimization
100% (3)
Notes On Luenberger's Vector Space Optimization
131 pages
Optimization in Operations Research Ronald Rardin Solutions PDF
100% (1)
Optimization in Operations Research Ronald Rardin Solutions PDF
142 pages
(Lecture Notes in Mathematics 2047) Gani T. Stamov (Auth.) - Almost Periodic Solutions of Impulsive Differential Equations - Springer-Verlag Berlin Heidelberg (2012)
No ratings yet
(Lecture Notes in Mathematics 2047) Gani T. Stamov (Auth.) - Almost Periodic Solutions of Impulsive Differential Equations - Springer-Verlag Berlin Heidelberg (2012)
235 pages
MA252 - Combinatorial Optimisation
No ratings yet
MA252 - Combinatorial Optimisation
9 pages
Problems and Solutions For Complex Analysis
91% (34)
Problems and Solutions For Complex Analysis
259 pages
Difference Equations
100% (1)
Difference Equations
120 pages
Trade Mogul Trading Guide
No ratings yet
Trade Mogul Trading Guide
17 pages
Numerical Methods: Dr. Charisma Choudhury
No ratings yet
Numerical Methods: Dr. Charisma Choudhury
14 pages
Convex It y 2015
0% (1)
Convex It y 2015
437 pages
MC OWEN, ROBERT - Partial Differential Equations. Methods and Applications
75% (4)
MC OWEN, ROBERT - Partial Differential Equations. Methods and Applications
427 pages
Handbook of Linear Partial Differential Equations For Engineers and Scientists-Andrei D. Polyanin, Vladimir E. Nazaikinskii PDF
100% (1)
Handbook of Linear Partial Differential Equations For Engineers and Scientists-Andrei D. Polyanin, Vladimir E. Nazaikinskii PDF
1,623 pages
Binmore PDF
No ratings yet
Binmore PDF
4 pages
Dummit & Foote's Algebra
100% (3)
Dummit & Foote's Algebra
945 pages
C. Henry Edwards, David E. Penney Elementary Differential Equations With Boundary Value Problems 2003
67% (6)
C. Henry Edwards, David E. Penney Elementary Differential Equations With Boundary Value Problems 2003
320 pages
Linear and Integer Optimization (V3C1/F4C1) : Lecture Notes
No ratings yet
Linear and Integer Optimization (V3C1/F4C1) : Lecture Notes
125 pages
Chapter 4
100% (1)
Chapter 4
166 pages
Khalil - Nonlinear Systems Slides
89% (9)
Khalil - Nonlinear Systems Slides
628 pages
Kelley C.T. - Iterative Methods For optimization-SIAM (1999)
No ratings yet
Kelley C.T. - Iterative Methods For optimization-SIAM (1999)
188 pages
Nonlinear Optimization CO 367
No ratings yet
Nonlinear Optimization CO 367
105 pages
Feller 1949 - On The Theory of Stochastic Processes, With Par - Ticular Reference To Applications
No ratings yet
Feller 1949 - On The Theory of Stochastic Processes, With Par - Ticular Reference To Applications
30 pages
Eecs127 Reader
No ratings yet
Eecs127 Reader
199 pages
Applied Numerical Computing
No ratings yet
Applied Numerical Computing
274 pages
Semi Definite and Cone Programming
No ratings yet
Semi Definite and Cone Programming
89 pages
Coursework Assessment Summary Form Cie
100% (2)
Coursework Assessment Summary Form Cie
8 pages
KKT Conditions, First-Order and Second-Order Optimization, and Distributed Optimization: Tutorial and Survey
No ratings yet
KKT Conditions, First-Order and Second-Order Optimization, and Distributed Optimization: Tutorial and Survey
59 pages
EUR Statement: Gogu Daniel
100% (1)
EUR Statement: Gogu Daniel
2 pages
Math562 ContinuousOptimization
No ratings yet
Math562 ContinuousOptimization
126 pages
Linopt Notes
No ratings yet
Linopt Notes
201 pages
Optimization Structure and Applications
100% (1)
Optimization Structure and Applications
407 pages
Computer Science 1
No ratings yet
Computer Science 1
61 pages
Calculus of Variations Solution Manual Russak
100% (2)
Calculus of Variations Solution Manual Russak
240 pages
Grimmett G.R., Stirzaker D.R. Probability and Random Processes (3ed., Oxford, 2001)
100% (33)
Grimmett G.R., Stirzaker D.R. Probability and Random Processes (3ed., Oxford, 2001)
608 pages
ECE 236B Course Notes
No ratings yet
ECE 236B Course Notes
90 pages
Applied Linear Algebra and Optimization Using MATLAB
No ratings yet
Applied Linear Algebra and Optimization Using MATLAB
1,176 pages
Optimization
100% (7)
Optimization
540 pages
All of Statistics Chapter 1 & 2
100% (1)
All of Statistics Chapter 1 & 2
24 pages
SKF TrainingCalendar 2019-20 - India
No ratings yet
SKF TrainingCalendar 2019-20 - India
84 pages
Solutions To Selected Problems-Duda, Hart
67% (3)
Solutions To Selected Problems-Duda, Hart
12 pages
Solutions To Selected Problems in Numerical Optimization 2nbsped - Compress
No ratings yet
Solutions To Selected Problems in Numerical Optimization 2nbsped - Compress
75 pages
Introduction To Optimisation PDF
No ratings yet
Introduction To Optimisation PDF
264 pages
Integer Programming PDF
No ratings yet
Integer Programming PDF
141 pages
Linear and Integer Optimization (V3C1/F4C1) : Lecture Notes
No ratings yet
Linear and Integer Optimization (V3C1/F4C1) : Lecture Notes
129 pages
Classification of Optimization Methods
No ratings yet
Classification of Optimization Methods
68 pages
Math10 Q1 Wk3 Illustrate-Geometric-Sequence
100% (1)
Math10 Q1 Wk3 Illustrate-Geometric-Sequence
12 pages
Mosek Modeling
No ratings yet
Mosek Modeling
93 pages
Introductory Lectures On Convex Optimization-Yurii Nesterov, 1998
No ratings yet
Introductory Lectures On Convex Optimization-Yurii Nesterov, 1998
212 pages
Cours D'optimisation
No ratings yet
Cours D'optimisation
159 pages
(Rangarajan K. Sundaram) A First Course in Optimiz
100% (2)
(Rangarajan K. Sundaram) A First Course in Optimiz
364 pages
AS330 Series Elevator-Used Inverter User Manual V1.01
No ratings yet
AS330 Series Elevator-Used Inverter User Manual V1.01
128 pages
Kelley - Iterative Methods For Optimization-SIAM (1999) PDF
No ratings yet
Kelley - Iterative Methods For Optimization-SIAM (1999) PDF
187 pages
Numerical Solutions of Stiff Initial Value Problems Using Modified Extended Backward Differentiation Formula
No ratings yet
Numerical Solutions of Stiff Initial Value Problems Using Modified Extended Backward Differentiation Formula
4 pages
MFR11 Manual
No ratings yet
MFR11 Manual
59 pages
CLS - JEEAD Set Jee
No ratings yet
CLS - JEEAD Set Jee
20 pages
Cheat Sheet For Unknown Analysis (Chem 17)
No ratings yet
Cheat Sheet For Unknown Analysis (Chem 17)
4 pages
Electric Project Documentation: . +A PROJECT 2507 1. Information PLUG-SPRAY2507
No ratings yet
Electric Project Documentation: . +A PROJECT 2507 1. Information PLUG-SPRAY2507
5 pages
Harrogate International Application Form
No ratings yet
Harrogate International Application Form
4 pages
Matrix Algorithms Volume II Eigensystems TQW - Darksiderg
0% (1)
Matrix Algorithms Volume II Eigensystems TQW - Darksiderg
490 pages
Integral Transforms in Applied Mathematics
0% (2)
Integral Transforms in Applied Mathematics
109 pages
KM 2
No ratings yet
KM 2
23 pages
Final ThesisII
No ratings yet
Final ThesisII
82 pages
A Naidu Book Chapter0 FrontPages 2017-03-12
No ratings yet
A Naidu Book Chapter0 FrontPages 2017-03-12
23 pages
Ales Hrdlicka - Some Results of Recent Anthropological Exploration in Peru, 1911
No ratings yet
Ales Hrdlicka - Some Results of Recent Anthropological Exploration in Peru, 1911
40 pages
MRR Format For GED109
No ratings yet
MRR Format For GED109
1 page
CogView2 - May 2022
No ratings yet
CogView2 - May 2022
15 pages
Exercises in Nonlinear Control Systems
No ratings yet
Exercises in Nonlinear Control Systems
99 pages
Test CAE
No ratings yet
Test CAE
10 pages
Diffusion-LM Improves Controllable Text Generation
No ratings yet
Diffusion-LM Improves Controllable Text Generation
25 pages
Reaction Paper
No ratings yet
Reaction Paper
2 pages
Lecture 2 Design Controls and Criteria
No ratings yet
Lecture 2 Design Controls and Criteria
17 pages
Sasha Rush - Interactive and Visual Prompt Engineering
No ratings yet
Sasha Rush - Interactive and Visual Prompt Engineering
11 pages
BioAir EcoFilter Brochure
No ratings yet
BioAir EcoFilter Brochure
4 pages
HGP11 Q3 W3 - Las
No ratings yet
HGP11 Q3 W3 - Las
13 pages
Spesifikasi Rig 450 HP (BMA#06)
No ratings yet
Spesifikasi Rig 450 HP (BMA#06)
21 pages
Xu2020 Social MEDIA
No ratings yet
Xu2020 Social MEDIA
14 pages
BLANKS: Checks The BOD Water & BOD Bottles: Notes
No ratings yet
BLANKS: Checks The BOD Water & BOD Bottles: Notes
2 pages
2020 - Book - SolvingProblemsInMathematicalA Part 1 PDF
100% (7)
2020 - Book - SolvingProblemsInMathematicalA Part 1 PDF
375 pages
Oct 31
No ratings yet
Oct 31
4 pages
Welding Final Presentation
No ratings yet
Welding Final Presentation
18 pages
Quotation for Air cond - 240108 - eng version (giá gốc)
No ratings yet
Quotation for Air cond - 240108 - eng version (giá gốc)
3 pages
Rakesh Resume
No ratings yet
Rakesh Resume
2 pages
CO250 Web
No ratings yet
CO250 Web
204 pages
Answers To Selected Exercise Problems Strogatz
No ratings yet
Answers To Selected Exercise Problems Strogatz
9 pages
Integral, Measure and Derivative: A Unified Approach
From Everand
Integral, Measure and Derivative: A Unified Approach
G. E. Shilov
No ratings yet
Statistical Mechanics: Principles and Selected Applications
From Everand
Statistical Mechanics: Principles and Selected Applications
Terrell L. Hill
4/5 (1)

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Numerical Optimization - Solutions Manual

Uploaded by

Numerical Optimization - Solutions Manual

Uploaded by

Solutions to Selected Problems in

is the only strict local minimizer of f(x).

is NOT a minimizer. Consider min(f(x)). It is seen that

is a saddle point and only a stationary point.

is not even a local min, then it is certainly not an isolated local

is a local min but that it is not strict. Let ^

such that f(x

) f(x) for all x ^. Because x

. Since we can do this for every neighborhood of x

cannot be an isolated local min.

is parallel to an eigenvector of Q, then

. By continuity of g, we have |g(x

is given by (10.22). Recall v

= (0, 0). The number of active constraints

), so MFCQ does hold.

) = (0, 0). Second, if < 0, then

2. Plugging each into f and choosing

2. Plugging each into f and choosing

) = (1, 0), we have

) = (1, 0) is the optimal solution, with f

() the following conditions hold:

() is equal to the set

(1) the following conditions hold:

) to be optimal there must exist vectors and s such that

) = (1, 0) to the original

) satises the KKT conditions if and only if

) satisfying the KKT conditions yields

) that satises the KKT conditions.

and the fact that

= 0. Therefore, from any x

satisfying the KKT

= 0. The corresponding quadratic penalty function

= 0. The corresponding quadratic penalty function

; ). On the other hand, when

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.