0% found this document useful (0 votes)
15 views7 pages

Nlpsol 4

This document summarizes key concepts from Solutions Chapter 4 of an optimization textbook: 1. It provides the solution to a quadratic programming problem with equality constraints by deriving expressions for the Lagrangian dual function and its gradient and Hessian. 2. It analyzes the convergence of a gradient method for solving the dual problem, showing that the error decreases geometrically under certain conditions on the step size. 3. It derives a threshold value for the step size parameter such that the gradient method is guaranteed to converge. 4. It extends the analysis to an alternative updating scheme for the dual variables and derives conditions for its convergence. 5. It briefly describes the logarithmic barrier method for constrained optimization and the first

Uploaded by

Afshin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views7 pages

Nlpsol 4

This document summarizes key concepts from Solutions Chapter 4 of an optimization textbook: 1. It provides the solution to a quadratic programming problem with equality constraints by deriving expressions for the Lagrangian dual function and its gradient and Hessian. 2. It analyzes the convergence of a gradient method for solving the dual problem, showing that the error decreases geometrically under certain conditions on the step size. 3. It derives a threshold value for the step size parameter such that the gradient method is guaranteed to converge. 4. It extends the analysis to an alternative updating scheme for the dual variables and derives conditions for its convergence. 5. It briefly describes the logarithmic barrier method for constrained optimization and the first

Uploaded by

Afshin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Solutions Chapter 4

SECTION 4.2

4.2.4 www

Problem correction: Assume that Q is symmetric and invertible. (This correction has been made
in the 2nd printing.)

Solution: We have
1 
minimize f (x) = x Qx
2
subject to Ax = b.
Since x∗ is an optimal solution of this problem with associated Lagrange multiplier λ∗ , we have

Ax∗ = b and Qx∗ + A λ∗ = 0. (1)

We also have
qc (λ) = min Lc (x, λ),

where
1  c
Lc (x, λ) = x Qx + λ (Ax − b) + ||Ax − b||2 .
2 2
One way of showing that qc (λ) has the given form is to view qc (λ) as the dual of the penalized
problem:
1  c
minimize x Qx + Ax − b2
2 2
subject to Ax = b,
which is a quadratic programming problem. Note that x∗ is also a solution of this problem, so
that the optimal value of the problem is f ∗ . Furthermore, by expanding the term ||Ax − b||2 , the
preceding problem is equivalent to
1  1
minimize x (Q + cA A)x + cb Ax + cb b
2 2
subject to Ax = b.

Because x∗ is the unique solution of the original problem, Q must be positive definite over the
null space of A
y  Qy > 0, ∀ y = 0, Ay = 0.

1
Then, similar to the proof of Lemma 3.2.1, it can be seen that there exists some positive scalar
c̄ such that Q + cA A is positive definite for all c ≥ c̄, i.e.,

Q + cA A > 0, ∀ c ≥ c̄. (2)

[this can be shown similar to the proof of Lemma 3.2.1, pg. 298]. By duality theory, there is
no duality gap for the preceding problem [qc (λ∗ ) = f ∗ ], and according to Example 3.4.3 from
Section 3.4, the function qc (λ) is quadratic in λ, so that the second order Taylor’s expansion is
exact for all λ, i.e.,

1
qc (λ) = f ∗ + ∇qc (λ∗ ) (λ − λ∗ ) + (λ − λ∗ ) ∇2 qc (λ∗ ) (λ − λ∗ ), ∀ λ ∈ m . (3)
2

We now need to calculate ∇qc (λ∗ ) and ∇2 qc (λ∗ ). We have

 
∇qc (λ) = h x(λ, c)

    −1  
∇2 qc (λ) = −∇h x(λ, c) ∇2xx Lc x(λ, c), λ ∇h x(λ, c) ,

where x(λ, c) minimizes Lc (x, λ). To find x(λ, c), we can solve ∇Lc (x, λ) = 0, which yields

Qx + A λ + cA (Ax − b) = 0 ⇔ (Q + cA A)x = cA b − A λ,

so that
x(λ, c) = (Q + cA A)−1 (cA b − A λ), ∀ c ≥ c̄

[(Q + cA A)−1 exists as implied by Eq. (2)]. Therefore

 
∇qc (λ) = h x(λ, c) = A(Q + cA A)−1 (cA b − A λ) − b, ∀ c ≥ c̄, (4)

from which by using Eq. (1), it can be seen that

∇qc (λ∗ ) = 0. (5)

Moreover, we have
∇2 qc (λ) = −A(Q + cA A)−1 A , ∀ λ ∈ m , (6)

so that by using the preceding two relations in Eq. (3), we obtain

1
qc (λ) = f ∗ − (λ − λ∗ ) A(Q + cA A)−1 A (λ − λ∗ ), ∀ λ ∈ m , ∀ c ≥ c̄.
2

(a) We have
λk+1 = λk + ck ∇qck (λk ),

2
so that
λk+1 − λ∗ = λk − λ∗ + ck ∇qck (λk ).

We now express ∇qck (λk ) in an equivalent form. In what follows, we assume that ck ≥ c̄ for all
k, so that ∇qck (λ) is linear for all k [cf. Eq. (4)]. By using the first order Taylor’s expansion, we
obtain
∇qc (λ) = ∇qc (λ∗ ) + ∇2 qc (λ∗ ) (λ − λ∗ ), ∀ λ ∈ m ,

and by using Eqs. (5) and (6), we have

∇qc (λ) = −A(Q + cA A)−1 A (λ − λ∗ ), ∀ λ ∈ m ,

Therefore
λk+1 − λ∗ = λk − λ∗ − ck A(Q + ck A A)−1 A (λk − λ∗ )
 
= I − ck A(Q + ck A A)−1 A (λk − λ∗ ),
and by applying the results of Section 1.3, we obtain

λk+1 − λ∗  ≤ rk ||λk − λ∗ ||,

where
 
rk = max |1 − ck Eck |, |1 − ck eck | ,

and Ec and ec are the maximum and minimum eigenvalues of A(Q + cA A)−1 A .

(b) The matrix identity of Appendix A

(A + CBC  )−1 = A−1 − A−1 C(B −1 + C  A−1 C)−1 C  A−1

applied to (Q + ck A A)−1 yields


 −1
1
(Q + ck A A)−1 = Q−1 − Q−1 A I + AQ−1 A AQ−1
ck

and so  −1
1
A(Q + ck A A)−1 A = AQ−1 A − AQ−1 A I + AQ−1 A AQ−1 A .
ck
Let γ be an eigenvalue of (AQ−1 A )−1 . Using the facts that

1
λ = {eigenvalue of A} ⇔ = {eigenvalue of A−1 },
λ

λ = {eigenvalue of A} ⇔ λ + c = {eigenvalue of cI + A},

we can see that  −1


1 1 1 1 1 1
− + =
γ γ c γ γ c+γ

3
is an eigenvalue of
A(Q + cAA )−1 A .

Thus
ck
rk = max 1− .
1≤i≤m γi + ck

(c) First, for the method to be defined we need ck ≥ c̄ for all k sufficiently large. Second, for the
method to converge, we need rk < 1 for all k sufficiently large. Thus

c
1− < 1, ∀ i,
γi + c

which is equivalent to
c c
−2 < − <0 or 0< < 2.
γi + c γi + c
Since c > 0, we must have γi + c > 0. Then solving the above inequality yields the threshold
value
ĉ = max 0, max {−2γi } .
1≤i≤m

Hence, the overall threshold value is


c = max{c̄, ĉ}.

4.2.5 www

Using the results of Exercise 4.2.4, updating the multipliers with

λk+1 = λk + αk (Axk − b)

implies
αk
λk+1 − λ∗  ≤ max 1− λk − λ∗ .
i γi + ck
For the method to converge, we need for k > k̄,

αk
1− ≤ 1 − , ∀ i,
γi + ck

or
αk
≤ ≤2− (1)
γi + ck
for some  > 0. If Q is positive definite and ck = c for all k, we have γi > 0 for all i, and if
δ ≤ αk ≤ 2c, the condition (1) is satisfied for  ≤ min{δ, 2γi }/(c + γi ) for all i.

4
4.2.9 www

In the logarithmic barrier method we have

 
xk = arg min f (x) + k B(x) ,
x∈S

r  
where S = {x ∈ X | gj (x) < 0, j = 1, . . . , r} and B(x) = − j=1 ln −gj (x) . Assuming that f
and gj are continuously differentiable, xk satisfies

∇f (xk ) + k ∇B(xk ) = 0

or equivalently
r
k
∇f (xk ) − ∇gj (xk ) = 0.
j=1
gj (xk )

k
Define µkj = − g k for all j and k. Then we have
j (x )

µkj > 0, ∀ j = 1, . . . , r, ∀k, (1)

r
∇f (xk ) + µkj ∇gj (xk ) = 0, ∀ k. (2)
j=1

Suppose that x∗ is a limit point of the sequence {xk }. Let {xk }k∈K be a subsequence of
{xk } converging to x∗ , and let A(x∗ ) be the index set of active constraints at x∗ . Furthermore,
for any x, let ∇gA (x) be a matrix with columns ∇gj (x) for j ∈ A(x∗ ) and ∇gR (x) be a matrix
with columns ∇gj (x) for j ∈ A(x∗ ). Similarly, we partition a vector µ: µA is a vector with
coordinates µj for j ∈ A(x∗ ) and µR is a vector with coordinates µj for j ∈ A(x∗ ). Then Eq. (2)
is equivalent to
∇f (xk ) + ∇gA (xk )µkA + ∇gR (xk )µkR = 0, ∀ k. (3)

If j ∈ A(x∗ ), then gj (xk ) < −δ for some positive scalar δ and for all large enough k ∈ K,
which guarantees the boundedness of the sequence {−1/gj (xk )}K . Since k → 0, we have

k
lim µkj = − lim = 0, ∀ j ∈ A(x∗ ),
k→∞, k∈K k→∞, k∈K gj (xk )

i.e., {µkR → 0}K . Therefore, by continuity of ∇gj , we have

lim ∇gR (xk )µkR = 0. (4)


k→∞, k∈K

Suppose now that x∗ is a regular point, i.e., the gradients ∇gj (x∗ ) for j ∈ A(x∗ ) are linearly
independent, so that the matrix ∇gA (x∗ ) ∇gA (x∗ ) is invertible. Then, by continuity of ∇gj , the

5
matrix ∇gA (xk ) ∇gA (xk ) is invertible for all sufficiently large k ∈ K. Premultiplying Eq. (3) by
 −1
∇gA (xk ) ∇gA (xk ) ∇gA (xk ) gives

 −1  
µkA = − ∇gA (xk ) ∇gA (xk ) ∇gA (xk ) ∇f (xk ) + ∇gR (xk )µkR .

By letting k → ∞ over k ∈ K, and by using the continuity of ∇f and ∇gj and the relation (4),
we obtain
 −1
lim µkA = − ∇gA (x∗ ) ∇gA (x∗ ) ∇gA (x∗ ) ∇f (x∗ ).
k→∞, k∈K

Define µ∗ by µ∗R = 0 and


µ∗A = lim µkA ,
k→∞, k∈K

so that by letting k → ∞ with k ∈ K, from Eq. (3) we have

∇f (x∗ ) + ∇gA (x∗ )µ∗A + ∇gR (x∗ )µ∗R = ∇f (x∗ ) + ∇g(x∗ )µ∗ = 0.

In view of Eq. (1), µ∗ must be nonnegative, so that µ∗ is a Lagrange multiplier. Furthermore,


assuming that x∗ is a limit point of the sequence {xk }, the regularity of x∗ is sufficient to ensure
the convergence of {µkj } to corresponding Lagrange multipliers.

By Prop. 4.1.1, every limit point of {xk } is a global minimum of the original problem.
Hence, for the convergence of {µkj } to corresponding Lagrange multipliers, it is sufficient that
every global minimum of the original problem is regular.

4.2.11 www

Consider first the case where f is quadratic, f (x) = 12 x Qx with Q positive definite and symmetric,
and h is linear, h(x) = Ax − b, with A having full rank. Following the hint, the iteration
λk+1 = λk + αh(xk ) can be viewed as the method of multipliers for the problem
α
minimize 1
2 x Qx −
Ax − b2
2
subject to Ax − b = 0.

According to Exercise 4.2.4(c), this method converges if α > α, where the threshold value α is

α=0 if ζ ≥ 0, (1)

α = −2ζ if ζ < 0, (2)

where ζ is the minimum eigenvalue of the matrix

 −1
A(Q − αA A)−1 A .

6
To calculate ζ, we use the matrix identity

αA(Q − αA A)−1 A = (I − αAQ−1 A )−1 − I

 −1
of Section A.3 in Appendix A. If ζ1 , . . . , ζm are the eigenvalues of A(Q − αA A)−1 A , we
have
α 1
= − 1.
ζi 1 − αξi−1
where ξi are the eigenvalues of (AQ−1 A )−1 . This equation can be written as

α α
= ,
ζi ξi − α

from which
ζi = ξi − α.

Let ξ = min{ξ1 , . . . , ξm }. Then the condition (1) is written as

0 < α ≤ ξ. (3)

The condition (2) is written as

α > 2(α − ξ) with α > ξ,

or
ξ < α < 2ξ. (4)

Convergence is obtained under either condition (3) or (4), so we see that convergence is obtained
for
0 < α < 2ξ.

In the case where f is nonquadratic and/or h is nonlinear, a local version of the above
analysis applies.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy