0% found this document useful (0 votes)
11 views79 pages

IV. Optimization

The document discusses quadratic forms, their definitions, and properties, including definiteness and optimization conditions. It explains how to determine the definiteness of a symmetric matrix using leading principal minors and provides examples of constrained and unconstrained optimization. Additionally, it covers first-order and second-order necessary conditions for optimal points in multivariate optimization.

Uploaded by

xjiaweixu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views79 pages

IV. Optimization

The document discusses quadratic forms, their definitions, and properties, including definiteness and optimization conditions. It explains how to determine the definiteness of a symmetric matrix using leading principal minors and provides examples of constrained and unconstrained optimization. Additionally, it covers first-order and second-order necessary conditions for optimal points in multivariate optimization.

Uploaded by

xjiaweixu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 79

CAS EC 505 Mathematics for Economics

IV. Optimization and Value Functions

Bjorn Persson

Boston University

Fall 2023

Bjorn Persson (Boston University) Optimization Fall 2023 1 / 79


Quadratic Forms I

De…nition
A quadratic form is a function f : Rn ! R that maps a vector x 2 Rn
into a real number r 2 R :
n n
f (x ) = ∑ ∑ aij xi xj = x T Ax = r 2 R
i =1 j =1

Here, A is a square matrix of coe¢ cients

Quadratic forms are non-linear functions

Example
Suppose f : R2 ! R is a quadratic form:

f (x ) = a11 x12 + a12 x1 x2 + a21 x2 x1 + a22 x22

Bjorn Persson (Boston University) Optimization Fall 2023 2 / 79


Quadratic Forms II

On matrix form:
a11 a12 x1
f (x ) = x1 x2 = x T Ax
a21 a22 x2

Arrange the coe¢ cients to make A symmetric:


a12 +a21
a11 2
A= a 21 +a12
2 a22

Example
The function f : R2 ! R described by f (x ) = x12 + x22 is a quadratic
form where:
1 0
A=
0 1

Bjorn Persson (Boston University) Optimization Fall 2023 3 / 79


Quadratic Forms III
Example
Find the matrix of the quadratic form f : R2 ! R described by:

f (x ) = 5x12 10x1 x2 x22

All quadratic forms f (x ) can be represented by a symmetric matrix

Just like linear functions f (x ) = Ax are completely speci…ed by A, so


are quadratic functions f (x ) = x T Ax

Quadratic forms maps x = 0 into the origin:

f (0) = 0

Therefore we must consider non-zero vectors in order to distinguish


quadratic forms

Bjorn Persson (Boston University) Optimization Fall 2023 4 / 79


De…niteness and Optima I
De…nition
Consider an n n symmetric matrix A and suppose x 6= 0

A is positive de…nite if:


x T Ax > 0
and negative de…nite if:
x T Ax < 0
A is inde…nite if:

x T Ax > 0 for some x


x T Ax < 0 for some x

Furthermore, A is positive (negative) semi-de…nite if:

x T Ax ( )0
Bjorn Persson (Boston University) Optimization Fall 2023 5 / 79
De…niteness and Optima II

A de…nite matrix is also semi-de…nite

Example
Let f1 : R ! R be de…ned by: f1 (x ) = ax 2 and let f2 , f3 , f4 : R2 ! R
be de…ned by:

f2 (x ) = x12 + x22
f3 (x ) = x12 x22
f4 (x ) = x12 + 2x1 x2 + x22

Maximization and minimization


A positive de…nite quadratic form is minimized at x = 0

A negative de…nite quadratic form is maximized at x = 0

Bjorn Persson (Boston University) Optimization Fall 2023 6 / 79


Checking De…niteness I

Checking de…niteness
Recall that the sign of the second-order derivative contains
information about maxima and minima

For functions f : Rn ! R, the Hessian matrix consists of the second


order derivatives

The Hessian matrix is square and symmetric

The sign of the second derivative is the de…niteness of the Hessian

How can we determine the de…niteness of a matrix?

We need the concept of the leading principal minor of a matrix

Bjorn Persson (Boston University) Optimization Fall 2023 7 / 79


Checking De…niteness II

Principal submatrices
A principal submatrix of a square matrix A is the submatrix obtained
when row i and column i are deleted from A

If A is an n n matrix and n k rows and columns are deleted, the


k k matrix obtained is the k th order principal matrix

Example
Let n = 3. What are the leading submatrices?

Leading principal submatrix


A leading principal submatrix of order k is obtained when the last
n k rows and columns are deleted

Bjorn Persson (Boston University) Optimization Fall 2023 8 / 79


Checking De…niteness III

Principal minor and leading principal minor


The (leading) principal minor is the determinant of a (leading)
principal submatrix (delete the same row(s) and column(s))

Notation
Let ∆k denote the k th order principal minor and let Dk denote the k th
order leading principal minor

Example
Find the k th order leading principal minors for A with n = 3

Bjorn Persson (Boston University) Optimization Fall 2023 9 / 79


Checking De…niteness IV
Theorem (de…niteness)
The following theorem links leading principal minors to de…niteness:

A is positive de…nite i¤ Dk > 0 for k = 1, ..., n


A is negative de…nite i¤ ( 1)k Dk > 0 for k = 1, ..., n

Inde…niteness
If some of the k determinants are non-zero, but the test fails either
sign pattern above, the matrix is inde…nite

Theorem (semide…niteness)
The following theorem links leading principal minors to de…niteness:

A is positive semide…nite i¤ ∆k 0 for k = 1, ..., n


k
A is negative semide…nite i¤ ( 1) ∆k 0 for k = 1, ..., n

Bjorn Persson (Boston University) Optimization Fall 2023 10 / 79


Checking De…niteness V
Example
The quadratic form described by f1 (x ) = x12 + x22 is positive de…nite
Hence, x = 0 is a minimum

Example
The quadratic form f2 (x ) = x12 x22 is negative de…nite, so x = 0 is
a maximum

Example
The quadratic form f3 (x ) = 5x12 10x1 x2 x22 is inde…nite, so x = 0
is a saddle point

Example
What is the de…niteness of the quadratic form f4 (x ) = x22 ?

Bjorn Persson (Boston University) Optimization Fall 2023 11 / 79


Constrained Optimization of Quadratic Forms I

Constrained optimization
The de…niteness of a quadratic form f (x ) = x T Ax tells you if x = 0
is a maximum or a minimum (or neither)

Example
If x = 0 is a strict global maximum, A is negative de…nite (example
f (x ) = x12 x22 )

Example
Recall the inde…nite form f (x ) = 5x12 10x1 x2 x22 . If the domain
X = R2 is restricted to the subset X C R2 de…ned by all (x1 , x2 )
such that 2x1 x2 = 0, the function f can be rewritten in terms of x1
only:
g (x1 ) = 5x12 20x12 4x12 = 19x12 = 19x 2

Bjorn Persson (Boston University) Optimization Fall 2023 12 / 79


Constrained Optimization of Quadratic Forms II
This form is negative de…nite with a strict global maximum x = 0 on
the constrained set X C = (x1 , x1 )

Example
Consider the quadratic form f (x ) = ax12 + bx1 x2 + cx22 constrained
to the set X C R2 de…ned by all (x1 , x2 ) such that αx1 + βx2 = 0

Substituting the constraint into f we obtain:


2
α α α
f x1 , x1 = ax12 + b x1 x1 + c x1
β β β
1
= aβ2 bαβ + cα2 x12
β2

Note that f is positive i¤ aβ2 bαβ + cα2 > 0

Bjorn Persson (Boston University) Optimization Fall 2023 13 / 79


Constrained Optimization of Quadratic Forms III

This expression can be rewritten as a determinant:

aβ2 bαβ + cα2 = det(A)

where: 0 1
0 β α
A= @ β a b A
2
α b2 c
The matrix A is the coe¢ cient matrix A bordered by the the
coe¢ cients of the linear constraints

Hence, the quadratic form is positive de…nite on the constraint set i¤


det(A) < 0 and it is negative de…nite i¤ det(A) > 0

Note the sign of the determinant

Bjorn Persson (Boston University) Optimization Fall 2023 14 / 79


Unconstrained Optimization I
Optimal points
Let f : X Rn ! R be a twice continuously di¤erentiable function
Multivariate optimization is a search for optimal vectors x 0 2 X Rn

De…nition
Let f : X Rn ! R be a function de…ned on an open set X . A
0
vector x 2 X is a global min of f on X if:

f x0 f (x ) for all x 2 X

A vector x 0 is a local min of f if there is an open ball Bε x 0 around


x 0 such that:

f x0 f (x ) for all x 2 Bε x 0

Reversing the inequality signs gives conditions for global and local
maxima
Bjorn Persson (Boston University) Optimization Fall 2023 15 / 79
First-Order Necessary Conditions I

FONC
A necessary condition for an interior point x 0 2 X to be a maximum
or minimum on X is that f be stationary:

rf x 0 = 0

Example
Find the stationary points of the function f : R2 ! R de…ned by:

f (x ) = x13 + x23 6x1 x2

In order to classify the critical points, we can often use the Hessian
matrix

Bjorn Persson (Boston University) Optimization Fall 2023 16 / 79


Second-Order Su¢ cient Conditions I

SOSC
Stationary points can be classi…ed using the Hessian matrix
(second-order partials)

Let f : X Rn ! R as before and suppose that x 0 2 X is a


stationary point. Then if:

H x0 is positive de…nite, then x 0 is a (strict) local min


H x0 is negative de…nite, then x 0 is a (strict) local max
H x0 is inde…nite, then x 0 is neither (saddle point)

Why is this true?

Bjorn Persson (Boston University) Optimization Fall 2023 17 / 79


Quadratic Approximations and the Hessian I

A¢ ne approximations
Recall that an a¢ ne approximation g : Rn ! R of a function
f : Rn ! R around x 0 2 Rn is:

f (x ) g ( x ) = a0 + a1 x

where a0 2 R and a1 2 Rn

The best a¢ ne approximation is when a0 = f (x 0 ) rf x 0 x0


and a1 = rf x 0 :

g (x ) = f (x 0 ) + rf x 0 x x0

Bjorn Persson (Boston University) Optimization Fall 2023 18 / 79


Quadratic Approximations and the Hessian II

Quadratic approximations
A quadratic function Q : Rn ! R that approximates f : Rn ! R
around x 0 2 Rn is on the form:

f (x ) Q (x ) = a0 + a1 x + x T Ax

where A is an n n matrix

The best quadratic approximation of f at x 0 should assume the same


value and same derivative as f at x 0

In addition, it should have the same curvature as f around x 0

The curvature A of f is given by the Hessian matrix

Bjorn Persson (Boston University) Optimization Fall 2023 19 / 79


Quadratic Approximations and the Hessian III

In other words, the best quadratic approximation of f around x 0 is:

1 T
Q (x ) = f x 0 + rf (x 0 ) x x0 + x x0 H (x 0 ) x x0
2
Second-order Taylor expansion

Example
Find the best quadratic approximation of f : R2 ! R around
x 0 = (1, 1) where:
f (x ) = x13 + x23

Bjorn Persson (Boston University) Optimization Fall 2023 20 / 79


The Hessian and Second-Order Conditions I

The de…niteness of the Hessian


Recall that if x 0 2 Rn is an interior stationary point of f : Rn ! R,
then the gradient vanishes there

Therefore, we get:
1
f x 0 + ∆x f x0 (∆x )T H x 0 ∆x
2
If H x 0 is positive de…nite, then:

f x 0 + ∆x > f x 0

for small ∆x 6= 0. Therefore, x 0 is a (local) minimizer of f

Bjorn Persson (Boston University) Optimization Fall 2023 21 / 79


The Hessian and Second-Order Conditions II

Theorem
Suppose that f : X Rn ! R is twice continuously di¤erentiable and
0
that x is a critical point of f , so that:

rf x 0 = 0

If, in addition:

H x0 is positive de…nite, then x 0 is a (strict) local min


H x0 is negative de…nite, then x 0 is a (strict) local max
H x0 is inde…nite, then x 0 is a saddle point

Proof: Taylor expand f around x = x 0

Bjorn Persson (Boston University) Optimization Fall 2023 22 / 79


The Hessian and Second-Order Conditions III

Note
The de…niteness of the Hessian matrix involves the relationship
between all the second-order partials

Example
Suppose f : R2 ! R is de…ned by:

f (x ) = 3x1 x2 x12 x22

The second-order own partials are negative everywhere, but the


unique stationary point is not a maximum

Is the Hessian negative de…nite?

Bjorn Persson (Boston University) Optimization Fall 2023 23 / 79


The Hessian and Second-Order Conditions IV

Example
Classify the stationary points of f : R2 ! R de…ned by:

f (x ) = x13 + x23 6x1 x2

Bjorn Persson (Boston University) Optimization Fall 2023 24 / 79


Second-Order Necessary Conditions I

Semide…nite Hessian matrix


If the Hessian is semide…nite, the second-order conditions are
necessary, not su¢ cient

Second-order necessary condition (SONC)


Suppose that x 0 2 Rn minimizes (maximizes) f . Then rf x 0 = 0
and H (x 0 ) is positive (negative) semide…nite

Note that the reverse statement is not valid

Bjorn Persson (Boston University) Optimization Fall 2023 25 / 79


Global Extrema
When are extrema global?
In order to guarantee global extreme points, we need to impose
restrictions on the curvature of f
The following statements are equivalent:
i ) : f is convex on X
ii ) : H (x ) is positive semide…nite for all x 2 X
iii ) : if rf x 0 = 0, then x 0 is a global min on X
Note the analogous case where f is concave and H (x ) is everywhere
negative de…nite

Example
Let f : R2 ! R and consider the program:
min [f (x )] = (x1 1)2 x1 x2 + (x2 1)2
x 2R2

Bjorn Persson (Boston University) Optimization Fall 2023 26 / 79


Constrained Optimization I

Nonlinear optimization programs


We will consider special cases of the general problem:

g (x ) = b
max / min [f (x )] where X = x 2 Rn :
x 2X h (x ) c

where f : Rn ! R, g : Rn ! Rm , and h : Rn ! Rp are di¤erentiable,


and where b 2 Rm and c 2 Rp

Constraint set
The set X Rn is a restriction on the domain de…ned by the implicit
relationships g (x ) = b and h (x ) c

Bjorn Persson (Boston University) Optimization Fall 2023 27 / 79


Constrained Optimization II
A geometric approach
Suppose that the problem is:

max [f (x )] where X = x 2 R2 : g (x ) = b
x 2X

for f , g : R2 ! R. What characterizes the optimal point?

Necessary condition (Lagrange)


Suppose that x 0 is a local optimum of f x 0 subject to g (x 0 ) = b,
and suppose that rg (x 0 ) 6= 0

Then there exists a unique scalar λ 2 R such that:

rf (x 0 ) = λ rg (x 0 )

holds at this optimum

Bjorn Persson (Boston University) Optimization Fall 2023 28 / 79


Constrained Optimization III

Constraint quali…cation
The requirement rg (x 0 ) 6= 0 is a restriction on the constraint - a
constraint quali…cation. If x 0 is the optimal point and if rg (x 0 ) = 0
(or if it is unde…ned), the Lagrange method will not identify it

Hence, we require that x 0 2 X CQ , where:

X CQ = x 2 R2 : g (x ) = b and rg (x ) 6= 0

Example
Solve:

max [f (x )] = x1 x2 where: X = x 2 R2 : x1 + x2 = 1
x 2X

Bjorn Persson (Boston University) Optimization Fall 2023 29 / 79


Constrained Optimization IV
Example
Consider the problem:

min [f (x )] = (x1 1)2 + x22 where X = x 2 R2 : x12 + x22 = 4


x 2X

Example
Utility maximization:

max [u (x )] where: x 2 R2+ : px = w


x 2X

If interior solution, Lagrange’s theorem gives the optimal consumption


rule:
ru (x 0 ) = λp
or:
u1 ( x 0 ) p1
0
= MRS =
u2 (x ) p2
Bjorn Persson (Boston University) Optimization Fall 2023 30 / 79
The Lagrange Function I
The Lagrange function
The FONC for the nonlinear program can be rewritten:

rf (x 0 ) λ rg (x 0 ) = 0

This is also the FONC for the function L(x, λ), which is de…ned as:

L (x, λ) = f (x ) λ [g (x ) b]

This is the Lagrange function, or the Lagrangian

The Lagrange function is a combination of the objective function f


and the constraint function g

A stationary point of L(x, λ) satis…es the FONC:

L i ( x 0 , λ ) = fi ( x 0 ) λgi (x 0 ) = 0 for i = 1, 2
Bjorn Persson (Boston University) Optimization Fall 2023 31 / 79
The Lagrange Function II

Example
Set up the Lagrange function for the utility maximization problem:
p
max [u (x )] = x1 x2
x

subject to:
x1 + 2x2 = 12
The FONC gives two equations in three unknowns (x1 , x2 , λ) so we
use the constraint to solve the problem

Bjorn Persson (Boston University) Optimization Fall 2023 32 / 79


Example When Method Fails

Constraint quali…cation
If the CQ is not satis…ed at the optimal point the method does not
work

Example
Find the solution to the problem:
p p
max [2x1 + 3x2 ] subject to: x1 + x2 = 5
x 2R2

Is there a solution?

Procedure
Find the stationary points of the Lagrange function and check if there
are any admissible points where the CQ fails

Bjorn Persson (Boston University) Optimization Fall 2023 33 / 79


More than one Equality Constraint I

Problem with more than one constraint


Extension to many equality constraints is straight forward

Formulation
Suppose the problem is:

max / min [f (x )] subject to g (x ) = b


x

where f : Rn ! R and g : Rn ! Rm is a vector-valued function:


8 1
>
> g (x1 , ..., xn ) = b1
>
< g 2 (x1 , ..., xn ) = b2
g (x ) = ..
>
> .
>
: m
g (x1 , ..., xn ) = bm

Bjorn Persson (Boston University) Optimization Fall 2023 34 / 79


More than one Equality Constraint II

The Lagrange function is:

L (x,λ) = f (x ) λ [g (x ) b]
m
= f (x ) ∑ λ i [ gi ( x ) bi ]
i =1

Necessary condition
If x 0 2 Rn is optimal, and if x 0 2 X CQ , then there exists a unique
vector λ 2 Rm such that:

rf (x 0 ) = λJg x 0

Bjorn Persson (Boston University) Optimization Fall 2023 35 / 79


More than one Equality Constraint III

Here, Jg (x ) is the Jacobian matrix:


0 1
rg 1 (x )
B rg 2 (x ) C
B C
Jg (x ) = B .. C
@ . A
rg m (x )

Constraint quali…cation
The constraint quali…cation CQ states that at x 0 , the gradients of the
constraint function must be linearly independent (the Jacobian matrix
Jg (x 0 ) must have full rank)

Bjorn Persson (Boston University) Optimization Fall 2023 36 / 79


More than one Equality Constraint IV

Example
Consider the program:

x1 + 2x2 + x3 = 30
min [f (x )] = x12 + x22 + x32 subject to
x 2 R3 2x1 x2 3x3 = 10

Example
Consider the problem:

x12 + x22 =p1


min [f (x )] = x12 + x22 + x32 subject to
x 2 R3 x1 + x2 = 2

Specify the points in the constraint where the CQ fails (if any)

Bjorn Persson (Boston University) Optimization Fall 2023 37 / 79


Inequality Constraints I

Constraint set
Consider the problem:

max [f (x )] subject to h (x ) c
x

where f , h : R2 ! R

Two cases
At the solution x 0 the constraint is either binding h (x 0 ) = c, or slack
h (x 0 ) < c

Geometric approach
Suppose that x 0 is optimal, that the constraint binds at x 0 , and that
CQ holds

Bjorn Persson (Boston University) Optimization Fall 2023 38 / 79


Inequality Constraints II

By Lagrange’s theorem, there is a unique multiplier µ 0 such that:

rf (x 0 ) = µ rh (x 0 )

The sign of µ matters

Suppose instead that y is optimal, and that the constraint is not


binding at y

What must then be true at y ?

If y 0 is optimal and if h y 0 < c, then a necessary condition for a


maximum is:
rf y 0 = 0
Independent of the constraint

Bjorn Persson (Boston University) Optimization Fall 2023 39 / 79


Inequality Constraints III

Complementary slackness
This can be achieved by requiring that:

µ [h (x ) c] = 0

for any x or y
This complementary slackness condition assures that the last term is
eliminated from the Lagrange function at the optimal point,
regardless whether the constraint is binding or slack there

Bjorn Persson (Boston University) Optimization Fall 2023 40 / 79


Karush-Kuhn-Tucker I
Necessary condition for optimality
The problem is:

max [f (x )] subject to h (x ) c
x

where f , h : Rn ! R

Theorem (KKT …rst-order necessary condition)


Suppose that x 0 2 Rn maximizes f on the set h (x ) c, and suppose
further that rh x 0 6= 0 if h (x 0 ) = c

Then there exists a unique multiplier µ 0 such that:

rf x 0 = µ rh x 0
µ h x0 c =0
h (x ) c
Bjorn Persson (Boston University) Optimization Fall 2023 41 / 79
Karush-Kuhn-Tucker II

Note that if h (x 0 ) < c, then the constraint quali…cation is irrelevant

Note the similarities between the Lagrange and KKT necessary


conditions

Example
Suppose the problem is:

max [f (x )] = x12 + x22 + x2 1 subject to x12 + x22 1


x

Is there a solution?

Will the constraint be binding at the solution?

What about the CQ?

Bjorn Persson (Boston University) Optimization Fall 2023 42 / 79


KKT with More than One Constraint I

More inequality constraints


The generalization to problems with n > 2 variables and m > 1
constraints is straightforward

Constraint quali…cation
The CQ is concerned only with binding constraints

The constraint quali…cation CQ states that if x 0 is optimal and if


hi (x 0 ) = 0 for i = 1, .., m and hj x 0 < 0 for j = m + 1, .., p, where
p m, then the Jacobian matrix Jh (x 0 ) must have rank m

Bjorn Persson (Boston University) Optimization Fall 2023 43 / 79


Minimization problems I
Minimization
In equality-constrained problems the FONC only identi…es stationary
points

Suppose that we seek to minimize f subject to an inequality


constraint:
min [f (x )] subject to h (x ) c
x

where f , g : Rn ! R

A geometric approach
Recall that if x 0 solves the maximization problem:

max [f (x )] subject to h (x ) c
x

and h (x 0 ) = c, the gradients of f and h point in the same direction


Bjorn Persson (Boston University) Optimization Fall 2023 44 / 79
Minimization problems II
If instead x 0 solves the minimization problem:

min [f (x )] subject to h (x ) c
x

and h (x 0 ) = c, the gradients of f and h point in opposite directions:

min [f (x )] st h (x ) 0
x

Hence, the same FONC but insist that µ 0 in a minimization


problem

Non-negative multipliers
A more common approach is to keep the requirement that µ 0 by
maximizing the negative of f :

min [f (x )] subject to h (x ) c , max [ f (x )] subject to h (x ) c


x x

Bjorn Persson (Boston University) Optimization Fall 2023 45 / 79


Minimization problems III

If x 0 is optimal, then there is a unique scalar µ 0 such that:

rf x 0 = µ rh (x 0 )

Alternatively, we can rewrite the problem as:

min [f (x )] subject to h (x ) c
x

Then we can apply the same FONCs as we do for our original problem:

max [f (x )] subject to h (x ) c
x

Note that the FONC is not su¢ cient - the points that satisfy the
FONC may be saddle points or local maxima (if the constraint is not
binding)

Bjorn Persson (Boston University) Optimization Fall 2023 46 / 79


Minimization problems IV
Example
Consider the problem:

min x12 + x22 subject to x1 + x2 1


x 2R2

Canonical forms
The canonical forms are:

max f (x ) subject to h (x ) c
x

and:
min f (x ) subject to h (x ) c
x
The canonical forms have the same Lagrange function and FONC:

L(x, µ) = f (x ) µ [h (x ) c]
0 0 0
Li x , µ = fi x µhi (x ) = 0 for i = 1, .., n with µ 0

Bjorn Persson (Boston University) Optimization Fall 2023 47 / 79


Second-order Conditions I

Second-order conditions
Recall that the Hessian matrix provides information about the nature
of stationary points

In an unconstrained maximization problem a su¢ cient condition for a


point x 0 to be a local maximum is that the gradient of f is zero and
that the Hessian matrix of f is negative de…nite at x 0 :

x T H x 0 x < 0 for x 6= 0

Second-order conditions in constrained problems


In a constrained problem, the same idea applies to the Lagrange
function

Bjorn Persson (Boston University) Optimization Fall 2023 48 / 79


Second-order Conditions II

Suppose that the constrained problem is:

max [f (x )] where X = fx 2 Rn : g (x ) = b 2 Rg
x 2X

The (bordered) Hessian matrix is:

0 Jg (x )
H (x, λ) =
Jg (x ) r2x L (x, λ)

In the special case where f , g : R2 ! R the Hessian is:


0 1
0 g1 g2
H = @ g1 f11 λg11 f12 λg12 A
g2 f21 λg21 f22 λg22

Bjorn Persson (Boston University) Optimization Fall 2023 49 / 79


Second-order Conditions III

Second-order theorem
Suppose that x 0 , λ is stationary. Then if:

det H x 0 , λ > 0

the bordered Hessian is negative de…nite at (x 0 , λ),and that point is


a local maximum

If instead:
det H x 0 , λ < 0
the bordered Hessian is positive de…nite at x 0 , λ , and that point is
a local minimum

Bjorn Persson (Boston University) Optimization Fall 2023 50 / 79


Second-order Conditions IV

Example
Consider the example from before:

min x12 + x22 subject to x1 + x2 5


x 2R2

with stationary point x 0 , λ = ( 52 , 52 , 5). The determinant of the


bordered Hessian is:
0 1
0 1 1
5 5 @
det(H , , 5 ) = det 1 2 0 A= 4
2 2
1 0 2

Bjorn Persson (Boston University) Optimization Fall 2023 51 / 79


Global Extreme Points I

Su¢ cient conditions for global extreme points


If the bordered Hessian matrix is everywhere negative (positive)
de…nite, the Lagrange function is concave (convex) in x

Let x 0 , λ be a stationary point of L(x, λ)

If L (x, λ) is (strictly) concave in x on the constraint set, then x 0 , λ


is a (strict) global maximizer

If L (x, λ) is (strictly) convex in x on the constraint set, then x 0 , λ


is a (strict) global minimizer

Bjorn Persson (Boston University) Optimization Fall 2023 52 / 79


Global Extreme Points II

Note the following:


If f is concave (convex) then f is convex (concave)
The sum of convex functions is a convex function, and the sum of
concave functions is a concave function
If f is concave and the constraint functions are convex, then the
Lagrangian function is concave
If f is convex and and the constraint functions are concave in x, then
L(x, λ) is convex in x

Bjorn Persson (Boston University) Optimization Fall 2023 53 / 79


Kuhn-Tucker Formulation I

Non-negativity constraints
In applications, the variables are often restricted to be non-negative

Example
Let x 2 Rn , f , g : Rn ! R and consider the problem:

max [f (x )] subject to g (x ) = b and x 0


x 2 Rn

The associated Lagrange function is:

L(x ) = f (x ) λ [g (x ) b ] + µT x

where µ 2 Rn is a vector of multipliers

Bjorn Persson (Boston University) Optimization Fall 2023 54 / 79


Kuhn-Tucker Formulation II

First-order condition
Rather than explicitly state the non-negativity constraints, they can
be accounted for in the FONC as follows:

Li (x 0 , λ) = fi (x 0 ) λgi (x 0 ) 0 with equality if xi0 > 0

Known as the Kuhn-Tucker formulation

Bjorn Persson (Boston University) Optimization Fall 2023 55 / 79


Proof of the Lagrange Multiplier Theorem I

Theorem (the Fredholm alternative)


Let ai for i = 1, .., m be linear functionals on X Rn and let:

S = fx 2 X : ai x = 0 for i = 1, .., m g

Suppose that a0 is a linear functional such that a0 x = 0 for all x 2 S.


Then there exists a vector λ 2 Rm such that:
m
a0 x = ∑ λ i ai x
i =1

Proof: Note that the orthogonal complement to S is:

S ? = fa 2 Rn : ai x = 0 for all x 2 S g

Bjorn Persson (Boston University) Optimization Fall 2023 56 / 79


Proof of the Lagrange Multiplier Theorem II

By de…nition, each ai 2 S ? . Moreover, S ? is a subspace. Wlog


assume that fa1 , .., al g form a maximal linearly independent subset of
fa1 , .., am g. Then S has dimension n l which implies that S ? has
dimension l. Hence, the l linearly independent vectors a1 , .., al span
S?

Since a0 x = 0 for all x 2 S, we have that a0 2 S ? = lin fa1 , .., al g.


Hence, there exists scalars λi such that
l
a0 = ∑ λ i ai
i =1

or
l
a0 x = ∑ λ i ai x
i =1

Bjorn Persson (Boston University) Optimization Fall 2023 57 / 79


Proof of the Lagrange Multiplier Theorem III

Setting λi = 0, i = l + 1, .., m, we obtain:


m
a0 x = ∑ λ i ai x
i =1

Corollary
Let A be an m n matrix and let b 2 Rm . Exactly one of the
following alternatives are true: either there exists a vector x 2 Rn
such that (i ): Ax = b, or else there exists a vector y 2 Rm such that
(ii ):

yA = 0
yb > 0

Bjorn Persson (Boston University) Optimization Fall 2023 58 / 79


Proof of the Lagrange Multiplier Theorem IV

Proof: If both (i ) and (ii ), then 0 = 0x = yAx = yb > 0, a


contradiction. Let S be the subspace spanned by the columns of A.
Alternative (i ) states that b 2 S. If this is not the case, then by the
strong Separating Hyperplane Theorem there is a nonzero vector y
that strongly separates b from the closed convex set S, that is
yb > yz for all z 2 S. Since S is a subspace we have yz = 0 for all
y 2 S and in particular for each column of A so yA = 0 and yb > 0
which is (ii ).

Bjorn Persson (Boston University) Optimization Fall 2023 59 / 79


Proof of the Lagrange Multiplier Theorem V
Proof of the Lagrange multiplier theorem
Consider the NLP:
max [f (x )]
x 2X

where f : Rn ! R is C1 and where:

X = fx 2 X : g (x ) = 0g

with g : Rn ! Rm and g 2 C1

Theorem
Suppose x 0 is a local maximizer in the program above and that
ρ rg x 0 = k. Then there exists a unique vector λ 2 Rk such
that:
m
rf x 0 = ∑ λi Jg x0
i =1

Bjorn Persson (Boston University) Optimization Fall 2023 60 / 79


Proof of the Lagrange Multiplier Theorem VI

Proof: A necessary condition for x 0 to be a local maximizer is that


Df x 0 dx = 0 for all perturbations in:

S = fdx 2 X : Dgi (x ) = 0 for i = 1, .., m g

Note that Df (x 0 ) and Dgi (x ) are linear functionals so Df (x 0 ) = a0


and Dgi (x ) = ai . By the Fredholm alternative there exists scalars λi
such that:
m
Df (x 0 ) = ∑ λ i Dg ( x 0 )
i =1

Bjorn Persson (Boston University) Optimization Fall 2023 61 / 79


Parametric Optimization Problems I

Parametric problem
In economic applications, the optimization problem is often
(compactly) written:
max [f (x )]
x 2X (a )

where the constraint set X (a) Rn depends on a vector a 2 Rm of


parameters

Choice variables and parameters

Bjorn Persson (Boston University) Optimization Fall 2023 62 / 79


Parametric Optimization Problems II
Example
The n-good utility maximization problem:

max [u (x )]
x 2X (p,w )

where:
X (p, w ) = fx 2 Rn+ : px wg
Here, p w 2 Rn++
+1
are the parameters describing (constraining) the
consumer’s environment

General parametric form


A parametric nonlinear maximization program can be stated:

max [f (x, a)] st g (x, a) b


x 2X (a )

Note that the solution will depend on the parameters x (a, b )


Bjorn Persson (Boston University) Optimization Fall 2023 63 / 79
Value Functions I

Value Functions
Suppose that x (a) solves the problem the following problem:

max [f (x, a)]


x 2X (a )

De…nition
The value function v : X A ! R gives the optimal attainable value
of the objective function f as a function of the parameters a:

v (a) = max f (x, a) = f (x (a) , a)


x 2X (a )

The value function v (a) describes how the value of the solution
varies with the parameters

Bjorn Persson (Boston University) Optimization Fall 2023 64 / 79


Value Functions II

Some value functions in economics


The indirect utility function:

max u (x ) = v (x (p, w )) = v (p, w )


x 2X (p,w )

The consumer’s expenditure function:

min p T x = e (h (p, u )) = e (p, u )


x 2X (p,u )

The …rm’s cost function:

min w T x = c (x (w ), y ) = c (w , y )
x 2X (w ,y )

Bjorn Persson (Boston University) Optimization Fall 2023 65 / 79


Value Functions III

Example
Suppose f : R2 ! R is f (x, a) = ax 2 . Find the value function:

v (a) = max [f (x, a)]


1 x 2

Example
Let f : R2 ! R be f (x, a) = x 4 + a(x 2 1). Find the value
function:
v (a) = max [f (x, a)]
1 x 1

Bjorn Persson (Boston University) Optimization Fall 2023 66 / 79


Value Functions IV
Example
Let f : R2 ! R de…ned by:

f (x, a) = 1 (x a )2

be a demand function, where a 2 [0, 1]. A seller chooses an optimal


location x 2 [0, 1] to maximize pro…ts:

max [π (x, a)] = (p c )f (x, a)


x 2[0,1 ]

Assume for simplicity that p = 1 and that c = 0

What is the optimal location x (a)?

What is the value of locating optimally?

What if the seller is constrained to locate at endpoints x 2 f0, 1g?


Bjorn Persson (Boston University) Optimization Fall 2023 67 / 79
Value Functions V
Value function in the Lagrange formulation
Suppose the problem:

max [f (x, a)] st g (x, a) = b


x 2X (a )

where f , g : Rn ! R has a unique solution x (a, b )

The value function is then:

v (a, b ) = f (x (a, b ), a) λ [g (x (a, b ) , a) b ] = f (x (a, b ), a)

Example
Find the value of the following utility maximization program:
p
max [u (x )] = x1 x2
x 2X (p,w )

Bjorn Persson (Boston University) Optimization Fall 2023 68 / 79


Envelope Theorems I

An envelope theorem
Suppose that we have a parametric maximization problem:

max [f (x, a)]


x 2X (a )

where a 2 R, with solution x (a) and value function:

v (a ) = f (x (a ), a )

If the value function is di¤erentiable, one version of the envelope


theorem states that:
dv (a) ∂f (x (a), a) dx (a) ∂f (x (a), a) ∂f (x (a), a)
= + =
da ∂x da ∂a ∂a

Bjorn Persson (Boston University) Optimization Fall 2023 69 / 79


Envelope Theorems II

Example
Suppose we want to minimize f : R A ! R given by:

f (x, a) = x 2 + 2ax 2

How does the value respond to parametric changes?

Bjorn Persson (Boston University) Optimization Fall 2023 70 / 79


Constrained Envelope Theorems I

The Lagrangian value function


Recall that the value function associated with the maximization
problem:
max [f (x )] subject to g (x ) = a 2 R
x

is given by the Lagrange function evaluated at the solution x (a):

v (a) = L(a) = f (x (a)) λ [g (x (a)) a] = f (x (a))

How does the optimal value of the problem change with a?

Envelope theorem
Di¤erentiate the value function with respect to a gives:

v 0 (a) = fx (x (a)) x 0 (a) λ gx (x (a)) x 0 (a) 1 =λ

Bjorn Persson (Boston University) Optimization Fall 2023 71 / 79


Constrained Envelope Theorems II

The indirect e¤ects though the solution x (a) are negligible

Example
Let u : R2+ ! R be a di¤erentiable utility function. Use the envelope
theorem to derive Roy’s identity:

v (p, w ) = max u (x )
x 2B p,w

The value function is:

L(p, w ) = u (x (p, w )) λ [px (p, w ) w]

Bjorn Persson (Boston University) Optimization Fall 2023 72 / 79


Constrained Envelope Theorems III

Di¤erentiate L wrt to parameters pi and w :

Lpi (p, w ) = λxi (p, w ) for i = 1, 2


Lw (p, w ) = λ

Dividing the two gives Roy’s identity:

Lpi (p, w )
= xi (p, w ) for i = 1, 2
Lw (p, w )

Bjorn Persson (Boston University) Optimization Fall 2023 73 / 79


Lagrange Multipliers I

Interpretation of the multiplier


Let f : Rn ! R and g : Rn ! Rk be C 1 and consider:

max f (x ) st g (x ) = b
x 2X ( θ )

where b 2 Θ Rk is a vector of constants

The Lagrange function is:

L(x ) = f (x ) λ [g (x ) b]

The optimal value of f will in general depend on b :

v (b ) = f (x (b )) λ [g (x (b )) b ] = L(x (b )) + λb

Bjorn Persson (Boston University) Optimization Fall 2023 74 / 79


Lagrange Multipliers II
By the envelope theorem, an in…nitesimal change in b causes a
change in the value function v (b ) according to:

rv (x (b )) = λ

Suppose bi changes marginally. Then the value changes:

∂v ∂x
= rL ((b )) + λi = λi
∂bi ∂bi
Alternatively, we have:

v (c + dc ) v (c ) λdc

Multipliers measure penalty/reward for (marginal) changes in


constraints

What if the constraint is slack?


Bjorn Persson (Boston University) Optimization Fall 2023 75 / 79
The Value Function and Lagrangian Duality I
Primal and dual programs
Closely associated with any NLP primal problem there is another NLP
known as the dual problem

Under suitable convexity assumptions and constraint quali…cations,


the primal and dual problems have the same value

Lagrange dual function


Suppose a primal problem:

max [f (x )]
x 2X

where f : Rn ! R, g : Rn ! Rp , h : Rn ! Rm , and where:

g (x ) = 0
X = x 2 Rn :
h (x ) 0
Bjorn Persson (Boston University) Optimization Fall 2023 76 / 79
The Value Function and Lagrangian Duality II
Suppose that the value of the problem is v

The Lagrange function is

L(x, λ, µ) = f (x ) λg (x ) µh (x )

Consider the value of this problem as a function of the multipliers


(penalty weights):

v (λ, µ) = sup L(x, λ, µ) = sup ff (x ) λg (x ) µh (x )g


x 2X x 2X

The function v (λ, µ) is called the Lagrange dual function and


produces an upper bound on the problem:

v (λ, µ) = sup L(x, λ, µ) L(x, λ, µ)


x 2X

Bjorn Persson (Boston University) Optimization Fall 2023 77 / 79


The Value Function and Lagrangian Duality III

Lagrange dual problem


The dual problem associated with the NLP above is:

min v (λ, µ) = min sup f (x ) λg (x ) µh (x )


λ,µ λ,µ x 2X :

subject to µ 0

The solution gives the lowest upper bound to the NLP, and its value
is denoted by w

Note that L(x, λ, µ) is a¢ ne in λ, µ so the function v (λ, µ) is a


minimum of a¢ ne functions and is therefore convex

Bjorn Persson (Boston University) Optimization Fall 2023 78 / 79


The Value Function and Lagrangian Duality IV
Weak duality
Weak duality states that the value of the primal problem cannot
exceed the value of the dual:

w v

This always holds

Strong duality
When w = v , strong duality obtains. Because w is an upper bound
on the optimal value v , if both are equal for some (x, λ, µ) then that
point must be optimal

The converse is false: (x, λ, µ) could be optimal with w > v

Strong duality and convexity


Bjorn Persson (Boston University) Optimization Fall 2023 79 / 79

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy