IV. Optimization
IV. Optimization
Bjorn Persson
Boston University
Fall 2023
De…nition
A quadratic form is a function f : Rn ! R that maps a vector x 2 Rn
into a real number r 2 R :
n n
f (x ) = ∑ ∑ aij xi xj = x T Ax = r 2 R
i =1 j =1
Example
Suppose f : R2 ! R is a quadratic form:
On matrix form:
a11 a12 x1
f (x ) = x1 x2 = x T Ax
a21 a22 x2
Example
The function f : R2 ! R described by f (x ) = x12 + x22 is a quadratic
form where:
1 0
A=
0 1
f (0) = 0
x T Ax ( )0
Bjorn Persson (Boston University) Optimization Fall 2023 5 / 79
De…niteness and Optima II
Example
Let f1 : R ! R be de…ned by: f1 (x ) = ax 2 and let f2 , f3 , f4 : R2 ! R
be de…ned by:
f2 (x ) = x12 + x22
f3 (x ) = x12 x22
f4 (x ) = x12 + 2x1 x2 + x22
Checking de…niteness
Recall that the sign of the second-order derivative contains
information about maxima and minima
Principal submatrices
A principal submatrix of a square matrix A is the submatrix obtained
when row i and column i are deleted from A
Example
Let n = 3. What are the leading submatrices?
Notation
Let ∆k denote the k th order principal minor and let Dk denote the k th
order leading principal minor
Example
Find the k th order leading principal minors for A with n = 3
Inde…niteness
If some of the k determinants are non-zero, but the test fails either
sign pattern above, the matrix is inde…nite
Theorem (semide…niteness)
The following theorem links leading principal minors to de…niteness:
Example
The quadratic form f2 (x ) = x12 x22 is negative de…nite, so x = 0 is
a maximum
Example
The quadratic form f3 (x ) = 5x12 10x1 x2 x22 is inde…nite, so x = 0
is a saddle point
Example
What is the de…niteness of the quadratic form f4 (x ) = x22 ?
Constrained optimization
The de…niteness of a quadratic form f (x ) = x T Ax tells you if x = 0
is a maximum or a minimum (or neither)
Example
If x = 0 is a strict global maximum, A is negative de…nite (example
f (x ) = x12 x22 )
Example
Recall the inde…nite form f (x ) = 5x12 10x1 x2 x22 . If the domain
X = R2 is restricted to the subset X C R2 de…ned by all (x1 , x2 )
such that 2x1 x2 = 0, the function f can be rewritten in terms of x1
only:
g (x1 ) = 5x12 20x12 4x12 = 19x12 = 19x 2
Example
Consider the quadratic form f (x ) = ax12 + bx1 x2 + cx22 constrained
to the set X C R2 de…ned by all (x1 , x2 ) such that αx1 + βx2 = 0
where: 0 1
0 β α
A= @ β a b A
2
α b2 c
The matrix A is the coe¢ cient matrix A bordered by the the
coe¢ cients of the linear constraints
De…nition
Let f : X Rn ! R be a function de…ned on an open set X . A
0
vector x 2 X is a global min of f on X if:
f x0 f (x ) for all x 2 X
f x0 f (x ) for all x 2 Bε x 0
Reversing the inequality signs gives conditions for global and local
maxima
Bjorn Persson (Boston University) Optimization Fall 2023 15 / 79
First-Order Necessary Conditions I
FONC
A necessary condition for an interior point x 0 2 X to be a maximum
or minimum on X is that f be stationary:
rf x 0 = 0
Example
Find the stationary points of the function f : R2 ! R de…ned by:
In order to classify the critical points, we can often use the Hessian
matrix
SOSC
Stationary points can be classi…ed using the Hessian matrix
(second-order partials)
A¢ ne approximations
Recall that an a¢ ne approximation g : Rn ! R of a function
f : Rn ! R around x 0 2 Rn is:
f (x ) g ( x ) = a0 + a1 x
where a0 2 R and a1 2 Rn
g (x ) = f (x 0 ) + rf x 0 x x0
Quadratic approximations
A quadratic function Q : Rn ! R that approximates f : Rn ! R
around x 0 2 Rn is on the form:
f (x ) Q (x ) = a0 + a1 x + x T Ax
where A is an n n matrix
1 T
Q (x ) = f x 0 + rf (x 0 ) x x0 + x x0 H (x 0 ) x x0
2
Second-order Taylor expansion
Example
Find the best quadratic approximation of f : R2 ! R around
x 0 = (1, 1) where:
f (x ) = x13 + x23
Therefore, we get:
1
f x 0 + ∆x f x0 (∆x )T H x 0 ∆x
2
If H x 0 is positive de…nite, then:
f x 0 + ∆x > f x 0
Theorem
Suppose that f : X Rn ! R is twice continuously di¤erentiable and
0
that x is a critical point of f , so that:
rf x 0 = 0
If, in addition:
Note
The de…niteness of the Hessian matrix involves the relationship
between all the second-order partials
Example
Suppose f : R2 ! R is de…ned by:
Example
Classify the stationary points of f : R2 ! R de…ned by:
Example
Let f : R2 ! R and consider the program:
min [f (x )] = (x1 1)2 x1 x2 + (x2 1)2
x 2R2
g (x ) = b
max / min [f (x )] where X = x 2 Rn :
x 2X h (x ) c
Constraint set
The set X Rn is a restriction on the domain de…ned by the implicit
relationships g (x ) = b and h (x ) c
max [f (x )] where X = x 2 R2 : g (x ) = b
x 2X
rf (x 0 ) = λ rg (x 0 )
Constraint quali…cation
The requirement rg (x 0 ) 6= 0 is a restriction on the constraint - a
constraint quali…cation. If x 0 is the optimal point and if rg (x 0 ) = 0
(or if it is unde…ned), the Lagrange method will not identify it
X CQ = x 2 R2 : g (x ) = b and rg (x ) 6= 0
Example
Solve:
max [f (x )] = x1 x2 where: X = x 2 R2 : x1 + x2 = 1
x 2X
Example
Utility maximization:
rf (x 0 ) λ rg (x 0 ) = 0
This is also the FONC for the function L(x, λ), which is de…ned as:
L (x, λ) = f (x ) λ [g (x ) b]
L i ( x 0 , λ ) = fi ( x 0 ) λgi (x 0 ) = 0 for i = 1, 2
Bjorn Persson (Boston University) Optimization Fall 2023 31 / 79
The Lagrange Function II
Example
Set up the Lagrange function for the utility maximization problem:
p
max [u (x )] = x1 x2
x
subject to:
x1 + 2x2 = 12
The FONC gives two equations in three unknowns (x1 , x2 , λ) so we
use the constraint to solve the problem
Constraint quali…cation
If the CQ is not satis…ed at the optimal point the method does not
work
Example
Find the solution to the problem:
p p
max [2x1 + 3x2 ] subject to: x1 + x2 = 5
x 2R2
Is there a solution?
Procedure
Find the stationary points of the Lagrange function and check if there
are any admissible points where the CQ fails
Formulation
Suppose the problem is:
L (x,λ) = f (x ) λ [g (x ) b]
m
= f (x ) ∑ λ i [ gi ( x ) bi ]
i =1
Necessary condition
If x 0 2 Rn is optimal, and if x 0 2 X CQ , then there exists a unique
vector λ 2 Rm such that:
rf (x 0 ) = λJg x 0
Constraint quali…cation
The constraint quali…cation CQ states that at x 0 , the gradients of the
constraint function must be linearly independent (the Jacobian matrix
Jg (x 0 ) must have full rank)
Example
Consider the program:
x1 + 2x2 + x3 = 30
min [f (x )] = x12 + x22 + x32 subject to
x 2 R3 2x1 x2 3x3 = 10
Example
Consider the problem:
Specify the points in the constraint where the CQ fails (if any)
Constraint set
Consider the problem:
max [f (x )] subject to h (x ) c
x
where f , h : R2 ! R
Two cases
At the solution x 0 the constraint is either binding h (x 0 ) = c, or slack
h (x 0 ) < c
Geometric approach
Suppose that x 0 is optimal, that the constraint binds at x 0 , and that
CQ holds
rf (x 0 ) = µ rh (x 0 )
Complementary slackness
This can be achieved by requiring that:
µ [h (x ) c] = 0
for any x or y
This complementary slackness condition assures that the last term is
eliminated from the Lagrange function at the optimal point,
regardless whether the constraint is binding or slack there
max [f (x )] subject to h (x ) c
x
where f , h : Rn ! R
rf x 0 = µ rh x 0
µ h x0 c =0
h (x ) c
Bjorn Persson (Boston University) Optimization Fall 2023 41 / 79
Karush-Kuhn-Tucker II
Example
Suppose the problem is:
Is there a solution?
Constraint quali…cation
The CQ is concerned only with binding constraints
where f , g : Rn ! R
A geometric approach
Recall that if x 0 solves the maximization problem:
max [f (x )] subject to h (x ) c
x
min [f (x )] subject to h (x ) c
x
min [f (x )] st h (x ) 0
x
Non-negative multipliers
A more common approach is to keep the requirement that µ 0 by
maximizing the negative of f :
rf x 0 = µ rh (x 0 )
min [f (x )] subject to h (x ) c
x
Then we can apply the same FONCs as we do for our original problem:
max [f (x )] subject to h (x ) c
x
Note that the FONC is not su¢ cient - the points that satisfy the
FONC may be saddle points or local maxima (if the constraint is not
binding)
Canonical forms
The canonical forms are:
max f (x ) subject to h (x ) c
x
and:
min f (x ) subject to h (x ) c
x
The canonical forms have the same Lagrange function and FONC:
L(x, µ) = f (x ) µ [h (x ) c]
0 0 0
Li x , µ = fi x µhi (x ) = 0 for i = 1, .., n with µ 0
Second-order conditions
Recall that the Hessian matrix provides information about the nature
of stationary points
x T H x 0 x < 0 for x 6= 0
max [f (x )] where X = fx 2 Rn : g (x ) = b 2 Rg
x 2X
0 Jg (x )
H (x, λ) =
Jg (x ) r2x L (x, λ)
Second-order theorem
Suppose that x 0 , λ is stationary. Then if:
det H x 0 , λ > 0
If instead:
det H x 0 , λ < 0
the bordered Hessian is positive de…nite at x 0 , λ , and that point is
a local minimum
Example
Consider the example from before:
Non-negativity constraints
In applications, the variables are often restricted to be non-negative
Example
Let x 2 Rn , f , g : Rn ! R and consider the problem:
L(x ) = f (x ) λ [g (x ) b ] + µT x
First-order condition
Rather than explicitly state the non-negativity constraints, they can
be accounted for in the FONC as follows:
S = fx 2 X : ai x = 0 for i = 1, .., m g
S ? = fa 2 Rn : ai x = 0 for all x 2 S g
or
l
a0 x = ∑ λ i ai x
i =1
Corollary
Let A be an m n matrix and let b 2 Rm . Exactly one of the
following alternatives are true: either there exists a vector x 2 Rn
such that (i ): Ax = b, or else there exists a vector y 2 Rm such that
(ii ):
yA = 0
yb > 0
X = fx 2 X : g (x ) = 0g
with g : Rn ! Rm and g 2 C1
Theorem
Suppose x 0 is a local maximizer in the program above and that
ρ rg x 0 = k. Then there exists a unique vector λ 2 Rk such
that:
m
rf x 0 = ∑ λi Jg x0
i =1
Parametric problem
In economic applications, the optimization problem is often
(compactly) written:
max [f (x )]
x 2X (a )
max [u (x )]
x 2X (p,w )
where:
X (p, w ) = fx 2 Rn+ : px wg
Here, p w 2 Rn++
+1
are the parameters describing (constraining) the
consumer’s environment
Value Functions
Suppose that x (a) solves the problem the following problem:
De…nition
The value function v : X A ! R gives the optimal attainable value
of the objective function f as a function of the parameters a:
The value function v (a) describes how the value of the solution
varies with the parameters
min w T x = c (x (w ), y ) = c (w , y )
x 2X (w ,y )
Example
Suppose f : R2 ! R is f (x, a) = ax 2 . Find the value function:
Example
Let f : R2 ! R be f (x, a) = x 4 + a(x 2 1). Find the value
function:
v (a) = max [f (x, a)]
1 x 1
f (x, a) = 1 (x a )2
Example
Find the value of the following utility maximization program:
p
max [u (x )] = x1 x2
x 2X (p,w )
An envelope theorem
Suppose that we have a parametric maximization problem:
v (a ) = f (x (a ), a )
Example
Suppose we want to minimize f : R A ! R given by:
f (x, a) = x 2 + 2ax 2
Envelope theorem
Di¤erentiate the value function with respect to a gives:
Example
Let u : R2+ ! R be a di¤erentiable utility function. Use the envelope
theorem to derive Roy’s identity:
v (p, w ) = max u (x )
x 2B p,w
Lpi (p, w )
= xi (p, w ) for i = 1, 2
Lw (p, w )
max f (x ) st g (x ) = b
x 2X ( θ )
L(x ) = f (x ) λ [g (x ) b]
v (b ) = f (x (b )) λ [g (x (b )) b ] = L(x (b )) + λb
rv (x (b )) = λ
∂v ∂x
= rL ((b )) + λi = λi
∂bi ∂bi
Alternatively, we have:
v (c + dc ) v (c ) λdc
max [f (x )]
x 2X
g (x ) = 0
X = x 2 Rn :
h (x ) 0
Bjorn Persson (Boston University) Optimization Fall 2023 76 / 79
The Value Function and Lagrangian Duality II
Suppose that the value of the problem is v
L(x, λ, µ) = f (x ) λg (x ) µh (x )
subject to µ 0
The solution gives the lowest upper bound to the NLP, and its value
is denoted by w
w v
Strong duality
When w = v , strong duality obtains. Because w is an upper bound
on the optimal value v , if both are equal for some (x, λ, µ) then that
point must be optimal