0% found this document useful (0 votes)
23 views16 pages

ORF363 COS323 F14 Lec3

This lecture covers optimization problems, including unconstrained optimization problems like the Fermat-Weber problem and least squares. It introduces basic optimization terminology like decision variables, objective function, constraints, optimal solution, and optimal value. It presents first and second order necessary conditions for optimality, including that the gradient of the objective function must be zero at a local minimum. It also discusses positive semidefinite and positive definite matrices in the context of second order conditions.

Uploaded by

Jaco Greeff
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views16 pages

ORF363 COS323 F14 Lec3

This lecture covers optimization problems, including unconstrained optimization problems like the Fermat-Weber problem and least squares. It introduces basic optimization terminology like decision variables, objective function, constraints, optimal solution, and optimal value. It presents first and second order necessary conditions for optimality, including that the gradient of the objective function must be zero at a local minimum. It also discusses positive semidefinite and positive definite matrices in the context of second order conditions.

Uploaded by

Jaco Greeff
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Lec3p1, ORF363/COS323

This lecture: Instructor:


Amir Ali Ahmadi
• Optimization problems - basic notation and terminology Fall 2014
• Unconstrained optimization TAs: Y. Chen,
○ The Fermat-Weber problem G. Hall,
J. Ye
○ Least squares
• First and second order necessary conditions for optimality
• Second order sufficient condition for optimality
• Solution to least squares

• An optimization problem in general (or abstract) form:


Objective function

Decision variables

Short for minimize

Constraint set (or feasible set)


Short for subject to

In this class (unless otherwise stated), we have:

Typically, some description of and is given as input to us.

Optimal solution :

(also called the "solution" or the "global solution")

A point that minimizes over

May not exist.


May not be unique.

Lec3 Page 1
Lec3p2, ORF363/COS323

x* exists and is unique. x* does not exist.

Problem is "unbounded."

x* exists, but not unique. x* does not exist.

Optimal value: (if x* exists)

But can be well-defined even if x* doesn't exist.


• See the lower right picture above.
• In such a scenario, the term "minimum" is often replaced by "infimum".

• An important case where x* is guaranteed to exist:


○ continuous and compact (i.e., closed and bounded).
• This is known as the Weierstrass theorem.

• What if we want to maximize an objective function instead?


○ Just multiply f by a minus sign:

Optimal solution doesn't change.


Optimal value only changes sign.

Lec3 Page 2
Lec3p3, ORF363/COS323
Unconstrained optimization:

Decision variables are not constrained at all. The goal is only to


minimize the objective function.
Example 1: The Fermat-Weber problem.
You have a list of loved ones who live in given
locations in the US. You would like to decide Fermat Weber
(1607-1665) (1868-1958)
where to live so you are as close to them all as
possible; say, you want to minimize the sum of
distances to each person.

you? • cousin 1
• grandma
• mom
• dad
• sister • Cousin 3
• brother

• lover 1 • best friend • lover 2

Location of person i:

Your location:

• Variant: also given weights wi for each person


(your mom says you should care more about her than lover 1)

• Many other applications: e.g., Princeton is deciding on the location


of a new gym and wants to minimize distance to dormitories, giving
priority to undergrads,…

Lec3 Page 3
Lec3p4, ORF363/COS323

• As we'll see later, this optimization problem is "easy" to solve, not


because it is unconstrained (as there are many terribly hard
unconstrained problems!), but because it has a nice structure
(called convexity).
• If at the same time you wanted to be "far" from some subset of
your friends and family, this would have been a very hard problem
to solve!
• Optimization theory is full of instances where a tiny variation in the
problem formulation changes the problem completely from being
very easy to being very hard. It takes a trained eye to detect this.
Hopefully by the end of the class you will develop an appreciation
for this type of phenomenon.
• But we are getting way ahead of ourselves. For one thing, we
haven't even formalized what it means for an optimization problem
to be "easy" or "hard". Let's forget this for now and move on to
another unconstrained optimization problem---one of the most
widely-encountered in science and engineering.

Example 2: Least squares.

Legendre Gauss
(1752-1833) (1777-1855)
Given: mxn matrix

mx1 vector

Solve:

By default, ||.|| always represents the 2-norm; i.e., ||.||2 .

In expanded notation, we are solving:

Lec3 Page 4
Lec3p5, ORF363/COS323

Some applications of least squares:

Data fitting.

Given:

Fit a (say, cubic) polynomial:

Quick notation exercise: convince yourself that this is a least squares


problem.

Overdetermined system of linear equations.

A simple linear predictor for the stock price of a company:

Stock price at day t

We have three months of daily stock price data to train our model (lots of
5-day windows). How to find the best for future prediction?

and are given from data.


( would be 90.)

Lec3 Page 5
Lec3p6, ORF363/COS323

Optimality conditions

Unconstrained local and global minima.

Consider a function

A point is said to be a:

Local minimum:

Strict local minimum:

Global minimum:

Strict global minimum:

• Local/global maxima defined analogously.


• A (strict) global minimum is of course also a (strict) local minimum.

Local min
Strict global min Strict local max Strict local min
Local max and
local min

No global max in this case. Problem is unbounded above.

Lec3 Page 6
Lec3p7, ORF363/COS323

• In general, finding local minima is a less ambitious goal than finding


global minima.
• Luckily, there are important problems where we can find global
minima efficiently.
• On the other hand, there are problems where finding even a local
minimum is intractable.
• These statements should become more concrete as the course
progresses.

First and second order conditions for local optimality

Optimality conditions are results that give us some structural


information about the properties of optimal solutions. To understand
the proofs that follow, make sure you are comfortable with the
following notions:
• The gradient vector
• The chain rule
• The Hessian matrix
• Taylor series approximation
See lecture notes of the previous lecture or Sections 5.3-5.6 of [CZ13].

Notation reminder:

The gradient vector (nx1 vector)

The Hessian matrix (nxn symmetric matrix)


Notation of [CZ13]:

Lec3 Page 7
Lec3p8, ORF363/COS323

Theorem. (First Order Necessary Condition for (Local) Optimality)


If is an unconstrained local minimizer of a differentiable
function , then we must have:

Fermat
(1607-1665)

Proof.

Lec3 Page 8
Lec3p9, ORF363/COS323

Remarks:
• This condition is necessary but not sufficient for local
optimality.
• Nevertheless, it is useful because any local minimum
must satisfy this condition. So, we can look for local (or
global) minima only among points that make the
gradient of the objective function vanish.
• We will see later that in presence of an important
concept called convexity, this condition is in fact
sufficient for local (and global!) optimality.

Second order conditions.

The statements of our second order optimality conditions involve the


notions of psd and pd matrices. Let's recap these concepts.

Linear algebra interlude.


(See the last lecture if you need more review.)

Symmetric matrix: (AT denotes the transpose of A. )

symmetric not symmetric

Theorem. Eigenvalues of a real symmetric matrix are real.

Proof. See, e.g., Theorem 3.2 in Section 3.2 of [CZ13].

A square matrix A is said to be:


• Positive semidefinite (psd) if:

• Positive definite (pd) if:

Notation:

Lec3 Page 9
Lec3p10, ORF363/COS323

Recall that when we talk of positive semidefiniteness (or


positive definiteness), we assume with no loss of generality
that our matrix is symmetric: If A was not symmetric, we
could take its "symmetric part".

Theorem. A matrix is positive semidefinite if and only if all its


eigenvalues are nonnegative. A matrix is positive definite if and only if
all its eigenvalues are positive.
Proof. See, e.g., Theorem 3.7 in Section 3.4 of [CZ13].

Examples:
MATLAB: eig([2 4;4 5])

Recall our easy test in dimension 2:

This generalizes to n dimensions using the concepts of principal minors


and leading principal minors; see Section 3.4 of [CZ13].

Lec3 Page 10
Lec3p11, ORF363/COS323

Theorem. (Second Order Necessary Condition for (Local) Optimality)


If is an unconstrained local minimizer of a twice
differentiable function , then, in addition to , we
must have:

(i.e., the Hessian at is positive semidefinite.)

Proof.

"Little o" notation: see [CZ13], Section 5.6 or our previous lecture.

Lec3 Page 11
Lec3p12, ORF363/COS323

Theorem. (Second Order Sufficient Condition for (Local) Optimality)

Suppose is twice differentiable, , and

(i.e., the Hessian at is positive definite), then is a strict local


minimum of

Proof.

Lec3 Page 12
Lec3p13, ORF363/COS323

Remarks.
• is not sufficient for local optimality.

• is not necessary for (even strict global) optimality.

Questions to keep at the back of your mind:


• How would we use all these optimality conditions to find local
solutions and certify their optimality?
• Is it easy to find points satisfying these conditions? e.g., is it easy to
solve
• Suppose you certified that a given point is locally optimal, how
would you go about checking if it is also globally optimal?

Exercise. State (and prove) the analogues of our three theorems for
local maxima.

Now that we have a better understanding of the structure of optimal


solutions for unconstrained optimization problems, let's revisit our least
squares problem…

Lec3 Page 13
Lec3p14, ORF363/COS323

Least squares, revisited.


Given: matrix (Assume columns of are
linearly independent)
vector

Solve:

Lec3 Page 14
Lec3p15, ORF363/COS323

Exercise with optimality conditions

Find all the local minima and maxima of the following function:

Lec3 Page 15
Lec3p16, ORF363/COS323

Notes:
• Optimality conditions are covered in Chapter 6 of [CZ13] in a more general
setting where one also has a general constraint The unconstrained
optimality conditions that we presented here are stated in Chapter 6 as
corollaries (called the "interior case"). You are only responsible for what was
covered in class.
• Least squares is covered in Section 12.1 of [CZ13]. But again, this is for
further reading and my notes should have everything that I expect you to
know.

References:
- [CZ13] E.K.P. Chong and S.H. Zak. An Introduction to
Optimization. Fourth edition. Wiley, 2013.

- [Bert04] D.P. Bertsekas. Nonlinear Programming.


Second edition. Athena Scientific, 2004.

Lec3 Page 16

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy