0% found this document useful (0 votes)
52 views43 pages

US - TMC - 05 - Optimization 2022

note
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views43 pages

US - TMC - 05 - Optimization 2022

note
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Theoretical Models for Computing: Optimization

Presenter: Dr. Ha Viet Uyen Synh.


Optimization
Root finding and optimization are related, both involve
guessing and searching for a point on a function.

Fundamental difference is:


Root finding is searching for zeros of a function or
functions
Optimization is finding the minimum or the maximum
of a function of several variables.

2
Mathematical Background
An optimization or mathematical programming problem
generally be stated as:
Find x, which minimizes or maximizes f(x) subject
to
d i ( x)  ai i = 1,2,, m *
ei ( x) = bi i = 1,2,, p *

Where x is an n-dimensional design vector, f(x) is the


objective function, di(x) are inequality constraints, ei(x) are
equality constraints, and ai and bi are constants

3
Optimization problems can be classified on the basis of the
form of f(x):
If f(x) and the constraints are linear, we have
linear programming.
If f(x) is quadratic and the constraints are linear,
we have quadratic programming.
If f(x) is not linear or quadratic and/or the
constraints are nonlinear, we have nonlinear programming.

When equations(*) are included, we have a constrained


optimization problem; otherwise, it is unconstrained
optimization problem.

4
PART A

ONE-DIMENSIONAL
UNCONSTRAINED OPTIMIZATION
One-Dimensional Unconstrained
Optimization

In multimodal functions, both local and global optima can


occur. In almost all cases, we are interested in finding the
absolute highest or lowest value of a function.

6
How do we distinguish global
optimum from local one?
By graphing to gain insight into the behavior of the
function.

Using randomly generated starting guesses and picking the


largest of the optima as global.

Perturbing the starting point to see if the routine returns a


better point or the same local minimum.

7
Golden-Section Search
A unimodal function has a single maximum or a minimum
in the a given interval. For a unimodal function:
- First pick two points that will bracket your extremum [xl,
xu].
- Pick an additional third point within this interval to
determine whether a maximum occurred.
- Then pick a fourth point to determine whether the
maximum has occurred within the first three or last three
points

The key is making this approach efficient by


choosing intermediate points wisely thus
minimizing the function evaluations by replacing
the old values with new values.

8
l0 = l1 + l2
l1 l2
=
l0 l1
•The first condition specifies that the sum of the two sub
lengths l1 and l2 must equal the original interval length.
•The second say that the ratio of the length must be equal

l1 l l2
= 2 R=
l1 + l2 l1 l1
1
1+ R = R2 + R −1 = 0
R
− 1 + 1 − 4(−1) 5 −1
R= = = 0.61803
2 2

Golden Ratio
9
The method starts with two initial guesses, xl and xu,
that bracket one local extremum of f(x):
Next two interior points x1 and x2 are chosen according to
the golden ratio
5 −1
d= ( xu − xl )
2
x1 = xl + d
x2 = xu − d

The function is evaluated at these two interior points.

10
Two results can occur:

If f(x1)>f(x2) then the domain of x to the left of x2 from xl


to x2, can be eliminated because it does not contain the
maximum. Then, x2 becomes the new xl for the next round.

If f(x2)>f(x1), then the domain of x to the right of x1 from xl


to x2, would have been eliminated. In this case, x1 becomes
the new xu for the next round.

New x1 determined as before


5 −1
x1 = xl + ( xu − xl )
2
xu − xl
 a = (1 − R) 100%
xopt
11
Example #1
Use the golden-section search to find the maximum of
𝑥2
𝑓 𝑥 = 2 sin 𝑥 −
10
within the interval xl = 0 and xu = 4.

First, the golden ratio is used to create the two interior points

The function can be evaluated at the interior points

Because f (x2) > f (x1), the maximum is in the interval defined by xl, x2, and x1.
Thus, for the new interval, the lower bound remains xl = 0, and x1 becomes the
upper bound, that is, xu = 2.472.
In addition, the former x2 value becomes the new x1, that is, x1 = 1.528. Further,
we do not have to recalculate f (x1) because it was determined on the previous
iteration as f (1.528) = 1.765.

All that remains is to compute the new values of d and x2,

The function evaluation at x2 is f(0.994) = 1.531. Since this value is less than the
function value at x1, the maximum is in the interval prescribed by x2, x1, and xu.

The result is converging on the true value of 1.7757 at x = 1.4276.


Parabolic Interpolation

where x0, x1, and x2 are the initial guesses, and x3 is the value of x that
corresponds to the maximum value of the parabolic fit to the guesses.
Example #2
Use the parabolic interpolation to find the maximum of
𝑥2
𝑓 𝑥 = 2 sin 𝑥 −
10
within the interval x0 = 0, x1=1 and x2 = 4.

The function values at the three guesses can be evaluated,

which has a function value of f(1.5055) = 1.7691. For the next iteration,

which has a function value of f(1.4903) = 1.7714.


Thus, within five iterations, the result is converging rapidly on the true value
of 1.7757 at x = 1.4276.
Newton’s Method

A similar approach to Newton- Raphson method can be


used to find an optimum of f(x) by defining a new function
g(x)=f‘(x). Thus because the same optimal value x*
satisfies both
f‘(x*)=g(x*)=0
We can use the following as a technique to the
extremum of f(x).

f ( xi )
xi +1 = xi −
f ( xi )

17
Example #3
Use the Newton’s Method to find the maximum of
𝑥2
𝑓 𝑥 = 2 sin 𝑥 −
10
within an initial guess of x0=2.5.

The first and second derivatives of the function can be evaluated as

Thus, within four iterations, the result converges rapidly on the true value.
PART B

MULTIDIMENSIONAL
UNCONSTRAINED OPTIMIZATION
Multidimensional Unconstrained
Optimization
Techniques to find minimum and maximum of a function
of several variables are described.

These techniques are classified as:


That require derivative evaluation Gradient or
descent (or ascent) methods
That do not require derivative evaluation Non-
gradient or direct methods.

20
DIRECT METHODS Random Search
Based on evaluation of the function randomly at selected
values of the independent variables.

If a sufficient number of samples are conducted, the


optimum will be eventually located.

Example: maximum of a function


f (x, y)=y-x-2x2-2xy-y2
can be found using a random number generator.

21
Advantages
Works even for discontinuous and nondifferentiable
functions.
Always finds the global optimum rather than the global
minimum.

Disadvantages
As the number of independent variables grows, the task can
become onerous.
Not efficient, it does not account for the behavior of
underlying function.

22
Univariate and Pattern Searches
More efficient than random search and still doesn’t require
derivative evaluation.

The basic strategy is:


Change one variable at a time while the other variables are
held constant.
Thus problem is reduced to a sequence of one-dimensional
searches that can be solved by variety of methods.
The search becomes less efficient as you approach the
maximum.

23
Pattern directions can be used to shoot directly along the
ridge towards maximum.

24
GRADIENT METHODS
Gradients and Hessians
The Gradient
If f(x,y) is a two dimensional function, the
gradient vector tells us
What direction is the steepest ascend?
How much we will gain by taking that step?

f f
f = i+ j or del f
x y

Directional derivative of
f(x,y) at point x=a and y=b

25
•For n dimensions

 f 
 x ( x ) 
 1 
 f ( x )
f ( x) =  x2 
  
 f 
 ( x )
 xn 

26
Example #4
Employ the gradient to evaluate the steepest ascent direction for the function
𝑓(𝑥, 𝑦) = 𝑥𝑦2
at the point (2, 2). Assume that positive x is pointed east and positive y is pointed
north

First, our elevation can be determined as


𝑓(2, 2) = 2(2)2 = 8
Next, the partial derivatives can be evaluated,

which can be used to determine the gradient as


𝛻𝑓 = 4𝑖 + 8𝑗
This vector can be sketched on a topographical map of the function. This
immediately tells us that the direction we must take is

The slope in this direction, which is the magnitude of f , can be calculated as


The Hessian
For one dimensional functions both first and
second derivatives valuable information for
searching out optima.
First derivative provides
(a) the steepest trajectory of the function
(b) tells us that we have reached the maximum.
Second derivative tells us that whether we are a maximum
or minimum.

For two dimensional functions whether a


maximum or a minimum occurs involves not
only the partial derivatives w.r.t. x and y but also
the second partials w.r.t. x and y.

28
Assuming that the partial derivatives are continuous at and
near the point being evaluated
2
2 f 2 f  2 f 
H = 2 − 
x y 2  xy 
2 f
If H  0 and 2  0, then f(x, y) has a local minimum
x
2 f
If H  0 and 2  0, then f(x, y) has a local maximum
x
If H  0, then f(x, y) has a saddle point

The quantity [H] is equal to the determinant of a matrix


made up of second derivatives

29
Finite-Difference Approximations
The Steepest Ascend Method
Start at an initial point (xo,yo), determine the direction of
steepest ascend, that is, the gradient.
Then search along the direction of the gradient, ho, until
we find maximum. Process is then repeated.

31
The problem has two parts
Determining the “best direction” and
Determining the “best value” along that search direction.
Steepest ascent method uses the gradient approach as its
choice for the “best” direction.
To transform a function of x and y into a function of h
along the gradient section:

f
x = xo + h
x
f
y = yo + h
y

h is distance along the h


axis

32
If xo=1 and yo=2

f = 3i + 4 j

x = 1 + 3h
y = 2 + 4h

33
Example #5
Suppose we have the following two-dimensional function:
𝑓(𝑥, 𝑦) = 2𝑥𝑦 + 2𝑥 − 𝑥2 − 2𝑦2
Develop a one-dimensional version of this equation along the gradient direction
at point x=−1 and y = 1.

The partial derivatives can be evaluated at (−1, 1),

Therefore, the gradient vector is


𝛻𝑓 = 6𝑖 − 6𝑗
The function can be expressed along this axis as

By combining terms, we develop a one-dimensional function g(h) that maps f (x,


y) along the h axis,
𝑔(ℎ) = −180ℎ2 + 72ℎ − 7
Example #6
Maximize the following two-dimensional function:
𝑓(𝑥, 𝑦) = 2𝑥𝑦 + 2𝑥 − 𝑥2 − 2𝑦2
Using initial guess, x=−1 and y = 1.

Method #1: Calculating directly


To do this, the partial derivatives can be evaluated as

This pair of equations can be solved for the optimum, x = 2 and y = 1. The
second partial derivatives can also be determined and evaluated at the
optimum,

and the determinant of the Hessian is computed


|𝐻| = −2(−4) − 22 = 4
Therefore, because |H| > 0 and ∂2 f/∂x2 < 0, function value f(2, 1) is a
maximum.
Method #2: Using Steepest Ascent Method
𝑔(ℎ) = −180ℎ2 + 72ℎ − 7
Now, because this is a simple parabola, we can directly locate the maximum
(that is, h = h∗) by solving the problem,

𝑔′(ℎ ) = − 360ℎ ∗ + 72 = 0

ℎ = 0.2
This means that if we travel along the h axis, g(h) reaches a minimum value
when h = h∗ = 0.2.
The (x, y) coordinates corresponding to this point,
𝑥 = −1 + 6(0.2) = 0.2
𝑦 = 1 − 6(0.2) = −0.2
The second step is merely implemented by repeating the procedure. First, the
partial derivatives can be evaluated at the new starting point (0.2, −0.2) to give

Therefore, the gradient vector is


𝛻𝑓 = 1.2 𝑖 + 1.2 𝑗
The coordinates along this new h axis can now be expressed as
𝑥 = 0.2 + 1.2ℎ
𝑦 = −0.2 + 1.2ℎ
Substituting these values into the function yields
𝑓(0.2 + 1.2ℎ, −0.2 + 1.2ℎ) = 𝑔(ℎ) = −1.44ℎ2 + 2.88ℎ + 0.2
The step h∗ to take us to the maximum along the search direction can then be
directly computed as
𝑔′(ℎ ∗) = −2.88ℎ ∗ + 2.88 = 0
ℎ∗= 1
The (x, y) coordinates corresponding to this new point,
𝑥 = 0.2 + 1.2(1) = 1.4
𝑦 = −0.2 + 1.2(1) = 1
The approach can be repeated with the final result converging on the analytical
solution, x = 2 and y = 1.
PART C

CONSTRAINED OPTIMIZATION
LINEAR PROGRAMMING
An optimization approach that deals with meeting a desired
objective such as maximizing profit or minimizing cost in
presence of constraints such as limited resources
Mathematical functions representing both the objective and
the constraints are linear.

39
Standard Form
Basic linear programming problem consists of two major
parts:
The objective function
A set of constraints
For maximization problem, the objective function is
generally expressed as

Maximize Z = c1 x1 + c2 x2 +  + cn xn

cj= payoff of each unit of the jth activity that is


undertaken
xj= magnitude of the jth activity
Z= total payoff due to the total number of activities

40
The constraints can be represented generally as
ai1 x1 + ai 2 x2 +  + ain xn  bi

Where aij=amount of the ith resource that is consumed


for each unit of the jth activity and bi=amount of the ith
resource that is available
The general second type of constraint specifies that all
activities must have a positive value, xi>0 .
Together, the objective function and the constraints
specify the linear programming problem.

41
Possible outcomes that can be generally obtained in a
linear programming problem
1. Unique solution. The maximum objective function
intersects a single point.
2. Alternate solutions. Problem has an infinite number
of optima corresponding to a line segment.
3. No feasible solution.
4. Unbounded problems. Problem is under-constrained
and therefore open-ended.

42
Any Questions?

 hvusynh@hcmiu.edu.vn

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy