Optimization Techniques - OT
Optimization Techniques - OT
TECHNIQUES
Prof. Shibdas Dutta,
Associate Professor,
DCG DATA CORE SYSTEMS INDIA PVT LTD
Kolkata
Objective Function,
Maxima,
Minima and
Saddle Points,
Convexity and
Concavity
Objectives
Study the basic components of an optimization problem.
lj(X) = 0 , j = 1, 2,….,p
Statement of an optimization problem…
where
X is an n-dimensional vector called the design vector
f(X) is called the objective function
gi(X) and lj(X) are inequality and equality constraints,
respectively.
This type of problem is called a constrained optimization problem.
• When drawn with the constraint surfaces as shown in the figure we can
identify the optimum point (maxima).
Variables
Constraints
Even though constraints are not essential, it has been argued that almost all
Consider the optimization problem presented earlier with only inequality constraints
gi(X) . Set of values of X that satisfy the equation gi(X) forms a boundary surface in the
Constraint surface divides the design space into two regions: one with gi(X) < 0 (feasible
region) and the other in which gi(X) > 0 (infeasible region). The points lying on the hyper
Behavior
Infeasible constraint
region g2 0
Side
constraint Feasible region
g3 ≥ 0 Behavior
.
constraint
. g 1 0
Bound
acceptable point.
f x 0
f x 0 ,
f x 0 f x 0
f x 0
f x 0 ,
f x 0 , f x 0 f x 0 f x 0
f x 0 f x 0
*
f ( x ) f ( x)
• A function is said to have a global or absolute minimum at x = x if *
for all x in
the domain over which f(x) is defined.
f ( x* ) f ( x )
• A function is said to have a global or absolute maximum at x = x* if for all x in the
domain over which f (x) is defined.
Relative and Global Optimum…
A1, A2, A3 = Relative maxima
A2 = Global maximum
B1, B2 = Relative minima
.
B1 = Global minimum Relative minimum is
A2 also global optimum
f(x)
. .
A3
f(x)
A1
.
B2
. .
B1
x
a b x a b
Functions of a single variable
[ a, b]
To find the value of x *
such that x* maximizes f(x) we need to
• For the same function stated above let f ’(x*) = f ”(x*) = . . . = f (n-1)(x*) =
or x* = -3/2
f ''( x*) 2
is positive .
Hence the point x* = -3/2 is a point of minima and the function attains a
minimum value of -29/4 at this point.
Example 2
4
Find the optimum value of the function f ( x) ( x 2) and also state if
the function attains a maximum or a minimum
Solution:
f '( x) 4( x 2)3 0 or x = x* = 2 for maxima or minima.
f x * 24 at x* = 2
A contour plot
Necessary conditions
From the above contour map, perturbations from points of local minima in any
direction result in an increase in the response function f(x), i.e. the slope of the
function is zero at this point of local minima.
Similarly, at maxima and points of inflection as the slope is zero, the first derivative
of the function with respect to the variables are zero.
Which gives us f 0; f 0 at the stationary points. i.e. the
x1 x2
x f
gradient vector of f(X), at X = X* = [x1 , x2] defined as follows, must equal
zero: f
x ( *)
x f 0
1
f
x ( *)
2
Sufficient conditions
Consider the following second order derivatives:
2 f 2 f 2 f
2
; 2;
x1 x2 x1x2
minima.
maxima.
if H is neither then the point X = [x1, x2] is neither a point of maxima nor
minima.
Example
Locate the stationary points of f(X) and classify them as
relative maxima, relative minima or neither based on the
rules discussed in the lecture.
f ( X) 2 x13 / 3 2 x1 x2 5 x1 2 x22 4 x2 5
Solution
Example …
f
From (X) 0 ,
x1
22 x2 2 2 x2 5 0
2
8 x22 14 x2 3 0
(2 x2 3)(4 x2 1) 0
x2 3 / 2 or x2 1/ 4
4 x 2
H 1
2 4
4 x1 2
I - H
2 4
At X1 = [-1,-3/2] , 4 2
I - H ( 4)( 4) 4 0
2 4
2 16 4 0
Since one eigen value is positive and
1 12 2 12 one negative, X1 is neither a relative
maximum nor a relative minimum
Example …
At X2 = [3/2,-1/4]
6 2
I - H ( 6)( 4) 4 0
2 4
1 5 5 2 5 5
Since both the eigen values are positive, X2 is a local minimum.
2 f / x 2
0
2 2
It isstrictly
f / x convex
if its slope is continually increasing
or 0 throughout the function.
Properties of convex functions
• Equivalently, f(x) is concave on [a, b] if and only if the function −f(x) is convex
on every subinterval of [a, b].
f (t 1 (1 t ) 2 ) tf ( 1 ) (1 t ) f ( 2 )
Solution:
f
x ( *) 2 x 2 2 x 5 0
x f 1
1 2
f 2 x1 4 x2 4 0
x ( *)
2
maximum nor a relative minimum. Hence at X1 the function is neither convex nor
concave.
• At X2 = [3/2,-1/4]
• Since both the eigen values are positive, X2 is a local minimum
• Function is convex at this point as both the eigen values are positive.
Lagrange Multipliers and
Kuhn-tucker Conditions
Objectives
f g
0
And (2) written as
1x x1 (x * , x * ) (3)
1 2
f g
0
x2 x2 (x * , x * )
1 2
(4)
Solution by method of Lagrange
multipliers…
g ( x1 , x2 ) ( x * , x * ) 0 (5)
1 2
• Hence equations (2) to (5) represent the necessary conditions for the point
[x1*, x2*] to be an extreme point.
g / x1 g / x1
• These necessary conditions require that at least one of the partial derivatives
of g(x1 , x2) be non-zero at an extreme point.
Solution by method of Lagrange
multipliers…
• The conditions given by equations (2) to (5) can also be generated by
constructing a functions L, known as the Lagrangian function, as
(6)
L( x1 , x2 , ) f ( x1 , x2 ) g ( x1 , x2 )
• Alternatively, treating L as a function of x1,x2 and , the necessary
conditions for its extremum are given by
L f g
( x1 , x2 , ) ( x1 , x2 ) ( x1 , x2 ) 0
x1 x1 x1
L f g
( x1 , x2 , ) ( x1 , x2 ) ( x1 , x2 ) 0
x2 x2 x2
(7)
L
( x1 , x2 , ) g ( x1 , x2 ) 0
Necessary conditions for a general problem
• For a general problem with n variables and m equality constraints the
problem is defined as shown earlier
• In this case the Lagrange function, L, will have one Lagrange multiplier j
for Leach
( x1 , xconstraint as
2 ,..., xn , 1 , 2 ,..., m ) f ( X) 1 g1 ( X) 2 g 2 ( X) ... m g m ( X)
(8)
Necessary conditions for a general problem…
• L is now a function of n + m unknowns, x1 , x2 ,..., xn , 1 , 2 ,...,
, and mthe
L f m g j
( X) j ( X) 0, i 1, 2,..., n; j 1, 2,..., m
xi xi j 1 xi
L (9)
g j ( X) 0, j 1, 2,..., m
j
where 2 L
Lij ( X* , * ), for i 1, 2,..., n and j 1, 2,..., m
xi x j (12)
g p
g pq ( X* ), where p 1, 2,..., m and q 1,2,..., n
xq
• If equation (11), on solving yields roots, some of which are positive and
others negative, then the point X* is neither a maximum nor a minimum.
Example
Minimize f ( X) 3x12 6 x1 x2 5 x22 7, x1 5 x2
Subject to x1 x2 5
Solution
g1 ( X) x1 x2 5 0
L( x1 , x2 ,..., xn , 1 , 2 ,..., m ) f ( X) 1 g1 ( X) 2 g 2 ( X) ... m g m ( X)
with n = 2 and m = 1
3 x 6 x1 x2 5 x 7 x1 5 x2 1 ( x1 x2 5)
2 2
L= 1 2
L
6 x1 6 x2 7 1 0
x1
1
x1 x2 (7 1 )
6
1
5 (7 1 ) 1 23
6 or
Example… L
6 x1 10 x2 5 1 0
x2
1
3 x1 5 x2 (5 1 )
2
1
3( x1 x2 ) 2 x2 (5 1 )
2
1 11
x2 x1
2 2
11 1
Hence, X* , ; λ* 23
2 2
g1
g11 1
x1 ( X*,λ* )
g1
g12 g 21 1
x2 ( X*,λ* )
programming problems.
Kuhn-Tucker Conditions: Optimization
Model
Consider the following optimization problem
Minimize f(X)
subject to
gj(X) ≤ 0 for j=1,2,…,p
X=[x1,x2,…,xn]
Kuhn-Tucker Conditions
Kuhn-Tucker conditions for X* = [x1* , x2* , . . . xn*] to be a local minimum are
f m
g
j 0 i 1, 2,..., n
xi j 1 xi
j g j 0 j 1, 2,..., m
g j 0 j 1, 2,..., m
j 0 j 1, 2,..., m
Kuhn Tucker Conditions …
Minimize
2 2 2
f x 2 x 3x
1 2 3
subject to
g1 x1 x2 2 x3 12
g 2 x1 2 x2 3 x3 8
Example 1…
Kuhn – Tucker Conditions
f g g 2 x1 1 2 0 (14)
1 1 2 2 0 4 x2 1 22 0 (15)
xi xi xi
6 x3 21 32 0 (16)
j g j 0 1 ( x1 x2 2 x3 12) 0 (17)
2 ( x1 2 x2 3x3 8) 0 (18)
x1 x2 2 x3 12 0 (19)
g j 0
x1 2 x2 3 x3 8 0 (20)
1 0 (21)
j 0 2 0 (22)
Example 1…
From (17) either 1 = 0 or x1 x2 2 x3 12 0 ,
1
Case 1: =0
2 / 2 2 / 2
From (14), (15) and (16) we have x1 = x2 = and x3 =
82 0, 2 0 or 8
2
2
Using these in (18) we get
2 0 2
From (22), , therefore, =0,
Therefore, X* = [ 0, 0, 0 ]
X* = [ 0 0 0 ]
Example 2
Minimize
f x12 x22 60 x1
subject to
g1 x1 80 0
g 2 x1 x2 120 0
Example 2…
Kuhn – Tucker Conditions
f g g 2 x1 60 1 2 0 (23)
1 1 2 2 0
xi xi xi 2 x2 2 0 (24)
1 ( x1 80) 0 (25)
j g j 0 2 ( x1 x2 120) 0 (26)
x1 80 0 (27)
g j 0 x1 x2 120 0 (28)
1 0 (29)
j 0
2 0 (30)
Example 2…
Case 1
2 x2
2
x1 30 2
From (23) and (24) we have 2 and
2 2 150 0
Using these in (26) we get
2 0 or 150
2 0
Considering , X* = [ 30, 0]. But this solution set violates (27)
and (28)
150
2
Case 2: ( x1 80) 0
2 2 x2
1 2 x2 220 (31)
Forx2 0 , 1 220
This solution set is satisfying all equations from (27) to (31) and hence
the desired
X* = [ 80, 40 ]
BIBLIOGRAPHY / FURTHER READING
optimization.
To study the above with the aid of the gradient vector and the Hessian matrix.
f *
( )
dxn
Sufficient condition
When all eigen values are negative for all possible values of X, then X*
is a global maximum, and when all eigen values are positive for all
possible values of X, then X* is a global minimum.
If some of the eigen values of the Hessian at X* are positive and some
negative, or if some are zero, the stationary point, X*, is neither a local
maximum nor a local minimum.
Example
Analyze the function f ( x) x12 x22 x32 2 x1 x2 2 x1 x3 4 x1 5 x3 2
and classify the stationary points as maxima, minima and points of inflection
Solution
f *
( )
x1 2 x1 2 x2 2 x3 4 0
f *
x f ( ) 2 x2 2 x1
0
x2 2 x3 2 x1 5 0
f *
( )
x3
Example …
Solving these simultaneous equations we get
2 f 2 f 2 f
2; 2 2; 2 2
x12 x2 x3
2 f 2 f
2
x1x2 x2x1
2 f 2 f
0
x2 x3 x3x2
2 f 2 f
2
x3x1 x1x3
Example …
Hessian of f(X) is
2 f
H
x x
i j
2 2 2
H 2 2 0
2 0 2
2 2 2
I - H 2 2 0 0
2 0 2
Example …
( 2)[ 2 4 4 4 4] 0
( 2)3 0
or 1 2 3 2
Since all eigen values are negative the function attains a maximum at the point
With the condition that m n; or else if m > n then the problem
becomes an over defined one and there will be no solution. Of the many
lecture.
Solution by Method of Constrained
Variation
• For the optimization problem defined above, consider a specific case with n
= 2 and m = 1
point [x1*, x2*] must be admissible variations, i.e. the point lies on the
constraint:
assuming dx1 and dx2 are small the Taylor’s series expansion of this gives
g ( x1 * dx1 , x2* dx2 )
us
g * * g * *
g ( x1* , x2* ) (x1 ,x 2 ) dx1 (x1 ,x 2 ) dx2 0
x1 x2
(3)
Method of Constrained Variation…
g g
or dg dx1 dx2 0 at [x1*,x2*] (4)
x1 x2
variations.
g / x2 0
Assuming , (4) can be rewritten as
g / x1 * *
dx2 ( x1 , x2 )dx1
g / x2
(5)
Method of Constrained Variation…
(5) indicates that once variation along x1 (dx1) is chosen arbitrarily, the variation along x2 (dx2) is
decided automatically to satisfy the condition for the admissible variation. Substituting equation
(5) in (1) we have:
f g / x1 f
df dx1 0
x1 g / x2 x2 (x1* , x 2* )
(6)
The equation on the left hand side is called the constrained variation of f. Equation (5) has to be
This gives us the necessary condition to have [x1*, x2*] as an extreme point (maximum or minimum)