0% found this document useful (0 votes)

72 views15 pages

Quasi Newton PDF

The document discusses quasi-Newton methods for optimization including: 1. The BFGS update which approximates the Hessian and preserves positive definiteness. 2. The BFGS update satisfies the secant condition and guarantees descent directions. 3. BFGS converges superlinearly for strongly convex problems with Lipschitz continuous Hessians.

Uploaded by

Sparrow Jack

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

72 views15 pages

Quasi Newton PDF

Uploaded by

Sparrow Jack

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

EE236C (Spring 2011-12)

2. Quasi-Newton methods

• variable metric methods

• quasi-Newton methods

• BFGS update

• limited-memory quasi-Newton methods

2-1
Newton method for unconstrained minimization

minimize f (x)

f convex, twice continously differentiable

Newton method

x+ = x − t∇2f (x)−1∇f (x)

• advantages: fast convergence, affine invariance

• disadvantages: requires second derivatives, solution of linear equation

can be too expensive for large scale applications

Quasi-Newton methods 2-2

Variable metric methods

x+ = x − tH −1∇f (x)

H ≻ 0 is approximation of the Hessian at x, chosen to:

• avoid calculation of second derivatives

• simplify computation of search direction

‘variable metric’ interpretation (EE236B, lecture 10, page 11)

∆x = −H −1∇f (x)

is steepest descent direction at x for quadratic norm

T
1/2
kzkH = z Hz

Quasi-Newton methods 2-3

Quasi-Newton methods

given starting point x(0) ∈ dom f , H0 ≻ 0

for k = 1, 2, . . ., until a stopping criterion is satisfied
−1
1. compute quasi-Newton direction ∆x = −Hk−1 ∇f (x(k−1))
2. determine step size t (e.g., by backtracking line search)
3. compute x(k) = x(k−1) + t∆x
4. compute Hk

• different methods use different rules for updating H in step 4

• can also propagate Hk−1 to simplify calculation of ∆x

Quasi-Newton methods 2-4

Broyden-Fletcher-Goldfarb-Shanno (BFGS) update

BFGS update

yy T Hk−1ssT Hk−1
Hk = Hk−1 + T −
y s sT Hk−1s

where
s = x(k) − x(k−1), y = ∇f (x(k)) − ∇f (x(k−1))

inverse update
T T
ssT

sy ys
Hk−1 = I− T −1
Hk−1 I − T + T
y s y s y s

• note that y T s > 0 for strictly convex f ; see page 1-11

• cost of update or inverse update is O(n2) operations

Quasi-Newton methods 2-5

Positive definiteness

if y T s > 0, BFGS update preserves positive definitess of Hk

proof: from inverse update formula,

T
T T
(sT v)2

s v s v
v T Hk−1v = v− T y −1
Hk−1 v− T y + T
s y s y y s

• if Hk−1 ≻ 0, both terms are nonnegative for all v

• second term is zero only if sT v = 0; then first term is zero only if v = 0

this ensures that ∆x = −Hk−1∇f (x(k)) is a descent direction

Quasi-Newton methods 2-6

Secant condition

BFGS update satisfies the secant condition Hk s = y, i.e.,

Hk (x(k) − x(k−1)) = ∇f (x(k)) − ∇f (x(k−1))

interpretation: define second-order approximation at x(k)

(k) (k) T (k) 1

fquad(z) = f (x ) + ∇f (x ) (z − x ) + (z − x(k))T Hk (z − x(k))
2

secant condition implies that gradient of fquad agrees with f at x(k−1):

∇fquad(x(k−1)) = ∇f (x(k)) + Hk (x(k−1) − x(k))

= ∇f (x(k−1))

Quasi-Newton methods 2-7

secant method
for f : R → R, BFGS with unit step size gives the secant method

′ (k)
f (x ) f ′(x(k)) − f ′(x(k−1))
x(k+1) (k)
=x − , Hk =
Hk x(k) − x(k−1)

x(k−1) x(k) x(k+1)

′
fquad (z)

f ′(z)

Quasi-Newton methods 2-8

Convergence

global result

if f is strongly convex, BFGS with backtracking line search (EE236B,

lecture 10-6) converges from any x(0), H (0) ≻ 0

local convergence

if f is strongly convex and ∇2f (x) is Lipschitz continuous, local

convergence is superlinear : for sufficiently large k,

kx(k+1) − x⋆k2 ≤ ck kx(k) − x⋆k2 → 0

where ck → 0 (cf., quadratic local convergence of Newton method)

Quasi-Newton methods 2-9

Example
m
X
minimize cT x − log(bi − aTi x)
i=1
n = 100, m = 500
Newton BFGS
2 2
10 10

0 0
10 10
f (x(k)) − f ⋆

f (x(k)) − f ⋆
-2 -2
10 10

-4 -4
10 10

-6 -6
10 10

-8 -8
10 10

-10 -10
10 10

-12 -12
10 10
0 1 2 3 4 5 6 7 8 9 0 20 40 60 80 100 120 140

k k

cost per Newton iteration: O(n3) plus computing ∇2f (x)

cost per BFGS iteration: O(n2)

Quasi-Newton methods 2-10

Square root BFGS update

to improve numerical stability, can propagate Hk in factored form

if Hk−1 = Lk−1LTk−1 then Hk = Lk LTk with

T
(αỹ − s̃) s̃

Lk = Lk−1 I + ,
s̃T s̃

where 1/2
T

s̃ s̃
ỹ = L−1
k−1 y, s̃ = Lk−1s, α=
yT s

if Lk−1 is triangular, cost of reducing Lk to triangular is O(n2)

Quasi-Newton methods 2-11

Optimality of BFGS update

X = Hk solves the convex optimization problem

−1 −1
minimize tr(Hk−1 X) − log det(Hk−1 X) − n
subject to Xs = y

• cost function is nonnegative, equal to zero only if X = Hk−1

• also known as relative entropy between densities N (0, X), N (0, Hk−1)

optimality result follows from KKT conditions: X = Hk satisfies

1 T
X −1
= −1
Hk−1 − (sν + νsT ), Xs = y, X≻0
2

with ! !
T −1
1 −1 y Hk−1 y
ν= 2Hk−1 y − 1+ s
sT y yT s

Quasi-Newton methods 2-12

Davidon-Fletcher-Powell (DFP) update

switch Hk−1 and X in objective on previous page

minimize tr(Hk−1X −1) − log det(Hk−1X −1) − n

subject to Xs = y

• minimize relative entropy between N (0, Hk−1) and N (0, X)

• problem is convex in X −1 (with constraint written as s = X −1y)
• solution is ‘dual’ of BFGS formula
T T
yy T

ys sy
Hk = I− T Hk−1 I − T + T
s y s y s y

(known as DFP update)

pre-dates BFGS update, but is less often used

Quasi-Newton methods 2-13

Limited memory quasi-Newton methods

main disadvantage of quasi-Newton method is need to store Hk or Hk−1

limited-memory BFGS (L-BFGS): do not store Hk−1 explicitly

• instead we store the m (e.g., m = 30) most recent values of

sj = x(j) − x(j−1), yj = ∇f (x(j)) − ∇f (x(j−1))

• we evaluate ∆x = Hk−1∇f (x(k)) recursively, using

! !
sj yjT yj sTj sj sTj
Hj−1 = I− −1
Hj−1 I− + T
yjT sj yjT sj yj s j

−1
for j = k, k − 1, . . . , k − m + 1, assuming, for example, Hk−m =I
• cost per iteration is O(nm); storage is O(nm)

Quasi-Newton methods 2-14

References

• J. Nocedal and S. J. Wright, Numerical Optimization (2006), chapters

6 and 7

• J. E. Dennis and R. B. Schnabel, Numerical Methods for Unconstrained

Optimization and Nonlinear Equations (1983)

Quasi-Newton methods 2-15

American Stories - Nagai Kafu & Mitsuko Iriye
100% (1)
American Stories - Nagai Kafu & Mitsuko Iriye
239 pages
Design of Seismic Isolated Structures
100% (1)
Design of Seismic Isolated Structures
154 pages
FL WLAN 1100/1101/2100/2101: User Manual
No ratings yet
FL WLAN 1100/1101/2100/2101: User Manual
58 pages
Lecture Notes in Artificial Intelligence PDF
No ratings yet
Lecture Notes in Artificial Intelligence PDF
404 pages
Geoinformatics For Marine and Coastal Management
100% (2)
Geoinformatics For Marine and Coastal Management
444 pages
Organization and Management Module 3: Quarter 1 - Week 3
100% (3)
Organization and Management Module 3: Quarter 1 - Week 3
15 pages
GEMS UnfoldUnwrinkle PDF
No ratings yet
GEMS UnfoldUnwrinkle PDF
10 pages
Fundamentals of Building Construction-1 PDF
20% (5)
Fundamentals of Building Construction-1 PDF
30 pages
Programming The Finite Element Method
No ratings yet
Programming The Finite Element Method
230 pages
Grade 8 - Integrated Science PDF
No ratings yet
Grade 8 - Integrated Science PDF
68 pages
Quasi Newton Methods
No ratings yet
Quasi Newton Methods
15 pages
Unconstrained Multivariable Optimization
No ratings yet
Unconstrained Multivariable Optimization
42 pages
Space in Japanese Zen Buddhist Architecture PDF
No ratings yet
Space in Japanese Zen Buddhist Architecture PDF
10 pages
E1 251 Linear and Nonlinear Op2miza2on
No ratings yet
E1 251 Linear and Nonlinear Op2miza2on
24 pages
DFO of Noisy Functions Via Quasi Newtion Methods - Berahas, Byrd, Nocedal (2018)
No ratings yet
DFO of Noisy Functions Via Quasi Newtion Methods - Berahas, Byrd, Nocedal (2018)
40 pages
Perkins Uk
No ratings yet
Perkins Uk
30 pages
Numerical Optimization: Lecture Notes #18 Quasi-Newton Methods - The BFGS Method
No ratings yet
Numerical Optimization: Lecture Notes #18 Quasi-Newton Methods - The BFGS Method
24 pages
Fitness Planner
No ratings yet
Fitness Planner
24 pages
Multi-Variable Optimization Methods
No ratings yet
Multi-Variable Optimization Methods
21 pages
The Convergence of Quasi-Gauss-Newton Methods For Nonlinear Problems
No ratings yet
The Convergence of Quasi-Gauss-Newton Methods For Nonlinear Problems
12 pages
John H. Hoefker
100% (1)
John H. Hoefker
13 pages
07SQUJS-AK FinalMATH070213 PDF
No ratings yet
07SQUJS-AK FinalMATH070213 PDF
12 pages
Global Convergence of A Modified Fletcher-Reeves Conjugate Gradient Method With Armijo-Type Line Search - Zhang, Zhou (2006)
No ratings yet
Global Convergence of A Modified Fletcher-Reeves Conjugate Gradient Method With Armijo-Type Line Search - Zhang, Zhou (2006)
12 pages
Quasi Newton Methods
No ratings yet
Quasi Newton Methods
17 pages
Experimental Study of The Broyden Class Updating Method For Solving Non-Linear Unconstrained Optimization Problems
No ratings yet
Experimental Study of The Broyden Class Updating Method For Solving Non-Linear Unconstrained Optimization Problems
10 pages
Second Order Method: Newton Method Quasi Newton Method
No ratings yet
Second Order Method: Newton Method Quasi Newton Method
11 pages
Unibraid
100% (1)
Unibraid
16 pages
Using Gradient Directions To Get Global Convergence of Neewton-Type Metods
No ratings yet
Using Gradient Directions To Get Global Convergence of Neewton-Type Metods
22 pages
Signals and Systems For Signals and Systems For
No ratings yet
Signals and Systems For Signals and Systems For
75 pages
Munkacsy Mihaly
100% (1)
Munkacsy Mihaly
28 pages
A Modified BFGS Method and Its Global Convergence in Nonconvex Minimization - 2001 - Journal of Computational and Applied Mathematics
No ratings yet
A Modified BFGS Method and Its Global Convergence in Nonconvex Minimization - 2001 - Journal of Computational and Applied Mathematics
21 pages
Time-Domain Numerical Simulation of Ocean Cable Structures
No ratings yet
Time-Domain Numerical Simulation of Ocean Cable Structures
28 pages
Lecture 05 - Unconstrained
No ratings yet
Lecture 05 - Unconstrained
21 pages
Wiki QNM
No ratings yet
Wiki QNM
6 pages
A Note On The Optimal Convergence Rate of Descent
No ratings yet
A Note On The Optimal Convergence Rate of Descent
11 pages
Quasi Newton PDF
No ratings yet
Quasi Newton PDF
15 pages
FALLSEM2020-21 CHE1011 TH VL2020210101704 Reference Material I 05-Sep-2020 Lecture 17 PDF
No ratings yet
FALLSEM2020-21 CHE1011 TH VL2020210101704 Reference Material I 05-Sep-2020 Lecture 17 PDF
19 pages
Project For Automated Train by Roshan
No ratings yet
Project For Automated Train by Roshan
6 pages
Rrrdesdelinear and Nonlinear Programming-4
No ratings yet
Rrrdesdelinear and Nonlinear Programming-4
3 pages
Time-Dependent Analysis of Cable Domes Using A Modi Ed Dynamic Relaxation Method and Creep Theory
No ratings yet
Time-Dependent Analysis of Cable Domes Using A Modi Ed Dynamic Relaxation Method and Creep Theory
12 pages
Lecture 05 - Quasi Newthon Methods
No ratings yet
Lecture 05 - Quasi Newthon Methods
10 pages
PE FItness Test
No ratings yet
PE FItness Test
10 pages
2021 Aeci Catlogue
No ratings yet
2021 Aeci Catlogue
162 pages
10.7 Variable Metric Methods in Multidimensions
No ratings yet
10.7 Variable Metric Methods in Multidimensions
6 pages
Optimization Class Notes MTH-9842
No ratings yet
Optimization Class Notes MTH-9842
25 pages
Concrete Order Form
No ratings yet
Concrete Order Form
1 page
Owners Manual: Warning
No ratings yet
Owners Manual: Warning
35 pages
Mirror Descent Slides
No ratings yet
Mirror Descent Slides
35 pages
Bio Mecanic A
No ratings yet
Bio Mecanic A
24 pages
Hamlet Allusions
100% (1)
Hamlet Allusions
22 pages
Doan BFGS
No ratings yet
Doan BFGS
72 pages
Newton Gauss Method
No ratings yet
Newton Gauss Method
37 pages
Optimumengineeringdesign Day3a
No ratings yet
Optimumengineeringdesign Day3a
34 pages
The Life and Work of Niels Bohr - A Brief Sketch: N Mukunda
No ratings yet
The Life and Work of Niels Bohr - A Brief Sketch: N Mukunda
8 pages
Support Lecture 1
No ratings yet
Support Lecture 1
4 pages
1 PB
No ratings yet
1 PB
9 pages
Algorithms Process Optimization
No ratings yet
Algorithms Process Optimization
5 pages
Lecture 2
No ratings yet
Lecture 2
19 pages
Gurley, Bill J, - Clinically Relevant Herb Mineral Vitamin Intrxn
No ratings yet
Gurley, Bill J, - Clinically Relevant Herb Mineral Vitamin Intrxn
9 pages
Lecture 12
No ratings yet
Lecture 12
16 pages
Admixtues Module 1
No ratings yet
Admixtues Module 1
9 pages
ACI 447 1R 18 Report On The Modeling Techniques Used in Finite Element
No ratings yet
ACI 447 1R 18 Report On The Modeling Techniques Used in Finite Element
24 pages
Intestinal Failure: Editor
No ratings yet
Intestinal Failure: Editor
953 pages
A Computer Method For Nonlinear Inelastic Analysis of 3D Semi-Rigid Steel Frameworks
No ratings yet
A Computer Method For Nonlinear Inelastic Analysis of 3D Semi-Rigid Steel Frameworks
20 pages
Uspfo Sop
No ratings yet
Uspfo Sop
34 pages
Lec 02
No ratings yet
Lec 02
43 pages
6 Gradient Method
No ratings yet
6 Gradient Method
19 pages
Activity Running Wolf Ws 7A
No ratings yet
Activity Running Wolf Ws 7A
3 pages
A New Class of Quasi Newton Updating Formulas For Unconstrained Optimization
No ratings yet
A New Class of Quasi Newton Updating Formulas For Unconstrained Optimization
13 pages
Basic Concepts: 1.1 Continuity
No ratings yet
Basic Concepts: 1.1 Continuity
7 pages
14 Newton
No ratings yet
14 Newton
24 pages
Lecture 15
No ratings yet
Lecture 15
3 pages
Chương 9
No ratings yet
Chương 9
12 pages
Lecture 7 Newton
No ratings yet
Lecture 7 Newton
44 pages
Nutritional Status Grade 2 CARNATION 2023
No ratings yet
Nutritional Status Grade 2 CARNATION 2023
4 pages
Programming 3
No ratings yet
Programming 3
10 pages
Innomar SES 2000 Compact Sub Bottom Profiler System - 2023 01 25 133058 - Lupl
No ratings yet
Innomar SES 2000 Compact Sub Bottom Profiler System - 2023 01 25 133058 - Lupl
2 pages
HW 3 Unconstrained-Optimization Advanced
No ratings yet
HW 3 Unconstrained-Optimization Advanced
9 pages
BFGS
No ratings yet
BFGS
9 pages
The Legend of Saint Barbara
No ratings yet
The Legend of Saint Barbara
2 pages
Tour Guiding
No ratings yet
Tour Guiding
19 pages
Lecture 5
No ratings yet
Lecture 5
6 pages
Document 12
No ratings yet
Document 12
2 pages
1991imajna 11 325 332
No ratings yet
1991imajna 11 325 332
9 pages
Opt Lec 10
No ratings yet
Opt Lec 10
16 pages
Keynote 1
No ratings yet
Keynote 1
32 pages
Optimization 2
No ratings yet
Optimization 2
40 pages
Lecture 7 8 Other Descent Methods
No ratings yet
Lecture 7 8 Other Descent Methods
7 pages
OL Science English (2021) Paper 02
No ratings yet
OL Science English (2021) Paper 02
17 pages
Preguntas Del Examen
No ratings yet
Preguntas Del Examen
8 pages
Chapter 8 Lecture Notes
No ratings yet
Chapter 8 Lecture Notes
4 pages
Lecture 5 Si416 2025
No ratings yet
Lecture 5 Si416 2025
21 pages
L-BFGS Algorithm
No ratings yet
L-BFGS Algorithm
4 pages
Bfgs
No ratings yet
Bfgs
11 pages
2021 Super Duty Chassis Cab Tech Specs
100% (1)
2021 Super Duty Chassis Cab Tech Specs
9 pages
LE10-ECO-MA-SP25-Elementary Economic Analysis
No ratings yet
LE10-ECO-MA-SP25-Elementary Economic Analysis
33 pages
On The Connection Between The Conjugate Gradient Method and Quasi-Newton Methods On Quadratic Problems
No ratings yet
On The Connection Between The Conjugate Gradient Method and Quasi-Newton Methods On Quadratic Problems
20 pages
Lecture 14
No ratings yet
Lecture 14
9 pages
Clnote Oct12
No ratings yet
Clnote Oct12
25 pages
GOAT Cheat Sheet
No ratings yet
GOAT Cheat Sheet
1 page
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Quasi Newton PDF

Uploaded by

Quasi Newton PDF

Uploaded by

EE236C (Spring 2011-12)

• variable metric methods

• limited-memory quasi-Newton methods

f convex, twice continously differentiable

x+ = x − t∇2f (x)−1∇f (x)

• advantages: fast convergence, affine invariance

can be too expensive for large scale applications

Quasi-Newton methods 2-2

H ≻ 0 is approximation of the Hessian at x, chosen to:

• avoid calculation of second derivatives

‘variable metric’ interpretation (EE236B, lecture 10, page 11)

is steepest descent direction at x for quadratic norm

Quasi-Newton methods 2-3

given starting point x(0) ∈ dom f , H0 ≻ 0

• different methods use different rules for updating H in step 4

Quasi-Newton methods 2-4

• note that y T s > 0 for strictly convex f ; see page 1-11

Quasi-Newton methods 2-5

if y T s > 0, BFGS update preserves positive definitess of Hk

proof: from inverse update formula,

• if Hk−1 ≻ 0, both terms are nonnegative for all v

this ensures that ∆x = −Hk−1∇f (x(k)) is a descent direction

Quasi-Newton methods 2-6

BFGS update satisfies the secant condition Hk s = y, i.e.,

Hk (x(k) − x(k−1)) = ∇f (x(k)) − ∇f (x(k−1))

interpretation: define second-order approximation at x(k)

(k) (k) T (k) 1

secant condition implies that gradient of fquad agrees with f at x(k−1):

∇fquad(x(k−1)) = ∇f (x(k)) + Hk (x(k−1) − x(k))

Quasi-Newton methods 2-7

x(k−1) x(k) x(k+1)

Quasi-Newton methods 2-8

if f is strongly convex, BFGS with backtracking line search (EE236B,

if f is strongly convex and ∇2f (x) is Lipschitz continuous, local

kx(k+1) − x⋆k2 ≤ ck kx(k) − x⋆k2 → 0

where ck → 0 (cf., quadratic local convergence of Newton method)

Quasi-Newton methods 2-9

cost per Newton iteration: O(n3) plus computing ∇2f (x)

Quasi-Newton methods 2-10

to improve numerical stability, can propagate Hk in factored form

if Hk−1 = Lk−1LTk−1 then Hk = Lk LTk with

if Lk−1 is triangular, cost of reducing Lk to triangular is O(n2)

Quasi-Newton methods 2-11

X = Hk solves the convex optimization problem

• cost function is nonnegative, equal to zero only if X = Hk−1

optimality result follows from KKT conditions: X = Hk satisfies

Quasi-Newton methods 2-12

switch Hk−1 and X in objective on previous page

minimize tr(Hk−1X −1) − log det(Hk−1X −1) − n

• minimize relative entropy between N (0, Hk−1) and N (0, X)

(known as DFP update)

pre-dates BFGS update, but is less often used

Quasi-Newton methods 2-13

main disadvantage of quasi-Newton method is need to store Hk or Hk−1

limited-memory BFGS (L-BFGS): do not store Hk−1 explicitly

• instead we store the m (e.g., m = 30) most recent values of

sj = x(j) − x(j−1), yj = ∇f (x(j)) − ∇f (x(j−1))

• we evaluate ∆x = Hk−1∇f (x(k)) recursively, using

Quasi-Newton methods 2-14

• J. Nocedal and S. J. Wright, Numerical Optimization (2006), chapters

• J. E. Dennis and R. B. Schnabel, Numerical Methods for Unconstrained

Quasi-Newton methods 2-15

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.