0% found this document useful (0 votes)
10 views3 pages

Chapter 9 Lecture Notes

The document discusses the method of steepest descent and its limitations, highlighting that Newton's method, which utilizes both first and second derivatives, can perform better if the initial point is close to the minimizer. It outlines the iterative process of Newton's method, including the use of quadratic approximations and the importance of solving linear equations efficiently. The document also presents convergence properties and potential drawbacks of Newton's method, including issues with Hessian positivity and computational expense.

Uploaded by

jmarmolejo03
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views3 pages

Chapter 9 Lecture Notes

The document discusses the method of steepest descent and its limitations, highlighting that Newton's method, which utilizes both first and second derivatives, can perform better if the initial point is close to the minimizer. It outlines the iterative process of Newton's method, including the use of quadratic approximations and the importance of solving linear equations efficiently. The document also presents convergence properties and potential drawbacks of Newton's method, including issues with Hessian positivity and computational expense.

Uploaded by

jmarmolejo03
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

• The method of steepest descent uses only first derivatives (gradients) in selecting a suitable search direction.

○ This strategy is not always the most effective.


○ If higher derivatives are used, the resulting iterative algorithm may perform better than the steepest descent method.
• Newton’s method (sometimes called the Newton-Raphson method) uses first and second derivatives and indeed does perform better than
the steepest descent method if the initial point is close to the minimizer.
• The idea behind this method is as follows.
• Given a starting point, we construct a quadratic approximation to the objective function that matches the first and second derivative
values at that point.
• We then minimize the approximate (quadratic) function instead of the original objective function.
• We use the minimizer of the approximate function as the starting point in the next step and repeat the procedure iteratively.
• If the objective function is quadratic, then the approximation is exact, and the method yields the true minimizer in one step.
• If, on the other hand, the objective function is not quadratic, then the approximation will provide only an estimate of the position of
the true minimizer.
• Here is an illustration of Newton's method:

• We can obtain a quadratic approximation to the twice continuously differentiable objection function f: ℝn→ ℝ using the Taylor series expansion
of f about the current point x(k), neglecting terms of order three and higher. We obtain

• Applying the FONC to q yields

• If F(x(k)) > 0, then q achieves a minimum at

Example 1) Considering the starting point x(0) = [3, –1, 0, 1]⊤, use Newton’s method to minimize the Powell function:
• Observe that the kth iteration of Newton’s method can be written in two steps as:

○ Step 1 requires the solution of an n × n system of linear equations.


• Thus, an efficient method for solving systems of linear equations is essential when using Newton’s method.
• As in the one-variable case, Newton’s method can also be viewed as a technique for iteratively solving this equation:

• In this case F(x) is the Jacobian matrix of g at x; that is, F(x) is the n × n matrix whose (i,j) entry is (∂gi/∂xj)(x), i,j = 1,2,…, n.

Analysis of Newton’s Method:


• As in the one-variable case there is no guarantee that Newton’s algorithm heads in the direction of decreasing values of the objective function
if F(x(k)) is not positive definite.
• Moreover, even if F(x(k)) > 0, Newton’s method may not be a descent method; that is, it is possible that f(x(k+1)) ≥ f(x(k)).
○ For example, this may occur if our starting point x(0) is far away from the solution.
• Despite these drawbacks, Newton’s method has superior convergence properties when the starting point is near the solution.
○ The convergence analysis of Newton’s method when f is a quadratic function is straightforward.
• In fact, Newton’s method reaches the point x* such that ∇f(x*) = 0 in just one step starting from any initial point x(0).

• To analyze the convergence of Newton’s method in the general case, let {x(k)} be the Newton’s method sequence for minimizing a function f:
ℝn → ℝ. We show that {x(k)} converges to the minimizer x* with order of convergence at least 2.

Theorem 1)
Suppose that f ∈ C3 and x* ∈ ℝn is a point such that ∇f(x*) = 0 and F(x*) is invertible. Then, for all x(0) sufficiently close to x*, Newton’s
method is well-defined for all k and converges to x* with an order of convergence at least 2.

Note: In the Theorem 1, we did not state that x* is a local minimizer. For example, if x* is a local maximizes, then provided
that f ∈ C3 and F(x*) is invertible, Newton’s method would converge to x* if we start close enough to it.
• As stated in Theorem 1, Newton’s method has superior convergence properties if the starting point is near the solution.
○ However, the method is not guaranteed to converge to the solution if we start far away from it
• In fact, it may not even be well-defined because the Hessian may be singular.
• In particular, the method may not be a descent method;
• that is, it is possible that f(x(k+1) ≥ f(x(k)).
• Fortunately, it is possible to modify the algorithm such that the descent property holds.

Theorem 2)
Let {x(k)} be the sequence generated by Newton’s method for minimizing a given objective function f(x). If the Hessian F(x(k)) > 0 and
g(k) = ∇f(x(k)) ≠ 0, then the search direction

from x(k) to x(k+1) is a descent direction for f in the sense that there exists an ¯α > 0 such that for all α ∈ (0, ¯α),

• Theorem 2 motivates the following modification of Newton’s method:

○ that is, at each iteration, we perform a line search in the direction –F(x(k))–1g(k). By Theorem we conclude that the modified Newton’s
method has the descent property; that is,

• A drawback of Newton’s method is that evaluation of F(x(k)) for large n can be computationally expensive.
○ Furthermore, we have to solve the set of n linear equations F(x(k))d(k) = –g(k).
○ Another source of potential problems in Newton’s method arises from the Hessian matrix not being positive definite.
• You might see error for this while trying to use CPLEX/Gurobi to solve such problems as well!

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy