Chapter 9 Lecture Notes
Chapter 9 Lecture Notes
• We can obtain a quadratic approximation to the twice continuously differentiable objection function f: ℝn→ ℝ using the Taylor series expansion
of f about the current point x(k), neglecting terms of order three and higher. We obtain
Example 1) Considering the starting point x(0) = [3, –1, 0, 1]⊤, use Newton’s method to minimize the Powell function:
• Observe that the kth iteration of Newton’s method can be written in two steps as:
• In this case F(x) is the Jacobian matrix of g at x; that is, F(x) is the n × n matrix whose (i,j) entry is (∂gi/∂xj)(x), i,j = 1,2,…, n.
• To analyze the convergence of Newton’s method in the general case, let {x(k)} be the Newton’s method sequence for minimizing a function f:
ℝn → ℝ. We show that {x(k)} converges to the minimizer x* with order of convergence at least 2.
Theorem 1)
Suppose that f ∈ C3 and x* ∈ ℝn is a point such that ∇f(x*) = 0 and F(x*) is invertible. Then, for all x(0) sufficiently close to x*, Newton’s
method is well-defined for all k and converges to x* with an order of convergence at least 2.
Note: In the Theorem 1, we did not state that x* is a local minimizer. For example, if x* is a local maximizes, then provided
that f ∈ C3 and F(x*) is invertible, Newton’s method would converge to x* if we start close enough to it.
• As stated in Theorem 1, Newton’s method has superior convergence properties if the starting point is near the solution.
○ However, the method is not guaranteed to converge to the solution if we start far away from it
• In fact, it may not even be well-defined because the Hessian may be singular.
• In particular, the method may not be a descent method;
• that is, it is possible that f(x(k+1) ≥ f(x(k)).
• Fortunately, it is possible to modify the algorithm such that the descent property holds.
Theorem 2)
Let {x(k)} be the sequence generated by Newton’s method for minimizing a given objective function f(x). If the Hessian F(x(k)) > 0 and
g(k) = ∇f(x(k)) ≠ 0, then the search direction
from x(k) to x(k+1) is a descent direction for f in the sense that there exists an ¯α > 0 such that for all α ∈ (0, ¯α),
○ that is, at each iteration, we perform a line search in the direction –F(x(k))–1g(k). By Theorem we conclude that the modified Newton’s
method has the descent property; that is,
• A drawback of Newton’s method is that evaluation of F(x(k)) for large n can be computationally expensive.
○ Furthermore, we have to solve the set of n linear equations F(x(k))d(k) = –g(k).
○ Another source of potential problems in Newton’s method arises from the Hessian matrix not being positive definite.
• You might see error for this while trying to use CPLEX/Gurobi to solve such problems as well!