Crs Mfai 2024 Slides
Crs Mfai 2024 Slides
C R Subramanian
I
I Want to find a ~a ∈ Rd which minimises E (~a).
I
I When d = 1, E (a) becomes a continuous function of one
variable a.
I The minimiser a∗ and the minimum value E (a) can be
computed in O(n) time.
Limits and Continuity
I Algebra :
I f and g are defined over R.
I f 0 (a) and g 0 (a) exist for a ∈ R.
I (f ± g )0 (a) = f 0 (a) ± g 0 (a).
I (f · g )0 (a) = f (a) · g 0 (a) + f 0 (a) · g (a).
0 0 (a)·g 0 (a)
I gf (a) = g (a)·f (a)−f g (a)2
provided g (a) 6= 0.
I
I Chain Rule :
I Suppose Range(f ) ⊆ Domain(g ) ; f 0 (a), g 0 (f (a)) exist.
I (g (f ))0 (a) exists and equals g 0 (f (a)) · f 0 (a).
I Familiar version :
I y = f (x), z = g (y ), z = g (f (x)) ⇒ dz dz dy
dx = dy · dx .
Differentiability
I
00 2
I E1 (x) = f (c)(x−a)
2 for some c ∈ (a, x).
I f (a + h) = f (a) + hf 0 (a) + o(h) as h → 0.
I f (a + h) ≈ f (a) + hf 0 (a) as h → 0.
I
I differentiability ⇐⇒ local linearizability.
Taylor’s Approximation Formula
I f : Rn → Rm , n, m ≥ 1.
I m = 1 - real-valued or scalar functions/fields.
I m > 1 - vector-valued or vector functions/fields.
I n = 1 and m > 1 - trajectories (say, of a projectile in 3-space).
I q
I ~x ∈ Rd . ||~x ||2 = x12 + . . . + xd2 - L2 -norm of x.
I ~x , ~y ∈ Rd . d2 (~x , ~y ) = ||~x − ~y ||2 - L2 -distance.
I
I f : Rn → Rm . ~a ∈ Rn , ~l ∈ Rm .
I Lt~x →~a f (x) = ~l if, ∀ > 0, ∃δ > 0
I satisfying d2 (f (~x ), ~l) ≤ whenever 0 < d2 (~x , ~a) ≤ δ.
I f is continuous at ~a if Lt~ f ~(x) = f (~a).
x →~
a
Scalar and Vector functions
I f : O → R, O ⊆ Rn , O is open.
I Suppose f 0 (~x ) exists for every ~x ∈ B(~a, r ).
I Derivative of f 0 at ~a, if it exists, is the second-derivative f 00 (~a).
2
I Our Focus : Second-order partial derivatives - ∂∂xf ∂x
(~a)
.
i j
2
I Hessian (denoted by ∇2 f (~a)) is the matrix ∂∂xf ∂x
(~
a)
.
i j i,j
I Hessian is symmetric if the second-order pds are continuous.
Taylor’s approximation
I f : O → R, O ⊆ Rn , ~a ∈ O.
I second-order pds are continuous.
~T ·∇2 f (~
p a)·~
p
I f (~a + p~) = f (~a) + p~T · ∇f (~a) + 2 + . . ..
I
~T ·∇2 f (~
p η )·~
p
I f (~a + p~) = f (~a) + p~T · ∇f (~a) + 2
I for some ~η ∈ L(~a, ~a + p~).
I
2
pi ∂f∂x(~ai ) + pi pj ∂∂xfi ∂x
(~
η)
Pn Pn
I f (~a + p~) = f (~a) + i=1 i,j=1 j
I
pi ∂f∂x(~ai ) + o(||~
Pn
I f (~a + p~) = f (~a) + i=1 p ||) as p~ → ~0.
I
I Linear approximation : f (~a + p~) ≈ f (~a) + p~T · ∇f (~a).
Example (from Griva, Nash and Sofer)
I f is a scalar function.
I ~a is a local minimum for f ⇒ ∇f (~a)T · p~ ≥ 0 for all p~.
I ~a is a local minimum for f ⇒ ∇f (~a) = ~0.
I Necessary but not sufficient.
I ~a is a local minimum for f ⇒ ∇2 f (~a) is positive semi-definite.
I
I Sufficiency :
I ∇f (~a) = ~0 and ∇2 f (~a) is positive definite ⇒ ~a is a local
minimum.
I A symmetric matrix B is positive semi-definite (B 0) if
x T Bx ≥ 0 for all x ∈ Rn .
I A symmetric matrix B is positive definite (B 0) if
x T Bx > 0 for all x 6= ~0.
Unconstrained Minimization : Newton’s Method
I f : Rn → R, a scalar function.
I Given oracle access to computing ∇f and ∇2 f ,
Goal : To compute a local minimizer ~x ∗ of f .
I Newton’s Method for Minimizing :
1. Start with an initial guess ~x .
2. while ∇f (~x ) 6= ~0 and ∇2 f (~x ) 0 do
−1
3. p ← − ∇2 f (~x ) · ∇f (~x ) ; x ← x + p. endwhile
4. Return x.
I
I In practice, one replaces ∇f (~x ) 6= ~0 by ||∇f (~x )|| > , small .
Unconstrained Minimization : Newton’s Method