0% found this document useful (0 votes)

95 views22 pages

CS 726: Nonlinear Optimization 1 Lecture 3: Di Erentiability

This document summarizes key concepts from a lecture on non-linear optimization and differentiability. 1) It defines local and global minimizers of functions, and discusses different types of solutions to optimization problems. 2) It covers Taylor's theorem and how it relates to differentiability, stating the first and second order Taylor approximations. 3) It discusses properties like convexity, strict convexity, and strong convexity, and how they relate to the uniqueness and existence of solutions.

Uploaded by

Harris

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

95 views22 pages

CS 726: Nonlinear Optimization 1 Lecture 3: Di Erentiability

Uploaded by

Harris

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

CS 726: Nonlinear Optimization 1

Lecture 3 : Di↵erentiability

Michael C. Ferris

Computer Sciences Department

University of Wisconsin-Madison

January 29 2021

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 1 / 21

Background Material

Since we spent some part of the lecture in Background Quiz, I would

like you to review the whole of [Wright and Recht(2020), Chapter 2].
Definition of local and global, see [Wright and Recht(2020), Section
2.1].
Reading through this additional information and that of Bertsekas
mentioned last time may be helpful.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 2 / 21

Taxonomy 2.1
of Asolutions
Taxonomy of Solutions to Optimization Problems
Before we can begin designing algorithms, we must determine what it means
to solve an optimization problem. Suppose that f is a function mapping some
domain D ⇢ Rn to the real line R. We have the following definitions.
• x⇤ 2 D is a local minimizer of f if there is a neighborhood N of x⇤ such that
f (x) f (x⇤ ) for all x 2 N \ D.
• x⇤ 2 D is a global minimizer of f if f (x) f (x⇤ ) for all x 2 D.
• x⇤ 2 D is a strict local minimizer if it is a local minimizer for some neigh-
borhood N of x⇤ , and in addition f (x) > f (x⇤ ) for all x 2 N with x , x⇤ .
• x⇤ is an isolated local minimizer if there is a neighborhood N of x⇤ such
that f (x) f (x⇤ ) for all x 2 N \ D and in addition, N contains no local
minimizers other than x⇤ .
• x⇤ is the unique minimizer if it is the only global minimizer.
For the constrained optimization problem
min f (x), (2.1)
x2⌦

where ⌦ ⇢ D ⇢ Rn is a closed set, we modify the terminology slightly to use

the word “solution” rather than “minimizer.” That is, we have the following
definitions.

15
Note local solution and global solution

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 3 / 21

Taylor’s Theorem and Di↵erentiability

Convention
f : Rn ! R.
I Df (x) is a 1 ⇥ n row vector.
I rf (x) = [Df (x)]T (column vector).
g : R n ! Rm .
I Dg (x) 2 Rm⇥n
I [Dg (x)]T = rg (x) 2 Rn⇥m .

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 4 / 21

Theorem (First order Taylor)
([Wright and Recht(2020), Theorem 2.1])
f 2 C 1 . Then
1 f (x + p) = f (x) + rf (x + p)T p for some 2 (0, 1].
2 f (x + p) = f (x) + rf (x)T p + o(kpk), where

o(t)
lim = 0.
t#0 t
R1
3 f (x + p) = f (x) + 0 rf (x + p)T pd .

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 5 / 21

Theorem (Second order Taylor)
([Wright and Recht(2020), Theorem 2.1])
f 2 C 2 . Then
1 f (x + p) = f (x) + rf (x)T p + 12 p T r2 f (x + p)p for some 2 (0, 1].
1 T 2 2
2 f (x + p) = f (x) + rf (x)T p + 2 p r f (x)p + o(kpk ).
R1
3 rf (x + p) = rf (x) + 0 r2 f (x + p)pd .

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 6 / 21

Theorem
First Order Sufficiency Condition Let f : Rn ! R̄ (f is an extended
real-valued function). Suppose f is convex and x̄ is a local minimum of
f (x): f (x̄) = minx2Rn f (x). Then x̄ is a global minimizer of f (x) over Rn .

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 7 / 21

Proof.
Suppose f is convex and x̄ is a local minimizer of the function f (x) and
suppose 9 x̂ which is better (i.e. f (x̂) < f (x̄)). Let us construct an ✏-ball
around the point x̄ such that f (x) f (x̄) 8x 2 B✏ (x̄).

The line connecting x̂ and x̄ with x inside the ✏-ball

Now, define x = x̄ + (x̂ x̄). Note this is equivalent to x̂ + (1 )x̄.

Thus x is on the segment joining x̄ and x̂. If we take > 0 sufficiently
small, we can ensure x 2 B✏ (x̄). Therefore, f (x ) f (x̄).

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 8 / 21

Proof.
But we already know f (x ):

f (x̄)  f (x ) = f ( x̂ + (1 )x̂)

by our definition of x . f is convex, so

f ( x̂ + (1 )x̂)  f (x̂) + (1 )f (x̄).

By assumption, f (x̂) < f (x̄), so

f (x )  f (x̂) + (1 )f (x̄) < f (x̄) + (1 )f (x̄) = f (x̄).

This implies we can find a point in the neighborhood we defined strictly

better than the local minimizer. Contradiction, so x̄ must be the global
minimizer.
It’s worth noting this result holds regardless of di↵erentiability; it only
depends on the convexity of f .

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 9 / 21

Uniqueness of Global Minimizers
Theorem
Uniqueness Result Let f : Rn ! R̄ be strictly convex. Then, if f has a
global minimizer, that global minimizer is unique.

Proof.
Let x 1 ,x 2 be distinct global minimizers. For 0 < < 1, because f is
strictly convex,

f ( x 1 + (1 )x 2 ) < f (x 1 ) + (1 )f (x 2 ).

In order for both x 1 and x 2 to be global minimizers, f (x 1 ) = f (x 2 ), so let’s

replace f (x 2 ) with f (x 1 ) on the right-hand side of the above inequality:

f ( x 1 + (1 )x 2 ) < f (x 1 ) + (1 )f (x 1 ) = f (x 1 ).

This says 9 a point x somewhere between x 1 and x 2 such that

f (x) < f (x 1 ). )( f must have a unique global minimizer.
Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 10 / 21
for all 2 (0, 1).
As we will see throughout this text, a crucial quantity in optimization is the
Lipschitz constant L for the gradient of f , which is defined to satisfy

kr f (x) r f (y)k  Lkx yk, for all x, y 2 dom ( f ). (2.7)

We say that a continuously di↵erentiable function f with this property is L-

smooth or has L-Lipschitz gradients. We say that f is L0 -Lipschitz if

| f (x) f (y)|  L0 kx yk, for all x, y 2 dom ( f ). (2.8)

From (2.2), we have

Z 1
f (y) f (x) r f (x)T (y x) = [r f (x + (y x)) r f (x)]T (y x) d .
0

By using (2.7), we have

[r f (x+ (y x)) r f (x)]T (y x)  kr f (x+ (y x)) r f (x)kky xk  L ky xk2 .

By substituting this bound into the previous integral, we obtain the following
result.

Lemma 2.2 Given an L-smooth function f , we have for any x, y 2 dom ( f )

that
L
f (y)  f (x) + r f (x)T (y x) + ky xk2 . (2.9)
2
Lemma 2.2 asserts that f can be upper bounded by a quadratic function
whose value at x is equal to f (x).
When f is twice continuously di↵erentiable, we can characterize the con-
stant L in terms of the eigenvalues of the Hessian r2 f (x). Specifically, we
Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 11 / 21
Strict and Strong Convexity

Definition
From now on, x1 , xn , etc, will be used to refer to components of vectors
and x 1 , x n , etc, will be used to refer to distinct points.

Definition
Strictly Convex: A function f : R ! R̄ is strictly convex if 8x,y such
that x 6= y and 8↵ 2 [0, 1]

f ((1 ↵)x + ↵y ) < (1 ↵)f (x) + ↵f (y ).

Note this definition is identical to that of convex functions, but the

inequality is now strict.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 12 / 21

Strong Convexity

Let f : ⌦ ! R, h : ⌦ ! R where ⌦ is an open convex set.

Definition
f is strongly convex (⇢) on ⌦ if 9⇢ > 0 such that 8x, y 2 ⌦, 2 [0, 1]
⇢
f ((1 )x + y )  (1 )f (x) + f (y ) (1 ) kx y k2
2

Definition
h is strongly monotone (⇢) on ⌦ if 9⇢ > 0 such that 8x, y 2 ⌦

hh(x) h(y ), x yi ⇢ kx y k2

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 13 / 21

Theorem (Strong convexity)
If f is continuously di↵erentiable on ⌦ then the following are equivalent:
(a) f is strongly convex (⇢) on ⌦
(b) For all x, y 2 ⌦,
f (y ) f (x) + hrf (x), y xi + (⇢/2) kx y k2
(c) rf is strongly monotone (⇢) on ⌦
If f is twice continuously di↵erentiable on ⌦, then
⌦ ↵
(d) For all x,y ,z 2 ⌦, x y , r2 f (z)(x y ) ⇢ kx y k2
is equivalent to the above.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 14 / 21

Proof.
We show (a) () (b) () (c).
(a) =) (b) The hypothesis gives

f (x + (y x)) f (x) ⇢
 f (y ) f (x) (1 ) kx y k2
2
so taking the limit as !0
⇢
hrf (x), y xi  f (y ) f (x) kx y k2
2

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 15 / 21

Proof.
(b) =) (c) Applying (b) twice gives

⇢ 2
f (y ) f (x) + hrf (x), y xi + kx yk
2
⇢ 2
f (x) f (y ) + hrf (y ), x yi + kx yk
2

Adding these inequalities gives

2
f (y ) + f (x) f (x) + f (y ) + hrf (x) rf (y ), y xi + ⇢ kx yk

from where the result follows.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 16 / 21

Proof.
(c) =) (b) The hypothesis gives
Z 1 Z 1
2
hrf (x + t(y x)) rf (x), y xi dt ⇢t kx y k dt
0 0

which implies the result.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 17 / 21

Proof.
(b) =) (a) Letting y = u and x = (1 )u + v in (b) gives

⇢ 2
f (u) f ((1 )u + v ) + hrf ((1 )u + v ), (u v )i + k (u v )k (1)
2

Also letting y = v and x = (1 )u + v in (b) implies

⇢ 2
f (v ) f ((1 )u + v ) + hrf ((1 )u + v ), (1 )(v u)i + k(1 )(v u)k (2)
2

Adding (1 ) times (1) to times (2) gives the required result.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 18 / 21

Proof.
To complete the proof, we assume that f is twice continuously
di↵erentiable on ⌦.
(d) =) (c) This follows from the hypothesis since

⌧Z 1
2
hrf (x) rf (y ), x yi = r f (y + t(x y ))(x y )dt, x y
0
2
⇢ kx yk

(c) =) (d) Let x, y , z 2 ⌦. Then z + (x y ) 2 ⌦ for sufficiently small , so

D E hx y , rf (z + (x y )) rf (z)i
2
x y , r f (z + (x y ))(x y) = + o(1)

2
⇢ kx y k + o(1)

The result follows in the limit as ! 0.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 19 / 21

Then give definition of strong convexity modulus m:
1
f ((1 ↵)x + ↵y ) + m↵(1 ↵) kx y k22  (1 ↵)f (x) + ↵f (y ).
2
Strong implies strict clearly.
Lemma
If a function is strictly convex

f (y + (y x)) < f (y ) + (f (y ) f (x))

then
f (y ) > f (x) + rf (x)T (y x)

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 20 / 21

Proof.
Suppose f (y ) = f (x) + hrf (x), (y x)i for some x 6= y . Let
(t) = f (x + t(y x))f (x)t hrf (x), (y x)i. The above can be written
as (1) = (0) and we note that 0 (0) = 0, so we have (t) = (0) for
all t 2 [0, 1], which contradicts being strictly convex.
Hence f (y ) > f (x) + hrf (x), (y x)i for all x 6= y .

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 21 / 21

S. J. Wright and B. Recht.
Optimization for Data Analysis.
in proof, 2020.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 21 / 21

853470423-1000-Sbi-Data
No ratings yet
853470423-1000-Sbi-Data
127 pages
hmw9 (MA504)
100% (1)
hmw9 (MA504)
5 pages
Coercive Ness
No ratings yet
Coercive Ness
13 pages
Optimization Best
No ratings yet
Optimization Best
71 pages
CS 726: Nonlinear Optimization 1 Lecture 04: Convexity and Continuity
No ratings yet
CS 726: Nonlinear Optimization 1 Lecture 04: Convexity and Continuity
16 pages
Lecture Notes On Differentiability
No ratings yet
Lecture Notes On Differentiability
14 pages
Epigrafo PDF
No ratings yet
Epigrafo PDF
12 pages
Lecture Notes 2
No ratings yet
Lecture Notes 2
181 pages
Ma1102R Calculus Lesson 11: Wang Fei
No ratings yet
Ma1102R Calculus Lesson 11: Wang Fei
12 pages
Optimization: 1 Motivation
No ratings yet
Optimization: 1 Motivation
20 pages
A Detailed Analysis of The Brachistochrone Problem
No ratings yet
A Detailed Analysis of The Brachistochrone Problem
15 pages
lecture-3-si416-2025
No ratings yet
lecture-3-si416-2025
23 pages
BasicsOfConvexOptimization PDF
No ratings yet
BasicsOfConvexOptimization PDF
142 pages
Convex Optimization
No ratings yet
Convex Optimization
108 pages
Mean Value Theorems For Vector Valued Functions
No ratings yet
Mean Value Theorems For Vector Valued Functions
13 pages
Convex Functions: Renu M. R
No ratings yet
Convex Functions: Renu M. R
43 pages
Lecture 7
No ratings yet
Lecture 7
4 pages
Convexity, Lipschitzness, Smoothness
No ratings yet
Convexity, Lipschitzness, Smoothness
5 pages
AdvCV
No ratings yet
AdvCV
82 pages
Lecture_3_taxonomy_taylor
No ratings yet
Lecture_3_taxonomy_taylor
4 pages
Record Lab
No ratings yet
Record Lab
2 pages
Anal21.Dvi-calculus On Banach Space
No ratings yet
Anal21.Dvi-calculus On Banach Space
21 pages
1 Theory of Convex Functions
No ratings yet
1 Theory of Convex Functions
14 pages
Recitation 11: Based On Nesterov, Yurii. Introductory Lectures On Convex Optimization: A Basic Course
No ratings yet
Recitation 11: Based On Nesterov, Yurii. Introductory Lectures On Convex Optimization: A Basic Course
3 pages
Co 463
No ratings yet
Co 463
116 pages
Some Special Class of Functions in Optimization: Convex, Lipschitz, Strongly Convex
No ratings yet
Some Special Class of Functions in Optimization: Convex, Lipschitz, Strongly Convex
17 pages
Curs Tehnici de Optimizare
No ratings yet
Curs Tehnici de Optimizare
141 pages
Lecture 6
No ratings yet
Lecture 6
9 pages
Lecture Notes PDF
No ratings yet
Lecture Notes PDF
143 pages
Optimality Conditions: Unconstrained Optimization: 1.1 Differentiable Problems
No ratings yet
Optimality Conditions: Unconstrained Optimization: 1.1 Differentiable Problems
10 pages
[9783110426045 - An Introduction to Nonlinear Optimization Theory] 4 Convex Nonsmooth Optimization
No ratings yet
[9783110426045 - An Introduction to Nonlinear Optimization Theory] 4 Convex Nonsmooth Optimization
14 pages
Derivatives
No ratings yet
Derivatives
20 pages
Gaussian Optics Exercise
No ratings yet
Gaussian Optics Exercise
2 pages
lecture-2-si416-2025
No ratings yet
lecture-2-si416-2025
17 pages
Yale Univ. Mathematics Camp - 07
No ratings yet
Yale Univ. Mathematics Camp - 07
16 pages
03 Convex Functions Notes Cvxopt f22
No ratings yet
03 Convex Functions Notes Cvxopt f22
21 pages
1337001368
No ratings yet
1337001368
10 pages
Boq RFQ142 005 2023 Repairing of Potholes 003 8
No ratings yet
Boq RFQ142 005 2023 Repairing of Potholes 003 8
10 pages
SequnceSeries of Functions
No ratings yet
SequnceSeries of Functions
50 pages
9 Calculus: Differentiation: 9.1 First Optimization Result: Weierstraß Theorem
No ratings yet
9 Calculus: Differentiation: 9.1 First Optimization Result: Weierstraß Theorem
8 pages
Meanvalhhhhue
No ratings yet
Meanvalhhhhue
4 pages
Nonlinear Programming 3rd Edition Theoretical Solutions Manual
No ratings yet
Nonlinear Programming 3rd Edition Theoretical Solutions Manual
27 pages
9 Urinals
No ratings yet
9 Urinals
16 pages
Yusuf Hendrawan, STP, M.App - Life SC.,PH.D: Curriculum Vitae
No ratings yet
Yusuf Hendrawan, STP, M.App - Life SC.,PH.D: Curriculum Vitae
6 pages
Math I Lecture Notes
No ratings yet
Math I Lecture Notes
8 pages
Dowex Ion Exchanger Resin LA
No ratings yet
Dowex Ion Exchanger Resin LA
93 pages
Convex Functions and Optimization
No ratings yet
Convex Functions and Optimization
20 pages
Jan Van Tiel - Convex Analysis - An Introductory Text-Wiley (1984) PDF
No ratings yet
Jan Van Tiel - Convex Analysis - An Introductory Text-Wiley (1984) PDF
135 pages
lecture_09
No ratings yet
lecture_09
4 pages
Lesson 1 (1)
No ratings yet
Lesson 1 (1)
41 pages
Clinical Exercises for Treating Traumatic Stress in Children and Adolescents Practical Guidance and Ready to use Resources Full Chapter Download
100% (15)
Clinical Exercises for Treating Traumatic Stress in Children and Adolescents Practical Guidance and Ready to use Resources Full Chapter Download
16 pages
It S A PHD Not A Nobel Prize How Experienced Examiners Assess Research Theses
No ratings yet
It S A PHD Not A Nobel Prize How Experienced Examiners Assess Research Theses
19 pages
Convex Optimization L2 18
No ratings yet
Convex Optimization L2 18
11 pages
Concave and Convex Functions: 1 Basic Definitions
No ratings yet
Concave and Convex Functions: 1 Basic Definitions
12 pages
Agile Landscape - Chriss Webb
0% (1)
Agile Landscape - Chriss Webb
2 pages
Multivariatecalculus
No ratings yet
Multivariatecalculus
16 pages
Review Paper On CFD Analysis of Electrical Vehicle Battery Pack
No ratings yet
Review Paper On CFD Analysis of Electrical Vehicle Battery Pack
4 pages
New Zealand Mathematical Olympiad Committee Convex Functions
No ratings yet
New Zealand Mathematical Olympiad Committee Convex Functions
7 pages
Func 20160919
No ratings yet
Func 20160919
35 pages
Convexity and Differentiable Functions: R R R R R R R R R R R R R R R R
No ratings yet
Convexity and Differentiable Functions: R R R R R R R R R R R R R R R R
5 pages
Sobolov Spaces
No ratings yet
Sobolov Spaces
49 pages
High School Experiments in IR Astronomy
No ratings yet
High School Experiments in IR Astronomy
124 pages
Lec3 Convex Function Exercise
No ratings yet
Lec3 Convex Function Exercise
4 pages
Calculation Economizer - PL Project
No ratings yet
Calculation Economizer - PL Project
12 pages
Three Terminal Devices1
No ratings yet
Three Terminal Devices1
18 pages
Baren Report 06
No ratings yet
Baren Report 06
20 pages
tp5616 6.5EFOZ
No ratings yet
tp5616 6.5EFOZ
80 pages
Infy Verbal Aptitude 4
No ratings yet
Infy Verbal Aptitude 4
8 pages
Convexity: 1 Warm-Up
No ratings yet
Convexity: 1 Warm-Up
7 pages
Midterm DSP
No ratings yet
Midterm DSP
7 pages
Review Question 3
No ratings yet
Review Question 3
4 pages
Culvert Equipment
No ratings yet
Culvert Equipment
2 pages
convexity-1
No ratings yet
convexity-1
3 pages
Lindsay Mcginnis
No ratings yet
Lindsay Mcginnis
17 pages
Macabeo Me150p E01 Hw1 Chapter14&15
No ratings yet
Macabeo Me150p E01 Hw1 Chapter14&15
24 pages
Unconstrained Minimization in R: Newton Methods
No ratings yet
Unconstrained Minimization in R: Newton Methods
5 pages
Hyperlynx Thermal User
No ratings yet
Hyperlynx Thermal User
116 pages
Email Task, Session and Workflow Notification: Informatica
No ratings yet
Email Task, Session and Workflow Notification: Informatica
6 pages
The Occupational Personality Questionnaire
0% (1)
The Occupational Personality Questionnaire
11 pages
Effective Program Management Practices
100% (1)
Effective Program Management Practices
4 pages
Lesson 6.1: Forms of Energy
No ratings yet
Lesson 6.1: Forms of Energy
11 pages
Lesson Plan
No ratings yet
Lesson Plan
4 pages
2020 September IStructE Examiners Report
No ratings yet
2020 September IStructE Examiners Report
12 pages
The Activities For Audiolingual The Method Is Distinctively Various in Its Application
No ratings yet
The Activities For Audiolingual The Method Is Distinctively Various in Its Application
2 pages
Experiment 2 _ Properties of Acids and Bases Class x
No ratings yet
Experiment 2 _ Properties of Acids and Bases Class x
3 pages
Infinite Series
From Everand
Infinite Series
James M Hyslop
No ratings yet
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
Calculus-II (Mathematics) Question Bank
From Everand
Calculus-II (Mathematics) Question Bank
Mohmmad Khaja Shareef
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

CS 726: Nonlinear Optimization 1 Lecture 3: Di Erentiability

Uploaded by

CS 726: Nonlinear Optimization 1 Lecture 3: Di Erentiability

Uploaded by

CS 726: Nonlinear Optimization 1

Computer Sciences Department

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 1 / 21

Since we spent some part of the lecture in Background Quiz, I would

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 2 / 21

where ⌦ ⇢ D ⇢ Rn is a closed set, we modify the terminology slightly to use

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 3 / 21

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 4 / 21

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 5 / 21

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 6 / 21

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 7 / 21

The line connecting x̂ and x̄ with x inside the ✏-ball

Now, define x = x̄ + (x̂ x̄). Note this is equivalent to x̂ + (1 )x̄.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 8 / 21

by our definition of x . f is convex, so

f ( x̂ + (1 )x̂)  f (x̂) + (1 )f (x̄).

By assumption, f (x̂) < f (x̄), so

f (x )  f (x̂) + (1 )f (x̄) < f (x̄) + (1 )f (x̄) = f (x̄).

This implies we can find a point in the neighborhood we defined strictly

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 9 / 21

In order for both x 1 and x 2 to be global minimizers, f (x 1 ) = f (x 2 ), so let’s

This says 9 a point x somewhere between x 1 and x 2 such that

kr f (x) r f (y)k  Lkx yk, for all x, y 2 dom ( f ). (2.7)

We say that a continuously di↵erentiable function f with this property is L-

| f (x) f (y)|  L0 kx yk, for all x, y 2 dom ( f ). (2.8)

From (2.2), we have

By using (2.7), we have

[r f (x+ (y x)) r f (x)]T (y x)  kr f (x+ (y x)) r f (x)kky xk  L ky xk2 .

Lemma 2.2 Given an L-smooth function f , we have for any x, y 2 dom ( f )

f ((1 ↵)x + ↵y ) < (1 ↵)f (x) + ↵f (y ).

Note this definition is identical to that of convex functions, but the

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 12 / 21

Let f : ⌦ ! R, h : ⌦ ! R where ⌦ is an open convex set.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 13 / 21

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 14 / 21

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 15 / 21

Adding these inequalities gives

from where the result follows.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 16 / 21

which implies the result.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 17 / 21

Also letting y = v and x = (1 )u + v in (b) implies

Adding (1 ) times (1) to times (2) gives the required result.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 18 / 21

(c) =) (d) Let x, y , z 2 ⌦. Then z + (x y ) 2 ⌦ for sufficiently small , so

The result follows in the limit as ! 0.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 19 / 21

f (y + (y x)) < f (y ) + (f (y ) f (x))

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 20 / 21

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 21 / 21

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 21 / 21

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.