Differentiation and Derivatives of Inverse Functions
Differentiation and Derivatives of Inverse Functions
The derivative
Definition Suppose that f : (a, b) → R and a < c < b. Then f is differentiable
at c with derivative f 0 (c) if
f (c + h) − f (c)
lim = f 0 (c).
h→0 h
The domain of f 0 is the set of points c ∈ (a, b) for which this limit exists. If the
limit exists for every c ∈ (a, b) then we say that f is differentiable on (a, b).
Graphically, this definition says that the derivative of f at c is the slope of the
tangent line to y = f (x) at c, which is the limit as h → 0 of the slopes of the lines
through (c, f (c)) and (c + h, f (c + h)).
We can also write
0 f (x) − f (c)
f (c) = lim ,
x→c x−c
since if x = c + h, the conditions 0 < |x − c| < δ and 0 < |h| < δ in the definitions
of the limits are equivalent. The ratio
f (x) − f (c)
x−c
is undefined (0/0) at x = c, but it doesn’t have to be defined in order for the limit
as x → c to exist.
Like continuity, differentiability is a local property. That is, the differentiability
of a function f at c and the value of the derivative, if it exists, depend only the
values of f in a arbitrarily small neighborhood of c. In particular if f : A → R
where A ⊂ R, then we can define the differentiability of f at any interior point
c ∈ A since there is an open interval (a, b) ⊂ A with c ∈ (a, b).
(c + h)2 − c2
h(2c + h)
lim = lim = lim (2c + h) = 2c.
h→0 h h→0 h h→0
Note that in computing the derivative, we first cancel by h, which is valid since
h 6= 0 in the definition of the limit, and then set h = 0 to evaluate the limit. This
procedure would be inconsistent if we didn’t use limits.
For x > 0, the derivative is f 0 (x) = 2x as above, and for x < 0, we have f 0 (x) = 0.
For 0, we consider the limit
f (h) − f (0) f (h)
lim = lim .
h→0 h h→0 h
does not exist. (The right limit is 1 and the left limit is −1.)
Example . The function f : R → R defined by f (x) = |x|1/2 is differentiable
at x 6= 0 with
sgn x
f 0 (x) = .
2|x|1/2
If c > 0, then using the difference of two square to rationalize the numerator, we
get
(c + h)1/2 − c1/2
f (c + h) − f (c)
lim = lim
h→0 h h→0 h
(c + h) − c
= lim
h→0 h (c + h)1/2 + c1/2
1
= lim 1/2
h→0 (c + h) + c1/2
1
= 1/2 .
2c
If c < 0, we get the analogous result with a negative sign. However, f is not
differentiable at 0, since
f (h) − f (0) 1
lim = lim+ 1/2
h→0+ h h→0 h
does not exist.
Example . The function f : R → R defined by f (x) = x1/3 is differentiable at
x 6= 0 with
1
f 0 (x) = .
3x2/3
To prove this result, we use the identity for the difference of cubes,
a3 − b3 = (a − b)(a2 + ab + b2 ),
and get for c 6= 0 that
(c + h)1/3 − c1/3
f (c + h) − f (c)
lim = lim
h→0 h h→0 h
(c + h) − c
= lim 2/3
h→0 h (c + h) + (c + h)1/3 c1/3 + c2/3
1
= lim 2/3
h→0 (c + h) + (c + h)1/3 c1/3 + c2/3
1
= 2/3 .
3c
However, f is not differentiable at 0, since
f (h) − f (0) 1
lim = lim 2/3
h→0 h h→0 h
does not exist.
Example . Define f : R → R by
(
x2 sin(1/x) if x 6= 0,
f (x) =
0 if x = 0.
1 0.01
0.8 0.008
0.6 0.006
0.4 0.004
0.2 0.002
0 0
−0.2 −0.002
−0.4 −0.004
−0.6 −0.006
−0.8 −0.008
−1 −0.01
−1 −0.5 0 0.5 1 −0.1 −0.05 0 0.05 0.1
Figure 1. A plot of the function y = x2 sin(1/x) and a detail near the origin
with the parabolas y = ±x2 shown in red.
Then f is differentiable on R. (See Figure 1.) It follows from the product and chain
6 0 with derivative
rules proved below that f is differentiable at x =
1 1
f 0 (x) = 2x sin
− cos .
x x
Moreover, f is differentiable at 0 with f 0 (0) = 0, since
f (h) − f (0) 1
lim = lim h sin = 0.
h→0 h h→0 h
In this example, limx→0 f 0 (x) does not exist, so although f is differentiable on R,
its derivative f 0 is not continuous at 0.
Proof. Suppose, for definiteness, that f 0 (c) > 0 (otherwise, consider −f ). By the
continuity of f 0 , there exists an open interval U = (a, b) containing c on which
f 0 > 0. It follows from Theorem 8.36 that f is strictly increasing on U . Writing
V = f (U ) = (f (a), f (b)) ,
we see that f |U : U → V is one-to-one and onto, so f has a local inverse on V ,
which proves the first part of the theorem.
It remains to prove that the local inverse ( f |U )−1 , which we denote by f −1 for
short, is differentiable. First, since f is differentiable at c, we have
f (c + h) = f (c) + f 0 (c)h + r(h)
where the remainder r satisfies
r(h)
lim = 0.
h→0 h
Since f 0 (c) > 0, there exists δ > 0 such that
1
|r(h)| ≤ f 0 (c)|h| for |h| < δ.
2
It follows from the differentiability of f that, if |h| < δ,
f 0 (c)|h| = |f (c + h) − f (c) − r(h)|
≤ |f (c + h) − f (c)| + |r(h)|
1
≤ |f (c + h) − f (c)| + f 0 (c)|h|.
2
Absorbing the term proportional to |h| on the right hand side of this inequality into
the left hand side and writing
f (c + h) = f (c) + k,
we find that
1 0
f (c)|h| ≤ |k| for |h| < δ.
2
Choosing δ > 0 small enough that (c − δ, c + δ) ⊂ U , we can express h in terms of
k as
h = f −1 (f (c) + k) − f −1 (f (c)).
Using this expression in the expansion of f evaluated at c + h,
f (c + h) = f (c) + f 0 (c)h + r(h),
we get that
f (c) + k = f (c) + f 0 (c) f −1 (f (c) + k) − f −1 (f (c)) + r(h).
One can show that Theorem 8.51 remains true under the weaker hypothesis that
the derivative exists and is nonzero in an open neighborhood of c, but in practise,
we almost always apply the theorem to continuously differentiable functions.
The inverse function theorem generalizes to functions of several variables, f :
A ⊂ Rn → Rn , with a suitable generalization of the derivative of f at c as the linear
map f 0 (c) : Rn → Rn that approximates f near c. A different proof of the exis-
tence of a local inverse is required in that case, since one cannot use monotonicity
arguments.
As an example of the application of the inverse function theorem, we consider
a simple problem from bifurcation theory.
Example . Consider the transcendental equation
y = x − k (ex − 1)
where k ∈ R is a constant parameter. Suppose that we want to solve for x ∈ R
given y ∈ R. If y = 0, then an obvious solution is x = 0. The inverse function
theorem applied to the continuously differentiable function f (x; k) = x − k(ex − 1)
implies that there are neighborhoods U , V of 0 (depending on k) such that the
equation has a unique solution x ∈ U for every y ∈ V provided that the derivative
0.5
0
y
−0.5
−1
−1 −0.5 0 0.5 1
x
Figure 2. Graph of y = f (x; k) for the function in Example 8.52: (a) k = 0.5
(green); (b) k = 1 (blue); (c) k = 1.5 (red). When y is sufficiently close to
zero, there is a unique solution for x in some neighborhood of zero unless k = 1.
2.5
1.5
0.5
x
−0.5
−1
−1.5
−2
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
k
Alternatively, we can fix a value of y, say y = 0, and ask how the solutions of
the corresponding equation for x,
x − k (ex − 1) = 0,
depend on the parameter k. Figure 2 plots the solutions for x as a function of k for
0.2 ≤ k ≤ 2. The equation has two different solutions for x unless k = 1. The branch
of nonzero solutions crosses the branch of zero solution at the point (x, k) = (0, 1),
called a bifurcation point. The implicit function theorem, which is a generalization
of the inverse function theorem, implies that a necessary condition for a solution
(x0 , k0 ) of the equation f (x; k) = 0 to be a bifurcation point, meaning that the
equation fails to have a unique solution branch x = g(k) in some neighborhood of
(x0 , k0 ), is that fx (x0 ; k0 ) = 0.