Yan Min-Math 1023+1024
Yan Min-Math 1023+1024
Min Yan
Department of Mathematics
Hong Kong University of Science and Technology
1 Limit 7
1.1 Limit of Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1.1 Arithmetic Rule . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.1.2 Sandwich Rule . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.1.3 Some Basic Limits . . . . . . . . . . . . . . . . . . . . . . . . 15
1.1.4 Order Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.1.5 Subsequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.2 Rigorous Definition of Sequence Limit . . . . . . . . . . . . . . . . . . 25
1.2.1 Rigorous Definition . . . . . . . . . . . . . . . . . . . . . . . . 26
1.2.2 The Art of Estimation . . . . . . . . . . . . . . . . . . . . . . 28
1.2.3 Rigorous Proof of Limits . . . . . . . . . . . . . . . . . . . . . 31
1.2.4 Rigorous Proof of Limit Properties . . . . . . . . . . . . . . . 33
1.3 Criterion for Convergence . . . . . . . . . . . . . . . . . . . . . . . . 37
1.3.1 Monotone Sequence . . . . . . . . . . . . . . . . . . . . . . . . 38
1.3.2 Application of Monotone Sequence . . . . . . . . . . . . . . . 42
1.3.3 Cauchy Criterion . . . . . . . . . . . . . . . . . . . . . . . . . 45
1.4 Infinity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
1.4.1 Divergence to Infinity . . . . . . . . . . . . . . . . . . . . . . . 48
1.4.2 Arithmetic Rule for Infinity . . . . . . . . . . . . . . . . . . . 50
1.4.3 Unbounded Monotone Sequence . . . . . . . . . . . . . . . . . 52
1.5 Limit of Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
1.5.1 Properties of Function Limit . . . . . . . . . . . . . . . . . . . 53
1.5.2 Limit of Composition Function . . . . . . . . . . . . . . . . . 56
1.5.3 One Sided Limit . . . . . . . . . . . . . . . . . . . . . . . . . 61
1.5.4 Limit of Trigonometric Function . . . . . . . . . . . . . . . . . 63
1.6 Rigorous Definition of Function Limit . . . . . . . . . . . . . . . . . . 66
1.6.1 Rigorous Proof of Basic Limits . . . . . . . . . . . . . . . . . 67
1.6.2 Rigorous Proof of Properties of Limit . . . . . . . . . . . . . . 70
1.6.3 Relation to Sequence Limit . . . . . . . . . . . . . . . . . . . 73
1.6.4 More Properties of Function Limit . . . . . . . . . . . . . . . 78
1.7 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
1.7.1 Meaning of Continuity . . . . . . . . . . . . . . . . . . . . . . 80
1.7.2 Intermediate Value Theorem . . . . . . . . . . . . . . . . . . . 82
3
4 CONTENTS
2 Differentiation 93
2.1 Linear Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . 93
2.1.1 Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
2.1.2 Basic Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . 95
2.1.3 Constant Approximation . . . . . . . . . . . . . . . . . . . . . 98
2.1.4 One Sided Derivative . . . . . . . . . . . . . . . . . . . . . . . 100
2.2 Property of Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . 101
2.2.1 Arithmetic Combination of Linear Approximation . . . . . . . 101
2.2.2 Composition of Linear Approximation . . . . . . . . . . . . . 102
2.2.3 Implicit Linear Approximation . . . . . . . . . . . . . . . . . . 109
2.3 Application of Linear Approximation . . . . . . . . . . . . . . . . . . 113
2.3.1 Monotone Property and Extrema . . . . . . . . . . . . . . . . 113
2.3.2 Detect the Monotone Property . . . . . . . . . . . . . . . . . 115
2.3.3 Compare Functions . . . . . . . . . . . . . . . . . . . . . . . . 118
2.3.4 First Derivative Test . . . . . . . . . . . . . . . . . . . . . . . 120
2.3.5 Optimization Problem . . . . . . . . . . . . . . . . . . . . . . 122
2.4 Mean Value Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
2.4.1 Mean Value Theorem . . . . . . . . . . . . . . . . . . . . . . . 125
2.4.2 Criterion for Constant Function . . . . . . . . . . . . . . . . . 127
2.4.3 L’Hospital’s Rule . . . . . . . . . . . . . . . . . . . . . . . . . 129
2.5 High Order Approximation . . . . . . . . . . . . . . . . . . . . . . . . 133
2.5.1 Taylor Expansion . . . . . . . . . . . . . . . . . . . . . . . . . 136
2.5.2 High Order Approximation by Substitution . . . . . . . . . . . 139
2.5.3 Combination of High Order Approximations . . . . . . . . . . 144
2.5.4 Implicit High Order Differentiation . . . . . . . . . . . . . . . 148
2.5.5 Two Theoretical Examples . . . . . . . . . . . . . . . . . . . . 150
2.6 Application of High Order Approximation . . . . . . . . . . . . . . . 151
2.6.1 High Derivative Test . . . . . . . . . . . . . . . . . . . . . . . 151
2.6.2 Convex Function . . . . . . . . . . . . . . . . . . . . . . . . . 154
2.6.3 Sketch of Graph . . . . . . . . . . . . . . . . . . . . . . . . . . 158
2.7 Numerical Application . . . . . . . . . . . . . . . . . . . . . . . . . . 163
2.7.1 Remainder Formula . . . . . . . . . . . . . . . . . . . . . . . . 164
2.7.2 Newton’s Method . . . . . . . . . . . . . . . . . . . . . . . . . 166
3 Integration 171
3.1 Area and Definite Integral . . . . . . . . . . . . . . . . . . . . . . . . 171
3.1.1 Area below Non-negative Function . . . . . . . . . . . . . . . 171
3.1.2 Definite Integral of Continuous Function . . . . . . . . . . . . 174
3.1.3 Property of Area and Definite Integral . . . . . . . . . . . . . 178
3.2 Rigorous Definition of Integral . . . . . . . . . . . . . . . . . . . . . . 180
CONTENTS 5
4 Series 301
4.1 Series of Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
4.1.1 Sum of Series . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
4.1.2 Convergence of Series . . . . . . . . . . . . . . . . . . . . . . . 304
4.2 Comparison Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
4.2.1 Integral Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
4.2.2 Comparison Test . . . . . . . . . . . . . . . . . . . . . . . . . 309
6 CONTENTS
Limit
x1 , x2 , . . . , xn , . . . .
The n-th term of the sequence is xn , and n ∈ Z is the index of the term. In this
course, we will always assume that all the terms xn ∈ R are real numbers. Here are
some examples
xn = n : 1, 2, 3, . . . , n, . . . ;
yn = 2 : 2, 2, 2, . . . , 2, . . . ;
1 1 1
zn = : 1, , . . . , , . . . ;
n 2 n
un = (−1)n : 1, −1, 1, . . . , (−1)n , . . . ;
vn = sin n : sin 1, sin 2, sin 3, . . . , sin n, . . . .
Note that the index does not have to start from 1. For example, the sequence
un actually starts from n = 0 (or any even integer). What we are interested is only
the behaviour when n is large. Moreover, a sequence does not have to be given by
a formula. For example, the decimal expansions of π give a sequence
If n is the number of digits after the decimal point, then the sequence wn starts at
n = 0. Alternatively, we can introduce the notation called the integral part of n:
xn
wn
yn
vn
zn n
un
Now we look at the trend of the examples above as n gets bigger. We find that
xn gets bigger and can become as big as we want. On the other hand, yn remains
constant, zn gets smaller and can become as small as we want. This means that
yn approaches 2 and zn approaches 0. Moreover, un and vn jump around and do
not approach anything. Finally, wn is equal to π up to the n-th decimal place, and
therefore approaches π.
lim xn = l.
n→∞
A sequence diverges if it does not approach a specific finite number when n gets
bigger.
The equality in the proposition means that xn converges if and only if yn con-
verges. Moreover, the two limits have equal value when both converge.
1 1
Example 1.1.1. The sequence √ is obtained from √ by deleting the first
n+2 n
1 1
two terms. By limn→∞ √ = 0 and Proposition 1.1.2, we get limn→∞ √ =
n n
1
limn→∞ √ = 0.
n+2
1.1. LIMIT OF SEQUENCE 9
1
The example assumes limn→∞ √ = 0, which is supposed to be intuitively obvi-
n
ous. Although mathematics is inspired by intuition, a critical feature of mathematics
is rigorous logic. This means that we need to be clear what basic facts are assumed
in any argument. For the moment, we will always assume that we already know
1
limn→∞ c = c and limn→∞ p = 0 for p > 0. After the two limits are rigorously
n
established in Examples 1.2.2 and 1.2.3, the conclusions based on the two limits
become solid.
xn l
lim (xn ± yn ) = l ± k, lim cxn = cl, lim xn yn = kl, lim = ,
n→∞ n→∞ n→∞ n→∞ yn k
Exercise 1.1.2. Suppose xn and yn converge. Explain that limn→∞ xn yn = 0 implies either
limn→∞ xn = 0 or limn→∞ yn = 0. Moreover, explain that the conclusion fails if xn and
yn are not assumed to converge.
10 CHAPTER 1. LIMIT
n2 + a1 n + a0 n2 + c1 n + c0
2 2
n+a n+c
1. − . 3. − .
n+b n+d n+b n+d
2 2
n2 + a n2 + c
n2 + a1 n + a0 n2 + c1 n + c0
2. 2 − 2 . 4. − .
n + b1 n + b0 n + d1 n + d0 n+b n+d
Example 1.1.3. By 2n − 3 > n for sufficiently big n (in fact, n > 3 is enough), we
have
1 1
0< √ <√ .
2n − 3 n
1 1
Then by limn→∞ 0 = limn→∞ √ = 0 and the sandwich rule, we get limn→∞ √ =
n 2n − 3
0.
n
On the other hand, for sufficiently big n, we have n + 1 < 2n and n − 1 > ,
2
and therefore √ √ √
n+1 2n 2 2
0< < n = √ .
n−1 n
2
√
2 2 √ 1
By limn→∞ √ = 2 2 limn→∞ √ = 0 (arithmetic rule used) and the sandwich
n √ n
n+1
rule, we get limn→∞ = 0.
n−1
(−1)n 1 cos n
2. . 5. . 8. p √ .
n n + (−1)n n + sin n
√ cos n (−1)n
sin n 6. . 9. p .
3. . n + (−1)n n + (−1)n
n
1.1. LIMIT OF SEQUENCE 13
√ √
2 + (−1)n 3 3 n+2 n + sin n
10. √ . 13. . 16. √ .
3
n2 − 2 cos n 2n + (−1)n 3 n − cos n
√
sin n + (−1)n cos n n sin n + cos n (−1)n (n + 1)
11. √ . 14. . 17. .
n + (−1)n n−1 n2 + (−1)n+1
√
| sin n + cos n| n + sin n (−1)n (n + 10)2 − 1010
12. . 15. . 18. .
n n + cos 2n 10(−1)n n2 − 5
Exercise 1.1.24. Suppose limn→∞ xn = 1. Use the arithmetic rule and the sandwich rule
to prove that, if xn ≤ 1, then limn→∞ xpn = 1. Of course we expect the condition xn ≤ 1
to be unnecessary. See Example 1.1.21.
1
For the case 0 < a ≤ 1, let b = ≥ 1. Then by the arithmetic rule,
a
√ 1 1 1
lim n a = lim √n
= √
n
= = 1.
n→∞ n→∞ b limn→∞ b 1
This implies √
2
0 ≤ xn < √ .
n−1
√
2
By limn→∞ √ = 0 (see Example 1.1.1 or 1.1.3) and the sandwich rule, we get
n−1
limn→∞ xn = 0. This further implies
√
lim n n = lim xn + 1 = 1.
n→∞ n→∞
Example 1.1.9. The following “n-th root type” limits can be compared with the
limits in Examples 1.1.7 and 1.1.8
√ √
n
√
n √
1 < n n + 1 < 2n = 2 n n,
1 √
1 < n n+1 < n n,
n n √
1 < (n2 − n) n2 −1 < (n2 ) n2 /2 = ( n n)4 .
By Examples 1.1.7, 1.1.8 and the arithmetic rule, the sequences on the right converge
to 1. Then by the sandwich rule, we get
√ 1 n
lim n n + 1 = lim n n+1 = lim (n2 − n) n2 −1 = 1.
n→∞ n→∞ n→∞
Exercise 1.1.25. Prove that if a ≤ xn ≤ b for some constants a, b > 0 and sufficiently big
√
n, then limn→∞ n xn = 1.
1 c c
1. n 2n . 5. (an + b) n . 9. (an + b) n+d .
2 c cn
2. n n . 6. (an2 + b) n . 10. (an + b) n2 +dn+e .
c √ c c
3. n n . 7. ( n + 1) n . 11. (an2 + b) n+d .
c 1 cn+d
4. (n + 1) n . 8. (n − 2) n+3 . 12. (an2 + b) n2 +en+f .
1
By limn→∞ = 0 and the sandwich rule, we get limn→∞ an = 0.
nb
If −1 < a < 0, then 0 < |a| < 1 and limn→∞ |an | = limn→∞ |a|n = 0. By Exercise
1.1.11, we get limn→∞ an = 0.
4n 4·4·4·4 4 4 4 4·4·4·4 4 45 1
0< = · · ··· ≤ · = .
n! 1·2·3·4 5 6 n 1·2·3·4 n 4! n
45 1 4n
By limn→∞ = 0 and the sandwich rule, we get limn→∞ = 0.
4! n n!
Exercise 1.1.35 suggests how to show the limit in general.
Exercise 1.1.33. Show that limn→∞ n2 an = 0 for |a| < 1 in two ways. The first is by using
the ideas from Examples 1.1.11 and 1.1.12. The second is by using limn→∞ nan = 0 for
|a| < 1.
Exercise 1.1.34. Show that limn→∞ n5.4 an = 0 for |a| < 1. What about limn→∞ n−5.4 an ?
What about limn→∞ np an ?
an
Exercise 1.1.35. Show that limn→∞ = 0 for a = 5.4 and a = −5.4.
n!
an
Exercise 1.1.36. Show that limn→∞ √ = 0 for a = 5.4 and a = −5.4.
n!
n!an
Exercise 1.1.37. Show that limn→∞ = 0 for any a.
(2n)!
Exercise 1.1.39. Find the limits. Some convergence depends on a and p. You may try
some special values of a and p first.
1. np an . 1 np np an
4. . 7. . 10. √ .
np an n!an 3
n!
an np np n!an
2. . 5. . 8. √ . 11. .
np n! n! (2n)!
np np an np an n!np an
3. n . 6. . 9. √ . 12. .
a n! n! (2n)!
n2 + 3n + 5n n2 + n3n + 5! n2 + n! + (n − 1)!
1. . 3. . 5. .
n! n! 3n − n! + (n − 1)!
n! 1 (n!)2 1
Exercise 1.1.42. Prove n
< and < for n > 2. Then use this to prove
n n (2n)! n+1
n! (n!)2 (n!)k
limn→∞ = limn→∞ = 0. What about limn→∞ where k ≥ 2 is an integer?
nn (2n)! (kn)!
By taking yn = l, we get the following special cases of the property for a con-
verging sequence xn .
1. If xn ≤ l for sufficiently big n, then limn→∞ xn ≤ l.
20 CHAPTER 1. LIMIT
Exercise 1.1.43. Explain how to get the following special cases of the order rule.
1. If xn ≥ l for sufficiently big n, then limn→∞ xn ≥ l.
2n2 + n
Example 1.1.14. By limn→∞ = 2 and the order rule, we know 1 <
n2 − n + 1 r
2n2 + n n 2n2 + n √
n
2
< 3 for sufficiently big n. This implies 1 < 2
< 3 for
n −n+1 √ n −n+1
sufficiently big n. By limn→∞ n 3 = 1 and the sandwich rule, we get
s
n 2n2 + n
lim = 1.
n→∞ n2 − n + 1
√
Example 1.1.15. We showed limn→∞ n 3n − 2n = 3 in Example 1.1.10. Here we use
a different method, with
s thehelp of the order rule. s
n n
√n n n
2 2
By 3 − 2 = 3 1 − n
, we only need to find limn→∞ 1 −
n
. By
3 3
n
2
Example 1.1.11, we have limn→∞ 1 − = 1. By the order rule, therefore,
3
we have n
1 2
<1− <2
2 3
for sufficiently big n. This implies that
s n
1 n 2 √
n
√
n
< 1 − < 2
2 3
for sufficiently
s big n. Then by Example 1.1.7 and the sandwich rule, we get
n
2
limn→∞ n 1 − = 1, and we conclude that
3
s n
√n n 2
n n
lim 3 − 2 = 3 lim 1− = 3.
n→∞ n→∞ 3
1.1. LIMIT OF SEQUENCE 21
√
Exercise 1.1.44. Prove that if limn→∞ xn = l > 0, then limn→∞ n xn = 1. Moreover, find
√
a sequence satisfying limn→∞ xn = 0 and limn→∞ n xn = 1. Can we have xn converging
√
to 0 and n xn converging to 0.32?
Exercise 1.1.50. Suppose a,p b, c > 0, and p, q, r are polynomials with positive leading coef-
ficients. Find the limit of n p(n)an + q(n)bn + r(n)cn .
3n (n!)2
Example 1.1.16. The sequence xn = satisfies
(2n)!
xn 3n2 3
lim = lim = = 0.75.
n→∞ xn−1 n→∞ 2n(2n − 1) 4
xn
By the order rule, we have < 0.8 for sufficiently big n, say for n > N (in fact,
xn−1
N = 8 is enough). Then for n > N , we have
xn xn−1 xN +1
0 < xn = ··· xN < 0.8n−N xN = C · 0.8n , C = 0.8−N xN .
xn−1 xn−2 xN
By Example 1.1.11, we have limn→∞ 0.8n = 0. Since C is a constant, by the sandwich
rule, we get limn→∞ xn = 0.
Exercises 1.1.52 and 1.1.53 summarise the idea of the example.
xn
Exercise 1.1.52. Prove that if
≤ c for a constant c < 1, then xn converges to 0.
xn−1
xn
Exercise 1.1.53. Prove that if limn→∞ = l and |l| < 1, then xn converges to 0.
xn−1
(n!)2 n √ 2
(n!)p n
2. a . 5. n!an . 8. (n!)p an . 11. a .
(3n)! ((2n)!)q
(n!)3 n an
2
an n5 (n!)p n
3. a . 6. . 9. . 12. a .
(3n)! n! (n!)p ((2n)!)q
1.1.5 Subsequence
A subsequence is obtained by choosing infinitely many terms from a sequence. We
denote a subsequence by
x2k : x2 , x4 , x6 , x8 , . . . , x2k , . . . ,
x2k−1 : x1 , x3 , x5 , x7 , . . . , x2k−1 , . . . ,
x2k : x2 , x4 , x8 , x16 , . . . , x2k , . . . ,
xk! : x1 , x2 , x6 , x24 , . . . , xk! , . . . .
1 1 1 1
Example 1.1.17. Since 2
is a subsequence of , limn→∞ = 0 implies limn→∞ 2 =
n n n n
1 1
0. We also know limn→∞ √ = 0 implies limn→∞ = 0 but not vice versa.
n n
n + (−1)n 3
Example 1.1.18. The sequence is the union of the odd subsequence
n − (−1)n 2
(2k − 1) − 3 2k − 4 2k + 3
= and the even subsequence . Both subsequences con-
(2k − 1) + 2 2k + 1 2k − 2
verge to 1, either by direct computation, or by regarding them also as subsequences
n−4 n+3 n + (−1)n 3
of and , which converge to 1. Then we conclude limn→∞ = 1.
n+1 n−2 n − (−1)n 2
Example 1.1.19. The sequence (−1)n has one subsequence (−1)2k = 1 converging to
1 and another subsequence (−1)2k−1 = −1 converging to −1. Since the two limits
are different, by Proposition 1.1.6, the sequence (−1)n diverges.
a
satisfying sin n2k ≤ − cos . Therefore the two subsequences cannot converge to the
2
same limit. As a result, the sequence sin na diverges.
Now for general a that is not an integer multiple of π, we have a = 2N π ± b for
an integer N and b satisfying 0 < b < π. Then we have sin na = ± sin nb. We have
shown that sin nb diverges, so that sin na diverges.
We conclude that sin na converges if and only if a is an integer multiple of π.
√ √ 1 1
1. n! + 1 − n! − 1. 3. ((n + 1)!) n! . 5. (n!) (n+1)! .
1 1 2 −1 2 1
2. (n!) n! . 4. ((n + (−1)n )!) n! . 6. (2n + 3 n ) n2 .
n √ p √ nπ nπ
1. 2(−1) . 6. n n + (−1)n − n . 11. sin cos .
2 3
(−1)n
2. n n . n 1 nπ
7. (2(−1) n + 3n ) n . n sin
n 3 .
3. n(−1) . n (−1) nn 1
12. nπ
8. (2 + 3 )n . n cos +2
(−1)n n + 3 2
4. . nπ
n − (−1)n 2 9. tan . nπ
3 n − sin
13. 3 .
(−1)n n2 nπ nπ
5. . 10. (−1)n sin . n + 2 cos
n3 − 1 3 2
Exercise 1.1.57. Find all a such that the sequence cos na converges.
and similarly limk→∞ x0k N = 1. Then by the sandwich rule, we get limk→∞ x0k p = 1.
Similar proof shows that limk→∞ x00k p = 1. Since the sequence xpn is the union of
two subsequences x0k p and x00k p , by Proposition 1.1.6 again, we get limn→∞ xpn = 1.
Exercise 1.1.58. Suppose limn→∞ xn = 1 and yn is bounded. Prove that limn→∞ xynn = 1.
Exercise 1.1.59. Suppose limn→∞ xn = l > 0. By applying Example 1.1.21 to the sequence
xn
, prove that limn→∞ xpn = lp .
l
1.2. RIGOROUS DEFINITION OF SEQUENCE LIMIT 25
2n
For another example, limn→∞ = 0 means the following implications
n!
n
2
n > 10 =⇒ − 0 < 0.0003,
n!
n
2
n > 20 =⇒ − 0 < 0.0000000000005,
n!
..
.
Note that the relation between N (measuring the bigness of n) and (measuring
the smallness of |xn − l|) may be different for different limits.
26 CHAPTER 1. LIMIT
The problem with infinitely many implications is that our language is finite.
In practice, we cannot verify all the implications one by one. Even if we have
verified the truth of the first one million implications, there is no guarantee that
the one million and the first implication is true. To mathematically establish the
truth of all implications, we have to formulate one finite statement that includes the
consideration for all N and all .
xN +2 xN +3 l+
x4
x2 xN +1 xn x l
n+1
x3 l−
x1
n
N
1
Example 1.2.1. For any > 0, choose N = . Then
1 1 1
n > N =⇒ − 0 = <
= .
n n N
1
This verifies the rigorous definition of limn→∞ = 0.
n
By applying the rigorous definition to = 0.1, 0.01, . . . , we recover the infinitely
many implications we wish to achieve. This justifies the rigorous definition of limit.
In fact, the right side is always true, regardless of the left side.
1
For any > 0, choose N = 1 . Then
p
1 1 1
n > N =⇒ p − 0 = p < p = .
n n N
n2 − 1
Example 1.2.4. To rigorously prove limn→∞ = 1, for any > 0, we have
n2 + 1
r 2
2 n − 1 2 2 2
n>N = − 1 =⇒ 2 − 1 = 2 < 2 = = .
n +1 n +1 N +1 2
−1 +1
Therefore the sequence converges
r to 1. 2
2 n − 1
How did we choose N = − 1? We want to achieve 2
− 1 < . Since
n +r
1
2 2
this is equivalent to 2 < , which we can solve to get n > − 1, choosing
r n +1
2
N= − 1 should work.
Example 1.2.5. To rigorously prove the limit in Example 1.1.5, we estimate the
difference between the sequence and the expected limit
√ √ (n + 2) − n 2
| n + 2 − n − 0| = √ √ <√ .
n+2+ n n
28 CHAPTER 1. LIMIT
2 4
This shows that for any > 0, it is sufficient to have √ < , or n > 2 . In other
n
4
words, we should choose N = 2 .
The discussion above is the analysis of the problem, which you may write on
your scratch paper. The formal rigorous argument you are supposed to present is
4
the following: For any > 0, choose N = 2 . Then
√ √ 2 2 2
n > N =⇒ | n + 2 − n − 0| = √ √ < √ < √ = .
n+2+ n n N
While the exact solution can be found, the formula for N is rather complicated. For
more complicated example, it may not even be possible to find the formula for the
exact solution.
We note that finding the exact solution of |xn − l| < is the same as finding
N = N (), such that
n > N ⇐⇒ |xn − l| < .
However, in order to rigorously prove the limit, only =⇒ direction is needed. The
weaker goal can often be achieved in much simpler way.
Example 1.2.6. Consider the limit in Example 1.1.2. For n > 1, we have
2n2 + n
3n − 2 3n 3
n2 − n + 1 − 2 = n2 − n + 1 < n2 − n = n − 1 .
2n2 + n
3 3
Since < implies 2 − 2 < , and < is equivalent to
n−1 n −n+1 n−1
3 3
n > + 1, we find that choosing N = + 1 is sufficient
2n2 + n
3 3n − 2 3 3
n > N = + 1 =⇒ 2 − 2 = 2 < < = .
n −n+1 n −n+1 n−1 N −1
2
n2 − 1
n − 1 2
Exercise 1.2.1. Show that 2
− 1 < and then rigorously prove limn→∞ 2 = 1.
n +1 n n +1
1.2. RIGOROUS DEFINITION OF SEQUENCE LIMIT 29
The key for the rigorous proof of limits is to find a simple and good enough
estimation. We emphasize that there is no need to find the best estimation. Any
estimation that can fulfill the rigorous definition of limit is good enough.
Everyday life is full of good enough estimations. Mastering the art of such
estimations is very useful for not just learning calculus, but also for making smart
judgement in real life.
Example 1.2.7. If a bottle is 20% bigger in size than another bottle, how much bigger
is in volume?
The exact formula is the cube of the comparison in size
(1 + 0.2)3 = 1 + 3 · 0.2 + 3 · 0.22 + 0.23 .
Since 3 · 0.2 = 0.6, 3 · 0.22 = 0.12, and 0.23 is much smaller than 0.1, the bottle is a
little more than 72% bigger in volume.
Example 1.2.8. The 2013 GDP per capita is 9,800USD for China and 53,100USD for
the United States, in terms of PPP (purchasing power parity). The percentage of
the annual GDP growth for the three years up to 2013 are 9.3, 7.7, 7.7 for China
and 1.8, 2.8, 1.9 for the United States. What do we expect the number of years for
China to catch up to the United States?
First we need to estimate how much faster is the Chinese GDP growing compared
to the United States. The comparison for 2013 is
1 + 0.077
≈ 1 + (0.077 − 0.019) = 1 + 0.058.
1 + 0.019
Similarly, we get the (approximate) comparisons 1 + 0.075 and 1 + 0.049 for the
other two years. Among the three comparisons, we may choose a more conservative
1 + 0.05. This means that we assume Chinese GDP per capita grows 5% faster than
the United States for the next many years.
Based on the assumption of 5%, the number of years n for China to catch up to
the United States is obtained exactly by solving
n(n − 1) 53, 100
(1 + 0.05)n = 1 + n 0.05 + 0.052 + · · · + 0.05n = ≈ 5.5.
2 9, 800
5.5 − 1
If we use 1 + n 0.05 to approximate (1 + 0.05)n , then we get n ≈ = 90.
0.05
n(n − 1)
However, 90 years is too pessimistic because for n = 90, the third term 0.052
2 n
is quite sizable, so that 1 + n 0.05 is not a good approximation of (1 + 0.05) .
An an exercise for the art of estimation, we try to avoid using calculator in
53, 100
getting better estimation. By ≈ 2.32 , we may solve
9, 800
m(m − 1)
n = 2m, (1 + 0.05)m = 1 + m 0.05 + 0.052 + · · · + 0.05m ≈ 2.3.
2
30 CHAPTER 1. LIMIT
2.3 − 1 m(m − 1)
We get n ≈ 2 · = 52. Since 0.052 is still sizable for m = 26 (but
0.05 2
giving much better approximation than n = 90), the actual n should be somewhat
smaller than 52. We try n = 40 and estimate (1 + 0.05)n by the first three terms
40 · 39
(1 + 0.05)40 ≈ 1 + 40 · 0.05 + 0.052 ≈ 5.
2
So it looks like somewhere between 40 and 45 is a good estimation.
We conclude that, if Chinese GDP per capita growth is 5% (a very optimistic
assumption) faster than the United States in the next 50 years, then China will
catch up to the United States in 40 some years.
Exercise 1.2.2. I wish to paint a wall measuring 3 meters tall and 6 meters wide, give or
take 10% in each direction. If the cost of paint is $13.5 per square meters, how much
should I pay for the paint?
Exercise 1.2.3. In a supermarket, I bought four items at $5.95, $6.35, $15.50, $7.20. The
sales tax is 8%. The final bill is around $38. Is the bill correct?
Exercise 1.2.4. In 1900, Argentina and Canada had the same GDP per capita. In 2000,
the GDP per capita is 9,300USD for Argentina and 24,000USD for Canada. On average,
how much faster is Canadian GDP growing annually compared with Argentina in the 20th
century?
Next we leave real life estimations and try some examples in calculus.
Example 1.2.10. Again we assume x and y are close to 3 and 5. Now we want to
find the percentage of tolerance, such that 2x − 3y is within ±0.2 of −9.
We can certainly use the answer in Example 1.2.9 and find the percentage 0.04
3
≈
0.04
1.33% for x and 5 ≈ 0.8% for y. This implies that, if both x and y are within
0.8% of 3 and 5, then 2x − 3y is within ±0.2 of −9.
The better (or more honest) way is to directly solve the problem. Let δ1 and δ2
be the percentage of tolerance for x and y. Then x = 3(1 + δ1 ) and y = 5(1 + δ2 ),
and
|(2x−3y)−(−9)| = |2(x−3)−3(y−5)| ≤ |2·3δ1 −3·5δ2 | ≤ 21δ, δ = max{|δ1 |, |δ2 |}.
1.2. RIGOROUS DEFINITION OF SEQUENCE LIMIT 31
To get 21|δ| to be within our target of 0.2, we may take our tolerance δ = 0.9% <
0.2
21
≈ 0.0095.
Example 1.2.11. Assume x and y are close to 3 and 5. We want to find the tolerance
for x and y, such that xy is within ±0.2 of 3 · 5 = 15. This means finding δ > 0,
such that
|x − 3| < δ, |y − 5| < δ =⇒ |xy − 15| < 0.2.
Under the assumptions |x − 3| < δ and |y − 5| < δ, we have
|xy − 15| ≤ |xy − 3y| + |3y − 15| ≤ |x − 3||y| + 3|y − 5| ≤ (|y| + 3)δ.
We also note that, if we postulate δ ≤ 1, then |y − 5| < δ implies 4 < y < 6, so that
|xy − 15| ≤ (|y| + 3)δ ≤ (6 + 3)δ = 9δ.
To get 9|δ| to be within our target of 0.2, we may take our tolerance δ = 0.02 < 0.2
9
.
Since this indeed satisfies δ ≤ 1, we conclude that we can take δ = 0.02.
If the targeted error ±0.2 is changed to some other amount ±, then the same
argument shows that we can take the tolerance to be δ = 10 . Strictly speaking,
since we also use δ ≤ 1 in the argument above, we should take δ = min{ 10 , 1}.
Exercise 1.2.5. Find a tolerance for x, y, z near −2, 3, 5, such that 5x − 3y + 4z is within
± of 1.
Exercise 1.2.6. Find a tolerance for x and y near 2 and 2, such that xy is within ± of 4.
1
Exercise 1.2.8. Find a percentage of tolerance for x near 2, such that is within ±0.1 of
x
0.5.
r r
2 n+2 n+2 2 n+2
n< =⇒ 0 < −1< − 1 = < =⇒ − 1 < .
n n n n
√ √
n+2 n+2 sin n
1. . 5. √ . 10. .
n−3 n−3 n
√ √
n+a cos n + a
n−2 6. . 11. .
2. . n+b n + b sin n
n+3
n n √ √
7. − . 12. n + a − n + b.
n+a n+1 n−1
3. . n+a n+c r
n+b 8. − . n+a
n+b n+d 13. .
n+b
2n2 − 3n + 2 1 √ √
4. . 9. √ . 14. 3
n + 1 − 3 n.
3n2 − 4n + 1 an + b
a np + a a sin n + b
1. √ . 2. . 3. .
np +b np + b np + c
√ a
Example 1.2.14. The estimation in Example 1.1.7 tells us that | n a − 1| < for
n
a
a > 1. This suggests that for any > 0, we may choose N = . Then
√ a a
n > N =⇒ | n a − 1| < < = .
n N
√
This rigorously proves that limn→∞ n
a = 1 in case a ≥ 1.
n2
|n2 an − 0| = n2 |a|n =
(1 + b)n
n2
=
n(n − 1) 2 n(n − 1)(n − 2) 3
1 + nb + b + b + · · · + bn
2! 3!
n2 3!n 3!n 3!22
< = < nn 3 = .
n(n − 1)(n − 2) 3 (n − 1)(n − 2)b3 b nb3
b 22
3!
3!22 3!22
Since < is the same as n > , we have
nb3 b3
3!22 2 n 3!22
n> and n ≥ 3 =⇒ |n a − 0| < < .
b3 nb3
2
3!2
This shows that we may choose N = max ,3 .
b3
It is clear from the proof that we generally have
an
Example 1.2.16. We rigorously prove limn→∞ = 0 in Example 1.1.13.
n!
Choose a natural number M satisfying |a| < M . Then for n > M , we have
n
a Mn M · M ···M M M M MM M M M +1 1
< = · · · · · ≤ · = · .
n! n! 1 · 2···M M +1 M +2 n M! n M! n
Therefore for any > 0, we have
M +1 n
M M +1 1 M M +1
M a 1
n > max , M =⇒ − 0 < · < · M +1 = .
M ! n! M! n M! M
M !
√ √
Example 1.2.17. Suppose limn→∞ xn = l > 0. We prove that limn→∞ xn = l.
First we clarify the problem. The limit limn→∞ xn = l means the implication
We need to argue is that the first implication implies the second implication.
We have
√ √ √ √
√ √ |( xn − l)( xn + l)| |xn − l| |x − l|
| xn − l| = √ √ =√ √ ≤ n√ .
xn + l xn + l l
|xn − l|
Therefore for any given > 0, the second implication will hold as long as √ < ,
√ √ l
or |xn − l| < l. The inequality |xn − l| < l can√be achieved from the first
implication, provided we apply the first implication to l in place of .
The analysis above leads to the√following formal proof. Let > 0. By applying
the definition of limn→∞ xn = l to l > 0, there is N , such that
√
n > N =⇒ |xn − l| < l.
Then
√
n > N =⇒ |xn − l| < l
√ √ √ √
√ √ |( xn − l)( xn + l)| |xn − l| |x − l|
=⇒ | xn − l| = √ √ =√ √ ≤ n√ < .
xn + l xn + l l
Then
n > max{N1 , N2 } =⇒ |xn − l| < , |yn − k| <
2 2
=⇒ |(xn + yn ) − (l + k)| ≤ |xn − l| + |yn − k| < + = .
2 2
k y
l
area= |l||y − k|
x
Let limn→∞ xn = l and limn→∞ yn = k. Then for any 1 > 0, 2 > 0, there are
N1 , N2 , such that
n > N1 =⇒ |xn − l| < 1 ,
n > N2 =⇒ |yn − k| < 2 .
Then for n > N = max{N1 , N2 }, we have (see Figure 1.2.2)
|xn yn − lk| = |(xn − l)yn + l(yn − k)|
≤ |xn − l||yn | + |l||yn − k|
< 1 (|k| + 2 ) + |l|2 ,
where we use |yn − k| < 2 implying |yn | < |k| + 2 . The proof of limn→∞ xn yn = lk
will be complete if, for any > 0, we can choose 1 > 0 and 2 > 0, such that
1 (|k| + 2 ) + |l|2 ≤ .
This can be achieved by choosing 1 , 2 satisfying
2 ≤ 1, 1 (|k| + 1) ≤ , |l|2 ≤ .
2 2
In other words, if we choose
1 = , 2 = min 1,
2(|k| + 1) 2|l|
at the very beginning of the proof, then we get a rigorous proof of the arithmetic
rule. The formal writing of the proof is left to the reader.
Example 1.2.20. The sandwich rule in Proposition 1.1.4 reflects the intuition that, if
x and z are within of 5, then any number y between x and z is also within of 5
|x − 5| < , |z − 5| < , x ≤ y ≤ z =⇒ |y − 5| < .
Geometrically, this means that if x and z lies inside an interval, say (5 − , 5 + ),
then any number y between x and z also lies in the interval.
Suppose xn ≤ yn ≤ zn and limn→∞ xn = limn→∞ zn = l. For any > 0, there are
N1 and N2 , such that
n > N1 =⇒ |xn − l| < ,
n > N2 =⇒ |zn − l| < .
Then
n > N = max{N1 , N2 } =⇒ |xn − l| < , |zn − l| <
=⇒ l − < xn , zn < l +
=⇒ l − < xn ≤ yn ≤ zn < l +
⇐⇒ |yn − l| < .
1.3. CRITERION FOR CONVERGENCE 37
Example 1.2.21. The order rule in Proposition 1.1.5 reflects the intuition that, if x is
very close to 3 and y is very close to 5, then x must be less than y. More specifically,
we know x < y when x and y are within ±1 of 3 and 5. Here 1 is half of the distance
between 3 and 5.
Suppose xn ≤ yn , limn→∞ xn = l, limn→∞ yn = k. For any > 0, there is N ,
such that (you should know from earlier examples how to find this N )
l − < xn ≤ yn < k + .
Therefore we proved that l − < k + for any > 0. It is easy to see that the
property is the same as l ≤ k.
Conversely, we assume limn→∞ xn = l, limn→∞ yn = k, and l < k. For any > 0,
there is N , such that n > N implies |xn − l| < and |yn − k| < . Then
Exercise 1.2.15. Prove that a sequence xn converges if and only if the subsequences x2n
and x2n+1 converge to the same limit. This is a special case of Proposition 1.1.6.
Exercise 1.2.16. Suppose xn ≥ 0 for sufficiently big n and limn→∞ xn = 0. Prove that
limn→∞ xpn = 0 for any p > 0.
The theorem basically says that any convergent sequence is bounded. The num-
ber B is a bound for the sequence.
If xn ≤ B for all n, then we say xn is bounded above, and B is an upper bound.
If xn ≥ B for all n, then we say xn is bounded below, and B is a lower bound. A
sequence is bounded if and only if it is bounded above and bounded below.
n2 + (−1)n
The sequences n, diverge because they are not bounded. On the
n+1
other hand, the sequence 1, −1, 1, −1, . . . is bounded but diverges. Therefore the
converse of Theorem 1.3.1 is not true in general.
Exercise 1.3.1. Prove that if xn is bounded for sufficiently big n, i.e., |xn | ≤ B for n ≥ N ,
then xn is still bounded.
Exercise 1.3.2. Suppose xn is the union of two subsequences x0k and x00k . Prove that xn is
bounded if and only if both x0k and x00k are bounded.
π2
We will see that the sum is actually .
6
√
q p
Example 1.3.2. The number 2 + 2 + 2 + · · · is the limit of the sequence xn
inductively given by √ √
x1 = 2, xn+1 = 2 + xn .
After trying first couple of terms, we expect
p the√sequence to√be increasing. This
can be verified by induction. We have x2 = 2 + 2 > x1 = 2. Moreover, if we
assume xn > xn−1 , then
√ p
xn+1 = 2 + xn > 2 + xn−1 = xn .
y=x
y = f (x)
f (x3 )
f (x2 )
f (x1 )
x1 x2 x3 x4 l
√
Exercise 1.3.4. Suppose a sequence xn satisfies xn+1 = 2 + xn .
Exercise 1.3.5. For the three functions f (x) in Figure 1.3.2, study the convergence of the
sequences xn defined by xn+1 = f (xn ). Your answer depends on the initial value x1 .
1
Exercise 1.3.6. Suppose a sequence xn satisfies xn+1 = (x2n + xn ). Prove the following
2
statements.
4. If −2 < x1 < −1, then the sequence is decreasing for n ≥ 2 and converges to 0.
1.3. CRITERION FOR CONVERGENCE 41
Exercise 1.3.7. Determine the convergence of inductively defined sequences. Your answer
may depend on the initial value x1 .
a+b √
Exercise 1.3.11. The arithmetic and the geometric means of a, b > 0 are and ab.
2
By repeating the process, we get two sequences defined by
xn + yn √
x1 = a, y1 = b, xn+1 = , yn+1 = xn yn .
2
42 CHAPTER 1. LIMIT
Prove that xn ≥ xn+1 ≥ yn+1 ≥ yn for n ≥ 2, and the two sequences converge to the same
limit.
3. Use the relation between yn+2 and yn to prove that l is the upper bound of y2k and
the lower bound of y2k+1 .
4. Prove that the subsequence y2k is increasing and the subsequence y2k+1 is decreasing.
3. Compare the two methods for specific values of a and b (say a = 4, b = 1). Which
way is faster?
√
4. Can you come up with a similar scheme for numerically computing 3 a? What
choice of the weight gives you the fastest method?
First assume 0 < a < 1. Then the sequence an is decreasing and satisfies
0 < an < 1. Therefore the sequence converges to a limit l. By the remark in
Example 1.1.1, we also have limn→∞ an−1 = l. Then by the arithmetic rule, we have
Since a 6= 1, we get l = 0.
For the case −1 < a < 0, we may consider the even and odd subsequences
of an and apply Proposition 1.1.6. Another way is to apply the sandwich rule to
−|a|n ≤ an ≤ |a|n .
3n (n!)2
Example 1.3.4. We give another argument that the sequence xn = in Exam-
(2n)!
xn
ple 1.1.16 converges to 0. By limn→∞ = 0.75 < 1 and the order rule, we have
xn−1
xn
< 1 for sufficiently big n. Since xn is always positive, we have xn < xn−1 for
xn−1
sufficiently big n. Therefore after finitely many terms, the sequence is decreasing.
Moreover, 0 is the lower bound of the sequence, so that the sequence converges.
Let limn→∞ xn = l. Then we also have limn→∞ xn−1 = l. If l 6= 0, then
xn limn→∞ xn l
lim = = = 1.
n→∞ xn−1 limn→∞ xn−1 l
Exercise 1.3.14. Extend Example 1.3.3 to a proof of limn→∞ nan = 0 for |a| < 1.
|xn |
Exercise 1.3.15. Extend Example 1.3.4 to prove that, if limn→∞ = l < 1, then
|xn−1 |
limn→∞ xn = 0.
n
1
Example 1.3.5. For the sequence 1 + , we compare two consecutive terms by
n
44 CHAPTER 1. LIMIT
A close examination shows that the sequence is increasing. Moreover, by the com-
putation in Example 1.3.1, the first expansion gives
n
1 1 1 1
1+ < 1 + + + ··· +
n 1! 2! n!
1 1 1
<1+1+ + + ··· + < 3.
1·2 2·3 (n − 1)n
1 n 1 n
n+1 n
2n + 1
n+1
1. . 2. 1 − . 3. 1 + . 4. .
n n 2n 2n − 1
1 n+1
Exercise 1.3.17. Let xn = 1 + .
n
1. Use induction to prove (1 + x)n ≥ 1 + nx for x > −1 and any natural number n.
xn−1
2. Use the first part to prove > 1. This shows that xn is decreasing.
xn
3. Prove that limn→∞ xn = e.
1 n
4. Prove that 1 − is increasing and converges to e−1 .
n
1.3. CRITERION FOR CONVERGENCE 45
1 n
1 1 1 1 1 2 k
1+ ≥1+ + 1− + ··· + 1− 1− ··· 1 − .
n 1! 2! n k! n n n
Finally, prove
1 1 1
lim 1 + + + ··· + = e.
n→∞ 1! 2! n!
Theorem 1.3.3 (Cauchy Criterion). A sequence xn converges if and only if for any
> 0, there is N , such that
Sequences satisfying the property in the theorem are called Cauchy sequences.
The theorem says that a sequence converges if and only if it is a Cauchy sequence.
The necessity is easy to see. If limn→∞ xn = l, then for big m, n, both xm and
xn are very close to l (say within 2 ). This implies that xm and xn are very close
(within 2 + 2 = ).
The proof of sufficiency is much more difficult and relies on the following deep
result that touches the essential difference between the real and rational numbers.
Using the theorem, the converse may be proved by the following steps.
Example 1.3.6. In Example 1.1.19, we argued that the sequence (−1)n diverges be-
cause two subsequences converge to different limits. Alternatively, we may apply
the Cauchy criterion. For = 1 and any N , we pick any n > N and pick m = n + 1.
Then m, n > N and |xm − xn | = |(−1)n+1 − (−1)n | = 2 > . This means that the
Cauchy criterion fails, and therefore the sequence diverges.
Example 1.3.8. The following is the partial sum of the harmonic series
1 1 1
xn = 1 + + + ··· + .
2 3 n
For any n, we have
1 1 1 1 1 1 1
x2n − xn = + + ··· + ≥ + + ··· + = .
n+1 n+2 2n 2n 2n 2n 2
1
For = and any N , we choose a natural number n > N and also choose m =
2
2n > N . Then
|xm − xn | = x2n − xn ≥ .
This shows that the sequence fails the Cauchy criterion and diverges.
1.3. CRITERION FOR CONVERGENCE 47
Exercise 1.3.19. If xn is a Cauchy sequence, is |xn | also a Cauchy sequence? What about
the converse?
1
Exercise 1.3.20. Suppose |xn+1 − xn | ≤ . Prove that xn converges.
n2
Exercise 1.3.21. Suppose cn is bounded and |r| < 1. Prove that the sequence
xn = c0 + c1 r + c2 r2 + · · · + cn rn
converges.
It is easy to see that any number in [0, 1] is the limit of a convergent subsequence
of xn .
Exercise 1.3.23. Construct a sequence such that the limits of convergent subsequences are
1
exactly , n ∈ N and 0.
n
Exercise 1.3.24. Construct a sequence such that any number is the limit of some convergent
subsequence.
1 2 n−1 1 · 3 · 5 · · · (2n − 1)
3. 1 + 2
+ 2 + ··· + . 7. .
2 3 n2 2 · 4 · 6 · · · (2n)
1 1 1 1 · 3 · 5 · · · (2n − 1)
4. + + ··· + . 8. .
1·2 3·4 (2n − 1)2n n!
v v
u u s
u r
1 1 1 u u1 1 1
5. − +· · ·+(−1)n+1 . 9. t 1+ t + + ··· + .
1·2 3·4 (2n − 1)2n 2 3 n
r
2 3 n q
√
+ ··· +
p
6. + . 10. 1+ 2 + 3 + · · · + n.
1·3 2·4 (n − 1)(n + 1)
1.4 Infinity
1.4.1 Divergence to Infinity
A sequence may diverge for various reasons. For example, the sequence n diverges
because it can become arbitrarily big. On the other hand, the bounded sequence
(−1)n diverges because it has two subsequences with different limits. The first
example may be summarized by the following definition.
In the definition, the infinity means that the absolute value (or the magnitude)
of the sequence can become arbitrarily big. If we further take into account of the
signs, then we get the following definitions.
xN +1
x2 x xn
x4 xN +2 N +3
B
xn+1
x1 x3
n
N
Example 1.4.2. Example 1.1.11 may be extended to show that limn→∞ an = ∞ for
|a| > 1. Specifically, let |a| = 1 + b. Then |a| > 1 implies b > 0, and we have
n(n − 1) 2
|an | = (1 + b)n = 1 + nb + b + · · · + bn > nb.
2
For any B, we then have
B
n> =⇒ |an | > nb > B.
b
This proves limn→∞ an = ∞ for |a| > 1. If we take the sign into account, this also
proves limn→∞ an = +∞ for a > 1.
1
Example 1.4.3. Suppose xn 6= 0. We prove that limn→∞ xn = 0 implies limn→∞ =
xn
∞. Actually the converse is also true and the proof is left to the reader.
If limn→∞ xn = 0, then for any B > 0, we apply the definition of the limit to
1
> 0 to get N , such that
B
1 1
n > N =⇒ |xn | < =⇒ > B.
B xn
1
This proves limn→∞ = ∞.
xn
Applying what we just proved to the limit in Example 1.1.11, we get another
proof of limn→∞ an = ∞ for |a| > 1.
1
Exercise 1.4.1. Let xn 6= 0. Prove that limn→∞ xn = 0 if and only if limn→∞ = ∞.
xn
Exercise 1.4.2. Prove that limn→∞ xn = +∞ if and only if xn > 0 for sufficiently big n
1
and limn→∞ = 0.
xn
n2 − n + 1 an n!
1. . 3. , |a| > 1. 5. .
n+1 n 4n
n
2. √ . 4. np an , |a| > 1. 6. n!an .
n+1
A counterexample for the first equality is xn = n and yn = −n, for which we have
limn→∞ xn = ∞, limn→∞ yn = ∞ and limn→∞ (xn + yn ) = 0. In general, one needs to
use common sense to decide whether certain extended arithmetic rules make sense.
Example 1.4.4. By Example 1.4.1 and the extended arithmetic rule, we have
3 3 3 1
lim (n − 3n + 1) = lim n 1 − 2 + 3 = (+∞) · 1 = +∞.
n→∞ n→∞ n n
Exercise 1.4.5. Construct sequences xn and yn , such that both diverge to infinity, but
xn + yn can have any of the following behaviors.
1. limn→∞ (xn + yn ) = ∞.
2. limn→∞ (xn + yn ) = 2.
Exercise 1.4.6. Prove that if p > 0, then limn→∞ xn = +∞ implies limn→∞ xpn = +∞.
What about the case p < 0?
Exercise 1.4.7. Prove the extended sandwich rule: If xn ≤ yn for sufficiently big n, then
limn→∞ xn = +∞ implies limn→∞ yn = +∞.
Exercise 1.4.8. Prove the extended order rule: If limn→∞ xn = l is finite and limn→∞ yn =
+∞, then xn < yn for sufficiently big n.
Exercise 1.4.9. Suppose limn→∞ xn = l > 1. Prove that limn→∞ xnn = +∞.
52 CHAPTER 1. LIMIT
xn
Exercise 1.4.10. Prove that if limn→∞ = l and |l| > 1, then xn diverges to infinity.
xn−1
Exercise 1.4.11. Explain the infinities. Determine the sign of infinity if possible.
n + sin 2n 1 3n − 2n
1. √ . 4. √ √ . 7. .
n − cos n n
n − n 2n n
n! √ √ 3n − 2n
2. n , a + b 6= 0. 5. n( n + 2 − n). 8. .
a + bn n3 + n2
2
1 n
1 (−1)n n2
3. √ . 6. . 9. 1 + .
n
n−1 n−1 n
Example 1.4.6. For a > 1, the sequence an is increasing. If the sequence converges
to a finite limit l, then
an
Exercise 1.4.12. Prove limn→∞ = +∞ for a > 1.
n2
1.5. LIMIT OF FUNCTION 53
lim f (x) = l,
x→a
x → a, x 6= a =⇒ f (x) → l.
1 1
y = x2 y= y = sin
x x
On the other hand, as x approaches the infinity, we find that x2 gets arbitrar-
1 1 1
ily big, and and sin approach 0. Therefore limx→∞ x2 = ∞, limx→∞ =
x x x
1
limx→∞ sin = 0. Moreover, sin x swings between −1 and 1, and limx→∞ sin x
x
diverges.
Proposition 1.5.2 (Arithmetic Rule). Suppose limx→a f (x) = l and limx→a g(x) = k.
Then
f (x) l
lim (f (x) + g(x)) = l + k, lim cf (x) = cl, lim f (x)g(x) = lk, lim = ,
x→a x→a x→a x→a g(x) k
where c is a constant and k 6= 0 in the last equality.
Proposition 1.5.3 (Sandwich Rule). If f (x) ≤ g(x) ≤ h(x) and limx→a f (x) =
limx→a h(x) = l, then limx→a g(x) = l.
Proposition 1.5.4 (Order Rule). Suppose limx→a f (x) = l and limx→a g(x) = k.
1. If f (x) ≤ g(x) for x near a and x 6= a, then l ≤ k.
2. If l < k, then f (x) < g(x) for x near a and x 6= a.
Example 1.5.1. For a > 0, we have |x| = x near a. By Proposition 1.5.1, we have
limx→a |x| = limx→a x = a = |a|.
For a < 0, we have |x| = −x near a. By Propositions 1.5.1 and 1.5.2, we have
limx→a |x| = limx→a −x = − limx→a x = −a = |a|.
Combining the two cases with limx→0 |x| = 0, we get limx→a |x| = |a|.
x3 − 1
Example 1.5.3. The rational function is not defined at x = 1. Yet the function
x−1
converges at 1
x3 − 1 2
lim = lim (x2 + x + 1) = lim x + lim x + lim 1 = 12 + 1 + 1 = 3.
x→1 x − 1 x→1 x→1 x→1 x→1
More generally, for any polynomial p(x) = cn xn + cn−1 xn−1 + · · · + c1 x + c0 and finite
a, we have
lim p(x) = p(a).
x→a
p(x)
A rational function r(x) = is the quotient of two polynomials p(x) and
q(x)
q(x), and is defined at a if q(a) 6= 0. Further by the arithmetic rule, we also have
p p
Example 1.5.6. Similar to Example 1.1.5, the function |x| + 2 − |x| satisfies
p p p p
p p ( |x| + 2 − |x|)( |x| + 2 + |x|)
0 < |x| + 2 − |x| = p p
|x| + 2 + |x|
1 2
=p p <p .
|x| + 2 + |x| |x|
56 CHAPTER 1. LIMIT
2 1
By limx→∞ p = 2 limx→∞ p = 0 and the sandwich rule, we get
|x| |x|
p p
lim ( |x| + 2 − |x|) = 0.
x→+∞
1
and the sandwich rule, then we get limx→0 x sin = 0.
x
Exercise 1.5.1. Explain that limx→a f (x) = l if and only if limx→a (f (x) − l) = 0.
Exercise 1.5.2. Use the sandwich rule to prove that limx→a |f (x)| = 0 implies limx→a f (x) =
0.
Then z and x are related by z = g(f (x)), the composition of two functions. Suppose
both f and g have limits
where b is the value of the first limit as well as the location of the second limit. The
problem is whether the composition has limit
x → a, x 6= a =⇒ y → b,
y → b, y =
6 b =⇒ z → c,
x → a, x 6= a =⇒ z → c.
However, the two implications cannot be combined as is, because “y → b” does not
imply “y → b, y 6= b”. There are two ways to save this. The first is to strengthen
the first implication to
x → a, x 6= a =⇒ y → b, y 6= b.
Here the extra condition is f (x) 6= b for x near a and x 6= a. The second is to
strengthen the second implication to
y → b =⇒ z → c.
y → b =⇒ z → c = g(b).
Proposition 1.5.5 (Composition Rule). Suppose limx→a f (x) = b and limy→b g(y) =
c. Then we have limx→a g(f (x)) = c, provided one of the following extra conditions
is satisfied
1. f (x) 6= b for x near a and x 6= a.
2. c = g(b).
Later on, this will become the definition of the continuity of g(y) at b. Moreover,
the composition rule in this case means
lim g(f (x)) = c = g(b) = g lim f (x) .
x→a x→a
So the continuity of g(y) is the same as the exchangeability of the function g and
the limit.
The composition rule extends Proposition 1.1.6 because a sequence xn can be
considered as a function
xn : n 7→ x(n) = xn .
Then a subsequence can be considered as a composition with a function nk : N → N
that satisfies limk→∞ nk = ∞
Example 1.5.9. We have limx→1 (3x3 − 2) = 1 from the arithmetic rule. We also know
√
limy→1 y = 1 from Example 1.5.8. The composition
√ √
x 7→ y = f (x) = 3x3 − 2 7→ z = g(y) = y = g(f (x)) = 3x3 − 2
√
should give us limx→1 3x3 − 2 = 1.
We need to verify one of the extra conditions in the composition rule. If x is
close to 1 and x < 1, then we have x3 < 13 = 1, so that 3x3 − 2 < 1. Similarly, if x
is close to 1 and x > 1, then 3x3 − 2 > 1. Therefore for x close to 1 and 6= 1, we
indeed have 3x3 − 2 6= 1. This verifies the first condition.
Although the validity of the first condition already
√ allows us to apply the com-
√
position rule, the second condition limy→1 y = 1 = 1 is also valid.
Note that it is rather tempting to write
√ √ √
lim 3x3 − 2 = 3lim 3x3 − 2 = lim y.
x→1 3x −2→1 y→1
In other words, the composition rule appears simply as a change of variable. How-
ever, one needs to be careful because hidden in the definition of limx→1 is x 6= 1.
1.5. LIMIT OF FUNCTION 59
Similarly, the assumption 3x3 − 2 6= 1 is implicit in writing lim3x3 −2→1 , and the first
equality above actually requires you to establish
x → 1, x 6= 1 ⇐⇒ 3x3 − 2 → 1, 3x3 − 2 6= 1.
This turns out to be true in our specific example, but might fail for other examples.
√ √
Example 1.5.10. Example 1.5.8 can be extended to show that √ limx→a x = a for
any a > 0 (see Exercise 1.5.5). This actually means that x is continuous, and
therefore we may apply the composition rule to get
√ √
q r
lim 3
3x − 2 + 7x = lim 3x3 − 2 + 7x
x→1 x→1
√
r
= lim 3x3 − 2 + lim 7x
x→1 x→1
rq
= lim (3x3 − 2) + lim 7x
x→1 x→1
√
q
= 3 · 12 + 1 + 7 · 1.
Here is the detailed reason. The last equality is by√the arithmetic rule. The third
equality makes use of the continuity of the function x and the composition rule to
move the limit from outside the square root to inside the√square root. The second
equality is by the arithmetic rule. Once we know limx→1 ( 3x3 − 2 + 7x) converges
√
to a positive number, the first equality then follows from the continuity of x and
the composition rule.
√ √
Exercise 1.5.5. Show that limx→a x= a for any a > 0.
√ √
Exercise 1.5.6. Show that limx→a 3
x = 3 a for any a 6= 0.
Example 1.5.11. A change of variable can often be applied to limits. For example,
we have
lim f (x2 ) = lim2 f (y), for a 6= 0.
x→a y→a
Example 1.5.12. The composition rule fails when neither conditions are satisfied,
which means that f (x) = b for some x 6= a arbitrarily close to a, and c 6= g(b).
For a concrete example, consider
(
1 y, if y 6= 0,
f (x) = x sin , g(y) =
x A, if y = 0.
We have limx→0 f (x) = 0 (see Example 1.5.7) and limy→0 g(y) = 0 (see Example
1.5.2). This means a = b = c = 0. However, the composition is
x sin 1 , if x 6= (nπ)−1 ,
g(f (x)) = x
A, if x = (nπ)−1 .
and limx→0 g(f (x)) converges if and only if g(0) = A = 0 = c.
x → ∞, x > 0 =⇒ f (x) → l.
x → a, x > a =⇒ f (x) → l.
x → a, x < a =⇒ f (x) → l.
All the properties of the usual (two sided) limits still hold for one sided limits.
Moreover, we have the following relation (for a = ∞, a± means ±∞).
Proposition 1.5.6. limx→a f (x) = l if and only if limx→a+ f (x) = limx→a− f (x) = l.
we have limx→0+ f (x) = 1 and limx→0− f (x) = −1. Since the two limits are not
equal, limx→0 f (x) diverges.
|x + 4|
Example 1.5.14. To find limx→∞ , we consider the limit at +∞ and −∞. For
x
|x + 4| x+4 1
x > 0, we have = = 1 + 4 . By the arithmetic rule, we have
x x x
|x + 4| 1 1
lim = lim 1 + 4 = 1 + 4 lim = 1.
x→+∞ x x→+∞ x x→+∞ x
|x + 4| x+4 1
For x < −4, we have =− = −1 − 4 . By the arithmetic rule, we have
x x x
|x + 4| 1 1
lim = lim −1 − 4 = −1 − 4 lim = −1.
x→−∞ x x→−∞ x x→−∞ x
|x + 4|
Since the two limits are different, limx→∞ diverges.
x
Example 1.5.15. If we apply the argument in Example 1.5.6 to x > 0, then we get
√ √
lim ( x + 2 − x) = 0.
x→+∞
Example 1.5.16. The composition rule can also be applied to one sided limit. For
√
example, we have limx→0+ f (x2 ) = limx→0+ f (x) by introducing y = x2 and x = y,
and the first condition for the composition rule is satisfied in both directions
x → 0, x > 0 ⇐⇒ y → 0, y > 0.
We also have limx→0− f (x2 ) = limx→0+ f (x) by
√
x = − y → 0, x < 0 ⇐⇒ y = x2 → 0, y > 0.
Then we may use Proposition 1.5.6 to conclude that limx→0 f (x2 ) = limx→0+ f (x).
Specifically, we first use limx→1 xp = 1, which we just proved, in the third equality.
Then we use the arithmetic rule to get the second equality. Finally, the first equality
is obtained by a change of variable, which is essentially the composition rule.
Exercise 1.5.12. Suppose limx→a f (x) = l > 0. Prove that limx→a f (x)p = lp . This extends
Exercise 1.1.35 to function limit.
lim sin x = 0.
x→0+
Changing x to y = −x (i.e., applying the composition rule) gives the left limit
1
D
B
tan x
sin x
x
O A C
= sin a lim cos y + cos a lim sin y = (sin a)1 + (cos a)0 = sin a,
y→0 y→0
= cos a lim cos y − sin a lim sin y = (cos a)1 + (sin a)0 = cos a,
y→0 y→0
limx→a sin x sin a
lim tan x = = = tan a, if cos a 6= 0.
x→a limx→a cos x cos a
Example 1.5.18. By the arithmetic rule and the composition rule, we have
π
cos y +
limπ
cos x
= lim 2 = lim − sin y = −1,
x→ 2
π y→0 y y→0 y
x−
2
1 1
lim x sin = lim sin y = 1,
x→∞ x y→0 y
sin x − sin a sin(a + y) − sin a
lim = lim
x→a x−a y→0 y
cos y − 1 sin y
= lim sin a + cos a = cos a.
y→0 y y
Exercise 1.5.14. Study the limit of the sequences sin(sin(sin . . . a))) and cos(cos(cos . . . a))),
where the trigonometric functions are applied n times.
66 CHAPTER 1. LIMIT
The meaning of the definition is given by Figure 1.6. The other limits can be
similarly defined. For example, limx→a+ f (x) = l means that for any > 0, there is
δ > 0, such that
0 < x − a < δ =⇒ |f (x) − l| < .
The limit limx→∞ f (x) = l means that for any > 0, there is N , such that
Moreover, limx→a− f (x) = +∞ means that for any B, there is δ > 0, such that
l+
l
l−
a−δ a a+δ
Exercise 1.6.1. Write down the rigorous definitions of limx→a f (x) = −∞, limx→a− f (x) =
l, and limx→+∞ f (x) = −∞.
1.6. RIGOROUS DEFINITION OF FUNCTION LIMIT 67
Example 1.6.2. To prove limx→1 x2 = 1 rigorously means that, for any > 0, we need
to find suitable δ > 0, such that
We have
Therefore
1
Example 1.6.3. To rigorously prove limx→1 = 1, we note
x
1
− 1 = 1 |x − 1|.
x |x|
When x is close to 1, we know |x − 1| is very small. We also know that |x| is close
1
to 1, so that can be controlled by a specific bound. Combining the two facts,
|x|
1
we see that |x − 1| can be very small. Of course concrete and specific estimation
|x|
68 CHAPTER 1. LIMIT
1 1 1
is needed to get a rigorous proof. If |x − 1| < , then x > and < 2. Therefore
2 2 |x|
1
if we also have |x − 1| < , then we get |x − 1| < .
2 |x|
The analysis
above suggests the following rigorous proof. For any > 0, choose
1
δ = min , . Then
2 2
1
0 < |x − 1| < δ =⇒ |x − 1| < , |x − 1| <
2 2
1 1
=⇒ x > 1 − = , |x − 1| <
2 2 2
1 1
=⇒ − 1 = |x − 1| ≤ 2|x − 1| < .
x |x|
1
For any > 0, choose δ = p . Then by p > 0, we have
We also prove
lim xp = +∞, for any p > 0.
x→+∞
1
For any B > 0, choose N = B p . Then by p > 0, we have
x > N =⇒ xp > N p = B.
1
Changing x to , we get the similar conclusions for p < 0
x
p
a ,
if p < 0, 0 < a < +∞,
p
lim x = +∞, if p < 0, a = 0+ ,
x→a
0, if p < 0, a = +∞.
1.6. RIGOROUS DEFINITION OF FUNCTION LIMIT 69
Example 1.6.5. We try to rigorously prove the limit in Example 1.5.5. Example 1.2.6
gives the rigorous proof for the limit of the similar sequence. Instead of copying the
proof, we make a slightly different estimation of |f (x) − l|
2x2 + x
3x − 2 4|x| 12
x2 − x + 1 − 2 = x2 − x + 1 < x2 = |x| .
3
The crucial inequality is based on the intuition that |3x−2| < 4|x| and |x2 −x+1| >
x2
for sufficiently big x. The first inequality is satisfied when 2 < |x|, and the second
3
x2 x2 x2
is satisfied when |x| < and 1 < . It is easy to see that 2 < |x|, |x| < and
3 3 3
2
x 12
1< are all satisfied when |x| > 4. Therefore |f (x) − l| < when |x| > 4.
3 |x|
12
Formally, for any > 0, choose N = max 4, . Then
12
|x| > N =⇒ |x| > 4, |x| >
x2 12
=⇒ |3x − 2| < 4|x|, |x2 − x + 1| > , <
3 |x|
2
2x + x 3x − 2 4|x| 12
=⇒ 2 − 2 = 2 <
2 = < .
x −x+1 x − x + 1 x |x|
3
Exercise 1.6.2. Extend the proof of limx→a c = c and limx→a x = a to the case a = ±∞.
Exercise 1.6.4. Rigorously prove the limits. For the first three limits, give direct proof
instead of using Example 1.5.17.
x
1. limx→a x2 = a2 . 4. limx→∞ = 1.
x+a
√ √ √ √
2. limx→a x = a, a > 0. 5. limx→+∞ ( x + a − x + b) = 0.
1 1 r
x+a
3. limx→a = , a 6= 0. 6. limx→∞ = 1.
x a x+b
70 CHAPTER 1. LIMIT
Exercise 1.6.5. Suppose f (x) ≥ 0 for x near a and limx→a f (x) = 0. Suppose g(x) ≥ c
for x near a and constant c > 0. Prove that limx→a f (x)g(x) = 0. This extends Exercise
1.1.34 to the function limit.
Example 1.6.7. We prove the extended arithmetic rule (+∞)·l = +∞ for l > 0. This
means that, if limx→a f (x) = +∞ and limx→a g(x) = l > 0, then limx→a f (x)g(x) =
+∞.
For any B, there is δ1 > 0, such that
2
0 < |x − a| < δ1 =⇒ f (x) > B.
l
l
For = > 0, there is δ2 > 0, such that
2
l l
0 < |x − a| < δ2 =⇒ |g(x) − l| < =⇒ g(x) > .
2 2
Combining the two implications, we get
2 l
0 < |x − a| < δ = min{δ1 , δ2 } =⇒ f (x)g(x) > B · = B.
l 2
This completes the proof of (+∞) · l = +∞.
As an application of the extended arithmetic rule, we have
3 3 3 1
lim (x − 3x + 1) = lim x lim 1 − 2 + 3 = (+∞) · 1 = +∞.
x→+∞ x→+∞ x→+∞ x x
1.6. RIGOROUS DEFINITION OF FUNCTION LIMIT 71
In general, we have
(
+∞, if an > 0,
lim (an xn + an−1 xn−1 + · · · + a1 x + a0 ) =
x→+∞ −∞, if an < 0.
Example 1.6.9. We prove the second case of the composition rule in Proposition 1.5.5.
In other words, limx→a f (x) = b and limy→b g(y) = g(b) imply limx→a g(f (x)) = g(b).
By limy→b g(y) = g(b), for any > 0, there is µ > 0, such that
0 < |b − y| < µ =⇒ |g(y) − g(b)| < .
Since the right side also holds when y = b, we actually have
|y − b| < µ =⇒ |g(y) − g(b)| < . (1.6.1)
On the other hand, by limx→a f (x) = b, for the µ > 0 just found above, there is
δ > 0, such that
0 < |x − a| < δ =⇒ |f (x) − b| < µ.
Then we get
0 < |x − a| < δ =⇒ |f (x) − b| < µ
=⇒ |g(f (x)) − g(b)| < .
In the second step, we apply the implication (1.6.1) to y = f (x). This completes
the proof that limx→a g(f (x)) = g(b).
72 CHAPTER 1. LIMIT
Example 1.6.10. In Example 1.5.16, we argued that limx→0 f (x2 ) = limx→0+ f (x).
Now we give rigorous proof.
Suppose limx→0 f (x2 ) = l. Then for any > 0, there is δ > 0, such that
Exercise 1.6.6. Prove the arithmetic rule limx→a f (x)g(x) = limx→a f (x) limx→a g(x) in
Proposition 1.5.2.
Exercise 1.6.9. Prove the first case of the composition rule in Proposition 1.5.5.
l
Exercise 1.6.10. Prove the extended arithmetic rules (−∞) + l = −∞ and = 0 for
∞
function limit.
Exercise 1.6.11. For a subset A of R, define limx∈A,x→a f (x) = l if for any > 0, there is
δ > 0, such that
x ∈ A, 0 < |x − a| < δ =⇒ |f (x) − l| < .
Suppose A ∪ B contains all the points near a and 6= a. Prove that limx→a f (x) = l if and
only if limx∈A,x→a f (x) = l = limx∈B,x→a f (x).
Exercise 1.6.12. Prove that limx→a f (x) = l implies limx→a max{f (x), l} = l and
limx→a min{f (x), l} = l. Can you state and prove the sequence version of the result?
Exercise 1.6.13. Suppose f (x) ≤ 1 for x near a and limx→a f (x) = 1. Suppose g(x) is
bounded near a. Prove that limx→a f (x)g(x) = 1. What about the case f (x) ≥ 1?
1.6. RIGOROUS DEFINITION OF FUNCTION LIMIT 73
Exercise 1.6.14. Use Exercises 1.6.12, 1.6.13 and the sandwich rule to prove that, if limx→a f (x) =
1 and g(x) is bounded near a, then limx→a f (x)g(x) = 1. This is the function version of
Exercise 1.1.34. An alternative method for doing the exercise is by extending Proposition
1.1.6.
lim xn = a, xn 6= a (at least for big n), and lim f (x) = l =⇒ lim f (xn ) = l.
n→∞ x→a n→∞
sin x 1 1
Example 1.6.12. From limx→0 = 1 and limn→∞ = limn→∞ √ = 0, we get
x n n
1 1
sin sin √
1 n = 1, √ 1 n
lim n sin = lim lim n sin √ = lim = 1.
n→∞ n n→∞ 1 n→∞ n n→∞ √1
n n
Q Q
R−Q R−Q
y = D(x) y = xD(x)
The above example suggests the following extension of the sandwich rule:
The proof is exactly the same as the above example. For any > 0, there exists
N 0 such that n > N 0 implies
|yn − l| < , |zn − l| <
Then if x > N = N 0 + 1, there exists n satisfying n ≤ x ≤ n + 1 and n ≥ x − 1 > N 0 ,
such that
l − < yn ≤ f (x) ≤ zn < l +
in other words, for x > N ,
|f (x) − l| < .
Exercise 1.6.15. Use the limit in Example 1.2.15 and the idea of Example 1.6.14 to prove
that limx→+∞ x2 ax = 0 for 0 < a < 1. Then use the sandwich rule to prove that
limx→+∞ xp ax = 0 for any p and 0 < a < 1.
76 CHAPTER 1. LIMIT
√
Example 1.6.15. The limit limn→∞ n
n = 1 in Example 1.1.8 suggests that
1
lim x x = 1.
x→+∞
Exercise 1.6.19. Use Exercises 1.1.34 and 1.6.18 to prove that limn→∞ xn = l > 0 and
limn→∞ yn = k imply limn→∞ xynn = lk .
Exercise 1.6.21. Use Exercises 1.6.14 and 1.6.20 to prove that limx→a f (x) = l > 0 and
limx→a g(x) = k imply limx→a f (x)g(x) = lk .
1 n
n+1 n+1
n n+1
1. 1 − . 6. 1 + 2 . 11. .
n n +1 n
n+1 (−1)n n
1 n
n n
2. 1 + . 7. 1 + 2 . 12. .
2n n + (−1)n n + (−1)n
n2 n+2 n
n
2 1 n+1
13. .
3. 1+ . 8. 1+ . n−2
n n
n2
n2
a n n−1 n+(−1)n
4. 1+ . 2 n−1 14. .
n 9. 1+ . n
n
n+1 n n2 +(−1)n n
(−1)n
n n−1 n+1
5. 1+ 2 . 10. 1+ 2 . 15. .
n −1 n −1 n+2
1
Like the sequence limit, the example of sin at 0 shows that a bounded function
x
does not necessarily converge. However, Theorem 1.3.2 suggests that a bounded
monotone function should converge.
It is strictly increasing if
The concepts of decreasing and strictly decreasing are similar, and a function is
(strictly) monotone if it is either (strictly) increasing or (strictly) decreasing.
Theorem 1.6.6. If f (x) is monotone and bounded on (a, a + δ), then limx→a+ f (x)
converges.
The theorem also holds for the left limit. The theorem can be proved by choosing
a decreasing sequence xn converging to a. Then f (xn ) is a bounded decreasing
sequence. By Theorem 1.3.2, we have limn→∞ f (xn ) = l. Then limx→a+ f (x) = l
can be proved by comparing with the sequence limit, similar to (actually simpler
than) Examples 1.6.15 and 1.6.17.
The Cauchy criterion in Theorem 1.3.3 can also be extended to functions.
Theorem 1.6.7 (Cauchy Criterion). The limit limx→a f (x) converges if and only if
for any > 0, there is δ > 0, such that
The criterion also holds for one sided limit. Again the proof starts by choosing a
sequence xn converging to a, then applying the sequence version of Cauchy criterion
(Theorem 1.3.3) to f (xn ), and then comparing limn→∞ f (xn ) and limx→a f (x).
Exercise 1.6.24. If f (x) is monotone and bounded on (a−δ, a)∪(a, a+δ), does limx→a f (x)
converge?
Exercise 1.6.25. Prove that if limx→a f (x) converges, then f (x) satisfies the Cauchy crite-
rion.
1.7 Continuity
A function is continuous at a if its graph is “not broken” at a. For example, the
function in Figure 1.7.1 is continuous at a2 and a5 , and we have limx→a2 f (x) = f (a2 )
and limx→a5 f (x) = f (a5 ). It is not continuous at the other ai for various reasons.
The function diverges at a1 and a7 because it has different left and right limits. The
function converges at a3 but the limit is not f (a3 ). The function diverges to infinity
at a4 . The function diverges at a6 because the left limit diverges.
The left side limx→a f (x) implies that the function should be defined for x near
a and x 6= a. The right side f (a) implies that the function is also defined at
a. Therefore the concept of continuity can only be applied to functions that are
80 CHAPTER 1. LIMIT
a1 a2 a3 a4 a5 a6 a7
defined near a and including a, which means all x satisfying |x − a| < δ for some
δ > 0. Then by the definition of limx→a f (x), it is easy to see that f (x) is continuous
at a if and only if for any > 0, there is δ > 0, such that
In other words, the continuity of f means that the limit and the evaluation of f can
be exchanged. By using Proposition 1.6.2 (another variant of the composition rule),
1.7. CONTINUITY 81
the same remark can be applied to a sequence limit limn→∞ xn = b instead of the
function limit limx→a g(x) = b, and we get
lim f (xn ) = f (b) = f lim xn .
n→∞ n→∞
Example 1.7.1. The sign function in Example 1.5.13 is continuous everywhere except
at 0. The Dirichlet function D(x) in Example 1.6.13 is not continuous anywhere.
The function xD(x) is continuous at 0 and not continuous at all the other places.
x3 − 1
Example 1.7.2. The function in Example 1.5.3 is not defined at x = 1, and we
x−1
cannot talk about its continuity at the point. In order to make the function contin-
x3 − 1
uous at 1, we need to assign the value of the function at 1 to be limx→1 = 3,
x−1
and we get a continuous function
3
x − 1
, if x 6= 1
f (x) = x − 1 = x2 + x + 1.
3, if x = 1
√
Example 1.7.3. By the composition rule and the continuity of x, ax and sin x, we
have
√ q
lim x3 + 1 = lim x3 + 1 = 3,
x→2 x→2
n n
lim
lim a = an→∞ n2 −1 = 1,
n2 −1
n→∞
√ √
q
lim− sin 1 − x = sin lim− 1 − x = sin lim− (1 − x2 ) = 0.
2 2
x→1 x→1 x→1
√
Example 1.7.4. √The continuity of x implies that, if limn→∞ xn = l > 0, then
√
limn→∞ xn = l. This is Example 1.2.17.
The continuity of ax implies
√ that, if limn→∞ xn = l, then limn→∞ axn = al . This
implies the limit limn→∞ a = 1 in√Example 1.2.14, and also implies the limits such
n
n+1
√1 √
as limn→∞ a n = 1 and limn→∞ a n−sin n = a.
Exercise 1.7.1. Determine the intervals on which the function is continuous. Is it possible
to extend to a continuous function at more points?
x2 − 3x + 2 3. sign(x). 5. xx .
1. .
x2 − 1
x2 − 1 1 cos x
2. . 4. x sin . 6. .
x−1 x 2x − π
1 + f (0)g(x)
Exercise 1.7.4. Find two continuous functions f (x) and g(x), such that limx→0
1 + f (x)g(0)
converges but the value is not 1.
Theorem 1.7.2 (Intermediate Value Theorem). If f (x) is continuous on [a, b], then
for any number γ between f (a) and f (b), there is c ∈ [a, b] satisfying f (c) = γ.
x
a c b
Back to f (x) = x3 −3x+1. In Example 1.7.5, we actually already know that f (x)
has at least one root on (0, 1). In fact, by f (−b) < 0, f (0) > 0, f (1) < 0, f (b) > 0,
we know f has at least one root on each of the intervals (−b, 0), (0, 1), (1, b). Since
a polynomial of order 3 has at most three roots, we conclude that f (x) has exactly
one root on each of the three intervals.
limx→ π + sin x −1
limπ + tan x = 2
= = −∞.
x→− 2 limx→ π + cos x 0+
2
π π
Therefore for any number γ, we can find a > − and very close to − , such that
2 2
π π
tan a < γ. We can also find b < and very close to , such that tan b > γ. Then
2 2
tan x is continuous on [a, b] and tan a < γ < tan b. This implies that γ = tan c for
π
some c ∈ (a, b). Therefore any number is the tangent of some angle between −
2
π
and .
2
The example shows that, if f (x) is continuous on (a, b) satisfies limx→a+ f (x) =
−∞ and limx→b− f (x) = +∞, then f (x) can take any number as value on (a, b).
Note that the interval (a, b) here does not even have to be bounded.
satisfies f (−1) = −1, f (1) = 2, but does not take any number in (0, 1] as value. The
problem is that the function is not continuous at 0, where a jump in value misses the
numbers in (0, 1]. Therefore the Intermediate Value Theorem cannot be applied.
Exercise 1.7.5. Let f (x) : [0, 1] → [0, 1] be a continuous function. Prove that there exists
at least one c ∈ [0, 1] satisfying f (c) = c.
x2 − 3x + 2 4. sin x.
1. .
x2 − 1 1
5. sin .
2. xx . x
(
2x , if − 1 ≤ x ≤ 0,
3. ex . 6.
x2 + 3, if 0 < x ≤ 1.
84 CHAPTER 1. LIMIT
√ √ √ √ √
Exercise 1.7.7. limx→+∞ cos( x + 2 + x) and limx→+∞ x(sin x + 2 − sin x) diverge.
then x2 = y also has a unique non-positive solution and gives the inverse
√
f2−1 (y) = − y : [0, +∞) → (−∞, 0].
has inverse
(√
y, if 0 ≤ y < 1
f3−1 (y) = √ : [0, +∞) → (−∞, −1] ∪ [0, 1).
− y, if y ≥ 1
The examples show that, for the concept of inverse function to be unambiguous,
we have to specify the ranges for the variable and the value. In this regard, if two
functions have the same formula but different ranges, then we should really think
of them as different functions.
In general, if a function f (x) is defined for all x ∈ D, then D is the domain of
the function, and all the values of f (x) is the range
R = {f (x) : x ∈ D}.
1.7. CONTINUITY 85
With the domain and range explicitly specified, we express the function as a map
f (x) : D → R.
Now the equation f (x) = y has solution only when y ∈ R. Moreover, we need to
make sure that the solution is unique in order for the inverse to be unambiguous.
This means that the function is one-to-one
x1 , x2 ∈ D, x1 6= x2 =⇒ f (x1 ) 6= f (x2 ).
x1 , x2 ∈ D, f (x1 ) = f (x2 ) =⇒ x1 = x2 .
Example 1.7.11. Consider the function f (x) = x5 + 3x3 + 1 defined for all x. By the
remark in Example 1.7.7, together with (see Example 1.6.7)
we see that any number can be the value of f (x). This shows that the range of f is
R.
Is the function one-to-one? This can be established as follows. If x1 6= x2 , then
either x1 < x2 or x1 > x2 . In the first case, we have
By switching the roles of x1 and x2 , we get x1 > x2 implying f (x1 ) > f (x2 ). Either
way, we get x1 6= x2 implying f (x1 ) 6= f (x2 ).
We conclude that f (x) = x5 + 3x3 + 1 : R → R is invertible.
Example 1.7.12. The sine function can take any value in [−1, 1]. To make sure it is
one-to-one, we specify the domain and range
h π πi
sin x : − , → [−1, 1].
2 2
This is strictly increasing and continuous, and takes any number in [−1, 1] as value.
Therefore we get the inverse sine function
h π πi
arcsin y : [−1, 1] → − , .
2 2
The inverse sine function is also strictly increasing and continuous.
tan x
arcsin x
− π2
1 sin x π
2 arctan x
− π2 −1 − π2
−1 π π
2 2
− π2
−1
− π2
Then by an argument similar to Example 1.7.7, any number in (0, +∞) is a value
of ax . Therefore the exponential function has an inverse
loga x : (0, +∞) → R,
called the logarithmic function with base a. Like the exponential function, the
logarithmic function is strictly increasing and continuous. We can also show
lim loga x = +∞, lim loga x = −∞, for a > 1
x→+∞ x→0+
by method similar to Example 1.7.13.
The logarithm loge x based on the special value e is called the natural logarithm.
We will denote the natural logarithm simply by log x.
For 0 < a < 1, the exponential ax is strictly decreasing and continuous. The
corresponding logarithm can be similarly defined and is also strictly decreasing and
continuous. Moreover, we have
lim loga x = −∞, lim loga x = +∞, for 0 < a < 1.
x→+∞ x→0+
Exercise 1.7.8. Let f be a strictly increasing function. Show that its inverse is also strictly
increasing.
88 CHAPTER 1. LIMIT
ex
1 log x
x → a ⇐⇒ y → b.
x 6= a ⇐⇒ y 6= b.
Therefore the composition rule can be applied in both directions, and we have
Here the equality means that the convergence of both sides are equivalent, and the
limits have the same value.
Example 1.7.16. Since sin x is strictly increasing and continuous near 0, we have
arcsin y x
lim = lim = 1.
y→0 y x→0 sin x
Example 1.7.17. In Examples 1.5.11, 1.5.16, 1.6.10, we find limx→a f (x2 ) = limx→a2 f (x)
for a 6= 0 and limx→0 f (x2 ) = limx→0+ f (x). Here we explain the two equalities from
the viewpoint of continuous change of variable.
The function
x2 : [0, +∞) → [0, +∞)
1.7. CONTINUITY 89
is strictly increasing and continuous, and is therefore invertible, with strictly in-
creasing and continuous inverse
√
x : [0, +∞) → [0, +∞).
The continuous change of variable implies that limx→a f (x2 ) = limx→a2 f (x) for
a > 0 and limx→0+ f (x2 ) = limx→0
√ f (x). Note that the second equality makes use
+
is strictly decreasing and continuous, and is therefore invertible, with strictly de-
creasing and continuous inverse. This implies that limx→a f (x2 ) = limx→a2 f (x) for
a < 0 and limx→0− f (x2 ) = limx→0+ f (x).
By
lim+ f (x2 ) = lim+ f (x) = lim− f (x2 ),
x→0 x→0 x→0
2
we also conclude that limx→0 f (x ) = limx→0+ f (x).
Example 1.7.18. Since the natural logarithm log x is continuous, we may move the
limit from outside the logarithm to inside the logarithm
log(x + 1) 1
1
lim = lim log(1 + x) x = log lim (1 + x) x = log e = 1.
x→0 x x→0 x→0
x log(x2 + 3x + 2) − log 2
1. limx→0 . 3. limx→0 .
log(ax + 1) x
x2 − 1 log x
2. limx→1 . 4. limx→1 .
log x sin πx
ex − 1 y
lim = lim = 1.
x→0 x y→0 log(y + 1)
Example 1.7.20. The continuity of the logarithm implies the continuity of the func-
tion xx = ex log x on (0, ∞). In general, if f (x) and g(x) are continuous and f (x) > 0,
then f (x)g(x) = eg(x) log f (x) is continuous.
The continuity of the exponential and the logarithm can also be used to prove
that
lim an = l > 0, lim bn = k =⇒ lim abnn = lk .
n→∞ n→∞ n→∞
The reason is that the continuity of log implies limn→∞ log an = log l. Then the
arithmetic rule implies limn→∞ bn log an = k log l. Finally, the continuity of the
exponential implies
In Exercises 1.6.19 and 1.6.21, we took a number of steps to prove the same
property, without using the logarithmic function.
Using change the variable and Examples 1.7.18 and 1.7.19, we get
1
Leonhard Paul Euler, born 1707 in Basel (Switzerland), died 1783 in St. Petersburg (Russia).
Euler is one of the greatest mathematicians of all time. He made important discoveries in almost
all areas of mathematics. Many theorems, quantities, and equations are named after Euler. He
also introduced much of√the modern mathematical terminology and notation, including f (x), e, Σ
(for summation), i (for −1), and modern notations for trigonometric functions.
2
Lorenzo Mascheroni, born 1750 in Lombardo-Veneto (now Italy), died 1800 in Paris (France).
The Euler-Mascheroni constant first appeared in a paper by Euler in 1735. Euler calculated the
constant to 6 decimal places in 1734, and to 16 decimal places in 1736. Mascheroni calculated the
constant to 20 decimal places in 1790.
92 CHAPTER 1. LIMIT
Chapter 2
Differentiation
93
94 CHAPTER 2. DIFFERENTIATION
2.1.1 Derivative
Geometrically, a function may be represented by its graph. The graph of a linear
function is a straight line. Therefore a linear approximation at x0 is a straight line
that “best fits” the graph of the given function near x0 . This is the tangent line of
the function.
LP Q
Q tangent L
P
x0 x
Specifically, the point P in Figure 2.1.1 is the point (x0 , f (x0 )) on the graph
of f (x). We pick a nearby point Q = (x, f (x)) on the graph, for x near x0 . The
straight line connecting P and Q is the linear function (the variable in LP Q is t
because x is already used for Q)
f (x) − f (x0 )
LP Q (t) = f (x0 ) + (t − x0 ).
x − x0
The notation f 0 for the derivative is due to Joseph Louis Lagrange. It is simple
and convenient, but could become ambiguous when there are several variables related
df
in more complicated ways. Another notation , due to Gottfried Wilhelm Leibniz,
dx
2.1. LINEAR APPROXIMATION 95
Example 2.1.1. The function f (x) = 3x − 2 is already linear. So its linear approx-
imation must be L(x) = f (x) = 3x − 2. This reflects the intuition that, if the
distance is exactly 7m5cm, then the measure by the ruler in centimeters should be
d(3x − 2)
7m5cm. In particular, the derivative f 0 (x) = 3, or = 3. In general, we
dx
have (A + Bx)0 = B.
Exercise 2.1.1. Find the linear approximations and then the derivatives.
1. 5x + 3 at x0 . 3. x2 at x0 . 5. xn at 1.
2. x3 − 2x + 1 at 1. 4. x3 at x0 . 6. xn at x0 .
1 1 1
Therefore is differentiable at x0 , and the linear approximation is − 2 (x − x0 ).
x x0 x0
We express the derivative as 0
1 1
= − 2.
x x
√ √ 1
Therefore x is differentiable, and the linear approximation is x0 − √ (x − x0 ).
2 x0
We express the derivative as
√
√ 0 d x 1
( x) = = √ .
dx 2 x
d(xp ) (1 + h)p − 1
= lim = p.
dx x=1 h→0 h
Exercise 2.1.3. Find the derivatives and then the linear approximations.
√
1. 3
x at 1. 3. cos x2 at 0. 5. arcsin x at 0. 7. sin sin x at 0.
Exercise 2.1.5. We have log |x| = log(−x) for x < 0. Show that the derivative of log(−x)
1
at x0 < 0 is . The interpret your result as
x0
1
(log |x|)0 = .
x
Exercise 2.1.7. Suppose p is an odd integer. Then xp is defined for x < 0. Do we still have
(xp )0 = pxp−1 for x < 0?
is 1, the constant approximation means that, for any > 0, there is δ > 0, such that
|x − x0 | < δ =⇒ |f (x) − a| ≤ .
This means exactly that f (x) is continuous at x0 , and the approximating constant
is a = f (x0 ). Therefore the fact of linear approximation implying constant approx-
imation means the following.
Example 2.1.11. The absolute value function |x| is continuous everywhere. Yet the
derivative
|x| − |0|
(|x|)0 |x=0 = lim = lim sign(x)
x→0 x − 0 x→0
Example 2.1.12. The Dirichlet function D(x) in Example 1.6.13 is not continuous
anywhere and is therefore not differentiable anywhere.
On the other hand, the function xD(x) is continuous at 0. Yet the derivative
xD(x)
(xD(x))0 |x=0 = lim = lim D(x)
x→0 x x→0
at 0.
Exercise 2.1.13. Let [x] be the greatest integer ≤ x. Study the differentiability of [x].
The derivative f 0 (x0 ) exists if and only if both f+0 (x0 ) and f−0 (x0 ) exist and are equal.
|x| |x|
(|x|)0 |at 0+ = lim+ = 1, (|x|)0 |at 0− = lim− = −1.
x→0 x x→0 x
Therefore |x| has left and right derivatives. Since the two one sided derivatives are
different, the function is not differentiable at 0.
1. |x2 − 3x + 2| at 0, 1, 2. 3. | sin x| at 0.
√
2. 1 − cos x at 0. 4. |π 2 − x2 | sin x at π.
is differentiable at 1.
Exercise 2.1.18. For p ≥ 0, xp is defined on [0, δ). What is the right derivative of xp at 0?
Exercise 2.1.19. For some p (see Exercises 2.1.7 and 2.1.18), xp is defined on (−δ, δ). What
is the derivative of xp at 0?
d(f + g) df dg
(f + g)0 (x) = f 0 (x) + g 0 (x), or = + .
dx dx dx
102 CHAPTER 2. DIFFERENTIATION
Although the approximation is not linear, the square unit (x − x0 )2 is much smaller
than x − x0 when x is close to x0 . Therefore f (x)g(x) is differentiable and has
linear approximation ac + (bc + ad)(x − x0 ), and we get (f g)0 (x0 ) = bc + ad. By
a = f (x0 ), b = f 0 (x0 ), c = g(x0 ), d = g 0 (x0 ), we get the Leibniz rule
d(f g) df dg
(f g)0 (x) = f 0 (x)g(x) + f (x)g 0 (x), or = g+f .
dx dx dx
The explanation above on the derivatives of arithmetic combinations are analo-
gous to the arithmetic properties of limits.
Exercise 2.2.3. Find a polynomial p(x), such that (p(x)ex )0 = x2 ex . In general, suppose
(pn (x)ex )0 = xn ex . Find the relation between polynomials pn (x).
Exercise 2.2.4. Find polynomials p(x) and q(x), such that (p(x) sin x + q(x) cos x)0 =
x2 sin x. Moreover, find a function with derivative x2 cos x?
Exercise 2.2.5. Find constants A and B, such that (Aex sin x+Bex cos x)0 = ex sin x. What
about (Aex sin x + Bex cos x)0 = ex cos x?
1
Example 2.2.1. We know (log x)0 = for x > 0. For x < 0, we have
x
0 0 0 1 1 1
(log(−x)) = (log y) |y=−x (−x) = (−1) = (−1) = .
y y=−x −x x
Therefore we conclude
1
(log |x|)0 = , for x 6= 0.
x
Example 2.2.2. In Example 2.1.5, we use the definition to derive xp = pxp−1 . Alter-
natively, we may also derive the derivative of xp at general x0 > 0 from the derivative
(xp )0x=1 = p at a special place.
x
To move from x0 to 1, we introduce y = . Then xp is the composition
x0
x
x 7→ y = 7→ z = xp = xp0 y p .
x0
Then x = x0 corresponds to y = 1, and we have
d(xp ) d(xp0 y p )
dz dy d x
= =
dx x=x0 dy y=1 dx x=x0 dy y=1 dx x0 x=x0
d(y p )
1 1
= xp0 · · = xp0 · p · = pxp−1
0 .
dy x0y=1 x0
Exercise 2.2.6. Use the derivative at a special place to find the derivative at other places.
104 CHAPTER 2. DIFFERENTIATION
Exercise 2.2.8. A function f (x) is odd if f (−x) = −f (x), and is even if f (−x) = f (x).
What can you say about the derivative of an odd function and the derivative of an even
function?
Then we may use the Leibniz rule to get the derivative of quotient
0 0 0
f (x) 1 0 1 1
= f (x) = f (x) + f (x)
g(x) g(x) g(x) g(x)
0
1 g (x) f (x)g(x) − f (x)g 0 (x)
0
= f 0 (x) − f (x) = .
g(x) g(x)2 g(x)2
x+1 x3 + 1 ax + b 1
2. . 4. . 6. . 8. .
x−2 x2 −x+1 cx + d (x + a)(x + b)
log x log x xp ex
1. . 2. . 3. . 4. .
x xp log x x log x
ex − e−x ex + e−x
sinh x = , cosh x = ,
2 2
2.2. PROPERTY OF DERIVATIVE 105
and
sinh x cosh x 1 1
tanh x = , coth x = , sechx = , cschx = .
cosh x sinh x cosh x sinh x
Find their derivatives and express them in hyperbolic trigonometric functions.
2. (x2 − 1)ex .
2 6. log(log x). 10. log | tan x|.
x 1 11. log | sec x − tan x|.
3. e(e ) . 7. log .
log x
1 − sin x
4. elog x . 8. log(log(log x)). 12. log .
1 + sin x
1. |x2 (x + 2)3 |. 2. | sin3 x|. 3. |x(ex − 1)|. 4. |(x − 1)2 log x|.
Exercise 2.2.22. Find constants A and B, such that (Aeax sin bx+Beax cos bx)0 = eax cos bx.
What about (Aeax sin bx + Beax cos bx)0 = eax sin bx?
x
Exercise 2.2.24. Compute the derivative of log . Then find a function with derivative
x+1
1 1
. In case a2 ≥ 4b, can you find a function with derivative 2 ?
(x + a)(x + b) x + ax + b
√ √
Exercise 2.2.25. Compute√the derivative of x x2 + a + a log(x + x2 + a). Then find a
function with derivative x2 + ax + b.
Example 2.2.8. Suppose u(x) and v(x) are differentiable and u(x) > 0. Then
u(x)v(x) = eu(x) log v(x) , and
(u(x)v(x) )0 = (ev(x) log u(x) )0 = (ey )0 |y=v(x) log u(x) (v(x) log u(x))0 .
By
(ey )0 |y=v(x) log u(x) = ey |y=v(x) log u(x) = ev(x) log u(x) = u(x)v(x) ,
and
v(x)u0 (x)
(v(x) log u(x))0 = v 0 (x) log u(x)+v(x)(log u)0 |u=u(x) u0 (x) = v 0 (x) log u(x)+ ,
u(x)
We get
v(x)u0 (x)
v(x) 0 v(x) 0
(u(x) ) = u(x) v (x) log u(x) +
u(x)
= u(x)v(x)−1 (u(x)v 0 (x) log u(x) + u0 (x)v(x)).
Exercise 2.2.29. Let f (x) = u(x)v(x) . Then log f (x) = u(x) log v(x). By taking the deriva-
tive on both sides of the equality, derive the formula for f 0 (x).
Exercise 2.2.30. Use the idea of Exercise 2.2.29 to compute the derivatives.
2.2. PROPERTY OF DERIVATIVE 109
2 2
Example 2.2.9. The unit
√ circle x + y = √
1 on the plane is made up of the graphs
of two functions y = 1 − x and y = − 1 − x2 . We may certainly compute the
2
Therefore
cos t x
y 0 (x) = − =− .
sin t y
In general, the derivative of a function y = y(x) given by a parametrized curve
x = x(t), y = y(t) is
y 0 (t)
y 0 (x) = 0 .
x (t)
Note that the formula is ambiguous, in that y 0 (x) = − cot t and y 0 (t) = cos t are not
d d
the same functions. The primes in the two functions refer to and respectively.
dx dt
110 CHAPTER 2. DIFFERENTIATION
So it is better to keep track of the variables by using Leibniz’s notation. The formula
above becomes
dy
dy
= dt .
dx dx
dt
dy dy dx
This is just another way of expressing the chain rule = .
dt dx dt
Exercise 2.2.31. Compute the derivatives of the functions y = y(x) given by curves.
1. x = sin2 t, y = cos2 t.
Example 2.2.10. Like the unit circle, the equation 2y−2x2 −sin y+1 = 0 is a curve on
the plane, made up of the graphs of several functions y = y(x). Although we cannot
find an explicit formula for the functions, we can still compute their derivatives.
Taking the derivative of both sides of the equation 2y − 2x2 − sin y + 1 = 0 with
respect to x and keeping in mind that y is a function of x, we get 2y 0 −4x−y 0 cos y = 0.
Therefore
4x
y0 = .
2 − cos y
r
π π
The point P = , satisfies the equation and lies on the curve. The
2 2
tangent line of the curve at the point has slope
r
π
4 √
2
y 0 |P = π = 2π.
2 − cos
2
Therefore the tangent line at P is given by the equation
π √
r
π
y − = 2π x − ,
2 2
or √ π
y= 2πx − .
2
2.2. PROPERTY OF DERIVATIVE 111
Exercise 2.2.34. If f (sin x) = x, what can you say about the derivative of f (x)? What if
sin f (x) = x?
Example 2.2.12. In Example 1.7.11, we argued that the function f (x) = x5 + 3x3 + 1
is invertible. The inverse g(x) satisfies g(x)5 + 3g(x)3 + 1 = x. Taking the derivative
in x on both sides, we get 5g(x)4 g 0 (x) + 9g(x)2 g 0 (x) = 1. This implies
1
g 0 (x) = .
5g(x) + 9g(x)2
dx 1 1 1
= , or x0 (y) = 0 = 0 .
dy dy y (x)|x=x(y) y (x(y))
dx
Specifically, for y = arcsin x, we have x = sin y. then
d arcsin x dy 1 1 1 1 1
= = = = =p = √ .
dx dx dx (sin y)0 cos y 1 − sin2 y 1 − x2
dy
1 1 1
(arctan x)0 = , (arccos x)0 = − √ , (arcsecx)0 = √ .
1 + x2 1 − x2 x x2 − 1
1 x
Exercise 2.2.40. Compute the derivative of arctan . Then for the case a2 ≤ 4b, find a
a a
1
function with derivative 2 . This complements Exercise 2.2.24.
x + ax + b
√
x+1−1 √
Exercise 2.2.41. Compute the derivative of log √ and arctan x. Then find a
x+1+1
1
function with derivative √ .
x ax + b
1 −1 −1
of the functions f and f (f (x)) at 1.
x
f −1
f (x)
Exercise 2.2.43. Explain the formula for the derivative of the inverse function by consid-
ering the inverse of the linear approximation.
Exercise 2.2.44. Find the place on the curve y = x2 where the tangent line is parallel to
the straight line x + y = 1.
Exercise 2.2.45. Show that the area enclosed by the tangent line on the curve xy = a2 and
the coordinate axes is a constant.
Exercise 2.2.46. Let P be a point on the curve y = x3 . The tangent at P meets the curve
again at Q. Prove that the slope of the curve at Q is four times the slope at P .
The function has a (global) maximum at x0 if f (x0 ) ≥ f (x) for all x in the
domain
x ∈ domain =⇒ f (x) ≤ f (x0 ).
The concepts of (global) minimum can be similarly defined. The maximum and
minimum are extrema of the function. A global extreme is also a local extreme.
The local maxima are like the peaks in a mountain, and the global maximum is
like the highest peak.
max
loc max
loc min
min
x
a b
The following result shows the existence of global extrema in certain case.
By the same reason, the function is strictly increasing on [0, +∞). This leads to
the local minimum at 0. In fact, by x2 ≥ 0 = |0| for all x, we know x2 has a
global minimum at 0. The function has no local maximum and therefore no global
maximum on R.
On the other hand, if we restrict x2 to [−1, 1], then x2 has global minimum at
0 and global maxima at −1 and 1. If we restrict to [−1, 2], then x2 has global
minimum at 0, global maximum at 2, and local (but not global) maximum at −1. If
we restrict to (−1, 2), then x2 has global minimum at 0, and has no local maximum.
2.3. APPLICATION OF LINEAR APPROXIMATION 115
h π πi
Example 2.3.2. The sine function is strictly increasing on 2nπ − , 2nπ + and
2 2
π 3π π
is strictly decreasing on 2nπ + , 2nπ + . This implies that 2nπ + are local
2 2 2
π π
maxima and 2nπ − are local minima. In fact, by sin 2nπ − = −1 ≤ sin x ≤
2 2
π
1 = sin 2nπ + , these local extrema are also global extrema.
2
Exercise 2.3.1. Determine the monotone property and find the extrema for |x|
Exercise 2.3.2. Determine the monotone property and find the extrema on R.
1. |x|. 5. x6 . 1 13. ex .
9. .
1 x2 +1
2. x2 + 2x. 6. . 14. e−x .
x 10. cos x.
p
3. |x2 + 2x|. 7. |x|. 11. sin2 x. 15. log x.
1
4. x3 . 8. x + . 12. sin x2 . 16. xx .
x
Exercise 2.3.3. How are the extrema of the function related to the extrema of f (x)?
Exercise 2.3.4. How are the extrema of the function related to the extrema of f (x)?
Exercise 2.3.5. Is local maximum always the place where the function changes from in-
creasing to decreasing? In other words, can you construct a function f (x) with local
maximum at 0, but f (x) is not increasing on (−δ, 0] for any δ > 0?
Exercise 2.3.6. Compare the global extrema on various intervals in Example 2.3.1 with
Theorem 2.3.1.
Example 2.3.3. We have (x2 )0 = 2x < 0 on (−∞, 0) and x2 continuous on (−∞, 0].
Therefore x2 is strictly decreasing on (−∞, 0]. By the similar reason, x2 is strictly
increasing on [0, +∞). This implies that 0 is a local minimum. The conclusion is
consistent with the observation in Example 2.3.1 obtained by direct inspection.
Example 2.3.4. The function f (x) = x3 −3x+1 has derivative f 0 (x) = 3(x+1)(x−1).
The sign of the derivative implies that the function is strictly increasing on (−∞, −1]
and [1, +∞), and is strictly decreasing on [−1, 1]. This implies that −1 is a local
maximum and 1 is a local minimum.
Example 2.3.5. The function f (x) = sin x − x cos x has derivative f 0 (x) = x sin x.
The sign of the derivative determines the strict monotone property on the interval
[−5, 5] as described in the picture. The strict monotone property implies that −π, 5
are local minima, and −5, π are local maxima.
√ (5x + 2)
x2 (x + 1) has derivative f 0 (x) =
3
Example 2.3.6. The function f (x) = √
33x
for x 6= 0. Using the sign of the derivative and the continuity, we get the strict
2
monotone property of the function, which implies that − is a local maximum, and
5
0 is a local minimum.
local max
local max π
2.377
−π 5
−5 π
−2.377
local min
−π
local min
x3 0 x2 (x2 − 3)
Example 2.3.7. The function f (x) = has derivative f (x) = for
x2 − 1 (x2 − 1)2
x 6= ±1. The sign of the derivative determines the strict
√ monotone property away
√ ±1. The strict monotone property implies that − 3 is a local minimum, and
from
3 is a local maximum.
√ √
x − 3 −1 0 1 3
f % max & & & & min %
f0 + 0 − no − 0 − no − 0 +
Exercise 2.3.11. Show that 2x + sin x = c has only one solution. Show that x4 + x = c has
at most two solutions.
Exercise 2.3.12. If f is differentiable and has 9 roots on (a, b), how many roots does f 0
have on (a, b)? If f also has second order derivative, how many roots does f 00 have on
(a, b)?
√
Exercise 2.3.13. Find smallest A > 0, such that log x ≤ A x. Find smallest B > 0, such
B
that log x ≥ − √ .
x
Theorem 2.3.3. Suppose f (x) and g(x) are continuous for x ≥ a and differentiable
for x > a. If f (a) ≥ g(a) and f 0 (x) ≥ g 0 (x) for x > a, then f (x) ≥ g(x) for x > a.
If f (a) ≥ g(a) and f 0 (x) > g 0 (x) for x > a, then f (x) > g(x) for x > a.
Example 2.3.8. We have ex > 1 for x > 0 and ex < 1 for x < 0. This is the
comparison of ex with the constant term of the Taylor expansion (or the 0th Taylor
expansion) in Example 2.5.5. How do we compare ex with the first order Taylor
expansion 1 + x?
We have e0 = 1+0. For x > 0, we have (ex )0 = ex > (1+x)0 = 1. Therefore we get
e > 1+x for x > 0. On the other hand, for x < 0, we have (ex )0 = ex < (1+x)0 = 1.
x
1 x 1 x+1
4. 1 + <e< 1+ , for x > 0.
x x
x−y
5. arctan x − arctan y ≤ 2 arctan , for x > y > 0.
2
x2 x2
6. < x − log(1 + x) < , for x > 0. What about −1 < x < 0?
2(1 + x) 2
Exercise 2.3.18. For natural number n and 0 < a < 1, prove that the equation
x2 xn
1+x+ + ··· + = aex
2! n!
has only one solution on (0, +∞).
If f 0 (x0 ) > 0, then the linear approximation L(x) = f (x0 ) + f 0 (x0 )(x − x0 ) of
f near x0 is strictly increasing. This means that L(x) < L(x0 ) for x < x0 and
L(x) > L(x0 ) for x > x0 . Since L is very close to f near x0 , we expect that f is also
“lower” on the left of x0 and “higher” on the right of x0 . In particular, this implies
that x0 is not a local extreme of f . By the similar argument, if f 0 (x0 ) < 0, the x0
is also not a local extreme. This is the reason behind the theorem.
Since our reason makes explicit use of both the left and right sides, the criterion
does not work for one sided derivatives. Therefore for a function f defined on an
interval, the candidates for the local extrema must be one of the following cases:
2.3. APPLICATION OF LINEAR APPROXIMATION 121
L(x0 ) = f (x0 )
Example 2.3.11. The derivative (x2 )0 = 2x vanishes only at 0. Therefore the only
candidate for the local extrema of x2 on R is 0. By x2 ≥ 02 for all x, 0 is a minimum.
If we restrict x2 to the closed interval [−1, 2], then the end points −1 and 2 are
also candidates for the local extrema. By x2 ≤ (−1)2 on [−1, 0] and x2 ≤ 22 on
[−1, 2], −1 is a local maximum and 2 is a global maximum.
On the other hand, the restriction of x2 on the open interval (−1, 2) has no other
candidates for local extrema besides 0. The function has global minimum at 0 and
has no local maximum on (−1, 2).
that modifies the square function by reassigning the value at 0. The function is not
differentiable at 0 and has nonzero derivative away from 0. Therefore on [−1, 2],
the candidates for the local extrema are 0 and the end points −1 and 2. The end
points are also local maxima, like the unmodified x2 . By x2 < f (0) = 2 on [−1, 1],
0 is a local maximum. The modified square function f (x) has no local minimum on
[−1, 2].
If we restrict the function to [−2, 2], then ±2 are also possible local extrema. By
comparing the values
Example 2.3.14. By (x3 )0 = 3x2 , 0 is the only candidate for the local extreme of x3 .
However, we have x3 < 03 for x < 0 and x3 > 03 for x > 0. Therefore 0 is actually
not a local extreme.
The example shows that the converse of Theorem 2.3.4 is not true.
Example 2.3.15. The function f (x) = xe−x has derivative f 0 (x) = (x − 1)ex . The
only possible local extreme on R is at 1. We have limx→−∞ f (x) = −∞ and
limx→+∞ f (x) = 0. We claim that the limits imply that f (1) = e−1 is a global
maximum.
Since the limits at both infinity are < f (1), there is N , such that f (x) < f (1)
for |x| ≥ N . In particular, we have f (±N ) < f (1). Then consider the function
on [−N, N ]. On the bounded and closed interval, Theorem 2.3.1 says that the
continuous function must reach its maximum, and the candidates for the maximum
on [−N, N ] are −N, 1, N . Since f (±N ) < f (1), we see that f (1) is the maximum
on [−N, N ]. Combined with f (x) < f (1) for |x| ≥ N , we conclude that f (1) is the
maximum on the whole real line.
4. x2 + bx + c on R. 9. xx on (0, 1].
a a a
By A0 (x) = − 2x, the candidates for the local extrema are 0, , . The values
2 4 2
a2 a
of A at the three points are 0, , 0. Therefore the maximum is reached when x = ,
16 4
which means the rectangle is a square.
Example 2.3.17. The distance from a point P = (x0 , y0 ) on the plane to a straight
line ax + by + c = 0 is the minimum of the distance from P to a point (x, y) on the
line. The distance is minimum when the square of the distance
f (x) = (x − x0 )2 + (y − y0 )2
P (x0 , y0 )
ax + by + c = 0
(x, y)
From
2
f 0 (x) = 2(x − x0 ) + 2(y − y0 )y 0 = (b(x − x0 ) − a(y − y0 )),
b
we know that f (x) is minimized when
b(x − x0 ) − a(y − y0 ) = 0.
ax + by + c = 0.
A
spee
d u
a
x
Q
L
P l y
sp
ee
b
d
v
B
Example 2.3.18. Consider light traveling from a point A in one medium to point B
in another medium. Fermat’s principle says that the path taken by the light is the
path of shortest traveling time.
Let u and v be the speed of light in the respective medium. Let L be the place
where two media meet. Draw lines AP and BQ perpendicular to L. Let the length
of AP, BP, P Q be a, b, l. Let x be the angle by which the light from A hits L. Let
y be the angle by which the light leaves L and reaches B.
The angles x and y are related by
a tan x + b tan y = l.
This can be considered as an equation that implicitly defines y as a function of x.
The derivative of y = y(x) can be obtained by implicit differentiation
a sec2 x
y 0 (x) = − .
b sec2 y
The time it takes for the light to travel from A to B is
a sec x b sec y
T = + .
u v
By thinking of y as a function of x, the time T becomes a function of x. The time
will be shortest when
dT a sec x tan x b sec y tan y 0
0= = + y
dx u v
a sec x tan x b sec y tan y a sec2 x
= −
u v b sec2 y
sin x sin y
= a sec2 x − .
u v
This means that the ratio between the sine of the angles x and y is the same as the
ratio between the speeds of light
sin x u
= .
sin y v
2.4. MEAN VALUE THEOREM 125
Exercise 2.3.20. A rectangle is inscribed in an isosceles triangle. Show that the biggest
area possible is half of the area of the triangle.
Exercise 2.3.21. Among all the rectangles with area A, which one has the smallest perime-
ter?
Exercise 2.3.22. Among all the rectangles with perimeter L, which one has the biggest
area?
Exercise 2.3.23. A rectangle is inscribed in a circle of radius R. When does the rectangle
have the biggest area?
Exercise 2.3.24. Determine the dimensions of the biggest rectangle inscribed in the ellipse
x2 y 2
+ 2 = 1.
a2 b
Exercise 2.3.25. Find the volume of the biggest right circular cone with a given slant height
l.
Exercise 2.3.26. What is the shortest distance from the point (2, 1) to the parabola y =
2x2 ?
or
f (a + h) − f (a) = f 0 (a + θh)h, for some 0 < θ < 1.
We also note that the conclusion is symmetric in a, b. Therefore there is no need to
insist a < b.
f (x) f (b)
f (a) L(x)
a c c c b
Geometrically, the Mean Value Theorem means that the straight line L connect-
ing the two ends (a, f (a)) and (b, f (b)) of the graph of f is parallel to the tangent
of the function somewhere. Figure 2.4.1 suggests that c in the Mean Value Theo-
rem is the place where the the distance between the graphs of f and L has local
extrema. Since such local extrema for the distance f (x) − L(x) always exists by
Theorem 2.3.1, we get (f − L)0 (c) = 0 for some c by Theorem 2.3.4. Therefore
f (b) − f (a)
f 0 (c) = L0 (c) = .
b−a
Example 2.4.1. We try to verify the Mean Value Theorem for f (x) = x3 − 3x + 1 on
[−1, 1]. This means finding c, such that
f (1) − f (−1) −1 − 3
f 0 (c) = 3(c2 − 1) = = = −2.
1 − (−1) 1 − (−1)
1
We get c = ± √ .
3
we conclude that
x
≤ log(1 + x) ≤ x.
1+x
The inequality already appeared in Example 2.3.9.
Example 2.4.3. For the function |x| on [−1, 1], there is no c ∈ (−1, 1) satisfying
The Mean Value Theorem does not apply because |x| is not differentiable at 0.
Exercise 2.4.1. Is the conclusion of the Mean Value Theorem true? If true, find c. If not,
explain why.
2. 2x on [0, 1].
p
5. |x| on [−1, 1]. 8. log |x| on [−1, 1].
1
3. on [1, 2]. 6. cos x on [−a, a]. 9. arcsin x on [0, 1].
x
Exercise 2.4.2. Suppose f (1) = 2 and f 0 (x) ≤ 3 on R. How large and how small can f (4)
be? What happens when the largest or the smallest value is reached? How about f (−4)?
Exercise 2.4.4. Find the biggest interval on which |ex −ey | > |x−y|? What about |ex −ey | <
|x − y|?
Exercise 2.4.5. Suppose f (x) is continuous at x0 and differentiable on (x0 −δ, x0 )∪(x0 , x0 +
δ). Prove that if limx→x0 f 0 (x) = l converges, then f (x) is differentiable at x0 and f 0 (x0 ) =
l.
For the special case f 0 = 0 throughout the interval, the argument gives the following
result.
128 CHAPTER 2. DIFFERENTIATION
Theorem 2.4.2. If f 0 (x) = 0 for all x ∈ (a, b), then f (x) is a constant on (a, b).
Theorem 2.4.3. If f 0 (x) = g 0 (x) for all x ∈ (a, b), then there is a constant C, such
that f (x) = g(x) + C on (a, b).
Example 2.4.4. The function ex satisfies f (x)0 = f (x). Are there any other functions
satisfying the equation?
If f (x)0 = f (x). Then
Example 2.4.6. By
1 1
(arcsin x)0 = √ , (arccos x)0 = − √ .
1 − x2 1 − x2
we have (arcsin x+arccos x)0 = 0, and we have arcsin x+arccos x = C. The constant
can be determined by taking a special value x = 0
π π
C = arcsin 0 + arccos 0 = 0 + = .
2 2
Therefore we have
π
arcsin x + arccos x = .
2
Exercise 2.4.6. Prove that a differentiable function is linear on an interval if and only if its
derivative is a constant.
Exercise 2.4.7. Find all functions on an interval satisfying the following equations.
π
1. arctan x + arctan x−1 = , for x 6= 0.
2
1
2. 3 arccos x − arccos(3x − 4x3 ) = π, for |x| ≤ .
2
x+a
3. arctan − arctan x = arctan a, for ax < 1.
1 − ax
x+a
4. arctan − arctan x = arctan a − π, for ax > 1.
1 − ax
0 x3 − 1 sin x log(1 + x)
: lim , lim , lim ;
0 x→1 x−1 x→0 x x→0 x
∞ log x log x x2
: lim −1 , lim , lim x ;
∞ x→0 x x→∞ x
x x→∞ e
1
1∞ : lim 1 + , lim (1 + sin x)log x ;
x→∞ x x→0
1 1
∞ − ∞ : lim − x .
x→0 x e −1
Theorem 2.4.4 (L’Hospital’s Rule). Suppose f (x) and g(x) are differentiable func-
tions on (a, b), with g 0 (x) 6= 0. Suppose
0 ∞
The theorem computes the limits of the indeterminates of type or . The
0 ∞
conclusion is the equality
f (x) f 0 (x)
lim+ = lim+ 0
x→a g(x) x→a g (x)
whenever the right side converges. It is possible that the left side converges but the
right side diverges.
130 CHAPTER 2. DIFFERENTIATION
The theorem also has a similar left sided version, and the left and right sided
versions may be combined to give the two sided version. Moreover, l’Hospital’s rule
also allows a or l to be any kind of infinity.
The reason behind l’Hospital’s rule is the following version of the Mean Value
Theorem, which can be proved similar to the Mean Value Theorem.
Theorem 2.4.5 (Cauchy’s Mean Value Theorem). If f (x) and g(x) are continuous
on [a, b] and differentiable on (a, b), such that g 0 (x) 6= 0 on (a, b), then there is
c ∈ (a, b), such that
f (b) − f (a) f 0 (c)
= 0 .
g(b) − g(a) g (c)
Consider the parametrized curve (g(t), f (t)) for t ∈ [a, b]. The theorem says that
the straight line connecting the two ends (g(a), f (a)) and (g(b), f (b)) of the curve is
f 0 (c)
parallel to the tangent of the curve somewhere. The slope of the tangent is 0 .
g (c)
f 0 (c)
g 0 (c)
(g(a), f (a))
(g(b), f (b))
Example 2.4.7. In Example 1.2.15, we proved that limx→+∞ ax = 0 for 0 < a < 1.
Exercise 1.6.15 extended the limit to limx→+∞ x2 ax = 0. We derive the second limit
from the first one by using l’Hospital’s rule.
2.4. MEAN VALUE THEOREM 131
1
We have b = > 1, and limx→+∞ ax = 0 is the same as limx→+∞ bx = ∞. We
a
x2 ∞
also have limx→+∞ x2 = ∞. Therefore limx→+∞ x2 ax = limx→+∞ x is of type ,
b ∞
and we may apply l’Hospital’s rule (twice)
x2 (x2 )0 2x
lim x
=(3) lim x 0
= lim x
x→+∞ b x→+∞ (b ) x→+∞ b log b
0
(2x) 2
=(2) lim x 0
= lim x =(1) 0.
x→+∞ (b log b) x→+∞ b (log b)2
Here is the precise reason behind the computation. The equality =(1) is from
Example 1.2.15. Then by l’Hospital’s rule, the convergence of the right side of =(2)
implies the convergence of the left side of =(2) and the equality =(2) itself. The left of
=(2) is the same as the right side of =(3) . By l’Hospital’s rule gain, the convergence of
the right side of =(3) implies the convergence of the left side of =(3) and the equality
=(3) .
sin x 0
Example 2.4.8. Applying l’Hospital’s rule to the limit limx→0 of type , we get
x 0
However, this argument is logically circular because it makes use of the formula
(sin x)0 = cos x. A special case of this formula is
sin x
lim = (sin x)0 |x=0 = 1,
x→0 x
x cos x − 1 x2 − 1
1. limx→0 . 3. limx→0 . 5. limx→1 .
sin x x2 x3 − 1
cos x − 1 x2 − 1 log(1 + x)
2. limx→0 . 4. limx→1 . 6. limx→0 .
x x−1 x
132 CHAPTER 2. DIFFERENTIATION
log x x−1
lim+ x log x = lim+ = lim = lim+ −x = 0.
x→0 x→0 x−1 x→0+ −x−2 x→0
2.5. HIGH ORDER APPROXIMATION 133
1
By converting x to , we also have
x
(log x)q
lim = 0, for p, q > 0.
x→+∞ xp
x + sin x
Example 2.4.12. If we apply l’Hospital’s rule to the limit limx→∞ of type
x
∞
, then we get
∞
x + sin x
lim = lim (1 + cos x).
x→∞ x x→∞
We find that the left converges and the right diverges. The reason for the l’Hospital’s
rule to fail is that the second condition is not satisfied.
P (x) = a0 + a1 (x − x0 ) + a2 (x − x0 )2 + · · · + an (x − x0 )n ,
The error Rn (x) = f (x) − P (x) of the approximation is called the remainder.
The definition means that
f (x) = P (x) + Rn (x)
= a0 + a1 (x − x0 ) + a2 (x − x0 )2 + · · · + an (x − x0 )n + o((x − x0 )n ),
where the “small o” notation means that the remainder term satisfies
f (x) − P (x) Rn (x)
lim n
= lim = 0.
x→x0 (x − x0 ) x→x0 (x − x0 )n
The n-th order approximation of a function is unique. See Exercise 2.5.7. More-
over, if m < n, then the truncation a0 + a1 (x − x0 ) + a2 (x − x0 )2 + · · · + am (x − x0 )m
is the m-th order approximation of f at x0 . After all, if we have the 10th order
approximation, then we should also have the 5th order approximation.
This means that cos x is second order differentiable at 0, with quadratic approxima-
1
tion 1 − x2 .
2
2.5. HIGH ORDER APPROXIMATION 135
1 xn+1
Exercise 2.5.1. Prove that = 1 + x + x2 + · · · + xn + . What does this tell you
1−x 1−x
1
about the differentiability of at 0?
1−x
1 1
Exercise 2.5.2. Show that and are differentiable of arbitrary order. What
1+x 1 + x2
are their high order approximations?
Exercise 2.5.4. Use l’Hospital’s rule to compute the limits. Then interpret your results as
high order approximations.
sin2 x − x2
1 1 2
3. limx→0 . 7. limx→0 log(1 + x) − x + x .
x4 x3 2
1 1 3 2 log x + (x − 1)(x − 3)
4. limx→0 sin x − x + x . 8. limx→1 .
x5 6 (x − 1)3
second order differentiable at 1? Is it possible for the function to be third order differen-
tiable?
P (x)
Exercise 2.5.7. Suppose P (x) = a0 + a1 (x − a) + a2 (x − a)2 satisfies limx→a = 0.
(x − a)2
Prove that a0 = a1 = a2 = 0. Then explain that the result means the uniqueness of
quadratic approximation. Moreover, extend the result to high order approximation.
Exercise 2.5.8. Suppose P (x) is the n-order approximation of f (x). What is the n-order
approximation of f (−x)? Then use Exercise 2.5.7 to explain that the high order ap-
proximation of an even function has not odd power terms. What about the high order
approximation of an odd function?
136 CHAPTER 2. DIFFERENTIATION
Each application of the l’Hospital’s rule means taking derivative once. Therefore we
get the third order approximation of cos x at 0 by taking derivative three times.
If f (x) is differentiable everywhere on an open interval, then the derivative f 0 (x)
is a function on the open interval. If the derivative function f 0 (x) is also differ-
entiable, then we get the second order derivative f 00 (x) = (f 0 (x))0 . If the function
f 00 (x) is yet again differentiable, then taking the derivative one more time gives the
third order derivative f 000 (x) = (f 00 (x))0 . The process may continue and we have
the n-th order derivative f (n) (x). The Leibniz notation for the high order derivative
dn f
f (n) (x) is .
dxn
1
Let f (x) = cos x and P (x) = 1 − x2 . The key to the repeated application of
2
the l’Hospital’s rule is that the numerator is always 0 at x0 = 0. This means that
f (x0 ) = P (x0 ), f 0 (x0 ) = P 0 (x0 ), f 00 (x0 ) = P 00 (x0 ), f 000 (x0 ) = P 000 (x0 ).
Theorem 2.5.2. If f (x) has n-th order derivative at x0 , then f is n-th order differ-
entiable, with n-th order approximation
Example 2.5.3. The high order derivatives of the power function xp are
(xp )0 = pxp−1 ,
(xp )00 = p(p − 1)xp−2 ,
..
.
(xp )(n) = p(p − 1) · · · (p − n + 1)xp−n .
More generally, we have
((a + bx)p )(n) = p(p − 1) · · · (p − n + 1)bn (a + bx)p−n .
For a = b = 1, we get the Taylor expansion at 0
p(p − 1) 2 p(p − 1) · · · (p − n + 1) n
(1 + x)p = 1 + px + x + ··· + x + o(xn ).
2! n!
For a = 1, b = −1 and p = −1, we get the Taylor expansion at 0
1
= 1 + x + x2 + · · · + xn + o(xn ).
1−x
You may compare with Exercise 2.5.1.
Example 2.5.4. By (log x)0 = x−1 and the derivatives from Example 2.5.3, we have
(n − 1)!
(log x)(n) = (−1)n−1 . This gives the Taylor expansion at 1
xn
1 1 1
log x = (x − 1) − (x − 1)2 + (x − 1)3 − · · · + (−1)n+1 (x − 1)n + o((x − 1)n ).
2 3 n
This can also be expressed as a Taylor expansion at 0
1 1 1
log(1 + x) = x − x2 + x3 − · · · + (−1)n+1 xn + o(xn ).
2 3 n
Example 2.5.5. By (ex )0 = ex , it is easy to see that (ex )(n) = ex for all n. This gives
the Taylor expansion at 0
1 1 1
ex = 1 + x + x2 + · · · + xn + o(xn ).
1! 2! n!
Example 2.5.6. The high order derivatives of sin x and cos x are 4-periodic in the
sense that sin(n+4) x = sin(n) x and cos(n+4) x = cos(n) x, and are given by
(sin x)0 = cos x, (cos x)0 = − sin x,
(sin x)00 = − sin x, (cos x)00 = − cos x,
(sin x)000 = − cos x, (cos x)000 = sin x,
(sin x)0000 = sin x, (cos x)0000 = cos x.
138 CHAPTER 2. DIFFERENTIATION
Note that we have o(x2n ) for sin x at the end, which is more accurate than o(x2n−1 ).
The reason is that the 2n-th term 0 · x2n is omitted from the expression, so that the
approximation is actually of 2n-th order. The similar remark applies to cos x.
We also note that the Taylor expansions of ex , sin x, cos x are related by the
equality √
eix = cos x + i sin x, i = −1.
Exercise 2.5.10. Prove the chain rule for second order derivative
1. ax . 4. cos(ax + b). ax + b
7. .
cx + d
2. eax+b . 5. log(ax + b).
ax + b 1
3. sin(ax + b). 6. log . 8. .
cx + d (ax + b)(cx + d)
Exercise 2.5.13. Use high order derivatives to find high order approximations.
1. ax , n = 5, at 0. 4. sin2 x, n = 6, at π. 7. x3 ex , n = 5, at 0.
2
2. ax , n = 5, at 1. 5. ex , n = 6, at 0. 8. x3 ex , n = 5, at 1.
2
3. sin2 x, n = 6, at 0. 6. ex , n = 6, at 1. 9. ex sin x, n = 5 at 1.
2.5. HIGH ORDER APPROXIMATION 139
1 (−1)n 1
Exercise 2.5.15. Prove (xn−1 e x )(n) = ex .
xn+1
n
Exercise 2.5.16. Prove (eax sin(bx + c))(n) = (a2 + b2 ) 2 eax sin(bx + c + nθ), where sin θ =
b
√ . What is the similar formula for (eax cos(bx + c))(n) ?
2
a +b 2
Exercise 2.5.17. Suppose f (x) has second order derivative near x0 . Prove that
Exercise 2.5.18. Compare ex , sin x, cos x, log(1 + x) with their Taylor expansions. For ex-
x2 x3 xn
ample, is ex bigger than or smaller than 1 + x + + + ··· + ?
2! 3! n!
b
Example 2.5.7. Substituting x by x in the Taylor expansion of (1 + x)p , we get
a
p
p p b
(a + bx) = a 1 + x
a
p(p − 1) b2 2
b
= ap 1 + p x + x + ···
a 2! a2
p(p − 1) · · · (p − n + 1) bn n
n
b n
+ n
x +o x
n! a an
p(p − 1) p−2 2 2
= ap + pap−1 bx + a b x + ···
2!
p(p − 1) · · · (p − n + 1) p−n n n
+ a b x + o(xn ).
n!
n
b n
Note that we used ap o n
x = o(xn ) in the computation. The reason is that
n a n
b n b n
o n
x really means a function R x , where R(x) is the remainder of the
a an
140 CHAPTER 2. DIFFERENTIATION
R(x)
n-th order Taylor expansion of (1 + x)p . Since limx→0 = 0, we get
xn
bn n
p
aR x
an ap R(y) R(y)
lim = lim n = ap−n bn lim n = 0.
x→0 xn y→0 a y→0 y
n
yn
b
n
b
This means ap R xn = o(xn ).
an
Further substitution of a, b, x by x0 , 1, x − x0 gives the Taylor expansion of xp at
x0
xp = (x0 + (x − x0 ))p
p(p − 1) p−2
= xp0 + pxp−1
0 (x − x0 ) + x0 (x − x0 )2 + · · ·
2!
p(p − 1) · · · (p − n + 1) p−n
+ x0 (x − x0 )n + o((x − x0 )n ).
n!
The Taylor expansion can also be obtained from the high order derivative in Example
2.5.3.
1
Example 2.5.8. The Taylor expansion of at 0 in Example 2.5.3 induces the
1−x
following approximations
1
= 1 − x + x2 − · · · + (−1)n xn + o(xn ),
1+x
1
2
= 1 − x2 + x4 − · · · + (−1)n x2n + o(x2n ).
1+x
Similar to the Taylor expansions of sin x and cos x, we expect that the odd power
1
terms vanish in the Taylor expansion of . Therefore the remainder should be
1 + x2
improved to o(x2n+1 ). To get the improved remainder, we consider the 2(n + 1)-th
1
order Taylor expansion of
1 + x2
1
2
= 1 − x2 + x4 − · · · + (−1)n x2n + (−1)n+1 x2(n+1) + o(x2(n+1) ).
1+x
This shows that the remainder of the 2n-th order Taylor expansion is (−1)n+1 x2(n+1) +
R(x)
R(x), where R(x) satisfies limx→0 2(n+1) = 0. By
x
(−1)n+1 x2(n+1) + R(x)
n+1 R(x)
lim = 0 = lim (−1) x + 2(n+1) x = 0,
x→0 x2n+1 x→0 x
2.5. HIGH ORDER APPROXIMATION 141
we get
(−1)n+1 x2(n+1) + o(x2(n+1) ) = o(x2n+1 ),
1
= 1 − x2 + x4 − · · · + (−1)n x2n + o(x2n+1 ).
1 + x2
1
Finally, it is easy to see that has derivative of any order. From the
1 + x2
coefficients in the Taylor expansion, we get
(
dn
1 0, if n = 2k − 1,
n 2
= n!an = k
dx x=0 1 + x
(−1) (2k)!, if n = 2k.
It is practically impossible to get this by directly computing the high order deriva-
tives (i.e., by repeatedly taking derivatives).
Exercise 2.5.19. Explain and justify the following claims about remainders.
1
Exercise 2.5.20. Find the Taylor expansion of at 0, and the high order derivatives
1 − x3
of the function at 0.
Exercise 2.5.21. Use the high order derivatives in Example 2.5.8 to find the Taylor expan-
sion of arctan x at 0.
1
Exercise 2.5.22. Find the Taylor expansion of √ at 0. Find the high order derivatives
1 − x2
of the function at 0. Then find the Taylor expansion of arcsin x at 0.
1 1 1
e−x = 1 − x + x2 − · · · + (−1)n xn + o(xn ),
1! 2! n!
1 1
ex = ex0 ex−x0 = ex0 − ex0 (x − x0 ) + ex0 (x − x0 )2 − · · ·
1! 2!
1
+ (−1)n ex0 (x − x0 )n + o((x − x0 )n ),
n!
2 1 2 1 4 1
ex = 1 + x + x + · · · + x2n + o(x2n+1 ).
1! 2! n!
142 CHAPTER 2. DIFFERENTIATION
2
Note that we have the more accurate remainder o(x2n+1 ) for ex for the reason similar
2
to Example 2.5.8. Moreover, the Taylor expansion of ex gives
2
0, if n = 2k − 1,
(ex )(n) |x=0 = (2k)!
, if n = 2k.
k!
2 x 2 1 1 2 1 n n
x e = x 1 + x + x + · · · + x + o(x )
1! 2! n!
1 1 1
= x2 + x3 + x4 + · · · + xn+2 + o(xn+2 ).
1! 2! n!
1 3 1 4 1
x2 e x = x2 + x + x + ··· + xn + o(xn ).
1! 2! (n − 2)!
Exercise 2.5.23. Use the basic Taylor expansions to find the high order approximations
and derivatives of functions in Exercise 2.5.11.
Exercise 2.5.24. Use the basic Taylor expansions to find the high order approximations
and derivatives at 0.
1 5. log(1 + 3x + 2x2 ). 9. sin x cos 2x.
1. .
x(x + 1)(x + 2)
√ 1 + x2 10. sin x cos 2x sin 3x.
2. 1 − x2 . 6. log .
1 − x3
√
3. 1 + x3 . 7. e2x . 11. sin x2 .
2
4. log(1 + x2 ). 8. ax . 12. sin2 x.
Exercise 2.5.25. Use the basic Taylor expansions to find high order approximations and
high order derivatives.
144 CHAPTER 2. DIFFERENTIATION
1. x3 + 5x − 1 at 1. 5. e−2x at 4. 9. sin x at π.
2. xp at −3. 6. log x at 2. 10. cos x at π.
x+3 π
3. at 1. 7. log(3 − x) at 2. 11. sin 2x at .
x+1 4
√ π
4. x + 1 at 1. 8. sin x at . 12. sin2 x at π.
2
Exercise 2.5.26. Use the basic Taylor expansions to find high order approximations and
high order derivatives at x0 .
2. x2 ex . 4. sin x. 6. sin2 x.
sin2 x − sin x2 1
lim = − .
x→0 x4 3
f 0 (0) = f 00 (0) = f 000 (0) = f (5) (0) = 0, f (4) (0) = −8, f (6) (0) = 152.
2.5. HIGH ORDER APPROXIMATION 145
Example 2.5.12. We may compute the Taylor expansions of tan x and sec x from the
1
Taylor expansions of sin x, cos x and
1−x
1 1
sec x = =
cos x 1 1
1 − x2 + x4 + o(x5 )
2 24
2
1 2 1 4 5 1 2 1 4 5
=1+ x − x + o(x ) + x − x + o(x )
2 24 2 24
3
1 2 1
+ x − x4 + o(x5 ) + o(x6 )
2 24
1 2 1 1
= 1 + x − x4 + x4 + o(x5 )
2 24 4
1 2 5 4
= 1 + x + x + o(x5 ),
2 24
1 3 1 5 6 1 2 5 4 5
tan x = sin x sec x = x − x + x + o(x ) 1 + x + x + o(x )
6 120 2 24
1 3 1 5 1 3 1 5 5 5
=x− x + x + x − x + x + o(x6 )
6 120 2 12 24
1 3 2 5 6
= x + x + x + o(x ).
3 15
(4) (5)
The expansions give (sec x)x=0 = 5 and (tan x)x=0 = 16.
1 1
Example 2.5.13. We computed limx→0 − x by using l’Hospital’s rule in
x e −1
1
Example 2.4.11. Alternatively,, we use the Taylor expansions of ex and
1−x
1 1 1 1 1 1
− x = − 2 1− = x
x e −1 x x 2
x 1 + + o(x)
x+ + o(x ) 2
2
1 x x 1 o(x)
= 1−1+ + o(x) + o + o(x) = + .
x 2 2 2 x
1
This implies that the limit is .
2
x
1
Example 2.5.14. We know limx→∞ 1 + = e from Example 1.6.17. The next
xx
1
question is what the difference 1 + − e looks like. As x goes to infinity, does
x
1
the difference approach 0 like ?
x
146 CHAPTER 2. DIFFERENTIATION
1
The question is the same as the behavior of (1 + x) x − e near 0. By the Taylor
expansions of log(1 + x) and ex at 0, we get
1 1
(1 + x) x − e = e x log(1+x) − e
2
1
x− x2 +o(x2 )
=e x
−e
− x2 +o(x)
=e e −1
h x x i
= e − + o(x) + o − + o(x)
2 2
e
= − x + o(x).
2
Translated back into x approaching infinity, we have
x
1 e 1
1+ −e=− +o .
x 2x x
1 − cos x2 ex − esin x
1. limx→0 . 7. limx→0 .
x3 sin x x3
sin x − tan x cos ax
2. limx→0 . 8. limx→0 log .
x3 cos bx
1 1
x − tan x
3. limx→0 − . 9. limx→0 .
sin2 x x2 x − sin x
1 x
1 1
2 1
4. limx→0 − . 10. limx→∞ x e − − 1 + .
x2 tan2 x x x
x 1
2 1
5. limx→1 − . 11. limx→∞ x log x sin .
x − 1 log x x
1 (x − 1) log x
6. limx→0 (cos x + sin x) x(x+1) . 12. limx→1 .
1 + cos πx
Exercise 2.5.29. Use whatever method you prefer to compute limits, p, q > 0.
q
1. limx→0+ xp e−x . 4. limx→1+ (x − 1)p log x.
q
2. limx→+∞ xp e−x . 5. limx→1+ (x − 1)p (log x)q .
q
3. limx→+∞ xp log x. 6. limx→+∞ xp e−x log x.
2.5. HIGH ORDER APPROXIMATION 147
Exercise 2.5.30. Use whatever method you prefer to compute limits, p, q > 0.
tanp x − xp tan x − cot x
1. limx→0+ . 3. limx→ π4 .
sinp x − xp 4x − π
sinp x − tanp x a tan bx − b tan ax
2. limx→0+ . 4. limx→0 .
xq a sin bx − b sin ax
1
1 1
16. limx→0 e−1 (1 + x) x .
x 2 x
18. limx→0 arccos x .
π
1
1
cos x x
Exercise 2.5.34. In Example 2.4.2, we applied the Mean Value Theorem to get log(1 + x) =
x
for some 0 < θ < 1.
1 + θx
1. Find explicit formula for θ = θ(x).
You may try the same for other functions such as ex − 1 = eθx . What can you say about
limx→0 θ in general?
Exercise 2.5.35. Show that the limits converge but cannot be computed by L’Hospital’s
rule.
1 x − sin x
x2 sin 2. limx→∞ .
1. limx→0 x. x + sin x
sin x
You
√ may verify the result by directly computing the second order derivative of y =
± 1 − x2 .
Example 2.5.16. In Example 2.2.10, we computed the derivative of the function y = y(x)
equation2y − 2x2 − sin y + 1 = 0 and then obtained the linear
implicitly given by ther
π π
approximation at P = , . We can certainly continue finding the formula for the
2 2
second order derivative of y(x) and then get the quadratic approximation at P .
2.5. HIGH ORDER APPROXIMATION 149
π r 2
2 2
π
0=2 + a1 ∆x + a2 ∆x + o(∆x ) − 2 + ∆x
2 2
π
− sin + a1 ∆x + a2 ∆x2 + o(∆x2 ) + 1
2 r
2 π
= 2a1 ∆x + 2a2 ∆x − 4 ∆x − 2∆x2 + o(∆x2 ) − cos(a1 ∆x + a2 ∆x2 + o(∆x2 )) + 1
2
√ 1
= 2a1 ∆x + 2a2 ∆x2 − 2 2π∆x − 2∆x2 + (a1 ∆x + a2 ∆x2 )2 + o(∆x2 )
2
√ 1
2a1 − 2 2π = 0, 2a2 − 2 + a21 = 0.
2
√ 2−π
The solution is a1 = , and the quadratic approximation is
2π, a2 =
2
π √
r
2−π 2 2 π
y(x) = + 2π∆x + ∆x + o(∆x ), ∆x = x − .
2 2 2
√ √ √
1. y 2 + 3y 3 + 1 = x. 4. x+ y= a.
1. x = sin2 t, y = cos2 t.
Exercise 2.5.39. Show that for any n, there is a function that is n-th order differentiable
at 0 but has no second order derivative at 0.
Exercise 2.5.40. The lack of high order derivatives for the function in Example 2.5.17 is
due to discontinuity away from 0. Can you find a function with the following properties?
The next example deals with the following intuition from everyday life. Suppose
we try to measure a length by more and more refined rulers. If our readings from
meter ruler, centimeter ruler, millimeter ruler, micrometer ruler, etc, are all 0, then
the real length should be 0. Similarly, the Taylor expansion of a function at 0 is the
measurement by “xn -ruler”. The following example shows that, even if the readings
by all the “xn -ruler” are 0, the function does not have to be 0.
has derivative
1 − |x|
1
e , if x > 0,
x2
f 0 (x) = − 1 e− |x|
1
, if x < 0,
x2
0, if x = 0.
2.6. APPLICATION OF HIGH ORDER APPROXIMATION 151
The derivative at x 6= 0 is computed by the usual chain rule, and the derivative at
0 is computed directly
1 1 y
f 0 (0) = lim e− |x| = lim |y| = 0.
x→0 x y→∞ e
The example can be understood in two ways. The first is that, for some functions,
even more refined ruler is needed in order to measure “beyond all orders”. The
second is that the function above is not “measurable by polynomials”. The functions
that are measurable by polynomials are call analytic, and the function above is not
analytic.
Exercise 2.5.41. Directly show (i.e., without calculating the high order derivatives) that
the function in Example 2.5.18 is differentiable of any order, with 0 as the approximation
of any order.
When x is close to x0 , we have 1 + o(1) > 0 and therefore f (x) > f (x0 ) when
c(x − x0 )n > 0 and f (x) < f (x0 ) when c(x − x0 )n < 0. Specifically, we have the
following signs of c(x − x0 )n for various cases.
• If n is odd and c > 0, then c(x − x0 )n > 0 for x > x0 and c(x − x0 )n < 0 for
x < x0 .
• If n is odd and c < 0, then c(x − x0 )n < 0 for x > x0 and c(x − x0 )n > 0 for
x < x0 .
The sign of c(x−x0 )n then further determines whether f (x) < f (x0 ) or f (x) > f (x0 ),
and we get the following result.
Theorem 2.6.1. Suppose f (x) has high order approximation f (x0 ) + c(x − x0 )n at
x0 .
The special case n = 1 is Theorem 2.3.4. For the special case n = 2, the theorem
gives the second derivative test: Suppose f 0 (x0 ) = 0 (i.e., the criterion in Theorem
2.3.4 is satisfied), and f has second order derivative at x0 .
Example 2.6.1. In Example 2.3.13, we found the candidates ±1 for the local extrema
of f (x) = x3 − 3x + 1. The second order derivative f 00 (x) = 6x at the two candidates
are
f 00 (1) = 6 > 0, f 00 (−1) = −6 < 0.
Therefore 1 is a local minimum and −1 is a local maximum.
2.6. APPLICATION OF HIGH ORDER APPROXIMATION 153
Example 2.6.2. Consider the function y = y(x) implicitly defined in Example 2.2.10.
4x
By y 0 (x) = , we find a candidate x = 0 for the local extreme of y(x). Then
2 − cos y
we have
00 4 d 1
y0.
y (x) = + 4x
2 − cos y dy 2 − cos y y=y(x)
4
At the candidate x = 0, we already have y 0 (0) = 0. Therefore y 00 (0) = >
2 − cos y(0)
0. This shows that x = 0 is a local minimum of the implicitly defined function.
Example 2.6.3. The function f (x) = x2 − x3 D(x) has no second order derivative
at 0, but still has the quadratic approximation f (x) = x2 + o(x2 ). The quadratic
approximation tells us that 0 is a local minimum of f (x).
sin x 1
Example 2.6.4. Let f (x) = 3
for x 6= 0 and f (0) = . Then for x 6= 0 close
6x − x 6
to 0, we have
x3 x5 x2 x4
1 6 1 5
f (x) = x− + + o(x ) = 1− + + o(x )
6x − x3 6 120 6 − x2 6 120
x2 x 4 x2 x4 x4
1 5 5 1
= 1+ + + o(x ) 1− + + o(x ) = + + o(x5 ).
6 6 36 6 120 6 120
1
We note that by f (0) = , the 4-th order approximation also holds for x = 0. Then
6
by Theorem 2.6.1, we find that x = 0 is a local minimum.
Alternatively, we may directly use the idea leading to Theorem 2.6.1. For x 6= 0
close to 0, we have
x3 x5 x4 (1 + o(x))
1 6 1
f (x) = x − + + o(x ) = + .
6x − x3 6 120 6 (6 − x2 ) · 120
x4 (1 + o(x)) 1
For small x 6= 0, we have 2
> 0, which further implies f (x) > = f (0).
(6 − x ) · 120 6
Therefore 0 is a (strict) local minimum.
Exercise 2.6.2. Find the local extrema for the function y = y(x) implicitly given by x3 +
y 3 = 6xy.
154 CHAPTER 2. DIFFERENTIATION
Exercise 2.6.3. For p > 1, determine whether 0 is a local extreme for the function
x2 + |x|p sin x sin 1 , if x =
6 0,
x
0, if x = 0.
Exercise 2.6.5. Let f (0) = 1 and let f (x) be given by the following for x 6= 0. Determine
whether 0 is a local extreme.
sin x sin x sin x
1. . 2. . 3. .
x + ax2 x + bx3 x + ax2 + bx3
Lx,y (z)
f (z)
x z y
By the geometric intuition illustrated in Figure 2.6.2, the following are equivalent
convexity conditions, for any x ≤ z ≤ y,
1. Lx,y (z) ≥ f (z).
2. slope(Lx,z ) ≤ slope(Lx,y ).
2.6. APPLICATION OF HIGH ORDER APPROXIMATION 155
3. slope(Lz,y ) ≥ slope(Lx,y ).
4. slope(Lx,z ) ≤ slope(Lz,y ).
Lz,y
Lx,z Lx,y
x z y
Lz,y
Lx,z
x c z d y
λ1 + λ2 + · · · + λn = 1, 0 ≤ λ1 , λ2 , . . . , λn ≤ 1,
then
By reversing the direction of inequality, we also get Jensen’s inequality for con-
cave functions.
In all the discussions about convexity, we may also consider the strict inequalities.
So we have a concept of strict convexity, and a differentiable function is strictly
convex on an interval if and only if its derivative is strictly increasing. Jensen’s
inequality can also be extended to the strict case.
Example 2.6.5. By (xp )00 = p(p − 1)xp−2 , we know xp is convex on (0, +∞) when
p ≥ 1 or p < 0, and is concave when 0 < p ≤ 1.
For p ≥ 1, Jensen’s inequality means that
This means that the p-th power of the average is smaller than the average of the
p-th power.
2.6. APPLICATION OF HIGH ORDER APPROXIMATION 157
We note that (xp )00 > 0 for p > 1. Therefore all the inequalities are strict,
provided some xi > 0 and 0 < λi < 1.
p
By replacing p with and replacing xi with xqi , we get
q
1q p1
xq1 + xq2 + · · · + xqn xp1 + xp2 + · · · + xpn
≤ , for p > q > 0.
n n
x2
x
1
x2
1
x− 2
x−1
x−2
1
1
Example 2.6.6. By (log x)00 = − 2 < 0, the logarithmic function is concave. Then
x
Jensen’s inequality tells us that
1 1
Exercise 2.6.8. Let p, q > 0 satisfy + = 1. Use the concavity of log x to prove Young’s
p q
inequality
1 p 1 q
x + y ≥ xy.
p q
λ1 x1 + λ2 x2 + · · · + λn xn = λ1 x1 + (1 − λ1 )(µ2 x2 + · · · + µn xn ).
Exercise 2.6.10. Use the concavity of log x to prove that, for xi > 0, we have
1
Exercise 2.6.11. Use Exercise 2.6.10 to show that f (p) = log(xp1 + xp2 + · · · + xpn ) p is
decreasing. Then explain
1 1
(xq1 + xq2 + · · · + xqn ) q ≥ (xp1 + xp2 + · · · + xpn ) p , for p > q > 0.
Exercise 2.6.12. Verify the convexity of x log x and then use the property to prove the
inequality (x + y)x+y ≤ (2x)x (2y)y . Can you extend the inequality to more variables?
3. Local extrema, which is often (but not restricted to) the places where the
function changes between increasing and decreasing.
4. Points of inflection, which is the place where the function changes between
convex and concave.
5. Infinity, including the finite places where the function tends to infinity, and
the behavior of the function at the infinity.
then the linear function is an asymptote at +∞. If b = 0, then the line is a horizontal
asymptote. We also have similar asymptote at −∞ (perhaps with different a and
b). Moreover, if limx→x0 f (x) = ∞ at a finite x0 , then the line x = x0 is a vertical
asymptote.
In subsequent examples, we sketch the graphs of functions and try to indicate
the characteristics listed above as much as possible.
Example 2.6.7. In Example 2.3.4, we determined the monotone property and the
local extrema of f (x) = x3 − 3x + 1. The second order derivative f 00 (x) = 6x also
tells us that f (x) is concave on (−∞, 0] and convex on [0, +∞), which makes 0 into
a point of inflection. Moreover, the function has no asymptotes. The function is
also symmetric with respect to the point (0, 1) (f (x) − 1 is an odd function). Based
on these information, we may sketch the graph.
max
3
infl
−3 −2 −1 0 1 2 3
1
min
f (x)
function has no asymptote and no symmetry. We also note that limx→0 =
x
x+1
limx→0 √ = ∞. Therefore the tangent of f (x) at 0 is vertical.
3
x
x (−∞, − 25 ) − 52 (− 25 , 0) 0 (0, 15 ) 1
5
( 15 , +∞)
f −∞ ← 0.1518 0 0.1073 → +∞
+ 0 − no +
f0
% max & min %
− no − 0 +
f 00
infl
max infl
√
3
Figure 2.6.6: Graph of x2 (x + 1).
1 2
Example 2.6.10. In Example 2.3.7, we determined the monotone property and the
x3
local extrema of f (x) = 2 . The function is not defined at ±1, and has limits
x −1
lim f (x) = ±∞, lim f (x) = ∓∞.
x→1± x→−1±
lim (f (x) − x) = 0.
x→∞
Exercise 2.6.13. Use the graph of x3 − 3x + 1 in Example 2.6.7 to sketch the graphs.
Exercise 2.6.14. Use the graph of xe−x in Example 2.6.9 to sketch the graphs.
min
√
− 3 infl
√
3
max
x3
Figure 2.6.8: Graph of .
x2 − 1
ax + b 1
Exercise 2.6.18. Sketch the graph of by using the graph of .
cx + d x
Example 2.7.2. Assume some metal balls of radius r = 10 are selected to make a
ball bearing. If the radius is allowed to have 1% relative error, what is the maximal
relative error of the weight?
The weight of the ball is
4
W = ρπr3 .
3
where ρ is the density. The error ∆W of the weight caused by the error ∆r of the
radius is
dW
∆W ≈ ∆r = 4ρπr2 ∆r.
dr
Therefore the relative error is
∆W ∆r
≈3 .
W r
∆r
Given the relative error of the radius is no more than 1%, we have ≤ 1%, so
r
∆W
that the relative error of the weight is ≤ 3%.
W
1 − (−1) 1−0
y(1) = 0, z(1) = −1, y 0 (1) = = −2, z 0 (1) = = 1.
(−1) − 0 0 − (−1)
164 CHAPTER 2. DIFFERENTIATION
Therefore
y(1.01) ≈ 0 − 2 · 0.01 = −0.02, z(1.01) ≈ −1 + 1 · 0.01 = −0.99,
y(0.98) ≈ 0 − 2 · (−0.02) = 0.04, z(0.98) ≈ −1 + 1 · (−0.02) = −1.02.
In other words, the points (1.01, −0.01, −0.99) and (0.98, 0.04, −1.02) are near (1, 0, −1)
and almost on the circle.
Theorem 2.7.1 (Lagrange Form of the Remainder). If f (x) has (n + 1)-st order
derivative on (a, x), then the remainder of the n-th order Taylor expansion of f (x)
at a is
f (n+1) (c)
Rn (x) = (x − a)n+1 for some c ∈ (a, x).
(n + 1)!
We only illustrate the argument for the case n = 2. We know that the remainder
satisfies R(a) = R0 (a) = R00 (a) = 0. Therefore by Cauchy’s Means Value Theorem
(Theorem 2.4.5), we have
R2 (x) R2 (x) − R2 (a) R20 (c1 )
= = (a < c1 < x)
(x − a)3 (x − a)3 − (a − a)3 3(c1 − a)2
R20 (c1 ) − R20 (a) R200 (c2 )
= = (a < c2 < c1 )
3[(c1 − a)2 − (a − a)2 ] 3 · 2(c2 − a)
00 00 000
R (c2 ) − R2 (a) R (c3 ) f 000 (c3 )
= 2 = 2 = . (a < c3 < c2 )
3 · 2(c2 − a) 3·2·1 3!
In the last step, we use R2000 = f 000 because the f − R2 is a quadratic function and
has vanishing third order derivative.
2.7. NUMERICAL APPLICATION 165
A slight modification of the proof above actually gives a proof that the Taylor
expansion is high order approximation (Theorem 2.5.2).
Example 2.7.4. The error for the linear approximation in Example 2.7.1 can be esti-
mated by
1
3
4c 1
R1 (x) = − 2 ∆x2 = 3 ∆x3 .
2! 8c 2
For both approximate values, we have
1
|R1 | ≤ 3 0.052 = 0.0000390625 < 4 × 10−5 .
8·4 2
Exercise 2.7.3. Find approximate values by using Taylor expansions and estimate the er-
rors.
Exercise 2.7.5. Find the approximate value of tan 1 accurate up to the 10-th digit by using
the Taylor expansions of sin x and cos x.
Exercise 2.7.6. If we use the Taylor expansion to calculate e accurate up to the 100-th
digit, what is the order of the Taylor expansion we should use?
solution
x0 x1 x2 x3
L0 L1 L2 L3
Example 2.7.6. By Example 1.7.5, the equation x3 −3x+1 = 0 should have a solution
on (0.3, 0.4). By Example 1.7.6, the equation should also have a second solution > 1
and a third solution < 0. More precisely, by f (−2) = −1, f (−1) = 3, f (1) = −1,
f (2) = 3, the second solution is on (−2, −1) and the third solution is on (1, 2).
Taking −2, 0.3, 2 as initial estimations, we apply Newton’s method and compute
the sequence
x3n − 3xn + 1 2 2xn − 1
xn+1 = xn − = x n + .
3(x2n − 1) 3 3(x2n − 1)
n x0 = −2 x0 = 0.3 x0 = 2
1 -1.88888888888889 0.346520146520147 1.66666666666667
2 -1.87945156695157 0.347296117887934 1.54861111111111
3 -1.87938524483667 0.347296355333838 1.53239016186538
4 -1.87938524157182 0.347296355333861 1.53208898939722
5 -1.87938524157182 0.347296355333861 1.53208888623797
6 1.53208888623796
7 1.53208888623796
Note that the initial estimation cannot be 1 or −1, because the derivative van-
ishes at the points. Moreover, if we start from 0.88, 0.89, 0.90, we get very different
sequences that respectively converge to the three sequences. We see that Newton’s
method can be very sensitive to the initial estimation, especially when the estimation
is close to where the derivative vanishes.
Example 2.7.7. We solve the equation sin x + x cos x = 0 by starting with the estima-
tion x0 = 1. After five steps, we find the exact solution should be 0.325639452727856 · · · .
n xn
0 1.000000000000000
1 0.471924667505487
2 0.330968826345873
3 0.325645312076542
4 0.325639452734876
5 0.325639452727856
6 0.325639452727856
Exercise 2.7.7. Applying Newton’s method to solve x3 − x − 1 = 0 with the initial estima-
tions 1, 0.6 and 0.57. What lesson can you draw from the conclusion?
Exercise 2.7.8. Use Newton’s method to find the unique positive root of f (x) = ex − x − 2.
2.7. NUMERICAL APPLICATION 169
Exercise 2.7.9. Use Newton’s method to find all the solutions of x2 − cos x = 0.
√
Exercise 2.7.10. Use Newton’s method to find the approximate values of 4.05 and e−1
accurate up to the 10-th digit.
Exercise 2.7.11. Use Newton’s method to find all solutions accurate up to the 6-th digit.
Note that one may rewrite the equation into another equivalent form and derive a
simpler recursive relation.
Exercise 2.7.13. What approximate values does the recursive relation xn+1 = 2xn − ax2n
give you? Explain by Newton’s method.
Exercise 2.7.14. Explain why Newton’s method does not work if we try to solve x3 −3x+1 =
0 by starting at the estimation 1.
Exercise 2.7.15. Newton’s method fails to solve the following equations by starting at any
x0 6= 0. Why?
√ p
1. 3 x = 0. 2. sign(x) |x| = 0.
170 CHAPTER 2. DIFFERENTIATION
Chapter 3
Integration
G[a,b] (f )
a b
Figure 3.1.1: The region between the function and the x-axis.
Our strategy is the following. For any x ∈ [a, b], let A(x) be the area of G[a,x] (f ),
which is part of the region over [a, x]. We will find how the function A(x) changes
and recover A(x) from its change. The area we wish to find is then the value A(b).
The subsequent argument assumes that f (x) is continuous.
Consider an interval [x, x + h] ⊂ [a, b], which implicitly assumes h > 0. Then the
change A(x+h)−A(x) is the area of G[x,x+h] (f ). Note that G[x,x+h] (f ) is sandwiched
between two rectangles
[x, x + h] × [0, m] ⊂ G[x,x+h] (f ) ⊂ [x, x + h] × [0, M ],
171
172 CHAPTER 3. INTEGRATION
where
m = min f, M = max f.
[x,x+h] [x,x+h]
m
G[x,x+h] (f )
a x x+h
b
Example 3.1.1. To find the area of the region below f (x) = c over [a, b], by (3.1.4),
we have A0 (x) = c = (cx)0 . Then by Theorem 2.4.3, we get A(x) = cx + C. Further
by A(a) = 0, we get C = −ca and A(x) = c(x − a). Therefore the area of the region
is A(b) = c(b − a).
The region G[a,b] (c) is actually a rectangle of base b − a and height c. The
computation of the area is consistent with the common sense.
The pattern we see from the examples above is that, to find the area below a non-
negative function and over an interval [a, b], we first find a function F (x) satisfying
f (x) = F 0 (x). Then by Theorem 2.4.3, A0 (x) = F 0 (x) implies A(x) = F (x) + C.
Further, by A(a) = 0, we get C = −F (a). Therefore A(x) = F (x) − F (a), and the
area we wish to find is
2
Example
0 3.1.3. To find the area of the region below x and over [0, a], we use
1 3 1 1 1
x = x2 . The area is a3 − 03 = a3 .
3 3 3 3 0
1 p
More generally, for any p 6= −1 and 0 < a < b, by x = xp , the area of
p+1
the region below xp and over [a, b] is
1
(bp+1 − ap+1 ).
p+1
√
For example, the area of the region below x and over [1, 2] is
x=2
2 3 2 3 3 2 √
x2 = (2 2 − 1 2 ) = (2 2 − 1).
3 x=1 3 3
Exercise 3.1.1. Find the area of the region below the function over the given interval.
174 CHAPTER 3. INTEGRATION
√
x
1 2
2 √
Figure 3.1.3: Area below parabola is (2 2 − 1).
3
√
1. xp on [0, 1], p > 0. 3. ex on [0, a]. 5. 1 + x on [1, 2].
1
2. sin x on [0, π2 ]. 4. on [1, a]. 6. log x on [1, a].
x
Exercise 3.1.2. Find the area of the region bounded by 1 − x2 and the x-axis.
−
f
To justify our claim, let A(x) be the signed area for f (x) over [a, x]. What we
are really concerned with is the change of A(x) where f is negative. So we consider
[x, x + h] ⊂ [a, b], with h > 0 and f < 0 on [x, x + h]. The change A(x + h) − A(x)
is the negative of the positive, “unsigned” area of the region
between f and the x-axis along the interval [x, x + h]. We have the similar inclusion
3.1. AREA AND DEFINITE INTEGRAL 175
The heights of the rectangles are respectively −M and −m, and we get (beware of
the signs)
(−M )h ≤ −(A(x + h) − A(x)) ≤ (−m)h.
Again we get the inequality (3.1.2) and subsequently the limit (3.1.3).
The discussion for the case h < 0 is similar, and we conclude that A0 (x) = f (x).
a xx+h
G(f ) b
M
m
0
0 1 2
Example 3.1.4. By (x) = 1 and x = x, we get
2
Z b Z b
1
dx = b − a, xdx = (b2 − a2 ).
a a 2
In general, for any integer n 6= −1, we have
Z b
1
xn dx = (bn+1 − an+1 ).
a n+1
However, for n < 0, a and b need to have the same sign. The reason is that we
derived the Newton-Leibniz formula under the assumption that the integrand is
176 CHAPTER 3. INTEGRATION
continuous. If a and b have different sign, then 0 ∈ [a, b], and xn is not continuous
on [a, b] for n < 0.
On the other hand, for any p, xp is defined for x > 0. Then for b ≥ a > 0, we
have Z b
1
xp dx = (bp+1 − ap+1 ).
a p + 1
We note that xp is also defined at 0 for p ≥ 0, and the formula above holds for
p ≥ 0 and b ≥ a ≥ 0. The reason is that xp is right continuous at 0, and we derived
Newton-Leibniz formula by one-sided derivatives.
1
Example 3.1.5. By (ex )0 = ex and (log x)0 = , we get
x
Z b Z b
x b a dx b
e dx = e − e , = log b − log a = log .
a a x a
Note that the second integral requires b ≥ a > 0.
Example 3.1.6. By (sin x)0 = cos x and (cos x)0 = − sin x, we get
Z b Z b
cos xdx = sin b − sin a, sin xdx = cos a − cos b.
a a
Of course, we need |a|, |b| < 1 in the first equality because the integrand is defined
1
only on the open interval (−1, 1). In particular, the area of the region below
1 + x2
and over [0, 1] is Z 1
dx π
2
= arctan 1 − arctan 0 = .
0 1+x 4
We also note that
Z b
dx π π
lim = lim arctan b − lim arctan a = − − = π.
a→−∞ a 1 + x2 b→+∞ a→−∞ 2 2
b→+∞
1
So the area of the unbounded region between and the x-axis is π.
1 + x2
Exercise 3.1.3. Use the area meaning of definite integral to directly find the value.
3.1. AREA AND DEFINITE INTEGRAL 177
Z 1 p Z 3 Z b
1. 1 − x2 dx. 2. (x − 2)dx. 3. |x − 1|dx.
−1 0 a
Z b √
Z b
1
Exercise 3.1.7. Compute 3
xdx and √
3
dx. Explain for what range of a, b are the
a a x
formulae valid.
1
Exercise 3.1.9. What is the area of the unbounded region between √ and the x-axis,
1 − x2
over the interval (−1, 1)?
178 CHAPTER 3. INTEGRATION
The equality can be used to calculate the definite integral of “piecewise continuous”
functions.
Example 3.1.8. The definite integral of the function (which is not continuous at 0)
(
−2x, if − 1 ≤ x < 0,
f (x) =
ex , if 0 ≤ x ≤ 1,
on [−1, 1] is
Z 1 Z 0 Z 1 Z 0 Z 1
f (x)dx = f (x)dx + f (x)dx = (−2x)dx + ex dx
−1 −1 0 −1 0
= −x2 |0−1 + ex |10 = e.
ex
−2x
−1 1
This extends the definite integral to the case the upper limit is smaller than the lower
limit. With such extension, the equality (3.1.5) holds for any combination of a, b, c.
Moreover, the extended definite integral is still computed by the antiderivative as
before.
Another important property of area is positivity. Translated into definite inte-
gral, this means
Z b
f ≥ 0 =⇒ f (x)dx ≥ 0, for a < b. (3.1.6)
a
Z b
Note that if a > b, then f (x)dx ≤ 0. The positivity is further extended to
a
monotonicity in Example 3.5.5.
If we shift the graph under f (x) over [a, b] by d, we get the graph under f (x − d)
over [a + d, b + d]. Since the area is not changed by shifting, we get
Z b+d Z b
f (x − d)dx = f (x)dx. (3.1.7)
a+d a
See Exercise for more examples of properties of area implying properties of definite
integral.
In Section 3.5, we will introduce more properties from the the viewpoint of
computation (i.e., Newton-Leibniz formula). Some of these properties cannot be
easily explained by properties of area.
Z 2 Z 4 Z 0 Z 4
Exercise 3.1.10. Suppose f (x)dx = 3, f (x)dx = 2, f (x)dx = 0. Find f (x)dx.
0 5 5 2
3. Rectangles have the usual area: µ(ha, bi × hc, di) = (b − a)(d − c).
3.2. RIGOROUS DEFINITION OF INTEGRAL 181
Here ha, bi can mean any of [a, b], (a, b), (a, b], or [a, b). A carefully review of the
argument in Section 3.1 shows that nothing beyond the three properties are used.
Suppose a plane region A ⊂ R2 is a union of finitely many rectangles, A = ∪ni=1 Ii ,
such that the intersections between Ii are at most lines. Since lines have zero area
by the third property, we may use the second property to further define µ(A) =
P n
i=1 µ(Ii ). We give such a plane region the temporary name “good region”, since
we have definite idea
Pnabout the area of a good region. (Strictly speaking, we still
n
need to argue that i=1 µ(Ii ) is independent of the decomposition A = ∪i=1 Ii .)
as the lower bound for µ(X), and the outer area (the minimum should really be the
infimum)
µ∗ (X) = min{µ(B) : B ⊃ X, B is a good region},
as the upper bound for µ(X). We say that the subset X has area (or Jordan
measurable) if µ∗ (X) = µ∗ (X), and the common value is the area µ(X) of X. If
µ∗ (X) 6= µ∗ (X), then we say X has no area.
The subset X has area if and only if for any > 0, there are good regions A
and B, such that A ⊂ X ⊂ B and µ(B) − µ(A) < . In other words, we can
find good inner and outer approximations, such that the difference between the
approximations can be arbitrarily small.
182 CHAPTER 3. INTEGRATION
A
X
Example 3.2.1. A point can be considered as a reduced rectangle and has area 0. If X
consists of finitely many points, then we can take B = X be the union of all “point
rectangles” in X. Since µ(B) = 0, we get µ∗ (X) = 0. By 0 ≤ µ∗ (X) ≤ µ∗ (X), we
also have µ∗ (X) = 0. Therefore finitely many points has area 0.
Example 3.2.2. Consider the triangle with vertices (0, 0), (1, 0) and (1, 1). We parti-
tion the interval [0, 1] into n parts of equal length
n i−1 i
[0, 1] = ∪i=1 , .
n n
1
By taking sufficiently big n, the difference µ(Bn ) − µ(An ) = can be arbitrarily
n
small. Therefore the triangle has area, and the area is given by limn→∞ µ(An ) =
1
limn→∞ µ(Bn ) = . This justifies the conclusion of Example 3.1.2 for the case a = 1.
2
Example 3.2.3. For an example of subsets without area, i.e., satisfying µ∗ (X) 6=
µ∗ (X), let us consider the subset X = (Q ∩ [0, 1])2 of all rational pairs in the unit
square.
3.2. RIGOROUS DEFINITION OF INTEGRAL 183
1 1
inner approx. outer approx.
Since the only rectangles contained in X are single points, any good region
A ⊂ X must be finitely many points. Therefore µ(A) = 0 for any good region
A ⊂ X, and µ∗ (X) = 0.
On the other hand, if B is a good region containing X, then B must almost
contain the whole square [0, 1]2 , with the only exception of finitely many horizontal
or vertical (irrational) line segments. Therefore we have µ(B) ≥ µ([0, 1]2 ) = 1. This
implies µ∗ (X) ≥ 1 (show that µ∗ (X) = 1!).
Exercise 3.2.1. Use inner and outer approximations to explain that any rectangle has area
given by the multiplication of two sides. This justifies Example 3.1.1.
Exercise 3.2.2. Explain that a (not necessarily horizontal or vertical) straight line segment
has area 0.
Exercise 3.2.3. Explain that the region between y = x and the x-axis over [0, a] has area
1 2
a . This fully justifies the computation in Example 3.1.2.
2
Exercise 3.2.4. Explain that the subset X = (Q ∩ [0, 1]) × [0, 1] of all vertical rational lines
in the unit square has no area.
Suppose f ≥ 0 on [a, b]. As indicated by Figure 3.2.4, for any inner approxima-
tion of G[a,b] (f ), we can always choose “full vertical strips” to get a better approxima-
tion for G[a,b] (f ). Here better means closer to the expected value of µ(G[a,b] (f )). The
outer approximations have similar improvements by full vertical strips. Therefore
we only need to consider the approximations by full vertical strips.
mi
a b a xi−1 xi b
The Riemann integrability of f means that G[a,b] (f ) has area, which further
means that the difference between inner and outer approximations can be arbitrarily
small. Therefore we get the following criterion for the integrability.
3.2. RIGOROUS DEFINITION OF INTEGRAL 185
The quantity
ω[xi−1 ,xi ] f = max f − min f
[xi−1 ,xi ] [xi−1 ,xi ]
measures how much the value of f fluctuates on [xi−1 , xi ] and is called the oscillation
of the function on the interval. Since the continuity of a function can imply that such
oscillations are uniformly small, continuous functions are always Riemann integrable.
The criterion also implies that monotone functions are Riemann integrable. On the
other hand, there are functions that are not Riemann integrable.
Example 3.2.4. Consider the Dirichlet function D(x) on [0, 1]. We always have
min D = 0, max D = 1.
[xi−1 ,xi ] [xi−1 ,xi ]
Theorem 3.2.3. A bounded subset X ⊂ R2 has area if and only if its boundary ∂X
has zero area.
We remark that the theory of area can be easily extended to the theory of
volume for subsets in Rn . We may then get the rigorous definition of multivariable
Riemann integrals on subsets of Euclidean spaces, where the subsets should have
volume themselves. The high dimensional versions of Theorems 3.2.2 and 3.2.3 are
still valid.
Further extension of the area theory introduces countably many in place of
finitely many. The result is the modern theory of Lebesgue measure and Lebesgue
integral.
186 CHAPTER 3. INTEGRATION
Theorem 3.2.4. Suppose f is Riemann integrable on [a, b]. Then for any > 0,
there is a tagged partition P0 of [a, b], such that for any refinement P of P0 , we have
Z b
S(P, f ) − f (x)dx < .
a
For practical purposes (see the next section), we usually take the partition to
be regular, i.e. the subintervals are spaced evenly. However, the definitions of both
Darboux and Riemann sums allow arbitrary partitions, and usually √ it is not possible
to obtain a refinement which is also regular (e.g. take P : 1 ≤ 2 ≤ 2).
Luckily, we have the following result (which is equivalent to both Theorem 3.2.2
and Theorem 3.2.4), that allows us to use any partition as long as the subinterval
widths are small enough.
3.3. NUMERICAL CALCULATION OF INTEGRAL 187
Z b
Theorem 3.2.5. Suppose f is Riemann integrable on [a, b] with s = f (x)dx.
a
Then for any > 0, there exists δ > 0 such that for any partition P with kP k < δ
and any sample points x∗i for S(P, f ), we have
|L(P, f ) − s| < , |S(P, f ) − s| < , |U (P, f ) − s| < ,
where kP k = max (xi − xi−1 ) is the largest subinterval width (“mesh size”).
1≤i≤n
Equivalently,
s − < L(P, f ) ≤ S(P, f ) ≤ U (P, f ) < s + .
a b a b
1
Example 3.3.2. To compute the integral of f (x) = on [0, 1], we take n = 4.
1 + x2
The partition is
P4 : 0 < 0.25 < 0.5 < 0.75 < 1, h = 0.25.
The values of f (x) at the five partition points are
Exercise 3.3.1. Find Ln and Rn and confirm the value of related integral.
3.3. NUMERICAL CALCULATION OF INTEGRAL 189
Another choice is the average of the Riemann sums using the left and right points.
Ln + Rn h
Tn = = (f (x0 ) + 2f (x1 ) + 2f (x2 ) + · · · + 2f (xn−1 ) + f (xn )).
2 2
The two approximation schemes are the midpoint rule and the trapezoidal rule.
Since Mn , Ln , Rn are Riemann sums, they converge to the integral of f (x). Conse-
quently Tn also converges.
190 CHAPTER 3. INTEGRATION
a b a b
n = 4, we have
M8 ≈ 0.785721, T8 ≈ 0.784747.
Z 1
dx π
Compared with the actual value 2
= = 0.7853981634 · · · , the follow-
0 1+x 4
ing are the errors of various schemes.
We observe that the midpoint and trapezoidal rules are much more accurate,
and the error for the midpoint rule is about half of the error for the trapezoidal rule.
Moreover, doubling the number of partition points improves the error by a factor of
4 for the two rules. The following gives an estimation of the errors.
Theorem 3.3.1. Suppose f 00 (x) is continuous and bounded by K2 on [a, b], then
Z b Z b
K2 (b − a)3 K2 (b − a)3
f (x)dx − Mn
≤ , f (x)dx − Tn
≤ .
a
24n2
a
12n2
192 CHAPTER 3. INTEGRATION
Z b
Exercise 3.3.6. For the integral x2 dx, we take any partition of [a, b], in which the
a
intervals may not have the same length. Estimate the error of the various schemes in
terms of the size δ = maxni=1 (xi − xi−1 ) of the partition.
Exercise 3.3.7. Apply the midpoint and trapezoidal rules to the integral and compare with
the actual value.
Z 2
Z π
dx
1. , n = 6, 12. 2. sin xdx, n = 4, 12.
1 x 0
Exercise 3.3.8. Apply the midpoint and trapezoidal rules to the integral. Moreover, esti-
mate the number of partition points needed for the approximation to be accurate up to
10−6 .
Z π Z 1
cos x2 dx, n = 5, 10.
2
1. 4. ex dx, n = 10.
0 0
Z π 2
sin x
Z
1
2. dx, n = 5, 10. 5. e x dx, n = 10.
0 x 1
Z 2 Z 2
1 log x
3. √ dx, n = 5, 10. 6. dx, n = 10.
0 1 + x3 1 1+x
1
Exercise 3.3.9. Show that T2n = (Mn + Tn ).
2
Z b
Exercise 3.3.10. Prove that if f is a concave positive function, then Tn ≤ f (x)dx < Mn .
a
satisfying
h
Sn = (f (x0 )+4f (x1 )+2f (x2 )+4f (x3 )+2f (x4 )+· · ·+2f (xn−2 )+4f (xn−1 )+f (xn )).
3
1
This is Simpson’s rule. Observe that Sn = (2Tn + M n2 ) is the weighted average
3
of the trapezoidal (with step size h) and midpoint (with step size 2h) rules. As a
result, it also converges to the integral of f (x).
The errors in Simpson’s rule can be estimated by the bound on the fourth order
derivative.
Theorem 3.3.2. Suppose f (4) (x) is continuous and bounded by K4 on [a, b], then
Z b
K4 (b − a)5
f (x)dx − S n
≤ .
a
180n4
A consequence of the theorem is that doubling the partition improves the error
by a factor of 16!
Z 1
dx
Example 3.3.6. Applying Simpson’s rule to for n = 4, we use the same
0 1 + x2
data from Example 3.3.1 to get
0.25
S4 = × (1.000000 + 4 × 0.941176 + 2 × 0.800000 + 4 × 0.640000 + 0.500000)
3
≈ 0.785392.
The error |Sn − I| = 0.000540 is comparable to the midpoint and trepezoidal rule
for n = 8.
194 CHAPTER 3. INTEGRATION
How many partition points are needed in order to get the approximate value
accurate up to the 6-th digit? To answer the question, we compute the derivatives
1
From f (5) (x), the extrema of f (4) (x) on [0, 1] can only be at 0, √ or 1. By
3
(4)
(4) 1 81
|f (0)| = 24, f √ = , |f (4) (1)| = 3,
3 8
24
≤ 10−6 .
180n4
Therefore we need n ≥ 19.1. Since n should be an even integer, this means n ≥ 20.
We may carry out the similar estimation for the midpoint and trapezoidal rules.
We find K2 = |f 00 (0)| = 2, so that the estimations become
2 2
≤ 10−6 , ≤ 10−6 .
24n2 12n2
The answers are respectively n ≥ 289 and n ≥ 409.
Exercise 3.3.14. Simpson’s 3/8 rule is obtained by using cubic instead of quadratic approx-
imation. Derive the formula for this rule.
Note that the continuity was used critically in our argument for A0 (x) = f (x).
Note also that if f (x) is only integrable (hence bounded), then A(x) is still
continuous on [a, b] by the same argument as in(3.1.1).
Z x2
Example 3.4.1. Let f (x) be a continuous function. To find the derivative of f (t)dt,
a
we note that the integral is a composition
Z x2 Z x
2
f (t)dt = A(x ), A(x) = f (t)dt.
a a
Z x
2
Example 3.4.2. The function f (x) = et dt cannot be expressed as combinations
0
2
of the usual elementary functions. Still, we know f 0 (x) = ex . We also have
Z x2 Z x2 Z x !
d 2 d 2 2 4 2
et dt = et dt − et dt = 2xex − ex .
dx x dx 0 0
we have
Z x 1,
if x > 0,
0
A(x) = sign(x)dx = |x|, A (x) = no, if x = 0,
0
−1, if x < 0.
We note that A(x) is not differentiable at 0, exactly the place where the sign function
is not continuous. The example shows that the continuity assumption cannot be
dropped from the Fundamental Theorem.
. . . , [−5π, −4π], [−3π, −2π], [−π, π], [2π, 3π], [4π, 5π], . . . ,
This implies that Si(x) has local maxima at . . . , −6π, −4π, −2π, π, 3π, 5π, . . . , and
has local minima at . . . , −5π, −3π, −π, 2π, 4π, 6π, . . . . Moreover, we can also calcu-
late the second order derivative
x cos x − sin x
00 , if x 6= 0,
Si (x) = x2
0, if x = 0,
Exercise 3.4.4. Study the monotone and convex properties, including the extrema and the
points of inflection.
Z x Z x Z x
dt πt2 2 2
1. 2
. 2. sin dt. 3. √ e−t dt.
0 1 + t + t 0 2 π 0
Exercise 3.4.7. Prove that for a positive continuous function f (x) on (0, +∞), the function
Z x 2
tf (t)dt
0
g(x) = Z x
f (t)dt
0
is strictly increasing on (0, +∞).
Z x
Exercise 3.4.8. Discuss where f (x) is not continuous and where f (t)dt is not differen-
0
tiable.
( 0
x, if x 6= 0, x2 sin 1 ,
if x 6= 0,
1. f (x) = .
1, if x = 0, 3. f (x) = x .
0,
( if x = 0,
x, if x > 0,
2. f (x) = .
1, if x ≤ 0,
The Fundamental Theorem of Calculus says that the signed area gives one an-
tiderivative, and can be used as F (x) above
Z Z x
f (x)dx = f (t)dt + C.
a
3.4. INDEFINITE INTEGRAL 199
Example 3.4.6. By
1
(xp+1 )0 = (p + 1)xp , (log |x|)0 = , (ex )0 = ex ,
x
we get p+1
Z x + C, for p 6= −1,
Z
p
x dx = p + 1 ex dx = ex + C.
log |x| + C, for p = −1;
The antiderivatives of tan x and sec x are more complicated, and are given in Ex-
ample 3.5.27.
log |x|
Z
1
1. dx = (log |x|)2 + C.
x 2
eax
Z
2. eax cos bxdx = 2 (a cos bx + b sin bx) + C.
a + b2
Z
1
3. cos(ax + b)dx = sin(ax + b) + C.
a
a2
Z p
x 1 p
4. a2 − x2 dx = arcsin + x a2 − x2 + C, a > 0.
2 a 2
x − a
Z
dx 1
5. = log + C.
x2 − a2 2a x + a
Z
dx p
6. √ = log x + x2 + a + C.
2
x +a
Z p
1 p a p
7. x2 + adx = x x2 + a + log x + x2 + a + C.
2 2
Z Z
Exercise 3.4.11. If f (x)dx = F (x) + C, then what is f (ax + b)dx? Apply your
conclusion to compute the integrals.
Z Z Z
dx
1. log |a + bx|dx. 3. sec2 (3x − 1)dx. 5. p .
x(1 − x)
Z
Z dx Z
dx
2. sin(ax + b)dx. 4. p . 6. .
1 − (x − 1)2 x2 + 2x + 2
Exercise 3.4.12. Find the antiderivative of x(ax2 + b)p . Then compute the integrals.
Z p Z Z
xdx xdx
1. x x2 + 3dx. 2. 2
. 3. √ .
x +1 4 − x2
One should not just mindlessly compute the antiderivative. Sometimes we need
to consider the meaning of antiderivative and question whether the answer makes
sense.
have antiderivative? Does the function have definite integral? What do you learn from
the example?
(F (x) + G(x))0 = F 0 (x) + G0 (x) = f (x) + g(x), (cF (x))0 = cF 0 (x) = cf (x),
By the Newton-Leibniz formula, we get the linear property for the definite integral
Z b
(f (x) + g(x))dx = (F (b) + G(b)) − (F (a) + G(a))
a
Z b Z b
= (F (b) − F (a)) + (G(b) − G(a)) = f (x)dx + g(x)dx,
a a
Z b Z b
cf (x)dx = cF (b) − cF (a) = c(F (b) − F (a)) = c f (x)dx.
a a
1 2 1
= x2 + x3 + x4 + C.
2 3 4
This furthers gives the definite integral
Z 1
1 2 1 17
x(1 + x)2 dx = (12 − 02 ) + (13 − 03 ) + (14 − 04 ) = .
0 2 3 4 12
Z
On the other hand, it would be very complicated to compute x(x + 1)10 dx by
the binomial expansion of (1 + x)10 . The following is much simpler
Z Z Z Z
x(x + 1) dx = ((x + 1) − 1)(x + 1) dx = (x + 1) dx − (x + 1)10 dx
10 10 11
1 1 1
= (x + 1)12 − (x + 1)11 + C = (11x − 1)(x + 1)11 + C.
12 11 12 · 11
√
Z Z Z 2
1. x x + 1dx. 5. (x − 1)(x + 1)p dx. x−1
9. dx.
x
Z
x 2
Z Z
2. p
x(ax + b) dx. 6. dx. x−1
(x + 1)10 10. dx.
x2
x2 − x + 1
Z Z Z 2
2 p x−1
3. x (ax + b) dx. 7. dx. 11. dx.
(x + 1)10 x+1
x−1 (x − 1)2
Z Z Z
4
4. (x − 1)(x + 1) 3 dx. 8. √ dx. 12. dx.
x (x + 1)4
ax + b B
=A+ .
cx + d cx + d
ax + b 2
ax + b
Then compute the antiderivatives of and .
cx + d cx + d
to get
x − a
Z
dx 1 1 1
= log |x − a| − log |x + a| + C = log + C.
x 2 − a2 2a 2a 2a x + a
As noted in Example 3.1.4, we should not blindly use the Newton-Leibniz formula
in competing the definite integral. For example, we cannot get
Z 2
1 1 2 − 1 1 0 − 1
dx = log − log = − 1 log 3,
2
0 x −1 2 2+1 2 0 + 1 2
because the interval [0, 2] contains 1, where the integrand approaches infinity.
204 CHAPTER 3. INTEGRATION
x2 dx
Z
xdx
Z
(2x + 1)dx
Z
1. . 4. . 7. .
x2 − 1 x2 + 3x + 2 x2 + 1
1
x2 dx
Z
dx
Z
(2x + 1)dx
Z
2. . 5. . 8. .
x2 − 1 0 x2 + 3x + 2 (x2 + 1)(x2 + 4)
x2 dx x2 dx
Z Z Z
dx
3. . 6. . 9. .
2
x + 3x + 2 x2 + 3x + 2 (x2 + 1)(x2 + 4)
sin3 xdx.
2
1. cos x sin xdx. 4. 7. sin x cos 2xdx.
0
Z π Z Z π
2. sin2 x cos xdx. 5. cos x sin 2xdx. 8. | sin x cos 2x|dx.
0 0
Z Z Z π
2 2
3. cos xdx. 6. cot xdx. 9. | sin x − cos x|dx.
0
Example 3.5.5. For f ≥ g and a ≤ b, by the inequality (3.1.6) and the linearity of
definite integral, we have
Z b Z b Z b
f (x)dx − g(x)dx = (f (x) − g(x))dx ≥ 0.
a a a
Therefore we have
Z b Z b
f ≥ g =⇒ f (x)dx ≥ g(x)dx, for a < b.
a a
The inequality corresponds to Theorem 2.3.3 that uses the derivatives to compare
functions. However, it is more direct to get the inequality by using the non-negativity
of area.
If we apply the inequality to −|f | ≤ f ≤ |f |, then we get
Z b Z b
|f (x)|dx ≥ f (x)dx , for a < b.
a a
Z b
1
Example 3.5.6 (Average). The average of a function f on [a, b] is f (x)dx.
b−a a
If m ≤ f ≤ M on [a, b], then
Z b Z b Z b
m(b − a) = mdx ≤ f (x)dx ≤ M dx = M (b − a).
a a a
This implies that the average of f lies between m and M , which is consistent with
our intuition.
206 CHAPTER 3. INTEGRATION
so that
√ sin t2 5 sin t2 1
t
|x| < δ =⇒ − t + = − T (t ) ≤ |t|7 .
2
t 6 t t
Therefore
√
2
Z x
6 2 5
Z x
x x sin t t
|x| < δ =⇒ F (x) −
+ =
−t+ dt ≤
|t|7 dt = |x|8 .
2 36 0 t 6 0 8
Example 3.5.8. Suppose f (x) has second order derivative on [a, b]. We may take the
a+b
linear approximation at the middle point c = . By the Lagrange form of the
2
remainder (Theorem 2.7.1), we get
f 00 (x̄)
f (x) = f (c) + f 0 (c)(x − c) + (x − c)2 ,
2
3.5. PROPERTIES OF INTEGRATION 207
Let K2 be the bound for the second order derivative. In other words, |f 00 | ≤ K2 on
[a, b]. Then the right side
Z b 00 Z b 00
|f (x̄)| K2 b
Z
f (x̄) 2
2 K2
(x − c) dx ≤
(x − c) dx ≤ (x − c)2 dx = (b − a)3 .
a 2 a 2 2 a 24
Exercise 3.5.7. Show that the integration of n-th order approximation is (n + 1)-st order
approximation. Specifically, find high order approximation of function at 0.
Z x
cos t − 1
Z 0 t
e −1
1. dt, order 5. 3. dt, order 5.
0 t −x2 t
Z √x Z x
sin t − t log(1 + t)
2. 2
dt, order 4. 4. dt, order 7.
0 t −x t
Z b
Exercise 3.5.8. Derive an estimation for
f (x)dx − f (a)(b − a) in terms of the bound
a
K1 of f on [a, b].
Exercise 3.5.9. Apply the estimation in Example 3.5.8 to each interval of a partition and
derive the error formula for the midpoint rule in Theorem 3.3.1.
Example 3.5.9. The antiderivative of the logarithmic function in Example 3.4.7 may
be derived by using the integration by parts (taking F (x) = log |x| and G(x) = x)
Z Z Z
log |x|dx = x log |x| − xd log |x| = x log |x| − x(log |x|)0 dx
Z
= x log |x| − dx = x log |x| − x + C.
Example 3.5.10. The integral in Example 3.5.1 can also be computed by using the
integration by parts
Z Z
10 1
x(x + 1) dx = xd(x + 1)11 (integrate (x + 1)10 part)
11
Z
1 11 1
= x(x + 1) − (x + 1)11 dx (exchange two parts)
11 11
1 1
= x(x + 1)11 − (x + 1)12 + C.
11 12 · 11
Z Z
= −x e − 2 xde = −x e − 2xe + 2 e−x dx
2 −x −x 2 −x −x
= −(x2 + 2x + 2)e−x + C.
xex dx
Z Z Z
2 x x 2
1. (x − 1)a dx. 2. (x + a ) dx. 3. .
(x + 1)2
Z 1
Exercise 3.5.14. Compute xn ax dx.
0
Z 1
m!n!
Exercise 3.5.15. For natural numbers m, n, show that xm (1 − x)n dx = .
0 (m + n + 1)!
The idea can be extended to product of xn , sin ax and cos bx for various a and b
Z Z Z
1 1 1
x sin x sin 2xdx = x(cos 3x − cos x)dx = xd sin 3x − sin x
2 2 3
Z
1 1
= x(sin 3x − 3 sin x) − (sin 3x − 3 sin x)dx
6 6
1 1 1 1
= x sin 3x − x sin x + cos 3x − cos x + C.
6 2 18 2
An example of the definite integral is
Z π Z π π 2 Z π
2
2
2
2 π 2
2
x sin xdx = − x d cos x = − cos + 0 cos 0 + 2x cos xdx
0 0 2 2 0
Z π Z π
2 π π 2
=2 xd sin x = 2 sin − 2 · 0 sin 0 − 2 sin xdx
0 2 2 0
π
= π + 2 cos − 2 cos 0 = π − 2.
2
We have
Z Z
−1 ax −1 ax −1
I0 = a cos bxde = a e cos bx − a eax d cos bx
1
= xex cos x − ex (cos x + sin x) + J1 .
2
Similarly, we have
1
J1 = xex sin x − ex (− cos x + sin x) − I1 .
2
Solving the two equations, we get
Z
1
xex cos xdx = xex (cos x + sin x) − ex sin x + C,
2
Z
1
xex sin xdx = xex (− cos x + sin x) + ex cos x + C.
2
= − cosm+1 x sinn−1 x
Z
+ (−m cosm−1 x sinn x + (n − 1) cosm+1 x sinn−2 x) cos xdx
= − cosm+1 x sinn−1 x
Z
+ (−m cosm x sinn x + (n − 1) cosm x(1 − sin2 x) sinn−2 x)dx
1 n−1
Im,n = − cosm+1 x sinn−1 x + Im,n−2 , m + n 6= 0.
m+n m+n
The formula reduces the power of sine by 2. If we first integrate a copy of cos x,
then we get another recursive relation that reduces the power of cosine by 2
1 m−1
Im,n = cosm−1 x sinn+1 x + Im−2,n , m + n 6= 0.
m+n m+n
3.5. PROPERTIES OF INTEGRATION 213
On the other hand, we can also express Im,n−2 and Im−2,n in terms of Im,n . After
substituting n by n + 2, we get recursive relations that increase the power by 2
1 m+n+2
Im,n = cosm+1 x sinn+1 x + Im,n+2 , n 6= −1,
n+1 n+1
1 m+n+2
=− cosm+1 x sinn+1 x + Im+2,n , m 6= −1.
m+1 m+1
Here is a concrete example of using the recursive relation
6−1
Z
1
cos4 x sin6 xdx = I4,6 = − cos4+1 x sin6−1 x + I4,6−2
4+6 4+6
1 5
= − cos5 x sin5 x + I4,4
10 10
1 5 5 5 1 5 3 3
= − cos x sin x + − cos x sin x + I4,2
10 10 8 8
1 5
= − cos5 x sin5 x + sin3 x
10 10 · 8
5·3 1 5 1
+ − cos x sin x + I4,0
10 · 8 6 6
5 1 5 5 3 5·3
= − cos x sin x + sin x + sin x
10 10 · 8 10 · 8 · 6
5·3·1 1 3 3
+ cos x sin x + I2,0
10 · 8 · 6 4 4
5 1 5 5 3 5·3
= − cos x sin x + sin x + sin x
10 10 · 8 10 · 8 · 6
5·3·1 1 3 5·3·1 3 1 1
+ cos x sin x + + cos x sin x + I0,0
10 · 8 · 6 4 10 · 8 · 6 4 2 2
1 5 5·3
= − cos5 x sin5 x + sin3 x + sin x
10 10 · 8 10 · 8 · 6
5·3·1 1 3 3·1 5·3·1 3·1
+ cos x + cos x sin x + + x + C.
10 · 8 · 6 4 4·2 10 · 8 · 6 4 · 2
Here is another example that requires increasing the power
sin2 x −4 + 2 + 2
Z
1
4
dx = I−4,2 = − cos−4+1 x sin2+1 x + I−4+2,2
cos x −4 + 1 −4 + 1
1 sin3 x
= + C.
3 cos3 x
Applying the recursive relation to the definite integral, we have
Z π π Z π Z π
2
n 1 n−1
2 n − 1 2 n−2 n−1 2
sin xdx = − cos x sin x + sin xdx = sinn−2 xdx.
0 n 0 n 0 n 0
214 CHAPTER 3. INTEGRATION
Then we get
Z π
2 n−1n−3
Z π
2
(n − 1)!! π , if n is even,
sinn xdx = ··· sin0 or 1
xdx = (n −n!! 2
0 n n−2 0
1)!!
, if n is odd.
n!!
Here sin0 or 1 x takes power 0 for even n and takes power 1 for odd n. Moreover, we
used the double factorial
(
2k(2k − 2)(2k − 4) · · · 4 · 2, if n = 2k,
n!! = n(n − 2)(n − 4) · · · =
(2k + 1)(2k − 1)(2k − 3) · · · 3 · 1, if n = 2k + 1.
Z Z
Exercise 3.5.18. Find the recursive relations for xp cos axdx and xp sin axdx. Then
compute the integral.
Z Z Z π
1. x cos2 xdx. 3. x cos2 x sin 2xdx. 5. x6 cos xdx.
0
Z Z Z π
2
2. 3 2
x cos xdx. 4. 3 2
x cos x sin 2xdx. 6. x5 sin 2xdx.
0
Z Z
p ax
Exercise 3.5.19. Find the recursive relations for x e cos bxdx and xp eax sin bxdx.
Then compute the integral.
Z Z Z
1. x2 e−x sin 3xdx. 2. 2 x
x 2 cos xdx. 3. x3 ex cos2 xdx.
Z π
2 (2m)!(2n)!
Exercise 3.5.21. Show that sin2m x cos2n xdx = for natural
22m+2n+1 m!n!(m + n)!
Z0 π
2
numbers m, n. Can you find sinm x cosn xdx?
0
Exercise 3.5.22. Use (tan x)0 = sec2 x = tan2 x + 1 to derive the recursive formula for
Z Z π
4
secm x tann xdx similar to Example 3.5.14 and then find the value of tan2n xdx.
0
3.5. PROPERTIES OF INTEGRATION 215
We have
Z
2 p
Ip = x(ax + bx + c) − xd(ax2 + bx + c)p
Z
2 p
= x(ax + bx + c) − px(2ax + b)(ax2 + bx + c)p−1 dx.
Z Z
0
− B (ax + bx + c) (ax + bx + c) dx − C (ax2 + bx + c)p−1 dx
2 p−1 2
B
= x(ax2 + bx + c)p − AIp − (ax2 + bx + c)p − CIp−1 .
p
This gives us the recursive relation
1 2 p p(b2 − 4ac) 1
Ip = (2ax + b)(ax + bx + c) − Ip−1 , p 6= − .
(2p + 1)2a (2p + 1)2a 2
On the other hand, we may also express Ip−1 in terms of Ip . After substituting p by
p + 1, we get
1 2 p+1 (2p + 3)2a
Ip = (2ax + b)(ax + bx + c) − Ip+1 , p 6= −1.
(p + 1)(b2 − 4ac) (p + 1)(b2 − 4ac)
For the special case
Z
Ip = (ax2 + b)p dx, a, b 6= 0,
1
For the special cases of p = − , −1, Ip is given by Exercise 3.4.10 (and will be
2
derived in Examples 3.5.30, 3.5.31, 3.5.32)
Z
dx x
√ = arcsin + C, a > 0,
2 2 a
Z a −x √
dx
√ 2
= log x + x + a + C,
2
x +a
x − a
Z
dx 1
= log
+ C,
x 2 − a2 2a x + a
Z
dx 1 x
2 2
= arctan + C.
x +a a a
Then the recursive relations can be used to compute Ip when p is an integer or a
half integer. For example, we have
Z √
2 2
1 2 2 12 2 · 12 a2
a − x dx = I 1 = x(a − x ) + I1
2 2 · 12 + 1 2 · 12 + 1 2 −1
1 √ a2 x
= x a2 − x2 + arcsin + C,
2 2 a
2 · 32 a2
Z
3 1 2 32
(a2 − x2 ) 2 dx = I 3 = x(a 2
− x ) + I3
2 2 · 32 + 1 2 · 32 + 1 2 −1
3a2 1 √ 2 2
1 2 2 32 a x
= x(a − x ) + x a − x2 + arcsin +C
4 4 2 2 a
1 √ 3a4 x
= − x(2x2 − 5a2 ) a2 − x2 + arcsin + C,
Z 8 8 a
dx 1 2(−2) + 3
2 2 2
= I−2 = − 2
x(x2 + a2 )−2+1 + I−2+1
(x + a ) 2(−2 + 1)a 2(−2 + 1)a2
x 1 x
= 2 2 2
+ 3 arctan + C.
2a (x + a ) 2a a
Exercise 3.5.25. Combine the ideas of Exercise 3.4.11 and Example 3.5.15 to compute the
integral.
Z Z Z p
dx dx
1. . 3. 3 . 5. x(1 − x)dx.
(x2 + 2x + 2)2 (x2 + 2x + 2) 2
Z
Z
dx
Z
3
dx
2. √ . 4. (x 2
+ 2x + 2) 2 dx. 6. 3 .
x2 + 2x + 2 (x(1 − x)) 2
If we use the differential notation dφ(x) = φ0 (x)dx, then the equality becomes
Z Z
f (φ(x))dφ(x) = f (y)dy .
y=φ(x)
The right side means computing the antiderivative of the function of y first, and
then substituting y = φ(x) into the antiderivative. This is the change of variable
formula. By Newton-Leibniz formula, we further get the change of variable formula
for definite integral
Z b Z b Z φ(b)
0
f (φ(x))φ (x)dx = f (φ(x))dφ(x) = f (y)dy.
a a φ(a)
Z
Example 3.5.16. If f (y)dy = F (y) + C, then by letting y = ax + b, we have
Z Z
1 1
f (ax + b)dx = f (ax + b)d(ax + b) = F (ax + b) + C.
a a
218 CHAPTER 3. INTEGRATION
For example,
The idea is a “mini-integration” of xdx that can be expressed more clearly by writing
Z Z
x2 2 1 1 2
xe dx = ex d(x2 ) = ex + C.
2 2
d(x2 ) x2
Z Z
xdx 1 1
= = arctan + C.
x 4 + a4 2 (x2 )2 + a4 2a2 a2
Example 3.5.18. The integral in Example 3.5.1 was computed in Example 3.5.10
again by using the integration by parts. The integral can also be computed by
change of variable.
Let y = x + 1. Then
Z Z Z
10 10 1 11 1
x(x + 1) dx = (y − 1)y dy = (y 10 − y 11 )dy = y − y 12 + C.
11 12
= x + 1 − 8 log |x + 1|
16
− 24(x + 1)−1 + 16(x + 1)−2 − (x + 1)−3 + C
3
8(9x2 + 12x + 5)
=x− − 8 log |x + 1| + C.
3(x + 1)3
Note that the second C is the first C plus 1.
Compare the above with the computation of definite integral
Z 1 4 Z 2
x−1 (y − 2)4
dx =y=x+1 dy
0 x+1 1 y4
Z 2
= (1 − 4 · 2y −1 + 6 · 22 y −2 − 4 · 23 y −3 + 24 y −4 )dy
1
2
−1 −2 16 −3
= y − 8 log y − 24y + 16y − y
3
1
1 1 16 1
= 1 − 8 log 2 − 24 − 1 + 16 −1 − −1
2 4 3 8
17
= − 8 log 2.
3
Note that the evaluation is done by using the new variable y instead of the old x.
Example 3.5.19. The integrals of inverse trigonometric functions can also be com-
puted by combining integration by parts and change of variable
d(1 − x2 )
Z Z Z
xdx 1
arcsin xdx = x arcsin x − √ = x arcsin x + √
1 − x2 2 1 − x2
√
= x arcsin x + 1 − x2 + C.
Alternatively, we may simply introduce the trigonometric function as the new vari-
able. For example, by y = arcsin x, x = sin y, we have
Z Z Z
arcsin xdx = yd(sin y) = y sin y − sin ydy
√
= y sin y + cos y + C = x arcsin x + 1 − x2 + C.
√
Note that cos y = 1 − x2 is non-negative because x ∈ [−1, 1] and y ∈ [− π2 , π2 ]. The
integration by parts used in both computations are essentially the same.
220 CHAPTER 3. INTEGRATION
Z
The idea for integrating arcsin x can also be used to compute xm (arcsin x)n dx
Z
and xm (arctan x)n dx.
Z
dx
Example 3.5.20. To compute √ , we introduce
ex + a
√
y= ex + a, y 2 = ex + a, 2ydy = ex dx.
Then
Z Z 2y dy Z
dx ex 2dy
√ x = = 2
.
e +a y y −a
By Examples 3.4.9 and 3.5.2, we have
√ √ x √
1 y − −a 1 e + a − a
√ log √ + C = √ log √ x √ + C, if a > 0,
Z
dx a y + −a a e r x a
+ a +
√ x =
e +a 2 y 2 e
√ arctan √ +C = √ arctan − − 1 + C, if a < 0.
−a −a −a a
Therefore
π π sin y π cos π 1
Z Z
I= dy = − dz
2 0 1 + cos2 y 2 cos 0 1 + z 2
π 1 dz π2
Z
π
= = (arctan 1 − arctan(−1)) = .
2 −1 1 + z 2 2 4
Note that the computation of the definite integral makes use of the new variable z
only. There is no need to go back to the original variable x.
√ √ √ ex dx
Z Z Z
x x
2. e dx. 4. xe dx. 6. .
1 + ex
222 CHAPTER 3. INTEGRATION
√
Z Z Z
dx dx
7. . 8. −x
. 9. ex + adx.
1 + ex x
e +e
Z Z
5
Exercise 3.5.31. Find a recursive relation for (e + a) dx. Then compute (ex + a) 2 dx
x p
Z
dx
and .
(e + a)3
x
1
Exercise 3.5.35. Explain why we cannot use the change of variable y = to compute the
Z 1 x
dx
integral 2
.
−1 1 + x
Exercise 3.5.36. Suppose f is continuous on an open interval containing [a, b]. Find the
d b
Z
derivative f (x + t)dx.
dt a
√
Z
dx
Example 3.5.22. To compute √ , we simply let y = x − 1. Then x =
1+ x−1
y 2 + 1, dx = 2ydy, and
Z Z Z
dx 2ydy 1
√ = =2 1− dy
1+ x−1 1+y 1+y
√ √
= 2y − 2 log(1 + y) + C = 2 x − 1 − 2 log(1 + x − 1) + C.
3.5. PROPERTIES OF INTEGRATION 223
Example 3.5.23. By taking y = x6 , we get rid of the square root and cube root at
the same time.
6y 5 dy
Z Z Z 3
dx y dy
√ √ =x=y6 3 2
=6
x+ x3
y +y y+1
3 Z
(y + 1) − 1dy
Z
2 1
=6 =6 y −y+1− dy
y+1 1+y
= 2y 3 − 3y 2 + 6y − 6 log(1 + y) + C
√ √ √ √
= 2 x − 3 3 x + 6 6 x − 6 log(1 + 6 x) + C.
Z
dx
Example 3.5.24. To compute √ √ , we introduce
x+1+ x+1
√ √
y = x + 1 + x.
Then
1 √ √ 1 √ 1 √
= x + 1 − x, y+ = 2 x + 1, y− = 2 x,
y y y
and 2
1 1 1 1
x= y− , dx = y − 3 dy.
4 y 2 y
Therefore
1 1
Z
dx
Z
2
y− y31
Z dy
(y − 1)(y + 1)(y 2 + 1)
√ √ = = dy
x+1+ x+1 y+1 2 (y + 1)y 3
Z
1 1 1 1
= 1 − + 2 − 3 dy
2 y y y
1 1 1
= y − log |y| − + 2 + C
2 y 2y
1 √ √ √ 1 √ √
= − log( x + 1 + x) + x + ( x + 1 + x)2 + C.
2 4
y2
r
x 2y
y= , x= 2
, dx = dy,
1−x 1+y (1 + y 2 )2
and get
Z r Z Z
x 2y 1 1
dx = y dy = 2 − dx
1−x (1 + y 2 )2 1 + y 2 (1 + y 2 )2
r
y p x
=− + arctan y + C = − x(1 − x) + arctan + C.
1 + y2 1−x
Exercise 3.5.39. Compute the integrals in Example 3.5.24 and 3.5.25 by using change of
variable similar to Example 3.5.26.
Example 3.5.27. We can use (cos x)0 = − sin x, (sin x)0 = cos x and cos2 x+sin2 x = 1
to calculate the antiderivative of cosm x sinn x, in which either m or n is odd.
Z Z Z
sin xdx = − sin xd cos x = (cos2 x − 1)d cos x
3 2
1
= cos3 x − cos x + C,
Z 3Z Z
4 5 4 4
cos x sin xdx = − cos x sin xd cos x = − cos4 x(1 − cos2 x)2 d cos x
Z
=− (cos4 x − 2 cos6 x + cos8 x)d cos x
1 2 1
= − cos5 x + cos7 x − cos9 x + C,
Z Z5 7 Z 9
sin x d cos x
tan xdx = dx = − = − log | cos x| + C,
cos x cos x
Z Z Z Z
dx d sin x d sin x
sec xdx = = =
cos x cos2 x 1 − sin2 x
1 1 + sin x 1 (1 + sin x)2 1 + sin x
= log + C = log 2 + C = log +C
2 1 − sin x 2 1 − sin x | cos x|
= log | sec x + tan x| + C.
Example 3.5.28. Similar to Example 3.5.27, we can also use (tan x)0 = sec2 x, (sec x)0 =
sec x tan x and sec2 x = 1 + tan2 x to calculate the antiderivative of secm x tann x.
Z Z Z
1
sec x tan xdx = tan xd sec x = (sec2 x − 1)d sec x = sec3 x − sec x + C,
3 2
3
Z Z
1
sec4 xdx = (tan2 x + 1)d tan x = tan3 x + tan x + C,
3
Z Z Z
tan4 xdx = (sec2 x − 1)2 dx = (sec4 x − 2 sec2 x + 1)dx
1 3
= tan x + tan x − 2 tan x + x + C
3
1
= tan3 x − tan x + x + C,
3
The following is computed in Example 3.5.14 by more complicated method
sin2 x
Z Z Z
1
4
dx = sec x tan xdx = tan2 xd tan x = tan3 x + C.
2 2
cos x 3
226 CHAPTER 3. INTEGRATION
Example 3.5.29. The method of Example 3.5.28 cannot be directly applied to the
antiderivatives of secn x and tann x for odd n. Instead, the idea of Example 3.5.27
can be used.
Using the integration by parts, we have
Z Z Z
3
sec xdx = sec xd tan x = sec x tan x − tan xd sec x
Z Z
= sec x tan x − tan x sec xdx = sec x tan x − (sec2 x − 1) sec xdx
2
Z Z
3
= sec x tan x − sec xdx + sec xdx.
Z Z Z
3 6 4
2. tan xdx. 5. cot x csc xdx. 8. tan5 x sec7 xdx.
Z Z Z
3. tan6 x sec4 xdx. 6. tan2 x sec xdx. 9. tan3 x cos2 xdx.
3.5. PROPERTIES OF INTEGRATION 227
Z Z Z
5 4 4 6
10. cot x sin xdx. 11. csc x cot xdx. 12. x tan x sec xdx.
In fact, the quadratic function has two real roots, and it is more direct to calculate
the integral by using
√
2 −b ± b2 − 4ac
ax + bx + c = a(x − x1 )(x − x2 ), x1 , x2 = ,
2a
and the idea of Examples 3.5.2 and 3.5.3.
2ax + b √ 2
1
√ log √ + ax + bx + c + C, if a > 0,
Z
dx a a
√ =
ax2 + bx + c 1 −2ax − b
√
arcsin √ + C, if a < 0.
−a b2 − 4ac
1
Exercise 3.5.51. Use the change of variable y = x ± to compute the integral.
x
x2 + 1
Z
dx
Z
1. dx. 3. 4
dx.
x4 + 1 x +1
Z 2
Z
x2 − 1 1 1
2. dx. 4. 1+x− ex+ x dx.
x4 + 1 1 x
2
x2 − 2x + 3 A B C
= + + .
x(x + 1)(x + 2) x x+1 x+2
Therefore
x2 − 2x + 3
Z Z
3 6 11
dx = − − dx
x(x + 1)(x + 2) 2x x + 1 2(x + 2)
3 11
= log |x| − 6 log |x + 1| − log |x + 2| + C.
2 2
Example 3.6.2. If some real root of the denominator has multiplicity, then we need
x2 − 2x + 3
more sophisticated postulation. For example, to integrate , we postulate
x(x + 1)3
x2 − 2x + 3 A B C D
3
= + + 2
+ .
x(x + 1) x x + 1 (x + 1) (x + 1)3
This is the same as
x2 − 2x + 3 = A(x + 1)3 + Bx(x + 1)2 + Cx(x + 1) + Dx.
Taking various values, we get
x = 0 : 3 = A, x = −1 : 6 = −D,
d
(coefficient of) x3 : 0 = A + B, : − 4 = D − C.
dx x=−1
Therefore A = 3, B = −3, C = −2, D = −6, and (C below means the general
constant, and is different from the coefficient C = −2 above)
Z 2 Z
x − 2x + 3 3 3 2 6
dx = − − − dx
x(x + 1)3 x x + 1 (x + 1)2 (x + 1)3
x
= 3 log
+ 2 + 3
+C
x+1 x + 1 (x + 1)2
x
= 3 log
+ 2x + 5 + C.
x + 1 (x + 1)2
Example 3.6.3. In Examples 3.5.2, 3.6.1, 3.6.2, the numerator has lower degree than
the denominator. In general, we need to divide polynomials for this to happen.
x5
For example, to integrate , we first divide x5 by (x + 1)2 (x − 1) =
(x + 1)2 (x − 1)
x3 + x2 − x − 1.
x2 − x + 2
x3 + x2 − x − 1 x5
− x5 − x4 + x3 + x2
− x4 + x3 + x2
x4 + x3 − x2 − x
2x3 −x
− 2x3 − 2x2 + 2x + 2
− 2x2 + x + 2
3.6. INTEGRATION OF RATIONAL FUNCTION 233
Then
x5 2 −2x2 + x + 2
=x −x+2+
(x + 1)2 (x − 1) (x + 1)2 (x − 1)
9 1 1
= x2 − x + 2 − + 2
+ ,
4(x + 1) 2(x + 1) 4(x − 1)
and
x5 x3 x2
Z
9 1 1
2
dx = − + 2x − log |x + 1| − + log |x − 1| + C.
(x + 1) (x − 1) 3 2 4 2(x + 1) 4
x2 dx x5 dx
Z
dx
Z Z
1. . 4. . 7. .
1+x x2 + x − 2 x(1 + x)(2 + x)
(2 − x)2 dx
Z
dx
Z Z
dx
2. . 5. . 8. .
2
x +x−2 2 − x2 x2 (1
+ x)
x4 dx
Z
dx
Z Z
xdx
3. . 6. . 9. .
2
x +x−2 1 − x2 (x + a)2 (x + b)2
The examples above illustrate how to integrate rational functions of the form
bm xm + bm−1 xm−1 + · · · + b1 x + b0
. This means exactly that all the roots of the
(x − a1 )n1 (x − a2 )n2 · · · (x − ak )nk
denominator are real. In general, however, a real polynomial may have complex
roots, and a conjugate pair of complex roots corresponds to a real quadratic factor.
1
Example 3.6.4. To integrate , we note that x3 = (x − 1)(x2 + x + 1), where
−1 x3
x2 + x + 1 has a conjugate pair of complex roots. We postulate
1 A Bx + C
= + 2 .
x3 −1 x−1 x +x+1
x = 0 : 1 = A − C; x = 1 : 1 = 3A; x2 : 0 = A + B.
234 CHAPTER 3. INTEGRATION
1 1 2
Therefore A = , B = − , C = − , and
3 3 3
−x − 2
Z Z Z
dx 1 dx 1
3
= + 2
dx
x −1 3 x−1 3 x +x+1
d(x2 + x + 1) 1
Z Z
1 1 dx
= log |x − 1| − 2
+ 2
3 6 x +x+1 2 x +x+1
Z
1 1 1 dx
= log |x − 1| − log(x2 + x + 1) − 2 √ 2
3 6 2
x + 12 + 23
1 (x − 1)2 1 1 x + 12
= log 2 − √ arctan √ + C
6 x +x+1 2 3 3
2 2
2
1 (x − 1) 1 2x + 1
= log 2 − √ arctan √ + C.
6 x +x+1 3 3
x2
Example 3.6.5. To integrate , we postulate
(x4 − 1)2
x2 x2
=
(x4 − 1)2 (x − 1)2 (x + 1)2 (x2 + 1)2
A1 A2 B1 B2 C1 x + D1 C2 x + D2
= + + + + + 2 .
x − 1 (x − 1)2 x + 1 (x + 1)2 x2 + 1 (x + 1)2
Since changing x to −x does not change the left side, we see that A1 = −B1 ,
A2 = B2 , C1 = C2 = 0, and the equality becomes
x2 2A1 x2 + 1 D1 D2
4 2
= 2
+ 2A2 2 2
+ 2 + 2 .
(x − 1) x −1 (x − 1) x + 1 (x + 1)2
1 1 1
It is then easy to find A1 = − , A2 = , D1 = 0, D2 = − . Therefore with the
16 16 4
help of Example 3.5.15,
x2
Z Z
1 1
4 2
dx = − +
(x − 1) 16(x − 1) 16(x − 1)2
1 1 1
+ + − dx
16(x + 1) 16(x + 1)2 4(x2 + 1)2
1 x + 1
− 1
= log
16 x−1 16(x − 1)
1 x 1
− − 2
− arctan x + C
16(x + 1) 8(x + 1) 8
x3
1 x + 1 1
=− + log
− arctan x + C.
4(x4 − 1) 16 x − 1 8
3.6. INTEGRATION OF RATIONAL FUNCTION 235
(x2 + 1)4
Example 3.6.6. To integrate the rational function , we first notice that the
(x3 − 1)2
degree of the numerator is higher. Therefore we divide (x2 + 1)4 by (x3 − 1)2 .
x2 + 4
x6 − 2x3 + 1 x8 + 4x6 + 6x4 + 4x2 + 1
8 5
−x + 2x − x2
4x6 + 2x5 + 6x4 + 3x2 + 1
6 3
− 4x + 8x −4
2x5 + 6x4 + 8x3 + 3x2 − 3
This can be interpreted as an expression for 2x5 + 6x4 + 8x3 + 3x2 − 3, which gives
x = 0: − 3 = −A1 + A2 + C1 + C2 ,
x = 1: 16 = 9A2 ,
x = −1 : − 4 = −2A1 + A2 + 4(−B1 + C1 ) + 4(−B2 + C2 ),
x5 : 2 = A1 + B1 ,
x4 : 6 = A1 + A2 − B1 + C1 ,
d
: 32 = 9A1 + 18A2 .
dx x=1
32 16 14 8 1
A2 = , A1 = , B1 = − , C1 = − , B2 = 0, C2 = − .
9 9 9 9 3
236 CHAPTER 3. INTEGRATION
Then
(x2 + 1)4
Z Z Z
1 3 32 16
3 2
dx = x + 4x + dx + dx
(x − 1) 3 9(x − 1) 9(x − 1)2
Z Z
14x + 8 1
− 2
dx − dx
9(x + x + 1) 3(x + x + 1)2
2
1 32 16
= x3 + 4x + ln |x − 1| −
3 9 9(x − 1)
Z 2 Z Z
7d(x + x + 1) dx dx
− 2
− 2
−
9(x + x + 1) 9(x + x + 1) 3(x + x + 1)2
2
1 32 16 7
= x3 + 4x + log |x − 1| − − log(x2 + x + 1)
3 9 9(x − 1) 9
Z Z
dx dx
− 2
− .
9(x + x + 1) 3(x + x + 1)2
2
By
2 √ !2 1
2 1 3 x+ 2 2x + 1
x +x+1= x+ + , √ = √ ,
2 2 3 3
2
and
1
Z
dx x+ 1 2x + 1
= √ 2 2 + √ 3 arctan √ +C
(x + x + 1)2
2
3 2 3 3
2 2
(x + x + 1) 2 2
2x + 1 4 2x + 1
= + √ arctan √ + C.
3(x2 + x + 1) 3 3 3
rational function can then be expressed as a sum: For each factor (x + a)m of the
denominator, we have terms
A1 A2 Am
+ 2
+ ··· + ,
x + a (x + a) (x + a)m
and for each factor (x2 + bx + c)n of the denominator, we have terms
B1 x + C1 B2 x + C2 Bn x + Cn
2
+ 2 2
+ ··· + 2 .
x + bx + c (x + bx + c) (x + bx + c)n
d(x2 + bx + c)
Z Z Z
Bx + C B B dx
2 n
dx = 2 n
+ C− .
(x + bx + c) 2 (x + bx + c) 2 (x + bx + c)n
2
The second part can be computed by the recursive relation in Example 3.5.15
Z
dx
(x + bx + c)n
2
Z
1 2x + b dx
= + 2(2n − 3) .
(4c − b2 )(n − 1) (x2 + bx + c)n−1 (x2 + bx + c)n−1
For n = 1, 2, we have
Z
dx 2 2x + b
= √ arctan √ + C,
x2 + bx + c 4c − b 2 4c − b 2
Z
dx 2x + b 4 2x + b
2 2
= 2 2
+ 3 arctan √ + C.
(x + bx + c) (4c − b )(x + bx + c) (4c − b2 ) 2 4c − b2
(1 + x)2 dx x2 dx
Z Z
1. . 10. .
1 + x2 (x2 + 4x + 6)2
(2x2 + 3)dx
Z
dx
Z
2. . 11. .
x3 + x2 − 2 x4 −1
x2 dx
Z
dx
Z
3. 3
. 12. .
x −1 x4 − 1
(x + 1)3 dx
Z Z
dx
4. . 13. .
x3 + 1 (x4 − 1)2
Z
dx
Z
dx
5. . 14. .
(x + 1)(x2 + 1) x4 +4
Z
dx
Z
dx
6. . 15. .
x(x + 1)(x2 + x + 1) x4 + x2 + 1
Z Z
dx dx
7. . 16. .
(x + 1)(x + 1)(x3 + 1)
2 x6 +1
Z Z
dx xdx
8. . 17. .
(x + a )(x2 + b2 )
2 2 (x − 1)2 (x2
+ 2x + 2)
− y34 dy
Z Z r Z Z
dx 1 x 1 3dy
√ = 3
dx = 1− 3 y 2 = 3
3
x + x2
3 x x+1 y y −1
− 1 − y13
1 (y − 1)2 √ 2y + 1
= log 2 − 3 arctan √ + C
2 y +y+1 3
1 (y − 1)3 √ 2y + 1
= log 3 − 3 arctan √ + C
2 y −1 3
√ √ √
r
3 3 1 x
= log( x + 1 − x) − 3 arctan √ 2
3 3
+ 1 + C.
2 3 x+1
r
n ax + b
In general, a function involving can be integrated by introducing
cx + d
r
n ax + b dy n − b y n−1
y= , x= , dx = n(ad − bc) dy.
cx + d −cy n + a (cy n − a)2
√
Z Z Z
dx dx
1. 1+ ex dx. 2. √ √ . 3. √ .
1 + e + 1 − ex
x ax + b
1. t is an integer.
r+1
2. is an integer.
s
r+1
3. + t is an integer.
s
A theorem by Chebyshev1 says that these are the only cases that the antiderivative can
be changed to the integral of a rational function.
1
Pafnuty Lvovich Chebyshev, born 1821 in Okatovo (Russia), died 1894 in St Petersburg (Rus-
sia). Chebyshev’s work touches many fields of mathematics, including analysis, probability, number
theory and mechanics. Chebyshev introduced his famous polynomials in 1854 and later generalized
to the concept of orthogonal polynomials.
3.6. INTEGRATION OF RATIONAL FUNCTION 241
Example 3.6.10. Rational functions of sin x and cos x can be integrated by a simpler
substitution if it has additional property. For example, to integrate the function
sin x
, we introduce
sin x + cos x
dy
y = tan x, x = arctan y, dx = .
y2+1
242 CHAPTER 3. INTEGRATION
Here we use the tangent of the full angle x, instead of half the angle in Example
3.6.9. Then
Z y dy
y2 + 1
Z Z
sin xdx tan xdx y 2 +1 1 1
= = = log + arctan y + C
sin x + cos x tan x + 1 y+1 4 (y + 1)2 2
1 1
= x − log | sin x + cos x| + C.
2 2
The key point here is that the integrand is a rational function R(sin x, cos x) that
satisfies R(−u, −v) = R(u, v). In this case, the integrad can always be written as a
rational function of tan x, and the change of variable can be applied.
cos x
Example 3.6.11. Note that rational function of sin x and cos x is
cos x sin x + sin3 xm
odd in the sin x variable. This is comparable to the function cos x sinn x for the case
n is odd. We may introduce the same change of variable y = cos x, dy = − sin xdx
like the earlier example and get
−ydy
Z Z Z
cos xdx cos x sin xdx
3 = 2 2 =
cos x sin x + sin x cos x sin x + (sin x) 2 y(1 − y ) + (1 − y 2 )2
2
√
1 y − 1+ 5 1
y − 1
2
= √ log √ + log +C
5 y − 1− 5 2 y + 1
2
√
1 2 cos x − 1 − 5 1 1 − cos x
= √ log √ + log + C.
5 2 cos x − 1 + 5 2 1 + cos x
Similarly, a rational function of sin x and cos x that is odd in the cos x variable
can be integrated by introducingZ x = sin y. If R(−u, −v) = R(u, v), then we may
introduce y = tan x to compute R(sin x, cos x)dx.
1 − r2
Z
dx
Z
1. dx, |r| < 1. 6. .
1 − 2r cos x + r2 2 sin x + sin 2x
sin2 x
Z Z
dx
2. . 7. dx.
a − cos 2x 1 + sin2 x
Z
dx
Z
dx
3. . 8. .
a + tan x sin(x + a) sin(x + b)
Z
dx
Z
dx
4. . 9. .
cos x + tan x (1 + cos2 x)(2 + sin2 x)
Z
(1 + sin x)dx
Z
dx
5. . 10. .
sin x + tan x sin x(1 + cos x)
3.7. IMPROPER INTEGRAL 243
Z Z
dx dx
11. . 14. .
(a + cos x) sin x a2 sin2 x + b2 cos2 x
1 − tan x
Z Z
(sin x + cos x)dx
12. . 15. dx.
sin x(sin x − cos x) 1 + tan x
Z Z √
dx
13. . 16. tan xdx.
(a + cos2 x) sin x
Z +∞
Therefore the improper integral e−x dx has value 1. Geometrically, this means
0
that the area of the unbounded region under the graph of the function e−x and over
the interval [0, +∞) is 1.
Example 3.7.2.
Z The function log x is unbounded on the bounded interval (0, 1]. Since
1
the integral log xdx is improper at 0+ , we consider the integral over [, 1] for > 0
0
Z 1 x=1
log xdx = (x log x − x) = −1 − log + .
x=
244 CHAPTER 3. INTEGRATION
1
log x
e−x
Since the right side converges to −1 as → 0+ , the improper integral converges and
has value Z 1 Z 1
log xdx = lim+ log xdx = −1.
0 →0
The unbounded region measured by this improper integral is actually the same as
the one in Example 3.7.1, up to a rotation.
Z +∞
dx
Example 3.7.3. Consider the improper integral , where a > 0. We have
a xp
1−p
b b − a1−p
, if p 6= 1,
Z
dx
= 1 − p
a xp
log b − log a, if p = 1.
As b → +∞, we get
1−p
Z +∞ a
dx , if p > 1,
p
= p−1
a x
diverge, if p ≤ 1.
Z 1
dx
Example 3.7.4. The integral is improper at 0+ for p > 0. For > 0, we have
0 xp
1 − 1−p
1
, if p 6= 1,
Z
dx
p
= 1−p
x
− log , if p = 1.
3.7. IMPROPER INTEGRAL 245
As → 0+ , we get
1
1
, if p < 1,
Z
dx
= 1 − p
0 xp diverge, if p ≥ 1.
Z b
By the same argument, for a < b, the improper integrals (x − a)p dx and
a
Z b
(b − x)p dx converge if and only if p > −1.
a
Z +∞
dx
Example 3.7.5. The integral is improper at +∞ and −∞. The integral
−∞ x2 + 1
on a bounded interval is
Z b
dx
= arctan b − arctan a.
a x2 +1
Then we get
Z +∞ Z b
dx dx π π
2
= lim = lim arctan b − lim arctan a = − − = π.
−∞ x + 1 a→−∞ a x2 + 1 b→+∞ a→−∞ 2 2
b→+∞
1
However, the computation is wrong since the integrand is not continuous on
Z 1 x
dx
[−1, 1]. In fact, the integral is improper on both sides of 0, and we need
−1 x
246 CHAPTER 3. INTEGRATION
Z 0Z 1
dx dx
both improper integrals and to converge and then get
−1 x 0 x
Z 1 Z 0 Z 1
dx dx dx
= + .
−1 x −1 x 0 x
Since Z 1 Z 1
dx dx
= lim+ = lim+ − log = +∞
0 x →0 x →0
Z 1 Z 1
dx dx
diverges, the improper integral diverges, so that also diverges.
0 x −1 x
Z 0
Example 3.7.8. To compute the improper integral xex dx, we start with integra-
−∞
tion by parts on a bounded interval
Z 0 Z 0 Z 0
x x b
xe dx = xde = −be − ex dx = −beb − 1 + eb .
b b b
The example shows that the integration by parts can be extended to improper
integrals, simply by taking the limit of the integration by parts formula for the usual
proper integrals.
Z +∞
dx
Example 3.7.9. For a > 1, consider the improper integral . We have
a x(log x)p
Z b Z b Z log b
dx d(log x) dy
= = .
a x(log x)p a (log x)p log a yp
Taking b → +∞ on both sides, we get
Z +∞ Z +∞
dx dy
= .
a x(log x)p log a y
p
The equality means that the improper integral on the left converges if and only if
the improper integral on the right converges, and the two values are the same. By
Z +∞
dx
Example 3.7.3, we see that the improper integral converges if and
a x(log x)p
only if p < 1, and
Z +∞
dx (log a)p+1
= − , if p < 1.
a x(log x)p p+1
3.7. IMPROPER INTEGRAL 247
The example shows that the change of variable can also be extended to improper
integrals, simply by taking the limit of the change of variable formula for the usual
proper intervals.
Exercise 3.7.1. Determine the convergence of improper integrals and evaluate the conver-
gent ones.
Z +∞ Z 0 Z 1
dx
1. xp dx. 4. ax dx. 7. √ .
0 −∞ −1 1 − x2
Z 1 Z 1 π
dx dx
Z
2
2. . 5. . 8. tan xdx.
0 x(− log x)p −1 1 − x2 0
Z +∞ Z +∞ Z π
x dx
3. a dx. 6. . 9. sec xdx.
0 2 1 − x2 0
Exercise 3.7.2. Determine the convergence of improper integrals and evaluate the conver-
gent ones.
Z +∞ Z +∞ Z +∞
dx dx log x
1. . 7. √ . 13. dx.
1 x+1 0 x(1 + x) 1 x2
Z +∞ 9 1
xdx
Z Z
dx log x
2. . 8. √ . 14. √ dx.
−∞ x2 + 1 1
3
x−9 0 x
Z +∞ Z +∞ Z +∞
dx x arctan x
3. √ . 9. xex dx. 15. dx.
1
3
x+1 0 0 (1 + x2 )2
+∞ +∞ +∞
x2 dx
Z Z Z
2
4. . 10. xe−x dx. 16. e−ax cos bxdx.
0 x3 + 1 0 0
+∞ +∞
x2 dx √ +∞
Z Z Z
5. . 11. e− x
dx. 17. e−ax sin bxdx.
0 (x3 + 1) 2
0 0
Z +∞ Z 1 Z +∞
dx
6. . 12. x log xdx. 18. e−x | sin x|dx.
0 x(x + 1)(x + 2) 0 0
Z 1
1 1 2 n
lim log + log + · · · + log = log xdx.
n→∞ n n n n 0
Note that the left side is a “Riemann sum” for the right side. However, since the integral is
improper, we cannot directly use the fact that
√ the Riemann sum converges to the integral.
n
n!
Moreover, the limit is the same as limn→∞ = e−1 .
n
248 CHAPTER 3. INTEGRATION
of the function I(b) = f (x)dx as b → +∞. The Cauchy criterion for the conver-
a
gence is that, for any > 0, there is N , such that
Z c
b, c > N =⇒ |I(c) − I(b)| = f (x)dx < .
b
The Cauchy criterion shows that the convergence of an improper integral depends
only on the behavior of the function near the improper place. Moreover, the Cauchy
criterion also implies the following test for convergence.
Z b
Theorem 3.7.1 (Comparison Test). If |f (x)| ≤ g(x) on (a, b) and the integral g(x)dx
Z b a
at +∞. The convergence of g(x)dx implies that for any > 0, there is N ,
a
such that Z c
c > b > N =⇒ g(x)dx < .
b
The assumption |f (x)| ≤ g(x) further implies that for c > b,
Z c Z c Z c
f (x)dx ≤
|f (x)|dx ≤ g(x)dx.
b b b
3.7. IMPROPER INTEGRAL 249
Z +∞
1 − 23 3
Example 3.7.10. We know √ < x for x ≥ 1. Since x− 2 dx converges,
x3 + 1 Z 1
+∞
dx
by the comparison test, we know √ also converges. We note that
x 3+1
Z +∞ 1
dx
√ converges too because only the behavior of the function for big x
0 x3 + 1
(i.e., near +∞) is involved.
Z +∞
2 2
Example 3.7.11. To determine the convergence of e−x dx, we use 0 < e−x ≤
0 Z +∞
2
−x
e for x ≥ 1. By Example 3.7.1 and the comparison test, we know e−x dx
Z 1 Z +∞ 1
−x2 −x2
converges. Since e dx is a proper integral, we know e dx also con-
0 0
verges.
Z +∞
log x
Example 3.7.12. To determine the convergence of dx, p > 0, we use the
1 xp Z +∞
log x 1 1
comparison p
≥ p > 0 for x ≥ e. For p ≤ 1, the divergence of dx
x x Z 1 xp
+∞
log x
implies the divergence of dx.
1 xp Z +∞
1
For p > 1, although we also know the convergence of p
dx, the comparison
Z1 +∞ x
log x
above cannot be used to conclude the convergence of dx. Instead, we
1 xp
choose q satisfying p > q > 1. Then by
log x
p log x
lim x = lim p−q = 0,
x→+∞ 1 x→+∞ x
x q
Z +∞
log x 1 dx
we have p ≤ q for sufficiently large x. By the convergence of q
,
x x Z +∞ 1 x
log x
therefore, we know the converges of dx.
1 xp
250 CHAPTER 3. INTEGRATION
Z +∞
log x
We conclude that dx converges if and only if p > 1.
1 xp
Z +∞ Z +∞
Exercise 3.7.4. Compare the integrals I = f (x)dx and J = g(x)dx that
a a
are improper at +∞.
f (x)
1. Prove that if limx→+∞ converges, g(x) ≥ 0 for sufficiently large x, and J
g(x)
converges, then I also converges.
f (x)
2. Prove that if limx→+∞ converges to a nonzero number, and g(x) ≥ 0 for
g(x)
sufficiently large x, then I converges if and only if J converges.
(f + g)2 dx converge.
a
1 1 +∞
xp dx
Z Z Z
dx
4. . 7. √ , q > 0. 10. xp ax dx, a > 0.
−∞ |x|p | log x|q 0 1 − xq 1
Z 1
Z +∞
dx dx Z +∞
5. . 8. , q > 11. xp log(1 + xq )dx.
xp + (log x)q 0 xp (1 − xq )r
2 1
0.
+∞ Z 1 1
xp dx
Z Z
6. . 9. xp ax dx, a > 0. 12. xp log(1 + xq )dx.
0 1 + xq 0 0
+∞
Z
sin x
By Example 1.3.8, the right side diverges to +∞. Therefore x dx diverges.
1
1
The tests basically replaces sin x and in the example by f (x) and g(x). In case
x
f (x) is continuous and g(x) is continuously differentiable,
Z we can justify the tests x
by repeating the argument in the example. Let F (x) = f (t)dt. Then F (a) = 0,
a
and Z b Z b Z b
f (x)g(x)dx = g(x)dF (x) = g(b)F (b) − F (x)g 0 (x)dx.
a a a
Under the assumption of the Dirichlet test, we have limb→+∞ g(b)F (b) = 0, and
|F (x)| < M for some constant M and all x ≥ a. Assume the monotonic function
g(x) is increasing. Then g 0 (x) ≥ 0, and
|F (x)g 0 (x)| ≤ M g 0 (x).
Since Z +∞ Z b
0
g (x)dx = lim g 0 (x)dx = lim (g(b) − g(a)) = −g(a)
a b→+∞ a b→+∞
Finally, we show some examples of using the integration by parts and change of
variable to compute improper integrals. We note that the convergence needs to be
verified before applying the properties of integration.
Z +∞
2
Example 3.7.16. In Example 3.7.11, we know the convergence of e−x dx. By the
2 Z +∞ 0
xp e−x 2
similar idea, especially limx→+∞ −x = 0, we know that xp e−x dx converges
e 0
for any p ≥ 0.
Let Z +∞
2
In = xn e−x dx.
0
Then we may apply the integration by parts to get
1 +∞ n−1 −x2
Z
In = − x de
2 0
x=+∞
n − 1 +∞ n−2 −x2 n−1
Z
1 n−1 −x2
=− x e + x e dx = In−2 .
2 x=0 2 0 2
It is known (by using integration of two variable function, for example) that
Z +∞ √
−x2 π
I0 = e dx = .
0 2
3.7. IMPROPER INTEGRAL 255
The helix in R3
h
x = r cos θ, y = r sin θ, z= θ, 0 ≤ θ ≤ 2π,
2π
moves along the circle of radius r from the viewpoint of the (x, y)-coordinates, and
moves up in the z-direction in constant speed, such that each round moves up by
height h.
In general, a parametrized curve in R2 is given by
x = x(t), y = y(t), a ≤ t ≤ b.
The initial point of the curve is (x(a), y(a)), and the end point is (x(b), y(b)). To
compute the length of the curve between the two points, we consider the length s(t)
from the initial point (x(a), y(a)) to the point (x(t), y(t)). We find the change s0 (t)
and then integrate the change to get s(t). The length of the whole curve is s(b).
Similar to the argument for the area of the region G[a,b] (f ), we need to be careful
about the sign. In the subsequent discussion, we pretend everything is positive
(which at least gives you the right derivative), and further argument about the
negative case is omitted. Moreover, we restrict the argument to the case x(t) and y(t)
are nice. In fact, we will assume the two functions are continuously differentiable.
In general, we may break the curve into finitely many continuously differentiable
pieces and add the lengths of the pieces together.
As the parameter t is changed by ∆t, the change ∆s = s(t + ∆t) − s(t) of the
length is the length of the curve segment from (x(t), y(t)) to (x(t + ∆t), y(t + ∆t)).
The curve segment is approximated by the straight line connecting the two points.
Therefore the length of the curve is approximated by the length of the straight line
p p
∆s ≈ (x(t + ∆t) − x(t))2 + (y(t + ∆t) − y(t))2 = (∆x)2 + (∆y)2 .
3.8. APPLICATION TO GEOMETRY 257
(x(t), y(t))
t=b
t=a
x2 y 2
More generally, an ellipse + 2 = 1 can be parametrized as
a2 b
x = a cos θ, y = b sin θ, 0 ≤ θ ≤ 2π.
The length of the ellipse is the so called elliptic integral
Z 2π p Z 2π √
2 2
b2
(−a sin θ) + (b cos θ) dθ = a 1 + K cos2 θdθ, K= − 1.
0 0 a2
The integral cannot be computed as an elementary expression if a 6= b.
258 CHAPTER 3. INTEGRATION
x2
2 2
Example 3.8.3. The astroid x 3 + y 3 = 1 can be parametrized as
Note that the range [0, 2π] for t corresponds to moving around the astroid exactly
once. Therefore the perimeter is
Z 2π q Z 2π
2
(−3 cos2 t sin t)2 + (3a sin t cos t)2 dt = 3| sin t cos t|dt = 6.
0 0
Example 3.8.4. The argument about the length of curves also applies to curves in
R3 and leads to
Z bp
length of curve = x0 (t)2 + y 0 (t)2 + z 0 (t)2 dt.
a
3.8. APPLICATION TO GEOMETRY 259
The result has a simple geometrical interpretation: By cutting along a vertical line,
the cylinder can be “flattened” into a plane. Then the helix becomes the hypotenuse
of a right triangle with horizontal length 2πr and vertical length h.
Example 3.8.5. When a circle rolls along a straight line, the track of one point on
the circle is the cycloid. Let r be the radius of the circle, and assume the point is
at the bottom at the beginning. After rotating angle t, the center of the circle is at
(rt, r), and the point is at (rt, r) + r(− cos(t − π2 ), sin(t − π2 )). Therefore the cycloid
is parameterized by
x = rt − r sin t, y = r − r cos t.
As the circle makes one complete rotation, we get one period of the cycloid,
corresponding to t ∈ [0, 2π]. The length of this one period is
Z 2π p Z 2π p Z 2π
2 2
t
(r − r cos t) + (r sin t) dt = r 2(1 − cos t)dt = r 2 sin dt = 8r.
0 0 0 2
Exercise 3.8.4. Think of the rolling circle that produces the cycloid as a disk. What is the
track of a point on the disk that is not necessarily on the circle (i.e., the boundary of the
disk)? Find the formula for computing the length of this track.
Exercise 3.8.5. Suppose a line is wrapped around a circle. When the line is unwrapped
from the circle, the track of one point on the line is the involute of the circle. Let r be the
radius of the circle and let t be the unwrapped angle.
3.8. APPLICATION TO GEOMETRY 261
2. Find the length of the involute as the line is unwrapped by half of the circle.
Example 3.8.6. The curve y = x2 and the straight line y = x enclose a region over
0 ≤ x ≤ 1. The area of the region is the area below x subtracting the area below
x2 , which is
Z 1 Z 1 Z 1
2 1
xdx − x dx = (x − x2 )dx = .
0 0 0 6
x2 x
Example 3.8.7. To compute the area of the region bounded by y = x2 −2x and y = x.
We denote the (positive) areas of the four indicated regions by A1 , A2 , A3 , A4 . Then
Z 2 Z 2 Z 3 Z 3
2
xdx = A1 , (x −2x)dx = −A2 , xdx = A3 +A4 , (x2 −2x)dx = A4 .
0 0 2 2
x
x2 − 2x
A3
A4
A1
2 3
A2
In general, we can divide [a, b] into some intervals, such that on each interval, one of
the following happens: f (x) ≥ 0 ≥ g(x), f (x) ≥ g(x) ≥ 0, 0 ≥ f (x) ≥ g(x). Then
an argument similar to Example 3.8.7 shows that the total area is indeed given by
the formula above.
Example 3.8.8. The functions sin x and cos x intersect at many
places and enclose
π 5π
many regions. One such region is over the interval , , on which we have
4 4
sin x ≥ cos x. The area of the region is
Z 5π √
4
(sin x − cos x)dx = 2 2.
π
4
3.8. APPLICATION TO GEOMETRY 263
Z b
Figure 3.8.8: The area is (f (x) − g(x))dx.
a
5π
π 4 cos x
0 π π
4 2
sin x
Example 3.8.9. The region between the parabola y 2 − x = 1 and the straight line
x + y = 1 is between the functions
(√
x + 1, if − 1 ≤ x ≤ 0, √
f (x) = g(x) = − x + 1.
1 − x, if 0 ≤ x ≤ 3,
The area is
√ √
Z 3 Z 3 Z 0 Z 3 Z 3
9
f (x)dx − g(x)dx = x + 1dx + (1 − x)dx − (− x + 1)dx = .
−1 −1 −1 0 −1 2
Note that the region is obtained by rotating the region in Example 3.8.7. Natu-
rally the results are the same. The previous example actually suggests another way
of computing the area, by exchanging the roles of x and y.
x+y =1 y2 − x = 1
−1 3
−2
Exercise 3.8.6. Compute area of the region with the given bounds.
√
1. y = x, y-axis, y = 1. 6. y = ex , y = x2 − 1, on [−1, 1].
3. y = log x, y = x, y = 0, y = 1. 8. x = y 2 − 4y, x = 2y − y 2 .
4. y = x2 , y = 2x − x2 . 9. y = 2x − x2 , x + y = 0.
Exercise 3.8.7. Explain that, if 0 ≥ f ≥ g on [a, b], then the area of the region between the
Z b
graphs of f and g over [a, b] is (f (x) − g(x))dx.
a
Exercise 3.8.8. Explain that, the area of the region between the graphs of f and g over
Z b
[a, b] is |f (x) − g(x)|dx, even when we may have f > g some place and f < g some
a
other place.
3.8. APPLICATION TO GEOMETRY 265
C C1
C2
Figure 3.8.11: The region is always on the left of the boundary curve.
Consider a simple region in Figure 3.8.2, such that the boundary curve can be
divided into the graphs of two functions y = y1 (x) and y = y2 (x) for x ∈ [α, β].
Suppose the boundary curve has parameterisation φ(t) = (x(t), y(t)), t ∈ [a, b], such
that y1 and y2 correspond respectively to t ∈ [a, c] and t ∈ [c, b]. The direction of
the parameterisation satisfies our assumption.
y1 C
t=a
X t=b
t=c
y2
α β
Figure 3.8.12: Calculate the area by integrating along the boundary curve.
Z β
The area of the region is (y1 (x) − y2 (x))dx. We may use the parameterisation
α
of the boundary curve as the change of variable to get the following formula for the
266 CHAPTER 3. INTEGRATION
area
Z β Z β Z β
(y1 (x) − y2 (x))dx = y1 (x)dx − y2 (x)dx
α α α
Z a Z b
= y(t)dx(t) −
y(t)dx(t)
c c
Z b Z
=− y(t)dx(t) = − ydx.
a C
Here we have x(a) = x(b) and y(a) = y(b) because the boundary curve is closed.
The positive sign on the right can be explained as follows. The area is supposed to
be the contribution from the right boundary part subtracting the contribution from
the left boundary part. From the picture, we see that the direction of the right part
is upward, the same as the y-direction (the direction of dy), and the direction of the
left part is downward, opposite to the y-direction.
Example 3.8.11. The boundary circle of the unit disk is parameterised by x(t) = cos t,
y = sin t, t ∈ [0, 2π]. Since the parameterisation satisfies our assumption, we may
use it to calculate the area of the unit disk
Z 2π Z 2π
− y(t)dx(t) = sin2 tdt = π.
0 0
Example 3.8.12. Consider the region enclosed by the Archimedean spiral x = t cos t,
y = t sin t, t ∈ [0, π], and the x-axis. The boundary of the region consists of the
spiral and the interval [−π, 0] on the x-axis. After checking that the direction of the
boundary satisfies the assumption, we get the area
Z π Z 0 Z π
0 1
− (t sin t)(t cos t) dt − 0dx = − (t sin t cos t − t2 sin2 t)dt = π 3 .
0 −π 0 6
3.8. APPLICATION TO GEOMETRY 267
t=π t=0
Exercise 3.8.11. Explain that,Zif the direction of the boundary curve C is opposite to our
assumption, then the area is ydx.
C
Exercise 3.8.12. Compute the areas of the regions enclosed by the curves.
x2 y 2
1. Ellipse + 2 = 1.
a2 b
2 2
2. Astroid x 3 + y 3 = 1.
p p
3. |x| + |y| = 1.
the curve around the x-axis, we let A(t) be the area of the surface obtained by
revolving the [a, t] segment of the curve around the x-axis. Again the subsequent
argument ignores the sign.
As the parameter is changed by ∆t, the change ∆A = A(t+∆t)−A(t) of the area
is the area of surface obtained by revolving the curve segment from (x(t), y(t)) to
(x(t + ∆t), y(t + ∆t)) = (x, y) + (∆x, ∆y). Since the curve segment is approximated
by the straight line connecting the two points, the area ∆A is approximated by the
area of the revolution of the straight line.
2
y
∆
2 +
t=b p ∆
x
y + ∆y
y
t=a
2π(y + ∆y)
2πy
x x + ∆x
and the outer arc has length 2πy(t + ∆t). Therefore the area of the partial annulus
gives the approximation
1 p p
∆A ≈ (2πy(t) + 2πy(t + ∆t)) ∆x2 + ∆y 2 = π(y(t) + y(t + ∆t)) ∆x2 + ∆y 2 .
2
Dividing the change ∆t of the parameter, we get
s 2 2
∆A ∆x ∆y
≈ π(y(t) + y(t + ∆t)) + .
∆t ∆t ∆t
This leads to
Z b p
area of surface of revolution = A(b) = 2π y(t) x0 (t)2 + y 0 (t)2 dt.
a
p
We note that ds = x0 (t)2 + y 0 (t)2 dt is used for computing the length of curve, and
we can write Z b
area of surface of revolution = 2π y(t)ds.
a
Here y is really the distance from the curve to the axis of revolution.
Example 3.8.14. The torus is obtained by revolving a circle on the upper half plane
around the x-axis. Let the radius of the circle be a and let the center of the circle
be (0, b). Then a < b and the circle may be parametrized as
(a cos θ, a sin θ + b)
a
θ
b
With the help of Example 3.5.15 and the computation in Example 3.8.2, we have
Z 1 √
1 2 2√ 1 2
Z Z
3 1
2 2
x 1 + 4x dx = 2
x 1 + x dx = (1 + x2 ) 2 − (1 + x2 ) 2 dx
0 8 0 8 0
2 !
3 Z 2
1 1 3 2 · 1
= x(1 + x2 ) 2 + 2
−1 (1 + x2 ) 2 dx
8 2 · 32 + 1 0
2 · 3
2
+ 1 0
√ √ √
1 2 3 2 1 1 1
= 5 −
2 5 + log(2 + 5) =√ − log(2 + 5).
8 5 5 2 5 40
2π π √
So the area is √ − log(2 + 5).
5 20
x−x 2
√
2
x
x2
x x x
Example 3.8.15 shows how to adapt the formula for the area of the surface
of revolution to the more general case of any parametrized curve (x(t), y(t)) with
respect to a straight line αx + βy + γ = 0. Assume the curve is on the “positive
side” of the straight line
αx(t) + βy(t) + γ
Then the distance y(t) should be replaced by p . We still have
α 2 + β2
p
ds = x0 (t)2 + y 0 (t)2 dt. Therefore we get the general formula
b
p
(αx(t) + βy(t) + γ) x0 (t)2 + y 0 (t)2
Z
area of surface of revolution = 2π p dt.
a α2 + β 2
Exercise 3.8.13. Find the formula for the area of the surface of revolution of the graph of
a function y = f (x) around the x-axis. What about revolving around the y-axis? What
about revolving around the line x = a?
Exercise 3.8.15. Find the area of the surface obtained by revolving one period of the cycloid
in Example 3.8.5 around the x-axis.
Exercise 3.8.16. Find the area of the surface obtained by revolving the involute of the circle
in Example 3.8.5 around the x-axis.
let V (x) be the volume of the part of solid obtained by revolving G[a,x] (f ) around
the x-axis. Then the change ∆V = V (x + ∆x) − V (x) is the volume of the solid
obtained by revolving G[x,x+∆x] (f ).
M
m
f (b)
∆x
f (a)
Example 3.8.17. The solid torus is obtained by revolving a disk in the upper half
plane around the x-axis. Let the radius of the disk be a and let the center√ of the disk
2 2
√ a < b and the disk is the region between y1 (x) = b + a − x and
be (0, b). Then
2 2
y2 (x) = b − a − x over the interval [−a, a]. The torus is the solid obtained by re-
volving G[−a,a] (y1 ) subtracting the solid obtained by revolving G[−a,a] (y2 ). Therefore
the volume of the torus is the volume of the first solid subtracting the second
Z a Z a Z a
2 2
π y1 (x) dx − π y2 (x) dx = π (y1 (x)2 − y2 (x)2 )dx
−a −a
Z−aa √ √
=π ((b + a2 − x2 )2 − (b − a2 − x2 )2 )dx
Z−aa √
=π 4b a2 − x2 dx
−a
Z π
2
= 4πb a2 cos2 tdt = 2π 2 a2 b.
− π2
annulus section
Example 3.8.17 shows that, if f ≥ g ≥ 0 on [a, b], then the volume of the solid of
revolution obtained by revolving the region between f and g over [a, b] around the
x-axis is
Z b
π (f (x)2 − g(x)2 )dx.
a
Now we extend the discussion before Example 3.8.11 about calculating the area
of a plan region by the integrating along the boundary curve. Suppose the region X
in Figure 3.8.2 lies in the upper half plane. Then similar to the earlier discussion,
274 CHAPTER 3. INTEGRATION
So all the earlier discussion about the area can be applied to the volume of the solid
of revolution.
Example 3.8.20. Consider the region enclosed by the Archimedean spiral and the
x-axis in Example 3.8.20. The volume of the solid obtained by revolving the region
around the x-axis is
Z π Z π
2 2
− (t sin t) d(t cos t) = − (t2 sin2 t cos t − t3 sin3 t)dt = π 3 − 4π.
0 0 3
Exercise 3.8.17. Find volume of the solid obtained by revolving the region in Exercise
3.8.12 around the x-axis.
Exercise 3.8.19. Explain that, if the direction of the boundary curve C is opposite
Z to our
assumption, then the volume of the solid of revolution around the x-axis is π y 2 dx.
C
Next we consider the general case of revolving a region X around a straight line
L : αx + βy + γ = 0. We assume X is on the “positive side” of L in the sense that
the parameterisation (x(t), y(t)) of the boundary curve C of X satisfies
Moreover, we still
Z assume that the direction of C satisfies our assumption. Then in
αx(t) + βy(t) + γ
the formula −π y 2 dx, y should be understood as the distance p
C α2 + β 2
from C to L, and dx should be understood as the progression
(β, −α)
along the direction p of L. Therefore the volume of the solid of revolution
α2 + β 2
is Z b
(αx(t) + βy(t) + γ)2 (βx0 (t) − αy 0 (t))
−π p dt
a ( α2 + β 2 )3
We note that the negative sign is due to the mismatch (See Figure 3.8.4) of the
direction of the boundary curve and the direction of the progression along L. In
general, we may determine the sign by comparing the direction of the parameter
and the direction of progression.
(α, β)
X L : αx + βy + γ = 0
progression along L
For the special case that X is above the horizontal lineZ y = b ((α, β) = (0, 1)),
the volume of the solid of revolution around the line is −π (y − b)2 dx. If X is on
C
the right of the y-axis (i.e., the line x = 0, with (α, β) = (1, 0)), then the volume of
the solid of revolution around the y-axis is
Z b Z
2 0
−π x(t) (−y (t))dt = π x2 dy.
a C
0
The negative sign in front of y comes from the fact that the progression for the line
x = 0 goes downwards, the opposite of the y-direction. If X is on the right of the
vertical line
Z x = a, then the volume of the solid of revolution around the vertical
line is π (x − a)2 dy.
C
Example 3.8.21. Take the segment y = x2 , x ∈ [0, 1], of the parabola in Example
3.8.2. If we revolve the region X between the parabola and the x-axis around the
x-axis, then the volume of the solid is
Z x=1 Z 1
2 1
π y dx = π (x2 )2 dx = π.
x=0 0 5
If we revolve X around the y-axis, then the volume of the solid is
Z x=1 Z y=1
2 1
π x dy = π ydy = π.
x=0 y=0 2
Let Y be the region between the parabola and the vertical line x = 1. If we
revolve Y around the vertical line x = 1, then the volume of the solid is
Z x=1 Z 1
2 7
π (1 − x) dy = π (1 − x)2 d(x2 ) = π.
x=0 0 6
If we revolve Y around the y-axis instead, then the volume of the solid is
Z x=1 Z y=1
2 2 1
π (1 − x )dy = π (1 − y)dy = π.
x=0 y=0 2
Let Z be the region between the parabola y = x2 and the diagonal y = x. If we
revolve Z around the x-axis, then the volume of the solid is
Z x=1
2
π (x2 − (x2 )2 )dx = π.
x=0 15
If we revolve Z around the line x = −1, then the volume of the solid is
Z x=1 Z 1
2 2 2
π ((x + 1) d(x ) − (x + 1) dx) = π (x + 1)2 (2x − 1)dx
x=0
Z0 2
1
=π z 2 (2z − 3)dz = π.
1 2
3.8. APPLICATION TO GEOMETRY 277
dx + dx √ dx + d(x2 ) 1 + 2x
√ = 2dx and √ = √ dx.
2 2 2
Exercise 3.8.21. Let A ≤ f (x) ≤ B for x ∈ [a, b]. Find the formula for the volume of the
solid of revolution of the region between the graph of function f and y = A around the
line y = C, where C 6∈ (A, B).
Exercise 3.8.22. Find the formula for the volume of the solid obtained by revolving a region
X for which the parameterised boundary has the right direction.
3. X is below y = b, around y = b.
Exercise 3.8.23. Find the volume of the solid obtained by revolving the region between the
curve and the axis of revolution.
1. y = x3 , x ∈ [0, 2], around x-axis.
2. x2 = 2py, x ∈ [0, 1], around y-axis.
3. y = ex , x ∈ [0, 1], around x-axis.
4. y = ex , x ∈ [0, 1], around y-axis.
5. y = ex , x ∈ [0, 1], around x = 1.
h πi
6. y = tan x, x ∈ 0, , around x-axis.
4
ex + e−x
7. y 2 = , x ∈ [−a, a], around x-axis.
2
8. y 2 = x3 , x ∈ [0, 1], around x-axis.
x2 y 2
9. Ellipse + 2 ≤ 1, around x-axis.
a2 b
2 2
10. Astroid x 3 + y 3 ≤ 1, around x-axis.
Xx
a x b
Z b
formula π (f (x)2 − g(x)2 )dx for the solid of revolution.
a
Xx
a x b
Example 3.8.24. Let R be a region in the plane. Let P be a point not in the plane.
Connecting P to all points in R by straight lines produces the pyramid X with base
R and apex P .
We may put R on the (x, y)-plane in R3 and assume that P = (0, 0, h) lies in the
positive z-axis, where h is the distance from P to the plane. Let A be the area of R.
We decompose the pyramid by the horizontal planes, so that z is the distance. The
section Xz is similar to R, so that the area of Xz is
proportional
2 to the square of its
h−z
distance h − z to P . We find the area of Xz to be A, and the volume of
h
the pyramid is
Z h 2
h−z 1
Adx = hA.
0 h 3
Example 3.8.25. Let X be the intersection of two round solid cylinders of radius 1 in
3.8. APPLICATION TO GEOMETRY 281
P
h
2
h−z
z Xz area h
A
area A
0 R
orthogonal position. We put the two cylinders in R3 , by assuming the two cylinders
to be x2 + y 2 ≤ 1 and x2 + z 2 ≤ 1. Then we decompose the solid by intersecting with
the
√ planes perpendicular to the x-axis. 2The section Xx is a square of side length
2 1 − x2 and therefore has area 4(1 − x ). The volume of the intersection solid X
is Z 1
16
4(1 − x2 )dx = .
−1 3
1 − x2
p
x −1
1 x −1
Exercise 3.8.25. Explain the formula in Section 3.8.3 for the area of surface of revolution
by using suitable equidistant decomposition.
x2 y 2 z 2
1. Ellipsoid + 2 + 2 ≤ 1.
a2 b c
x2 y 2 z 2
2. Solid bounded by + 2 − 2 = 1 and z = ±c.
a2 b c
3. Intersection of the sphere x2 + y 2 + z 2 ≤ 1 and the cylinder x2 + y 2 ≤ x.
1. A solid with a disk as the base, and the parallel sections perpendicular with the
base are equilateral triangles.
2. A solid with a disk as the base, and the parallel sections perpendicular with the
base are squares.
3. Cylinder cut by two planes, one is perpendicular to the cylinder and the other form
angle α with the cylinder. The two planes do not intersect inside the cylinder.
4. Cylinder cut by two planes forming respective angles α and β with the cylinder.
The two planes do not intersect inside the cylinder.
5. A wedge cut out of a cylinder, by two planes forming respective angles α and β with
the cylinder, such that the intersection of two planes is a diameter of the cylinder.
So far we used parallel lines and planes to construct the decomposition. We may
also use equidistant curves and surfaces to construct the decomposition.
b
r
ψ(r)
a
Xr
φ(r)
O
For example, for the disk centered at (1, 0) and of radius 1, we have φ =
r r
− arccos and ψ = arccos , r ∈ [0, 2]. The area of the disk of radius 1 is (taking
2 2
r
t = arccos , r = 2 cos t)
2
Z 2 Z 0 Z π
r 2
2r arccos dr = 2t(2 cos t)d(2 cos t) = 8 t cos t sin tdt
0 2 π
2
0
Z π Z π
2
=4 t sin 2tdt = u sin udu = π.
0 0
Example 3.8.27. Let X be a region in the right plane (i.e., the right side of y-axis).
Let Y be the solid obtained by revolving X around the y-axis. We may use the
cylinders centered at the y-axis to decompose Y . The decomposition is equidistant,
with x as the distance. Let Xx be the intersection of X with the vertical line
x × R. Then the section Yx is the cylinder obtained by revolving Xx around the
y-axis. The area of the section is 2πx(length of Xx ). Therefore the volume of the
Z b
solid of revolution is 2π x(length of Xx )dx. In particular, if X is the region
a
between functions f (x) and g(x), where f (x) ≥ g(x) on [a, b], then the volume is
Z b
2π x(f (x) − g(x))dx.
a
Yx Xx
x
X
Exercise 3.8.29. Compute the volumes of the solids of revolution in Example 3.8.21 by
using the formula in Example 3.8.27.
Exercise 3.8.30. Compute the volumes of the solids of revolution in Exercise 3.8.23 by
using the formula in Example 3.8.27.
Exercise 3.8.31. Compute the volumes of the solids of revolution in Exercise 3.8.24 by
using the formula in Example 3.8.27.
Exercise 3.8.32. After Example 3.8.21, we presented the formula for computing the volume
of a solid obtained by revolving a region in R2 bounded by a parameterized curve. Can
you derive the similar formula by using the idea from Example 3.8.27?
Exercise 3.8.33. In Section 3.8.4 and Example 3.8.27, we have two ways of computing the
volume of a solid of revolution. For the following simple case, explain that the two ways
give the same result: Let f (x) be and invertible non-negative function on [0, a], such that
f (a) = 0 and both f (x) and f −1 (y) are continuously differentiable. The solid is obtained
by revolving the region between the graph of f and the two axis.
Example 3.8.28. Finally, we compute the size of high dimensional objects. Let αn
be the volume of the n-dimensional sphere S n of radius 1. Then
α0 = 2, α1 = 2π, α2 = 4π.
Thus
π 2π
αn = 2αn−1 In−1 = 4αn−2 In−1 In−2 = 4αn−2 = αn−2 .
2(n − 1) n−1
3.9. POLAR COORDINATE 285
t
1
r=c θ=c
d
r
α
θ
θ
r = cos θ
θ a r
O π 3π
0 2 2 2π
1. x = 1. 3. x + y = 1. 5. x2 + y 2 = x.
2. y = −1. 4. x = y 2 . 6. xy = 1.
Exercise 3.9.3. What is the polar equation of the curve obtained by flipping r = f (θ) with
respect to the origin? Then use your conclusion to find the curve r = − cos θ.
Exercise 3.9.4. What is the relation between the curves r = f (θ) and r = −f (θ + π)?
Exercise 3.9.5. What is the polar equation of the curve obtained by rotating r = f (θ) by
angle α? Then use your conclusion to answer the following.
1. What is the curve r = sin θ?
2. Find the polar equation for a general circle passing through the origin.
Example 3.9.2. The Archimedean spiral is r = θ. Note that r < 0 when θ < 0, so
that a flipping with respect to the origin is needed when we draw the part of the
spiral corresponding to θ < 0. The symmetry with respect to the y-axis is due to
the fact that if (r, θ) satisfies r = θ, then (−r, −θ) also satisfies r = θ.
The Fermat’s spiral is r2 = θ. The symmetry with respect to the origin is due
to the fact that if (r, θ) satisfies r2 = θ, then (−r, θ) also satisfies r2 = θ.
Example 3.9.3. The curve r = 1 + cos θ is a cardioid. Its clockwise rotation by 90◦ is
another cardioid r = 1 + sin θ. More generally, the curve r = a + cos θ is a limaçon.
288 CHAPTER 3. INTEGRATION
The curve intersects itself when |a| < 1 and does not intersect itself when |a| > 1.
The symmetry with respect to the x-axis is due to the fact that if (r, θ) satisfies
r = a + cos θ, then (r, −θ) also satisfies the equation.
1.5
π 1
2
0.4
π θ=0
3π
2
r = 1 + cos θ r = 1 + sin θ
1 θ
sθ
co θ
θ θ
B θ
O 1 2
A C
If we imagine the rolling circle C as part of a rolling disk D, and we fix a point
in D of distance d from the center of C. Then the track tranced by the point is the
limaçon r = 1 + 2d cos θ, with the origin of the polar coordinate being a point of
distance d form the center of A.
Example 3.9.4. The curve r = cos 2θ is the four-leaved rose, and r = cos 3θ is the
three-leaved rose. The circle r = cos θ can be considered as the one-leaved rose.
3.9. POLAR COORDINATE 289
Inh general, ithe curve r = cos nθ can be described as follows. For θ in the arc
π π
I = − , , the value of r goes from 0 to 1 and then back to 0, so that the
2n 2n
π
corresponding curve is one leaf occupying angle of the whole circle. This is the
h π πi h π πi n π
leaf in − , for n = 2 and in − , for n = 3. For θ in the second arc I + ,
4 4 6 6 n
π
we need to rotate this first leaf by angle and then flipping with respect to the
n
π
origin (because r becomes negative), which gives a leaf occupying I + + π. This
n
5π 7π 7π 9π
is the leaf in , for n = 2 and in , for n = 3. For θ in the third arc
4 4 6 6
2π 2π
I+ , we get the leaf obtained by rotating the first leaf by angle (no flipping
n n
needed now because r becomes non-negative
again), which
gives a leaf occupying
2π 3π 5π 3π 5π
I+ . This is the leaf in , for n = 2 and in , for n = 3. Keep
n 4 4 6 6
going, we see two distinct patterns depending on the parity of n.
3π
θ= 2 2π
θ= 3
4 3
θ=π 3 1 θ=0 1 θ=0
2 2
π
π
θ= 3
θ= 2
Figure 3.9.6: Four-leaved rose r = cos 2θ and three-leaved rose r = cos 3θ.
More
generally,
we may consider r = cos pθ. Again we get first leaf occupying
π π π 2π
I= − , , the second leaf occupying I + +π, the third leaf occupying I + ,
2p 2p p p
etc. The pattern could be very complicated, depending on whether p is rational or
irrational, and in case p is rational, the parity of the numerator and denominator of
p.
π
Finally, r = sin pθ is obtained by rotating r = cos 2θ by . We also get many
2p
leaved roses by other rotations.
For the area in terms of polar coordinate, assume f ≥ 0 and consider the region
X[α,β] (f ) bounded by r = f (θ), θ ∈ [α, β], and the rays θ = α and θ = β. Using the
idea of Section 3.1.1, let A(θ) be the area of the region X[α,θ] (f ). Then the change
A(θ + h) − A(θ) is the area of X[θ,θ+h] (f ). Since X[θ,θ+h] (f ) is sandwiched between
fans of angle between θ, θ + h and radii m = min[θ,θ+h] f , M = max[θ,θ+h] f , we get
1 2 1
m h ≤ A(θ + h) − A(θ) ≤ M 2 h.
2 2
3.9. POLAR COORDINATE 291
Here the left and right sides are the known areas of the fans. The inequality is the
same as
1 2 A(θ + h) − A(θ) 1
m ≤ ≤ M 2.
2 h 2
θ+h
M θ
m
1
Example 3.9.6. Let p > . Then one leaf of the rose r = cos pθ is from the angle
2
292 CHAPTER 3. INTEGRATION
π π
− to the angle . The length of the leaf is
2p 2p
Z π p Z π q
2p 2p
2 02
(cos pθ) + (cos pθ) dθ = cos2 pθ + p2 sin2 pθdθ
π π
− 2p − 2p
Z π q
2
= p−2 cos2 t + sin2 tdt
− π2
Z 2π
1 p
= 1 + (p−2 − 1) cos2 tdt.
2 0
This is the elliptic integral in Example 3.8.1. Moreover, the area of the leaf is
Z π
1 2p π
(cos pθ)2 dθ = .
2 − 2p
π 4p
Example 3.9.7. The cardioid r = 1 + cos θ and the circle r = 3 cos θ intersect at
π
θ = ± . The area of the region outside the cardioid and inside the circle is
3
Z π
1 3
((3 cos θ)2 − (1 + cos θ)2 )dθ = π.
2 − π3
π
3
− π3
Example 3.9.8. We try to find the volume of the solid of revolution obtained by
revolving the region between the two leaves of the limaçon r = a + cos θ, 0 < a < 1,
around the x-axis. In the cartesian coordinate, the curve is parameterized by
Let θ = α at the origin O. Then the volume we are looking for is the volume of
the solid of revolution from θ = 0 to θ = α, subtracting the volume of the solid of
3.9. POLAR COORDINATE 293
O
θ=π θ=0
θ=α
Exercise 3.9.8. What is the length of lemiçon? What is the area of the region enclosed by
lemiçon? Note that for |c| > 1, we have two parts of the lemiçon and two regions.
Exercise 3.9.9. Find length of the part of the cardioid r = 1 + cos θ in the first quadrant.
Moreover, find the area of the region enclosed by this part and the two axes.
Exercise 3.9.10. Find the area of the region enclosed by strophoid r = 2 cos θ − sec θ.
W (x + ∆x) − W (x) ∆W
W 0 (x) = lim = lim = F (x).
∆x→0 ∆x ∆x→0 ∆x
This implies that the work done for the whole trip from a to b is
Z b
W (b) = F (x)dx.
a
Example 3.10.1. Suppose one end of spring is fixed and the other end is attached
to an object. In the natural position, when the spring is neither stretched nor
compressed, no force is exercised on the object. When the position of the object
3.10. APPLICATION TO PHYSICS 295
deviate from the natural position by x, however, Hooke’s law says that the spring
exercises a force F (x) = −kx on the object. Here k is the spring constant, and the
negative sign indicates that the direction of the force is opposite to the direction of
the deviation.
If the object starts at distance a from its natural position, then the work done
by the spring in pulling the object to its natural position is
Z a
k
kxdx = a2 .
0 2
Here we use the positive sign because the direction of movement is the same as the
direction of the force.
The argument about the work done by a force is quite typical. In general, if a
quantity is additive, then the quantity can be decomposed into small pieces. The
estimation of each small piece tells us the change of the quantity. The whole quantity
is then the integration of the change.
In the subsequent examples, we will only analyze a small piece of an additive
quantity. We will omit the limit part of the argument and directly write down the
corresponding integration.
Example 3.10.2. We want to find the work it takes to pump a bucket of liquid out
of the top of the bucket.
x
H
h ∆x
Suppose the bucket has base diameter r, top diameter R, and height H. Suppose
the liquid has density ρ and depth h. We decompose liquid into horizontal sections.
At distance x from the top, the section is a disk of radius r(x) satisfying
r(x) − r H −x
= .
R−r H
The liquid of thickness ∆x and at distance x from the top has (approximate) weight
gρπr(x)2 ∆x (g is the gravitational constant). The work it takes to lift this piece
296 CHAPTER 3. INTEGRATION
of liquid to the top of bucket is ∆W ≈ (gρπr(x)2 ∆x)x = πgρxr(x)2 ∆x. Since the
liquid spans from x = H − h to x = H, the total work needed is
Z H
πgρ H
Z
2
W = πgρ xr(x) dx = 2 x [(R − r)(H − x) + rH]2 dx
H−h H H−h
2 2 2 1 2 1 3 1 2 4
= πgρH R a b + a(2a − 3b)b + (1 − a)(1 − 3a)b − (1 − a) b ,
2 3 4
r h
where a = and b = .
R H
l(x) − l H −x
= .
L−l H
The force exercised on the strip is ∆F ≈ (ρx)l(x)∆x. Since the water spans from
x = 0 to x = H, the total force
Z H Z H
L−l 1
F =ρ xl(x)dx = ρ x L− x dx = ρH 2 (L + 2l).
0 0 H 6
x
H l(x)
∆x
Exercise 3.10.1. A spring has natural length a. If the force F is needed to stretch the
spring to length b, how muck work is needed to stretch the spring from the natural length
to the length b?
Exercise 3.10.2. A ball of radius R is full of liquid of density ρ. Due to the gravity, the
liquid leaks out of a hole at the bottom of the ball. How much work is done by the gravity
in draining all the liquid?
3.10. APPLICATION TO PHYSICS 297
Exercise 3.10.3. A circular disk of radius r is fully submerged in liquid of density ρ, such
that the center of the disk is at depth h. What is the force exercised by the liquid on one
side of the plate? Note that the plate may be inclined at some angle.
Exercise 3.10.4. A ball of radius r is fully submerged in liquid of density ρ, such that the
center of the disk is at depth h. What is the force exercised by the liquid on the ball?
Exercise 3.10.5. A cable of mass m and length l has a mass M tied to the lower end. How
much word is done in using the cable to lift the mass M to the top end of the cable?
Exercise 3.10.6. Newton’s law of gravitation says that two bodies with masses m and M
gmM
attract each other with a force F = , where d is the distance between the bodies.
d2
Suppose the radius of the earth is R and the mass is M . How much work is needed to
launch a satellite of mass m vertically to a circular orbit of height H? What is the minimal
initial velocity needed for the satellite to escape the earth’s gravity?
Then the system is decomposed into n pieces. The i-th piece can be approximately
considered as a mass mi = ρ(x∗i )(xi − xi−1 ) located at x∗i , for some x∗i ∈ [xi−1 , xi ].
The whole system is approximated by the system of n pieces, and has approximate
center of mass Pn
ρ(x∗i )(xi − xi−1 )x∗i
x̄P = Pi=1 n ∗
.
i=1 ρ(xi )(xi − xi−1 )
The denominator is the Riemann sum of the function xρ(x) and the numerator is the
Riemann sum of the function ρ(x) (see the beginning of Section 3.3.1). Therefore
as the partition gets more and more refined, the limit becomes the center of mass
Z b
xρ(x)dx
a
x̄ = Z b .
ρ(x)dx
a
298 CHAPTER 3. INTEGRATION
Now consider masses distributed along a curve (x(t), y(t)), t ∈ [a, b], with the density
ρ(t) at location t. Take a partition P of [a, b]. The curve is approximated by
straight line segments connectingp(x(ti−1 ), y(ti−1 )) to (x(ti ), y(ti )). The i-th straight
line segment has length ∆si = (x(ti ) − x(ti−1 ))2 + (y(ti ) − y(ti−1 ))2 and can be
approximately considered as a mass mi = ρ(t∗i )∆si located at (x(t∗i ), y(t∗i )), for some
t∗i ∈ [ti−1 , ti ]. The whole system is approximated by the system of n pieces, and has
approximate center of mass
Pn Pn
i=1 (ρ(t∗i )∆si )x(t∗i ) i=1 (ρ(t∗i )∆si )y(t∗i )
x̄P = Pn ∗
, ȳ P = P n ∗
.
i=1 ρ(ti )∆si i=1 ρ(ti )∆si
As the partition gets more and more refined, the limit becomes the center of mass
Z b Z b
x(t)ρ(t)ds y(t)ρ(t)ds p
a a
x̄ = Z b
, ȳ = Z b
, ds = x0 (t)2 + y 0 (t)2 dt.
ρ(t)ds ρ(t)ds
a a
Example 3.10.4. For constant density ρ(x) = ρ distributed on the interval, the center
of mass is the middle point
Z b
xρdx 1
ρ (b2 − a2 ) a+b
x̄ = Za b
= 2 = .
ρ(b − a) 2
ρdx
a
If the density is ρ(x) = λ + µx, which is linearly increasing, then the center of mass
is Z b
x(λ + µx)dx
a 3λ(a + b) + 2µ(a2 + ab + b2 )
x̄ = Z b = .
3(2λ + µ(a + b))
(λ + µx)dx
a
Example 3.10.5. Consider the semi-circular curve of radius r and constant density ρ.
We have
x = r cos θ, y = r sin θ, ds = rdθ, 0 ≤ θ ≤ π,
3.10. APPLICATION TO PHYSICS 299
Exercise 3.10.7. Find the center of mass of the parabola y = x2 , x ∈ [0, 2], of constant
density.
Exercise 3.10.8.√ Find the center of mass of a triangle of constant density and with vertices
at (−1, 0), (0, 15) and (7, 0).
Exercise 3.10.9. Let m[a,b] and x̄[a,b] be the mass and the center of mass of a distribution
of masses on [a, b] with the density ρ(x). Let [a, b] = [a, c] ∪ [c, b] and similarly introduce
m[a,c] , m[c,b] , x̄[a,c] , x̄[c,b] . Show that the center of mass has the distribution property
Series
∞
X
an = a1 + a2 + a3 + · · · .
n=1
∞
X
rn = 1 + r + r2 + r3 + · · · + rn + · · · ,
n=0
∞
X
n = 1 + 2 + 3 + ··· + n + ··· ,
n=1
∞
X 1 1 1 1 1
= + + + ··· + + ··· ,
n=1
n(n + 1) 1·2 2·3 3·4 n(n + 1)
∞
X 1 1 1 1
p
= 1 + p + p + ··· + p + ··· ,
n=1
n 2 3 n
∞
X (−1)n+1 1 1 (−1)n+1
=1− + − ··· + + ··· ,
n=1
n 2 3 n
∞
X 1 1 1 1
= 1 + + + ··· + + ··· .
n=0
n! 1! 2! n!
301
302 CHAPTER 4. SERIES
If the partial sum converges, then the series converges and has sum (or value)
∞
X
an = lim sn .
n→∞
n=1
If finitely many terms in a series are modified, added or dropped, then the new
partial sum s0n and the original partial sum sn are related by s0n = sn+n0 + C for
some constants n0 and C. This implies that the convergence of series is not affected,
although the sum may be affected.
The arithmetic properties of the sequence limit implies
X X X X X
(an + bn ) = an + bn , can = c an .
P P an
However, there is no formula for an bn or .
bn
1 − rn+1
Therefore sn = and
1−r
∞
X 1 , if |r| < 1,
n
r = 1−r
n=0
diverges, if |r| ≥ 1.
P 1
Example 4.1.2. The computation in Example 1.3.1 gives the partial sum of .
n(n + 1)
1 1 1 1 1 1 1 1 1
+ +· · ·+ = 1− + − +· · ·+ − = 1− .
1·2 2·3 n(n + 1) 2 2 3 n n+1 n+1
Therefore ∞
X 1 1
= lim 1 − = 1.
n=1
n(n + 1) n→∞ n+1
4.1. SERIES OF NUMBERS 303
P 1
Example 4.1.3. Example 2.7.5 shows that the partial sum of satisfies |sn − e| =
n!
e 1
, which implies ∞
P
|Rn (1)| ≤ n=0 = e. Exercise 1.3.18 gives an alternative
(n + 1)! n!
argument. Of course the argument, which uses the Lagrange form of the remainder
P xn
(Theorem 2.7.1), can be extended to the series . The partial sum satisfies
n!
ec e|x|
|sn − ex | = |Rn (x)| = |x|n+1 ≤ |x|n+1 , |c| < |x|.
(n + 1)! (n + 1)!
Since for fixed x, the right side converges to 0 as n → ∞, we conclude that
P∞ x n
n=0 = ex .
n!
n P
Exercise 4.1.1. Suppose the partial sum sn = . Find the series an and its sum.
2n + 1
Exercise 4.1.2. Decimal expressions for rational numbers have repeating patterns. For
example, we have
34 34 34
1.234 = 1.2343434 · · · = 1.2 + + + + ···
1000 100000 10000000
∞
34 X 1 34 1 611
= 1.2 + = 1.2 + = .
1000 100 n 1000 1 495
n=0 1−
100
1. Find rational expressions for 1.23, 1.230, 1.023.
5 43
2. Final the decimal based series representing the rational numbers , .
12 35
Exercise 4.1.4. The Sierpinski carpet is obtained from the unit square by successively
deleting “one third squares”. Find the area of the carpet.
304 CHAPTER 4. SERIES
Exercise 4.1.5. Two lines L and L0 form an angle θ at P . A boy starts on L at distance a
from P and walk to L0 along shortest path. After reaching L0 , he walks back to L along
shortest path. Then he walks to L0 again along shortest path, and keeps walking back and
forth. What is the total length of his trip?
Exercise 4.1.6. Find the area between curves y = xn and y = xn+1 and use this to conclude
1
that ∞
P
n=1 = 1.
n(n + 1)
Exercise 4.1.7. Compute the partial sum and the sum of series.
1.
P∞ n. P∞ 1
n=1 nr 4. n=2 .
n(n + 1)(n + 2)
P 2n + (−1)n 3n−1
1
2. . P∞
5n+1 5. n=2 log 1 − 2 .
n
P∞ 1 P∞ n
3. n=0 . 6. n=1 .
(a + nd)(a + (n + 1)d) (n + 1)!
P∞ xn
Exercise 4.1.8. Suppose xn > 0. Compute n=1 .
(1 + x1 )(1 + x2 ) · · · (1 + xn )
Exercise 4.1.10. Use the Lagrange form of the remainder to show that the Taylor series
1
for converges for |x| < 1.
1−x
Exercise 4.1.11. Use the Lagrange form of the remainder to show that the Taylor series
for cos x and sin x converge for any x
∞ ∞
X x2n X x2n+1
(−1)n = cos x, (−1)n = sin x.
(2n)! (2n + 1)!
n=0 n=0
This is a consequence of
P n
(−1)n ,
P P P
By the theorem, the series 1, n, diverge. By Example
P n+1
1.1.20, the series sin na converges if and only if a is an integer multiple of π.
If an ≥ 0 for sufficiently large n, then the partial sum sequence is increasing for
large n, and Theorem 1.3.2 becomes the following.
P
Theorem 4.1.3. If an ≥ 0, then an converges if and only if the partial sums are
bounded.
P 1
Example 4.1.4. The terms in the series = p are positive. Therefore the con-
n
vergence is equivalent to the boundedness of the partial sum. For p ≥ 2, we have
1 1
p
< and the following bound from Examples 1.3.1 and 4.1.2
n (n − 1)n
1 1 1 1 1 1 1
p
+ p + ··· + p ≤ 1 + + + ··· + = 2 − < 2.
1 2 n 1·2 2·3 (n − 1)n n
P (−1)n+1
Example 4.1.5. The even partial sum of the series is the partial sum of
n
the series
X 1 1
1
1 1
− = 1− + − + ··· .
2n − 1 2n 2 3 4
The terms of the series above are positive, and the partial sum has upper bound
1 1 1 1 1 1 1 1 1
1− + −· · ·+ − = 1− − −· · ·− − − < 1.
2 3 2n − 1 2n 2 3 2n − 2 2n − 1 2n
P 1 1
By Theorem 4.1.3, − converges. This means that the even partial
2n − 1 2n
P (−1)n+1 1
sum s2n of converges. By s2n+1 = s2n + , the odd partial sum
n 2n + 1
P (−1)n+1
s2n+1 converges to the same limit. Therefore converges.
n
P 1 P n
Exercise 4.1.12. Show the divergence of √ and .
n
a 2n − 1
1 1 1 1 1 1 1 1
1. 1 + − + + − + ···. 3. 1 + 2
+ 3 + 4 + ···.
2 3 4 5 6 2 3 4
1 2 1 1 2 1 1 1
2. 1 + − + + − + ···. 4. 1 + √ +√ +√ + ···.
2 3 4 5 6 1·2 3·4 5·6
rn for 0 ≤ r < 1
P
Exercise 4.1.14. Use Theorem 4.1.3 to argue about the convergence of
P 1
and .
n!
0 = 0 + 0 + 0 + ···
= (1 − 1) + (1 − 1) + (1 − 1) + · · ·
= 1 − 1 + 1 − 1 + 1 − 1 + ···
= 1 + (−1 + 1) + (−1 + 1) + (−1 + 1) + · · ·
= 1 + 0 + 0 + 0 + · · · = 1.
We used the Cauchy criterion (Theorem 1.3.3) for the divergence of harmonic
P1
series . In general, applying the Cauchy criterion to the partial sum shows that
P n
an converges if and only if for any ≥ 0, there is N , such that (since |sm − sn | is
symmetric in m and n, we may always assume n > m)
Let Z n
xn = f (1) + f (2) + · · · + f (n) − f (x)dx.
1
By f decreasing, we get
Z n
xn − xn−1 = f (n) − f (x)dx ≤ 0,
n−1
Z n
xn − f (n) = f (1) + · · · + f (n − 1) − f (x)dx ≥ 0
1
n−1
X Z k+1
= f (k) − f (x)dx ≥ 0.
k=1 k
The first inequality implies xn is decreasing, and the second inequality implies xn ≥
f (n) ≥ 0. Therefore lim xn = γ converges, and the theorem follows. We have
0 ≤ γ ≤ x1 = f (1).
1
Example 4.2.1. For p > 0, the function is decreasing and converges to 0 as
xp
P 1
x → +∞. By Theorem 4.2.1, therefore, the series converges if and only if the
Z +∞ np
dx
improper integral converges. By Example 3.7.3, this happens if and only
1 xp
if p > 1.
P1
Although the harmonic series diverges, Theorem 4.2.1 estimates the partial
n
sum
1 1 1
1 + + + · · · + = log n + γ + n ,
2 3 n
where n decreases and converges to 0, and
γ = 0.577215664901532860606512090082 · · ·
Example 4.2.2. For p > 0 and x > e, the integral test can be applied to the function
1 P 1
p
. We conclude that converges if and only if the improper
x(log x) Z n(log n)p
+∞
dx
integral converges. By Example 3.7.9, this means p > 1.
a x(log x)p
1 π2
Example 4.2.3. We will show that ∞
P
n=1 2 = in Example 4.5.13. In fact, for
n 6
1
even k, ∞ k
P
n=1 k can be calculated as a rational multiple of π . However, very little
n
is known about the sum for odd k. Still, we may use the idea of Theorem 4.2.1 to
estimate the remainder
Z +∞ ∞ Z k+1
X X∞ ∞ Z k+1
X Z +∞
f (x)dx = f (x)dx ≤ f (k) ≤ f (x)dx = f (x)dx.
n+1 k=n+1 k k=n+1 k=n k n
P∞ 1
For example, the 10-th partial sum of n=1 is
n3
1 1 1
s10 = 3
+ 3 + · · · + 3 = 1.197532 · · · .
1 2 10
By Z ∞ Z ∞
dx 1 dx 1
= = 0.005, = = 0.004132 · · · ,
10 x3 2(10)2 11 x3 2(11)2
we get
1
If we want to get the approximate value of ∞
P
n=1 3 up to the 6-th digit, then
n
we my try to find n satisfying
Z n+1
dx 2n + 1 1
= < < 0.000001.
n x3 2n2 (n + 1)2 n3
P 1
Exercise 4.2.1. Determine the convergence of .
n(log n)(log(log n))p
4.2. COMPARISON TEST 309
1 1
Exercise 4.2.2. Find suitable function f (n), such that the sequence 1+ √ +· · ·+ √ −f (n)
2 n
P∞ n 1
converges to a limit γ. Then express the sum of the series n=0 (−1) √ in terms of γ.
n
P∞ 1
Exercise 4.2.3. Estimate n=1 3 to within 0.01.
n2
The test is completely parallel to the similar test (Theorem 3.7.1) for the con-
vergence of improper integrals, and can be proved similarly by using the Cauchy
criterion (Theorem 4.1.4). P P
For the special case bn = |an |, the test says that if |an | converges, then an
converges. In other words, absolute convergence implies convergence. We note that
the conclusion of the comparison test is always absolute convergence.
P log n log n 1
Example 4.2.4. Consider the series . If p ≤ 1, then ≥ . By the
np np n
P1 P log n
comparison test, the divergence of implies the divergence of .
n np
If p > 1, then choose q satisfying p > q > 1. We have
log n log n 1 1
= < for large n.
np np−q nq nq
log n
Here the inequality is due to the fact that p − q > 0 implies lim p−q = 0. By
n
P 1
Example 4.2.1, converges. Then by the comparison test, we conclude that
nq
P log n
converges.
np
log n 1
The key idea of the example above is to compare an = p
with bn = q by
n n
using the limit of their quotient. By
an log n
lim = lim p−q = 0,
bn n
an
we get < 1 for sufficiently large n. Since both an and bn are positive, we may
bn P
apply the comparison
P test to conclude that the convergence of bn implies the
convergence of an .
310 CHAPTER 4. SERIES
an
In general, if an , bn > 0 and lim = l converges, then by the comparison test,
P bn P
the convergence of bn implies the convergence of an . Moreover, if l 6= 0, then
bn 1 P P
we also have lim = , and we conclude that an converges if and only if bn
an l
converges.
P n + sin n
Example 4.2.5. For , we make the following comparison
n3 + n + 2
n + sin n
3
lim n + n + 2 = 1.
n→∞ 1
n2
P 1 P n + sin n
By the convergence of , we get the convergence of .
n2 n3 + n + 2
Similarly, by the comparison
2n + n2
√ √
5n−1 − n4 3n
lim n = 5,
n→∞ 2
√
5
n
2n + n2
P 2 P
and the convergence of √ , the series √ converges.
5 5n−1 − n4 3n
x
1 e 1
Example 4.2.6. By Example 2.5.14, we know 1 + −e = − +o . This
n x 2x x
1
implies that for sufficiently large n, 1 + − e is negative and comparable to
n n
1 P1 P 1
. Since the harmonic series diverges, we conclude that 1+ −e
n n n
diverges.
+∞
| sin x|
Z
Example 4.2.7. By Example 3.7.14, we know that dx converges if and
1 xp Z +∞
| sin ax|
only if p > 1. By a change of variable, we also know that, for a 6= 0, dx
1 xp
converges if and only if p > 1. However, we cannot use the integral test (Theorem
P | sin na| | sin ax|
4.2.1) to get the similar conclusion for p
. The problem is that is
n xp
not a decreasing function.
| sin na| 1 P | sin na|
By p
≤ p and the comparison test, we know converges for
n n np
p > 1. The series also converges if a is a multiple of π, because all the terms are 0.
It remains to consider the case p ≤ 1 and a is not a multiple of π.
4.2. COMPARISON TEST 311
π π 3π
First assume 0 < a ≤ . For any natural number k, the interval kπ + , kπ +
2 4 4
π
has length and therefore must contain nk a for some natural number nk . Then
2
1
| sin nk a| ≥ √ , and for p ≤ 1,
2
∞ ∞ ∞ ∞
X | sin na| X | sin na| X | sin nk a| 1 X 1
≥ ≥ ≥√ .
n=1
np n=1
n k=1
nk 2 k=1 nk
3π 1 4 a a P1
By nk ≤ kπ + , we get ≥ > . Then by = +∞, we get
4 nk 4k + 3 π 4k k
P 1 P | sin na|
= +∞ and = +∞.
nk np
π
In general, if a is not an integer multiple of π, then there is b, such that 0 < b ≤
2
and either a + b or a − b is an integer multiple of π. Then we have | sin na| = | sin nb|,
P | sin na|
and we still conclude that diverges for p ≤ 1.
np
n2 a n2
r !
1 n+1 P P
8.
P
√ − log . 11. . 14. cos .
1 n
n n n
a+
n
P 1 p
2 15. − log cos .
1 n n
2n
P
9. 1− . P n
n 12. . a
log np sin q .
P
(n + a) (n + b)n+a 16.
n+b
n
2
1 2n−n
P P √
√ √ P | cos na|
10. 1+ . 13. 2na− nb− nc . 17. .
n np
1
n+1 1
xp
Z Z Z
sin x n
sin xn dx.
P P P
2. dx. 4. dx. 6.
n xp 0 1 + x2 0
P1
Exercise 4.2.18. Suppose an is a bounded sequence. Show that (an − an+1 ) converges.
n
Exercise 4.2.19. The decimal representations of positive real numbers are actually the sum
of series. For example,
TheoremP4.2.3 (Root Test). Suppose |an | ≤ rn for some r < 1 and sufficiently large
n. Then an converges.
r for sufficiently large n. Then by the root test, we conclude that (n5 + 2n + 3)xn
converges for |x| < 1.
If |x| ≥ 1, then the term (n5 + 2n + 3)xn of the series does not converge to 0. By
Theorem 4.1.2, the series diverges for |x| ≥ 1.
314 CHAPTER 4. SERIES
P (log n)n 6.
P q
np xn . P a 2n−n2
1. . 10. 1+ .
n2 n
n
P np P an + b
2. . 7. .
(log n)n cn + d P np
11. .
b n
1 a+
a n2
P
3. . P n
an + bn 8. 1+ .
P n n
4. (a + bn )p . n
a + (−1)n
P p n P a −n2 12.
P
n3 .
5. n x . 9. 1+ . b + (−1)n
n
Next we turn to another way of comparing series. Theorem 2.3.3 compares two
functions by comparing their derivatives (i.e., the changes of functions). Similarly,
we may compare two sequences an and bn by either comparing the differences an+1 −
an+1 bn+1
an and bn+1 − bn , or the ratios and . The comparison of ratio is especially
an bn
suitable for the comparison of series.
an+1 bn+1 aN
Suppose an , bn > 0, and ≤ for n ≥ N . Then for c = , the two
an bn bN
an+1 cbn+1
sequences an and cbn are equal at n = N , and ≤ implies that the second
an cbn
sequence has bigger change than the first one, at least for n ≥ N . This should imply
an ≤ cbn for n ≥ N . The following is the rigorous argument
aN +1 aN +2 an bN +1 bN +2 bn
an = aN ··· ≤ cbN ··· = cbn .
aN aN +1 an−1 bN bN +1 bn−1
P P
By the comparison test, if bn converges, then an converges.
an+1
Theorem 4.2.4 (Ratio Test). Suppose ≤ bn+1 for sufficiently large n. If
P P an bn
bn converges, then an converges.
We note that the assumption implies that the terms bn have the same sign for
sufficiently large n. By changing all bn to −bn if necessary, we may assume that
bn > 0 for sufficiently large n.
4.2. COMPARISON TEST 315
P (2n)! n
Example 4.2.9. The series x satisfies
(n!)2
(2n)! n
x
= lim 2n(2n − 1) |x| = 4|x|.
an (n!)2
lim
= lim
n→∞ an−1 n→∞
(2n − 2)! xn−1 n→∞ n2
((n − 1)!)2
If 4|x| < 1, then we fix r satisfying 4|x| < r < 1. By the order rule, we have
n
an
< r = r for large n.
an−1 rn−1
bn = rn , Theorem 4.2.4 implies that
P P P
By comparing with the power series an
converges. If 4|x| > 1, then we get
an
an−1 > 1 for large n.
P
Therefore |an | is increasing and does not converge to 0. By Theorem 4.1.2, an
diverges.
P (2n)! n 1 1
We conclude that 2
x converges for |x| < and diverges for |x| > . For
(n!) 4 4
1 P n
x = , we cannot compare with the geometric series r . Instead, we may try to
4
P 1 (2n)!
compare with p
. The terms an = > 0, and
n (n!)2 4n
1
an 2n(2n − 1) 1 np p 1
= =1− , =1− +o .
an−1 4n 2 2n 1 n n
(n − 1)p
1 P 1 P
So we expect an to be comparable to 1 . Since 1 diverges, we expect an
n 2 n 2
diverges. For a rigorous argument, we wish to show that
1
an np
≥ for some p ≤ 1 and large n.
an−1 1
(n − 1)p
1
Of course this holds for p = 1 > . Therefore by the ratio test, we conclude that
2
P (2n)!
diverges.
(n!)2 4n
1 (2n)!
(−1)n
P
We will show in Example 4.3.4 that, for r = − , the series
4 (n!)2 4n
converges.
316 CHAPTER 4. SERIES
Second, when the comparison with the geometric series does not work, we may
P 1
compare with . Suppose
np
an+1 p
an ≤ 1 − n for some p > 1 and large n. (4.2.1)
1
an+1 q 1 nq
an ≤ 1 − n + o n = for large n.
1
(n − 1)q
1 P
By applying the ratio test to bn = p
, we conclude that an converges. The use
n
of criterion (4.2.1) for the convergence of series is the Raabe test.
The Raabe test also has the limit version. We note that (4.2.1) is equivalent to
an+1
n 1 − ≥ p > 1 for large n.
an
4 4 · 7 4 · 7 · 10 2 2·5 2·5·8
1. + + + ···. 3. + + + ···.
2 2 · 6 2 · 6 · 10 4 4 · 7 4 · 7 · 10
2 2 · 6 2 · 6 · 10 2 2·5 2·5·8
2. + + + ···. 4. + + + ···.
4 4 · 7 4 · 7 · 10 4 · 7 4 · 7 · 10 4 · 7 · 10 · 13
Exercise 4.2.23. Determine convergence. There might be come special values of r for which
you cannot yet make conclusion.
P (n!)2 n P n!(2n)! n P n! n P nn+1 n
1. r . 3. r . 5. r . 7. r .
(2n)! (3n)! nn (n + 1)!
P (3n)! n P nn n n! P (2n)! n
rn .
P
2. r . 4. r . 6. 8. r .
(n!)3 n! (n + 1)n n2n
cannot apply the comparison test to the whole series because the conclusion of
comparison test is always absolute convergence. In fact, we applied Theorem 4.1.3
to the even partial sum of the series in Example 4.1.5. The following is an elaboration
of the idea of Example 4.1.5.
The series
∞
X
(−1)n an = a0 − a1 + a2 − a3 + · · ·
n=0
is called an alternating series. If an is decreasing (for all n), then the odd partial
sum
is increasing and has upper bound a0 . Therefore lim s2n+1 converges. By s2n =
s2n+1 − a2n+1 and lim a2n+1 = 0, we have lim s2n = lim s2n+1 and therefore the whole
partial sum sequence converges.
converges for p > 0. By Example 4.2.1, the series absolutely converges for p > 1,
conditionally converges for 0 < p ≤ 1, and diverges for p ≤ 0.
P a n p
Example 4.3.2.
P a n Consider the series n b . By limn→∞
n
|na bn | = |b| and the root
test, n b absolutely converges Pfora |b| < 1 and diverges for |b| > 1.
P is n na , which converges if and only if a < −1. If
If b = 1, then the series
b = −1, then the series is (−1) n , which by Example 4.3.1 converges if and only
if a < 0.
In conclusion, the series na bn absolutely converges for either |b| < 1, or a < −1
P
and |b| = 1, conditionally converges for −1 ≤ a < 0 and b = −1, and diverges
otherwise.
n2 + a
(−1)n 3
P
Example 4.3.3. Consider the alternating series . The corresponding
n +b
absolute value series is comparable to the harmonic series and therefore diverges.
x2 + a
If we can show that f (x) = 3 is decreasing, therefore, then the Leibniz test
x +b
4.3. CONDITIONAL CONVERGENCE 319
P (2n)! n
Example 4.3.4. In Example 4.2.9, we determined the convergence of x for
(n!)2
1 1
all x except x = − . For x = − , the series is alternating, and |an | is decreasing
4 4
an 1
by =1− < 1. If we can show that an converges to 0, then we can apply
an−1 2n
the Leibniz test.
1
We compare with the ratio of |an | with the ratio of p
n
1
an 1 np p 1
an−1 = 1 − 2n ≤ =1− +o .
1 n n
(n − 1)p
Exercise 4.3.5. Determine the absolute or conditional convergence for the undecided cases
in Exercise 4.2.23.
Like the convergence of improper integrals, we also have the analogues of the
Dirichlet and Abel tests.
P
Proposition 4.3.2 (Dirichlet Test). Suppose the partial
P sum of an is bounded.
Suppose bn is monotonic and limn→∞ bn = 0. Then an bn converges.
P
Proposition 4.3.3 (Abel
P Test). Suppose an converges. Suppose bn is monotonic
and bounded. Then an bn converges.
P | sin na|
Example 4.3.5. In Example 4.2.7, we showed that diverges when a is
np
not an integer multiple of π. Then Example 3.7.14 suggests that the series should
converge conditionally.
By the Dirichlet test, if we can show that the partial sum
Exercise 4.3.6. Derive the Leibniz test and the Abel test from the Dirichlet test.
P an P an
Exercise 4.3.7. Prove that if p
converges, then converges for any q > p.
n nq
P 1 (−1)n (−1)n
1. .
P P
3. . 5. .
n + (−1)n n2 (n + (−1)n )p np + (−1)n
n(n−1)
P (−1)n P (−1) 2
2. √ . 4. √ .
( n + (−1)n )p n + (−1)n
√
a log n n an + b n
sin n2 + aπ.
P
1. 2.
P n
(−1) 1 − . 3.
P
log .
n cn + d
√
P (−1)[ n]
Exercise 4.3.11. Let [x] be the biggest integer ≤ x. Determine the convergence of
np
P (−1) [log n]
and .
np
Theorem 4.3.4. The sum of an absolutely convergent series does not depend on the
order. On the other hand, given any conditionally convergent series and any number
s, it is possible to rearrange the order so that the sum of the rearranged series is s.
∞
X (−1)n+1
Example 4.3.6. We know from Example 4.3.1 that the series converges
n=1
n
conditionally. The partial sum can be estimated from the partial sum of the har-
322 CHAPTER 4. SERIES
1 1 1 1 1
1 − + − + ··· + −
2 3 4 2n − 1 2n
1 1 1 1 1 1 1 1 1
= 1 + + + + ··· + + −2 + + + ··· +
2 3 4 2n − 1 2n 2 4 6 2n
= (log 2n + γ + 2n ) − (log n + γ + n ) = log 2 + (2n − n ).
This implies
1 1 1
1− + − + · · · = log 2.
2 3 4
If the terms are rearranged, so that one positive term is followed by two negative
terms, then the partial sum is
1 1 1 1 1 1 1 1
1 − − + − − + ··· + − −
2 4 3 6 8 2n − 1 4n − 2 4n
1 1 1 1 1
= 1 + + + + ··· + +
2 3 4 2n − 1 2n
1 1 1 1 1 1 1 1
− + + + ··· + − + + + ··· +
2 4 6 2n 2 4 6 4n
1 1
= (log 2n + γ + 2n ) − (log n + γ + n ) − (log 2n + γ + 2n )
2 2
1 1
= log 2 + (2n − n ).
2 2
If two positive terms are followed by one negative term, then the partial sum is
1 1 1 1 1 1 1 1
1 + − + + − + ··· + + −
3 2 5 7 4 4n − 1 4n − 3 2n
1 1 1 1
= 1 + + + + ··· +
2 3 4 4n
1 1 1 1 1 1 1 1
− + + + ··· + − + + + ··· +
2 4 6 4n 2 4 6 2n
1 1
= (log 4n + γ + 4n ) − (log 2n + γ + 2n ) − (log n + γ + n )
2 2
3 1
= log 2 + (4n − 2n − n ).
2 2
We get
1 1 1 1 1 1
1− − + − − + ··· = log 2,
2 4 3 6 8 2
1 1 1 1 1 3
1 + − + + − + ··· = log 2.
3 2 5 7 4 2
4.3. CONDITIONAL CONVERGENCE 323
1 1 1
Exercise 4.3.12. Rearrange the series 1 − + − + · · · so that p positive terms are
2 3 4
followed by q negative terms and the pattern repeated. Show that the sum of new series
1 p
is log 2 + log .
2 q
1 1 1 1 1 1
Exercise 4.3.13. Show that 1 − √ + √ − √ + · · · converges, but 1 − √ − √ + √ −
2 3 4 2 4 3
1 1 1 1 1
√ − √ + √ − √ − √ + · · · diverges.
6 8 5 10 12
∞
X
Note that the infinite sum ai bj is a “double series” with two indices i and j.
i,j=1
There are many ways of arranging this series into a single series. For example, the
following is the “diagonal arrangement”
X
(ab)k = a1 b1 + a1 b2 + a2 b1 + · · ·
+ a1 bn−1 + a2 bn−2 + · · · + an−1 b1 + · · · ,
and the following is the “square arrangement”
X
(ab)k = a1 b1 + a1 b2 + a2 b2 + a2 b1 + · · ·
+ a1 bn + a2 bn + · · · + an bn−1 + an bn + an bn−1 + · · · + an b1 + · · · .
Under the condition of the theorem, the series is supposed to converge absolutely.
Then by Theorem 4.3.4, all arrangements give the same sum.
xn
Example 4.3.7. We know from Example 4.1.4 that ∞
P
n=0 absolutely converges to
x
n!
e . By Theorem 4.3.5, we have
∞
! ∞ ! ∞ ∞
n X yn
X x X xi y j X X xi y j
ex ey = = =
n=0
n! n=0
n! i,j=0
i!j! n=0 i+j=n
i!j!
∞ n ∞
X 1 X n! X 1
= xi y j = (x + y)n = ex+y .
n=0
n! i=0 i!(n − i)! n=0
n!
324 CHAPTER 4. SERIES
n n
Exercise 4.3.14. If you take the product of a geometric series with itself, what conclusion
can you make?
P (−1)n
Exercise 4.3.15. Suppose √ = l. Show that the square arrangement of the product
n
of the series with itself converges to l2 . What is the sum of the diagonal arrangement?
The partial sum of the series is the n-th order Taylor expansion Tn (x) of f (x) at x0 .
4.4. POWER SERIES 325
In Example 4.1.3, we used the Lagrange form of the remainder Rn (x) = f (x) −
Tn (x) (Theorem 2.7.1) to show that the Taylor series of ex converges to ex . Exer-
cise 4.1.11 further showed that the Taylor series of sin x and cos x converge to the
trigonometric functions.
P∞ (−1)n+1 n
Example 4.4.1. Consider the Taylor series n=0 x of log(1 + x). We have
n
dn (−1)n−1 (n − 1)!
log(1 + x) = ,
dxn (1 + x)n
1 n! n+1 |x|n+1
|Rn (x)| = |x| = ,
(n + 1)! |1 + c|n+1 (n + 1)|1 + c|n+1
1
where c lies between 0 and x. If − ≤ x ≤ 1, then |x| ≤ |1 + c|, and we get
2
1
|Rn (x)| < . Therefore limn→∞ Rn (x) = 0, and the Taylor series converges to
n+1
log(1 + x).
The Taylor series is the harmonic series at x = −1 and therefore diverges. For
|x| > 1, the terms of the Taylor series diverges to ∞, and therefore the Taylor series
1
also diverges. The remaining case is −1 < x < − .
2
In Exercise 4.4.2, a new form of the remainder is used to show that the Taylor
series actually converges to log(1 + x) for all −1 < x ≤ 1.
|p(p − 1) · · · (p − n)|
|Rn (x)| = |1 + c|p−n−1 |x|n
(n + 1)!
|p(p − 1) · · · (p − n)| |1 + c|p−1 |x|n
= ,
(n + 1)! |1 + c|n
1
where c lies between 0 and x. For − < x ≤ 1, we have |x| ≤ |1 + c| and |1 + c|p−1
2
is bounded. Moreover, by Exercise 4.3.1, for p > −1, we have
|p(p − 1) · · · (p − n)|
lim = 0.
n→∞ (n + 1)!
326 CHAPTER 4. SERIES
1
Then we conclude that limn→∞ Rn (x) = 0 for − < x ≤ 1 and p > −1, and the
2
Taylor series converges to (1 + x)p .
The Taylor series diverges for |x| > 1. Using a new form of the remainder,
Exercise 4.4.2 shows that the Taylor series actually converges to (1 + x)p for |x| < 1
and any p.
Exercise 4.4.1. Use the Lagrange form of the remainder to show that Cauchy form of the
remainder is
f (n+1) (c)
Rn (x) = (x − c)n (x − x0 ),
n!
where c lies between x0 and x. Use this to show that the Taylor series of log(1 + x) and
(1 + x)p at x0 = 0 converge to the respective functions for any |x| < 1.
mayphave many possible limits (for various subsequences). Let the upper limit
lim n |an | be the maximum of all the possible limits. Then the radius is
1
R= p .
limn→∞ n
|an |
One can verify the formula for Example 4.4.11.
328 CHAPTER 4. SERIES
√
n
Example 4.4.6. For any p, we have
P p n limn→∞ np = 1. Therefore the radius of conver-
gence for the power series n x is 1. The example already appeared in Example
4.3.2.
√ P n
Example 4.4.7. By limn→∞ n 2n + 3n = 3, the series (2 + 3n )xn converges for
1 1 1
|x| < and diverges for |x| > . The series also diverges for |x| = because the
3 3 3
terms do not converge to 0.
1 P n n 1 P n n
We note that the radius of convergence is for 2 x and for 3 x . The
2 3
radius for the sum of the two series is the smaller one.
√ P n n
Example 4.4.8. By limn→∞ n nn = +∞, the series n x diverges for all x 6= 0.
r n
1 P (−1) n
By limn→∞ n n = 0, the series x converges for all x.
n nn
Example 4.4.9. In Example 4.2.9, we use the ratio test to show that the radius of
P (2n)! n 1
convergence of 2
x is . The idea can be used to show that the radius of
(n!) 4
P n
an
convergence for an x is R = limn→∞ , provided that the limit converges.
an+1
For example, by
p
np
n
lim = lim = 1,
n→∞ (n + 1)p n→∞ n + 1
1
lim n! = lim (n + 1) = +∞.
n→∞ 1 n→∞
(n + 1)!
The radius of convergence is the square root of the radius of convergence of the
series
∞
X (−1)n xn
.
n=0
22n (n!)2
4.4. POWER SERIES 329
By
(−1)n
22n (n!)2
lim = lim 4(n + 1)2 = +∞,
n+1
n→∞
(−1) n→∞
22n+2 ((n + 1)!)2
Exercise 4.4.3. Suppose an can be divided into two subsequences an0 and an00 . Suppose
0 00 1
limn0 →∞ n |an0 | = l0 and limn00 →∞ n |an00 | = l00 converge. Prove that
p p
is the
max{l0 , l00 }
an xn .
P
radius of convergence for
an
Exercise 4.4.4. Suppose limn→∞
= R converges. Prove that R is the radius of
an+1
n
P
convergence for an x .
xn P (3n)! n P (2 + (−1)n )n n
11. x . 16. x .
P
6. .
an + bn n!(2n)! log n
P n+1 n n 2 2
P n + a n n+2 n + 1 n n2
1. x . 3. x . 5.
P
(−1)n x .
n n+b n
n2 n2
P an + b n n−2
n+1 n+1
xn . (−1)n xn .
P P
2. 4. 6. x .
n n cn + d
x3 x6 x9
A(x) = 1 + + + + ··· .
2·3 2·3·5·6 2·3·5·6·8·9
The product should be the sum of ai bj xi+j . We get the power series cn xn by
P
gathering all the terms with power xn .
The power series can also be differentiated or integrated term by term within
the radius of convergence.
P∞
Theorem 4.4.3. Suppose f (x) = n=0 an xn for |x| < R. Then
∞
X
0
f (x) = (an xn )0 = a1 + 2a2 x + 3a3 x2 + · · · + nan xn−1 + · · · ,
n=1
and
Z x ∞ Z x
X a1 2 a2 3 an n+1
f (t)dt = an tn dt = a0 x + x + x + ··· + x + ···
0 n=0 0 2 3 n+1
1
= ∞ n
P
Example 4.4.11. Taking the derivative of n=0 x , we get
1−x
1
= 1 + 2x + 3x2 + 4x3 + · · · + nxn−1 + · · · ,
(1 − x)2
2
3
= 2 · 1 + 3 · 2x + 4 · 3x2 + 5 · 4x3 + · · · + n(n − 1)xn−2 + · · · .
(1 − x)
Therefore
∞
X ∞
X ∞
X
2 2 2 2 n 2 n n−1 2
1 x + 2 x + ··· + n x + ··· = n x =x nx +x n(n − 1)xn−2
n=1 n=1 n=2
1 2 x(1 + x)
=x 2
+ x2 3
= .
(1 − x) (1 − x) (1 − x)3
x2 x3 xn
log(1 + x) = x − + − · · · + (−1)n+1 + · · · , for |x| < 1.
2 3 n!
Note that in Example 4.4.1, by estimating the remainder, we were able to prove the
1
equality rigorously only for − < x < 1. Here by using term wise integration, we
2
get the equality for all x within the radius of convergence.
1
= ∞ n 2n
P
Example 4.4.12. By integrating 2 n=0 (−1) x , we get the Taylor series of
1+x
arctan x
∞
X (−1)n 2n+1 x3 x5 x7 (−1)n 2n+1
arctan x = x = x− + − +· · ·+ x + · · · for |x| < 1.
n=0
2n + 1 3 5 7 2n + 1
Exercise 4.4.10. Use the product of power series to verify the identity sin 2x = 2 sin x cos x.
Exercise 4.4.11. Find Taylor series and determine the radius of convergence.
1 4. sin2 x, at 0. 7. arcsin x, at 0.
1. , at 0.
(x − 1)(x − 2)
π
√ 5. sin x, at . 8. arctan x, at 0.
2. x, at x = 2. 2 Z x
π sin t
3. sin x2 , at 0. 6. sin 2x, at . 9. dt, at 0.
2 0 t
332 CHAPTER 4. SERIES
P∞ n f (x)
Exercise 4.4.12. Given the Taylor series n=0 an x of f (x), find the Taylor series of .
1+x
P∞ xn
Exercise 4.4.13. Show that the function f (x) = n=0 satisfies xf 00 + f 0 − f = 0.
(n!)2
Exercise 4.4.14. Show that the Airy function in Exercise 4.4.8 satisfies f 00 − xf = 0.
Exercise 4.4.15. Show that the Bessel functions in Example 4.4.10 and Exercise 4.4.7 satisfy
A power series may or may not converge at the radius of convergence (i.e., at
±R). If it converges, then the following gives the value of the sum.
P∞ n
Theorem
P∞ 4.4.4. Suppose n=0 an x converges for |x| < R and x = R. Then
n
P ∞ n
n=0 an R = limx→R− n=0 an x .
(−1)n+1 n Pn
Example 4.4.13. By Examples 4.4.11 and 4.3.1, we know log(1 + x) = n=1 x
n
converges for |x| < 1, and the series converges at x = 1. Since log(1 + x) is continuous at
x = 1, by Theorem 4.4.4, we get
n n
X (−1)n+1 X (−1)n+1
= lim xn = lim log(1 + x) = log(1 + 1) = log 2.
n x→1− n x→1−
n=1 n=1
Exercise 4.4.16. Find the sum. Discuss what happens at the radius of convergence.
1.
P∞ 2 xn . (x − 1)n x2n+1
n=1 n
P∞ P∞
3. n=2 . 5. n=0 .
n(n − 1) 2n + 1
P∞ xn P∞ xn
2.
P∞
n3 xn . 4. n=1 . 6. n=0 .
n=1 n(n + 1)(n + 2) 2n + 1
Exercise 4.4.17. Find the sum. Discuss what happens at the radius of convergence.
P∞ x 2n+1 P∞ x2n P∞ xn
1. (−1)n . 2. (−1)n . 3. n=1 .
n=1
n! n=1
(2n + 1)! 2n (2n − 1)!
Exercise 4.4.18. Find the Taylor series of the function and the radius of convergence. Then
explain why the sum of the Taylor series is the given function.
4.5. FOURIER SERIES 333
√
1. arcsin x. 3. arctan x. 5. log(x + 1 + x2 ).
Z x Z x Z x
sin t 2 log(1 − t)
2. dt. 4. e−t dt. 6. dt.
0 t 0 0 t
Note that there is no b0 because sin 0x = 0. The 12 factor for a0 allows us to have a
unify formula for an later. Moreover, like the Taylor series, we use ∼ instead of =
to indicate that the equality is yet to be established.
The approximation of a periodic function by (linear combinations of) trigono-
metric functions is not measured by the values at single points, but rather the overall
approximation in terms of the integral of the difference function. This means that we
can only expect that the sum of the trigonometric series to be equal to the function
“almost everywhere”.
We expect the sum of the trigonometric series to be equal to f (x) as far as integra-
tions are concerned. We also assume that the integration of infinite series can be
calculated term by term. Then we get
Z 2π ∞
a0 2π
Z X Z 2π
f (x) cos nxdx = cos nxdx + am cos mx cos nxdx
0 2 0 k=1 0
∞
X Z 2π
+ bm sin mx cos nxdx = πan
k=1 0
Z 2π Z 2π ∞ Z 2π
a0 X
f (x) sin nxdx = sin nxdx + am cos mx sin nxdx
0 2 0 m=1 0
∞
X Z 2π
+ bm sin mx sin nxdx = πbn .
m=1 0
the right side is the Fourier series of sin4 x. The coefficients give
Z 2π Z 2π Z 2π
4 3π 4 π π
sin xdx = , sin x cos 2xdx = − , sin4 x cos 4xdx = .
0 4 0 2 0 8
then (
0, if 2kπ ≤ x < 2kπ + a,
f (x) =
1, if 2kπ + a ≤ x < 2(k + 1)π.
−4π −2π a 2π 4π
Example 4.5.3. Let f (x) be the even periodic function of period 2π satisfying
(
1, if |x| ≤ a,
f (x) =
0, if a < |x| ≤ π.
−4π −2π −π a π 2π 4π
336 CHAPTER 4. SERIES
We have
1 2π 1 π 2 π 2 a
Z Z Z Z
2a
a0 = f (x)dx = f (x)dx = f (x)dx = 1dx = ,
π 0 π −π π 0 π 0 π
Z 2π Z π Z a
1 2 2 2 sin na
an = f (x) cos nxdx = f (x) cos nxdx = cos nxdx = .
π 0 π 0 π 0 nπ
Z p Z a+p
The calculation used the fact that f (x)dx = f (x)dx for any periodic func-
0 a
tion of period p. We may also calculate bn and find bn = 0. In fact, for even function,
we expect that all the odd terms bn sin nx to vanish.
Exercise 4.5.1. Suppose f (x) is an even periodic function of period 2π. Prove that
2 π
Z
an = f (x) cos nxdx, bn = 0.
π 0
Exercise 4.5.2. Suppose f (x) is an odd periodic function of period 2π. Prove that
2 π
Z
an = 0, bn = f (x) sin nxdx.
π 0
π
Exercise 4.5.3. Extend f (x) on 0, to a periodic function of period 2π, such that its
P∞ 2
Fourier series is of the form n=1 an cos(2n − 1)x? How about ∞
P
n=1 bn sin(2n − 1)x?
Exercise 4.5.4. Given the Fourier series of f (x) and g(x), what is the Fourier series of
af (x) + bg(x)? Use the idea and Example 4.5.2 to find the Fourier series of the periodic
function f (x) of period 2π satisfying
(
1, if a ≤ x < b,
f (x) =
0, if 0 ≤ x < a or b ≤ x < 2π.
Exercise 4.5.5. Suppose f (x) is a periodic function of period 2π. What is the relation
between the Fourier series of f (x) and f (x + a)? Use the idea and Example 4.5.3 to derive
Example 4.5.2.
Ap periodic
function f (x) of period p may be converted
p to a periodic function
f x of period 2π. Then the Fourier series of f x gives the Fourier series
2π 2π
of f (x). Alternatively, the basic periodic functions cos nx and sin nx of period 2π
2nπ 2nπ
give the basic periodic functions cos x and sin x of period p, and we expect
p p
the Fourier series of f (x) to be
∞
a0 X 2nπ 2nπ
f (x) ∼ + an cos x + bn sin x .
2 n=1
p p
4.5. FOURIER SERIES 337
By an argument similar to the case of period 2π, we get the Fourier coefficients
2 p
Z
2nπ
an = f (x) cos xdx, n ≥ 0,
p 0 p
2 p
Z
2nπ
bn = f (x) sin xdx, n > 0.
p 0 p
Exercise 4.5.7. Use Example 4.5.2 to derive the Fourier series of the periodic function of
period 1 satisfying (
1, if 0 ≤ x < a,
f (x) =
0, if a ≤ x < 1.
Exercise 4.5.8. Derive the formulae for the Fourier coefficients of periodic even or odd
functions of period p, similar to Exercises 4.5.1 and 4.5.2.
−4 −3 −2 −1 1 2 3 4
Note that we do not care about the value at the integer points because it does
not affect the Fourier coefficients, which are
2 1
Z
a0 = xdx = 1,
1 0
2 1
Z Z 1 Z 1
1 1
an = x cos 2nπxdx = xd sin 2nπx = − sin 2nπxdx = 0,
1 0 nπ 0 nπ 0
2 1
Z Z 1
1
bn = x sin 2nπxdx = − xd cos 2nπx
1 0 nπ 0
Z 1
1 1
=− 1− cos 2nπxdx = − .
nπ 0 nπ
1
We note that the reason for an = 0 for n 6= 0 is that f (x) − is an odd function.
2
338 CHAPTER 4. SERIES
We indicate x ∈ (0, 1) because the function equals x only on the interval. The
function is x − 1 instead of x on the interval (1, 2).
−4 −3 −2 −1 1 2 3 4
2 1
Z Z 1
a0 = 2 · f (x)dx = 2 xdx = 1,
2 0 0
2
Z 1 Z 1 − 4 , if n is odd,
an = f (x) cos nπxdx = 2 x cos nπxdx = n2 π 2
1 0 0 0, if n is even.
We may also extend the function to an odd periodic function of period 2. This
is also the extension of the function x on (−1, 1) to a periodic function of period 2.
The Fourier coefficients an = 0 for the odd function, and
1
(−1)n+1 2
Z
bn = 2 x sin nπxdx = .
0 nπ
−4 −3 −2 −1 1 2 3 4
Example 4.5.6. Let f (x) be the periodic function of period 1 extending the function
x2 on (0, 1). Then
Z 1
2
a0 = 2 x2 dx = ,
3
Z0 1
1
an = 2 x2 cos 2nπxdx = 2 2 ,
nπ
Z0 1
1
bn = 2 x2 sin 2nπxdx = − .
0 nπ
The Fourier series is
∞
21 X 1 1
x ∼ + cos 2nπx − sin 2nπx , x ∈ (0, 1).
3 n=1 n2 π 2 nπ
−4 −3 −2 −1 1 2 3 4
Exercise 4.5.9. The periodic function is given on one interval of period length. Find the
Fourier series.
Exercise 4.5.11. Write the formula for the Fourier coefficients of the even periodic (of
period 2p) extension of a function f (x) on (0, p). What about the odd extension?
340 CHAPTER 4. SERIES
Exercise 4.5.12. Extend the function on (0, p) to even and odd functions of period 2p and
compute the Fourier series.
Exercise 4.5.13. Given the Fourier series of functions f (x) and g(x) of period p. Find the
Fourier series of the following periodic functions.
Exercise 4.5.14. Use Examples 4.5.4 and 4.5.5 to find the Fourier series.
In this course, f (x) is a real valued function, and an , bn are real numbers. There-
fore c−n ei(−n)x is the complex conjugation of cn einx , and the usual Fourier series is
given by
a0 = 2c0 , an cos nx + bn sin nx = 2Re(cn einx ).
4.5. FOURIER SERIES 341
Example 4.5.7. For the function in Example 4.5.2, the complex Fourier coefficient is
Z a
1 a
c0 = dx = ,
2π 0 2π
Z a
1 1 i
cn = e−inx dx = e−inx |a0 = (e−ina − 1).
2π 0 −2inπ 2nπ
The complex Fourier series is
a X i
f (x) ∼ + (e−ina − 1)einx , x ∈ (0, 2π).
2π n6=0 2nπ
By
i i
2Re (e−ina − 1)einx = Re ((cos na − 1) − i sin na)(cos nx + i sin nx)
2nπ nπ
1
= (sin na cos nx − (cos na − 1) sin nx),
nπ
we recover the Fourier series in terms of trigonometric functions in Example 4.5.2.
Example 4.5.8. Consider the periodic function of period 2π given by ex on (0, 2π).
The complex Fourier coefficient is (recall that e2ikπ = 1)
Z 2π
1 1 e2π − 1
cn = ex e−inx dx = e(1−in)x |2π
0 = .
2π 0 2π(1 − in) 2nπ(1 − in)
The complex Fourier series is
∞
e2π − 1 X
x 1
e ∼ einx , x ∈ (0, 2π).
2π n=−∞ 1 − in
By
1 1 + in cos nx − n sin nx
Re einx = Re 2
(cos nx + i sin nx) = ,
1 − in 1+n 1 + n2
we get the Fourier series in terms of trigonometric functions
∞
!
2π
e − 1 1 X cos nx − n sin nx
ex ∼ + , x ∈ (0, 2π).
π 2 n=1 1 + n2
By changing x to 2πx, we get the Fourier series for the periodic function of period
1 given by e2πx on (0, 1)
∞
!
2π
e − 1 1 X cos 2nπx − n sin 2nπx
e2πx ∼ + , x ∈ (0, 1).
π 2 n=1 1 + n2
342 CHAPTER 4. SERIES
a sin x
Example 4.5.9. Consider the function f (x) = , with |a| < 1. We
1 − 2a cos x + a2
rewrite the function and take the Taylor expansion in terms of einx = z n , z = eix ,
eix − e−ix
a sin x a
= 2i
1 − 2a cos x + a2 eix + e−ix
1 − 2a + a2
2
eix − e−ix
a 1 1 1
= = −
2i (1 − aeix )(1 − ae−ix ) 2i 1 − aeix 1 − ae−ix
∞ ∞
!
1 X ix n X −ix n 1 X |n| inx
= (ae ) − (ae ) = a e
2i n=0 n=0
2i n6=0
∞ ∞
1 X n inx X
= a (e − e−inx ) = an sin nx.
2i n=1 n=1
Note that we have geometric series because |aeix | = |ae−ix | = |a| < 1. Moreover,
the Fourier series is actually equal to the function. The example also shows the
connection between the Fourier series and the power series.
Exercise 4.5.15. Find the complex form of the Fourier series of ax on (0, 2π). Then by
1
taking a = e 2π , derive the Fourier series of the function ex on (0, 1).
ix
Exercise 4.5.16. Find the complex form of the Fourier series of e 2 on (0, 2π). Then derive
the Fourier series of cos x and sin x on (0, π) by taking the real and imaginary parts.
Exercise 4.5.17. What is the complex form of the Fourier series for a periodic function of
period p?
coefficients An , Bn of f 0 (x)
2 p 0
Z
2
A0 = f (x)dx = (f (p) − f (0)) = 0,
p 0 p
Z p
2 p
Z
2 0 2nπ 2nπ
An = f (x) cos xdx = cos xdf (x)
p 0 p p 0 p
Z p
2 2nπ 2nπ 2nπ
= f (p) − f (0) + f (x) sin xdx = bn ,
p 0 p p p
2 p 0 2 p
Z Z
2nπ 2nπ
Bn = f (x) sin xdx = sin xdf (x)
p 0 p p 0 p
2 p
Z
2nπ 2nπ 2nπ
=− f (x) cos xdx = − an .
p 0 p p p
This shows that we may differentiate the Fourier series term by term.
Proposition 4.5.3. Suppose f (x) is a periodic function of period p. If the Fourier co-
Z x
efficients of f (x) are an , bn and a0 = 0, then the Fourier series of F (x) = f (t)dt
0
is ∞
Z x
A0 p X1 2nπ 2nπ
f (t)dt ∼ + −bn cos x + an sin x .
0 2 2π n=1 n p p
Exercise 4.5.20. Derive the Fourier series of x3 on (0, 1) from the Fourier series of x2 .
Exercise 4.5.21. Derive the Fourier series of |x| on (−1, 1) from the Fourier series of its
derivative.
Exercise 4.5.22. Suppose f (x) is continuously differentiable on [0, p], with perhaps different
f (0+ ) and f (p− ). Then we can extend both f (x) and f 0 (x) to periodic functions of period
p. If the Fourier coefficients of the extended f (x) are an , bn , prove that
∞
f (p− ) − f (0+ ) 2 X
0 − + 2nπ 2nπ
f (x) ∼ + (f (p ) − f (0 ) + nπbn ) cos x + nπan sin x .
p p p p
n=1
4.5. FOURIER SERIES 345
Theorem 4.5.4. Suppose f (x) is a periodic function. If f (x) has one sided limits
f (x− +
0 ) and f (x0 ) at x0 , and there is M , such that
and
x > x0 and close to x0 =⇒ |f (x) − f (x+
0 )| ≤ M |x − x0 |.
−
f (x+
0 ) + f (x0 )
Then the Fourier series of f (x) converges to at x = x0 .
2
The condition means that the value of f (x) lies in two “corners” on the two sides
of x0 .
f (x+
0)
f (x−
0)
x0
Example 4.5.11. In Example 4.5.4, we find the Fourier series of x on (0, 1). The function
satisfies the condition of Theorem 4.5.4 at x0 = 0. We conclude that
∞
0+1 f (0− ) + f (0+ ) 1 1X1
= = − sin 2nπ0.
2 2 2 π n
n=1
1
This is trivially true. We get similar trivial equalities at x0 = 1 and x0 = .
2
1
If we take x0 = , then
3
∞
1 1 1X 1 2(3k + 1)π 1 2(3k + 2)π 1 2(3k + 3)π
= − sin + sin + sin
3 2 π 3k + 1 3 3k + 2 3 3k + 3 3
k=1
∞ √ √ !
1 1X 1 3 1 3
= − − .
2 π 3k + 1 2 3k + 2 2
k=1
346 CHAPTER 4. SERIES
1 1
Similarly, by evaluating the Fourier series at x0 = and x0 = , we get
4 8
1 1 1 1 1 1 1 1 1 π
1− + − + − + − + − + ··· = ,
3 5 7 9 11 13 15 17 19 4
1 1 1 1 1 1 1 1 1 π
1+ − − + + − − + + − ··· = √ .
3 5 7 9 11 13 15 17 19 2 2
0 1
Figure 4.5.2: Partial sums of the Fourier series for x on (0, 1).
Example 4.5.12. The periodic function of period 2 given by |x| on (−1, 1) satisfies the
condition of Theorem 4.5.4 everywhere. Evaluating the Fourier series in Example 4.5.4 at
x = 0, we get
∞
1 4 X 1
0= − 2 .
2 π (2n + 1)2
n=0
Therefore
∞ ∞
X 1 1 1 1 4X 1 π2
= 1 + + + + · · · = = .
n2 22 32 42 3 (2n + 1)2 6
n=1 n=0
4.5. FOURIER SERIES 347
1
If we evaluate at x = , then we get
4
1 1 4 1 1 1 1 1 1 1 1
= −√ + − − + + − − + ··· .
4 2 2π 2 12 32 52 72 92 112 132 152
Combined with the sum above, we get
1 1 1 1 1 1 1 1
+ + + + + + ··· = + √ π2,
12 32 92 112 172 192 12 16 2
1 1 1 1 1 1 1 1
+ + + + + + ··· = − √ π2.
52 72 132 152 212 232 12 16 2
P∞ 1
Exercise 4.5.23. Use the Fourier series of x2 on (0, 1) to compute n=1 .
n2
Exercise 4.5.24. Use the Fourier series of x2 on (−1, 1) to get the Fourier series of x3 and
1
x4 on (−1, 1). Then evaluate the Fourier series of x4 to get ∞
P
n=1 4 .
n
Therefore 2 2 2
Z 2π
8 3 1 1 35
sin xdx = 2π + π+ π = π.
0 8 2 8 64
The idea (which is essentially Pythagorean theorem) leads to the following formula.
348 CHAPTER 4. SERIES
The identity means that the Fourier series, considered as a conversion between
periodic functions and sequences of numbers, preserves the “Euclidean length”. The
complex form of Parseval’s identity is
∞ Z 2π
X
2 1
|cn | = |f (x)|2 dx.
n=−∞
2π 0
The proof of this identity requires the notion of uniform convergence to justify the
interchange of summation and integration, and is outside the scope of this course.
Example 4.5.13. Applying Parseval’s identity to the Fourier series of x on (0, 1), we
get
∞ Z 1
1 X 1 2
+ 2 2
=2 x2 dx = .
2 n=1 n π 0 3
This is the same as
1 1 1 1 π2
+ + + + · · · = .
12 22 32 42 6
Applying the identity to the Fourier series of x2 on (0, 1), we get
∞ ∞ Z 1
2 X 1 X 1 2
+ + =2 x4 dx = .
9 n=1 n4 π 4 n=1 n2 π 2 0 5
P∞ 1 π2
Using n=1 = , we get
n2 6
1 1 1 1 π4
+ + + + · · · = .
14 24 34 44 90
Example 4.5.14. Applying Parseval’s identity to the complex form of the Fourier
series of ex on (0, 2π) in Example 4.5.8, we get
∞ Z 2π
(e2π − 1)2 X 1 1 2x e4π − 1
= e dx = .
4π 2 n=−∞
1 + n 2 2π 0 4π
or ∞
X 1 1 1 1 1 1 π(e2π + 1) 1
= + + + + + · · · = − .
n=1
1 + n2 2 5 10 17 26 2(e2π − 1) 2
Exercise 4.5.25. Apply Parseval’s identity to the Fourier series in Example 4.5.2 to find
P∞ sin2 na P∞ cos2 na
n=1 and n=1 .
n2 n2
Exercise 4.5.26. Apply Parseval’s identiy to the Fourier series of x3 and x4 on (0, 1) to find
P∞ 1 P∞ 1
n=1 6 and n=1 8 .
n n
P∞ 1
Exercise 4.5.27. Find n=1 by evaluating the Fourier series in Example 4.5.8.
1 + n2
The Parseval’s Identity immediate implies the following important properties of the
Fourier coefficients:
Theorem 4.5.7 (Isoperimetric Inequality). If a differentiable simple closed curve has perime-
ter L, then the area
L2
A≤
4π
where the maximum is achieved when the curve is a circle.
is the parametric equation of our simple closed curve, then the area is given by
Z Z 2π
A= xdy = x(t)y 0 (t)dt
C 0
so
R 2πthat it has
R 2πperimeter L = 2π. By shifting the picture, we can also assume that
0 x(t)dt = 0 y(t)dt = 0.
350 CHAPTER 4. SERIES
Writing down the Fourier series for x(t) and x0 (t) respectively:
∞
X
x(t) = cn eint , cn = 0
n=−∞
X∞
x0 (t) = incn eint
n=−∞
During the use of Parseval’ Identity, note that equality holds when we have cn = 0 for
|n| ≥ 2. This means
x(t) = a cos t + b sin t
During the use of 2ab ≤ a2 + b2 , equality holds when a = b, i.e. y 0 (t) = x(t), so that
L2
A ≤ πr2 = .
4π
L2 L2
(Alternatively, the ratio 4πA is dimensionless, so it is invariant under rescaling and 1 ≤ 4πA
holds for all curves.)