Understanding Real Analysis-9781315315072
Understanding Real Analysis-9781315315072
Second Edition
Paul Zorn
St. Olaf College
Northfield, Minnesota
TEXTBOOKS in MATHEMATICS
Series Editors: Al Boggess and Ken Rosen
PUBLISHED TITLES
ABSTRACT ALGEBRA: A GENTLE INTRODUCTION
Gary L. Mullen and James A. Sellers
ABSTRACT ALGEBRA: AN INTERACTIVE APPROACH, SECOND EDITION
William Paulsen
ABSTRACT ALGEBRA: AN INQUIRY-BASED APPROACH
Jonathan K. Hodge, Steven Schlicker, and Ted Sundstrom
ADVANCED LINEAR ALGEBRA
Hugo Woerdeman
ADVANCED LINEAR ALGEBRA
Nicholas Loehr
ADVANCED LINEAR ALGEBRA, SECOND EDITION
Bruce Cooperstein
APPLIED ABSTRACT ALGEBRA WITH MAPLE™ AND MATLAB®, THIRD EDITION
Richard Klima, Neil Sigmon, and Ernest Stitzinger
APPLIED DIFFERENTIAL EQUATIONS: THE PRIMARY COURSE
Vladimir Dobrushkin
APPLIED DIFFERENTIAL EQUATIONS WITH BOUNDARY VALUE PROBLEMS
Vladimir Dobrushkin
APPLIED FUNCTIONAL ANALYSIS, THIRD EDITION
J. Tinsley Oden and Leszek Demkowicz
A BRIDGE TO HIGHER MATHEMATICS
Valentin Deaconu and Donald C. Pfaff
COMPUTATIONAL MATHEMATICS: MODELS, METHODS, AND ANALYSIS WITH MATLAB® AND MPI,
SECOND EDITION
Robert E. White
A CONCRETE INTRODUCTION TO REAL ANALYSIS, SECOND EDITION
Robert Carlson
A COURSE IN DIFFERENTIAL EQUATIONS WITH BOUNDARY VALUE PROBLEMS, SECOND EDITION
Stephen A. Wirkus, Randall J. Swift, and Ryan Szypowski
A COURSE IN ORDINARY DIFFERENTIAL EQUATIONS, SECOND EDITION
Stephen A. Wirkus and Randall J. Swift
PUBLISHED TITLES CONTINUED
SPORTS ANALYTICS
Roland B. Minton
A TOUR THROUGH GRAPH THEORY
Karin R. Saoub
TRANSITION TO ANALYSIS WITH PROOF
Steven G. Krantz
TRANSFORMATIONAL PLANE GEOMETRY
Ronald N. Umble and Zhigang Han
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to
publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials
or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material
reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If
any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any
form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming,
and recording, or in any information storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://www.
copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400.
CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been
granted a photocopy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification
and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
Contents
Preface ix
4 Derivatives 207
4.1 Defining the Derivative . . . . . . . . . . . . . . . . . . . . . 207
4.2 Calculating Derivatives . . . . . . . . . . . . . . . . . . . . . 219
4.3 The Mean Value Theorem . . . . . . . . . . . . . . . . . . . 230
4.4 Sequences and Series of Functions . . . . . . . . . . . . . . . 239
vii
viii CONTENTS
5 Integrals 257
5.1 The Riemann Integral: Definition and Examples . . . . . . . . 257
5.2 Properties of the Integral . . . . . . . . . . . . . . . . . . . . 268
5.3 Integrability . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
5.4 Some Fundamental Theorems . . . . . . . . . . . . . . . . . 292
Index 335
Preface
New sections on topology and compactness. Two entirely new sections, Sec-
tions 3.5 and 3.6, introduce basic ideas of topology and compactness. Some gen-
eral definitions and principles are discussed, but the setting is almost always the
real line, treated (when helpful) as a metric space. Compactness is defined in
terms of open covers (which pose their own linguistic challenges!) but there is
strong emphasis on closed and bounded sets in R, via the Heine–Borel theorem.
This material is very occasionally alluded to, but not really depended on, in later
sections. In this sense these sections are essentially self-contained, and could be
used for independent study or enrichment projects.
ix
x Preface
New problems and exercises. This edition includes new problems and exercises
in most of the sections. Many new problems focus on helping students understand
and “unpack” (in the sense discussed above) definitions and theorems. Students
are often asked to prove special cases, explore concrete instances of general re-
sults, and the like. Many problem sets have also been more carefully ordered to
distinguish between odd- and even-numbered exercises; most of the former have
hints or solutions in the back.
Thanks
This book, in both of its editions, owes its existence to a large (uncounted but
surely countable) set of teachers, academic colleagues, St. Olaf College students,
publishing company professionals, friends, advisors, critics, “competitors,” fam-
ily, and others. It is a pleasure to acknowledge some of them by name—and to
claim sole credit for errors that survived their best efforts to help.
Among local colleagues I thank Bruce Hanson, Paul Humke, Loren Larson,
the late Arnold Ostebee, Matt Richey, the late Lynn Steen, Ted Vessey, and many
others with whom I have discussed matters mathematical and pedagogical for
many years. On matters of taste and usage—both linguistic and mathematical—
my friend Barry Cipra is a reliable resource. His advice, when I have taken it,
has always proved correct. These gifted teachers, expositors, and mathematicians
have helped me think about the deep connections and subtle relationships among
teaching, telling, and doing mathematics. They have also demonstrated, by ar-
gument and example, that serious engagement with post-calculus mathematics is
possible, and valuable, for a broad range of students, not just a small elite. This
“big tent” approach to our discipline informs and underpins St. Olaf’s very suc-
cessful undergraduate program, and I have kept it in mind when writing this book.
Academic colleagues elsewhere have taught me a lot, too. These include my
own mathematics teachers at Washington University and the University of Wash-
ington; authors and referees from my work with Mathematics Magazine and other
publications of the Mathematical Association of America; and authors of other
real analysis textbooks from which I have learned, taught, or read. Real analysis
can be viewed usefully from many angles; doing so offers depth and perspective.
The late Klaus Peters and Charlotte Henderson worked with remarkable care,
diligence, and intelligence to make the first edition of this book better and its
production easier. I’m all the more grateful to them because, as a sometime editor
myself, I know how difficult and exacting that work can be.
For encouragement and equally diligent and meticulous work on the second
edition, I thank Karen Simon, Robert Ross, Sunil Nair, and Shashi Kumar, all of
Taylor & Francis.
Last, but hardly least important, are the support and forbearance of my wife,
Janet, in everything that book writing entails for anyone unlucky enough to ob-
Preface xi
Building on calculus basics. Students in the beginning real analysis course that
this book supports may have little or no course experience beyond single-variable
calculus (a year, say) and perhaps some exposure to linear algebra, differential
equations, or multivariate calculus. What can be expected from such students is a
general, though probably informal, sense of the big ideas of calculus—function,
limit, derivative, and integral—and some curiosity about the rigorous theory that
lies behind the techniques. This book’s main strategy is to exploit students’ prior
experience with and curiosity about calculus in working toward deeper under-
standing.
help, especially in earlier sections, before the mathematical and linguistic subtlety
ramp up—together.
Focus on the basics. This book focuses on (what I take to be) basic elements of
the theory; neither compactness nor the Lebesgue integral is covered, for instance.
This is intentional: I’ve willingly traded “coverage” for “simplicity.” (Instructors
who want to go deeper can readily do so; see below for suggestions.) Covering
fewer topics, moreover, leaves room for more narrative discussion and concrete
examples, of which there are many. To put it another way, I try to look closely at
relatively simple things.
Many examples, many solutions. The book contains many worked-out exam-
ples and quite a few detailed solutions to exercises, especially in earlier sections.
I believe that many students learn theory largely “inductively,” from examples,
and that detailed solutions to a substantial number of problems can usefully illus-
trate, for newcomers, the language and conventions of mathematical discourse.
A one-semester course? This book is designed for use in one typical college
semester, but there is more material here than I would cover in that time, except
perhaps with veterans of a proofs course. With a less experienced class I might
omit the coverage of infinite series, for instance, to concentrate more heavily on
Preface xiii
integrals. Or one could omit integrals entirely, and go into more depth on series,
sequences, and derivatives.
Supplementary topics. Real analysis offers many possibilities for group or in-
dividual projects on supplementary topics. These can be especially useful with
unusually well-prepared or highly talented students, who can otherwise be bored
in a class of less highly selected students. Here are some ideas; all are readily
researchable in books or online by motivated students:
• The Cantor–Banach–Bernstein theorem: if f : A → B and g : B → A are
injective, then there is a bijection h : A → B.
• Exploring the limit superior and the limit inferior (a “guided discovery” on
this topic appears in the book).
• Exploring Taylor polynomials and remainder theorems (Taylor’s theorem
is mentioned, but only briefly).
• Exploring compactness of subsets of R, the Heine–Borel theorem, and con-
nections with boundedness of functions and convergence of sequences.
• Exploring deeper properties of the Riemann integral. Suppose, for instance,
that a function f is integrable on [0, 1] and f (x) > 0 for all x. Showing
R1
rigorously that 0 f > 0 is harder than it might seem, and it raises good
questions about monotonicity, points of continuity, and integrability.
CHAPTER 1
Preliminaries: Numbers, Sets,
Proofs, and Bounds
Using these symbols we can discuss these sets clearly and efficiently, with sen-
tences like the following:
42 √ √
N ⊆ Z ⊆ Q ⊆ R; ∈ Q; 152399025 ∈ Z; 93 ∈ R \ Q.
43
√
What each of these symbol strings says should be clear. Whether each claim is The last one says that 93 is
true or false is less obvious—but that’s a matter of mathematics, not of symbolic real but not rational. Notice the
funny “setminus” sign.
clarity. As it turns out, all four claims are true; proving the last assertion takes
some effort. We’ll see a proof soon; for now
Set notation is useful and efficient, but only when used correctly. Something the proof is beside the point.
is amiss, for instance, with each of the following expressions:
√
(i) R ∈ Q; (ii) 3.1 ⊃ N; (iii) N ⊆ R.
∗ The special font can help avoid confusion with other sets. The choices N and R are made for
obvious reasons; Q and Z remind us of quotient and Zahlen (German for “numbers”), respectively.
The symbol C can denote the set of complex numbers.
1
2 1. Preliminaries: Numbers, Sets, Proofs, and Bounds
A charitable reader might try to make sense of (i)–(iii), but their problems are
ultimately fatal—none really makes sense. (A nicer way to say this is to refer to
errors of syntax.) The problem with (i), for instance, is not the (true!) fact that
real numbers may be irrational, but rather that R is a set of numbers, and hence
not even in the running to be an element of the set of rationals. Expression
√ (ii) is
meaningless for similar reasons, and (iii) is even worse: the expression N has
no clear meaning, so the question of containment in R is moot.
both of which are rich stews of integers, fractions, and irrational numbers. We
will not deny or ignore your earlier experience—all of it will be useful—but aim
to sharpen intuition and make assumptions explicit.
As a look ahead consider, for instance, the expression
f (x) − f (π)
lim = f ′ (π),
x→π x−π
It’s f ′ (π ), as we know from which describes a certain derivative. Making clear sense of such an expression,
beginning calculus. as we’ll need to do later in studying derivatives, depends on various properties of
real numbers. Here are some early hints:
f (x) − f (π)
x−π
1.1. Numbers 101: The Very Basics 3
Integers. We’ll build up our description of the real numbers by starting with sim-
pler sets: first Z and then Q. What’s “simpler” depends, of course, on the situa-
tion. To a number theorist, for instance, there is nothing simple about the integers.
In real analysis, however, we’ll need only basic, familiar properties of the integers,
and we’ll usually take these as “known.” We’ll accept without fuss, for instance,
that equations like
make sense, and hold true, for any integers a, b, and c. We also “know” that every
integer a has an additive inverse, −a, but that a multiplicative inverse, 1/a, need What happens if a = 0?
not be an integer.
The well-ordering property. Here’s another basic fact we’ll just assume: Every
nonempty set of positive integers has a least element. This fact is known formally
as the well-ordering property of the positive integers. The name may be myste-
rious but the property itself should be familiar and believable from experience. It comes from set theory—not
Among all even positive multiples of 7, for instance, 14 is smallest. We’ll take our subject here.
the well-ordering property as an axiom, not as a theorem to be proved. In more formal discussion the
property is sometimes proved to
Rational numbers. Basic algebraic properties of Q are equally familiar. All four follow from even simpler
equations above, for example, hold just as well for rational numbers a, b, and axioms.
c as for integers. Because nonzero rational numbers have (rational) reciprocals,
equations and expressions that involve division can now make sense. Here are
three examples:
p+q p q pq q p+q 1 1
= + ; =p ; = + .
r r r r r pq q p
4 1. Preliminaries: Numbers, Sets, Proofs, and Bounds
All are true, of course, whenever p, q, and r are rational numbers and all denomi-
nators are nonzero.
In later work we’ll freely use such familiar facts, usually without comment.
Right now it is a good exercise to derive some properties of Q, assuming basic
properties of Z.
For us a rational number r is just a ratio of integers:
a
r= , where a, b ∈ Z and b 6= 0.
b
Of course, any given rational number has many possible forms:
2 14 −222 88
= = = = ...;
3 21 −333 132
the first form, in which numerator and denominator have no common factors, is
called reduced.
S OLUTION . The proof is direct: We start with the hypothesis and derive the
conclusions. Because p and q are rational, we can write
a c
p= and q = ,
b d
where a, b, c, d ∈ Z and b 6= 0 6= d. Now we add fractions:
a c ad + cb
p+q = + = ,
b d bd
which exhibits p + q as the desired ratio of integers. (Both numerator and de-
nominator are integers, and the denominator is nonzero, thanks to basic integer
arithmetic, which we take as known.)
Also,
c a cb + ad ad + cb
q+p= + = = ,
d b db bd
where the last equality follows from commutativity of integer operations. The last
expression is just what we calculated earlier for p + q, so we’re done. ♦
ab
pq = = 1. ♦
ab
Q is a field. The following theorem says, in mathematical parlance, that Q is a
field. Nothing in the theorem should
surprise you; the point is to
Theorem 1.1. The set Q has the following properties: collect and formalize key
properties.
• Two operations: Addition and multiplication are operations on Q: if p and
q are any rational numbers, then so are p + q and pq.
• Commutativity: Addition and multiplication are commutative operations:
if p and q are any rational numbers, then p + q = q + p and pq = qp.
• Associative operations: Addition and multiplication are associative opera-
tions: if p, q, and r are any rational numbers, then (p + q) + r = p + (q + r)
and (pq)r = p(qr).
• Identities: The rational number zero is an additive identity: 0 + p = p
holds for all p ∈ Q. The rational number one is a multiplicative identity:
1p = p holds for all p ∈ Q.
• Inverses: For every rational number p, the rational number −p is an ad-
ditive inverse: p + (−p) = 0. For every nonzero rational number p, the
rational number 1/p is a multiplicative inverse: p · 1/p = 1.
• Distributivity: Multiplication distributes across addition: For any rationals
p, q, and r, p(q + r) = pq + pr.
All parts of the theorem can be proved in the spirit of the preceding examples,
assuming similar properties of Z. But . . .
S OLUTION . To be a field, a set must satisfy each of the long list of properties
in Theorem 1.1. Proving all this can be daunting. But there’s a happy flip side:
Disproving that Z is a field is easy. It is enough to find just one property that fails
for Z, and for this even a single counterexample suffices. We could just observe,
for instance, that the integer 42 has no multiplicative inverse among the integers,
and leave it at that. ♦
6 1. Preliminaries: Numbers, Sets, Proofs, and Bounds
Q is not R. The set Q is clearly infinite—it contains all the integers, for one thing.
A bit less obvious, perhaps, is the fact that between any two different rationals lie
infinitely many more rationals. Between 3.14 and 3.15, for instance, lie
and so on. With so many rational numbers around, one might wonder whether
there are any irrationals—numbers that cannot be√written
√ as√ratios of integers.
Nowadays everybody “knows” the answer: 2, 3, 3 2, π, e, and many
other favorite numbers are all irrational. But none of these facts is completely
trivial. Indeed, the Pythagoreans (around 500 BCE) saw all numbers as rational.
By some accounts, they drowned the philosopher Hippasus as a heretic after he
demonstrated that a square of side 1 has irrational diagonal. We’ll take that risk
and still give a proof.
√
Theorem 1.2. 2 is irrational.
√
Like other proofs we’ve given, Proof: The proof is by contradiction: Assuming 2 is rational, we’ll derive an
this one exploits basic absurdity.
properties of integers. √ √
Assume, then, that 2 is rational. We can write 2 = ab , for some positive
integers a and b, not both even. (If both were even we could cancel a common
factor of 2.) Next we do some algebra:
√ a a2
2= =⇒ 2= =⇒ 2b2 = a2 .
b b2
Now 2b2 is certainly even, and so a2 must be even, too. But then a itself must be
even; otherwise, a2 would be odd. Thus we can write a = 2c for some integer c,
and we have
2b2 = a2 = 4c2 =⇒ b2 = 2c2 .
This implies that b (like a) is even, which contradicts our earlier assumption, and
completes the proof.
See the exercises for details. The following more general theorem can be proved by similar means. It shows,
too, that the set of irrational numbers, or R \ Q, has infinitely many members.
√
Theorem 1.3. If the √ positive integer n is not a perfect square, then n is irra-
tional. In symbols, n ∈ R \ Q.
Infinitely many irrationals for the price of one. It took some work to find even
one irrational number. This done, however, it is surprisingly easy to find many,
many more irrationals.
1.1. Numbers 101: The Very Basics 7
Proof: There is less to this than meets the eye. Everything boils down to the
field properties of Q, listed in Theorem 1.1. For example, suppose (aiming for
a contradiction) that xr is rational. We know from Theorem 1.1 that 1/r is also
rational, and so the product
1
xr × = x
r
is rational, too. This contradicts our hypothesis, and so xr must be irrational. The
remaining parts are similar, and are left as exercises.
Exercises
1. As in Example 1, page 2, decide whether each of the following claims
makes sense. If a claim makes sense, is it true or false?
(a) If r ∈ Q then 5r ∈ Q.
√
(b) 8 ⊂ R \ Q.
√
(c) { 8} ⊂ R \ Q.
(d) If a ∈ Q and b ∈ R \ Q then ab ∈ R \ Q.
(e) If a ∈ Q and b ∈ N then a/b ∈ Q.
(f) If a ∈ Q then there exists n ∈ N such that na2 > 100.
2. Decide whether each of the following claims makes sense. If not, why not?
If so, is the claim true or false? (No proofs needed.)
(a) If a ∈ R \ Q, then a2 ∈ R \ Q.
(b) Q2 > 0.
(c) If a ∈ R \ Q, then there exists n ∈ N such that na2 > 100.
√
(d) If a ∈ Q then a + 2 ∈ / Q.
8 1. Preliminaries: Numbers, Sets, Proofs, and Bounds
√
(e) If a ∈ Q and a =
6 0, then 2·a∈
/ Q.
3. For which integers a is 1/a an integer? For which integers a is 1/a rational?
For which integers a is 1/a = a? No proofs necessary, but be sure to
consider all possibilities.
4. We said in this section that Z is not a field. Which of the several require-
ments in Theorem 1.1 does Z fail? Which does it pass? Give an example to
illustrate each failed requirement.
5. The set of irrationals is not a field because (among many other reasons) the
irrationals contain no multiplicative identity. Give one reason—as briefly
as possible—why each of the following sets is not a field.
6. Let p and q be any two rational numbers. In each part following, decide
whether the given expression must be a rational number, regardless of the
values of p and q. If so, explain why, referring to theorems in this section.
If not, give a counterexample.
p+q
(a) (the average of p and q)
2
p+q
(b) 2
p + q2
p
(c) p2 + q 2
p
(d) p2 + 2pq + q 2
7. Let both x and y denote irrational numbers, and consider the quantities (a)
xy; (b) x + y; (c) x − y; (d) x/y. Give examples (i.e., specific values of x
and y) to show that each quantity can be either rational or irrational.
2b − a a′
= ′.
a−b b
Explain why (i) both a′ and b′ √are positive integers; (ii) b′ < b; (iii)
(a′ /b′ )2 = 2, and so deduce that 2 = a/b is impossible.
(a) Show that Z2 is not a field with the usual operations of multiplication
and addition.
(b) Show that Z2 is a field if we use the two operations (i) ordinary mul-
tiplication; (ii) addition “mod 2”: 0 + 0 = 0 and 0 + 1 = 1 + 0 = 1,
but 1 + 1 = 0.
13. The set M2×2 of 2 × 2 matrices with real number entries permits matrix
addition and matrix multiplication, so we can ask about the properties men-
tioned in Theorem 1.1. No proofs needed, but give examples to illustrate
any properties that fail.
15. We know from elementary calculus that ln n tends to infinity (“blows up”)
slowly as n tends to infinity. It follows that ln ln ln n also tends to infinity,
but slower still. Use the well-ordering property to explain why there must
be a least positive integer n0 such that ln ln ln n0 > 2. (Notes: Finding n0
exactly would be tricky, since n0 is very, very large. For fun, try to estimate
how many digits n0 has.)
16. The well-ordering property (see page 3) says something special about the
set N:
(a) Does Q have the well-ordering property? (In other words, does every
nonempty subset of Q have a least element?) Why or why not?
(b) Does the set R = { 1, 10, 100, 1000, . . . } have the well-ordering
property? Why or why not?
(c) Does the set T = { −3, −2, −1, . . . , 41, 42 } have the well-ordering
property? Why or why not?
(d) Replace the word “least” with “greatest” in the well-ordering property
above. Does the result hold for N? For T ? For Z \ N?
Observe:
1.2. Sets 101: Getting Started 11
Notations and operations. Standard symbols let us describe set properties and
operations clearly and concisely. Just a few go a long way:
• Subsets and containment: The expression B ⊂ A says (truthfully for the
sets above) that B is a subset of A; the C-like symbol suggests containment.
If B ( A, then B is a proper subset of A.
In a similar spirit, all of these expressions:
A ⊃ B; B ⊆ A; B ( A; I ⊇ D; ∅⊆D
Are they all true of the sets
describe various containment relations. above?
March ∈ A; 2 ∈ D; April ∈
/ B; {2, 3} ∈ C; {2, 3} ∈
/ I.
The last two claims might be surprising. Notice that the set {2, 3} is indeed
an element of the peculiar set C, but not of I, which contains only numbers. On the other hand, {2,3} ⊂ I.
• New sets from old: Any two sets S and T can be combined in various ways
to form new sets:
union : S ∪ T = {x | x ∈ S or x ∈ T };
intersection : S ∩ T = {x | x ∈ S and x ∈ T };
set difference : S \ T = {x | x ∈ S and x ∈
/ T };
Cartesian product : S × T = {(x, y) | x ∈ S and y ∈ T }.
12 1. Preliminaries: Numbers, Sets, Proofs, and Bounds
Here are some (true) examples from the sets above; note that all quantities
in question are sets:
A ∩ C = {November}; A ∪ B = A; A ∩ B = B;
A \ B = {February, April, June, September, November} ; B \ A = ∅;
A × N = {(m, n) | m is a month and n is a positive integer} .
Decoding set language. Statements about sets may involve complicated or tricky
symbolic combinations. Making sense of them takes careful reading.
E XAMPLE 2. Let Z, Q, and R denote (as usual) the sets of integers, rationals,
and reals, respectively, and consider the sets
S = x ∈ R | x2 − x − 1 = 0 ; T = x ∈ R | x2 − x − 1 > 5 .
The sets S and T illustrate set-builder notation: they are “built” from a larger set
using a selection rule or membership criterion. (The vertical bar | means some-
thing like “such that.”)
What does each of the following assertions mean? Which are true?
S OLUTION . Statement (i) says that S—a set of numbers—is itself an integer;
this is clearly false. Statements (ii) and (iii) make better sense; (ii) says that all
roots of x2 − x− 1 are rational, while (iii) says that these same roots are √ irrational.
Who’s right? Well, by the quadratic formula, the two roots are (1 ± 5)/2, both
of which are real but irrational, so (iii) is true and (ii) is false. Statement (iv) boils
down to the claim that 42 − 4 − 1 > 5, which is clearly true.
We might, by the way, have saved some work by first rewriting S and T in
simpler or different forms. As we’ve seen,
( √ √ )
1+ 5 1− 5
S= , .
2 2
The quadratic formula also shows that x2 − x − 1 = 5 has the two roots x = −2 Check for yourself.
and x = 3, and so
T = {x ∈ R | x < −2 or x > 3} . ♦
Intervals. Intervals in the real line are familiar but useful sets in studying real
analysis. Calculus veterans have seen countless examples; we collect a few as
reminders of the possible variety and of some useful descriptive language.
In particular, closed intervals contain their endpoints, if any, while open in-
tervals do not. All intervals, open or closed, bounded or unbounded, share two
defining properties:
Definition 1.5. A set I ⊂ R is an interval if (i) I contains at least two points;
(ii) if a and b are in I and a < x < b, then x ∈ I, too.
E XAMPLE 3. Let I and J be any two intervals. What possible forms can the
intersection I ∩ J take?
S OLUTION . The intervals I and J might miss each other entirely; then I ∩J = ∅.
Or I and J might intersect in a single point, as do I = [1, 3] and J = [3, 5]. More
interesting is the fact that only one other possibility exists: If I ∩ J contains at
least two points, then I ∩ J is an interval.
The hard way to prove this is to handle many special cases, depending on
whether each of I and J is open, closed, bounded, unbounded, etc. The easy way
is to use Definition 1.5, in which only (ii) is a live question.
Suppose, then, that both a and b are in I ∩ J, and a < x < b. Since I is an
interval, Definition 1.5 guarantees that x ∈ I. For the same reason, we must have
x ∈ J, too. Thus x ∈ I ∩ J, and we’re done. ♦
Open sets: a look ahead. Open intervals are sets like (1, 3) and (−∞, 42) that
don’t contain any endpoints. More generally, any set U ⊂ R is called (topologi-
cally) open if U is the union of any collection (empty, finite, or infinite) of open
We get serious about basics of intervals. A set A ⊆ R is called closed if A = R \ U , where U is open. In other
topology later. This is just a words, closed sets are complements of open sets.
taste.
Example 4 shows that this new terminology plays well with what’s already
familiar.
S OLUTION . The sets (1, 3) and R = (−∞, ∞) are open intervals, and therefore
also open in the topological sense. For the other sets, we study complements:
[1, 3] = R \ ((−∞, 1) ∪ (3, ∞)) and {42} = R \ ((−∞, 42) ∪ (42, ∞))
Both of these complements are unions of open intervals and are therefore open
It would be troubling if [1, 3] sets; hence both [1, 3] and {42} are closed sets. The complement of {42} ∪ [1, 3]
were not closed. is the union of three open intervals, so {42} ∪ [1, 3] is also closed.
1.2. Sets 101: Getting Started 15
Here comes a twist: As the union of an empty collection of open intervals, the
empty set ∅ is open. As the complement of the open set R, ∅ is also closed. As
the complement of ∅, R is also closed. Thus, R and ∅ turn out to be both open and
closed. ♦
Exercises
1. Consider several sets discussed in this section:
(a) Rewrite S and T in simpler forms. (One is a finite set and the other
an interval.)
(b) Decide whether each of the following statements is true or false, and
explain: S ⊂ N; S ⊂ T ; T ∩ Q =6 ∅; −2.8 ∈ Q \ T .
(c) Give the simplest possible description of the set U = {x ∈ R | x2 +
x < 0}.
(a) A = [1, 3]
(b) A = [1, ∞)
(c) A = (1, 2) ∪ (3, 4)
16 1. Preliminaries: Numbers, Sets, Proofs, and Bounds
4. (a) Show that the complement of a closed and bounded interval [a, b] is
the union of two unbounded open intervals.
(b) Give examples to show that the complement of an open interval I can
be empty, one closed interval, or the union of two closed intervals.
(c) Write the complement of Z as a union of intervals.
6. This problem is about De Morgan’s laws; see page 12. Let A and B be any
subsets of R, and consider the two claims
(i) R \ (A ∪ B) = (R \ A) ∪ (R \ B) ;
(ii) R \ (A ∪ B) = (R \ A) ∩ (R \ B) .
(a) One of (i) and (ii) is true and the other false. Identify the false claim
and give specific sets A and B to show it is false.
(b) What happens in the special case that A = B?
(c) Prove the true statement above. (Hint: Show that every element of the
left side is an element of the right side, and vice versa.)
7. This problem is about De Morgan’s laws (page 12) when A1 and A2 are
intervals. Let A1 = (1, 3) and A2 = (2, 5). Note that R \ A1 = (−∞, 1] ∪
[3, ∞) and R\A2 = (−∞, 2]∪[5, ∞). Write R\(A1 ∪A2 ) and R\(A1 ∩A2 )
in interval notation, and check that De Morgan’s laws hold as claimed.
8. This problem explores De Morgan’s laws (page 12) when A1 = (0, 1) and
A2 = (2, ∞). As in Problem 7, write all of the sets R \ A1 , R \ A2 ,
R \ (A1 ∩ A2 ), and R \ (A1 ∪ A2 ) using interval notation, and check that
De Morgan’s laws hold.
10. Give specific examples of intervals I and J for which I ∩ J is (a) open;
(b) closed; (c) half-open; (d) open and unbounded. Is it possible in each
case to choose I and J so that neither I ⊂ J nor J ⊂ I?
(a) Find any two intervals I and J such that I ∪ J is not an interval.
(b) Find disjoint intervals I and J such that I ∪ J is an interval.
(c) Consider open intervals I = (a, b) and J = (c, d) with a < c, b < d,
and 0 ∈ I ∩ J. Show that I ∪ J is an open interval.
12. Suppose I and J are any intervals, and c any number such that c ∈ I ∩ J.
Use Definition 1.5 to show that I ∪ J is an interval.
14. Can any interval have exactly 123456789 points? Why or why not?
16. This problem links to Example 4, page 14; note the meanings there of
“open” and “closed.”
(a) Show that (1, 2) ∪ (3, ∞) is open. What related set is closed?
(b) Let a ∈ R. Show that {a} is closed.
(c) Let a ∈ R. Show that (−∞, a) is open and (−∞, a] is closed.
(d) The interval (0, 1) is obviously open. Show that it’s not closed.
17. This problem links to Example 4, page 14; note the meanings there of
“open” and “closed.”
18. From any set S we can create the new set P (S), called the power set of S,
consisting of all subsets of S. If S = {1, 2}, for instance, then P (S) =
{ ∅, {1}, {2}, {1, 2} }.
(c) Let N10 = {1, 2, . . . , 10} and N11 = {1, 2, . . . , 10, 11}. Explain
why P (N11 ) has twice as many members as P (N10 ). (We’ll prove
this formally in a later section.)
(a) Let S be the set of all three-member subsets of N10 . How many ele-
ments does S have?
(b) Let T be the set of all three-tuples (a, b, c) with a, b, and c in N10 .
How many elements does T have?
(c) Let S10 be the set of all permutations (i.e., orderings) of the elements
of N10 . How many elements does S10 have?
(d) Do any two of the sets N10 , S, T , S10 have nonempty intersection?
20. Let S be a set with n elements. Let S be the set of all (n − 1)-element
subsets of S. How many elements does S have? Why?
21. Let N100 = {1, 2, . . . , 100}. Let A42 and A58 be the sets of all 42- and
58-member subsets of N100 , respectively. Show that A42 and A58 have the
same number of elements.
22. The xy-plane can be thought of as the Cartesian product R × R (hence the
notation R2 ). Sketch each of the following subsets of R × R.
(a) {1, 2, 3} × R
(b) R × {1, 2, 3}
(c) Z × N
(d) {(x, y) | y = x2 }
(e) {(x, y) | x = sin y}
(f) {(x, y) | x2 + y 2 = −1}
(c) What does the element (2, 3, black) in the set G × {black, white} rep-
resent in this context? What does the set G×{black, white} represent?
(d) A picture in the sense above can be thought of as an element of P (G),
the power set of G. Explain.
S OLUTION
√ . The formula f (x) = x2 makes the rule perfectly clear:
√ For inputs
−3, 2, and 1.2345, the corresponding outputs are f (−3) = 9, f ( 2) = 2, and
f (1.2345) = 1.23452. The domain and the codomain, on the other hand, are open
to choice. In a calculus course we might, if pressed, use the natural domain—the
set of all real numbers for which the rule makes sense. For f (x) = x2 , that’s R
itself. (For g(x) = tan(x), the natural domain omits some real numbers.) In a
number theory course, we might use N as domain.
The codomain, too, is open to choice. For a given domain, we need only as-
sure that the codomain contains all possible outputs. With domain R, for instance,
we could use as codomain for f (x) = x2 any set that contains all the nonnegative
reals. This might be R itself, the infinite interval [0, ∞), or something stranger,
like (−42, ∞) or C. With domain N, we could reuse N as codomain, or choose
any set of integers that contains all positive perfect squares. ♦
The moral. The preceding example shows that a function is more than a formula.
A function is a 3-part package: a domain A, a codomain B, and a rule (which may
or may not be a symbolic formula) for assigning a unique output b = f (a) in B to
every input a in A. The notation f : A → B emphasizes this three-fold nature. In
20 1. Preliminaries: Numbers, Sets, Proofs, and Bounds
practice the domain or codomain or both are sometimes understood from context,
or even ignored, but they’re always waiting in the wings.
Range and codomain. The range (or image set) of a function f : A → B is the
set of all outputs:
range of f = {f (a) | a ∈ A} .
Note that the range is always a subset—perhaps a proper subset—of the codomain.
For f : R → R given by f (x) = x2 , for instance, the range is the interval [0, ∞),
which omits all the negative numbers.
Seeing Functions
Graphs. Graphs are deservedly popular in elementary calculus. Properties like
smoothness, steepness, rising vs. falling, concavity, and existence of asymptotes
reflect and reveal a lot about functions.
A less familiar fact is that, like functions, graphs can also be described in
the language of sets. The graph of a calculus-style function, say, f (x) = x2 , is
a curve in the xy-plane, made up of points of the form (x, f (x))—in this case,
(x, x2 ). But the idea of a graph makes sense for any function:
Observe:
• A graph is a set: The graph of f is a certain set of ordered pairs, and thus
a subset of the Cartesian product A × B.
• But not just any set: For f to be a function with domain A, its graph G
must contain one and only one point (a, b) for each a ∈ A. The graph of
a function f : R → R, for instance, must contain exactly one point of the
form (3, y). In pre-calculus lingo, this is the
vertical line test.
• Maybe a curve, maybe not: Graphs of calculus-style functions are often nice
curves. Indeed, a lot of beginning calculus is about connecting geometric
properties of curves to analytic properties of functions. But some functions
have graphs that are nothing like curves. The graph of j in Example 2, for
instance, has points of the form (Carol, February)!
• Other “graphs” out there: Like other useful math words, “graph” has dif-
ferent meanings in different settings. The graph of an equation, for exam-
ple, is the set of points (x, y) for which x and y satisfy the equation. The
graph of x2 +y 2 = 1 is a circle, and therefore not the graph of any function.
In this book, “graph” will refer only to functions.
• One idea, two views: A function and its graph are so closely linked—either
one completely determines the other—that functions are sometimes de-
fined, rather than just visualized, as sets of ordered pairs. We’ll use both From this perspective, functions
of these viewpoints freely. are sets in their own right.
Other views. For some functions, geometric graphs make little sense, so we
use other descriptive devices—tables, diagrams, etc. To describe the function
B IRTH M ONTH in Example 2, for instance, we could use a table:
22 1. Preliminaries: Numbers, Sets, Proofs, and Bounds
We could also use a diagram to describe B IRTH M ONTH, showing the domain, the
codomain, and arrows connecting inputs p to their corresponding outputs h(p).
(In such a view, the range is the set of “arrowheads,” where incoming arrows
Draw your own diagram for “land.”)
B IRTH M ONTH. Yet another, even less formal notation is sometimes useful. To describe the
M ONTH N UMBER function, we could just write
The next two examples will help illustrate the meaning of—and how to prove or
disprove—injectivity.
as desired. ♦
1.3. Sets 102: The Idea of a Function 23
Definition 1.9 (Onto functions). A function f : A → B is onto (or surjective) if, Using “onto” as an adjective
for every b ∈ B, there is some a ∈ A with f (a) = b. In equivalent words: Every sounds ugly, but everybody
does it.
element of the codomain is also in the range. In equivalent symbols:
{f (a) | a ∈ A} = B.
E XAMPLE 4. The function B IRTH M ONTH from Example 2 is not onto, because
no member of P was born in (say) March. The function M ONTH N UMBER : A → Finding even one such
N12 is onto, because the 12 months range in number from 1 to 12. The “inclusion” codomain member is enough.
i : Q → R with rule i(x) = x is not onto, because the codomain includes
irrationals, but the range does not.
Is the quadratic function q : R → R given by q(x) = x2 − 6x surjective?
The short answer is no: A little calculus or algebra shows that q(3) = −9 is the
minimum value, so the range of q is the interval [−9, ∞), a smaller set than the
codomain. We could make q surjective by using [−9, ∞), not R, as the codomain.
♦
Bijective functions. A function that is both injective and surjective is called bijec-
tive, or, equivalently, a one-to-one correspondence. Our M ONTH N UMBER func-
tion is one bijection. Another is suggested by a few values: What are the domain and
codomain? What’s a good
1 7→ a, 2 7→ b, 3 7→ c, ..., 25 7→ y, 26 7→ z. name for this function?
Here are two bijections from calculus, this time in a matched pair:
π π
f: − , → R, with rule f (x) = tan(x)
2 2
π π
g:R→ − , , with rule g(x) = tan−1 (x).
2 2
You can readily convince yourself, perhaps with graphs, that f and g are indeed
one-to-one and onto. But notice a little surprise: f and g are one-to-one cor-
respondences between an interval of finite length and the entire real line. Such
strange behavior is possible when infinite sets are involved; we will explore infi-
nite sets further in a later section.
This makes sense with respect Recall, especially, that order matters: The notation g ◦ f means that g follows f .
to nested parentheses—but it The two compositions g ◦ f and f ◦ g are seldom equal—even if both make good
goes against the usual grain of
reading from left to right.
sense.†
from Example 2?
and
M ONTH N UMBER ◦ B IRTH M ONTH : P → N12
make sense (the former is the function we called j in Example 2). For instance,
and
make no sense. ♦
Notes on proofs. We’ll prove (ii) and (iv), leaving (i) and (v) to the exercises.
Statement (iii) just combines (i) and (ii), so there is nothing new to prove.
To prove (ii) we need to show that for any c ∈ C there is some a ∈ A with
g ◦ f (a) = c. We can do this directly. For given c ∈ C, we know (because g is
onto) there exists b ∈ B with g(b) = c. Because f is onto there exists a ∈ A with
f (a) = b, and this a does the job: g ◦ f (a) = g( f (a) ) = g(b) = c, as desired.
To prove (iv) we’ll show that if f is not one-to-one, then g ◦ f cannot be one-
to-one either. (In math-speak, this is an indirect proof; more details on such things
in later sections.) Suppose, then, that a1 6= a2 , but f (a1 ) = f (a2 ). Then we’d
have g( f (a1 ) ) = g( f (a2 ) ), which is just another way of saying that g ◦ f (a1 ) =
g ◦ f (a2 ). Thus g ◦ f is not one-to-one, and the proof is done.
Note, finally, that g need not be one-to-one just because g ◦ f is. The calculus
formula y = (ex + 1)2 illustrates this. (Let f (x) = ex + 1 and g(x) = x2 ; further
details are left to exercises.)
26 1. Preliminaries: Numbers, Sets, Proofs, and Bounds
Inverse Functions
Another important definition:
Definition 1.11 (Inverse functions). Let f : A → B and g : B → A be func-
tions. We say f and g are inverse functions if both
Recall, first, that “inverse” has several meanings in mathematics. For instance,
the numbers 3 and −3 are additive inverses because 3+(−3) = 0. Similarly, 3 and
1/3 are multiplicative inverses because 3×1/3 = 1, the multiplicative identity. In
a similar spirit, two functions are inverses if composing them (rather than adding
or multiplying) produces identity functions:
Relations
Let A be a nonempty set, and f : A → A any function on A. We can think of f
A function and its graph are as its graph Gf = {(a, f (a)) | a ∈ A}. Note that Gf is a subset of A × A—but a
essentially the same thing. subset with special properties that reflect the fact that f is a function. For instance,
Gf cannot contain two different elements of the form (a0 , a1 ) and (a0 , a2 ).
Definition 1.12 (Relation on a set). A relation R on a set A is any subset of A ×
A. If (x, y) ∈ R, we write x R y, and we say that x is related to y.
1.3. Sets 102: The Idea of a Function 27
This may sound forbiddingly abstract, but familiar (if lightly disguised) examples
are all around us.
E QUALS = {(n, n) | n ∈ N}
corresponds to the equality relation: each integer is related (i.e., equal) only
to itself. In this case, of course, we usually write n = n, not n E QUALS n.
• Order relations: The set
defines the “divides” relation, in which we find 3 D IVIDES 123456, 42 D IVIDES 42,
and 1 D IVIDES n for all n. Number theorists use the handy symbol m | n
rather than m D IVIDES n.
Equality (on any set) is the prototype and simplest—but not the only—example
of an equivalence relation. Consider, for example, the S AME B LOOD T YPE rela-
tion on the set of humans, where x S AME B LOOD T YPE y means (of course) that
x and y have the same blood type. It is easy to see that S AME B LOOD T YPE is
indeed an equivalence relation, which sorts people into four “families” (called
equivalence classes) based on their blood types: A, B, AB, or O. More examples
are in the exercises and later in this book.
Exercises
1. Each part following gives the rule (implicit in the name) for a possible func-
tion. For each rule, find a reasonable domain A and codomain B to create a
function. (Try to make A relatively large and then make B relatively small.)
Is each function one-to-one? Onto?
(a) M OTHER : A → B
(b) F IRST B ORN S ON : A → B
(c) E YE C OLOR : A → B
(d) B IRTHDAY : A → B
3. Find the natural domain for each of the following calculus-style functions.
(a) f (x) = sin(x + ex )
(b) g(x) = tan(x)
p
(c) h(x) = x2 − 1
4. Find the natural domain for each of the following calculus-style functions.
√
(a) f (x) = 1 − ex
p
(b) g(x) = x2 + πx + 1 (give a decimal approximation)
(c) h(x) = ln (ex ) (give a decimal approximation)
5. In each part, find an appropriate quadratic formula q(x) = ax2 + bx + c for
the given function. (More than one answer may be correct.)
1.3. Sets 102: The Idea of a Function 29
(a) How can you tell from its graph whether f is one-to-one?
(b) How can you tell from its graph whether f is onto?
(c) Find formulas for three different calculus-style functions f : [0, 1] →
[0, 1] that are one-to-one and onto, and such that f (0) = 1.
(d) Define f : [0, 1] → [0, 1] by setting f (x) = x for x ∈ Q and f (x) =
x2 for x ∈
/ Q. Is f one-to-one? Onto?
(a) Describe the set G. (Hint: G is a set of ordered pairs; for instance,
(May, 3) ∈ G.)
(b) The W ORD L ENGTH function is not one-to-one. How does the graph
G reveal this?
30 1. Preliminaries: Numbers, Sets, Proofs, and Bounds
(c) The W ORD L ENGTH function is also not onto. How does the graph G
reveal this?
11. A certain function f has graph G = {(a, 1), (e, 2), (i, 3), (o, 4), (u, 5)}.
(a) What are the domain and the (smallest possible) codomain of f ?
(b) The function f is bijective. How can we tell this from the graph?
(c) Because f is bijective there is an inverse function, f −1 . What is the
graph of f −1 ?
(d) Can you think of more descriptive names than f and f −1 for these
functions?
(a) Show that if both f and g are one-to-one, then so is g ◦ f . (This is (i)
of Theorem 1.10, page 25.)
(b) Show that if g ◦ f is onto, then so is g. (This is (v) of Theorem 1.10,
page 25.)
(c) Consider the functions f (x) = ex + 1 and g(x) = x2 , both with
domain R and codomain R. Show that g ◦ f is one-to-one, but g is
not.
(
n
2 if n is even
17. Consider ℓ : N → Z given by ℓ(n) = 1−n . Is ℓ one-to-one?
2 if n is odd
Onto? Does this seem strange given the “sizes” of N and Z?
√
18. Consider the functions f (x) = x2 and g(x) = x, both with [0, ∞) as
domain and as codomain.
19. The ordinary sine and arcsine functions are inverses as long as some care is
taken with domains and codomains. Work out the details—that is, specify
domains and codomains for these functions that make them inverses.
20. Explain why the S AME B LOOD T YPE relation, discussed at the end of this
section, is reflexive, symmetric, and transitive.
21. Consider the relation M OD 5 on Z, defined by M OD 5 = {(m, n) | 5 divides
m − n}.
The old Chinese sage who supposedly said so may have had other things in mind,
but the advice certainly applies to mathematics. Without clear and unambiguous
language—“right names”—we can’t know exactly what we are talking about, and
therefore we can’t produce really convincing proofs.
The moral is that care with language is just as important in mathematics as it
is in, say, the literary arts. Granted, poets and mathematicians use language very
differently: good poetry may rely on subtle allusions and shaded meanings, but
the best mathematical proofs are always clear, direct, and straightforward, even
when they convey difficult ideas. Both poems and proofs can be praised as ele-
gant, but the judgment depends on different standards. The good news is that we
mathematicians need not aspire to fancy artistry: clarity and directness of expres-
sion are less rarefied arts than practical skills, readily acquired and improved on
the job.
1.4. Proofs 101: Proofs and Proof-Writing 33
Use standard symbols and notations. Using standard notations, of which we’ll
encounter many in this book, helps shorten, unclutter, and clarify mathematical
discussion—but only if notations are used consistently, and with care. For in-
stance, the notations (1, 3), [1, 3], and {1, 3} all have precise but different mean-
ings—one is an open interval, one is a closed interval, and one is a set with just
two members. Straying from these conventions is asking for trouble. It is far from
clear, for instance, what such notations as
√ √
[x | x2 > 2] and Q | (− 2, 2)
really mean. By contrast, the expressions
{x | x2 > 2} and {x ∈ Q | x2 < 2}
are clear and unambiguous. Or will be, once we have
defined all the ingredients.
Define everything. Everyone expects theorems, proofs, and calculations in math-
ematical writing. Less expected, but equally important, are formal definitions.
Words in everyday language have fluid meanings, with nuances that depend on
context. By contrast, words in mathematics have fixed, precise, agreed-upon
meanings. A typical theorem claims that some mathematical object has some There are some exceptions, but
mathematical property. Without clear definitions of the objects and properties not many.
under discussion, proof can’t get started.
E XAMPLE 1. Prove this claim: The set of rational numbers is closed under
addition.
P ROOF. Since x and y are rational, we can write x = a/b and y = c/d, where
a, b, c, and d are integers, and b and d are nonzero. Now we use some fraction
See the common denominator? arithmetic:
a c ad bc ad + bc
x+y = + = + = ,
b d bd bd bd
a rational number, and the proof is done. ♦
Fair game? Few proofs are completely from scratch. Even the simple proof
above relies on some very basic properties of integers: sums and products of in-
tegers are always integers, and the product of nonzero integers is always nonzero.
(In this book we will freely assume and use such basic properties of integer arith-
metic.) Knowing just which assumptions are safe and which need proof can be
tricky—especially when assumptions are “understood” rather than stated explic-
itly. Learning to sort out such matters is part of the craft of proof.
Write in complete “sentences.” The quotes are there because mathematical sen-
tences in mathematics may include not just ordinary words but also symbols,
equations, inequalities, etc. For instance, it is fine to write
One problem with the latter is the lack of “connective tissue”: a reader can’t tell
whether you’re asserting that something implies something else, or just listing
your favorite inequalities. At any cost, be clear.
Would you swim here or not? Is the sign intended for human or for reptile readers?
In practice, many mathematical sentences convey complex ideas, and so nat-
urally have correspondingly complex structures. It is especially important, there-
fore, to write mathematics as clearly and unambiguously as possible, and to help
the reader decipher your meaning.
1.4. Proofs 101: Proofs and Proof-Writing 35
Write sentences that “scan.” Proofs and solutions must of course be correct,
but they must also be intelligible to the intended reader. An excellent way to
ensure the latter is to read each sentence back to yourself. (Doing this silently
may reduce ridicule from neighbors.) A sentence with proper English grammar
and syntax may be mathematically right or wrong; every mathematician has seen
eloquent proofs that boil down to nonsense. But an ungrammatical sentence is
almost surely wrong or, worse, meaningless to a reader.
Make sense. Ask carefully whether what you write makes sense by the strict
standards of mathematical writing. For instance, the sentence
For all Q, we have x2 ≥ 0.
is nonsensical (do you see why?) and hence neither true nor false. Your first job
is to write sentences that make sense. Your second job is to write sentences that
are true.
Just don’t do “it.” The harmless-looking pronoun “it” commits countless crimes
in mathematics. Here, for example, is a confusing way to describe an important
connection between a function f and its derivative f ′ :
It’s maximum or minimum when it’s flat, and that happens when it’s
a zero of its derivative.
That’s way too many pronouns, and who knows what each refers to? Just say no:
A function f may have a maximum or minimum value where the
graph of f is flat. This can occur at a value of x for which f ′ (x) = 0.
Make it look easy. A musician planning a recital invests hours of practice and
study, and hits plenty of false notes. The recital itself skips all of this practice and
study, and most of the false notes. In the same way, a finished mathematical proof
should be the polished result, rather than the basic process, of whatever informal
thinking, experiments, and false starts may have happened along the way. It is
sometimes helpful to hint at the investigative phase of proving a result, but it is
important not to confuse such material with the proof itself. Good proofs are
clear, concise, and couched in standard mathematical language.
36 1. Preliminaries: Numbers, Sets, Proofs, and Bounds
Don’t say too much—or too little. Respect, but don’t overtax, your reader’s intel-
ligence and willingness to work. Ideas in your proof should be clear and accessi-
ble to your reader—someone with your own level of intelligence and knowledge.
and
Q and not R : It is cloudy but not raining.
Variables. Some statements involve variables: symbols that stand in for unspec-
ified inputs. For such statements we sometimes use names like P (x) to emphasize
the presence of variables. Here are some examples:
The truth or falsity of such statements usually depends on the values of the vari-
ables involved. Here, for instance, P (7) is false, P (−1) is true, Q(Sue) is proba-
Things might change if we bly false, and Q(Ed) might be true. In this case R(x, y) happens to be true for all
allowed, say, imaginary number real number inputs x and y.
inputs.
1.4. Proofs 101: Proofs and Proof-Writing 37
and
Q =⇒ P means x2 > 9 =⇒ x > 3.
Here, clearly, the implication P =⇒ Q is true: x2 > 9 does indeed hold
whenever x > 3. But the implication Q =⇒ P is false; try x = −42. An
important moral is that implication is, by default, a one-way street: P =⇒ Q is
no guarantee that Q =⇒ P .
If it happens that both P =⇒ Q and Q =⇒ P , then P and Q always Solving equations, as in linear
have the same truth values, and are therefore called equivalent. The statements algebra, is all about searching
for equivalent, but simpler,
P (x) : 2x + 5 = 11 and Q(x) : x = 3 are equivalent, for instance. equations.
And, or. Given statements P and Q, we can form new statements ( P and Q )
and ( P or Q ). (The parentheses aren’t essential, but they help keep the right
things together.) “And” is used mathematically much as it is in everyday speech:
( P and Q ) is true if, but only if, both P and Q are true. “Or” is a little different:
We take ( P or Q ) to be true when either or both of P and Q is true. This conven-
tion, called the inclusive or, differs a bit from the exclusive or, sometimes written
xor, that’s common in everyday life: A child might be offered either candy or ice
cream, but not both.
not P ,
Negating complicated statements can take some thought. Consider, for instance,
Goldbach’s conjecture: This famous unsolved problem
dates back to the 1740s, in
G : Every even integer greater than 2 is the sum of two primes. correspondence between
Christian Goldbach and
Leonhard Euler.
How would we negate G? We could, admittedly, write something like
but this is unhelpful. A better approach is to notice that G asserts that every even
integer has a certain property. To negate such a claim is to say that some even
integer does not have this property:
not G: Some even integer greater than 2 is not the sum of two primes.
And achieve fame. In other words, to disprove G it is enough to find even one big even number that
It’s been tried . . . . is not the sum of two primes.
In a similar spirit, De Morgan’s laws tell how to negate statements that involve
We alluded to them in slightly
different form in an earlier and and or:
section.
Notice, especially, that negating and statements produces or statements, and vice
versa.
As noted earlier, an implication and its converse need not have the same truth
value—and they do not in the present case. An implication and its contrapositive,
on the other hand, always have the same truth value. This matters in mathematical Common sense bears this out;
practice, as we will see, because the contrapositive of a statement is sometimes we explore it further in the
exercises.
easier to prove than the original.
Coda: new-age proofs. Proofs are no less important now than they’ve ever been.
But modern viewpoints and, especially, modern technology have made new kinds
of mathematics possible—and sometimes require new kinds of proof. For exam-
ple, the four-color theorem (four colors suffice to color any planar map of “coun-
tries”) was posed in 1852, but proved only in 1976, with aid from a computer to
check hundreds of special cases. Computer-aided proofs are now common, but
they remain controversial.
40 1. Preliminaries: Numbers, Sets, Proofs, and Bounds
Exercises
1. Here are some statements or attempted statements about real numbers. Which
are true? Which are false? Are any nonsensical? Are any negations of each
other?
2. Here are some statements or attempted statements about real numbers. Which
are true? Which are false? Are any nonsensical? Are any negations of each
other?
3. Following are several possibly true but poorly-stated claims from elemen-
tary calculus. Fix each statement by replacing all instances of “it” and “its”
and “it’s” with clearer words or phrases. (An example appears on page 35.)
4. Each of the following sentences has one or more syntax errors. In each
case, make a clear (and true) sentence with as little editing as possible.
(a) R =⇒ not S
(b) R =⇒ C
(c) R =⇒ (not S) and C
(d) C =⇒ not S
(a) P =⇒ R
(b) Q =⇒ R
(c) (P and Q) =⇒ R
10. Converses and contrapositives. In each part, write (as simply as possible)
both the converse and the contrapositive of the given implication. No proofs
needed, but try to label each statement as true or false. In all parts, a, b, an ,
etc. all stand for real numbers.
(a) If a and b are both rational, then a + b is rational.
(b) If a is irrational then 1/a is irrational, too.
(c) If a and b are both irrational, then ab is irrational.
P
(d) If a series an converges, then limn→∞ an = 0.
11. Negations. In each part, write (as simply as possible) the negation of the
given statement.
12. Negations. For each statement following, write the negation as simply as
possible. Then try to decide (some parts may be hard) whether each state-
ment is true or false; no proofs needed.
Claim V is harder to restate in if–then form because its hypotheses, such as the
definition of a prime number, would be tedious to list.
The form of a claim often suggests possible strategies for proving or refuting
it. In the case of an if P then Q claim, for instance, we might attack either the
claim itself or its contrapositive, if not Q then not P. For broad claims like VII
and VIII, each of which covers infinitely many cases, we should expect to work
harder for a proof—or maybe look for even one counterexample as a disproof.
Direct Proof
A direct proof addresses a claim if P then Q in the “obvious” way—it starts with
P and derives Q. We illustrate with Claim I.
Remember, we’re assuming E XAMPLE 1. Prove directly (using basic properties of integer arithmetic) that if
integer basics. a and b are rational numbers, then ab is rational.
Indirect Proof
An indirect proof, like a direct one, addresses a claim of the form if P, then Q. But
there is a twist: instead of showing that P implies Q, we prove the contrapositive
(but equivalent!) statement:
An indirect proof may be the simplest choice when either P or Q is awkward, but
(not P ) or (not Q) is simpler. We illustrate with Claim III.
S OLUTION . A direct argument seems awkward here because the hypothesis says
something negative about a. The contrapositive is simple and straightforward: if
1/a is rational, then a is rational.
This is easily shown. If 1/a is rational, then we can write 1/a = x/y for
some (nonzero) integers x and y. But then we have a = y/x, which shows that a
is rational, as desired. ♦
Proof by Contradiction
Proofs by contradiction are close kin to indirect proofs. In each case we first
assume that the conclusion fails, and then try to deduce a contradiction, either of
the hypothesis or of some other known fact. From the resulting absurdity we infer
that the original conclusion must have been true all along. (The method is also
known as reductio ad absurdum, Latin for “reduction to the absurd.”)
√
E XAMPLE 4. Prove Claim IV: 5 is irrational.
S OLUTION . We could tweak the proof of Theorem 1.2, page 6, but for variety
we’ll take an approach based on prime factorization: Every positive integer n has
a unique list of prime factors, some of which may be repeated. (For n = 60 the
list is {2, 2, 3, 5}.) The key insight for our proof is about squaring: each prime
factor of n appears twice as often among the prime factors of n2 . (For n2 = 602 ,
the prime factors are {2, 2, 2, 2, 3, 3, 5, 5}.) In particular, every square integer has
an even number of prime factors. So much said, we’re ready for a crisp proof.
√
P ROOF. Assume, toward contradiction, that 5 is rational. Then we can write
√ It is traditional, and helpful, to
5 = a/b for some positive integers a and b, and so label contradiction proofs up
front.
√ a a2
5= =⇒ 5 = 2 =⇒ 5b2 = a2 .
b b
46 1. Preliminaries: Numbers, Sets, Proofs, and Bounds
The last equation provides our contradiction. The right side is a square, and there-
fore has an even number of prime factors. But the left side has an odd number of
prime factors—an even number coming from b2 and one more from the 5. This
absurdity completes the proof. ♦
P ROOF. Assume, toward contradiction, that there are only finitely many primes,
say p1 , p2 , p3 , . . . , pn . Now consider the number N = p1 p2 p3 · · · pn + 1. By its
N is one more than a multiple of construction, N is not divisible by any of the primes p1 , p2 , p3 , . . . , pn . Hence ei-
each pi . ther N is itself prime or N has at least one prime factor not among p1 , p2 , p3 , . . . , pn .
Either way, the list {p1 , p2 , p3 , . . . , pn } could not have been complete. This con-
tradiction completes the proof. ♦
E XAMPLE 6. Prove Claim VI: Every even integer from 4 to 100 is the sum of
two primes.
S OLUTION . Why not just prove Claim VII—every even integer greater than two
is the sum of two primes—and be done with it? That would indeed do the job,
but Claim VII is Goldbach’s conjecture, a famous problem dating to the 1740s.
Considering that Claim VII has already stumped the likes of Euler, we’ll stick
Try some more! with Claim VI. Handling its 49 special cases is easy, although perhaps tedious:
Proof by Induction
Mathematical induction is among every mathematician’s favorite power tools. It
is a simple, structured, and sometimes astonishingly powerful approach to prov-
ing whole families of claims at once. We illustrate the idea first informally, by
example.
It is easy in this case to check P (n) for small n, simply by listing. For P (1), the
two subsets are {1} and ∅. For n = 2 the list is Every set has ∅ as a subset.
With n = 3 we get eight subsets, as expected: four from the n = 2 case plus four See the pattern?
more that include a 3:
Induction, formally. Example 7 illustrates the usual setting for a proof by induc-
tion: a claim that some proposition P (n) holds for every positive integer n. To
prove such a claim by induction takes two (named) steps:
• The base case: Show that P (1) holds.
48 1. Preliminaries: Numbers, Sets, Proofs, and Bounds
• The inductive step: Show that P (k) implies P (k + 1) for every positive
integer k. (Here P (k) is called the inductive hypothesis.)
We illustrate the idea (and the customary shop talk) by formalizing the proof
that, for all n, the set {1, 2, . . . , 42} has 2n subsets.
Proof (of Claim VIII, by induction): The base case n = 1 holds because the set
{1} has only itself and the empty set as subsets.
For the inductive step we assume the inductive hypothesis—that {1, 2, . . . , k}
has 2k subsets—and try to show that {1, 2, . . . , k, k + 1} has 2k+1 subsets. To
this end, note first that all 2k subsets of {1, 2, . . . , k} are also subsets of {1, 2, . . . ,
k + 1}. Every remaining subset of {1, 2, . . . , k + 1} contains k + 1, and so can be
formed by adding k + 1 to some subset of {1, 2, . . . , k}. Thus {1, 2, . . . , k, k + 1}
has 2 × 2k = 2k+1 elements, as desired, and the proof is complete.
n(n+1)
E XAMPLE 8. The identity 1 + 2 + 3 + · · · + n = 2 holds for all positive
integers n.
S OLUTION . We’ll write P (n) for the identity above; for instance, P (10) says
that 1 + 2 + · · · + 10 = 10 · 11/2 = 55. This is easily checked directly, but
we’d rather prove P (n) for all integers n—just the right job for mathematical
induction.
We’ll take this basic fact, like others about the integers, as an axiom rather
than something to be proved.
Exercises
1. In each part of this problem, either prove or disprove the given claim. If you
prove the claim, indicate whether your proof is direct, indirect, by contra-
diction, or something else. If you disprove the√ claim, use a counterexample.
(It’s OK to assume known facts, such as that 2 is irrational.) In all cases,
x and y are real numbers.
√
(a) If x ∈ Q, then 2 + x ∈ / Q.
√
(b) If x ∈
/ Q, then 2 + x ∈ / Q.
(c) If x + y is irrational, then at least one of x and y is irrational.
(d) If p is a prime number, then 2p − 1 is prime.
50 1. Preliminaries: Numbers, Sets, Proofs, and Bounds
(e) For all real numbers x and y, |x − y| ≤ x2 − y 2 .
2. In each part following, either prove or disprove the converse of the given
claim. If you prove the converse, indicate whether your proof is direct,
indirect, by contradiction, or something else. If you disprove
√ the converse,
use a counterexample. (Assume known facts, such as that 2 is irrational.)
In all cases, x and y are real numbers.
√
(a) If x ∈ Q, then 2 + x ∈ / Q.
√
/ Q, then 2 + x ∈
(b) If x ∈ / Q.
(c) If x + y is irrational, then at least one of x and y is irrational.
(a) Explain why S has an even number of subsets (including S and ∅).
Hint: Look at results in this section. Or think about complements.
(b) Suppose n is odd. Show that the number of subsets with an even
number of elements is the same as the number of subsets with an odd
number of elements. Hint: This can be done by induction, but it’s
quicker to use complements.
(c) The result in the preceding part actually holds for both odd and even
n. Prove this, perhaps by induction.
(a) Check directly that the equation holds for n = 1 and for n = 10.
(b) Prove by induction that the formula holds for all positive integers n.
(c) Use (don’t reprove) the formula in Example 8 to give another proof
(not involving induction) of the equation above.
8. Show that
(1 + 2 + 3 + · · · + n)2 = 13 + 23 + 33 + · · · + n3
holds for all positive integers n. (Hint: Use the formula for 1 + 2 + · · · + n
in Example 8.)
9. Recall the product rule from elementary calculus: If f1 and f2 have deriva-
tives f1′ and f2′ then (f1 f2 )′ = f1′ f2 + f1 f2′ .
(a) Use the ordinary product rule to show the analogous formula (f1 f2 f3 )′ =
f1′ f2 f3 + f1 f2′ f3 + f1 f2 f3′ .
(b) Let n be any positive integer. Guess a formula for (f1 f2 f3 . . . fn )′
and prove it by induction.
10. (a) Show that 5n > n! for positive integers n < 12.
(b) Show that 5n < n! for positive integers n ≥ 12.
11. (a) Guess a formula (in terms of n) for the sum 1 · 2 + 2 · 3 + 3 · 4 + · · · +
n · (n + 1). Prove your answer by induction.
(b) It is well known, and readily proved by induction, that
n n
X n(n + 1)(2n + 1) X n(n + 1)
j2 = and j= .
j=1
6 j=1
2
13. Show that the inequality 2n < n! holds for all integers n > 3.
14. Every calculus student knows that if f (x) = xn for any positive integer n,
then f ′ (x) = nxn−1 . Prove this by induction, assuming (i) if f (x) = x,
then f ′ (x) = x; and (ii) the product rule. (We’ll define the derivative, state
and prove the product rule, and firm up other ideas later in this book; here
the point is to see induction in a familiar setting.)
15. Another familiar calculus formula (see Problem 14) says that
1 n
if , then g ′ (x) = − n+1
g(x) =
xn x
for every positive integer n. Prove this by induction, assuming both the
n = 1 case and the product rule.
52 1. Preliminaries: Numbers, Sets, Proofs, and Bounds
√
16. A version of Theorem 1.3 says that if n is a positive integer and n is
rational, then n is a perfect square. Prove this using the following outline,
the idea of prime factorization, and the fact that a positive integer n is a
perfect square if and only if every prime factor of n appears to an even
power. See also Example 4, page 45.
√
If n = a/b, where a and b are positive integers, then squaring both sides
gives nb2 = a2 . Now factor each side of this equation as a product of
prime numbers. Because the right side is a square, each prime factor on the
right side appears to an even power, and so the same must be true on the
left. Each prime factor of b2 appears to an even power, and so (do you see
why?) each prime factor of n must also appear to an even power.
17. If your wallet contains two $5 bills and an unlimited supply of $3 bills.
then you can pay out some amounts, like $8 and $300, but not others, like
$2 and $7. Exactly which amounts can you pay out? Guess an answer and
prove it by induction.
18. Claim: In any group of n kittens, if one is orange, then all are orange.
That’s absurd, of course, but what’s wrong with the following “proof” by
induction?
Proof: The claim is trivial if n = 1, so the base case holds. To illustrate the
inductive step, assume the claim holds for n = 42. Suppose we’re given a
group of 43 kittens, including at least one—say, Hans—that’s orange. Any
other kitten—say Fritz—can be put in some 42-member group with Hans,
and so Fritz must also be orange by the inductive hypothesis. Thus all 43
kittens are orange, and the proof is done.
Finite sets: few surprises. “Small” sets offer few surprises. It seems clear, for
instance, that S = {a, b}, with two elements, is “smaller” than T = {a, b, c}, with
three. Similarly, N42 = {1, 2, . . . , 42} is obviously “smaller” than F IFTY S TATES
= {Alabama, Alaska, . . . , Wisconsin, Wyoming}, even though the two sets have
completely different types of elements, while N26 and E NGLISH A LPHABET =
{a, b, c, . . . , z } have the same “size.”
“Measuring” sets by counting their elements works well for sets with finitely
many elements. For such sets, moreover, our intuition is usually reliable. If, say,
a set S has 427 elements and T ( S, then T must be “strictly smaller,” with 426
elements or fewer.
1.6. Sets 103: Finite and Infinite Sets; Cardinality 53
Infinite sets: many surprises. Matters are very different for infinite sets. Con-
sider, for instance,
N = {1, 2, 3, 4, . . . and E = {2, 4, 6, 8 . . . }.
In one way E seems obviously “smaller”—it omits all of the (infinitely many!) odd
numbers in N. On the other hand, the mapping
1 7→ 2, 2 7→ 4, 3 7→ 6, ..., n 7→ 2n, ...
is a one-to-one correspondence between N and E, so maybe the two sets have the
“same size.” Similarly, the interval (0, 1) is “shorter” than the interval (0, 2), but
the same mapping, x 7→ 2x, is a one-to-one correspondence. An important aim
of this section is to develop useful ways of “measuring” infinite sets.
E XAMPLE 1. Show that (0, 1), (−2, 5), and (0, ∞) all have the same cardinality.
5
1
4
3 0.75
2
0.5
1
0.25
1
–1
–2 10 20
Thus, for any given b ∈ (0, 1), the positive number a = b/(1 − b) satisfies
g(a) = b. ♦
• New finite sets from old: If A and B are finite sets, then A ∪ B, A ∩ B,
A \ B, and A × B are all finite.
• Listing and ordering: If A is any finite set, say with 42 elements, then we
can list the elements: A = {a1 , a2 , a3 , . . . , a42 }. If A happens to be a finite
set of real numbers, we can order our list from smallest to largest:
Biggest and smallest members. The last property above has a useful form that
deserves special mention: We discuss this further in
Problem 3, page 50.
Fact 1.16. Every finite set of real numbers contains a maximum and a minimum
element.
The words “finite” and “contains” both matter. Some infinite sets, such as Z,
are unbounded, and therefore obviously lack maximum and minimum elements.
The case of a bounded infinite set, such as the open interval (1, 3), is a little
subtler. The problem is that 3 and 1—the obvious candidates for biggest and
smallest—are not members of (1, 3). The closed interval [1,3] does
contain maximum and minimum
Countably infinite sets. The set N is the “model” countably infinite set, but many elements.
other sets turn out to have the same cardinality as N. The following proposition
says, in effect, that N is the “smallest” infinite set.
Proposition 1.17. Let S be countably infinite, and let T ⊂ S be any nonempty
subset. Then T is either finite or countably infinite.
Notice the possible surprise: Every infinite subset of N—the odd numbers, the
primes, the powers of 2, the powers of 123456789—is in the sense of cardinality
just as “big” as N itself.
The idea of the proof is to list all the members of T :
t1 , t2 , t3 , . . . , t234 , . . . .
56 1. Preliminaries: Numbers, Sets, Proofs, and Bounds
New countable sets from old. Proposition 1.17 gives one way of creating count-
able sets: start with a given countable set and take subsets. Propositions 1.18 and
1.19 describe some other operations that leave countability intact. You may be
surprised . . . .
Proposition 1.18. Let A and B be countable sets. The Cartesian product
A × B = {(a, b) | a ∈ A and b ∈ B}
is countable.
The proof is a little easier if Proof: We’ll take both A and B to be countably infinite. In this case, we have
either or both are finite.
A = {a1 , a2 , a3 , . . . } and B = {b1 , b2 , b3 , . . . } ,
The matrix extends infinitely and we can list all members of A × B in a “doubly-infinite matrix”:
upward and to the right, like the
first quadrant in the xy-plane.
... ... ... ... ...
(a1 , b3 ) (a2 , b3 ) (a3 , b3 ) (a4 , b3 ) ...
(a1 , b2 ) (a2 , b2 ) (a3 , b2 ) (a4 , b2 ) ...
(a1 , b1 ) (a2 , b1 ) (a3 , b1 ) (a4 , b1 ) ...
Our last trick is to “count off” all the entries in this gigantic array—without miss-
ing any. One way to do this is to start at lower left and proceed along “northwest-
Find these entries in the big pointing” diagonals:
matrix.
1.6. Sets 103: Finite and Infinite Sets; Cardinality 57
Do you see the pattern? What matters is that every point (ai , bj ) appears once,
but only once, in our list. These facts imply that the function f : N → A × B
given by n 7→ nth entry in the list is the desired bijection.
is also countable.
Proof (sketch): Again we assume that each Ai is countably infinite, and list its
elements: Ai = {ai,1 , ai,2 , ai,3 , . . . }. As in the preceding proof, we can list all
elements of all the Ai in a big “matrix”:
Just as before, this “matrix” has countably many entries, so we’re almost done.
But a minor subtlety needs attention: the Ai might have some elements in com-
mon, so some elements of the union might appear more than once in the “matrix.”
Luckily, this turns out not to matter, because the distinct elements of the union cor-
respond to some subset of the matrix entries, and we showed in Proposition 1.17
that such subsets are countable.
be the set of fractions a/i, where a ∈ Z. Now each Ai is clearly countable—it is A given rational, say 3/7,
in one-to-one correspondence with Z—and Q is the union of all the Ai . appears not just in A 7 but also
in A14 , A 21 , and so on.
58 1. Preliminaries: Numbers, Sets, Proofs, and Bounds
Uncountable Sets
The preceding propositions imply that many apparently large sets, such as Q, are
in fact no larger in cardinality than N. Might R itself be countable? The answer
is hardly obvious, but it turns out to be no, as the German mathematician Georg
Cantor showed around 1873.
We’ll give a version of Cantor’s ingenious proof below, but first let’s consider
some striking implications of the theorem:
Corollary 1.22. The interval (0, 1) is uncountable. Every nonempty open interval
(a, b) is uncountable. The irrational numbers are uncountable.
All parts of the corollary say that the sets mentioned are, in cardinality, much
larger than the comparatively “sparse” set Q. Throw a dart at random at the real
line, and you’ll almost certainly hit an irrational.
Proofs of the corollaries. In Example 1 we showed that the intervals (0, 1),
We found explicit bijective (−2, 5), and (0, ∞) all have the same cardinality. Similar methods show that all
functions between these nonempty open intervals, including (−∞, ∞) (aka R), have the same cardinality.
intervals.
If the set P of irrationals were countable, then R = P ∪ Q would be the union of
two countable sets, and hence countable itself.
Cantor’s “diagonal” proof. Cantor uses the idea of infinite decimal expansion
to show that the interval (0, 1) is uncountable. The proof is by contradiction. If
(0, 1) were countable, we could list all of its elements in an infinite sequence:
x1 , x2 , x3 , . . . . Now each xi has an infinite decimal expansion, of the form
xi = 0.di1 di2 di3 di4 di5 . . . , where the dij are decimal digits, ranging from 0 to
9. Thus, we can write
Here comes the clever part. Cantor uses the diagonal entries
6 d11 ,
e1 = e2 6= d22 , e3 6= d33 , e4 6= d44 , ....
There are ten choices for each digit, so there are plenty of ways to choose the ei ,
and therefore countless possible numbers x0 . What matters is that x0 differs from Pun intended.
x1 in the first digit, from x2 in the second digit, from x3 in the third digit, and
so on. Thus x0 is nowhere among the original xi , which contradicts our original
assumption.
Exercises
1. Assume (it’s true!) that the union F1 ∪ F2 of any two finite sets is finite.
Prove by induction that F1 ∪ F2 ∪ · · ·∪ Fn is finite for every positive integer
n and finite sets Fi .
2. Find a one-to-one correspondence f : A → B (expressed as a simple
formula) in each part below.
4. Let (a, b) and (c, d) be any two bounded intervals. Find a linear function
f : (a, b) → (c, d) that is one-to-one and onto. (Give an explicit formula
for f in terms of a, b, c, and d.)
5. Let A be a set with 42 elements and B a set with 43 elements, and let
f : A → B and g : B → A be functions. Can f be one-to-one? Onto?
What about g? Give examples of what can happen, and explain what can’t.
How is the pigeonhole principle involved?
6. Let S be a finite set. Use the pigeonhole principle to show that if f : S → S
is one-to-one, then f is also onto.
60 1. Preliminaries: Numbers, Sets, Proofs, and Bounds
7. It can be shown that a set is S is infinite if and only if there exists a function
f : S → S that is one-to-one but not onto. (The “if” part is the claim of
Problem 6.)
(a) Find a function f : N → N that is one-to-one but not onto. (There are
many possibilities.)
(b) Find a function g : R → R that is one-to-one but not onto. (There are
many possibilities.)
(c) Find a function h : R → R \ {1} that is one-to-one and onto.
(a) For i ∈ N, let Qi be the set of rationals p/q (written in reduced form),
for which p2 + q 2 < i. How many members does Q3 have? What
about Q10 ? Can Qi = Qi+1 for some i? (Hints: Members of Qi
correspond to some, but not all, of the points inside the circle p2 + q 2
in the pq-plane. For Q3 and Q10 , just count these points.)
(b) Find an upper bound in terms of i for the number of elements in Qi .
S∞
(c) Show that Q1 ⊆ Q2 ⊆ Q3 ⊆ . . . and that i=1 Qi = Q.
12. Show that [0, 1] and (0, 1) have the same cardinality by finding a function
f : [0, 1] → (0, 1) that is one-to-one and onto.
Hints: The claim seems reasonable, but finding a good function f is tricky.
Here is one possibility: Set f (0) = 1/2, f (1) = 1/3, f (1/2) = 1/4,
f (1/3) = 1/5, f (1/4) = 1/6, f (1/5) = 1/7, etc. For all other x, set
f (x) = x. Think about it, draw a graph, etc.; show that this function does
the job. (We’ll show later, by the way, that no continuous function f can
have the desired properties.)
1.6. Sets 103: Finite and Infinite Sets; Cardinality 61
13. Show that [0, ∞) and (0, ∞) have the same cardinality.
14. Use results from these problems to explain why all intervals in R have the
same cardinality.
15. A “book” is a finite string of characters from some finite “alphabet,” such
as the 128-character ASCII system. Is the set B of all possible “books”
finite, countably infinite, or uncountable?
17. (This extended “problem” could be the basis of a possible project.) Suppose
we have two sets A and B and a one-to-one function f : A → B. Then it’s
reasonable to say that the cardinality of A is not greater than that of B; in
symbols, |A| ≤ |B|. (Think of |A| as the “size” of A, not as an absolute
value.) For instance, the one-to-one function f : N → Z given by f (n) = n
suggests, correctly, that |N| ≤ |Q|. In words, N is “not greater” than Z.
If there is also a one-to-one function g : B → A, then we can write both
|A| ≤ |B| and |B| ≤ |A|, and it’s natural to expect that A and B have the
same cardinality—i.e., |A| = |B|, or A ∼ B in our preferred notation.
The famous Cantor–Schröder–Bernstein theorem says that this is indeed so.
For any sets A and B and injective functions f : A → B and g : B → A,
we can construct from f and g a bijective function h : A → B.
(a) Find out (from books or online) exactly what the Cantor–Schröder–
Bernstein theorem says, and describe a proof in your own words.
(b) Let A = [0, 1] = B, and consider the functions f : [0, 1] → [0, 1]
given by f (x) = x/2 and g : [0, 1] → [0, 1] given by g(x) = x/2. Il-
lustrate the proof idea in the previous part by constructing from f and
g the advertised bijective function h. (The identity function h(x) = x
62 1. Preliminaries: Numbers, Sets, Proofs, and Bounds
E XAMPLE 1. Which real numbers x satisfy (i) |x − 3| < |x + 2|? What about
(ii) |3x − 7| < 5?
The last version makes the meaning clear: x lies within distance 5/3 from 7/3.
Equivalently, x ∈ ( 2/3, 4 ). ♦
1.7. Numbers 102: Absolute Values 63
Absolute Truths
We collect some key properties of absolute values in two theorems. The first is
easy; proofs are omitted or left as exercises.
Theorem 1.23. Let x and y denote arbitrary real numbers; assume k > 0.
The next theorem is important enough to have its own name. We explain the name below.
Theorem 1.24 (Triangle inequality). Let x and y be any real numbers. Then
|x + y| ≤ |x| + |y| .
|x − y| ≥ | |x| − |y| | .
On proofs. The triangle inequality follows from several bits of Theorem 1.23.
From (a), we get
Variations on a theme. The triangle inequalities are often used in forms slightly
different from the “vanilla” versions in the theorem. Following are some exam-
ples; observe occasional uses of Theorem 1.23:
Note especially what (ii) says about distance: the trip from x to y is no longer
than the sum of the distances from x to a and from a to y. This should sound
reasonable—stopping at a enroute shouldn’t shorten the trip from x to y—and it
helps explain the allusion to triangles. Note also (iii), which traps |x − y| between
upper and lower bounds.
In fact, triangle-type inequalities hold in quite general mathematical settings,
some beyond the scope of this book. The basic triangle inequality |x + y| ≤
|x| + |y| holds for complex numbers x and y, for instance, if |x| denotes the length
of x considered as a vector in the plane. Generalizing in another direction, related
facts like
|x1 + x2 + · · · + x100 | ≤ |x1 | + |x2 | + · · · + |x100 |
and Z Z
1 1
f (x) dx ≤ |f (x)| dx
0 0
also hold, and are sometimes called triangle inequalities.
For |x − z| there is no upper bound (why?), but the reverse triangle inequality
gives a lower bound:
To estimate |xy| and |x/y| we notice first that 6.99 < |x| < 7.01 and 6.98 <
These bounds follow from the |y| < 7.02. Thus,
triangle inequality; common
sense works, too. |x| 7.01
|xy| = |x||y| ≤ 7.01 · 7.02 ≈ 49.21 and ≤ ≈ 1.004.
|y| 6.98
♦
Exercises
1. It’s well known that | sin(x)| ≤ 1 and | cos(x)| ≤ 1 for all x. Assume this
and other familiar properties of sine and cosine in this problem.
(a) The triangle inequality says that |sin(x) + cos(x)| ≤ 2? Does a num-
ber smaller than 2 actually work in this inequality?
1.7. Numbers 102: Absolute Values 65
(b) What does the reverse triangle inequality say about |sin(x) − cos(x)|?
Can this inequality be “improved”? Explain.
(c) Use the triangle inequality to show that sin(x)2 + cos(x)2 ≤ 2. Can
this inequality be “improved”? Explain.
5. Find all real values of x that satisfy the given inequality; express answers
in interval notation.
(a) |x − 3| < |x + 1|
(b) x2 − 4 < 0.07
(c) x2 + 2x + 1 < 4
9. The triangle inequality (TI) for three summands follows from the ordi-
nary TI:
|x1 + x2 + x3 | = |(x1 + x2 ) + x3 | ≤ |x1 + x2 |+|x3 | ≤ |x1 |+|x2 |+|x3 | .
Show by induction that the TI holds for n summands, where n ≥ 2.
10. The triangle inequality |~x + ~y| ≤ |~x| + |~y| also holds for vectors ~x and ~y
in the plane, where |~x| denotes the length of ~x. Draw a picture to illustrate
this fact; make note of the triangle. When does equality hold?
11. This problem explores a connection between the absolute value and the
maximum or minimum of two quantities.
(a) Show that if x and y are real numbers, then
x + y + |x − y|
max{x, y} = .
2
(b) Find a similar formula for the minimum of x and y.
(c) Let f : I → R and g : I → R be functions defined on an interval
I. Find a formula involving the absolute value for the new function
h : I → R defined by h(x) = max{f (x), g(x)}.
(d) Let f (x) = sin x and g(x) = ex , and let h(x) be defined as in the
preceding part. Use technology to plot f , g, and h together in an
interval that shows clearly what’s happening.
12. Let ǫ be a positive number, and suppose |x − 7| < ǫ and |y − 7| < ǫ. Show
that |x − y| < 2ǫ.
13. (a) Suppose that |x − 1| < 0.5. Show that x > 0.5.
|c| |c|
(b) Suppose that c 6= 0 and |x − c| < 2 . Show that |x| > 2 .
Discuss, perhaps with a picture, why this is plausible. How are Riemann
sums involved? (No formal proof is possible until we define the integral!)
1.8. Bounds 67
1.8 Bounds
Theorems and proofs in real analysis often turn on boundedness. We’ll prove
later, for instance, that a continuous function defined on a closed and bounded
interval I = [a, b] is itself bounded on I. To make sense of such a claim re-
quires clear meanings for all its words—which include two somewhat different
instances of the b-word. In this section we start to unpack the idea and language We’ll also need to define
of boundedness in several settings. “continuous,” of course.
It is clear, for instance, that all three of the sets {1, 2, 3}, [1, 3], and (1, 3) are
bounded above by 3 and below by 1, while the interval (1, ∞) has the same lower
bound, but no upper bound.
Here are some basic notes on the idea and language of boundedness:
• Many choices: Upper and lower bounds are far from unique. For {1, 2, 3},
[1, 3], and (1, 3), for instance, −42 and 42 also work as lower and upper
bounds. These may seem unlikely choices, but in practice we sometimes
care more about the existence (or absence) of bounds than about particular
numerical values.
• Unboundedness: A set is unbounded if it is not bounded. The definition is And similarly for sets
far from surprising, but deciding whether a set is bounded or unbounded unbounded above or
unbounded below.
can be challenging. What do you think about
1 1 1 1 1 1
1, 1 + , 1 + + , 1 + + + , . . .
2 2 3 2 3 4
and
1 1 1 1 1 1
1, 1 + , 1 + + , 1 + + + , . . . ,
2 2 4 2 4 8
for example? See the exercises.
E XAMPLE 1. Let S and T be nonempty sets of real numbers. Prove some basic
properties of bounded sets:
(c) Let |S| = {|s| | s ∈ S}. Then S is bounded if and only if |S| is bounded.
for all s and t in question, and the claims about boundedness follow. ♦
This may all seem, for now, like a lot of fuss over not much. Indeed, there is little
interesting to be said about sups and infs of simple sets, like bounded intervals.
We will see soon, however, that the existence of real sups and infs for more general
bounded sets—a property called completeness—is essential in the theory of real
analysis. Here, for the moment, are some simpler notes on the definition.
1.8. Bounds 69
For I = (0, 1) and a = 1.003, for example, we can use δ = 0.003 (or any smaller
value of δ). Just as clearly, no positive δ works for I = (0, 1) and a = 0.
E XAMPLE 2. The interval I = (2, 3) is bounded away from π because |π −x| >
0.14 for all x ∈ (2, 3). By contrast, 3 ∈
/ I, but I is not bounded away from 3,
since I has members within any small distance δ from 3. The set N is bounded
away from π, but Q is not, since for any δ > 0, no matter how small, there are
rational numbers within δ of π. ♦
S OLUTION . The idea is that, since S is finite, one of the si must be closest
to a. More precisely, let di = |si − a| be the distance from si to a. Then
D = {d1 , d2 , . . . , dn } is a finite set of positive numbers, and so has a minimum
member, say d1 . (Every finite set of numbers has largest and smallest elements;
see Fact 1.16, page 55.) Now it is clear that δ = d1 /2 works in Definition 1.28. Any positive δ smaller than d1
♦ would work here.
70 1. Preliminaries: Numbers, Sets, Proofs, and Bounds
E XAMPLE 4. Let
the last quantity is an upper bound. (With more work we might find a smaller
upper bound.)
The function h behaves somewhat differently: although bounded on some
infinite intervals, such as [0.00017, ∞) and (−∞, −0.24), h is unbounded on
some finite intervals, like (0, 1) and (−0.24, 0). ♦
Exercises
1. Prove parts (a) and (c) of Example 1, page 68.
2. Show that if S ⊂ R is a finite set, then S contains both sup(S) and inf(S).
(Hint: See Fact 1.16, page 55.)
4. We said in this section that a set S of real numbers is bounded if and only
if there is some M > 0 such that |s| ≤ M for all s ∈ S. Equivalently,
S ⊆ [−M, M ].
(a) Suppose S has upper and lower bounds −5 and 3, respectively. Find
an M that satisfies the condition above.
(b) Suppose S has upper and lower bounds a and b, respectively. Find an
M that satisfies the condition above.
(c) Suppose M = 42 satisfies the condition above. Find upper and lower
bounds for S.
(d) Prove the statement at the beginning of this exercise.
5. Consider the functions f (x) = ex and g(x) = sin(x) and the set A =
[0, 10].
(a) Explain briefly why f is bounded below but not bounded above on R.
(b) Find a set A ⊆ R (as large as possible) such that f is bounded above
on A by M = 120.
(c) Find a set A ⊆ R (as large as possible) such that f is bounded below
on A by m = −1.
(d) Find a set A ⊆ R (as large as possible) such that f is bounded above
on A by M = 0.
(e) Let A ⊂ R be any set with 1234 points. Explain briefly why f is
bounded above and below on A.
9. For each set S following find—if possible—inf(S) and sup(S). Are any of
these maxima or minima? No proofs needed.
(a) S = n ∈ N | n2 < 10
(b) S = {p ∈ N | p is prime}
(c) S = {p ∈ N | p is an even prime}
(d) S = {sin(x) + 42 | x ∈ R}
(e) S = r ∈ Q | r2 ≤ 2
(f) S = x ∈ R | x2 ≤ 2
10. Is each of the following sets bounded? If so, give upper and lower bounds.
If not, why not? (No proofs needed; it’s OK to use ideas from elementary
calculus.)
1 1 1 1 1 1
(a) 1, 1 + , 1 + + , 1 + + + , . . .
2 2 3 2 3 4
1 1 1 1 1 1
(b) 1, 1 + , 1 + + , 1 + + + , . . .
2 2 4 2 4 8
ln 1 ln 2 ln 3 ln 4
(c) , , , ,...
1 2 3 4
1 2 3 4
2 2 2 2
(d) , , , , . . .
12 22 32 42
(e) {tan(x) | x ∈ R}
(f) x ∈ R | x3 − 5x2 + 7x − 1234 = 0
11. Consider the sets EW of all non-hyphenated English words and WITP of
all words in this problem. Let f be the function with rule
(a) Is f bounded above and/or below on EW? Can you give good upper
and lower bounds? Explain.
(b) Find upper and lower bounds for f on the set WITP.
1.8. Bounds 73
12. Consider the real-valued function g whose domain is the set GCES of gram-
matically correct English sentences, and has rule
Is g bounded on GCES? Can you give upper and lower bounds? What
bounds apply g on the subset SITP of sentences in this problem?
13. In each part following, describe the sets S ⊆ R with the given property.
14. Consider the interval I = [−10, 10]. Find good upper and lower bounds for
each of the following functions defined on I. Use the triangle inequality or
ideas from elementary calculus.
1
(a) f (x) = .
1 + x2
(b) g(x) = 3x2 + 2x − 7.
(c) h(x) = sin2 x + cos2 x + x.
15. Do the preceding problem, but replace the interval I with the interval J =
[−K, K], where K > 0. (Answers may depend on K, of course.)
16. Let f (x) = Ax + B, where A and B are any real constants, and let I be
the interval [−K, K], where K > 0. Find sharp (i.e., best possible) upper
and lower bounds for f on I. (Hint: Handle the cases A > 0, A = 0, and
A < 0 separately.)
(a) Can S be both (i) unbounded; and (ii) bounded away from 0? Either
give an example or explain why none is possible.
(b) Show that S is bounded away from 0 if and only if the set T = {1/s |
s ∈ S} is bounded.
(c) Show that S is bounded away from 0 if and only if there is an open
interval I = (a, b) with 0 ∈ I and S ∩ I = ∅.
18. (This problem alludes to Definition 1.28, page 69.) In each of the following
cases either give an example or say why none can exist.
74 1. Preliminaries: Numbers, Sets, Proofs, and Bounds
(a) A set of positive numbers that is bounded away from one but not from
zero.
(b) A number a such that Q is bounded away from a.
(c) A set S that is bounded away from a = 42 but not bounded away
from any other integer.
19. This problem is about the intervals I = [0, 1] and J = (2, 3).
Like other succinct mathematical statements, this one can use some unpacking:
1.9. Numbers 103: Completeness 75
• What about infs? For brevity, the completeness axiom mentions only sups.
But the story is similar for infs: Every nonempty set that is bounded below
has an infimum. The inf version, moreover, follows easily from the sup
version. See the exercises.
• The real advantage: The completeness axiom guarantees that every bounded
set of rationals has a supremum—which may or may not be a rational num- Rational numbers are also real.
ber. By contrast, the supremum of a bounded set of reals must be real. In
this sense R is complete, while Q is incomplete.
• In or out? We know already that the sup and the inf of a given set may
or may not lie within the set. It is pretty obvious, for instance, that the
half-closed interval (1, 2] contains its supremum but not its infimum. For
more complicated
sets, things may be less
clear. Consider, for example,
the set S = r ∈ Q | r2 ≤ 152399024 . A little thought shows that S
is bounded, so the completeness axiom guarantees that √ a sup and an inf
exist. We might
√ also guess (correctly!) that sup(S) = 152399024 and
inf(S) = − 152399024, but do these numbers lie inside or outside S?
This is far from clear at a glance—and the completeness
√ axiom is no help The question boils down to
at all. To decide, we’d need to discover whether 152399024 is rational, whether 152399024 is a perfect
square. It isn’t . . . quite.
and this takes a little effort.
a 2a 3a b na
0 0.01 5003.47
Nested Intervals
A collection of intervals I1 , I2 , I3 , . . . is called nested if
I1 ⊇ I2 ⊇ I3 ⊇ I4 ⊇ . . . .
Observe that, although each interval In contains infinitely many points, the inter-
section of the full collection is empty.
The story is different for closed intervals. The nested collection
has exactly one point of intersection—the number 0. The following theorem de-
scribes the situation in general; completeness is the key.
Theorem 1.31 (The nested intervals theorem). Consider a nested infinite col-
lection
I1 ⊇ I2 ⊇ I3 ⊇ I4 ⊇ . . .
1.9. Numbers 103: Completeness 77
I1 ∩ I2 ∩ I3 ∩ I4 ∩ . . .
contains at least one point. If the intervals’ lengths shrink to zero, then the inter-
section is a single point.
Proof (sketch): We sketch the main idea, leaving some details to exercises. If we
write In = [an , bn ] for all positive integers n, then the nesting condition means
a1 ≤ a2 ≤ a3 ≤ · · · ≤ b 3 ≤ b 2 ≤ b 1 .
(a) The interval (x, y) contains at least one rational and one irrational number.
(b) The interval (x, y) contains infinitely many rationals and infinitely many
irrationals.
Proof: Claim (b) looks much stronger than (a), but showing that (a) implies (b)
is surprisingly easy. Let’s suppose, toward contradiction, that (a) holds but (x, y)
contains only finitely many rationals r1 , r2 , . . . , rn . Then one of these, say rn ,
is largest. But then the interval (rn , y) must contain no rationals, which contra- Recall: Every nonempty finite
dicts (a). Exactly the same proof applies to irrational numbers, so we conclude set of reals has a maximum
element.
that (a) implies (b).
To prove (a) we’ll use Theorem 1.30. For convenience, we’ll handle here only
the special case 0 ≤ x < y, and mop up remaining cases as exercises. If we
set ǫ = y − x, then the squeezing-in part of Theorem 1.30 says that for some
integer n the rational number
√ r = 1/n lies in the interval (0, ǫ)—as does the
irrational number p = r/ 2. Note that 0 < p < r < y – x.
To finish the proof, we show that at least one of r, 2r, 3r, 4r, . . . (all are
rational) and at least one of p, 2p, 3p, 4p, . . . (all are irrational) lie inside (x, y).
This is intuitively reasonable—for both sequences, the “jumps” are too small to
miss (x, y) entirely. To put it formally, say for p, 2p, 3p, 4p, . . . , consider the
78 1. Preliminaries: Numbers, Sets, Proofs, and Bounds
and so x < n0 p < y, as desired. The proof that x < m0 r < y for some positive
integer m0 is almost identical.
Exercises
1. The list 21 , 13 , 41 , . . . , n1 , . . . suggests an explicit “recipe” for an endless
collection of rational numbers between 0 and 1.
(a) Give a similar recipe for an endless list of rationals between 0 and
1
507 .
1
(b) Give a similar recipe for an endless list of rationals between 2 and 13 .
(c) Let a and b be any two rationals, with a < b. Give a recipe (involving
a and b) for an endless list of rationals between a and b.
3. Use the unboundedness of N (the first part of Theorem 1.30) to prove the
squeezing-in principle (the second part of Theorem 1.30).
5. (a) Show with a counterexample that the following statement (which re-
sembles the Archimedean principle) is false: If a and b are real num-
bers with a < b, then there exists a positive integer n such that na > b.
(b) Prove (using the Archimedean principle) or disprove (with a coun-
terexample) the following statement: If a and b are nonzero real num-
bers with a < b, then there exists an integer n such that na > b.
1.9. Numbers 103: Completeness 79
(c) Prove without using the Archimedean principle: If a and b are positive
numbers with a < b, then there exists a real number n such that na >
b.
6. Part (a) of Theorem 1.32 says that every interval (x, y) contains at least one
rational and one irrational. The proof given there assumed that 0 ≤ x < y.
(a) Show that the same result holds if x < 0 < y. (Hint: Apply the result
already proved to the new interval (X, Y ) = (0, y).)
(b) Show that the result still holds if x < y ≤ 0. (Hint: Look at the new
interval (X, Y ) = (−y, −x).)
(a) Is the italicized statement true or false if R is replaced (in both places)
by Z? Explain.
(b) Is the italicized statement true or false if R is replaced (in both places)
by Q? Explain.
8. Use the completeness axiom for sups to prove the analogous result for infs:
If S ⊂ R, S = 6 ∅, and S is bounded below, then there is α ∈ R such
that α = inf(S). (Hint: Given a nonempty set S, look at the new set −S
defined by −S = {−s | s ∈ S}. Apply the completeness axiom to −S and
interpret the result.)
(a) Each In is open, the In are nested, and the intersection is the single
point 3.
(b) Each In is closed, no two In are equal, and the intersection is the
interval [−3, 42].
(c) Each In is open, no two In are equal, and the intersection is the inter-
val [−3, 42].
10. Show that every closed interval [a, b] is the intersection of a nested I1 ⊃
I2 ⊃ . . . of open intervals.
12. This problem is about some details in the proof of Theorem 1.31, page 76;
see the notation there.
13. We said in this section that completeness of R guarantees that 2 has a real
square root.
(a) Let S = {x | x2 < 2}. Explain why S has a least upper bound; call
it β. We’ll show that β 2 = 2.
(b) Suppose toward contradiction that β 2 < 2. Now choose any h > 0
such that (i) 0 < h < 1 and (ii) h ≤ (2 − β 2 )/(2β + 1). Show that
(β + h)2 < 2. Hint: Use the fact that h2 < h.
(c) The preceding calculation leads to a contradiction. Identify it, and
conclude β 2 < 2 is impossible.
(d) Prove by contradiction that β 2 > 2 is also impossible. (Hint: Set
k = (β 2 − 2)/(2β) and consider β − k.)
14. Imitate Problem 13 to show that every positive number a has a unique pos-
itive square root.
15. Use the result of Problem
√ √14 √ √ that every positive number a has
to show
unique positive roots a, 4 a, 8 a, 16 a, etc.
16. In the spirit of Problem 14, it’s true that every positive number a has a
unique positive cube root. We can show this by defining S = {x | x3 < a},
setting β = sup(S), and proving that β 3 = a. Complete details below.
(a) Suppose toward contradiction that β 3 < a. Choose h with 0 < h < 1
and h < (a − β 3 )/(3β 2 + 3β + 1). (Why is this possible?) Show that
(β + h)3 < a, a contradiction.
(b) Show that β 3 > a also leads to a contradiction. To do so, choose an
appropriate positive value of k and show that (β − k)3 > a.
Here is a slightly subtler fact: Every infinite string of decimal digits corre-
sponds to a unique real number β. Use the completeness axiom to explain
why. (Hint: For convenience, consider only strings of the form 0.d1 d2 d3 d4 d5 . . . ,
where each di is a decimal digit. Use these digits to construct a bounded
set with supremum β.)
CHAPTER 2
Sequences and Series
83
84 2. Sequences and Series
Using the definition: notes on proofs. Our concise definition can lead, ideally,
to equally concise proofs. But such polished products can seem both impressive
and mysterious, like a Ferrari with the hood up. To raise the hood a bit, we’ll
often precede formal proofs with informal discussion—but keep the two separate.
We start with some positive and negative examples.
2.1. Sequences and Convergence 85
1
E XAMPLE 1. Prove that the sequence {an } with an = n converges to zero. In
symbols, limn→∞ n1 = 0.
Pre-proof discussion. The fact that an → 0 seems obvious; how could it be oth-
erwise? To invoke the definition, we work with the desired inequality, |an − L| <
ǫ. Here we have
1 1 1 1
|an − L| = = , and < ǫ ⇐⇒ n > .
n n n ǫ
(The first calculation works because—but only because—both n and ǫ are posi-
tive.) Now we’re getting somewhere: N = 1/e does what the definition requires, Clearing out pesky absolute
and we’re ready for a slick and seamless proof. value bars is a big help.
Proof: Let ǫ > 0 be given. Set N = 1/ǫ. This N “works” because if n > N ,
then
1 1 1
|an − L| = = < = ǫ,
n n N
which is what the definition requires.
and
13 13 13
< ǫ ⇐⇒ < n + 5 ⇐⇒ − 5 < n.
n+5 ǫ ǫ
13
So N = ǫ − 5 does the job, and we’re ready for another brief proof.
Points in a plane. Terms of {an } can also be viewed as points (n, an ) on the
graph of the function a : N → R. Figure 2.2 offers glimpses of two sequences,
{1/n} and {sin n}. One converges to zero (as we’ve proved!); the other looks un-
Proving this is another matter. likely to converge to anything. Such graphs can reveal a lot, but never everything—
a sequence has infinitely many terms, and a picture shows only a tiny sample.
L−є L L+є
0.25
1.0
0.20
0.5
0.15
0.10 10 20 30 40 50
0.05 – 0.5
10 20 30 40 50 – 1.0
(a) A look at the sequence {1/n} (b) A look at the sequence {sin n}
L+є
L
L–є
n
0 10 20 30 40 50
Sequence graphs can also show how ǫ and N interact. The idea is to express
the key definition in graphical language: for any ǫ > 0, no matter how small,
there is some N on the (horizontal) n-axis so that all graph points to the right of
N lie inside an “ǫ-band” around the horizontal line y = L, as Figure 2.3 suggests.
Properties of Sequences
Sequences, like other real-valued functions, may or may not have various behav-
ioral properties. Here are some typical definitions:
The sequence {1/n}, for example, is strictly decreasing and bounded (above and
below), while the Fibonacci sequence is increasing and unbounded (above).
The following two theorems link these properties to convergence and diver-
gence. Neither statement is surprising, but the formal proofs are nice exercises—
and left as exercises—in using the definition. Some pre-proof discussion follows
each result.
Theorem 2.3. If a sequence {an } is (i) monotone, and (ii) bounded, then {an }
converges.
Theorem 2.4. If a sequence {an } converges, then {an } is bounded both above
and below.
Discussion. But for the annoying denominator {zn } resembles {1/n}, which
converges to zero (see Example 1). To show that zn → 0, too, for given ǫ > 0,
we need to find N for which
1 1
|zn − 0| = √ = √ <ǫ
n+ n+1+5 n+ n+1+5
whenever n > N . This looks clumsy, but a simple inequality brings radical
improvements:
1 1
|zn − 0| = √ < .
n+ n+1+5 n
Now it is easy to see that
1 1
< ǫ ⇐⇒ n > ,
n ǫ
which means that N = 1/ǫ “works.” We assemble the parts in the concise proof.
Proof. Let ǫ > 0 be given. Set N = 1ǫ . If n > N then Check each step; note the key
inequality.
1 1 1 1
|zn − 0| = √ = √ < < =ǫ
n + n + 1 + 5 n + n + 1 + 5 n N
Discussion. This time the key quantity |wn − L| takes the form Check the final calculation.
4n 170
|wn − L| = − 2 = ;
2n − 85 2n − 85
we want
170
|wn − L| =
<ǫ
2n − 85
to hold for large n. It would be nice to drop the absolute value—alas, the de-
nominator, 2n − 85, may be negative. The good news is that this happens only
90 2. Sequences and Series
for n ≤ 42. If n > 42, then we can indeed drop the absolute value in good
But check all algebra carefully. conscience, and solve our inequality without undue fuss:
170 170 170 85
2n − 85 = 2n − 85 < ǫ ⇐⇒ 2ǫ + 2 < n.
The last inequality is what we wanted—a value of n beyond which |wn − L| < ǫ.
Here comes the proof.
Exercises
1. If a sequence {xn } converges, then for any given ǫ > 0 there is some N
that “works” in the sense of the definition. For the sequence {1/n} the
following table shows values of N associated with values of ǫ:
Make a similar table (same values of ǫ) for each sequence following. Try
to choose N as small as possible.
(a) n12
n o
(b) √1n
n o
1
(c) 1+ln n
2. This problem is about ǫ–N tables for convergent sequences, like those in
Problem 1, with ǫ = 1, 0.1, 0.01, 0.001 in the first row.
(a) Explain why all entries in the second row can be the same.
(b) Consider the sequence {xn } with xn = 0 for all n. What goes in the
second row?
2.1. Sequences and Convergence 91
(c) Consider the sequence {yn } with yn = 1/d(n), where d(n) is the
number of base-ten digits in n. (For example, y123456 = 1/6.) What
goes in the second row?
(a) Show that there is some N such that xn ∈ (4.9, 5.1) for all n > N .
(b) Show that there is some N such that xn > 4.999 for all n > N .
(c) Show that the set {xn | xn > 6} is finite.
9. Prove that each of the following sequences converges. (Example 2, page 85,
is similar.)
2n
(a) {an }, with an = .
3n + 5
2n + 300000
(b) {bn }, with bn = .
3n + 5
92 2. Sequences and Series
2n
(c) {cn }, with cn = .
3n2 + 5
10. Guess and then prove a limit for each sequence following. (Examples 4 and 5
may be helpful.)
2n
(a) {an }, with an = .
3n − 5
2n
(b) {bn }, with bn = .
3n + sin n + 5
2n
(c) {cn }, with cn = (−1)n 2 .
3n + 5
11. Suppose that {an } converges to 1. Prove (use ǫ and N , not theorems) that
{17an} converges to 17.
12. Suppose {xn } is monotone decreasing and inf{xn } = 17. Show that xn →
17.
13. Convert the informal discussion after Theorem 2.3, page 88, into a concise
proof that a bounded, decreasing sequence converges.
14. Convert the informal discussion after Theorem 2.4, page 88, into a concise
proof.
16. From any given sequence {xn } we can form the related sequence {yn } =
{5xn +2}. Use the definition of convergence to show that if {xn } converges
to 42, then {yn } converges to . (First fill in the blank.)
18. Show that every real number β is the limit of an increasing sequence of
rational numbers. Hint: One approach uses infinite decimal expansion; see
Exercise 18, page 80.
19. Some sequences {xn } have ǫ–N tables (in the sense of Exercise 1) of the
following form:
21. Consider the sequence {xn } given by xn = 1/n. Decide whether each of
the following sentences is true; explain answers briefly.
(−1)n+1
22. Like Problem 21, but with the sequence {xn } given by xn = .
n
(a) The sequence {xn } is bounded above by 2.
(b) The sequence {xn } is bounded below.
(c) If n > 42, then |xn | < .01.
94 2. Sequences and Series
(d) There is some integer N such that |xn | < .0001 for all n such that
n > N.
(e) There is some integer N such that |xn − xn+1 | < .0001 for all n such
that n > N .
(f) There is some integer N such that |xn | > 0.001 for all n such that
n > N.
(g) For every positive number ǫ there is some integer N such that |xn | < ǫ
for all n such that n > N .
sin(n)
23. Like Problem 21, but with the sequence {xn } given by xn = .
n
(a) The sequence {xn } is bounded above by 2.
(b) The sequence {xn } is bounded below.
(c) If n > 42, then |xn | < .01.
(d) There is some integer N such that |xn | < .0001 for all n such that
n > N.
(e) There is some integer N such that |xn − xn+1 | < .0001 for all n such
that n > N .
(f) There is some integer N such that |xn | > 0.001 for all n such that
n > N.
(g) For every positive number ǫ there is some integer N such that |xn | < ǫ
for all n such that n > N .
24. Suppose that {xn } is unbounded above. Use the definition of convergence
to show that {xn } does not converge to L = 1000.
25. For each sequence, guess a limit L (e.g., try large values of n), if you think
one exists. Then complete an associated ǫ–N table like the one below.
(There are many possible values for N .)
1
(a) If an = √ , then L = .
n
1
(b) If an = 2 , then L = .
n
sin n
(c) If an = 2 , then L = .
n
2.1. Sequences and Convergence 95
3n
(d) If an = , then L = .
n+2
(e) If an = min{n, 42}, then L = .
26. (Do Problem 25 first.) Write a brief sentence or two in each part.
(a) Look at your rightmost N -entry in any table above. Could you cor-
rectly use this same entry in every N -position in that table?
(b) Suppose you’ve found “good” N -entries for a table like these. Would
your table still be OK if you double every N -entry but leave the ǫ-
entries alone? What if you double the ǫ-entries and leave the N -
entries alone?
as desired.
(a) Show by induction that fn > 1.5n for all n ≥ 11. (It is easy to check
with technology that the inequality is false for smaller n.)
(b) It can also be shown by induction (as in the preceding problem) that
fn > 1.6n for all “sufficiently large” n. Use technology to decide
which n have this property.
1
28. Consider the sequence {gn } defined by g1 = 1 and gn+1 = 1 + gn .
(a) Write out the first few terms of the sequence to see the Fibonacci
numbers pop up.
(b) Prove by induction that gn ≤ 2 for all n.
(c) Observe (with technology) √ that {gn } appears to hop back and forth
across the number φ = (1+ 5)/2 ≈ 1.618 (this is the famous golden
ratio). Prove this by showing (algebraically; no induction needed) that
(i) if gn > φ then gn+1 < φ; and (ii) if gn < φ then gn+1 > φ.
96 2. Sequences and Series
29. Following are several meaningful (if possibly clumsy) sentences about a
sequence {an }. Describe in your own words what each sentence means
about {an }. Do any of these sentences imply any others?
3n + 2 3 + 2/n 3 · 1 + 2 · n1
= = ,
n+5 1 + 5/n 1 + 5 · n1
which shows the original sequence as a combination of the much simpler se-
quences {1}, and {1/n}, which converge to 1 and 0, respectively. Now Theo-
rem 2.5 implies that
3 · 1 + 2 · n1 3·1+2·0
1 → = 3,
1+5· n 1+5·0
2.2. Working with Sequences 97
as expected.
We just assumed, of course, that {1} and {1/n} do indeed converge to one
and zero. These claims need proof, too, but the proofs are easy—and need to be
done just once. ♦
E XAMPLE 2. Let {an } and {bn } be divergent sequences, and let {cn } be a
convergent sequence. Can the sum {an + bn } converge? Can it diverge? What
about {an + cn }?
S OLUTION . Theorem 2.5 says nothing about sums of divergent sequences. But
easy examples show that {an } and {bn } can either converge or diverge. With
an = n and bn = π − n, for example, we get an + bn = π for all n, so {an + bn }
converges. Find your own example where
Theorem 2.5 does help with {an + cn }; let’s call it {dn } for the moment. If {an + bn } diverges.
{dn } were convergent, then the difference {dn − cn } = {an } would converge,
too, contradicting our assumption. Thus, {an + cn } diverges. ♦
which do hold for large n, by hypothesis. The trick is to parlay these simpler
inequalities into proofs of their more complicated cousins. The triangle inequality
Watch for a technical trick, too. will come in very handy; watch for it in the following proof.
Proof (for sums): Let ǫ > 0; we need to choose N so that |(an + bn ) − (a+ b)| <
ǫ whenever n > N . Set ǫ′ = ǫ/2; note ǫ′ > 0. Because an → a, there is a number
N1 that “works” for ǫ′ :
Now let N be the larger of N1 and N2 . If n > N , then both n > N1 and n > N2 ,
and so
where we used the triangle inequality at the line break. This shows that N works
in the desired sense, and completes the proof.
Working with products. To prove our claim about products we will use the cor-
responding properties for sums (just proved) and for constant multiples (left as an
We’ll see this again. exercise). The proof starts with a standard analyst’s trick: subtracting and adding
the same quantity:
an bn = an bn − an b + an b = an (bn − b) + an b.
The result for constant multiples shows that an b → ab; we’ll be done if we can
also show that an (bn − b) → 0. The key insight, as we will see in the formal
proof, is that the {an } are bounded.
Proof (for products): First we write an bn = an (bn − b) + an b. By the result for
sums, it’s enough to show that
an b → ab and an (bn − b) → 0.
And left as an exercise. That an b → ab follows from the property of constant multiples that we assumed.
To show an (bn − b) → 0, observe first that the convergent sequence {an } is
bounded (Theorem 2.4, page 88, says so); in other words, there exists M > 0
2.2. Working with Sequences 99
with |an | ≤ M for all n. Now let ǫ > 0 be given; note that ǫ/M > 0, too. Since
bn → b, there exists N such that |bn − b| < ǫ/M when n > N . This N works
for {an (bn − b)}, because if n > N , then
ǫ
|an (bn − b) − 0| = |an | |bn − b| ≤ M |bn − b| < M = ǫ,
M
as desired.
Squeezing. The algebraic methods of Theorem 2.5 don’t immediately help with
the sequence {sin n/n}, which has a trigonometric ingredient. Still, we expect
the sequence to converge to zero because the numerator remains tamely bounded,
while the denominator “blows up.” More precisely, we know
1 sin n 1
− ≤ ≤
n n n
for all n. Both the left- and right-hand sequences converge to zero, and we expect
them to “squeeze” the middle sequence to the same limit. That intuition is correct:
Theorem 2.6 (The squeeze principle). Let {an }, {bn }, and {cn } be sequences
such that an ≤ bn ≤ cn for all n. If an → L and cn → L, then bn → L, too.
Proof: For given ǫ > 0, we need to find N such that n > N implies |bn − L| < ǫ,
or, equivalently, L − ǫ < bn < L + ǫ.
Because an → L we can choose N1 so
Similarly, there is N2 so
Now let N = max{N1 , N2 }. This N works for {bn }, since if n > N then
L − ǫ < an ≤ bn ≤ cn < L + ǫ,
as desired.
(a) an → 0 ⇐⇒ |an | → 0
(b) If an → L, then |an | → |L|.
(c) If |an | → 0 and {bn } is bounded, then an bn → 0.
Proof: Claim (a), left as an exercise, is a basic application of the definition.
Claims (b) and (c) follow from (a) by judicious squeezing. For (b), notice first
that an − L → 0 by hypothesis, and so |an − L| → 0, according to (a). The
Here one sequence is constant. reverse triangle inequality now gives our squeezing inequality:
0 ≤ | |an | − |L| | ≤ |an − L|
Since the right-hand sequence tends to zero, so must the middle one, as we aimed
to show.
For (c), we first use boundedness of {bn } to choose B > 0 with |bn | ≤ B for
all n. This leads to another squeezing inequality:
0 ≤ |an bn | ≤ B |an | .
Again, the left- and right-hand sequences tend to zero (the constant multiple B
does no harm) and therefore squeeze the middle sequence to the same limit.
Sequences to order. Sequences with specified properties can often be made to
order, as the following two (similar) samples suggest. We’ll call them lemmas
because the sequences involved here are often used in proofs of other results.
The German Hilfsatz, or (Lemmas are junior-grade theorems, normally used to prove something else.)
“helping sentence,” is more
descriptive. Lemma 2.8. Let L be any number. There exist sequences {rn } and {pn }, both
converging to L, with rn ∈ Q and pn ∈ / Q for all n. If desired, {rn } and {pn }
and can be chosen to be either strictly increasing or strictly decreasing.
Proof: If L is rational, then the sequences defined by
√
1 2
rn = L + and pn = L +
n n
are strictly decreasing and converge to L. Other cases are similar, and left as
exercises.
Lemma 2.9. Let C ⊆ R be any nonempty bounded set, with α = inf(C) and
β = sup(C). There exist sequences {an } and {bn } contained in C with an → α
and bn → β.
The {an } case is similar. Proof: Let’s find {bn }. For each n ∈ N we know β − 1/n is not an upper bound
for C, so there is some bn ∈ C with
1
β− ≤ bn ≤ β and so bn → β
n
by the “squeezing” principle, Theorem 2.6. Further details are in an exercise.
2.2. Working with Sequences 101
Divergence to Infinity
Sequences like {n} and {−n3 } clearly have no finite limit, but they misbehave
less seriously than, say, the zig-zag sequence {1, −2, 3, −4, . . . }. The following
definition formalizes these ideas.
Definition 2.10. A sequence {xn } diverges to infinity if for every M > 0 there
exists a number N such that
Observe:
• Divergence to −∞: A similar definition holds for divergence to −∞: For
given M > 0 there must exist N with xn < −M whenever n > N .
• Big M , small ǫ: For ordinary convergence we focus mainly on small posi-
tive values of ǫ; here the challenging values of M are large.
• Convergence or divergence? Some authors refer to convergence to ∞; we
prefer to reserve “convergence” for finite limits.
• Unboundedness and divergence to infinity: An unbounded sequence need
not diverge to ±∞, as {1, −2, 3, −4, . . . } illustrates. An unbounded mono-
tone sequence, on the other hand, must diverge to ±∞; see Example 5.
• Zero or infinity? A sequence {xn } of positive terms diverges to ∞ if and
only if {1/xn } converges to 0. (Proofs follow directly from the definitions;
see the exercises.) A similar result holds for negative sequences.
Exercises
1. Prove the following part of Theorem 2.5, page 96: If an → a and c is any
constant, then can → ca. (Hint: The result is trivial if c = 0, so assume
c=6 0.)
2. Another part of Theorem 2.5, page 96, says that an /bn → a/b if b = 6 0 and
bn 6= 0 for all n. To complete the proof sketched in this section, we need to
show that 1/bn → 1/b.
6. Let C be the set of irrational numbers between 0 and 1. Describe the se-
quences {an } and {bn } mentioned in Lemma 2.9, page 100.
9. Consider the sequence {sn } in Example 3, page 97. Show by induction that
{sn } is strictly decreasing; i.e., show sn+1 < sn for all n ≥ 1.
1 1 1
10. Consider the sequence {sn } given by sn = + √ + · · · + √ . We
1 2 n
showed in Example 4, page 101, that {sn } diverges
√ to infinity. Give another
comparison proof, this time with the sequence { n}.
11. Consider again the sequence {hn } in Example 4, page 101. Observe that
1 1 1 1
h2n = hn + +···+ ≥ hn + n · = hn + . Use this idea to
n+1 2n 2n 2
prove (without calculus) that {hn } diverges to infinity.
104 2. Sequences and Series
(a) Prove that if xn → ∞, then −xn → −∞. (The converse is also true,
and the proof is almost identical.)
(b) Show that xn → ∞ if and only if 1/xn → 0.
(c) State and prove a similar claim for sequences of negative terms. (Use
the first two parts.)
√ √
14. Suppose an > 0 for all n and an√→ 3. Show that an → 3. (Hint:
√
Show first by algebra that | an − 3| < |an − 3|.)
15. Use theorems from this section (not the definitions of convergence) to dis-
cuss each of the following limits. It is OK to assume such basic facts as
1/n → 0, but say what you’re assuming.
2n + 3 sin n
(a)
5n + cos n
np o
(b) n2 + 1
np o
(c) n2 + n − n
2
n + arctan n
(d)
n+2
16. For any sequence {an } we can form a new sequence {bn } by the rule bn =
max{ |a1 | , |a2 | , . . . , |an |}. Show that bn converges if and only if {an } is
bounded.
2.2. Working with Sequences 105
17. Given any two sequences {an } and {bn }, we can construct a new sequence
{cn } with the rule cn = max{ an , bn }.
(a) Give an example (as simple as possible; there are many possibilities)
to show that {cn } may converge even if both {an } and {bn } diverge.
(b) Suppose {an } and {bn } converge to a and b respectively. Show that
{cn } converges, too. (Guess the limit first.)
19. It’s well known that the golden ratio, ϕ, satisfies ϕ ≈ 1.61803. Let x1 = 1
1
and xn+1 = 1 + .
xn
(a) Find (first as rational numbers and then as decimals) x1 through x8 .
What patterns do you see? Is this sequence monotone?
(b) Assume (it’s true) that {xn } converges to some number L. Explain
1
why L = 1 + . Then use the equation to find an exact formula for
L
L, involving a square root.
√
20. As every calculator knows, 3 ≈ 1.732051. Consider the sequence defined
xn + 3/xn
by x1 = 2 and xn+1 = for n ≥ 1.
2
(a) Find x2 , x3 , x4 by hand, as rational numbers. Then find decimal
approximate values.
√
(b) Show that if xn > 0, then xn+1 ≥√ 3. (It’s OK to use a little calculus
if you like.) Conclude that xn ≥ 3 for all n.
(c) Explain why xn+1 < xn for all n.
(d) We know now that xn → L for some number L. Which theorem(s)
say this?
L + 3/L
(e) We also know now that L = . Which theorems say this?
2
(f) What’s the exact value of L? Why?
x2n + 4
21. Suppose {xn } is defined by x1 = 3 and xn+1 = .
5
106 2. Sequences and Series
2.3 Subsequences
Basic Ideas
Any given sequence x1 , x2 , x3 , x4 , . . . has many, many subsequences. Here are
three examples:
x1 , x3 , x5 , x7 , . . . x1 , x4 , x9 , x16 , . . . x123 , x124 , x125 , . . .
Subsequences are formed by choosing—in order—any infinite subset of the orig-
inal sequence {xn }. The last example above could be called an upper tail sub-
sequence, formed simply by skipping over an initial string. Order matters, and
repetitions aren’t allowed, so
x2 , x1 , x4 , x3 , x6 , x5 , . . . and x1 , x1 , x2 , x2 , x3 , x3 , . . .
are not considered subsequences of the “parent” {xn }.
Subsequences may or may not behave much like their parents. If, say,
x1 , x2 , x3 , x4 , · · · = 1, 2, 1, 2, . . . ,
then {xn } has no single limit, and therefore diverges. But the subsequences
x1 , x3 , x5 , · · · = 1, 1, 1, . . . and x2 , x4 , x6 , · · · = 2, 2, 2, . . .
obviously converge to 1 and 2, respectively. Still other subsequences, such as
x3 , x6 , x9 , x12 , . . . and x1 , x10 , x100 , x1000 , . . .
may or may not converge. By contrast, the sequence {yn } with
y1 , y2 , y3 , y4 , · · · = 1.0, 0.1, 0.01, 0.001, . . .
converges to zero—and so does every one of its subsequences, including
y1 , y3 , y5 , . . . y83 , . . . and y1 , y10 , y100 , . . . , y1041 , . . . .
Stranger examples are possible.
2.3. Subsequences 107
S OLUTION . Yes and yes. Lacking rigorous definitions, we’ll argue (very!) in- Coming soon.
formally.
To find a subsequence of positive integers, imagine walking from left to right
along the rational sequence, underlining each term that happens to be a positive
integer. The result might look like this:
The process never stops because the supply of positive integers is infinite, so we
have our desired subsequence:
r2 , r3 , r24 , r1235 . . . .
To find a subsequence that converges to zero, recall first that for every positive
ǫ, no matter how small, the interval (−ǫ, ǫ) contains infinitely many rationals. To
find our desired subsequence, therefore, we walk again from left to right along the
original sequence. At some position, say a9 , we find a9 ∈ (−1, 1). Continuing
our rightward walk, we find, say,
and so on. We never need to stop because, at any stage, we’ve left behind only
finitely many of the infinite family of residents of (−ǫ, ǫ). ♦
Similar reasoning shows that any listing {rn } of Q has subsequences consist-
ing entirely of prime numbers or entirely of fractions with denominator 424242.
It is less obvious—but true—that {rn } has an increasing subsequence with limit
π, a decreasing subsequence with limit e, a decreasing subsequence of integers
diverging to −∞, and countless other subsequences of interest. We sort out such
possibilities in this section.
Formalities. The idea of subsequences is simple enough but the notation takes
some getting used to. To create a subsequence from a given sequence {an } means
to choose a strictly increasing sequence of subscripts
Observe:
• A useful inequality: The definition implies a simple inequality that is use-
ful in proofs: nk ≥ k for all k. A formal proof might involve induction
or the pigeonhole principle, but the idea is mainly common sense—the kth
term of a subsequence can’t appear before the kth term of the parent se-
quence.
• Another subsequence: The subsequence indices n1 , n2 , n3 , n4 , . . . for
{ank } form still another subsequence—a strictly increasing subsequence
of {1, 2, 3, . . . }.
a1 , a3 , a5 , a7 , . . . and a2 , a4 , a6 , a8 , . . .
in functional language and notation. What can be said about {an } if both subse-
quences converge to three?
for all k. In ordinary sequence notation, we can write {ank } = {a2k−1 }; notice
that k, not n, is now the index variable. For the “even” subsequence similar
reasoning gives {ank } = {a2k }.
If both subsequences {a2k−1 } and {a2k } converge to three, then it is reason-
A bit laboriously. able to expect the parent sequence {an } to do so as well. To prove this let ǫ > 0
be given. By hypothesis, there exist numbers K1 and K2 such that
Now it turns out that N = max {2K1 , 2K2 } “works” for ǫ in the parent sequence.
To see why, suppose n > N . If n = 2k happens to be even, then n = 2k >
2K2 , and so k > K2 , which implies that
Thus we have |an − 3| < ǫ for all n > N , and the proof is complete. ♦
Properties of Subsequences
Every infinite sequence {xn } has countless subsequences. What can be said about Uncountably many, in fact.
such an enormous set? The answer is, “quite a lot”: subsequences often “inherit”
their parents’ basic properties.
(b) If {xn } diverges to ±∞, then every subsequence {xnk } diverges to ±∞,
too.
(c) If {xn } has subsequences converging to different limits, then {xn } di-
verges.
About proofs. Key to both proofs is the fact, mentioned above, that, for any sub-
sequence, nk ≥ k for all k. To prove (a), let ǫ > 0 be given. By hypothesis there
exists N that works for {xn } in the sense that |xn − L| < ǫ whenever n > N .
The key fact above implies that the same N works for {xnk }; if k > N then
nk ≥ k > N , and so |xnk − L| < ǫ, as desired.
The proof of (b) is similar, and (c) follows immediately from (a). Further
details are left to exercises.
Proposition 2.13 is easy to state, but it’s a bit tricky to prove anything about all “Nontrivial,” in math-speak.
sequences—convergent or divergent, bounded or unbounded, tame or wild. Here,
110 2. Sequences and Series
It is hard to see much pattern there, but Proposition 2.13 guarantees that some-
where in there is an increasing subsequence, a decreasing subsequence, or maybe
both.
We’ll sneak up on a proof of Proposition 2.13 through two technical lemmas
about special cases.
Lemma 2.14. Every unbounded sequence {xn } has a monotone subsequence
that diverges to ±∞.
Proof (sketch): For a sequence {xn } unbounded above, we’ll find an increasing
sequence {xnk } with xnk > k for all k. Since {xn } is unbounded above, we can
choose n1 so xn1 > 1. Now the “upper tail”
Do you see why? is also unbounded above, and so we can choose n2 with n2 > n1 and xn2 >
max{2, xn1 }. Then we choose n3 > n2 with xn3 > max{3, xn2 }. Continuing
this process produces the desired subsequence. A similar construction produces a
strictly decreasing subsequence if {xn } is unbounded below.
Lemma 2.15. Let {xn } be a bounded sequence, with infimum α and supremum β.
If α ∈/ {xn }, then there is a decreasing subsequence {xnk } with xnk → α. If
β∈ / {xn }, then there is an increasing subsequence {xnk } with xnk → β.
Proof (sketch): To save a little labor we’ll use Lemma 2.14 and some fancy alge-
bra. In the case β ∈
/ {xn }, we define a new sequence {yn } by the rule
1
yn = .
β − xn
Do you see why? Now {yn } is unbounded above, and so Lemma 2.14 guarantees that there is an
increasing subsequence {ynk } with ynk → ∞. But this implies that
1
= β − xnk → 0.
ynk
This implies, in turn, that {xnk } is increasing, with limit β. A similar argument
applies if α ∈
/ {xn }.
At last we can prove Proposition 2.13. Watch for the completeness axiom to
pop up, and for Lemma 2.15 to provide the key technical insight.
2.3. Subsequences 111
Proof (of Proposition 2.13): Let {xn } be our sequence. If {xn } is unbounded,
we’re done, by Lemma 2.14. So we’ll assume that
(i) {xn } is bounded, and (ii) {xn } has no increasing subsequence,
and use these assumptions to construct a decreasing subsequence.
To get started, let β1 = sup{x1 , x2 , x3 , . . . }. (Assumption (i) and the com-
pleteness axiom guarantee that β1 exists.) The key observation is that β1 is in the
set {x1 , x2 , x3 , . . . }. Otherwise, says Lemma 2.15, some increasing subsequence
would converge to β1 . Thus, we can find n1 with xn1 = β1 ; this is the first term
of our desired subsequence.
To continue, we set
Again, β2 exists by the completeness axiom, and Lemma 2.15, applied this time
to the upper tail sequence
A big theorem. It is now easy to prove a famous theorem, which we’ll use re-
peatedly. We’ve done all the hard work. Starting in the very next section.
Exercises
1. Give an example in each part as indicated; no proofs needed.
(a) State the converse and the contrapositive of the BWT. Disprove the
false one.
(b) Consider the sequences {yn } and {zn } defined by yn = sin n and
zn = sinn n . What can be said about their subsequences? Is the BWT
helpful? Is it needed?
(c) Show (assuming the BWT) that every sequence {xn } has either (i) a
convergent subsequence, or (ii) a subsequence that diverges to ±∞.
Can both (i) and (ii) occur?
(a) Explain why the set of nonnegative rationals can also be written as a
sequence, say, {pn }.
(b) Give a reason why {pn } cannot be monotone.
(c) Although {pn } itself is not monotone, it must have a monotone sub-
sequence of positive integers. Explain why.
(a) Show that xn → x0 if and only if for every ǫ > 0 the set {n | xn ∈
/
(x0 − ǫ, x0 + ǫ)} is finite.
(b) Show that some subsequence {xnk } converges to x0 if and only if for
every ǫ > 0 the set {n | xn ∈ (x0 − ǫ, x0 + ǫ)} is infinite.
5. Consider the subsequence x4242 , x4243 , x4244 , . . . formed from a parent se-
quence x1 , x2 , x3 , . . . . If the subsequence converges to L, then so does the
parent sequence.
(a) Find the subsequences {x2k } and {x2k−1 }. What are their limits?
(No proofs needed.) What does the answer imply about convergence
of {xn }?
(b) Let {xnk } be any subsequence of {xn }. Show that if {xnk } con-
verges, then it must converge either to 1 or to −1.
10. Show that if {xn } diverges and x0 is any number, then there exists ǫ > 0
and a subsequence {xnk } such that |xnk − x0 | ≥ ǫ for all k. (In other
words, some subsequence {xnk } is bounded away from x0 .)
11. This problem is about Theorem 2.12, page 109.
12. Let {xn } be a sequence and x0 a number. Show that xn → x0 if and only
if xnk → x0 for every monotone subsequence {xnk }.
13. The sequence {xn } given by xn = 1/n converges, relatively slowly, to
L = 0. Find a subsequence {xnk } that converges rapidly to 0 in the sense
that |xnk − L| < 1/10k for all k.
14. Show that for any sequence {xn } with xn → L, there’s a subsequence
{xnk } for which |xnk − L| < 1/10k for all k.
15. Given any two sequences {an } and {bn }, we can construct a new sequence
{zn } by “zipping” {an } and {bn } together: {zn } is the new sequence
a1 , b1 , a2 , b2 , . . . , a42 , b42 , . . . . Show that {zn } converges to L if and
only if both {an } and {bn } converge to L.
114 2. Sequences and Series
Successive sn appear to jump back and forth across smaller and smaller intervals.
Does {sn } approach some limit?
In this particular case the answer is classical. Our sequence {sn } is derived
from the famous Leibniz series
1 1 1 1 1 1
1− + − + − + − ...,
3 5 7 9 11 13
which has been known for at least 300 years to converge to π/4 ≈ 0.7854. More
generally, {sn } illustrates the notion of a Cauchy sequence, named for the French
mathematician Augustin–Louis Cauchy (1789–1857), a key figure in the devel-
opment of real analysis.
Cauchy Basics
The formal definition describes precisely what we mean by “close to each other
for large n.”
Definition 2.17. A sequence {xn } is Cauchy if, for every ǫ > 0, there exists N
such that
|xn − xm | < ǫ whenever n > m > N .
Observe:
• Similar, but different: The definition resembles that for convergence, but
with a key difference: no limit L is mentioned.
• Not just consecutive terms: The definition requires that all terms with large
index be close to each other. This applies not only to consecutive terms, like
x12345 and x12346 , but also to any two terms with large index, like x12345
and x98765 .
2.4. Cauchy Sequences 115
√
E XAMPLE 1. As we know, the sequence {1/n} converges to zero, while { n}
diverges to infinity. Is either sequence Cauchy?
S OLUTION . The key idea, on which a formal proof can be based, is the se-
quence’s back-and-forth behavior:
combined with the fact that the distance between successive terms tends to zero.
If, say, ǫ = 0.001, we could set N = 1000, and observe that if n > m > 1000,
then
1
s1000 < sn , sm ≤ s1001 = s1000 + ,
1001
which implies that sn and sm lie within 1/1001 of each other. ♦
116 2. Sequences and Series
Morals from the examples. The examples suggest, correctly, a close connec-
tion between sequence convergence and the Cauchy property. Indeed, the two
properties turn out to be equivalent. As we’ll see, it is easy to show that every
convergent sequence is also Cauchy. The fact that every Cauchy sequence con-
verges is deeper, and proving it takes a little more work. The Bolzano–Weierstrass
theorem—another convergence guarantee—will be useful.
The basic idea is that if both xn and xm are eventually close to a limit L, then
they must also be close to each other. The proof makes this precise.
Proof: Suppose xn → L. For given ǫ > 0, we can choose N such that |xn −L| <
Why is it OK to use ǫ/2, not ǫ? ǫ/2 whenever n > N . This N works in the Cauchy sequence definition, because
if n > m > N , then
ǫ ǫ
|xn − xm | ≤ |xn − L| + |xm − L| < + = ǫ;
2 2
we used the triangle inequality in the first step.
Proof: For ǫ = 1, choose N as in the Cauchy definition. Now fix any integer
m0 with m0 > N . If n > m0 , then we have |xn − xm0 | < 1, or, equivalently,
xm0 − 1 < xn < xm0 + 1. In particular, the set {xm0 , xm0 +1 , xm0 +2 , . . . } is
bounded (above by xm0 + 1 and below by xm0 − 1). Since the remaining terms
{x1 , x2 , . . . xm0 −1 } form a finite set, the entire sequence {xn } is bounded.
Proof: Proposition 2.18 is the “only if” part. To prove the “if” part, suppose that
{xn } is Cauchy. Proposition 2.19 implies that {xn } is bounded; by the Bolzano–
Weierstrass theorem, {xn } has a subsequence {xnk } that converges to some limit,
say, L. To complete the proof, we’ll show that the full sequence {xn } also con-
verges to L.
Let ǫ > 0 be given. Since {xn } is Cauchy, we can choose N1 such that
ǫ ǫ
|xn − L| ≤ |xn − xN | + |xN − L| < + = ǫ,
2 2
as desired.
Now for given ǫ > 0, we can choose N with 1/2N −1 < ǫ. This N works in the
Cauchy definition, because if n > m ≥ N , then, as just shown,
1 1
|an − am | < ≤ N −1 < ǫ,
2m−1 2
as desired. ♦
118 2. Sequences and Series
Exercises
1. Show that each of the following sequences either is or is not Cauchy. (Use
the definition, not theorems.)
(−1)n
(a) {xn }, where xn = n .
(−1)n
(b) {yn }, where yn = 1234 .
n
(c) {zn }, where zn = n+1 .
sin n
(d) {wn }, where wn = n2 +1 .
2. Let {xn } and {yn } be Cauchy sequences. Use Definition 2.17 to show that
{xn + yn } is Cauchy, too.
4. Let {xn } be a Cauchy sequence, with xn ∈ Z for all n. Show that {xn }
is “eventually constant”—i.e., there exists N > 0 such that xn = xm
whenever n > m > N .
(a) The set Q of rational numbers is not complete. Show this by finding a
Cauchy sequence {rn } of rational numbers that has no rational limit.
(b) Is the set Z of integers complete? (Hint: See Problem 4.)
2.4. Cauchy Sequences 119
12. In each part, either give an example of the given type or say briefly (per-
haps by citing an appropriate theorem) why no such example can exist. To
describe a sequence {an }, either give a formula for an or just write enough
terms to make the pattern clear.
(a) a sequence {an } that is not monotone but diverges to ∞
n a o
n
(b) a divergent sequence {an } such that converges
1717
(c) two divergent sequences {an } and {bn } such that {an + bn } con-
verges to 17
(d) two convergent sequences {an } and {bn } such that such that {an /bn }
diverges
(e) a sequence with no convergent subsequence
(f) a Cauchy sequence with an unbounded subsequence
As usual for infinite processes, some obvious questions arise. Is there some limit
(or “infinite sum”) to which the series converges? How do we decide? If a series
has a limit, how do we find it?
A first observation is that series are close
P kin to sequences—about which we
already know a lot. Indeed, every series ak corresponds in a natural way to
We’ll define it in a moment. a certain sequence {An } whose properties tell us “everything” about the series.
The flaw in this happy scenario is that, in practice, getting one’s hands directly on
the sequence {An } can be difficult. Much of what follows can be thought of as
strategy for getting around this problem.
Enough generalities.
P∞
Definition 2.21. For a given series k=1 ak = a1 + a2 + a3 + . . . , the sequence
of partial sums is given by
A1 = a1 , A2 = a1 + a2 , ..., An = a1 + a2 + · · · + an , ....
Pn
In general, An = k=1 ak .
2.5. Series 101: Basic Ideas 121
E XAMPLE 1. Describe the partial sum sequences for the two series illustrated
above. Do they have limits?
comes from the geometric family, in which successive terms differ by a constant
multiple. Here we have Much more about geometric
series soon.
1 3 1 1 7 1 15
A1 = 1, A2 = 1 + = , A3 = 1 + + = , A4 = A3 + = , ....
2 2 2 4 4 8 8
Now it is easy to guess—and to prove by induction—that
2n − 1 1
An = = 2 − n−1 .
2n−1 2
Clearly, An → 2, so it makes sense to write 1 + 12 + 14 + 18 + · · · = 2. We’ll
sanctify this below with a formal definition.
The series
∞ ∞
X X 1 1 1 1
hk = = 1 + + + + ...
k 2 3 4
k=1 k=1
is the famous harmonic series. Here are the first few partial sums Hn , also known More soon about this, too.
as harmonic numbers:
1 3 1 1 11 1 25
H1 = 1, H2 = 1 + = , H3 = 1 + + = , H 4 = H3 + = .
2 2 2 3 6 4 12
No pattern seems obvious, so let’s try some numerical experiments: Thanks, technology; values are
rounded.
H10 = 2.929, H433 = 6.649, H7881 = 9.549, H12345 = 9.998.
The Hn are clearly increasing, but slowly. And mysteries remain: is the sequence
{Hn } bounded, and therefore convergent, or do the harmonic numbers diverge to By Theorem 2.3 page 88
infinity?
It turns out—as you probably know, and as we will review in exercises—that
{Hn } diverges to infinity. This has been known for at least 650 years, thanks to
the French philosopher and bishop Nicole Oresme. The moral for now is that this
partial sum sequence has no simple formula, and so requires some work-around
effort to analyze. This state of affairs is typical—most series don’t have partial
sum sequences with nice, “closed form” expressions. Exceptions, like the first
series above, are precious. ♦
122 2. Sequences and Series
Notes on notation. Sigma notation packs a lot of information into a small suit-
case—and so may need careful unpacking. For instance, all of the following
Convince yourself by writing out expressions mean exactly the same thing:
some terms.
∞ ∞ 103 ∞ ∞ ∞
X 1 X 1 X 1 X 1 X 1 X 1
; ; + ; ; .
k j=1
j j=1
j j=104
j j=0
j+1 z=13
z − 12
k=1
With due care taken, such cosmetic differences P won’t causePtrouble. For typo-
∞
graphical economy, moreover, we will write just ak , not k=1 ak , when con-
fusion seems unlikely.
As another example of notational alternatives, observe that the equation
∞
X n
X
ak = lim ak = A
n→∞
k=1 k=1
P
is a highly compressed form of the definition of convergence of a series ak to
the sum A. There is no shame—and often added clarity—in expanding things a
Efficiency isn’t everything. bit, so we might write, instead,
a1 + a2 + a3 + · · · = lim (a1 + a2 + a3 + · · · + an ) = A.
n→∞
P∞ 1
E XAMPLE 2. Does the series k=1 1/(k 2 + k) = 2 + 61 + . . . converge? If so,
to what limit?
Try some by hand. S OLUTION . By a happy coincidence, the partial sums follow a nice pattern:
1 2 3 100
S1 = , S2 = , S3 = , ..., S100 = , ...
2 3 4 101
Now it is easy to guess a formula for Sn , and to see that Sn → 1. (See the
exercises for details.) ♦
P
ak : one for which ak ≥ 0 for all k. In this case the partial sum sequence
{A1 , A2 , A3 , . . . } is increasing:
a1 ≤ a1 + a2 ≤ a1 + a2 + a3 ≤ a1 + a2 + a3 + a4 ≤ . . . .
By Theorem 2.3, page 88, there are only two possibilities: Either {An } converges,
or it diverges to infinity. If we can somehow rule out either alternative, the other
must apply.
P
• Constant multiples: can converges to cA; equivalently,
X X
can = c an .
Proof: We will leave the proof for sums as an exercise. For constant multiples
we want to show, in effect, that the distributive law applies to (convergent) infinite
sums. To do this, we need to show that the sequence of partial sums
converges to cA. This follows from the distributive law for finite sums, and from
what we already know about sequences. By the distributive law, we can rewrite
the preceding sequence as
or, equivalently, as
cA1 , cA2 , cA3 , . . . , cAn , . . . ,
P
where {An } is the sequence of partial sums of an . But we know by hypothesis
that An → A, and the constant multiple rule for sequences implies that cAn →
cA, as desired.
124 2. Sequences and Series
S OLUTION . It diverges. The given series is the sum of two simpler series—
one convergent and one divergent—discussedP in Example 1. In Pfact, every such
sum
P diverges. To see why, suppose that a n converges and bn diverges. If
(an + bn ) were convergent, then, by Theorem 2.23, the series
X X
bn = ( (an + bn ) − an )
as claimed.
2.5. Series 101: Basic Ideas 125
where a and r are any fixed numbers; notice the common ratio r of each summand
to its predecessor. The first series discussed above, for instance, is geometric, with
ratio r = 1/2:
∞ ∞
X 1 1 1 1 X
=1+ + + ··· = 0.5k−1 .
2k−1 2 4 8
k=1 k=1
The geometric form may be slightly hidden in the numbers, but it is easy to check
P∞ terms have common ratio r = 1.1, and so the standard geometric
that successive
template k=1 3 × 1.1k−1 fits the pattern.
Geometric series satisfy a useful algebraic equation: It’s readily proved by induction.
1 − rn
a + ar + ar2 + ar3 + · · · + arn−1 = a (∗)
1−r
holds if r 6= 1, n is a positive integer, and a is any number. This pretty formula
tells us, almost instantly, whether any geometric series converges and, if so, to
what limit.
6 0.
Proposition 2.25. Let a and r be any real constants, with a =
P∞ a
(a) If |r| < 1, then k=1 a rk−1 converges to 1−r .
P∞
(b) If |r| ≥ 1, then a rk−1 diverges.
k=1
P∞
We can infer, for instance,
P∞ that k=1 0.5k−1 converges to 2—as we proved in
Example 1. By contrast, k=1 3 × 1.1k−1 diverges—as the numerical evidence
suggests, and as the nth term test also implies.
Proof (sketch): Convergence and divergence of series are all about partial sums,
6 ±1 we have
and Equation (∗) above tells almost the whole story. If r =
n
X 1 − rn
a rk−1 = a .
1−r
k=1
126 2. Sequences and Series
What happens to the right side as n → ∞? The answer depends entirely on the
rn in the numerator—and that depends, in turn, on the value of r. If |r| < 1, then
rn → 0, and so
1 − rn 1
a →a ,
1−r 1−r
as claimed. If |r| > 1, then {arn } is unbounded, and so the series (badly!) fails
the nth term test, and so diverges.
The remaining cases (r = ±1) are much simpler, and left to the exercises.
E XAMPLE 4. Proposition 2.25 and Theorem 2.23 justify calculations like the
Watch the tricky index change following:
from k to j.
∞ ∞ ∞ ∞ j−1 ∞ j−1
X 3k + 4k X 3k X 4k X 3 X 4
= + = +
5k 5k 5k j=1
5 j=1
5
k=0 k=0 k=0
1 1 15
= + = = 7.5.
1 − 3/5 1 − 4/5 2
Similarly,
1 1 1 1 1 1 1 1
π+e+ − + − + ··· = π + e + 1 − + − + ...
3 6 12 24 3 2 4 8
∞
1 X k−1 1 1
=π+e+ (−1/2) =π+e+ ≈ 6.082.
3 3 1 + 1/2
k=1
Comparing series for size in this simple way is valid—and useful. Details are in
the following theorem and proof.
P P
Theorem 2.26 (Comparison test). Let ak and bk be series, with 0 ≤ ak ≤
bk for all k.
P P
(a) If bk converges to a number B, then ak converges to a number A,
with A ≤ B.
P P
(b) If ak diverges, then bk diverges, too.
P P
Proof: Let {An } and {Bn } be the partial sum sequences for ak and bk .
Because {An } and {Bn } are increasing sequences, each converges if—but only
if—it is bounded above. By hypothesis,
An = a1 + a2 + a3 + · · · + an ≤ b1 + b2 + b3 + · · · + bn = Bn
for all n. If {Bn } happens to be bounded above, then certainly {An } is also
bounded above, and we have
A = sup{An } ≤ sup{Bn } = B.
In this case An → A and Bn → B, as (a) claims. Part (b) follows from part (a).
E XAMPLE 5. Does each of the following series converge or diverge? What can
be said about limits?
∞ ∞ ∞
X 2k − 1 X 1 X sin k
(a) (b) (c)
3k 2k − 1 2k
k=1 k=1 k=1
The resulting series diverges (why?), and is comparable to series (b), which there-
fore also diverges. P
For series (c), comparison with ∞ k
k=1 1/2 is tempting, but some terms of (c)
may be negative, and so Theorem 2.26 doesn’t apply. Still, numerical evidence
suggests convergence; here are some (approximate) partial sums Sn :
The case for convergence looks strong, but we don’t (quite yet) have a proof. ♦
Absolute Convergence
It’s a triangle inequality, but for Series (c) from Example 5 stumped us. The next theorem comes just in time.
infinitely many summands. P P
Theorem 2.27. If |ak | converges then ak converges too, and
X∞ X∞
an ≤ |an | .
k=1 k=1
Let’s polish off series (c) from Example 5 before proving this. Clearly,
|sin k| 1
< k
2k 2
for all k, and so Theorem 2.26 guarantees that the left-hand series converges,
to some positive limit less than one. Theorem 2.27, in turn, says that series (c)
We don’t know exactly where. converges too, to a limit somewhere in the interval (−1, 1).
2.5. Series 101: Basic Ideas 129
P P
The proof, informally. Suppose |an | converges. To show that an converges,
too, we need to prove that the partial sum sequence {An } converges. Since we
have no specific limit in mind, we will show that {An } is a Cauchy sequence, and
then invoke Theorem 2.20, page P116. We worked hard to prove the
Observe first that, because |an | converges, its partial sum sequence theorem; here comes the
payoff.
is convergent, and therefore Cauchy. We will use this fact and the triangle in-
equality to show that {An } is Cauchy, too.
Let ǫ > 0 be given. Because {Cn } is Cauchy we can choose N such that
|Cn − Cm | < ǫ whenever n > m > N . The same N turns out to work for {An },
because if n > m > N , then Watch for the (finite) triangle
inequality.
|An − Am | = |(a1 + · · · + am + am+1 + · · · + an ) − (a1 + . . . am )|
= |am+1 + am+2 + · · · + an |
≤ |am+1 | + |am+2 | + · · · + |an |
= Cn − Cm = |Cn − Cm | < ǫ.
∞
X k+1 1 1 1 1 1
(−1) = 1 − + − + − ...
k 2 3 4 5
k=1
S OLUTION . Taking absolute values of all terms gives the ordinary harmonic
series
1 1 1 1
+ + + + ...,
1+
2 3 4 5
which we know to diverge. To see why the alternating version converges, consider
the partial sums {Sn }:
1 1 1 1 1 1 1 1 1 1
1, 1 − , 1 − + , 1 − + − , 1 − + − + . . . .
2 2 3 2 3 4 2 3 4 5
Here’s a (rounded) numerical view of S1 , . . . , S10 :
1.00, 0.500, 0.833, 0.583, 0.783, 0.617, 0.760, 0.635, 0.746, 0.646, . . . .
Notice the pattern: Successive partial sums alternately increase and decrease, but
with smaller and smaller gaps. In particular, for any N , every term Sn with n > N
lies between SN and SN +1 . If n > 1000, for example, then
The same holds for two terms. If n > m > 1000, then both Sn and Sm lie
between 0.69265 and 0.69365, and so lie within 0.001 of each other. Using this
idea we can prove that {Sn } is Cauchy, and hence convergent. We leave details
to the exercises.
So what is the limit? The striking but far-from-obvious exact answer turns
out to be ln 2 ≈ 0.69315. A rigorous proof is beyond our scope, but numerical
evidence shows we are in the ballpark. ♦
Exercises
1. For each series following, find a formula or simple rule for the partial sums
Sn . Then decide whether the series converges or diverges, and to what
limit.
∞
X
(a) 0 = 0 + 0 + 0 + ...
k=1
X∞
(b) 42 = 42 + 42 + 42 + . . .
k=1
X∞
(c) (−1)k = −1 + 1 − 1 + . . .
k=1
X∞
(d) k = 1 + 2 + 3 + ...
k=1
2.5. Series 101: Basic Ideas 131
∞
X
(e) 0.99k−1
k=1
∞
X 1
(f) (Hint: First guess a formula for Sn and prove it by
k2
+k
k=1
induction.)
P
2. For a certain series an , we know that An = a1 +a2 +· · ·+an = 4+1/n
for all n ∈ N.
P
(a) Does the series an converge? If so, to what limit? How do you
know?
(b) Does the sequence {an } converge? If so, to what limit? How do you
know?
P
(c) Does the series An converge? If so, to what limit? How do you
know?
(d) Note that a1 = A1 = 5. Find a2 and a3 .
(e) Find a formula in terms of n for an .
3. This problem is about partial sums Hn of the harmonic series (see Exam-
ple 1, page 121).
7. Prove the claim about sums and differences in Theorem 2.23, page 123.
(Note that the claim amounts to saying that addition is commutative for
convergent infinite sums; we already know that addition is commutative for
finite sums.)
8. This problem ties up some loose ends concerningP
Proposition 2.25, page 125,
which addresses convergence and divergence of ∞ k=1 a r
k−1
.
(a) How does the series look if r = 1? What if r = −1? Prove (using
partial sums) that the series diverges in both of these cases.
(b) In the proof of Proposition 2.25 we used several reasonable-seeming
facts about convergence and divergence of the sequence {rn }, de-
pending on the value of r. Prove them, as follows:
(i) If r and L are any real numbers and rn → L, then rn+1 → L,
too.
(ii) If r and L are any real numbers and rn → L, then rn+1 =
r rn → r L.
6 0 and rn → L, then L = 0. (Hint: Use the two preceding
(iii) If r =
parts.)
(iv) If 0 ≤ r < 1, then rn → 0. (Hint: Use a theorem about
monotone sequences.)
(v) If −1 < r ≤ 0, then rn → 0. (Hint: Squeeze.)
(vi) If |r| ≥ 1, then {rn } diverges. (Hint: It is enough to explain
why rn → 0 is impossible.)
(a) Claim: For any positive integer N , every partial sum Sn with n ≥ N
is between (or equal to one of) SN and SN +1 . Prove this for N =
1000. (A general proof is similar.)
(b) Show that, for any ǫ > 0, we can find N such that |SN − SN +1 | < ǫ.
(c) Use the preceding parts to show that {Sn } is Cauchy.
where p is any fixed number. With p = 1 we have the harmonic series, which
we’ve shown to diverge. With p = 2 we get
∞
X 1 1 1 1 1
= 1 + 2 + 2 + 2 + 2 + ...,
k2 2 3 4 5
k=1
which converges, as we’ll prove in two different ways. The general story on
convergence and divergence is simple:
P∞
Proposition 2.28. The p-series k=1 k1p converges if p > 1 and diverges if p ≤ 1.
2.6. Series 102: Testing for Convergence and Estimating Limits 135
Limit comparison test. The ordinary comparison test (Theorem 2.26, page 127)
is simple and powerful, but can be hard to apply. The limit comparison test can
simplify the work.
P P
Theorem 2.29 (Limit comparison test). Consider two series ak and bk with
positive terms, and suppose that {ak /bk } converges to a finite limit L.
P P
(a) If bk converges, then ak converges, too.
P P
(b) If L 6= 0, then either both ak and bk converge or both diverge.
P∞
E XAMPLE 1. We showed in Example 2, page 122, that the series k=1 bk =
P ∞ 2
k=1 1/(k + k) converges, with sum 1. Use limit comparison to deduce that
some other series converge.
P P
S OLUTION . We have already shown that the p-series ak = 1/k 2 con-
verges, but the proof was laborious. Limit comparison is easier. Here is the key
limit:
ak 1/k 2 k2 + k 1
lim = lim 2
= lim 2
= lim 1 + = 1,
k→∞ bk k→∞ 1/(k + k) k→∞ k k→∞ k
P
which implies (again) that 1/k 2 converges.
Similar arguments apply to uglier series. If, say,
∞ ∞
X X k 2 + 7k − 3 sin k
ak = ,
2k 4 − k
k=1 k=1
We’ll prove Theorem 2.29 using two auxiliary facts, each of interest in its own
right.
2.6. Series 102: Testing for Convergence and Estimating Limits 137
Proof (sketch): The trick is to compare the partial sum sequences {An } and {Bn },
which turn out to resemble each other. Here is the idea for N = 42; a general
proof is almost identical. If ak = bk for k > 42, and we know n > 42, then
An = a1 + · · · + a42 + a43 + · · · + an
= a1 + · · · + a42 + b43 + · · · + bn
= (a1 + · · · + a42 ) − (b1 + · · · + b42 ) + (b1 + · · · + b42 ) + b43 + · · · + bn
= A42 − B42 + Bn .
Thus, An = A42 − B42 + Bn for n ≥ 42, which implies that the sequences
{An } and {Bn } either both converge or both diverge. If, in fact, Bn → B, then
An → A42 − B42 + B.
P
Proposition 2.31 (Bounded comparison test). Consider two series
P ak and
bk with positive terms, and suppose that the sequence {ak /bk } is bounded
above.
P P
(a) If bk converges, then ak converges, too.
P P
(b) If ak diverges, then bk diverges, too.
The limit comparison test is a special case of Proposition 2.31, so the proof is
easy.
Proof (of Theorem 2.29): Because {ak /bk } converges, it is also bounded above,
and so Proposition 2.31 implies part (a).
Now suppose, as in (b), that L 6= 0. P In this case {bk /aP
k } converges to 1/L,
and we can apply (a) to conclude that if ak converges, bk must do so, too. Sorting out the a’s and b’s is a
This establishes the claim in (b) about convergence; the statement about diver- little confusing . . . but it all works
out.
gence is equivalent.
The ratio test. For a geometric series, the ratio of successive terms is a constant r,
and convergence depends on the magnitude of r. The ratio test generalizes this
principle to a large class of series that, although not geometric, behave “in the
limit” something like geometric series. The theorem makes these ideas precise.
138 2. Sequences and Series
P
Theorem 2.32 (Ratio test). Consider a series ak with positive terms, and sup-
pose that the sequence {ak+1 /ak } converges to a finite limit L.
P
(a) If L < 1 then ak converges.
P
(b) If L > 1 then ak diverges.
P
(c) If L = 1 the test is inconclusive; ak may converge or diverge.
S OLUTION . The ratio test says that the first series converges. The ratio in ques-
tion collapses nicely:
ak+1 1/(k + 1)! k! 1 · 2 ····· k 1
= = = = .
ak 1/k! (k + 1)! 1 · 2 · · · · · k · (k + 1) k+1
The last quantity converges to zero as k → ∞, so the ratio test guarantees con-
vergence.
The limit-of-ratios calculation is just as easy for the harmonic series:
ak+1 1/(k + 1) k
= = →1 as k → ∞.
ak 1/k k+1
Because the limit is one, however, the ratio test says nothing. The wasted work
is annoying, but only mildly so: we knew already from other arguments that the
The ratio test wasn’t the right harmonic series diverges. ♦
tool for this job.
Proof (of the ratio test): We prove (a) and leave (b) and (c) to the exercises. Given
Plenty of choices exist. L < 1, as in (a), choose any real number r with L < r < 1. Then the geometric
series 1 + r + r2 + . . . converges,
P and we’ll use “bounded comparison” (Propo-
sition 2.31) to show that ak converges, too.
Doing so requires a basic observation about limits and an algebraic trick. The
observation is that, because ak+1 /ak → L and L < r, there must be a number N
such that
ak+1
< r whenever k > N.
ak
To complete the proof, using Proposition 2.31, we’ll show that the sequence
The remaining terms, with k ≤ {ak /rk } is bounded. To do this it is enough to assume k > N . In this case,
N, form a finite set.
2.6. Series 102: Testing for Convergence and Estimating Limits 139
Now the last quantity is constant (independent of k), and so the sequence is
bounded, as desired.
Not-necessarily-positive series. Most of our convergence tests require positive
series, and so don’t apply directly to series like
∞
X sin k sin 1 sin 2 sin 3 sin 4
= + + + + ...
2k 1 4 8 16
k=1
and
∞
X (−1)k−1 1 1 1 1 1
√ = √ − √ + √ − √ + √ − ...,
k=1
k 1 2 3 4 5
which have both positive and negative terms. What
P can wePdo?
We’ve already seen one work-around: test |ak |, not ak , for convergence.
Theorem 2.27, page 128, guarantees that if the former converges, so does the
latter. Indeed, we already used this method (Example 5, page 127) to show that
the first series converges.
This approach fails for the second series, however, because
∞
(−1)k−1
= √1 + √1 + √1 + . . . ,
X
√
k 1 2 3
k=1
P∞ k−1
√
E XAMPLE 3. Finish off the series k=1 (−1) / k.
S OLUTION . Conditions (i) and (ii) of the theorem hold for our series:
1 1 1 1
√ > √ > √ > · · · > √ → 0,
1 2 3 k
as needed. Thus our series converges to some limit S, which lies between any two
successive partial sums. For instance, we have
and
S1002 ≈ 0.58911 < S < 0.62068 ≈ S1003 . ♦
Proof (of Theorem 2.33): The proof depends on a pleasing pattern among the par-
tial sums:
S2 ≤ S4 ≤ S6 ≤ S8 ≤ · · · ≤ S7 ≤ S5 ≤ S3 ≤ S1 .
All of these inequalities hold because the terms alternate in sign but decrease in
size. The reasons are straightforward but pretty, and best thought through for
Start small. See, e.g., why oneself.
S 5 ≥ S 8. In particular, the odd-index subsequence S1 , S3 , S5 , . . . is decreasing and
bounded below (by any even-index term, such as S8 ), while the even-index sub-
sequence S2 , S4 , S6 , . . . is increasing and bounded above. Hence both of these
subsequences converge. They tend to the same limit, moreover, because
and the right side tends to zero as n tends to infinity. This implies, finally, that
{Sn } itself converges, as desired, and that the limit, S, satisfies the main inequal-
ity:
S2 ≤ S4 ≤ S6 ≤ S8 ≤ · · · ≤ S ≤ · · · ≤ S7 ≤ S5 ≤ S3 ≤ S1 .
Estimating Limits
We have now acquired and sharpened several tools for detecting whether a lot of
series converge or diverge. That’s nice, but what about the limits themselves? The
bad news is that exact limits may be hard or impossible to find. (The family of
geometric series is a rare but important exception.) The good news is that the same
tools, cleverly employed, can also help us estimate limits with good precision.
2.6. Series 102: Testing for Convergence and Estimating Limits 141
E XAMPLE 4. We used the ratio test in Example 2 to show that the series
∞
X 1 1 1 1 1 1
=1+ + + + + + ...
k! 1 2 6 24 120
k=0
contribute, at most?
The comparison test idea gives a quick (if slightly crude) answer. Since
1 1 1 1 1 1
< 10 , < 11 , < 12 , . . . ,
11! 2 12! 2 13! 2
we can compare with a geometric series, for which we do know a limit:
1 1 1 1 1 1 1
+ + + · · · < 10 + 11 + 12 + · · · = 9 ≈ 0.002.
11! 12! 13! 2 2 2 2
This tells us that S10 ≈ 2.71828 underestimates the true limit by less than 0.002,
so we can have confidence in (at least) the first three digits. ♦
P
Bounding the tail: lessons from Example 4. For any convergent series ak
and any partial sum Sn we can write
∞
X
ak = (a1 + a2 + · · · + an ) + (an+1 + an+2 + . . . ) = Sn + Rn ,
k=1
P∞
where Rn = k=n+1 ak is the nth upper tail, or remainder.
The idea illustrated in Example 4 is to bound the tail: If we can somehow
show that |Rn | is small, then the limit S = Sn + Rn must be close to the finite
sum Sn , which we can compute without any worries over convergence. Upper
tails are often bounded by comparison to suitable geometric series, for which
(atypically) we can find exact limits.
We should confess, finally, that the exact limit of the series in Example 4 is
well known to be e ≈ 2.71828182846. A rigorous proof requires methods we
haven’t developed; our point here is that reasonable accuracy is available using
basic arguments.
142 2. Sequences and Series
Exercises
1. Use methods of this section to prove as efficiently as possible that each
series converges or diverges. (Similar series appeared in exercises for the
preceding section; now we know more techniques.)
∞
X 1
(a)
2k −1
k=1
∞
X k
(b)
2k 2 − 1
k=1
∞
X k
(c)
3k
k=1
∞
X k2 + 2
(d)
3k 2 − 2
k=1
2. Use methods of this section, where possible, to decide whether each of the
following series converges or diverges. Do any converge conditionally?
(Similar series appeared in exercises for the preceding section. With new
techniques we can handle some more efficiently.)
2.6. Series 102: Testing for Convergence and Estimating Limits 143
∞
X (−1)k
(a)
3k 2 + 1
k=1
∞
X k2 + k
(b)
k4
k=1
∞
X sin(k)
(c)
3k − 1
k=1
∞
X 1
(d)
2 + sin(k)
k=1
3. Decide whether each of the following series converges or diverges; use the
ratio test.
∞
X 2k
(a)
3k 2
k=1
∞
X k2
(b)
1.3k
k=1
∞
X 2k + k
(c)
3k
k=1
5. For which values of x does each of the following series converge abso-
lutely? (Series like this are called power series.)
144 2. Sequences and Series
∞
X
(a) xk
k=0
X∞
(b) kxk−1
k=1
∞
X xk
(c)
k
k=1
X∞
(d) k!xk
k=0
P∞
11. Let {Sn } be the partial sum sequence for the series k=1 √1 . Show that
√ k
Sn > n for all n; conclude that the series diverges.
P
12. In our proof sketch for Proposition 2.28, page 134, we
Pshowed that 1/k p
converges if p = 1.3. Show in a similar way that 1/k p converges for
every p > 1. (Hint: Writing p = 1 + s simplifies the algebra slightly; then
s plays the role of 0.3 in the given proof.)
P∞
13. Consider the series k=1 1/(2k−1 + 1), which converges to some limit S.
(a) Draw a picture to illustrate this inequality. (Leaf through any standard
calculus text, if necessary.)
P∞
(b) Estimate k=1 k13 with error less than 0.001. (Technology will be
helpful.)
16. Does each of the following series converge absolutely, converge condition-
ally, or diverge? Prove your answers. (It’s OK to assume basic facts about
geometric series, p-series, etc.)
∞
X 1 1
(a) −
k k+3
k=1
∞
X cos k
(b)
2k + 2k + 2
k=1
146 2. Sequences and Series
∞
X 81
(c) (−1)k
347k
k=1
Basic problems.
1. For each of the following sequences {xn }, find the corresponding sequence
{vn }; then find lim sup xn .
2. Why must all the vi exist in the sup-sequence? Why must {vn } have a
limit? (Cite appropriate theorems.)
3. How are lim xn and lim sup xn related to each other for a bounded se-
quence {xn }? Can one exist but not the other? Can both exist but be
different?
2.7. Lim sup and lim inf: A Guided Discovery 147
The lim inf. The lim inf is closely analogous to the lim sup.
1. Carefully state an appropriate definition for the lim inf of a bounded se-
quence. Then find lim inf xn for each of the example sequences in the first
problem.
2. Find a sequence {xn } with lim sup xn = 5 and lim inf xn = −2.
3. Prove that if lim sup xn = 5 and lim inf xn = 5, then lim xn = 5, too.
4. It seems reasonable that lim inf xn ≤ lim sup xn for any bounded sequence
{xn }. Prove this. What happens if the two are equal?
Algebra with the lim inf and the lim sup. The ordinary limit has nice algebraic
properties: for example, lim (xn + yn ) = lim xn + lim yn (assuming that both
limits on the right side exist). Do lim inf and lim sup have similar properties?
1. Find the lim sup and the lim inf of the sequence −1, −2, −3, −4, . . . . Find
a sequence {xn } with lim sup xn = ∞ and lim inf xn = −∞.
2. Show that every sequence (bounded or not) has a lim sup and a lim inf.
3. If lim sup xn = +∞, must lim xn = +∞, too? If lim sup xn = −∞,
must lim xn = −∞, too?
Lim sups, lim infs, and subsequences. The lim sup and the lim inf of a se-
quence {xn } are closely connected to subsequences of {xn }, as the following
proposition suggests.
Proposition 2.34. Let {xn } be a sequence.
(a) If {xn } is unbounded above (so lim sup xn = ∞), then some subsequence
{xnk } diverges to infinity.
(b) If {xn } is bounded above, with lim sup xn = L, then some subsequence
{xnk } converges to L.
148 2. Sequences and Series
1. Show that if lim sup xn = 5, then there is a subsequence {xnk } that con-
verges to 5. Could some other subsequence converge to 6? To 4? Explain
your answers.
2. Show that if {xn } has a subsequence {xnk } that converges to 5, then
More on the “sup-sequence.” The following problems explore further the cor-
respondence between the original sequence {xn } and the “sup-sequence” {vn }.
We can ask, for instance, about “inverting” this correspondence.
1. For each of the following sequences {vn }, find (if possible) two different
sequences {xn } that give the same sup-sequence {vn }.
3 3 4 4
(a) {vn } = 2, 2, , , , , . . .
2 2 3 3
(b) {vn } = {2, 2, 2, 2, 2, 2, . . . }
(c) {vn } = {2, 1.1, 1.01, 1.001, 1.0001, . . .}
2. For which sequences {vn } can there be more than one associated sequence
{xn }? State and prove a claim.
CHAPTER 3
Limits and Continuity
These views are not incorrect, but they are too vague to be useful for building
theory or proving theorems. What exactly does “approaches” mean? How close
to 3 must x be for x ≈ 3 to hold? We need a precise definition.
Definition 3.1 (Limit of a function). Let f be a function whose domain includes
an open interval I containing a, except perhaps for x = a. Let L be a number.
We write limx→a f (x) = L if, for every ǫ > 0, there exists δ > 0 such that
149
150 3. Limits and Continuity
E XAMPLE 1. Let f (x) = (3x2 − 3)/(x − 1). Use Definition 3.1 to show that
limx→1 f (x) = 6.
S OLUTION . Note first that all is well with domains: f (1) is undefined, but f (x)
makes good sense for all other x. Next, for x =6 1 we have
L+є L+є
L L
L–є L–є
S OLUTION . With f (x) = x for the first limit f (x) = k for the second, verifying
Definition 3.1 is an easy exercise. ♦
Picturing limits. Function limits can be understood graphically. For any given ǫ
and δ, we can look at the tiny rectangular “window” in the xy-plane, centered at
(a, L), in which L − ǫ < y < L + ǫ and a − δ < x < a + δ. This window has
“half-height” ǫ and “half-width” δ; two possibilities are shown in Figure 3.1.
From this viewpoint, the ǫ–δ condition is satisfied if the graph of f stays
inside this rectangle all the way from left to right, as in Figure 3.1(a). If the graph
“escapes” at top or bottom, as in Figure 3.1(b), the chosen δ does not work for the
given ǫ. The limit is L if, for any choice of ǫ (the half-height), there exists some
good half-width δ.
is called the characteristic function of the set Q. Does limx→0 f (x) exist? It is useful in illustrating “bad”
behavior; we’ll see it again. Can
you picture the graph?
S OLUTION . The (unsurprising) answer is no. To prove this formally, suppose
toward contradiction that L is a limit, and set ǫ = 0.001. By our assumption,
there must be some δ > 0 with
But (as we’ve shown) every interval (−δ, δ) contains both rational and irrational
numbers. If s is such a rational and t an irrational, then both
E XAMPLE 4. Let f be a function for which limx→a f (x) > 0. Must f (x) > 0
also hold for x near a? What can be said about another continuous function g if
limx→a g(x) = 0?
L
|f (x) − L| < whenever 0 < |x − a| < δ.
2
Rewriting these inequalities gives
L
f (x) > >0 whenever x ∈ (a − δ, a + δ) and x 6= a,
2
as desired.
Concerning g, the answer is “not much.” Near x = a the function g could be
positive (if, say, g(x) = (x − a)2 ) or negative (if, say, g(x) = −(x − a)2 ). Or
perhaps g(x) = x − a, which changes sign at x = a. ♦
Lemma 3.2. Let f (x) be defined on I \ {a}, where I is an open interval contain-
ing a, and let L be a number. The following are equivalent:
Proof: To show that (i) implies (ii), let’s assume that (ii) fails and construct a
Let’s prove the contrapositive, in sequence {xn } that violates (i).
other words.
3.1. Limits of Functions 153
For (ii) to fail means that there is some positive ǫ, say ǫ0 , for which no δ > 0
works. In particular, δ = 1 fails, so there must be some x1 in I \ {a} such that
0 < |x1 − a| < 1 but |f (x1 ) − L| ≥ ǫ0 .
Because δ = 1/2 also fails, there is some x2 with
0 < |x2 − a| < 1/2 but |f (x2 ) − L| ≥ ǫ0 .
Continuing in this way, we construct a sequence {xn } of points in I \ {a} such
that, for each n,
1
0 < |xn − a| < but |f (xn ) − L| ≥ ǫ0 .
n
In particular, {f (xn )} does not converge to L. On the other hand, {xn } converges
to a; the squeeze principle assures this because
1 1
a− < xn < a +
n n
for all n. Thus our sequence {xn } violates (i), as desired, and we’ve shown that
(i) =⇒ (ii).
Showing (ii) =⇒ (i) is easier; we’ll leave it as an exercise.
Lemma 3.2 lets us exploit earlier work with sequences to find, without much
effort, a lot of function limits that would be tedious or difficult to handle from
scratch.
E XAMPLE 5. Let f (x) = (3x2 + 5x + 2)/(2x − 7). Use Lemma 3.2 to prove
that limx→5 f (x) = 102
3 .
S OLUTION . Proving such a thing straight from the ǫ–δ definition could get ugly.
With Lemma 3.2, it is easy.
To get started, let {xn } be any sequence with xn → 5. By Lemma 3.2, it is
enough to show that
3x2n + 5xn + 2 102
lim f (xn ) = lim = .
n→∞ n→∞ 2xn − 7 3
This, in turn, follows from different parts of Theorem 2.5, page 96, which allows
algebraic combinations like the following for limits of sequences: No worries here about zero
denominators.
3x2n + 5xn + 2 2
3(lim xn ) + 5(lim xn ) + 2
lim =
n→∞ 2xn − 7 2(lim xn ) − 7
3 · 52 + 5 · 5 + 2 102
= = ,
2·5−7 3
as claimed. ♦
154 3. Limits and Continuity
New function limits from old. The following theorem does for functions what
Theorem 2.5, page 96, does for sequences.
Theorem 3.3 (Algebra with limits). Let f (x) and g(x) be defined for all inputs x
in I \ {a}, where I is any open interval containing a, and suppose that
lim f (x) = L and lim g(x) = M.
x→a x→a
Then
lim (f (x) ± g(x)) = lim f (x) ± lim g(x) = L ± M ;
x→a x→a x→a
lim (f (x) · g(x)) = lim f (x) · lim g(x) = L · M ;
x→a x→a x→a
f (x) limx→a f (x)
lim = 6 0.
if M =
x→a g(x) limx→a g(x)
Observe:
• Existence: Theorem 3.3 is more than a recipe for calculating new limits
from old; it is also a guarantee that various limits (those on the left of each
equation) actually exist. In the case of quotients, for existence, one might
reasonably worry about small or vanishing denominators; Theorem 3.3 as-
6 0.
sures us that all is well if M =
• Something to combine: The how-to-combine-limits rules in Theorem 3.3
aren’t much good until we have some already-proved basic limits to com-
bine. The good news is that a few very, very basic limits, such as
lim x = a and lim k = k
x→a x→a
We addressed these in go a long way. Applying Theorem 3.3, repeatedly as necessary, lets us cal-
Example 2. culate limits like these without too much effort:
3x2 + 5x + 2 3 · 52 + 5 · 5 + 2
lim = = 34;
x→5 2x − 7 2·5−7
x+2 5+2
317x2 − 5 x+3 317 · 52 − 5 5+3
lim = ≈ 0.5117.
x→5 42.075 − x5 42.075 − 55
(We calculated the first limit, a bit differently, in Example 5.)
Proof: All parts of Theorem 3.3 follow easily when we combine the analogous
results for sequences (Theorem 2.5, page 96) with Lemma 3.2, above. Concerning
products, for instance, we consider any sequence {xn } in I \ {a}. By hypothesis,
the new sequences {f (xn )} and {g(xn )} converge to L and M , respectively.
By Theorem 2.5, the product sequence {f (xn )g(xn )} converges to LM , and
Lemma 3.2 assures us that limx→a f (x)g(x) = LM , too. The remaining parts
are similar.
3.1. Limits of Functions 155
Squeezing functions. A squeeze principle works for function limits. We’ve al-
ready proved the sequence version (Theorem 2.6, page 99):
Proposition 3.4 (The squeeze principle). Let f (x), g(x), and h(x) be defined
for all inputs x in I \ {a}, where I is an open interval containing a, and suppose
that
f (x) ≤ g(x) ≤ h(x) for all x ∈ I \ {a}.
If limx→a f (x) = L and limx→a h(x) = L, then limx→a g(x) = L, too.
The proof, like that of Theorem 3.3, amounts to combining the sequence version
with Lemma 3.2. We’ll leave that to the exercises.
The limit-squeezing idea is simple. The tricky bit in practice is to find helpful
squeezing inequalities.
sin x
E XAMPLE 6. Squeeze something to find limx→0 x sin(1/x) and limx→0 x . Plotting these functions might
be useful.
S OLUTION . The fact that |x sin(1/x)| ≤ |x| for all x 6= 0 suggests a simple
squeezing inequality:
1
− |x| ≤ x sin ≤ |x|.
x
Clearly, ±|x| → 0 as x → 0, so the middle quantity tends to zero, too.
Finding a good squeezing inequality for the second limit takes more effort.
Here is one possibility:
sin x 1
cos x ≤ ≤ if x ∈ (−1, 1) and x = 6 0.
x cos x
This does the job: Since cos x → 1 as x → 0, the left- and right-hand functions
tend to 1, and hence so does the middle function. The squeezing inequality needs
proof too, of course; we’ll leave that to the exercises to avoid distraction. ♦
Following are several formal definitions; we leave some others, all in the same Familiar functions from calculus
spirit, as exercises. As with Definition 3.1, each part here involves some technical seldom cause trouble on this
assumption about domains, needed to ensure that the key inequalities make sense. score.
156 3. Limits and Continuity
Definition 3.5 (Variant limits). Let f be a function and let a and L be real num-
bers.
• Right-hand limit: Let f (x) be defined for all inputs x in some open interval
I = (a, b). We say limx→a+ f (x) = L if, for every ǫ > 0, there exists
δ > 0 such that
• Left-hand infinite limit: Let f (x) be defined for all inputs x in some open
interval I = (b, a). We say limx→a− f (x) = −∞ if, for every M > 0,
there exists δ > 0 such that
• Limit at infinity: Let f (x) be defined for all inputs x in some open interval
I = (b, ∞). We say limx→∞ f (x) = L if, for every ǫ > 0, there exists
N > 0 such that
• Infinite (two-sided) limit: Let f (x) be defined for all inputs x in an open
interval I containing a, except perhaps at x = a. We say limx→a f (x) = ∞
if, for every M > 0, there exists δ > 0 such that
• Infinite limit at infinity: Let f (x) be defined for all inputs x in some open
interval I = (b, ∞). We say limx→∞ f (x) = ∞ if, for every M > 0, there
exists N > 0 such that
S OLUTION . Note first that the functions in all three limits are fine as regards
domains. The first is defined for x > 0, the second for x > −5, and the third for
x ≥ 0.
Limits (a) and (b) have close analogues for sequences—limn→∞ n1 = 0 and
limn→∞ (3n + 2)/(n + 5) = 3—and the proofs for functions are almost identical
to those for sequences. For (b), for instance, we proved the sequence version in
3.1. Limits of Functions 157
Example 2, page 85, and the function version is almost the same. First, we see
that for any positive ǫ,
3x + 2 3x + 2 − 3(x + 5)
= 13 .
x + 5 − 3 =
x+5 x+5
(It is OK to drop the absolute value because x + 5 > 0 certainly holds for the
large positive x in which we’re interested.) Now
13 13
< ǫ ⇐⇒ − 5 < x,
x+5 ǫ
and so for any given ǫ > 0 we can set M = 13/ǫ. This M works in the appropriate
definition, because if x > M , then (skipping some details from just above)
3x + 2
= 13 < 13 = ǫ,
x+5 − 3 x+5 M
as desired. Limit (a) is left to you.
For (c), only positive x matter, and so
2x
= 2x√ < 2x < ǫ ǫ
1 + x√ − 0 1+ x ⇐⇒ x< .
2
Thus, for given ǫ > 0 the value δ = ǫ/2 works: if 0 < x < δ, then
2x
√
1 + x − 0 =< 2x < 2δ = ǫ,
All in the limit family. All of these limit variants are close kin to each other. Limits
at infinity, for instance, are one-sided in the sense that ∞ is approachable only
from below, and −∞ only from above. The following proposition makes some of Similar results apply to limits
this kinship explicit: that involve −∞.
Observe, especially, what the proposition says about existence: if the limits on
either side of “if and only if” exist, then so must the limits on the other side.
We will leave formal proofs to the exercises, but illustrate the idea of (i) with an
example.
158 3. Limits and Continuity
0.4
0.2
– 0.2
– 0.4
S OLUTION . Let ǫ > 0 be given; we need a δ > 0 that works for ǫ. Because of
g’s peculiar two-sided nature, we’ll first find positive numbers δleft and δright that
Here “right” and “left” mean work for ǫ on the left and on the right of zero, respectively.
x > 0 and x < 0, respectively. Matters are simplest on the left. There g(x) = x/2, and so
x
|g(x) − 0| = < ǫ ⇐⇒ |x| < 2ǫ,
2
which means that δleft = 2ǫ works (and also that limx→0− g(x) = 0). On the
right, we have g(x) = 2x sin(1/x), and so
ǫ
|g(x) − 0| = |2x sin(1/x)| ≤ |2x| < ǫ ⇐⇒ |x| < ,
2
which means that δright = ǫ/2 works (and that limx→0+ g(x) = 0).
Combining these results means that |g(x) − 0| < ǫ holds for all nonzero x in
the asymmetric interval
Obviously, |g(x) − 0| < ǫ also holds for nonzero x in the (smaller) symmetric
interval (−δleft , δleft ). This means that δ = δleft = ǫ/2 works for ǫ in the desired
limit. We’re done. ♦
3.1. Limits of Functions 159
1 3x + 2 p
lim √ ; lim ; lim x2 − 1000x; lim x2 + 2x − x.
x→∞ x + x+3 x→∞ x + 5 x→∞ x→∞
S OLUTION . The first limit yields to squeezing. For all x > 0, we have
1 1
0< √ < ;
x+ x+3 x
√
since limx→∞ x1 = 0, we must have limx→∞ 1/(x + x + 3) = 0, too.
We found the second limit from scratch in Example 7. With algebra we can
reduce the limit to something simpler (also handled in Example 7):
The limit limx→∞ x2 − 1000x involves the difference of two quantities, each
tending to infinity. Which tendency “wins”? It is easy to guess that x2 over-
whelms 1000x for large x, and so, presumably limx→∞ x2 − 1000x = ∞. Alge-
bra helps confirm this. For any given M > 0, we have
so we can use N √ = max{M, 1001} in the appropriate definition. Check details for yourself.
The function x2 + 2x−x is another difference of two quantities that diverge
to infinity. Plotting the function or plugging in large values of x suggests that the Try it.
limit is one. We can show this algebraically—with a little effort. The trick is to
multiply and divide by the conjugate expression:
√ √
p x2 + 2x − x x2 + 2x + x
x2 + 2x − x = √
x2 + 2x + x
2x 2
= √ = p .
2
x + 2x + x 1 + 2/x + 1
Exercises
1. We said in Example 2 that if a and k are any constants, then
3. Guess a value for each of the following limits; prove your answers using
Definition 3.1, page 149.
(a) lim 2x + 3
x→1
x2 − 1
(b) lim
x→−1 x + 1
4. Guess a value for each of the following limits; prove your answers using
Definition 3.1, page 149.
x2 + x − 2
(a) lim
x→1 x−1
2 + sin x
(b) lim x
x→0 3 − cos x
5. For a function f and an input x = a it may (or may not) happen that
limx→a f (x) = f (a). If f (x) = 3x + 5, for instance, it is easy to show that
limx→42 f (x) = 3 · 42 + 5 = f (42).
In each case following, decide whether this happens. If so, prove it; if not,
say why not. (Theorem 3.3, page 154, may be useful.)
7. Prove Proposition 3.4. (One way is to use Theorem 2.6, page 99, and
Lemma 3.2.)
8. Show that (ii) implies (i) in Lemma 3.2.
9. We defined several variations on the basic limit theme in Definition 3.5.
Here we explore two more members of the family.
(a) Assume (it is easy to show, but don’t bother) that limx→1 f (x) = −5.
For ǫ = 0.01, find a positive δ that works, and illustrate your answer
by sketching the graph of f in a well-chosen window of half-height ǫ
and half-width δ.
(b) It is a fact that limx→100 f (x) = 9400. For ǫ = 0.01, find a positive δ
that works, and illustrate your answer by sketching the graph of f in
a well-chosen window of half-height ǫ and half-width δ.
(c) It is true that limx→10 f (x) = 40. Set ǫ = 0.01. Does δ = 0.001
work in this case? Sketch the graph of f in an appropriate window to
illustrate your answer.
162 3. Limits and Continuity
12. Suppose that f (x) = 0 if x = 6 0 and f (0) = 42. Explain carefully why
limx→a f (x) = 0 for all real numbers a.
13. Suppose that g(x) = 0 if x ∈
/ Z and g(n) = 42 if x ∈ Z. Explain carefully
why limx→a g(x) = 0 for all real numbers a.
14. Suppose that h(x) = 42 if x = 1, 1/2, 1/3, . . . and h(x) = 0 otherwise.
What can be said about limx→a h(x)? Why?
15. Let S be a finite set of real numbers. Suppose that j(s) = 42 if s ∈ S
and j(x) = 0 for all other real x. Show that limx→a j(x) = 0 for all real
numbers a.
16. Explain why the function g in Example 8, page 158, satisfies the squeez-
ing inequality −|2x| ≤ g(x) ≤ |2x|. Use this to give another proof that
limx→0 g(x) = 0.
17. In Example 6, we used the squeezing inequality
sin θ 1
cos θ ≤ ≤ .
θ cos θ
(We said x, not θ.) Show as follows that this holds for nonzero θ in
(−π/2, π/2).
(a) Explain why it is good enough to show this for θ ∈ (0, π/2).
(b) Give a geometric proof of the (equivalent) inequality sin θ cos θ ≤
θ ≤ tan θ. (Hint: Draw the angle θ into the first quadrant of the
unit circle in the usual way. Then find right triangles whose areas
represent the left- and right-hand quantities above. What area does θ
represent?)
Otherwise, f is discontinuous at x = a.
Observe:
• In terms of limits: All those ǫ’s and δ’s suggest a lurking limit. Sure enough,
the definition boils down to this:
xn → a =⇒ f (xn ) → f (a).
This fact should sound reasonable in light of the “no surprises” property
just discussed. A formal proof can be based on Lemma 3.2, page 152.
x2 − 4x − 5
E XAMPLE 1. Where is the function f (x) = continuous? Can any
x2 − 25
discontinuities be “fixed”?
x2 − 4x − 5 a2 − 4a − 5
lim f (x) = lim = = f (a).
x→a x→a x2 − 25 a2 − 25
164 3. Limits and Continuity
0.5
– 0.5
–1
–1 – 0.5 0 0.5 1
Summarized in Theorem 3.3, Thanks to algebraic properties of limits the answer is yes—except perhaps at
page 154. a = ±5, where the denominator vanishes. So f is continuous at all a =6 ±5.
What if a = ±5? The short answer is that f is discontinuous at both points,
because both f (5) and f (−5) are undefined. On the other hand, factoring gives
(x − 5)(x + 1) x+1
f (x) = =
(x − 5)(x + 5) x+5
for x 6= ±5, from which it is easy to see (or prove, but we won’t bother) that
6
lim f (x) = , but lim f (x) does not exist.
x→5 10 x→−5
Sketch your own. S OLUTION . As its graph suggests, the sign function is discontinuous at a = 0,
because limx→0 sign(x) does not exist. But the sign function is continuous at all
6 0, because (as the graph also suggests) limx→a sign(x) = ±1 = sign(a).
a =
Formal proofs are straightforward; see the exercises.
3.2. Continuous Functions 165
The function g is stranger, and its graph is much weirder than Figure 3.3
suggests. For example, neither “line” contains any unbroken segment, but each
“line” contains infinitely many points, so densely packed that between any two
points lie infinitely many more.
Bizarre as g seems, it is not hard to show that g is continuous at a = 0 but
discontinuous elsewhere. To show continuity at a = 0, we need to check that
limx→0 g(x) = g(0) = 0. This follows from the squeezing inequality
Since the left- and right-hand quantities clearly tend to zero, so must g(x).
For a =6 0, on the other hand, limx→a g(x) does not exist. One way to prove
this is to consider two sequences {rn } and {pn }, each converging to a, with
rn ∈ Q and pn ∈ / Q for all n. Then we have Such sequences do exist.
The idea for continuity at a right endpoint is essentially the same. In practice, We leave it to you.
endpoint continuity comes up less often than the standard version—after all, an
interval has only two endpoints, but infinitely many points in between.
√
E XAMPLE 3. Check that the function f (x) = x is continuous at x = 0.
√ √
S OLUTION . We only need to show that limx→0+ x = 0 = 0. Doing so is
straightforward. For any ǫ > 0, the choice δ = ǫ2 works:
√ √
if 0 < x ≤ δ, then 0 < x ≤ δ = ǫ,
as desired. ♦
166 3. Limits and Continuity
Continuity of c and i at any input a amounts to nothing more than that limx→a c =
c and limx→a x = a; both claims are very easy to show. Proving continuity of
the remaining functions rigorously takes more effort—starting with clear defini-
tions of these functions. We omit details here, but see the exercises for further
discussion.
f
f ± g, f g, and
g
Proof (s): Proving Proposition 3.11 boils down to verifying algebraic properties
of limits. We’ve done that already; see Theorem 3.3 (page 154) and its proof.
We’ll prove Proposition 3.12 in good ǫ–δ style. For simplicity we’ll assume The proof involves two δ’s,
that f and g are defined on open intervals about b and a, respectively. To this end, chosen one after the other.
let ǫ > 0 be given. Since f is continuous at b, there exists δ1 > 0 such that
To finish the proof, note that this δ2 works for the original ǫ > 0:
About proofs. The general point is that elementary functions are combinations,
of the types treated in Propositions 3.11 and 3.12, of the continuous functions
discussed in Proposition 3.10. For instance, writing the polynomial p(x) = 3x2 +
πx − 7 in the form
3x2 + πx − 7 = 3 · x · x + π · x − 7
shows explicitly that p is built by multiplication and addition from constant func-
tions and the function i(x), both of which are continuous according to Proposi-
tion 3.10.
Continuity of all six basic trigonometric functions on their domains follows
from continuity of the sine function and such identities as
sin x 1
cos x = sin (x + π/2) ; tan x = ; sec x = .
cos x cos x
In a similar way, the functions
3x2 + 5x − 7 cos x
r(x) = and g(x) = arctan ln
x2 − 1 3x + 1
Exercises
1. It’s no surprise that the identity function f (x) = x is continuous at every
domain point x = a. Show this carefully using Definition 3.7, page 162.
3. This problem explores the ǫ–δ definition of continuity for the function
f (x) = 345x + 678.
(a) Let a = 3 in the notation of Definition 3.7, page 162, and set ǫ =
0.001. Find a value of δ, as large as possible, that works with this ǫ.
(b) Let a = 3, as above, and fix any ǫ > 0. Find a value of δ, as large
as possible, that works with this ǫ. Conclude that f is continuous at
x = 3.
(c) Let a be any number and ǫ > 0 any positive number. Find a value
of δ that works in this situation. Conclude that f is continuous at all
points a ∈ R.
10. Suppose {xn } is a sequence such that xn → 0. Explain using results of this
section why cos(xn ) sin(xn )) → 0.
11. Suppose {xn } is a sequence such that xn → 0. Explain using results of this
section why cos(sin(xn )) → 1.
12. We said in Example 2, page 164, that the sign function is discontinuous at
a = 0 but continuous everywhere else. Prove this in two parts:
(a) limx→0 sign(x) does not exist;
6 0.
(b) limx→a sign(x) = sign(a) if a =
13. Suppose f : R → R is continuous at x = 0 and f (1/n) > 0 for all n.
Show that f (0) ≥ 0.
14. Suppose f (3) = 5 and f is continuous at 3. Show that there exists δ > 0
such that 4.999 < f (x) < 5.001 for x ∈ (3 − δ, 3 + δ).
3.2. Continuous Functions 171
15. Suppose f (3) > 5 and f is continuous at 3. Show that there exists δ > 0
such that f (x) > 5 for x ∈ (3 − δ, 3 + δ).
16. Let f be the function whose graph consists of the two line segments joining
(0, 0), (1, 1), and (2, −1). Show that f is continuous at x = 0 and at x = 1.
(In fact, f is continuous on all of [0, 2].)
17. A function f has a removable discontinuity at a point x = a if (i) f is dis-
continuous at a; but (ii) we can define (or redefine) f (a) so that f becomes
continuous at x = a. In each part, decide whether the function has a re-
movable discontinuity at the given point. If so, explain how to remove it. If
not, why not?
1
(a) f (x) = ; a = 0.
x
2
x −4
(b) f (x) = ; a = −2.
x+2
1
(c) f (x) = x sin ; a = 0.
x
18. If f : I → R and g : I → R are any functions defined on I, then
we can form new functions max{f, g} and min{f, g} in the “obvious”
way: max{f, g}(x) = max{f (x), g(x)} for all x ∈ I, and similarly for
min{f, g}.
(a) Sketch graphs of max{f, g} and min{f, g} if f (x) = sin x and g(x) =
cos x on I = [−2π, 2π].
(b) Show that
|f (x) − g(x)| + f (x) + g(x)
max{f (x), g(x)} = .
2
Find a similar “formula” for min{f (x), g(x)}.
(c) Show that if f and g are continuous on I, then so are max{f, g} and
min{f, g}. (Hint: Use (b) and the fact that a(x) = |x| is continuous.)
f (x) = |x| is continuous at every domain point a ∈ R.
Sticky inequalities. Our first such property says, roughly, that if a continuous
function satisfies a strict inequality at a point, then the same inequality holds
We made up the fancy name. (“persists”) for inputs near that point.
30
20
10
0
0.2 0.4 0.6 0.8 1.
– 10
– 20
– 30
Proof: For simplicity, we handle the case where c is not an endpoint of I. The
case where c is an endpoint is similar.
Supposing, then, that f (c) < K, we set ǫ = K − f (c) > 0, and choose δ > 0
according to the definition of continuity of f at c. This δ satisfies our present
claim, because if |x − c| < δ, then
In particular, we have
as desired.
Proof: We will show that any function f with the given properties is bounded
above; the proof for lower bounds is similar. Suppose, toward contradiction, that
f is unbounded above. Because 1 is not an upper bound, there must be some
x1 ∈ [a, b] with f (x1 ) > 1. Because two is also not an upper bound, there is
x2 ∈ [a, b] with f (x2 ) > 2. Continuing this process, we can construct a sequence
Below by a and above by b. Now the sequence {xn } is bounded and so, by the Bolzano–Weierstrass theorem,
has a subsequence {xnk } that converges to some limit x0 ∈ [a, b].
Here comes our contradiction. Because f is continuous at x0 and xnk → x0 ,
we must have f (xnk ) → f (x0 ), too. This is impossible: For all k we have
f (xnk ) > nk ≥ k.
This means that the sequence {f (xnk )} is unbounded, and hence divergent.
Proposition 3.15 applies, by the way, to the three functions f , g, and h discussed
above—as long as we restrict domains to closed intervals, such as [0.002, 0.9998],
on which all three functions are continuous. (On this interval, −500 < h(x) <
500, for instance.)
Theorem 3.16 (Intermediate value theorem (IVT)). Let f be continuous on [a, b],
with f (a) =6 f (b); let v be any number between f (a) and f (b) (i.e., f (a) < v <
f (b) or f (b) < v < f (a)). Then there exists c in (a, b) with f (c) = v.
Proof: We discuss the case f (a) < v < f (b); the argument for the case f (b) <
v < f (a) is similar.
a ∈ S, for instance. Consider the set S = {x ∈ [a, b] | f (x) < v}. Because S is nonempty and
By b, for instance.
bounded above, it has a least upper bound—say, c—somewhere in [a, b]. We’ll
show that c has all the properties claimed in the theorem.
Our first claim is that f (c) ≤ v. This is trivial if c ∈ S. If c ∈
/ S, then there is
a sequence {sn } of members of S with sn → c. Because f is continuous at c we
must have f (sn ) → f (c), and since f (sn ) < v for all n it follows that f (c) ≤ v.
6 b, since we know f (b) > v.
We see, too, that c =
That’s Proposition 3.14, To complete the proof, we show that f (c) < v is impossible; the PoPI is the
page 172. key. Indeed, if f (c) < v holds, then the PoPI says that f (x) < v must also hold
for x in some small interval (c − δ, c + δ). This is absurd—we chose c so that
f (x) ≥ v for all x > c. Thus f (c) = v, as desired.
3.3. Why Continuity Matters: Value Theorems 175
Theorem 3.17 (Extreme value theorem (EVT)). If f is continuous on [a, b], then
f assumes both a maximum and a minimum value on [a, b]. That is, there exist
xmin and xmax in [a, b] such that
f (xmin ) ≤ f (x) ≤ f (xmax ) for all x ∈ [a, b].
Notice especially the important words “assumes,” “maximum,” and “minimum”:
The theorem guarantees that f achieves, not just approaches, biggest and smallest
values on [a, b].
Proof: Consider the output set We defined “range” in
Section 1.4.
R = range(f ) = {f (x) | x ∈ [a, b]} .
Proposition 3.15 says that R is a bounded set of real numbers. Since R is also
nonempty, the completeness axiom guarantees the existence of an infimum α and
a supremum β. To finish the proof, we need only show that α and β are members
of the range. With help—again—from Bolzano and Weierstrass we’ll handle the
case for β. The case for α is almost
Recall first that, since β = sup(R), there is a sequence {yn } contained in identical.
R with yn → β. (If β ∈ R, there is no harm done—we can just take yn = β
for all n.) Now for each yn there is at least one xn in [a, b] with f (xn ) = yn .
Choosing any one of these for each n produces a new sequence {xn } such that,
for all n,
a ≤ xn ≤ b and f (xn ) = yn .
The sequence {xn } itself need not converge, but the Bolzano–Weierstrass theo-
rem guarantees that some subsequence {xnk } converges, to a limit we’ll call xmax ;
clearly, xmax ∈ [a, b]. Since f is continuous at xmax and xnk → xmax , we must
have
f (xmax ) = lim f (xnk ) = lim ynk = lim yn = β,
k→∞ k→∞ n→∞
as claimed.
Bad values. Both the IVT and the EVT may fail, of course, if important hy-
potheses are violated. For example, the function f (x) = 1/x is continuous on the
open interval I = (0, 1), but assumes neither a maximum nor a minimum value
(for different reasons) on I. And the sign function, defined but discontinuous See Example 2, page 164.
on the closed interval [−10, 10], assumes only one intermediate value between
Which one?
f (−10) = −1 and f (10) = 1. (On the other hand, the sign function does assume
maximum and minimum values.)
“Root” seems less ambiguous; Odd-degree polynomials have real roots. A number r is a root (or a zero) of a
we’ll use it. function f if f (r) = 0. Thus, p(x) = x2 − 1 has roots ±1, q(x) = x2 + 1 has no
real roots, and
has roots 1, 2, 3, 4, and 5. Deciding whether an arbitrary function has any roots, let
alone finding them, can be challenging, but the IVT offers some encouragement
for a large class of polynomials.
of odd degree (an 6= 0 and n is odd) has at least one real root.
and so p(x) must assume both positive and negative values. If, say, p(x) = x3 −
7x2 + 5x + 3, then p(0) = 3 while p(−1) = −13, so p must have a root in the
Plotting p. interval (−1, 0). We leave a more formal proof to the exercises.
3.3. Why Continuity Matters: Value Theorems 177
E XAMPLE 1. Show that if f : [0, 1] → [0, 1] is continuous on [0, 1], then f has
a fixed point.
S OLUTION . Note a crucial hypothesis about the codomain: 0 ≤ f (x) ≤ 1 for See it in the notation?
all x ∈ [0, 1]. This means, geometrically, that the graph of f stays inside the
square window [0, 1] × [0, 1] in the xy-plane. Our problem amounts to showing
that this graph touches or crosses the line y = x at least once. This may seem Sketch for yourself.
obvious, but we want proof, and the IVT will help.
The trick is to look at the new function g(x) = f (x) − x; note that g is also
continuous on [0, 1]. Note also that
(i) g(0) = f (0) ≥ 0; (ii) g(1) = f (1) − 1 ≤ 0.
If either g(0) = 0 or g(1) = 0, we’re done already, so we may as well assume
that both (i) and (ii) are strict inequalities. In this case, the IVT guarantees that
g(x) = 0 holds for some x ∈ (0, 1), as we aimed to show. ♦
With due care taken for domains, the inverse relationship means that f (a) = b if
and only if g(b) = a. For the last pair, for instance, we have f (1.2) = tan 1.2 ≈
2.57215, while g(2.57215) = arctan(2.57215) ≈ 1.2.
It is natural to hope in this situation that if either of f and g is continuous,
then so is its “partner.” (Knowing this might be practically useful—sometimes
it is easier to show directly that one rather than the other is continuous.) The
following proposition addresses this situation.
Using the IVT. The claim should seem plausible; we’ll leave its proof as an exercise. Assuming
the claim, we will handle the case where f is strictly increasing; the case for
decreasing f is similar.
Let b ∈ J be given. Then g(b) = a for some a ∈ I; note that also f (a) = b.
To see that g is continuous at b, let ǫ > 0 be given. Since I is an open interval
and a ∈ I, we may as well assume that the small closed interval [a − ǫ, a + ǫ] is
If necessary, we can always contained in I. Because f is increasing, we know that f (a − ǫ) < f (a) = b <
take ǫ smaller. f (a + ǫ). To complete the proof, we now choose any δ > 0 so that
To see that this δ works for the original ǫ, consider any y in J with
Applying the strictly increasing function g to all parts of this inequality, we get
Exercises
1. Assume in both parts of this problem that f : [0, 2] → R is continuous on
[0, 2].
(a) Describe (by graph or formula) such a function f where f (0) is the
minimum value and f (1) is the maximum value of f on [0, 2].
(b) Suppose that f that attains a maximum value at x = 1. Show that f
is not one-to-one.
3. This problem revisits Proposition 3.19, page 176. Throughout, let p(x) =
xn + an−1 xn−1 + · · · + a1 x + a0 , where n is odd.
(a) What, if anything, does the EVT guarantee about maximum and min-
imum values of a continuous function f : R → R defined on all of R?
(b) No odd-degree polynomial achieves a maximum or a minimum value
on R. Explain why. (See an earlier problem.)
(c) Every even-degree polynomial q(x) achieves either a maximum or a
minimum value on R. Explain why. (Hint: This is just a little harder.
Note that either (i) q(x) → ∞ as x → ±∞, or (ii) q(x) → −∞
as x → ±∞. Show that q achieves a minimum in case (i) and a
maximum in case (ii).)
180 3. Limits and Continuity
(a) Does p assume a minimum on [0, 3]? If so, what is it? What does the
EVT say?
(b) Does q assume a minimum on (0, 3]? If so, what is it? What does the
EVT say?
(c) We said, but didn’t prove, in Example 2 that the maximum value of p
on [0, 3] is p(1) = 4. Prove it now, in two steps:
(i) Factor p(a) − 4 (one factor is a − 1).
(ii) Use your factorization to explain why p(a) − 4 ≤ 0 when a ∈
[0, 3].
9. In each part following, either give an example or explain why none can
exist.
11. Let f : [a, b] → R be continuous on [a, b]. Show that if f attains either its
maximum or its minimum value at an interior point c, then f is not one-to-
one.
12. Let I be an interval (open or closed) and suppose that f : I → R is contin-
uous and one-to-one. Show that f is strictly monotone.
13. Suppose that f : R → R is continuous and one-to-one, and that f (1) >
f (0). Show that f (2) > f (1).
14. The Principle of Persistent Inequalities (Proposition 3.14, page 172), de-
scribes a nice property of continuous functions. Here we investigate two
“one-sided” versions of the PoPI.
A function f satisfies the PoPILTV (the “less-than” version of the PoPI) at
x = c if, for every number K with f (c) < K, there exists δ > 0 such that
f (x) < K whenever x ∈ (c − δ, c + δ). The PoPIGTV (the “greater-than”
version) is the same, except that the condition f (x) < K is replaced by
f (x) > K.
(a) Suppose f is continuous at x = c. The ordinary PoPI says that f
satisfies the PoPILTV. Show that f also satisfies the PoPIGTV.
(b) Suppose g(x) = 1 if x > 0 and g(x) = 0 otherwise. Show that g
satisfies the PoPIGTV but not the PoPILTV at x = 0.
(c) Suppose h(x) = 1 if x = 0 and h(x) = 0 otherwise. Which of the
PoPIGTV and the PoPILTV does h satisfy at x = 0?
(d) Show that a function f : R → R is continuous at x = c if and only if
f satisfies both the PoPIGTV and the PoPILTV at x = c.
15. Let f : R → R be a continuous function such that f (x) is a rational number
for every real number input x. Show that f must be a constant function.
16. Consider a function f : R → R and a domain point c ∈ R. We say that f is
locally bounded at x = c if there is some δ > 0 and some M > 0 such that
|f (x)| < M whenever x ∈ (c − δ, c + δ). (In other words, f is bounded on
some open interval centered at c.)
(a) Show that f (x) = x2 is locally bounded at x = 42. (Find specific
values of δ and M ; there are many possibilities.)
(b) Show that if f is continuous at c, then f is locally bounded at c.
(c) Find a function f : R → R that is locally bounded but not continuous
at x = 0. Explain why your example works.
(d) Find a function f : R → R that is not locally bounded at x = 0.
Explain why your example works. (Be sure your example is defined
for all inputs x.)
182 3. Limits and Continuity
What’s new here? What’s wrong with ordinary continuity, over which we’ve
worked hard? We’ll address these good questions right after some basic examples.
1.4
100.4
є+ 1
є+ 100
1
100
1–є 100 – є
0.6 99.6
S OLUTION . Yes, and yes—and the second yes follows immediately from the
first. If any f is uniformly continuous on a set S, then f is automatically uni-
formly continuous on any smaller set S ′ ⊂ S.
Informally speaking, the first “yes” applies because f is “steeper” at x = 42
than anywhere else on I = [−3, 42]. Therefore, any δ that “works” for a given ǫ
at x = 42 will also work elsewhere in I, where the graph is less steep.
Before starting the formal proof, note that for any x and y we have
|f (x) − f (y)| = x2 − y 2 = |x − y| |x + y| .
Now consider the last factor, |x + y|: For x and y in [−3, 42] I we have
|x + y| ≤ |x| + |y| ≤ 42 + 42 = 84.
Putting the pieces together produces a key inequality: |f (x) − f (y)| ≤ 84 |x − y|
for all x and y in I. This done, we’re ready for our (very short!) formal proof.
Let ǫ > 0 be given. Set δ = ǫ/84. This δ works, since if x, y ∈ I and
|x − y| < δ, then
|f (x) − f (y)| = |x − y| |x + y| ≤ 84 |x − y| < 84δ = ǫ,
as desired. ♦
We will exploit this view later, when we study integrals. This basic idea—that
f “preserves closeness”—translates naturally to Cauchy sequences, which are all
about closeness.
Proof: Let ǫ > 0 be given. By hypothesis there is some δ > 0 that works in the
sense of uniform continuity of f on I. Because {xn } is a Cauchy sequence, we
can choose N so that Yes, we mean δ, not ǫ.
as required.
E XAMPLE 3. Let f (x) = 1/x. Use Proposition 3.23 to show that f is not
uniformly continuous on I = (0, ∞). Is f uniformly continuous on [1, ∞)?
S OLUTION . We could use the definition to show directly that f is not uniformly
continuous (see the exercises), but using Proposition 3.23 is shorter. We can sim-
ply observe that the sequence {1/n} is contained in I and Cauchy, while the
output sequence {f (xn )} = {n} is surely not Cauchy.
On the interval [1, ∞), by contrast, f is uniformly continuous. To see this,
note that for x, y ∈ [1, ∞) we have
1 1 y − x
|f (x) − f (y)| = − = ≤ |y − x| .
x y xy
This implies that, for any ǫ > 0, the choice δ = ǫ works in the definition of
uniform continuity. ♦
186 3. Limits and Continuity
Exercises
1. Use Definition 3.21 (not theorems) in each part.
(a) f (x) = 5 is uniformly continuous on R.
(b) g(x) = 2x + 7 is uniformly continuous on R.
2. Use Definition 3.21 in each part.
(a) Assume in this part that, for all x and y, |f (x) − f (y)| ≤ |x − y|.
Show that, for any ǫ > 0, the value δ = ǫ works in Definition 3.21.
(b) Show that, indeed, |f (x) − f (y)| ≤ |x − y| for all x and y. (This part
involves “only” algebra—no ǫ or δ needed.)
(b) l(x) = 1/x is not uniformly continuous on (0, 1). Hint: Let ǫ = 1
and show that no suitable δ can be chosen.
5
5. Theorem 3.22 implies that f (x) = + 3 is uniformly continuous on the
x
interval [1/4, 10]. Give an ǫ–δ proof of this fact.
(a) Show that the function g(x) = −137f (x) is uniformly continuous on
I.
(b) Show that the function h(x) = |f (x)| is uniformly continuous on I.
8. Let I and J be any intervals, with I ⊆ J, and let f be a function. Show that
if f is uniformly continuous on J, then f is uniformly continuous on I, too.
(Note: The claim is pretty obvious; the point is to give a definition-based
proof.)
√
9. Use Definition 3.21 to show in two steps that f (x) = x is uniformly
continuous on [0, ∞).
√ √ √
(a) Show that if 0 ≤ y ≤ x, then x − y ≤ x − y.
(b) Use the preceding part to finish the problem.
188 3. Limits and Continuity
16. In each part, either use Definition 3.21 or cite appropriate theorems or
propositions.
18. Claim: Suppose f is uniformly continuous on the half-open interval (0, 1].
Then we can “extend f ” (i.e., define f (0)) to be uniformly continuous on
the closed interval [0, 1].
Prove this in two steps:
(a) Set f (0) = limn→∞ f (1/n). Explain why this makes sense—i.e.,
explain why the limit exists. (Hint: The sequence 1, 1/2, 1/3, . . . is
Cauchy; apply Proposition 3.23, page 185.)
(b) (This is a little harder.) To show that f is uniformly continuous on
all of [0, 1], let ǫ > 0. Set ǫ′ = ǫ/2, and choose δ > 0 that “works”
for ǫ′ in the definition of uniform continuity of f on (0, 1]. Show
that this δ works for ǫ in the definition of uniform continuity of f
on [0, 1]. (Hint: It is enough to show that |f (x) − f (0)| < ǫ when
|x − 0| < δ. To do this, choose any n0 such that 0 < 1/n0 < x
and |f (1/n0 ) − f (0)| < ǫ/2. (Why is this possible?) Then use the
triangle inequality.)
190 3. Limits and Continuity
No coffee—but plenty of points and sets. Coffee cups and bagels aren’t seen
in real analysis. As rough analogues, however, consider two open intervals, say
I = (0, 1) and J = (−13, 42). In obvious ways I can be stretched (“dilated”)
and slid (“translated”), without tearing or cutting, to coincide with J, and vice
versa. In this sense I and J are topologically “the same”. The closed interval
K = [0, 1], by contrast, is genuinely “different” from I and J: It contains left and
right endpoints that can’t be created or destroyed by stretching and sliding.
We’ve already seen that closed and open intervals like [0, 1] and (0, 1) have
That’s Theorem 3.17, page 175. quite different properties in real analysis. The extreme value theorem, for in-
stance, says that every continuous function assumes maximum and minimum val-
ues on [0, 1]. No such guarantee holds on the interval (0, 1); a function continuous
Consider f (x) = 1/x, for there need not even be bounded. In this sense the sets [0, 1] and (0, 1) are topo-
instance. logically different.
Individual points can also have differing topological properties with respect
to sets. The points x1 = 0.5 and x2 = 0.6 are both interior to the set [0, 1], and
in this sense “the same”. The point x3 = 1, by contrast, is on the boundary of the
set [0, 1]: every open interval, say (0.999, 1.001) contains points both inside and
outside [0, 1].
Open and closed sets Open and closed sets are the most important objects in
topology. Among open sets, the most familiar are open interval, like (0, 1). We’ve
The word “basis” is used more used open intervals freely throughout this book, and they’re the basis for two more
formally in a theoretical general definitions:
development of topology.
Definition 3.25. A set U ⊆ R is called open if U is the union of any collection of
open intervals. A set A ⊆ R is called closed if its complement R \ A is open.
We can think of the interval Nǫ (x) as “elbow room.” For each element x,
U holds not just x itself but also tiny left and right “ǫ-elbows” on either
side of x. We see, for instance, that S = (0, 1] is not open: 1 ∈ S but
no “right ǫ-elbow” of 1 fits into S. As another example, we see that the
set R \ Z is open: every non-integer, say x = 2.93, has some integer-free
ǫ-neighborhood, say (2.92, 2.94). We see, too, that the set Z is closed. Here ǫ = 0.01.
• Not open, not closed: Many familiar subsets of R are neither open nor closed.
The sets Q and R \ Q are good examples. We’ve seen that every open in-
terval contains both rationals and irrationals. Equivalently, neither Q nor
R \ Q contains any intervals; clearly, neither can be open.
Some examples will expand on these definitions and suggest new definitions,
some of which we pursue in exercises.
S OLUTION . Every finite set is closed. To see why, suppose without loss of gen-
erality that s1 < s2 < · · · < sn . Now we can write If not, we can reorder and
rename the elements.
R \ S = (−∞, s1 ) ∪ (s1 , s2 ) ∪ (s2 , s3 ) ∪ · · · ∪ (sn , ∞),
which shows that R \ S is open.
No nonempty finite or even countably infinite set is open: Open sets are unions
of open intervals, and intervals are uncountable sets. ♦ See Section 1.6 for more on
countability.
192 3. Limits and Continuity
We see in the next example that a countably infinite set may or may not be
closed.
E XAMPLE 2. Let
1 1 1 1 1 1
S = 1, , , , . . . and T = 0, 1, , , , . . . .
2 3 4 2 3 4
Is either S or T open or closed?
E XAMPLE 3. Let U1 and U2 be open sets, and A1 and A2 closed sets. What can
be said about unions and intersections?
Do you see why? De Morgan S OLUTION . We’ll see that both U1 ∪ U2 and U1 ∩ U2 are open; it follows that
can help. both A1 ∪ A2 and A1 ∩ A2 are closed.
To see why U1 ∪ U2 is open recall that U1 and U2 are themselves unions of
intervals. Thus U1 ∪ U2 is itself a union—the union of all the open intervals that
make up either U1 or U2 .
To see why U1 ∩ U2 is open we use ǫ-neighborhoods. Suppose x ∈ U1 ∩ U2 .
Since x ∈ U1 there is some ǫ1 > 0 with x ∈ Nǫ1 (x) ⊆ U1 . Similarly, there is
ǫ2 > 0 with x ∈ Nǫ2 (x) ⊆ U2 . Then, for ǫ = min{ǫ1 , ǫ2 } we have Nǫ (x) ⊆
U1 ∩ U2 . ♦
Example 3 points to more general results:
Proposition 3.26. Unions and intersections of open and closed sets behave topo-
logically as follows:
(a) The union of any collection of open sets is open. The intersection of any finite
collection of open sets is open.
(b) The intersection of any collection of closed sets is closed. The union of any
finite collection of closed sets is closed.
3.5. Topology of the Real Numbers 193
Proof: The first assertion in (a) can be proved exactly as for unions of two open
sets. We proved the second part in the preceding example. Part (a) implies (b) via
De Morgan’s laws.
Topology and sequences. Topology words and ideas are useful in studying se-
quences.
E XAMPLE 4. Let {an } be any sequence that converges to a, and U any open set
with a ∈ U . Then an ∈ U for all but finitely many n.
Proof: If not, then a ∈ R \ A, an open set. Hence, for some ǫ > 0 we have
a ∈ Nǫ (a) ⊆ R \ A. This means that Nǫ (a) contains none of the an , which
contradicts the claim in Example 4.
Limits and limit points. A point a is defined to be a limit point of a set A if every
open neighborhood Nǫ (a) contains at least one element of A other than a.
(a) Let A = (0, 1]. All points of A are limit points. The point a = 0 is also a
limit point of A, even though 0 ∈ / A. No other point in R is a limit point. For
the point a = 1.001, for example, the neighborhood N0.0001 (a) misses the
set A entirely.
1 1 1
(b) Let S = 1, , , , . . . , as in Example 2. No point of S is a limit point:
2 3 4
every point of S has a (perhaps tiny) ǫ-neighborhood that contains no other
point of S. The point 0, although not in S, is a limit point of S, since every
set Nǫ (0) contains infinitely points of S.
(c) For the set Q of rational numbers, every real number is a limit point.
S OLUTION . We did half the work just above. Unpacking the notation a bit gives
Of the last two sets on the right, the first is open by Example 6; the second is open
by essentially the same argument. As the intersection of two open sets, f −1 (U )
is open, too. ♦
√ √
(c) If U = (−42, 3) then f −1 (U ) = f −1 ([0, 3)) = − 3, 3
n √ √ o
(d) If U = {1, 2, 3}, then f −1 (U ) = ±1, ± 2, ± 3 .
♦
The pattern suggested above—inverse images of open sets are open, and simi-
larly for closed sets—is no accident. Proposition 3.29 gives a brief but general
description of continuity in topological language. Proposition 3.29 can be used to
−1
define continuity.
Proposition 3.29. A function f : R → R is continuous on R if and only if f (U )
is open whenever U ⊆ R is open.
Proof: For the “if” part, fix a ∈ R. Indeed, for any ǫ > 0, we can let U be
the open interval (f (a) − ǫ, f (a) + ǫ). Then a ∈ f −1 (U )—an open set by our
hypothesis—and so there is δ > 0 such that (a − δ, a + δ) ⊆ f −1 (U ). Unraveling
these notations shows that this is just another way of saying that |f (x) − f (a)| <
ǫ when |x − a| < δ. This is precisely the ǫ–δ condition for continuity of f at
x = a, so the “if” part is proved.
Proving the “only if” part is similar. Here’s a sketch: Suppose U ⊆ R is open. Further details are left as an
If f −1 (U ) = ∅, we’re done already. If not, and a ∈ f −1 (U ), then f (a) ∈ U , and exercise.
since U is open we can choose ǫ > 0 with Nǫ ( f (a) ) ⊆ U . By continuity of f at
x = a we can choose δ > 0 with Nδ (a) ⊆ f −1 (U ). Hence f −1 (U ) is open as
claimed.
Open intervals are good enough. The condition in Proposition 3.29 can be re-
laxed a little: f is continuous if and only if f −1 (I) is open for every open interval
I = (a, b). This is because, roughly speaking, inverse images “play well” with
unions: the inverse image of a union is the corresponding union of inverse images. See the exercises for more on
this.
R is a “metric space”. A metric space is a set X equipped with a reasonable
notion of distance, known more formally as a metric. In real analysis the absolute
value plays this role: For any numbers x and a, |x − a| is the distance from x to
a, and the inequality |x − a| < ǫ means that x is “within distance ǫ” of a.
The following abstract definition makes more general sense.
Definition 3.30. Let X be any set and d : X × X → R a function. We say d is a
metric on X if the following conditions hold for all x, y, and z in X:
(i) d(x, y) ≥ 0, and d(x, y) = 0 only when x = y (aka, positivity)
(ii) d(x, y) = d(y, x) (aka, symmetry)
(iii) d(x, z) ≤ d(x, y) + d(y, z) (aka, triangle inequality)
All parts of the definition are easy to check in our most familiar case, X = R and
d(x, y) = |x − y|; the metric triangle inequality, in particular, is a long-familiar
property of the absolute value.
196 3. Limits and Continuity
as the open ball of radius ǫ about x = a. When d(x, y) = |x − y|, this “ball”
is really just an interval of radius ǫ, the set we’ve called an ǫ-neighborhood, and
denoted Nǫ (a). If X is a higher-dimensional set, like R2 or R3 , with appropriate
metrics, a “ball” might resemble a disk or a child’s notion of a ball.
Building a metric space. Given any set X and any metric d, we can can create
a useful topology on X—called a metric space topology—by defining open sets
to be unions of open balls. Indeed, we did exactly that earlier in this section: real
open intervals are actually open balls, and so our earlier definition is in fact the
metric definition. Metric spaces are in some sense “nice,” and this happy state is
sometimes summarized by saying that R with its “usual” topology is is a metric
space. We explore this idea further in exercises.
Exercises
1. We said in Example 3 that because the union of two open sets is open, “it
follows” that the intersection of two closed sets is closed. Use De Morgan’s
laws to give details.
(a) Suppose U1 , U2 , U3 , . . . are all open sets. Explain why the infinite
union U1 ∪ U2 ∪ U3 ∪ . . . is open.
(b) Suppose A1 , A2 , A3 , . . . are all closed sets. Use De Morgan’s laws
(they hold for infinite unions and intersections, too) to explain why
the infinite intersection A1 ∩ A2 ∩ A3 ∩ . . . is closed.
4. We said in Example 3 that because the union of two open sets is open, “it
follows” that the intersection of two closed sets is closed. Use De Morgan’s
laws to give details.
3.5. Topology of the Real Numbers 197
5. We said in this section that a set U ⊆ R is open if and only if for each
p ∈ U there is some ǫ > 0 such that Nǫ (p) ⊆ U .
(a) Let U = (0, ∞); we know U is open. For p = 0.0042 find a specific
value of ǫ for which Nǫ (p) ⊆ U .
(b) Let p be any element of (0, ∞). Find a value of ǫ (depending on p, of
course) for which Nǫ (p) ⊆ U .
(c) Prove the statement above for any open set U ⊆ R. (Hint: For the
“only if” part note that if p ∈ U then p ∈ (a, b) ⊆ U for some open
interval (a, b).)
6. Let S ⊆ R be any set. A point p ∈ R is called a boundary point of S if
every ǫ-neighborhood Nǫ (p) intersects both S and R \ S.
7. Here’s the converse of the claim in Example 4, page 193: Let {an } be a
sequence, and a a number. Suppose that for any open set U with a ∈ U ,
an ∈ U for all but finitely many n. Then {an } converges to a.
8. (a) Problem 6 illustrates that an open set U may have boundary points.
Show that U contains no boundary points.
(b) Suppose p is a boundary point of a closed set A ⊆ R. Show that
p ∈ A.
9. Let S ⊆ R be any set. A point p ∈ S is an interior point of S if Nǫ (x0 ) ⊂ S
for some ǫ > 0.
11. In the notation of Proposition 3.29, page 195, show the following.
(a) For any set S ⊆ R, f −1 ( R \ S ) = R \ f −1 ( S ).
(b) f : R → R is continuous on R if and only if f −1 (A) is closed when-
ever A ⊆ R is closed.
12. Repeat Problem 10, but use the (continuous) function f (x) = |x|. Hint:
For (d), use cases, depending on whether 0 ∈ (a, b).
13. A set D ⊆ R is dense in R if D has nonempty intersection with every open
interval (a, b) ⊂ R.
17. We said in this section that the “usual” topology on R can be thought of as
given by the metric d(x, y) = |x − y|. This means, in particular, that every
open set U ⊆ R is a union of one or more open balls Bǫ (a). For example,
U = (3, 7) = B2 (5). In that spirit, write each set following as a union
(perhaps infinite) of open balls.
3.6. Compactness 199
(a) (0, 1)
(b) R
(c) R \ Z
(d) (0, ∞)
18. In the spirit of(Definition 3.30, page 195, define for any real numbers x and
1 if x = 6 y
y, d(x, y) = .
0 if x = y
19. Define, for any real numbers x and y, d(x, y) = min {|x − y|, 1}.
3.6 Compactness
Compactness is a “nice” topological property enjoyed by certain subsets of R—
e.g., finite sets, closed and bounded intervals, and convergent sequences—but not
others. We’ve seen, for instance, that a function defined and continuous on a
closed and bounded interval [a, b] assumes maximum and minimum values, while
a function defined and continuous on an open interval need not even be bounded.
Formal definitions follow soon; here’s a preview.
Almost finite. Compact sets need not be finite. We’ll see, for instance, that the Finite sets are compact, though.
unit interval [0, 1]—an uncountably infinite set—is compact. On the other hand,
[0, 1] is both bounded and closed, and these properties turn out to be almost as
good as finiteness for some purposes.
(b) If F is finite and nonempty, then F contains maximum and minimum ele-
We’ve discussed this before. ments. If S is compact and nonempty, then S contains maximum and mini-
See, e.g., Problem 3, page 50. mum elements.
(c) Every finite set F is bounded. Every compact set S is bounded.
(d) Every finite set F is closed. Every compact set S is closed.
(e) Let f : R → R be a continuous function. If F ⊂ R is finite and nonempty,
then f achieves maximum and minimum values on F . If S ⊂ R is compact
and nonempty, then f achieves maximum and minimum values on S.
Definitions. The “open cover” definition we’ll soon give for compactness looks
awkward at first glance. Indeed, other possible definitions of compactness (we’ll
see one soon) turn out to be equivalent for subsets of R. But the open cover
definition makes sense in any topological space, and turns out to be useful in and
beyond real analysis.
First, an auxiliary definition and some basic examples:
Definition 3.31. Let S ⊆ R be any set of real numbers. An open cover of S is
any collection U of open sets that “covers” S—i.e., for each s ∈ S there is some
U ∈ U with s ∈ U . A subcover is any subcollection U ′ ⊆ U that still covers all
of S; U ′ is “proper” if U ′ ( U.
E XAMPLE 3. Let S be any set of real numbers, and fix ǫ > 0. The collection
Each point of S gets its own U = {Nǫ (s) | s ∈ S} completely covers S.
ǫ-neighborhood. Whether any proper subcover exists depends on S, and perhaps on ǫ. If, say,
S = Z and ǫ = 0.1, then none of the Nǫ (s) can be omitted. If S = R, on the
other hand, the proper (but still infinite) subcover U ′ = {Nǫ (q) | q ∈ Q} works.
It has about 10,000 members. If S = [0, 1000] and ǫ = 0.1, then we can find a finite subcover, like this one:
The same idea works for any bounded set S ⊂ R and any fixed ǫ > 0. ♦
We’re ready for that awkward definition, and some basic examples.
E XAMPLE 4. The empty set ∅ is compact. So is every finite set F ⊂ R. The full
set R is not compact. ♦ Proofs are left as exercises.
from Example 2 cover (0, 1), but no finite subcollection suffices: If we stop with
(1/N, 2), then the set (0, 1/N ] is left uncovered. ♦
E XAMPLE 6. Let {an } be any convergent sequence of real numbers, with limit
a. Then the set S = {a, a1 , a2 , a3 , . . . } is compact. The set S ′ = {a1 , a2 , a3 , . . . }
need not be compact.
S OLUTION . Let U be any open cover of S. Since a ∈ S there is some open set
Ua in U with a ∈ Ua . As we showed in Example 4, page 193, Ua contains all but
(at most) finitely many of the an . We can cover these outliers, if any, with finitely
many members of U, say U1 , U2 , . . . UN . Thus U ′ = {Ua , U1 , U2 , . . . , UN }
works as a finite subcover. We leave the claim about S ′ as an exercise. ♦
Proof: To prove (i) we check that R \ S is open: For any p ∈ R \ S we’ll find
a neighborhood of p that’s disjoint from S. To wit, for each ǫ > 0 we define
Uǫ = {x ∈ R | |x − p| > ǫ}. The sets Uǫ form an open cover U of S. By
compactness, some finite collection Uǫ1 , Uǫ2 , Uǫ3 , . . . , UǫN covers S. If we take
ǫ0 to be the smallest of these ǫ’s, then we see that S ⊆ Uǫ0 . Thus, the open There is a smallest one since
interval (p − ǫ0 , p + ǫ0 ) is disjoint from S, as desired. our set of ǫ’s is finite.
Proving (ii) is easier. The nested collection
A big theorem. Theorem 3.35 below is a famous result, dating from around 1900.
More general versions of It fully describes the compact subsets of R in down-to-earth language. We need
Theorem 3.35 hold in some just one more technical fact.
topological spaces other than
R.
Lemma 3.34. Let S ⊂ R be compact and A a closed subset of S. Then A is
compact.
Proof: Let U be any open cover of A. Tossing one more open set, R \ A, into U
produces an open cover V of S. By compactness of S, some finite subcover V ′
covers S. Ejecting R \ A from V ′ gives the desired finite subcover of U.
The Bolzano–Weierstrass theorem asserts that every closed and bounded interval Theorem 2.16, page 111
A = [a, b] is sequentially compact. In fact, A need not be an interval:
Proof: Let A be closed and bounded, and {an } a sequence with an ∈ A for
all n. The sequence {an } is bounded, so Bolzano–Weierstrass says there’s a
convergent subsequence {ank }; call its limit a. Since A is closed, Fact 3.27,
page 193, guarantees that a ∈ A, as desired.
Using Compactness
Compact sets have, as claimed earlier, several pleasant properties that “play well”
with continuous functions. Following are three (now) simple but important exam-
ples.
Another proof is outlined in Proof: Let’s show that f (K) is sequentially compact. To this end, let {yn } be
exercises. any sequence contained in f (K); we seek a convergent subsequence with limit in
f (K).
Since {yn } ⊆ f (K), there are x1 , x2 , x3 , . . . , all in K, with f (xn ) = yn
for all n. By compactness of K the sequence {xn } has a convergent subsequence
{xnk }, with limit x0 ∈ K. Setting ynk = f (xnk ) for all k, and y0 = f (x0 )
See page 163. produces the sought-after subsequence of {yn }. Continuity of f guarantees that
as desired.
Exercises
1. Use Definition 3.31 to prove the claims in Example 4:
(a) Show that if {an } is given by an = 1/n, then S ′ is not compact. (Find
a suitable open cover.)
(b) Suppose {an } is the sequence 1, 0, 1/2, 0, 1/3, 0, . . . . Is the set {an }
compact? Why or why not?
(c) Give an example of a divergent sequence {an } for which the set {an }
is compact.
3.6. Compactness 205
Thus, c′ (3) = 0. Since the value x = 3 played no important role in the calcula-
tion, we’d guess (correctly) that c′ (a) = 0 for all a.
Finding ℓ′ (3) is easy, too:
ℓ(x) − ℓ(3) 2x + 7 − 13 2x − 6
ℓ′ (3) = lim = lim = lim = 2.
x→3 x−3 x→3 x−3 x→3 x − 3
207
208 4. Derivatives
Thus, ℓ′ (3) = 2, and (almost) the same calculation shows that ℓ′ (a) = 2 for all
See the exercises. inputs a.
Watch for a factoring trick. For q, we need another limit calculation:
q(x) − q(3) x2 − 9
q ′ (3) = lim = lim = lim (x + 3) = 6
x→3 x−3 x→3 x − 3 x→3
Thus, q ′ (3) = 6. This time the calculation looks a bit different away from x = 3,
so we defer (just briefly) finding q ′ (x) for other inputs.
The derivative a′ (3), if it exists, is the value of
(
a(x) − a(3) |x − 3| 1 if x > 3,
lim = lim = lim
x→3 x−3 x→3 x−3 x→3 −1 if x < 3.
Left- and right-hand limits do From the last form, we see that a′ (3) does not exist. ♦
exist at x = 3, but they are
unequal.
The results of Example 1 won’t surprise any calculus veteran. Let’s see what
Sketch these functions to see happens with some stranger functions, one of them an old friend.
the idea; no technology needed.
differentiable at x = 0?
S OLUTION . The short answers are no and yes. The limits in question are
f (x) g(x)
lim and lim .
x→0 x x→0 x
Look what happens along a The first limit clearly fails to exist, so f is not differentiable at x = 0. For the
rational sequence tending to second limit, we just note that
zero, and see the exercises.
(
g(x) x if x ≥ 0,
=
x 0 if x < 0,
and so
g(x)
g ′ (x) = lim = 0. ♦
x→0 x
15
0.010
10
0.005
5
0.000
0 – 0.005
–5 – 0.010
–4 –2 0 2 4 – 0.10 – 0.05 0.00 0.05 0.10
which we’ve already shown (by squeezing; see Example 6, page 155) to be zero.
Thus f ′ (0) = 0, and so the corresponding linear approximation—also visible in
Figure 4.1(b)—is just
The calculation was easy, but the situation is still undeniably strange: The graph
of f crosses its tangent line infinitely often in every open interval (−δ, δ) around
a = 0. In the interval (−.001, .001), for instance, we have f (1/(kπ)) = 0 for all
integers k with |kπ| > 1000. ♦
Interpreting absolute values as distances, we see that the distance between outputs
f (x) and f (a) is about |m| times the distance between inputs x and a.
If f (x) = x2 , for instance, we have f ′ (3) = 6, and f maps the small input
interval (2.95, 3.05) to the output interval (.7025, 9.3025), which is six times as
long.
Derivatives as Functions
For a given function f the derivative f ′ (a) depends on a. If, say, f (x) = x2 , then,
as we have seen, f ′ (a) = 2a holds for every input a, and we might simply write
f ′ (x) = 2x. It is natural, in other words, to think of f ′ as a function in its own
right, derived in a special way from f . Hence the term derivative
To avoid “symbol-creep” it is convenient to use the same input symbol, often function.
x, for both f and f ′ . Thus we might write something like
f (t) − f (x)
f ′ (x) = lim
t→x t−x
when we want to think of both f and f ′ as functions of x. The difference with
earlier versions of the derivative limit is entirely notational—no new mathematics
is involved.
√
E XAMPLE 4. If f (x) = x, what’s f ′ (x)? How are the domains of f and f ′
related to each other?
212 4. Derivatives
See the factoring trick? S OLUTION . Let’s calculate. For x > 0, we have
√ √ √ √
′ f (t) − f (x) t− x t− x
f (x) = lim = lim = lim √ √ √ √
t→x t−x t→x t−x t→x ( t + x)( t − x)
1 1 1
= lim √ √ = √ √ = √ ,
t→x t+ x x+ x 2 x
as we learned back in elementary calculus.
No derivative f ′ (0) exists in this case. One problem is that f is not defined
on any open interval containing 0, as the definition requires. Thus f ′ has domain
(0, ∞)—a proper subset of [0, ∞), the domain of f . ♦
Higher Derivatives
The function f ′ may be differentiable in its own right, to produce a new function
f ′′ , the second derivative of f . Repeating the process produces higher-order
derivatives f ′′′ , f (4) , f (5) , and so on. If f (x) = x2 , for example, then we have
Using second derivatives we can go a step further. The quadratic (aka second-
order) approximation to f at x = a is a quadratic function that fits f even more
closely than does ℓa :
f (a) = qa (a) and f ′ (a) = qa′ (a) and f ′′ (a) = qa′′ (a).
If, say, f (x) = cos x and a = 0, then we have (as you know from calculus)
1.0
0.5
̟ ̟
–̟ – 2 2
̟
– 0.5
– 1.0
(We will give a general formula for qa in the next section.) Given the extra match-
ing derivative, we expect q0 to approximate f even better near x = 0 than does
ℓ0 . Figure 4.2 doesn’t disappoint.
About the proof. To illustrate the idea, suppose f (3) = 42 and f ′ (3) = 7. To
show continuity of f at x = 3, we need to show that limx→3 f (x) = 42, or,
equivalently, that limx→3 ( f (x) − 42 ) = 0. The calculation involves a little
trick:
f (x) − 42 f (x) − 42
lim (f (x) − 42 ) = lim (x − 3) = lim · lim (x − 3)
x→3 x→3 x−3 x→3 x−3 x→3
= 7 · 0 = 0,
Digging deeper. The following lemma delves a little further into good-behavior
implications of differentiability.
214 4. Derivatives
Proof: All parts follow from closer looks at the defining limit:
f (x) − f (a)
f ′ (a) = lim .
x→a x−a
To prove (i), suppose f ′ (a) 6= 0 and set ǫ = |f ′ (a)| > 0. Because the preceding
limit exists, we can choose δ > 0 so that
f (x) − f (a) ′
− f (a) < ǫ = |f ′ (a)|
x−a
whenever |x − a| < δ. In particular, we must have f (x) 6= f (a) for all such x;
See for yourself. otherwise the inequality above fails.
Claim (ii) essentially just restates the existence of the key limit. Because
f ′ (a) = 0, we have
f (x) − f (a) f (x) − f (a)
0 = lim = lim .
x→a x−a x→a x−a
By definition of the limit, for given ǫ > 0 there exists δ > 0 such that, whenever
|x − a| < δ, we have
f (x) − f (a)
x − a < ǫ, or, equivalently, |f (x) − f (a)| < ǫ |x − a| ,
as claimed.
We’ll sketch a proof of (iii) assuming that f has a local minimum at x = a;
Polishing the proof is an the proof for a local maximum is similar. In this case there is an open interval
exercise. I = (a − δ, a + δ) with f (a) ≤ f (x) for x ∈ I. In particular,
f (x) − f (a) f (x) − f (a)
≥0 if x > a and ≤0 if x < a.
x−a x−a
4.1. Defining the Derivative 215
0.004
0.002
0.000
– 0.002
– 0.004
Figure 4.3 illustrates graphically what this means—especially about how closely
x approximates sin x near x = 0. The function g shows, finally, that the converse
to claim (c) is false: Even though g ′ (0) = 0, g attains neither a maximum nor a
minimum at 0. ♦
Exercises
1. Suppose f ′ (x) exists for all x. Every calculus student knows that if g(x) =
f (x) + 3, then g ′ (a) = f ′ (a) for all a. Show this using Definition 4.1.
2. Consider a linear function L(x) = Ax + B. Show using Definition 4.1 that
L′ (x) exists for all x, and that L′ is a constant function.
216 4. Derivatives
√
3. Consider the functions f (x) = x2 and g(x) = x.
5. Consider the function f (x) = 1/(x2 + 5). Compare answers here to what
you recall from elementary calculus.
6. (a) Use Definition 4.1 to show that if f (x) = 1/x and c 6= 0, then f ′ (c) =
−1/c2 .
(b) Prove by induction: (1/xn )′ = −n/xn+1 for all integers n ≥ 1.
(Note: It is OK to use the product rule; we’ll prove it in the next
section.)
(a) Show that if f (x) ≥ f (0) for x > 0, then f ′ (0) ≥ 0. (Hint: Consider
limx→0+ (f (x) − f (0))/(x − 0).)
4.1. Defining the Derivative 217
(b) Show that if f (x) ≥ f (0) for x < 0, then f ′ (0) ≤ 0. (Hint: Consider
limx→0− (f (x) − f (0))/(x − 0).)
(c) What do the preceding parts say about the derivative at a minimum
point of a function? (We’ll further explore this famous connection
soon.)
9. Consider continuous functions f : R → R and g : R → R, with f (0) =
0 = f ′ (0). Show that f g is differentiable at x = 0, and that (f g)′ (0) = 0.
(Use Definition 4.1; the product rule does not apply.)
10. Consider continuous functions f : R → R and g : R → R such that
(i) f (0) = 0; (ii) f ′ (0) = 2; and (iii) g(0) = 3. Show that f g is differen-
tiable at x = 0; find (f g)′ (0).
11. Suppose f (x) is differentiable at x = 2, with f ′ (2) = 3. Let g(x) =
4f (x) + 5. Show that g is also differentiable at x = 2, with g ′ (2) = 12.
Interpret the result geometrically.
12. Suppose f (x) is differentiable at x = 2, with f ′ (2) = 3. Let g(x) =
f (x + 4) + 5. We claim that g(x) is differentiable at x = −2, and that
g ′ (−2) = 3.
(a) Give an informal geometric argument for the claim. (Hint: How are
graphs of f and g related?)
(b) Calculate g ′ (−2) using Definition 4.1. (Hint: In the difference quo-
tient for g ′ (−2), write x = y − 4; note that x → −2 is equivalent to
y → 2.)
15. Let f : R → R be any function such that |f (x)| ≤ x2 for all x. Show that
f ′ (0) = 0.
16. Throughout this problem let f : (−1, 1) → R be any bounded (not neces-
sarily continuous) function.
x2 cos x
18. Use Definition 4.1 to show that g(x) = is differentiable at x = 0,
3 + cos x
′
with g (0) = 0. (We don’t know the quotient rule yet.)
20. Expand the following proof sketch for Claim (iv) of Lemma 2. Set ǫ =
f (x) − f (a)
f (a) > 0. Since lim exists, we can choose δ > 0 so that
x→a x−a
f (x) − f (a)
x−a − f (a) < ǫ = f (a) when 0 < |x − a| < δ. This δ does
what’s needed.
21. If f (x) = sin(100x), then (just assume this) f ′ (x) = 100 cos(x).
(a) What does Claim (iv) of Lemma 2, page 164, say about this f when
a = 0?
(b) Find a value of δ that works in this case.
4.2. Calculating Derivatives 219
′ 4
x2 sin(3x) + 4 ln x = 3x2 cos (3x) + 2x sin (3x) + .
x
Our interest here is less in applying the rules—you’ve suffered enough—than
in stating them precisely and proving them rigorously. After all, derivatives are
limits, so algebraic properties of limits—like those in Theorem 3.3—will be key.
f
f ± g, Cf + Dg, f g, and
g
Notes on the theorem. Proofs follow or are left to the exercises. First, some
informal observations:
• Useful formulas—and more: The theorem not only justifies some well-
loved techniques from elementary calculus but also guarantees that, under
the given hypotheses, all indicated derivatives exist. This existence follows,
as we will see, from corresponding properties of limits.
220 4. Derivatives
• Sums and constant multiples: Both the sum rule and the constant multiple
rule for derivatives are (easy) special cases of the result for linear combina-
tions. Setting C = D = 1, for instance, gives one of these old favorites.
• Linear combinations and linear transformations: The rule for linear com-
binations can be phrased succinctly in the language of linear algebra: Dif-
ferentiation is a linear transformation from one vector space to another.
(Vectors, in this case, are functions.)
• Approximate thinking: One view of the theorem concerns linear approxi-
mation. By hypothesis, both f and g have linear approximations ℓf,a and
ℓg,a at x = a:
The theorem tells, in effect, how to combine ℓf,a and ℓg,a to create new
linear approximations to functions like f + g, f g, and f /g. For example,
Following minor the linear combination rule says that for x near a we have
re-arrangement.
3f (x) + 2g(x) ≈ 3ℓf,a (x) + 2ℓg,a (x)
= (3f ′ (a) + 2g ′ (a)) (x − a) + 3f (a) + 2g(a),
which should seem reasonable. Still more succinctly, we have ℓ3f +2g,a =
3ℓf,a + 2ℓg,a. The situation for products and quotients is a little more com-
plicated; see the exercises.
Proof: All parts follow from manipulating the limits that define the derivatives in
question. We treat one part in detail and leave the rest to the exercises.
For products, the theorem claims that
f (x)g(x) − f (a)g(a)
lim = f (a)g ′ (a) + f ′ (a)g(a);
x→a x−a
implicit, of course, is the claim that the limit exists. To see why, we manipulate
the difference quotient limit, starting with a clever trick:
f (x)g(x) − f (a)g(a) f (x)g(x) − f (a)g(x) + f (a)g(x) − f (a)g(a)
add and subtract the same thing lim = lim
x→a x−a x→a x−a
f (x) − f (a) g(x) − g(a)
algebra with limits = lim g(x) + lim f (a)
x→a x−a x→a x−a
f (x) − f (a) g(x) − g(a)
more limit algebra = lim g(x) lim + lim f (a) lim
x→a x→a x−a x→a x→a x−a
= g(a) f ′ (a) + f (a)g ′ (a).
4.2. Calculating Derivatives 221
Note especially the last step, in which we evaluated four limits, of which only one
(the third) is completely trivial. The second and fourth limits define f ′ (a) and
g ′ (a), which we’ve assumed to exist, and the first, limx→a g(x) = g(a), holds
because g is continuous at x = a. Why? See Theorem 4.3,
page 213.
3x7 − 5x + 2
p(x) = 3x7 − 5x + 2 and q(x) = ,
x3 − x
is easy for calculus veterans. How is Theorem 4.5 involved?
S OLUTION . Theorem 4.5 guarantees, first, that the derivative functions p′ (x)
′
and q (x) exist. To prove that p(x)—or any polynomial function—is differen-
tiable for all x, we only need to observe that f (x) = x is differentiable, with Or prove . . . it’s easy.
f ′ (x) = 1 for all x. Now Theorem 4.5 implies that power functions like
are all differentiable in their own right, and that the familiar derivative formulas
hold for all positive integer powers:
′ ′ ′
x2 = 2x, x3 = 3x2 , ... x42 = 42x41 , ....
E XAMPLE 2. What does Theorem 4.5 say about derivatives of f (x) = (x2 +1)15
and g(x) = sin x/ex ?
S OLUTION . The product rule in Theorem 4.5 can help with f (x) if—but only
if—we’re willing to multiply out the 15th power. That is tedious for a human but
no big deal for, say, Mathematica:
(x2 +1)15 = x30 +15x28 +105x26 +455x24 +1365x22 +3003x20 +· · ·+15x2 +1.
The result is easy, if laborious, to differentiate term by term. A wiser plan involves
the chain rule; see below.
222 4. Derivatives
Differentiating g requires the quotient rule, of course, but first we need deriva-
tives of the numerator and the denominator. Although familiar from elementary
calculus, the formulas
′ ′
( sin x ) = cos x and ( ex ) = ex
are far from obvious, and we’ll defend them below. Assuming them for the mo-
ment, the rest is easy. Since numerator and denominator are differentiable, and the
denominator never vanishes, the quotient function is differentiable, with deriva-
tive ′
sin x ex cos x − ex sin x cos x − sin x ♦
= = .
ex ex · ex ex
Composition and the chain rule. Theorem 4.5 says nothing about functions like
h(x) = sin(x2 ) and k(x) = sin(sin(sin(ex ))), built by composition of differen-
tiable functions. Theorem 4.6 assures us that such composites are indeed differ-
As always, due care for entiable, and describes how the respective derivatives are combined.
domains is necessary.
Theorem 4.6 (The chain rule). Suppose that g is differentiable at a and f is dif-
ferentiable at b = g(a). Then f ◦ g is differentiable at a, and
E XAMPLE 3. The functions f (x) = sin x and g(x) = x2 are differentiable for
all x. Use the chain rule to differentiate
f ◦ g(x) = sin(x2 ) and f ◦ f ◦ g(x) = sin sin x2 .
We could substitute x for a, but S OLUTION . The chain rule applies directly to f ◦ g:
why bother?
′
(f ◦ g) (a) = f ′ (g(a)) · g ′ (a) = cos(a2 ) · 2a.
we used the earlier calculation in the last step. Substituting the formulas for f and
g gives the final answer:
′
(f ◦ f ◦ g) (a) = cos sin a2 · cos(a2 ) · 2a;
Here, before the proof, are some comments on the chain rule and why it is
plausible.
4.2. Calculating Derivatives 223
• Why multiply? The chain rule says that the derivative of a composition is
a certain product of derivatives. Thinking of derivatives as magnification
factors suggests why multiplication is the right thing to do. Suppose, say, We discussed the magnification
that g ′ (a) = 2 and f ′ (b) = 3. Then “lens” g and “lens” f magnify distances view in the preceding section.
by factors of 2 and 3, respectively, and so we expect six-fold magnification
on inputs sent first through g and then through f . Microscopes use the same
principle.
• It works for linear functions: We can just calculate, without fancy proofs,
that the chain rule holds for linear functions f and g. If f (x) = Ax + B
and g(x) = Cx + D, then f ′ (x) = A and g ′ (x) = C for all x, and
and so
(f ◦ g)′ (x) = AC,
as expected.
• It works for nonlinear functions, too: Differentiable functions are closely
approximated by linear functions, and so it is reasonable to hope that dif-
ferentiable functions might satisfy the same chain rule. This turns out to be
true; the formal proof explains why.
ǫ
|g(x)| < |x| ≤ ǫ |x| < ǫδ < δ ≤ δ2 ,
M
and so
ǫ
|f (g(x))| < M |g(x)| < M |x| = ǫ |x| .
M
This proves the chain rule.
We’ve already done most of the work. Theorems 4.5 and 4.6 guarantee that a
built-up function like
cos x + ex
f (x) = sin
ln (2 + x2 )
How? Why?
226 4. Derivatives
S OLUTION . All the remaining derivative formulas follow from the one formula
′
(sin x) = cos x and from algebraic properties of sines and cosines. For example,
we have
π ′ π
( cos x )′ = sin x + = cos x + = − sin x;
2 2
notice the chain rule implicit in the second equality, and two formulas relating
sines and cosines. Here is another, this time thanks to the quotient rule:
′
′ sin x cos2 x + sin2 x
( tan x ) = = = sec2 x.
cos x cos2 x
To prove that ( sin x )′ = cos x, we’ll use the well-known addition formula
for sines:
sin(x + h) = sin(x) cos(h) + cos(x) sin(h).
This will come in handy as we wrestle with the difference quotient:
sin(x + h) − sin(x)
( sin(x) )′ = lim
h→0 h
sin(x) cos(h) + cos(x) sin(h) − sin(x)
= lim
h→0 h
cos(h) − 1 sin(h)
= sin(x) lim + cos(x) lim
h→0 h h→0 h
= sin(x) · 0 + cos(x) · 1 = cos(x),
as desired. ♦
′
E XAMPLE 5. (Exponential derivatives). The exponential derivative (ex ) = ex
follows from the single limit
eh − 1
lim = 1.
h→0 h
How? Why?
′ ex+h − ex ex eh − ex eh − 1
( ex ) = lim = lim = ex lim = ex ,
h→0 h h→0 h h→0 h
as expected. ♦
4.2. Calculating Derivatives 227
y
g(x)
f (x)
f (a)
x
f (a) a
Derivatives of inverse functions. One loose end remains untied concerning dif-
ferentiability of elementary functions: handling inverses of differentiable func-
tions. (Logarithmic and√exponential functions are inverses, for instance, as are
f (x) = x3 and g(x) = 3 x.) It turns out that inverses of differentiable functions
are indeed differentiable—if the usual care is taken with domains of definition
and to avoid division by zero. Here is the key result:
1
g ′ (f (a)) = .
f ′ (a)
1
g(f (x)) = x =⇒ g ′ (f (x)) · f ′ (x) = 1 =⇒ g ′ (f (a)) =
f ′ (a)
whenever f ′ (a) =
6 0.
228 4. Derivatives
The harder part turns out to be showing what we assumed above: that g ′ (f (a))
exists. Figure 4.4 makes that assumption look reasonable, but proving it rigor-
ously takes us a bit off our main course. We omit the detour.
1 1
g ′ (f (a)) = , or g ′ (ea )) = .
f ′ (a) ea
′
Writing b = ea gives the familiar formula g ′ (b) = ( ln b ) = 1b . ♦
Exercises
1. We said in this section that f (x) = xn for any positive integer n, then
f ′ (x) = nxn−1 . For one inductive proof (using the product rule) see Prob-
lem 14, page 51.
Another approach is to work directly with the definition:
f (x) − f (a) xn − an
f ′ (a) = lim = lim .
x→a x−a x→a x − a
4. Prove the claim about linear combinations in Theorem 4.5, page 219.
5. This problem outlines a proof—assuming both the product rule and the
chain rule, which we proved separately—of the quotient rule part of The-
orem 4.5, page 219. Assume throughout that f and g are differentiable at
x = a and that g(a) 6= 0.
(a) Assume (it is easy to show) that the function k(x) = 1/x is differen-
6 0, with k ′ (x) = −1/x2 . Use this and the chain rule
tiable for all x =
′
to find h (a), where h(x) = 1/g(x).
(b) Apply the product rule to f (x)/g(x) = f (x) · 1/g(x) to deduce the
quotient rule as stated in Theorem 4.5.
6. Use Theorem 4.5 and the fact that (sin x)′ = cos x to find derivatives of the
other five standard trigonometric functions.
3x7 − 5x + 2
8. Consider the function q(x) = . The quotient rule guarantees
x3 − x
that q is differentiable except where the denominator is zero.
(a) Use technology to plot q; notice the two vertical asymptotes. How are
they related to derivatives?
(b) The number one is a root of the denominator, but the graph of q has
no vertical asymptote there. Why not?
(c) The value q(1) is undefined in the formula above. What value of q(1)
makes q differentiable at x = 1? Why?
The graphical view. The MVT equation says something about slopes on the
graph of f : At some point c between a and b, the tangent line is parallel to the
secant line joining (a, f (a)) and (b, f (b)). Figure 4.5 illustrates this. The (linear)
secant line function, labeled L(x) in the picture, will play a role in the proof.
4.3. The Mean Value Theorem 231
The horizontal case: Rolle’s theorem. If we add to the other hypotheses the
requirement that f (a) = f (b), then the conclusion takes a slightly simpler form:
In graphical terms, the new hypothesis says that the graph has the same height
at x = a and at x = b; the conclusion says that the graph has a horizontal tangent More than one possible c is fine
somewhere in between. Figure 4.6 shows two such horizontal tangents. with Rolle.
Car talk. If f (t) is the position of a car at time t, then (f (b) − f (a))/(b − a) The units might be miles and
is the car’s average velocity over the time interval [a, b], and f ′ (t) is the car’s hours.
instantaneous velocity at time t. At some intermediate time c, says the MVT (and
common sense), these two velocities must be equal. Rolle’s version sounds even
more intuitive: if a car starts and ends at the same position, then it must be stopped
at some instant in between.
y
f (x)
L(x)
f (b)
f (a)
x
a c b
f (x)
f (0)
x
0 c1 c2 b
Proof: Let s and t be any two points in I. By the MVT, there is a point c between
s and t for which
f (s) − f (t)
f ′ (c) = .
s−t
Since the left side is zero, so is the right, which means f (s) = f (t), as desired.
• Step 2: Use Step 1—and another famous “value theorem”—to prove Rolle’s
theorem.
About Step 1. Our Step 1 claim should be familiar, for at least two reasons.
First, it lurks behind most of those classic maximum–minimum problems of ele-
Find the largest rectangular mentary calculus. Second, we’ve already proved it (and a bit more), as part (c) of
pigpen . . . one side is a river . . . . Lemma 4.4, page 214. Let’s move on.
4.3. The Mean Value Theorem 233
About Step 2. We assume that g : [a, b] → R is continuous on [a, b] and differ- We use g, not f, to avoid
entiable on (a, b), and that g(a) = g(b); we need to show that g ′ (c) = 0 for some confusion in Step 3.
c ∈ (a, b).
To apply Step 1, we invoke the EVT: Because g is continuous on [a, b], it as- The extreme value theorem;
sumes both maximum and minimum values somewhere on that closed interval. If see page 175
both the maximum and the minimum are assumed at endpoints, then the maxi-
mum and minimum values are equal, and so g is constant. In this case, g ′ (c) = 0
for all c ∈ (a, b), and we’re done already. The alternative is that at least one of the
maximum and the minimum is attained somewhere in (a, b). In this case Step 1
applies, and again we’re done.
About Step 3. Rolle’s theorem is more than a junior-grade version of the MVT.
With a little ingenuity, the latter can be deduced from the former. The trick is A common one in analysis.
to apply Rolle’s theorem not to the function f given in the MVT, but to another
function g, cleverly contrived from f . We sketch the argument, leaving details to
the exercises.
1. Let L(x) be the linear function whose graph joins the points (a, f (a)) and
(b, f (b)); see Figure 4.5. Notice some properties of L:
f (b) − f (a)
(i) L(a) = f (a); (ii) L(b) = f (b); (iii) L′ (x) = for all x.
b−a
2. If we set g(x) = f (x) − L(x), then g(a) = 0 = g(b). By Rolle’s theorem,
there exists c ∈ (a, b) with g ′ (c) = 0. This c satisfies the conclusion of the
MVT, and the proof is complete.
f (t) − f (x)
≥0
t−x
for all t ∈ I with t > x. Thus f ′ (x) is the limit of a nonnegative expression, and
so must itself be nonnegative.
This completes the proof of (i); we leave remaining parts to the exercises.
If s and t are in I and f (s) < v < f (t), then f (c) = v for some v
between s and t.
It was widely believed in the nineteenth century that functions with the interme-
diate value property must be continuous.
This is false. In the late 1800s, Jean Gaston Darboux proved (essentially) the
following theorem (a proof, based on the mean value theorem, is outlined in the
exercises):
Taylor’s theorem: the MVT generalized. The mean value theorem equation can
be rewritten to say that, under the appropriate hypotheses, we have
for some number c between a and b. This equation can be thought of as saying that
f (b) approximates f (a) with error no larger than the last term. Taylor’s theorem,
dating from around 1700 and named for the Scottish mathematician Brook Taylor,
offers a more powerful version of such an approximation. Here is a special case:
Theorem 4.14 (Taylor’s theorem, n = 3). Suppose f , f ′ , f ′′ , and f ′′′ all exist
and are continuous on [a, b]. Then, for some input c in [a, b], we have
f ′′ (a) f ′′′ (c)
f (b) = f (a) + f ′ (a)(b − a) + (b − a)2 + (b − a)3 .
2 6
A proof is outlined in the exercises. Here is an application.
E XAMPLE 2. What does Taylor’s theorem say about f (x) = cos x for a = 0
and b = 1?
Exercises
1. Let f : R → R be a differentiable function, and suppose that f has n real
roots. Use Rolle’s theorem to show that f ′ has at least n − 1 real roots.
Give an example to show that f ′ may have more than n − 1 real roots.
2. (a) Use algebra to show that p(x) = x3 + x has exactly one (real) root.
(b) Use Rolle’s theorem to show (again) that p(x) = x3 + x has exactly
one root. (Hint: If there are two roots, Rolle’s theorem leads to a
contradiction.)
236 4. Derivatives
(a) Explain why f (x) must have at least one root, regardless of the values
of a and b.
(b) Give examples (i.e., specific values of a and b) to show that f (x) can
have one, two, or three roots. No formal proofs needed, but say briefly
why your examples work.
(c) Prove that f can have no more than three roots. (Hints: (i) Take it as
given that a quadratic polynomial can have at most two roots; (ii) use
Rolle’s theorem.)
(d) Show that if a > 0, then f has exactly one root.
(a) Show that there exists x1 and x2 with a < x1 < x2 < c and f ′ (x1 ) =
f ′ (x2 ) = 0.
(b) Show that there exists x0 with a < x0 < c and f ′′ (x0 ) = 0.
(c) State a generalization of these hypotheses that guarantees there exists
x0 with f (42) (x0 ) = 0.
(a) Show that if f ′ (x) > M for all x in an interval I, then f (b) − f (a) >
M (b − a) for all inputs a and b in I with b > a.
4.3. The Mean Value Theorem 237
(b) Suppose f (0) = 0, f ′ (0) = 0 and f ′′ (x) > 2 for all x > 0. Show
that f (x) > x2 for all x > 0.
10. In each part following, decide whether a value of c can be found as de-
scribed in the MVT. If so, find one. If not, say which of the MVT hypothe-
ses is not satisfied.
11. Suppose f (t) is the eastward velocity of a car at time t, with all quanti-
ties measured in appropriate units. What does Rolle’s theorem say in this
setting? (Hint: f ′ (t) measures eastward acceleration.)
12. One hypothesis of the MVT is that f : [a, b] → R be continuous on all of
[a, b]. Give an example (as simple as possible) to show that this hypothesis
is necessary—i.e., find a function f : [a, b] → R, differentiable throughout
(a, b), for which the MVT fails.
13. Suppose f and g are functions continuous on [a, b] and differentiable on
(a, b).
(a) Use results in this section to show that if f ′ (x) = g ′ (x) for all x ∈
(a, b), then f (x) = g(x)+C for some constant C. (This is sometimes
called an identity theorem for differentiable functions.)
(b) Suppose that f ′ (x) = g ′ (x) + 5 for all x ∈ (a, b). What can be said
about f (x) and g(x)? Why?
14. Antiderivative
R tables in elementary calculus books are full of formulas like
cos x dx = sin x + C. Forget the C on a test and lose
a point.
(a) Why exactly is the C there? What does this have to do with the MVT?
238 4. Derivatives
R dx
(b) A very fussy professor might find fault with the formula x =
ln |x| + C. Explain. (Hint: Consider domains.)
15. Give a detailed proof of Step 3 in the proof of the MVT. In particular, give
an explicit formula for L(x) and use it to verify the other claims.
16. This problem is about various parts of Proposition 4.12, page 233.
(a) State (don’t prove) versions of (i) and (ii) that involve the word “de-
creasing.”
(b) Prove (ii) carefully; mimic the proof of (i).
(c) Prove (iii).
17. Let f : R → R be differentiable for all x, with f ′ (x) < 1 for all x and
f (0) = 0.
18. Let f : R → R be differentiable for all x. Suppose that f ′ (x) > 1 for all x.
Show that the graph of y = f (x) can intersect the graph of y = x no more
than once.
19. Let f : R → R be differentiable for all x. Suppose that f ′ (x) < 1 for all x
and that f (0) = 0. Show that f (x) < x for all x > 0. Must f (x) < x hold
also for x < 0? Why or why not?
20. A function f : R → R is called a contraction if the inequality
|f (x) − f (y)| ≤ |x − y|
holds for all real numbers x and y. (As the name suggests, a contraction
“shrinks distances.”)
(b) Now suppose that a < s < t < b, and v is any number such that
f ′ (s) < v < f ′ (t). Consider the new function g(x) = f (x) − vx.
Then g satisfies the conditions of the previous part. Apply the result
of (a) to complete the proof.
22. In the situation of Example 1, find a (two-part) formula for f ′ , and show
that f ′ is not continuous at x = 0.
23. In the special case a = 0 and b = 1, Taylor’s theorem (Theorem 4.14) says
f ′′ (0) f ′′′ (c)
that f (1) = f (0) + f ′ (0) + + for some c ∈ (0, 1). Prove
2 6
this in the following steps.
f ′′ (0) 2
(a) Consider the function p2 (t) = f (0) + f ′ (0)t + t . Show that
′ ′ ′′ ′′
2
f (0) = p2 (0), f (0) = p2 (0), and f (0) = p2 (0).
(b) Consider the new function g(t) = f (t) − p2 (t) + (p2 (1) − f (1))t3 .
Show that g(0) = 0 = g(1); conclude that there is some t1 in (0, 1)
with g ′ (t1 ) = 0.
(c) Show that g ′ (0) = 0 = g ′ (t1 ), so there exists t2 in (0, t1 ) with
g ′′ (t2 ) = 0.
(d) Show that g ′′ (0) = 0 = g ′′ (t2 ), so there exists c in (0, t2 ) with
g ′′′ (c) = 0. Conclude that Taylor’s theorem holds in the case at hand.
1.5 1.5
1.0 1.0
0.5 0.5
0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0
4 4
2 2
–4 –2 2 4 –4 –2 2 4
–2 –2
–4 –4
What the examples show. The examples illustrate that a sequence of functions
may behave quite differently on one domain than on another. On the domain
[0, 1/2], for example, the functions {fn } in Example 1 converge to the zero func-
tion. On the domain [0, 1], too, these functions appear to converge, but to a limit
function that is discontinous at x = 1. On the domain [2, 3], by contrast, the same Not shown in Figure 4.7, but
functions become larger and larger with n, and so have no sensible limit func- easily visualized.
tion. The functions {gn } in Example 1, on the other hand, appear to converge to
g(x) = x on every domain interval.
Two Definitions
Let f1 , f2 , f3 , . . . be a sequence of real-valued functions, all defined on a fixed
interval I. To sort out similarities and differences in the behaviors illustrated in
Examples 1 and 2, we’ll define two senses in which {fn } might converge to a
limit function f , also defined on I.
• Both sequences converge pointwise: Both {fn } and {gn } converge point-
wise to their limits, at least on the domain intervals shown. For {fn } we
have
lim fn (x) = lim xn = 0 = f (x)
n→∞ n→∞
242 4. Derivatives
and the inequality holds for all x in [−5, 5], as the definition requires.
The sequence {fn } does not converge uniformly to f on [0, 1]. The prob-
lem, roughly speaking, is that the limit function f “jumps” abruptly at
x = 1, while all the fn increase smoothly. We explore this idea more
carefully in the next example.
In particular, we must have either fN +1 (x) < 0.1 or fN +1 (x) > 0.9 for all
x ∈ [0, 1]. This is clearly absurd—the continuous function fN +1 (x) = xN +1
must assume every value between zero and one. We’ve shown that uniform con-
vergence fails. ♦
E XAMPLE 4. Show that the same sequence {fn } does converge uniformly on
[0, 0.99], with limit f (x) = 0. This is the same limit function
as before, but chopped off at x
= 0.99.
S OLUTION . Let ǫ > 0 be given. Choose a number N such that 0.99N < ǫ. Any
Why can this be done? How big
such N works, because for all n > N and all x ∈ [0, 0.99], we have
is N if ǫ = 0.001?
as desired. ♦
Proposition 4.17. Let I be an interval, and let {fn } and {gn } be sequences of
real-valued functions, all defined on I.
• Uniform convergence is stronger: If fn → f uniformly, then fn → f point-
wise, too. The converse is false.
• Algebraic combinations: Suppose fn → f and gn → g pointwise on I,
where f and g are functions. Then
fn ± gn → f ± g and fn gn → f g,
S OLUTION . The answers are respectively yes and no. If we fix any x in I,
then {fn (x)} is a sequence of nonnegative numbers, and so its limit, f (x), is
also nonnegative, as claimed. As for continuity, we saw in Example 1 that the
And differentiable, for that continuous functions fn (x) = xn converge pointwise on [0, 1] to a discontinuous
matter. 6 0 and f (1) = 1.
limit—the function f defined f (x) = 0 if x = ♦
The proof idea. The formal proof is clever and slightly technical, but the basic
idea is straightforward. Continuity of f at x = a means, roughly, that if x ≈ a,
then f (x) ≈ f (a). This holds in the present case because we know (i) fn (x) ≈
f (x) for large n and for all x in I; (ii) fn (x) ≈ fn (a) when x ≈ a because fn is
continuous at x = a. Putting these conditions together gives, for x ≈ a,
Watch for three ǫ/ 3’s. We make this three-step approximation precise in the formal proof.
0.5
±1 ±0.5 0 0.5 1
Putting these inequalities together shows that our chosen δ does what we need it
to do: if |x − a| < δ, then By a three-part triangle
inequality.
|f (x) − f (a)| ≤ |f (x) − fN (x)| + |fN (x) − fN (a)| + |fN (a) − f (a)|
ǫ ǫ ǫ
< + + = ǫ,
3 3 3
as desired.
1+1/n
E XAMPLE 6. For each positive integer n, define fn (x) = |x| . Figure 4.9
shows several of the fn ; the sequence appears to approach the absolute value
function. What is happening with derivatives?
• fn → f uniformly on [−1, 1]. This follows from the fact, readily shown,
that
1
|f (x) − fn (x)| <
n
for all x ∈ [−1, 1]. Again, see the exercises.
246 4. Derivatives
The point is that, although all the fn are differentiable everywhere, their limit f is
not differentiable at the origin. Thus, differentiability—unlike continuity—is not
preserved by uniform limits. ♦
Series of Functions
See Definition 2.21, page 120. We saw in Section 2.5 the close relationship between series and sequences: Any
given series
X∞
ak = a1 + a2 + a3 + . . .
k=1
A1 = a1 , A2 = a1 + a2 , ..., An = a1 + a2 + · · · + an , ....
P
Convergence or divergence of the series ak now reduces to theP same questions
for the sequence {An }. Per Definition 2.22, page 122, the series ak converges
to the sum A if and only if the sequence {An } of partial sums converges to A.
Exactly the same approach works for series of functions.
f1 (x) = g1 (x); f2 (x) = g1 (x)+g2 (x); f3 (x) = g1 (x)+g2 (x)+g3 (x); ...
∞
X
The function series gn converges (pointwise or uniformly) to the sum function
n=1
g on I if and only if the function sequence f1 , f2 , f3 , . . . converges (pointwise
or uniformly) to g on I.
Power series. Power series are function series in which the summands are “mono-
mials,” like 3x7 and x5 /120, and the partial sums are polynomials—familiar,
convenient, and well-behaved functions in their own right. Power series arise
You’ve probably met these in naturally and usefully in theory and applications. Taylor and Maclaurin series are
calculus courses. crucial examples; we explore them briefly in the next section.
The topic of function series in general, and power series in particular, is vast.
We take just a quick glance to suggest the topic’s contours. The next example
concerns the simplest and arguably most useful power series; note the close con-
nection to geometric series already studied.
4.4. Sequences and Series of Functions 247
Does the power series converge to some function g on some interval I? Is the
convergence pointwise? Uniform?
the series diverges for |x| ≥ 1. In the language of function series: the geometric
power series converges pointwise for x ∈ I = (−1, 1) to the function g(x) =
1/(1 − x).
Convergence in this case is not uniform on all of I = (−1, 1), but it is uni-
form if we restrict x to smaller closed
P nintervals, such as [−0.8, 0.8]. To see why,
let ǫ > 0 be given. Because 0.8 converges, we can choose N such that
P∞ k
PN k
k=N +1 0.8 < ǫ. This is another way of saying that fn (x) = k=0 0.8
P∞ k
differs from the k=0 0.8 (which is g(0.8)) by less than ǫ. Similar reasoning
applies to any x ∈ [−0.8, 0.8]: If n > N ,
∞ ∞
X X
k
0.8k < ǫ,
|g(x) − fn (x)| = x ≤
k=n+1 k=N +1
More on power series. We saw above and will see again below that the limits
of convergent function sequences and series may be functions that are less well-
behaved (e.g., not differentiable) than the terms or summands that approach those
limits.
Power series turn out to be especially well-behaved in this and other respects.
Following, without proofs, are some special properties of power series not neces- Proofs are within our reach, but
sarily shared by arbitrary function series. complicated and off our main
path.
248 4. Derivatives
P
Proposition 4.20. Consider a power series an xn , and the function f defined
by
∞
X
f (x) = a0 + a1 x + a2 x2 + · · · = an xn
n=0
(b) If a power series function f (x) has radius of convergence R, and r < R, then
the series converges uniformly to f (x) for x ∈ [−r, r].
And see Example 2 for a similar S OLUTION . Apply the ratio test (Theorem 2.32, page 138): Since
calculation.
|x|k+1 /(k + 1)! |x|
lim = lim =0
k→∞ |x|k /k! k→∞ k + 1
f0 (x) = cos(πx)
f1 (x) = cos(πx) + a cos (bπx)
f2 (x) = cos(πx) + a cos (bπx) + a2 cos b2 πx
...
Xn
fn (x) = ak cos bk πx .
k=0
Here a is a constant with 0 < a < 1, small enough to assure that the fn converge
uniformly; b is a positive integer, large enough to create rapid oscillation in the
cosines. We omit computational details, but here are the main points:
• The fn converge uniformly on R to some function f .
Figure 4.10 shows the first few fn , with a = 0.5 and b = 13.
Exercises
1. In each part, show that the given sequence {fn } converges pointwise to the
given limit f on the given interval I. Is the convergence also uniform? Why
or why not?
1
(a) fn (x) = n for all n; f (x) = 0; I = R.
x
(b) fn (x) = n for all n; f (x) = 0; I = R.
sin x
(c) fn (x) = n for all n; f (x) = 0; I = R.
250 4. Derivatives
1
1
–3 –2 –1 1 2 3 –3 –2 –1 1 2 3
x x
–1
–1
(
n if x ≥ n,
2. Consider the sequence {fn } defined by fn (x) = .
0 if x < n
(a) Show that {fn } converges pointwise on R to the zero function f (x) =
0. Is the convergence also uniform?
(b) Show that {fn } converges uniformly on I = [−1000, 1000] to the
zero function f (x) = 0.
for each x ∈ I, the sequence {fn (x)} is Cauchy. (Such a sequence is said
to satisfy a Cauchy criterion.)
6. In the situation and notation of Proposition 4.17, page 243, show that if
fn → f and gn → g, both pointwise on I, then fn + gn → f + g pointwise
on I. Show also that the same result holds if “pointwise” is replaced with
“uniformly.”
8. This problem is about Example 6, page 245; consider the functions fn de-
fined there.
1+1/n
(a) Let n be any positive integer. Show that fn (x) = |x| is dif-
ferentiable at x = 0, with fn′ (0) = 0. Hint: Look at both left- and
right-hand limits of the difference quotient for fn′ (0).
(b) Let n be any positive integer. It’s a bit tricky, but we can show using
elementary calculus that |f (x) − fn (x)| < n1 for all x ∈ [−1, 1]. Try
it.
(c) Use the inequality in the preceding part to show that fn → f uni-
formly on [−1, 1].
12. In Example 8, page 248, we observed that the power series f (x) converges
uniformly to ex on bounded and closed intervals, like I = [−1, 1]. If ǫ = 1,
for instance, then N = 1 satisfies the conditions of Definition 4.16. (Use
technology to plot ex and f2 (x) = 1 + x + x2 /2 for −1 ≤ x ≤ 1 to see for
yourself.)
(a) Let I = [−1, 1] and ǫ = 0.01. Use plotting technology to find a value
of N that works for this I and ǫ.
(b) Let I = [−2, 2] and ǫ = 1. Use plotting technology to find a value of
N that works for this I and ǫ.
13. In the spirit of Example 8, page 248, consider the power series
∞
x3 x5 x7 X x2k−1
f (x) = x − + − + ··· = (−1)k+1 .
3! 5! 7! (2k − 1)!
k=1
(a) Use the ratio test to show that the series f (x) converges (in absolute
value) for all x.
(b) Differentiate f term by term—twice—to see that f ′′ = −f ; notice
also f (0) = 0 and f ′ (0) = 1. What famous calculus function has
these properties?
(c) Differentiate f term by term to find a power series for f ′ . What fa-
mous function is this? How do you know?
(d) Use technology to plot some partial sums of f and f ′ for −π ≤ x ≤
π. Do the results look familiar?
14. This problem is about the power series g(x) = x+x2 /2+x3/3+x4 /4+. . . .
(a) Use the ratio test to show that g(x) converges for −1 < x < 1.
(b) Explain why g ′ (x) = 1/(1 − x) for −1 < x < 1. Note that g(0) = 0,
too.
(c) Explain why g(x) = − ln(1 − x) for −1 < x < 1.
(d) The 5th partial sum of g(x) is f5 (x) = x+x2 /2+x3 /3+x4/4+x5 /5,
and f5 (1/2) = 661/960. Interpret this as an estimate to the natural
logarithm of something.
mean value theorem. Here is a slightly modified form of the version from Sec-
tion 4.3:
Theorem 4.21 (Taylor’s theorem, n = 3). Suppose f , f ′ , f ′′ , and f ′′′ all exist
and are continuous on [a, b]. Then, for some input c in [a, b], we have
f ′′ (a) f ′′′ (c)
f (b) − f (a) + f ′ (a)(b − a) + (b − a)2 = (b − a)3 .
2 6
As we observed earlier, Taylor’s formula says that if the right-hand quantity
is small, then the error committed in the approximation
f ′′ (a)
f (b) ≈ f (a) + f ′ (a)(b − a) + (b − a)2 .
2
See Example 2, page 235, for more on this perspective.
As the examples show, the function f and the polynomial Pn have the same
value and first n derivatives at x = a. In this sense Pn could reasonably be
described as the “best polynomial approximation” to f at x = a. The version
of Taylor’s theorem above says something about how closely P2 (b) approximates
f (b) when calculations are based at x = a.
|x|n+1 |b|n+1
|f (x) − Pn (x)| ≤ K ≤K
(n + 1)! (n + 1)!
1. What does the theorem say for f (x) = ex , n = 4, and b = 2? Use plotting
technology to check that the claimed inequality does indeed hold.
2. What does the theorem say for f (x) = x4 , n = 4, and I = [−2, 2]? What
if f (x) = x5 ?
1. Let g(x) = f (x) − Pn (x) for x ∈ [−b, b]. Show that 0 = g(0) = g ′ (0) =
g ′′ (0) = . . . g ( n)(0), and g (n+1) (x) = f ( n + 1)(x).
2. Suppose h and k are functions on [−b, b] with (i) h(0) = k(0), and (ii)
h′ (x) ≤ k ′ (x) for 0 ≤ x ≤ b. Then h(x) ≤ k(x) for 0 ≤ x ≤ b. Hint: The
mean value theorem or a corollary may be useful.
3. Our hypothesis is that
x2 x2
−K ≤ g ( (n − 1))(x) ≤ K for x ∈ [0, b].
2 2
Repeating this process a total of n + 1 times gives
xn+1 xn+1
−K ≤ g(x) ≤ K
(n + 1)! (n + 1)!
Taylor series. If f has derivatives of all orders at x = a, then we can write its
Taylor series:
∞
f ′′ (a) X f (k) (a)
S(x) = f (a) + f ′ (a)(x − a) + (x − a)2 + · · · = (x − a)k .
2 k!
k=0
are familiar from elementary calculus. In this chapter we interpret the integral, de-
fine it rigorously, and explore some of its properties—including the fundamental
theorem of calculus, which justifies calculations like the one above.
look a little different, but all four mean the same thing. The first two forms use dif-
ferent variable names, but these choices are arbitrary—all four expressions have
the same numerical value. For simplicity and economy, we’ll often drop the vari-
able name entirely, and use the last form. We’ll use the other forms when we want
to emphasize a variable name or, as in the third form, when no specific function
name is given.
Riemann’s and other integrals. In this book (and in elementary calculus) “inte-
gral” means “Riemann integral,” after the German mathematician G. F. B. Rie-
mann (1826–1866), who first defined integrability rigorously. Riemann’s mathe-
matical accomplishments, despite his short life, ranged across the discipline. One
of his conjectures in number theory, now known as the Riemann hypothesis, re-
mains after 150 years among mathematics’ most important unsolved problems.
And not for lack of trying: A $1 million prize for its solution, offered by the Clay
Mathematics Institute, lies unclaimed.
257
258 5. Integrals
The phrase “Riemann integral” honors a person, but it also distinguishes one
particular approach to integration from several others, each with its own features
and (depending on one’s viewpoint) bugs. Among important alternatives to Rie-
mann’s integral are the Lebesgue and the Henstock integrals, developed around
1900 and 1950, respectively. These integrals “agree with” the Riemann integral
for all the standard functions of calculus, but they are more general in the sense
that they handle larger classes of functions. More advanced courses in analysis
treat such integrals carefully, but we will stick with Riemann’s version.
Areas and integrals. The most familiar view of integrals from elementary calcu-
Rb
lus involves area: If f (x) ≥ 0 for x ∈ [a, b], then a f (x) dx measures the area
above the x-axis, below the curve y = f (x), and between the vertical lines x = a
and x = b. If f (x) < 0 for some inputs, then area below the x-axis is involved,
Draw your own pictures to and counts as negative.
illustrate the possibilities. Thinking of integrals as areas will often be helpful for us, too. This view
suggests—correctly—that
Z 7 Z 3 Z 3.14
15
3 dx = 18, (1 + x) dx = , and cos(x) dx ≈ 0,
1 0 2 0
Is the last integral slightly assuming, as we do for now, that all three integrals exist. But extra care will be
positive or slightly negative? needed for other functions, whose graphs may be ragged or broken, not smooth.
Why?
Draw your own pictures for the really mean? For the first integral, the integral-as-area view is enough—9 is the
first two integrals. only reasonable answer. For the second integral, the area view suggests 2 is possi-
ble, but is it exact? Why not 2.034? Or 1.957? For the third integral, the negative
result and elementary calculus intuition suggest that f (x) is in some sense more
negative than positive for x in the interval [a, b]. But in what sense is the answer
exactly −7?
Resolving all these questions asks a lot of the integral, and so it is no surprise
that the recipe is complicated. Here, first, we assemble the ingredients:
• Integrand and interval: The definition requires an integrand—a real-valued
function f : [a, b] → R defined for all inputs x in a closed and bounded
interval [a, b]. Note, in particular, that although integral equations like
Z ∞ Z 1
1 1
2
dx = 1 and √ dx = 2
1 x 0 x
5.1. The Riemann Integral: Definition and Examples 259
are sometimes seen in elementary calculus, the integrals in question are not That’s why we call them
of the type considered here. “improper.”
a s1 x1 s2 x2 s3 x3 s4 x4 s5 b
R2
We don’t know “officially” yet E XAMPLE 1. Explore some Riemann sums for 0
sin(x) dx.
that the integral exists. But it
does.
S OLUTION . The integral involves the integrand f (x) = sin x and the interval
[0, 2]. The simplest and least interesting partition of [0, 2] involves no chopping:
P = {x0 , x1 } = {0, 2} has just one subinterval, and kPk = 2. Here we need
Chosen at random by just one sampling point, say, s1 = 1.3657, and the corresponding Riemann sum
Mathematica. is simply
f (s1 )∆x1 = sin(1.3657) · 2 ≈ 1.9581.
With so little work or thought invested, we can’t expect much return from the
answer.
Let’s work (just) a little harder. With
With the same partition, but new samples S = {0.8, 0.8, 1.7, 1.7}, the Riemann
sum becomes
With right endpoint samples S = {0.01, 0.02, . . . , 2.0} from the same partition
the result is not much different; Mathematica gives about 1.421. These numbers,
and all the extra work, deserve more credibility. ♦ Rightly so. The exact value, as
we will be able to prove soon, is
1 – cos 2 ≈ 1.41615.
We are ready at last for the formal definition.
E XAMPLE 2. Let f (x) = 3 for all x and [a, b] = [1, 7]. Thinking of area
R7
suggests that 1 f = 18. Does the definition agree? Can the idea be generalized?
262 5. Integrals
which is exactly the desired answer. Thus, for a given ǫ > 0 we can choose any
positive δ, say δ = 42, and the definition is satisfied. ♦
There’s nothing special, of course, about the data in the preceding example.
Proposition 5.2. If f (x) = k is any constant function and [a, b] any interval, then
f is integrable on [a, b], and
Z b
k dx = k · (b − a).
a
Rb
Proof: The idea is simple: since all Riemann sums for I = a f are nonnegative,
I itself must be nonnegative, too. The formal proof takes some care.
Let ǫ > 0 be given. Choose δ > 0 that works for this ǫ in the sense of
Definition 5.1. If we choose any Riemann sum RS (f, P, S) with kPk < δ, then
we must have |RS (f, P, S) − I| < ǫ. Because f (x) ≥ 0 for all x ∈ [a, b], it
is clear that RS (f, P, S) ≥ 0, too, and so we must have I ≥ −ǫ. Since this
inequality holds for all positive ǫ, we must have I ≥ 0.
Stranger integrands. In Proposition 5.3 we assumed that the integral exists. De-
ciding which integrals exist takes some work, and we start slowly. The following
integrand is discontinuous, but only at one point. Does it matter?
E XAMPLE 3. Let (
1 if x = 1,
f (x) =
0 6 1.
if x =
Z 2
Does f exist?
0
5.1. The Riemann Integral: Definition and Examples 263
R2
S OLUTION . Thinking about area suggests 0 f = 0; the graph of f seems too Draw your own.
skinny to bound appreciable area above the x-axis.
Proving this is not difficult. Let ǫ > 0 be given, set δ = ǫ/2, and let P =
{x0 , x1 , . . . , xn } be any partition with kPk < δ. We need to show that if S =
{s1 , s2 , . . . , sn } is any set of sampling points for P, then the associated Riemann
sum satisfies
−ǫ < f (s1 )∆x1 + · · · + f (sn )∆xn < ǫ.
Since all of the f (si )∆xi are nonnegative, it is enough to prove the right-hand
inequality. The left-hand inequality is
Now f (si ) = 0 unless si = 1, in which case f (si ) = 1. Thus, each summand obvious.
f (si )∆xi is either 0 or ∆xi . Because the “offender point” x = 1 can lie in at
most two subintervals, say [xi−1 , xi ] and [xi , xi+1 ], our Riemann sum can have at
most two nonzero summands. Adding everything up gives Very likely, all summands are
zero.
n
X
f (si )∆xi = f (si )∆xi + f (si+1 )∆xi+1
i=1
ǫ ǫ
≤ ∆xi + ∆xi+1 < + = ǫ,
2 2
as desired. ♦
S OLUTION . No. The problem, roughly speaking, is that for every partition P =
{x0 , x1 , x2 , . . . , xn } of [0, 1], no matter how “fine,” there are both rational and
irrational numbers inside every subinterval [xi−1 , xi ]. If all of the sample points
si are chosen to be rational, then g(si ) = 1 for all i, and the corresponding
Riemann sum works out to
Xn X n
RS (f, P, S) = g(si )∆xi = 1 · ∆xi = 1.
i=1 i=1
1 2 3 4
This situation—widely varying Riemann sums even for “fine” partitions—is in-
compatible with Definition 5.1. ♦
We’ll tackle the left-hand inequality and leave the right-hand one as an exercise.
Because P is a partition of [0, 4], we can choose among the partition points
the first xi with xi ≥ 1 and the last xi with xi ≤ 3. If, say, these points are x42
5.1. The Riemann Integral: Definition and Examples 265
· · · < x41 < 1 ≤ x42 < x43 < · · · < x236 < x237 ≤ 3 < x238 < . . . .
This implies that 1 ≤ si ≤ 3 for i = 43, 44, . . . , 237, and so f (si ) = 5 for all
these i. Moreover, because kPk < δ, we know that
= 5 ∆x43 + · · · + 5 ∆x237
= 5 (x237 − x42 ) the ∆ xi “collapse”
> 5 (2 − 2δ) = 10 − 10δ = 10 − ǫ,
as we aimed to show. ♦
Lessons from the examples. The examples illustrate one pleasant and one less
pleasant property of integrals. The good news is that the Riemann integral is rel-
atively forgiving of minor misbehavior, such as occasional discontinuities, in an
integrand. Less convenient is the fact that proofs and calculations using the def-
inition, even for quite simple integrands, can be messy and technical. The moral As we just saw in Example 5.
is that we’d like to have both simpler tests for integrability and more efficient
methods of calculating integrals. We’ll find some in the following sections.
About the proof. We sketch the proof of a special case, leaving generalities
R3
to the exercises. Suppose, say, that 0 f = 42. Now let ǫ = 1. Choose
δ > 0 that works for this ǫ in the sense described in Definition 5.1, and let
P = {x0 , x1 , x2 , . . . , xn } be any partition of [1, 3] with kPk < δ. If we can
show that f is bounded on each of the n subintervals [xi−1 , xi ], we can conclude
that f is bounded on all of [1, 3].
t is the only variable; all the xi To see why f is bounded on, say, [x6 , x7 ], consider the expression
are constants.
S(t) = f (x1 )∆x1 + · · ·+ f (x6 )∆x6 + f (t)∆x7 + f (x8 )∆x8 + · · ·+ f (xn )∆xn .
Here is the key idea: For each t ∈ [x6 , x7 ], S(t) is a Riemann sum, based on the
R3
partition P, for the integral 0 f , and so Definition 5.1 guarantees that
which means that S(t) is bounded for t ∈ [x6 , x7 ]. This implies, as desired, that
f (t) is bounded for t ∈ [x6 , x7 ].
Exercises
R3
1. Let f (x) = x2 and consider the integral I = 0 f , (We know from elemen-
tary calculus—and we’ll prove soon—that I = 9.)
(a) Using the partition P = {0, 1, 2, 3, 4}, find a sample set S for which
RS(f, P, S) = 8.
(b) Using the partition P = {0, 1, 2, 3, 4}, find a sample set S for which
RS(f, P, S) = 9.
5.1. The Riemann Integral: Definition and Examples 267
3. Suppose that f is integrable on [0, 2] and that f (x) ≥ 3 for all x ∈ [0, 2].
R2
Show that 0 f ≥ 6. (Mimic the proof of Proposition 5.3, page 262.)
5. Let f (x) = 0 for x = 6 0 and f (0) = 42. We explore the proof that f is
R1
integrable on [0, 1] and 0 f = 0.
8. What is the converse of Proposition 5.3? Is it true? If so, why? If not, give
a counterexample.
268 5. Integrals
9. Prove that the value of I in Definition 5.1 is unique. (In other words, if both
I1 and I2 satisfy the definition, then I1 = I2 .)
Rb
10. Suppose that I = a f exists. Show that the following “Cauchy condi-
tion” holds: for all ǫ > 0 there exists δ > 0 such that if P1 and P2 are
two partitions of [a, b] with kP1 k < δ and kP2 k < δ, and S1 and S2 are
corresponding sample sets, then |RS(f, P1 , S1 ) − RS(f, P2 , S2 )| < ǫ.
11. Suppose that f is continuous on [a, b]. We will show later that f is also
integrable on [a, b]; here are some steps in that direction.
(a) Explain why the following condition holds: for any given ǫ > 0, there
exists some δ > 0 such that if s and t are in [0, 1] and |s − t| < δ,
then |f (s) − f (t)| < ǫ.
(b) Let ǫ and δ be as in the preceding part, let P = {x0 , x1 , x2 , . . . , xn }
be any partition with kPk < δ, and let S = {s1 , s2 , . . . , sn } and
T = {t1 , t2 , . . . , tn } be two different sample sets for P. Show that
Integration is linear. Before proving the theorem we observe that we can apply
it repeatedly to handle more complicated linear combinations. If, say, f , g, and
h are all integrable on [2, 5], then the following equation makes good sense—and
it’s true:
Z 5 Z 5 Z 5 Z
π π 5
3f − 7g + h = 3 f −7 g+ h.
2 4 2 2 4 2
A fancier way to state these ideas uses the language of linear algebra. The set
V of integrable functions on [a, b] is a vector space, and integration on [a, b] is a
linear transformation from V to R (another vector space!).
Proving Theorem 5.5. We’ll sketch the proof for sums and leave the rest to the
Rb Rb
exercises. For brevity we write If for a f and Ig for a g.
Let ǫ > 0 be given. Since If and Ig exist, there are positive numbers δf and
δg such that, for any sampling points s1 , . . . , sn ,
X n
ǫ
kPk < δf =⇒ f (si )∆xi − If < ;
2
i=1
X n ǫ
kPk < δg =⇒ g(si )∆xi − Ig < .
2
i=1
270 5. Integrals
Now let δ = min{δf , δg }; we’ll show that this δ “works” in the sense of Def-
inition 5.1 for the sum function f + g. To do so, consider any partition P =
{x0 , x1 , x2 , . . . xn } with kPk < δ, and let S = {s1 , s2 , . . . sn } be any corre-
Watch for the triangle inequality. sponding choice of sampling points. Now we calculate:
n
X
(f (si ) + g(si )) ∆xi − (If + Ig )
i=1
X n Xn
= f (si )∆xi − If + g(si )∆xi − Ig
i=1 i=1
X n X n
≤ f (si )∆xi − If + g(si )∆xi − Ig
i=1 i=1
ǫ ǫ
< + = ǫ.
2 2
This shows what we claimed: f + g is integrable on [a, b], with integral If + Ig .
Bigger integrands, bigger integrals. Theorem 5.5 has a simple and natural corol-
lary; we leave the proof to the exercises.
Corollary 5.6. Let f and g be integrable functions on [a, b].
Z b Z b
If f (x) ≤ g(x) for all x ∈ [a, b], then f≤ g.
a a
Combining ideas from Example 3, page 262, and Theorem 5.5 produces some
possible surprises.
E XAMPLE 1. Suppose
R 1000
S OLUTION . Yes, and 0 f = 0. Example 3, page 262, slightly modified,
shows that functions like f17 : [0, 1000] → R, given by
(
17 if x = 17,
f17 (x) =
0 otherwise
R 1000
are all integrable on [0, 1000], and that 0
f17 = 0. Our given f satisfies
f = f1 + f2 + · · · + f1000 ,
5.2. Properties of the Integral 271
as desired. ♦
Rb
E XAMPLE 2. Suppose f is integrable on [a, b], with
f = 42, and suppose a
Rb
g(x) = f (x) for all but finitely many x in [a, b]. What can be said about a g?
Rb
S OLUTION . In this case, a g = 42, too. To see why, consider the difference
function g − f . Since g(x) − f (x) = 0 for all but finitely many x, the method of
Rb
Example 1 shows that a (g − f ) = 0. By Theorem 5.5,
Z b Z b Z b Z b
g= f + (g − f ) = f+ (g − f ) = 42 + 0 = 42.
a a a a
Rb
Here is another way to express the result: If a f = I, then altering f (x) at
finitely many points in [a, b] leaves the integral unchanged. ♦
Joinery
Another way to build new integrals from old is to stick “pieces” of various func-
tions together. Interestingly, the pieces need not fit together continuously. In this
sense, the integral is more forgiving than the derivative, which requires smoother
joinery.
Justifying such calculations rigorously takes a little effort. The main idea is in the
next lemma. ♦
272 5. Integrals
1 2 3 4
Rb
Lemma 5.7. Suppose that f : [a, b] → R is integrable, with a
f = I. Assume
c > b, so [a, b] ⊂ [a, c], and consider the “extended” function f¯ : [a, c] → R
defined by (
f (x) if x ∈ [a, b],
f¯(x) =
0 if x > b.
Then f¯ is integrable on [a, c], and
Z c Z b
f¯ = f = I.
a a
The live issue, by the way, is integrability of f¯. Once we know that, it is no
surprise that extending f in the way described doesn’t change the value of the
Draw your own picture of a integral I. The proof—like others that involve the definition of the integral—is a
typical f and f̄. bit messy, but it is worth sticking with as an example of its type. The main idea is
Rc Rb
that Riemann sums for a f¯ can’t differ by much from Riemann sums for a f .
Proof: Let ǫ > 0 be given. We need to find δ > 0 so that if P is any partition of
[a, c] with kPk < δ, and S is any choice of sampling points for P, then
Xn
¯
f (si )∆xi − I < ǫ.
i=1
Rb
Let’s use what we know about the original integral a f to produce such a δ.
Rb
The ǫ/2 will come in handy later. First, since a f exists, we can choose δf > 0 so that
m
X ǫ
f (si )∆xi − I <
2
i=1
5.2. Properties of the Integral 273
We need to show that S lies within ǫ of I. To do so, let xi0 be the largest member
of P that does not exceed b; that is, xi0 ≤ b < xi0 +1 . Now for i > i0 + 1 we
know f¯(si ) = 0, while for i ≤ i0 we have f¯(si ) = f (si ). With these facts in
mind, we can rewrite S:
S = f (s1 )∆x1 + f (s2 )∆x2 + · · · + f (si0 )∆xi0 + f¯(si0 +1 )∆xi0 +1 .
Rb
This form reveals that S is almost a Riemann sum for a f . Indeed, the similar
sum
S ′ = f (s1 )∆x1 + f (s2 )∆x2 + · · · + f (si0 )∆xi0 + f (b) (b − xi0 )
Rb
is a Riemann sum for a f , corresponding to the new partition P ′ = {a, x1 , x2 , . . . , xi0 , b};
note that kP ′ k < δ. (If xi0 = b then the last summand in S ′ vanishes.)
To finish the proof, we observe that, by our hypothesis,
ǫ
|S ′ − I| < . (∗∗)
2
Also, S and S ′ differ by (at most) two terms, each of which is small:
|S − S ′ | = f¯(si0 +1 )∆xi0 +1 − f (b) (b − xi0 )
≤ f¯(si0 +1 )∆xi0 +1 + |f (b) (b − xi0 )|
ǫ ǫ
< M δ + M δ ≤ 2M = .
4M 2
Combining this with inequality (∗∗) gives
ǫ ǫ
|S − I| ≤ |S − S ′ | + |S ′ − I| < + = ǫ,
2 2
as desired.
274 5. Integrals
Note that f (x) = f1 (x) + f2 (x) for all x ∈ [a, b]. Now Lemma 5.7 says that both
f1 and f2 are integrable on [a, b], and that
Z b Z c Z b Z b
f1 = f and f2 = f.
a a a c
as claimed.
P = {x0 = a, x1 , x2 , . . . xn = b}
Adding all n of these results gives the desired Riemann sum: Watch the summands collapse.
n
X n
X
f (si )∆xi = F (xi ) − F (xi−1 )
i=1 i=1
= F (x1 ) − F (x0 ) + F (x2 ) − F (x1 ) + · · · + F (xn ) − F (xn−1 )
= F (xn ) − F (x0 ) = F (b) − F (a).
Good news—and some cautions. Theorem 5.9 should look familiar. It allows
many standard integral calculations of elementary calculus, like this one:
Z 5 5
2 x3 53 13 124
x dx = = − = .
1 3 1 3 3 3
Avoiding all that fuss over partitions, norms, and Riemann sums seems—and is—
a big advantage in calculating a lot of integrals. But some sticky questions remain:
• Can we find an antiderivative? In the preceding calculation, with f (x) =
x2 , it was easy to find (or just to know) that F (x) = x3 /3 is a suitable
antiderivative. For other functions it can be much harder, or even impossi-
ble, to find antiderivative formulas. For instance, neither of the harmless-
looking functions
f (x) = cos(x2 ) and g(x) = cos(x) ln(x)
has an “elementary antiderivative”—a function built from standard function
Ask Mathematica or Maple to “elements”: polynomials, trigonometric functions, logarithms, etc. Without
antidifferentiate these functions; a suitable antiderivative F , Theorem 5.9 is useless for calculation.
notice the strange ingredients in
the answer. • Is f integrable? We assumed in the theorem—and used crucially in the
Rb
proof—that a f exists. Elementary calculus courses often skirt the ques-
tion of integrability, perhaps forgivably both because the matter is subtle
and because the basic functions of elementary calculus turn to be integrable
on intervals within their domains.
Theorem 5.9 dodges the tough questions above by simply assuming, as hy-
potheses, that all is well. Doing so simplifies the proof, but it weakens the theo-
rem. In the next section, we’ll grapple more seriously with the question of which
functions are integrable. A key result (which applies to all basic functions of ele-
mentary calculus) is that every function f continuous on a closed interval [a, b] is
also integrable there.
R1
E XAMPLE 5. Show that 0
2x dx = 1; use Theorem 5.9 and Definition 5.1.
S OLUTION . The easy part is finding a value for the integral. Since the integrand
f (x) = 2x has antiderivative F (x) = x2 , and F (1) − F (0) = 1, Theorem 5.9
says that one is the only possible value. We could have thought about
The tricky bit is showing that the integral exists. We have, for now, only areas, too.
Definition 5.1 to work with, so we start as usual with a given positive ǫ. Let
δ = ǫ; we’ll use some clever algebra and a nice collapsing sum to show that this
δ “works.”
Let P = {x0 , x1 , x2 , . . . xn } be any partition of [0, 1] with kPk < δ. We
claim that, for any samples S = {s1 , s2 , . . . , sn } drawn from P,
n
X n
X
1−ǫ< f (si )∆xi = 2si ∆xi < 1 + ǫ.
i=1 i=1
We will prove just the second inequality, leaving the first as an (easy) exercise.
The key observation is that, since si ≤ xi for all i,
n
X n
X
2si ∆xi ≤ 2xi ∆xi .
i=1 i=1
It suffices, therefore, to show that the right-hand sum above can’t exceed 1 + ǫ.
For this we use an algebraic trick. If we write
2xi = (xi + xi−1 ) + (xi − xi−1 ) = xi + xi−1 + ∆xi
for each i, then substitution and a little algebra give
n
X n
X
2xi ∆xi = ( xi + xi−1 + ∆xi ) ∆xi
i=1 i=1
Xn
= (xi + xi−1 ) ∆xi + ∆xi 2
i=1
n
X n
X
= x2i − x2i−1 + ∆xi 2 = S1 + S2 .
i=1 i=1
Exercises
1. This problem is about Corollary 5.6, page 270.
(a) Consider the function h(x) = g(x) − f (x) on [a, b]. Why must h be
Rb
integrable? Why is a h ≥ 0?
(b) Prove Corollary 5.6.
4. Use the result of Problem 2 to find upper and lower bounds on each of the
following integrals; assume (it’s true!) that they exist.
Z
200
sin(x2 )
(a) 5+ dx
100 100
Z 200
sin2 x
(b) 5+ dx
100 100
Z 200
1
(c) 2 dx
100 100 + sin x
5. It is a fact (we will prove it later, but just assume it here) that if f is contin-
uous on [a, b] then f is also integrable on [a, b].
(a) Suppose that f is continuous on [a, b] and that f (x) > 0 for all x ∈
Rb
[a, b]. Show that a f > 0, too. (Hint: Use the extreme value theorem
and Corollary 5.6, page 270.)
5.2. Properties of the Integral 279
Rb
(b) Suppose f is continuous on [a, b] and that a f = 0. Show that f (c) =
0 for some c ∈ [a, b]. (This fact is sometimes known as the mean value
theorem for integrals.)
(c) Give an example to show that the result in (b) need not hold if f is
discontinuous on [a, b].
6. Suppose that f is integrable on [a, b]. Show (in the spirit of the proof of
Rb Rb
Theorem 5.5, page 269) that 3f is also integrable and that a 3f = 3 a f .
Rb Rb
7. Suppose that both a f and a |f | exist. Prove the “triangle-like” inequality
Z
b Z b
f ≤ |f | .
a a
10. We showed in and near Problem 12, page 268, that f (x) = x is integrable
Rb
on [a, b], and that a x dx = (b2 − a2 )/2.
R5
(a) Use the formula above and Theorem 5.5 to find 1
(3x + 7) dx.
(b) A function g has the W-shaped graph formed by connecting the dots
at (0, 2), (1, 0), (2, 1), (3, 0), and (4, 2) in the xy-plane. Explain why
R4
the integral 0 g exists and find its value.
11. Assume (it’s true) that the integral in each part following exists. Use Theo-
rem 5.9 to find the value.
R1
(a) 0 x42 dx
Rπ
(b) 0 sin x dx
280 5. Integrals
Rb
(c) a
C + Dx + Ex2 + F (x3 ) dx
R1 2
(d) 0
xex dx
5.3 Integrability
How can we decide whether a given function f : [a, b] → R is integrable on
[a, b]? The question is obviously important: a useful integral should apply to
many functions. The question is also difficult: deciding which functions satisfy a
complicated definition naturally takes some work.
So far we’ve seen only piecemeal results:
See Example 3, page 262, and • A discontinuous function f may or may not be integrable.
Example 4, page 263.
• We’ve also said—but not proved—that every continuous function is inte-
See Example 4, page 276. grable.
In this section we approach these matters rigorously, and identify some im-
portant classes of integrable functions. Our main tool, detailed in Theorem 5.14,
is the box sum, a useful and practically usable criterion for integrability.
Lemma 5.10. Let f [a, b] → R be a bounded function. Suppose that for every
ǫ > 0 there exists δ > 0 such that
whenever P1 and P2 are partitions of [a, b] such that both kP1 k < δ and kP2 k <
Rb
δ, and samples S1 and S2 come from P1 and P2 . Then the integral a f exists.
Proof: The idea is to concoct a certain Cauchy sequence {In } of numbers, whose
limit I will turn out to be the desired integral. Every Cauchy sequence has
To get started, for each n ∈ N we use the hypothesis to choose a positive one.
number δn such that
1
|RS(f, P1 , S1 ) − RS(f, P2 , S2 )| <
n
for all Riemann sums based on partitions P1 and P2 with both kP1 k < δn and
kP2 k < δn . For technical reasons we choose the δn to be decreasing: δ1 ≥ δ2 ≥ Convince yourself this is
δ3 ≥ . . . . possible.
Next, for each n we choose any particular partition Pn with kPn k < δn A regular partition works fine.
and any particular set Sn of samples from Pn . The associated Riemann sum
RS(f, Pn , Sn ) is then a number; let’s call it In for short. This process produces
a numerical sequence {In }, which is readily shown to be Cauchy—and hence
converges to some limit I. (The proof that {In } is Cauchy uses the fact that the
δn decrease; see the exercises.)
Last, we use Definition 5.1, page 261, to show that I is the sought-after inte-
gral. For given ǫ > 0, we first choose any positive integer N for which both
ǫ 1 ǫ
|IN − I| < and < ,
2 N 2
and we set δ = δN as chosen above. This δ does what Definition 5.1 asks. If P
is any partition with kPk < δN , S is any set of samples, and RS(f, P, S) is the
corresponding Riemann sum, then the triangle inequality and our choices give
Not quite there. Lemma 5.10 will prove useful, but it has a serious practical
drawback: showing that the hypothesized inequality holds for all suitable parti-
tions P and all corresponding sample sets S seems difficult. The box-sum crite- There are infinitely many P for
rion, which turns out to be equivalent to the hypothesis of Lemma 5.10, will prove each δ, and infinitely many S for
each P .
much easier to work with.
282 5. Integrals
a x1 x2 x3 x4 b a x1 x2 x3 x4 b
a x1 x2 x3 x4 b
Upper and lower sums. For any given partition P of an interval [a, b], there
are infinitely many ways of choosing samples S compatible with P, and hence
infinitely many possible Riemann sums associated to P. To get upper and lower
bounds on the values of all these Riemann sums, it is helpful to consider upper
sums and lower sums.
If the integrand f is continuous on [a, b], as in Figure 5.4, then upper and lower
sums are ordinary Riemann sums, with sampling points chosen to maximize or
minimize f over each separate subinterval. (A continuous function has maximum
and minimum values on each subinterval, by the extreme value theorem.) In this
case, therefore, upper and lower sums are simply the largest and smallest possible
5.3. Integrability 283
exist for each i, and so the following definitions make sense. The completeness axiom
guarantees this.
Definition 5.11. Let f : [a, b] → R be a bounded function, and P = {x0 , x1 , . . . ,
xn } a partition of [a, b]. With mi and Mi as above, we define upper and lower
sums as follows:
n
X
US (f, P) = Mi ∆xi = M1 ∆x1 + M2 ∆x2 + · · · + Mn ∆xn ;
i=1
Xn
LS (f, P) = mi ∆xi = m1 ∆x1 + m2 ∆x2 + · · · + mn ∆xn .
i=1
Upper and lower sums may not be Riemann sums, but they have useful con- They are Riemann sums if f
nections to Riemann sums and to the integrals they approximate. We collect two happens to be continuous.
such facts in the following proposition, using RS, US, and LS to denote Riemann,
upper, and lower sums. Upper and lower sums don’t
depend on samples, and so
Proposition 5.12. Let f : [a, b] → R be a bounded function, and P any partition their notations don’t involve
of [a, b]. an S .
LS (f, P) ≤ I ≤ US (f, P) .
About proofs. The inequality in (i) follows immediately from the definitions of
upper and lower sums. Part (ii) is slightly subtler; see the exercises for both parts.
Clamping down: box sums. Proposition 5.12 suggests that it is a good thing for
integrability when upper and lower sums are close together. Box sums measure
this closeness.
Definition 5.13 (The box sum). Let f : [a, b] → R be a bounded function and
P = {x0 , x1 , . . . xn } a partition of [a, b]. The associated box sum is the difference
a x1 x2 x n–1 b a x1 x2 x n–1 b
Box sums have a nice geometric interpretation, as seen already in Figure 5.4(c).
Figure 5.5(a) shows another box sum for the same integrand as before, but here
with a finer partition. Figure 5.5(b) shows another ten-element box sum, this time
for a discontinuous integrand.
Here is the key point: For both integrands in Figure 5.5, the total shaded
area—the box sum—can be made as small as we wish by requiring that the par-
tition have small norm. This seems clear enough for the smooth integrand in
Figure 5.5(a): For a partition with tiny norm, the shaded area becomes smaller
and smaller, approaching a skinny “tube” around the graph of f .
Less obvious, but equally important, is the fact that the total shaded area can
also be made small for the discontinuous integrand in Figure 5.5(b). Figure 5.6
suggests why. Any partition P may generate one or two tall box sum elements,
but if P has small norm, then all boxes are narrow and their areas don’t amount
to much. We formalize these ideas in a theorem.
Theorem 5.14 (The box-sum criterion). A function f : [a, b] → R is integrable
on [a, b] if and only if for each ǫ > 0 there exists some partition P of [a, b] with
box sum less than ǫ.
Many uses. We outline the rather technical proof at the end of this section; it
involves some meticulous bookkeeping. First we illustrate the theorem’s uses and
advantages—including the fact that for a given ǫ we need only one suitable par-
tition to prove integrability. Observe also that the box-sum criterion works both
ways: it detects both integrability and non-integrability. Example 1 illustrates
both of these uses.
We have shown by other methods that f is integrable on [0, 1] but g is not. What
do box sums say?
which “isolates” the jump discontinuity at x = 0.5 in the skinny interval [0.499,
0.501]. The box sum—only 0.002 in this case—can be made even smaller nar-
rowing the middle interval even further. Thus f is indeed integrable. With integral 0.5.
For g the conclusion is negative: for any partition P of [0, 1], every box ele-
ment has height one, and so the box sum is one. ♦ Draw your own picture.
The box-sum criterion is exactly what we need to prove some familiar and
important—but otherwise elusive—results.
Theorem 5.15. If f is continuous on [a, b], then f is integrable, too.
The proof idea. Continuity means that f cannot rise or fall very much over a
small domain interval. Thus, for a partition with small norm, all box sum elements
must be short. Since the total width of boxes is only (b − a), their total area must
also be small.
Proof: Let ǫ > 0 be given. Because f is continuous on [a, b], it is also uni-
formly continuous there, according to Theorem 3.22, page 184. This means we
can choose δ > 0 such that
ǫ
|f (s) − f (t)| < whenever |s − t| < δ and s, t ∈ [a, b].
b−a
a x1 x2 b
A regular partition will do. Now let P be any partition with kPk < δ. Since f is continuous on each subin-
terval [xi−1 , xi ], it attains maximum and minimum values there, so there exist si
and ti in [xi−1 , xi ] with
Our uniform continuity condition gives Mi − mi < ǫ/(b − a) for all i, and so
n
X
box sum = (Mi − mi ) ∆xi
i=1
n n
X ǫ ǫ X
< ∆xi = ∆xi = ǫ,
i=1
b−a b − a i=1
as desired.
More on integrability. We can use the box-sum criterion to prove other familiar,
reasonable-seeming properties of the integral. Proposition 5.16 gives two sam-
ples.
Proposition 5.16. Suppose that f : [a, b] → R is integrable.
Rc Rb
(i) If a < c < b, then a f and c f exist, and
Z b Z c Z b
f= f+ f.
a a c
Proof (sketch): The main proof challenge for both (i) and (ii) turns out to be inte-
grability: We need to show that if f is integrable on [a, b], then it is also integrable
on the smaller intervals [a, c] and on [c, b], and that |f | is also integrable on [a, b].
Once all the integrals in question are known to exist, the rest is easy. The equa-
tion in (i) is essentially Theorem 5.8, page 274. See Problem 7, page 279, for the
inequality in (ii).
Proving our integrability claims directly from the definition of integrability
We don’t even have candidates would be difficult. With the box-sum criterion, the proof is routine.
for the values of the integrals.
Rb
Let ǫ > 0 be given. By Theorem 5.14, applied to the integral a f , there exists
a partition P of [a, b] with box sum less than ǫ.
5.3. Integrability 287
a x1 x2 x3 x4 b a x1 x2 x3 x4 b
Rc Rb
Let’s show first that a f and c f exist. We may as well assume that c ∈ P;
if not, we can add c to P without increasing the box sum. Thus, P has the form Sketch the situation to convince
yourself.
P = {a = x0 , x1 , . . . , xm = c, xm+1 , . . . , xn = b},
and therefore
P1 = {a = x0 , x1 , . . . , xm = c} and P2 = {c = xm , xm+1 , . . . , xn = b}
are, respectively, partitions of [a, c] and [b, c]. Because P1 and P2 are subsets of
P, and all box summands are nonnegative, the box sums for P1 and P2 are clearly
smaller than that for P. By Theorem 5.14, f is indeed integrable on each smaller
interval.
To show that |f | is also integrable on [a, b], we compare box sums for f and
|f |. Figure 5.7 illustrates the nice answer: For any partition P of [a, b],
In Figure 5.7(a), box-sum elements that “straddle” the x-axis become smaller for
|f |, as shown in Figure 5.7(b). Thus any partition for f with box sum less than ǫ
works for |f |, too.
The proof idea. Figure 5.8 illustrates the proof idea for an increasing integrand, The pictured integrand has one
with m and M as lower and upper bounds. discontinuity, but a monotone
function can have many.
If P is a regular partition with norm δ, then the box sum elements can be
“stacked” vertically as shown, to give total area less than (M − m)δ. By choosing
δ small, we can keep the box sum as small as we like.
288 5. Integrals
a x1 x2 xn– 1 b
Figure 5.8. Stack the boxes: why every monotone function is integrable.
E XAMPLE 2. What does Proposition 5.17 say about integrability of some fa-
vorite functions?
all exist, because all integrands are monotone on [0, 100]. With help from Theo-
rem 5.8, page 274, we can conclude that all of
Z 100 Z 100 Z 100
cos x dx, ln(cos x + 2) dx, ecos x dx
0 0 0
also exist. Even though the integrands are not monotone on [0, 100], in each case
we can break [0, 100] up into finitely many smaller subintervals, on each of which
the given integrand is monotone. ♦
Why the box-sum criterion is necessary. Suppose f is integrable on [a, b]; let
Rb
I= a. For any given ǫ > 0, we’ll find a partition P of [a, b] with box sum less
5.3. Integrability 289
Why the box-sum criterion is sufficient. The plan of the proof is to use the box-
sum condition to obtain the “Cauchy criterion” of Lemma 5.10, page 281, which
in turn implies integrability. We show, in fact, that if for given ǫ > 0 we can find
some partition P0 with small box sum, then there exists some (very small!) δ > 0
such that every partition P with kPk < δ must also have small box sum, and this
does the trick.
One picky technical lemma will prove useful. We leave its straightforward Draw a simple picture to get
proof as an exercise. started.
In words: Adding one point to P decreases upper sums and increases lower
sums—but by no more than (M − m)δ.
Now we can prove that the box-sum condition implies integrability. For given
ǫ > 0, we first choose any particular partition
P0 = {w0 , w1 , . . . , wN } ,
of [a, b] with N partition points and box sum less than ǫ/2; that is,
ǫ
US (f, P0 ) − LS (f, P0 ) < .
2
Next we set
ǫ
δ= ,
4(N − 1)(M − m)
where M and m are, respectively, upper and lower bounds for f on [a, b]. This δ
Unlikely as that may seem in will turn out to work in Lemma 5.10.
advance. Now let P be any partition of [a, b] with kPk < δ. Consider the new partition
P ′ formed from P by adding in the N − 1 partition points w1 , w2 , . . . , wN −1 .
(Since w0 = a and wN = b, adding them to P has no effect.)
Now Lemma 5.18(i) implies that
US (f, P ′ ) ≤ US (f, P0 ) ,
which has length less than ǫ. In particular, every Riemann sum RS(f, P, S) is
between LS (f, P) and US (f, P), and therefore lies in the same interval.
Finally, we observe that if both P1 and P2 are partitions of [a, b] with both
kP1 k < δ and kP2 k < δ, and if samples S1 and S2 come from P1 and P2 ,
respectively, then both RS(f, P1 , S1 ) and RS(f, P2 , S2 ) lie in the same interval
of length less than ǫ, and are thus within ǫ of each other. Thus the hypothesis of
Rb
Lemma 5.10 is satisfied, and the integral a f exists.
Exercises
1. Suppose f is continuous on R. Explain why the functions defined by f 2 (x),
sin(f (x)), f (sin x), and ln (sin f (x) + 2) are all integrable on every inter-
val [a, b].
R 10
2. Consider the integral 0 sin x dx. Find a partition P of [0, 10] that gives
box sum less than 0.01. (Hint: Use the fact that inequality |sin x − sin y| ≤
|x − y| holds for all x, y ∈ R.)
4. Use box sums to show carefully that any function of your choice is not
integrable on [0, 1].
5. Show using box sums and Theorem 5.14 that the function f (x) = x2 is
integrable on [0, 1].
6. Show using box sums and Theorem 5.14 that the function f (x) = x3 is
integrable on [0, 1].
7. Suppose that f is integrable on [0, 2]. Show using box sums and Theo-
rem 5.14 that f is integrable on [0, 1], too.
R 100
8. We said in Example 2, page 288, that 0 cos x dx can be shown to exist
using Proposition 5.17, page 287 and Theorem 5.8, page 274. Give the
details.
10. Show that the sequence {In } defined in the proof of Lemma 5.10 is Cauchy.
292 5. Integrals
Note: This is (i) of Proposition 5.12; it says that for any partition P, the cor-
responding upper and lower sums are upper and lower bounds, respectively,
for the set of all Riemann sums associated with P.
12. Give a “picture proof” of Lemma 5.18, page 289. (Hint: Adding one more
point to a partition changes only one box sum element. Convince yourself
that this change is no more than the claimed amount.)
Rb
13. Show that if a f = I exists, and P is any partition of [a, b], then
LS (f, P) ≤ I ≤ US (f, P) .
Note: This is (ii) of Proposition 5.12; it says that, for any partition P, the
corresponding upper and lower sums “trap” the exact value of I from above
and below.
(Hint: To show that US (f, P) overestimates I, consider the “stair-step”
function fbig defined by
M 1
if a ≤ x < x1
M
2 if x1 ≤ x < x2
fbig (x) =
. . .
...
Mn if xxn−1 ≤ x ≤ b
Rb
Why is fbig integrable? What is its integral? How does a
fbig compare to
Rb
a f ?)
Proof: We’ve already done all the hard work. By Theorem 5.15, the integral
Rb
a
f exists, and Theorem 5.9 guarantees that the integral’s value is indeed F (b) −
F (a).
Having worked very hard to build the machine, we now only need to turn the
crank. ♦
Rπ
E XAMPLE 2. Find 0
sin(x2 ) dx.
as desired.
5.4. Some Fundamental Theorems 295
Nice to know, but . . . . Theorem 5.20 assures us that every continuous function
f , no matter how ill-behaved, has an antiderivative F . (Being an antiderivative, F Some are very ill-behaved.
is automatically differentiable and hence also continuous.) This is nice to know in
the abstract—but not helpful for calculating integrals like the one in Example 2.
Indeed, many useful and harmless-looking calculus functions, including
sin x
cos(x2 ), exp(x2 ), and ,
x
turn out not to have elementary antiderivatives, and are said not to be “integrable Elementary functions are nice
in closed form.” combinations of the familiar
calculus-style functions.
Average values, and another mean value theorem. Integration has a natural
connection to averaging.
Definition 5.21. If f is integrable on [a, b], then the average value of f on [a, b] is
given by
Rb
f
a
average value = .
b−a
Embedded in the proof of Theorem 5.20 is another theorem (and its proof) of
independent interest.
Theorem 5.22 (Mean value theorem for integrals). If f is continuous on [a, b],
then there exists c in (a, b) for which
Z b
f = f (c) · (b − a).
a
Exercises
Rb
1. Show that if a f = 0 and f is continuous on [a, b], then f (c) = 0 for some
c ∈ (a, b). Give an example to show that the conclusion need not hold if f
is not continuous.
Rb Rb
2. Let f and g be continuous on [a, b] and suppose a f = a g. Show that
f (c) = g(c) for some c in [a, b].
3. Suppose f has average value 3 on [a, b]. What is the average value of 5f +7
on [a, b]? Why?
R2
4. In each part, find I = 0 f , the average value of f on [0, 2], and a value
of c at which the average value is achieved (in the sense of Theorem 5.22).
296 5. Integrals
√
(a) f (x) = x.
(b) f (x) = x.
(c) f (x) = x2 .
(d) f (x) = x42 .
6. Let h : [a, b] → R be continuous on [a, b]. Suppose that h(x) ≥ 0 for all
Rb
x ∈ [a, b] and that a h = 0. Use Theorem 5.20 to show that h(x) = 0 for
Rx
all x ∈ [a, b]. (Hint: Use properties of the function H(x) = a h to derive
the result.)
7. Let f be continuous on [a, b]. Suppose that the average value and the maxi-
mum value of f on [a, b] are equal. Show that f is constant. Must the same
result hold if f is not continuous? Why?
11. Find an interval [a, b] such that the average value of f (x) = x2 occurs at
the midpoint.
12. We used the mean value theorem for integrals to prove Theorem 5.20. Use
Theorem 5.20 and the mean value theorem for derivatives to prove the mean
value theorem for integrals. (Assume that f is continuous, of course.)
13. (This problem refers to material in Section 4.4.) Consider the functions hn
defined by hn (x) = n if x ∈ (0, 1/n) and hn (x) = 0 otherwise. Let h be
the constant function h(x) = 0.
(a) Show that the sequence {hn } converges pointwise to h on [0, 1]. Does
R1 R1
0 hn converge to 0 h? Explain.
5.4. Some Fundamental Theorems 297
(b) Show that the sequence {hn } converges uniformly to h on [0.1, 1].
R1 R1
Does 0.1 hn converge to 0.1 h? Explain.
14. (This problem refers to material in Section 4.4.) Consider the functions
fn (x) = xn for n = 1, 2, 3, . . . and the limit function f given by f (x) = 0
if x ∈ [0, 1) and f (1) = 1.
(a) We showed in Section 4.4 that the sequence {fn } converges pointwise
R1 R1
to f on [0, 1]. Does 0 fn converge to 0 f ? Explain.
R 0.9
(b) The sequence {fn } converges uniformly to f on [0, 0.9]. Does 0 fn
R 0.9
converge to 0 f ? Explain.
15. (This problem refers to material in Section 4.4.) Let {fn } be a sequence of
continuous functions on [0, 1], and suppose {fn } converges uniformly on
[0, 1] to a function f .
299
300 Selected Solutions
19. (a) There are 10 × 9 × 8 = 720 ways to choose 3 different elements in order.
There are six ways to reorder each such choice, so the answer is that S had
720/6 = 120 elements.
(b) T has 10 × 10 × 10 = 1000 elements.
(c) S10 has 10! = 10 × 9 × 8 × · · · × 1 = 3628800 elements.
(d) The sets N10 , S, T , S10 have no elements in common.
21. We have S ∈ A42 if and only if N100 \S ∈ A58 , so there is a one-to-one correspon-
dence between A42 and A58 , which therefore have the same number of elements.
23. (a) The picture is a diagonal stripe from upper left to lower right.
(b) The black squares can be described by the set {(x, y) | x + y is even}.
(c) The element (2, 3, black) corresponds to a black square at position (2, 3). The
set G × {black, white} represents all possible ways to choose a square and
color it.
(d) A picture is, in effect, a subset of squares to be colored black. Thus P (G)
corresponds to the full set of possible pictures.
(a) Let A be the set of all humans who have ever lived, and B the set of all women
who have ever lived. The function is not injective, because siblings have the
same mother. The function is not surjective, either, since some women are
not mothers.
(b) Let A be the set of all mothers of sons, and B the set of all male humans.
Then F IRST B ORN S ON : A → B is one-to-one but not onto.
(c) Let A be the set of all humans and B the set of all colors. Then E YE C OLOR :
A → B is neither one-to-one nor onto, since several people have blue eyes,
and nobody has silver eyes.
(d) Let A be the set of all US citizens and B = {January 1, January 2, . . . , December 31}.
Then B IRTHDAY : A → B is onto (every day is someone’s birthday) but not
one-to-one (several people have the same birthday).
n(n + 1)
1 + 2 + 3 + · · · + n + (n + 1) = + (n + 1)
2
n (n + 1)(n + 2)
= (n + 1) +1 = .
2 2
This shows P (n + 1), and completes the proof.
n n n
X X X n(n + 1)
(c) Calculate with sums: (2n−1) = 2 n− 1=2 −n = n2 .
2
k=1 k=1 k=1
7. Prove this by induction. The base case n = 1 is clear. For the inductive step,
suppose that (1 + x)k ≥ 1 + kx for a particular k. Multiplying both sides by
(1 + x) and doing a little algebra shows that (1 + x)k+1 ≥ 1 + (k + 1)x, as
desired.
9. (a) Rewrite f1 f2 f3 as (f1 f2 )f3 and use the ordinary product rule twice.
(b) The formula is (f1 f2 f3 · · · fn )′ = f1′ f2 f3 · · · fn + f1 f2′ f3 · · · fn + f1 f2 f3′
· · · fn + · · · + f1 f2 f3 · · · fn′ . We’ve proved the first few cases. For the in-
ductive step, we write f1 f2 f3 · · · fn+1 = (f1 f2 f3 · · · fn )fn+1 and apply the
ordinary product rule.
11. (a) Checking early cases suggests the answer n(n + 1)(n + 2)/3; this is readily
proved by induction.
(b) Notice that
n n n
X X X n(n + 1)(2n + 1) n(n + 1)
j(j + 1) = j2 + j= + .
j=1 j=1 j=1
6 2
13. Note first that P (n) is obviously true for the base case n = 4; just check both sides.
To complete the inductive proof, we assume P (n) (for n ≥ 4) and show P (n + 1).
By P (n), we have 2n < n!. We know also that 2 < (n + 1). Multiplying these
inequalities gives
(c) x ∈ (−3, 1)
7. (a) Subtracting 4 from all parts of 4.96 < y < 5.04 gives 0.96 < y − 4 < 1.04;
note also that y − 4 = |y − 4|.
(b) Note that 2.93 < x < 3.07 and 4.96 < y < 5.04. Thus |y − x| < 5.04 −
2.93 = 2.11 = K.
(c) Since 2.93 < x < 3.07 and 4.96 < y < 5.04, we have |y − x| > 4.96 −
3.07 = 1.89 = L.
9. The base case (n = 2) is the ordinary TI. For the inductive step, let’s assume that
the TI holds for n summands. To show it holds for n + 1 summands we calculate
Selected Solutions 307
11. (a) Look separately at the cases x ≥ y (so |x − y| = x − y) and x < y (so
|x − y| = y − x).
x + y − |x − y|
(b) min{x, y} = .
2
f (x) + g(x) − |f (x) − g(x)|
(c) We can use h(x) = .
2
(d) A good interval is something like [−20, 2].
13. (a) |x − 1| < 0.5 implies 1 − x < 0.5, or x > 0.5.
|c|
(b) Use the triangle inequality: |c| = |c − x + x| ≤ |c − x| + |x| < 2
+ |x|.
This implies |c| < |c|
2
+ |x|, or |x| > |c|
2
.
1.8 Bounds
1. Following are sketches.
(a) Say m ≤ t ≤ M for all t ∈ T . If s ∈ S, then s ∈ T , too, so m ≤ s ≤ M .
Thus S is bounded.
(c) If |S| is bounded, then |s| < K for all s ∈ S and some K > 0. But then
−K < s < K for all s ∈ S, so S is bounded. The converse is similar.
3. (a) I = [1, 3) works
(b) I = (−∞, 3) works
(c) I = (1, 3) works
(d) No; if max(S) exists then max(S) = sup(S).
5. (a) We have 1 = e0 ≤ f (x) = ex ≤ e10 and −1 ≤ g(x) = sin(x) ≤ 1 for
x ∈ A. We know from calculus that (i) f (x) = ex is bounded below by 0 but
is unbounded above on R, and (ii) g(x) = sin(x) is bounded below by −1
and above by 1 on all of R.
(b) For f ◦ g: When x ∈ [0, 10] we have 0.37 ≈ e−1 ≤ esin(x) ≤ e1 ≈ 2.72.
For g ◦ f : When x ∈ [0, 10] we have −1 ≤ sin(ex ) ≤ 1.
7. Since A is bounded we know that −M ≤ x ≤ M for some M > 0 and all x ∈ A.
But then also −3M + 5 ≤ 3x + 5 = f (x) ≤ 3M + 5 for all x ∈ A, as desired.
9. (a) inf(S) = min(S) = 2; sup(S) = max(S) = 3.
(b) inf(S) = min(S) = 2; sup(S) does not exist, as there are infinitely many
primes.
308 Selected Solutions
(c) One possibility is to let h = b − a, and use the list a + h/2, a + h/3, a +
h/4, . . . , a + h/n, . . . . “Successive averaging” is another possibility.
3. For given ǫ > 0, consider the positive number 1/ǫ. Since N is unbounded, there
exists an integer n with n > 1/ǫ. But then 0 < 1/n < ǫ, as desired.
5. (a) The statement is false if a = 0 and b = 1.
(b) The statement is true. If 0 < a < b then the the Archimedean principle
applies. If a < b < 0 and n = −1, then na > b. The case a < 0 < b is left
to you.
(c) Clearly b + 1 > b, so we can choose n ∈ R with na = b + 1; i.e., n =
(b + 1)/a. Nothing Archimedean needed.
7. (a) The italicized statement is true; this is essentially the well-ordering principle.
(b) The italicized statement is now false. The set S of rationals less than π has
supremum π, but π ∈ / Q.
9. (a) In = (3 − 1/n, 3 + 1/n) works.
(b) In = [−3 − 1/n, 42 + 1/n] works.
(c) In = (−3 − 1/n, 42 + 1/n) works.
11. The hint explains the problem: If (a, b) is contained in all the Ii , then [a, b] is
contained in all these intervals, too.
13. (a) The set S is certainly bounded above; by completeness there’s a least upper
bound, β.
(b) Note that (β + h)2 = β 2 + 2βh + h2 < β 2 + 2βh + h = β 2 + (2β + 1)h.
Now use property (ii) of h.
(c) The preceding calculation shows β + h—a number larger than β—has square
less than 2. This contradicts the supremum property of β.
(d) Note that (β − k)2 = β 2 − 2βk + k2 < β 2 − 2βk. Substitute for k and
simplify.
√ p√ √ p√
15. The point is simply that 4 a = a, 8 a = 4
a, and so on.
√ p√
17. We have, for instance, a =
6 3
a; other parts are similar.
and
10 10/ǫ − 15
<ǫ ⇐⇒ < n.
9n + 15 9
Thus, for given ǫ > 0, the value N = 10/ǫ−15
9
works in the definition. A
formal proof resembles that in Example 2.
(b) The proof is like that for (a), except that now we have
899900 899900/ǫ − 15
|bn − L| = <ǫ ⇐⇒ < n.
9n + 15 9
899900/ǫ−15
Thus N = 9
works for any given ǫ > 0.
(c) We’ll show cn → 0. Observe first that
2n 2n 2
|cn − L| = < = ,
3n2 + 5 3n2 3n
and
2 2
<ǫ ⇐⇒ < n.
3n 3ǫ
2
Here’s the formal proof: Let ǫ > 0 be given; set N = 3ǫ
. This N works,
since if n > N then
2n 2 2
|cn − L| = < < = ǫ.
3n2 + 5 3n 3N
11. Here’s a sketch. Let ǫ > 0 be given, and set ǫ′ = ǫ/17. Since ǫ′ > 0 we can choose
N so that n > N implies |xn − 1| < ǫ′ . This same N works in the definition of
convergence of {17an } to 17.
Selected Solutions 311
13. Suppose {an } is decreasing. Since {an } is bounded, this set has an infimum; call
it L. To see L is also the limit, let ǫ > 0 be given. Because L is the infimum,
L + ǫ is not a lower bound for {an }, and so there is some aN with L + ǫ > aN .
This N “works”: If n > N , then an ≤ aN (because {an } is decreasing) and so
L ≤ aN < L + ǫ, as desired.
15. Suppose xn → 5. To show yn → 0, let ǫ > 0 be given. Since xn → 5 we can
choose N so that n > N =⇒ |xn − 5| < ǫ. This is just another way of saying
that |yn | < ǫ, and so yn → 0 as desired. The converse is almost identical.
17. (a) For any given ǫ > 0 we have |1/n − 0| ≥ ǫ only when n ≤ 1/ǫ. There are
only finitely many such n.
(b) Let ǫ > 0 be given, and consider the set F = {n | |xn − L| ≥ ǫ}. By
hypothesis, F is finite. If F is nonempty, then F has a largest element, say
N ; this N works in the definition of convergence. If F is empty, then N = 0
works.
(c) A sequence {xn } does not converge to zero if, for some ǫ > 0, we have
|xn | ≥ ǫ for infinitely many n.
(d) A sequence {xn } does not converge to zero if, for some ǫ > 0, there is no N
such that |xn | < ǫ whenever n > N .
19. Every constant sequence has such a table.
21. (a) true
(b) true; 0 is a lower bound
(c) false; n = 43 is a counterexample
(d) true; N = 10000 works
(e) true; N = 1 works, for instance
(f) false
(g) true; for a given ǫ we can choose N = 1/ǫ (or the next integer if 1/ǫ isn’t an
integer)
23. (a) true
(b) true
(c) false; use technology to find a counterexample, like n = 45
(d) true; N = 10000 works
(e) true; N = 20000 works . . . use the triangle inequality
(f) false; see (d)
(g) true; N = 1/ǫ (or the next integer) works
312 Selected Solutions
27. (a) We can check explicitly that 1.511 ≈ 86.5 < 89 = f11 , and 1.512 ≈
129.7 < 144 = f12 . For the inductive step, note that
fk+1 > 1.5k−1 + 1.5k = 1.5k−1 · 2.5 > 1.5k−1 · 1.52 = 1.5k+1 ,
as desired. (Note that the proof worked because 1 + 1.5 > 1.52 . A similar
result can be shown to hold if 1.5 is replaced by any number b with 1 + b >
b2 .)
(b) Using technology one can check that n ≥ 72 works.
29. Implications are as follows: .
(a) The statement says that an → π.
(b) The statement says that for some n, an = π for n > N .
(c) The statement says that an = π for n > 1.
In particular, (c) =⇒ (b) =⇒ (a).
Now for any given ǫ > 0, we can choose N so that |an − a| < ǫ/|c| for n > N .
(Why can we do this?) This N “works” for the given ǫ and the original sequence
{can }. (Assemble the pieces into an efficient proof.)
3. Both equivalences follows directly from the ǫ − N definition. The second also
follows from (a) of Proposition 2.7, page 99.
5. Imitate the (partial) proof given for Lemma 2.9.
1
7. (a) We can define {bn } by bn = 1 − n
.
(b) We can define {bn } by bn = 1 for all n.
Selected Solutions 313
(c) Because β − n1 is not an upper bound for C, there must exist bn as desired.
(Note the strict inequality.)
1
(d) Apply the squeeze principle to β − n
< bn ≤ β.
9. The n = 1 (base) case is easy: s2 = 1.5 < 2 = s1 . The inductive step is to show
that if sn − sn−1 < 0, then sn+1 − sn < 0, too. Doing so involves careful but
straightforward algebra.
11. It is clear that {hn } is increasing, and the inequality h2n ≥ hn + 12 implies that
{hn } is unbounded. Since h1 = 1, the inequality means that h2 ≥ 1.5, h4 ≥ 2.0,
h8 ≥ 2.5, h21 000 > 501, etc.
13. (a) Let M > 0 be given. Since xn → ∞, we can choose N so xn > M
whenever n > N . But then also −xn < −M when n > N . Thus −xn →
−∞.
(b) Suppose
xn → ∞. Let ǫ > 0 be given; we need to find N so n > N implies
1
xn = x1n < ǫ. Well, if we set M = 1/ǫ then (since xn → ∞) we can
choose N so n > N implies xn > M = 1/ǫ. But this implies 1/xn < ǫ, as
desired.
(c) Suppose xn < 0 for all n. Then xn → −∞ if and only if 1/xn → 0. This
follows from parts (a) and (b): xn → −∞ ⇐⇒ −xn → ∞ ⇐⇒ − x1n →
0 ⇐⇒ x1n → 0.
15. (a) The limit is 2/5.
(b) The sequence diverges to ∞.
(c) The sequence converges to 1/2.
n2 +arctan n
(d) The sequence diverges to ∞. One strategy is to observe that n+2
>
2
n −4
n+2
= n − 2.
17. (a) One possibility is to let an = (−1)n and bn = (−1)n+1 for all n.
(b) We’ll show that cn → max{a, b}.
One approach is to use the curious fact (which appears in a later section) that
for any numbers an and bn ,
an + bn + |an − bn |
cn = max{an , bn } = .
2
Invoking Theorem 2.5 (on algebra with convergent sequences) and Proposi-
tion 2.7.b (on how absolute values play nicely with limits) we see that
an + bn + |an − bn | a + b + |a − b|
cn = converges to c = = max{a, b},
2 2
as desired.
Here is another approach. We assume WLOG that a ≤ b and show that
cn → b.
Suppose first that a = b. Then for given ǫ > 0 we can choose Na and Nb
such that n > Na implies |an − b| < ǫ, and n > Nb implies |bn − b| < ǫ.
314 Selected Solutions
2.3 Subsequences
1. (a) The sequence 1, 2, 3, 1, 2, 3, . . . works.
(b) The sequence 1, −1, 1/2, −1/2, 1/3, −1/3, . . . works.
(c) The sequence 0, 1, 0, 2, 0, 3, . . . works.
(d) The sequence 1, 1, 1/2, 2, 1/3, 3, . . . works.
(e) The sequence 1, 1, 2, 1, 2, 3, 1, 2, 3, 4, . . . works.
3. (a) Given a sequence that lists the rationals, we can just form the subsequence of
nonnegative rationals.
(b) Look at any two successive terms, say p1 and p2 . Between these two rational
numbers like other rationals. If the sequence were monotone, these other
rationals would lie between p1 and p2 in the sequence.
Selected Solutions 315
(c) We can choose n1 so that pn1 = 1. Then we choose n2 with n2 > n1 and
pn2 ≥ 2. Continuing this process completes the proof; details are left to the
reader.
5. In general it’s possible for a subsequence to behave differently from its parent.
The point here is that this particular kind of subsequence—the “tail” of a given
sequence—behaves in a more special way.
Here’s the idea: Let ǫ > 0 be given, and suppose that some N1 , say N1 = 1000,
works for the subsequence. Note that the 1001th term of the subsequence is x5242 ,
and we know that from then on, all terms of the subsequence are within ǫ of L. But
this is just another way of saying that all terms past x5242 in the original sequence
are within ǫ of L; that is N = 5242 works for the given ǫ in the parent sequence.
In general, if N1 works for the subsequence, then N = 4242 + N1 works in the
parent sequence. Clean this up a little to get an impeccable proof.
7. Statements (a), (c), (e), and (f) are all equivalent to each other. Statements (b) and
(d) are also equivalent.
9. (a) The sequence 0, 3, 0, 3, 0, 3, . . . is one example.
(b) The statement xn → 3 means that, for every ǫ > 0, all but finitely many
xn are within ǫ of 3. Negating this condition means that, for some ǫ > 0,
infinitely many xn are at leastǫ away from 3. These {xn } can be taken in
order to give the desired subsequence.
11. (a) For M > 0, choose N such that xn > M whenever n > N . The same N
works for the subsequence {xnk }, since if k > N , then nk ≥ k > N , and
so xnk > M , as desired.
(b) The contrapositive of Theorem 2.12(a) says that if some subsequence {xnk }
does not converge to L, then the original sequence {xn } doesn’t converge to
L either. This implies Theorem 2.12(c), because if L is any number, then at
least one subsequence fails to converge to L, and so L can’t be the limit.
13. The sequence {x11 , x101 , x1001 , . . . } does the job. For this sequence we have
nk = 10k + 1.
15. Notice first that zn = bn/2 for even n and zn = a(n+1)/2 for odd n. In particular,
both sequences {an } and {bn } are subsequences of {zn }, so if zn → L we must
have an → L and bn → L, too.
For the converse, we suppose that both an → L and bn → L. Let ǫ > 0 be
given. We can choose N1 and N2 such that n > N1 =⇒ |an − L| < ǫ, and
n > N2 =⇒ |bn − L| < ǫ. Now let N = 2 max{N1 , N2 }. This N “works”
for {zn }, because if n > N and n is even, then n/2 > N ≥ N2 , so |zn − L| =
|bn/2 − L| < ǫ. Similarly, if n > N and n is odd, then (n + 1)/2 > N ≥ N1 , so
|zn − L| = |a(n+1)/2 − L| < ǫ.
(b) The sequence is not Cauchy. If we choose, say ǫ = 0.001, then no N works,
since if N is any positive number, no matter how big, then with m = N + 1
and n = N + 2 we have |yn − ym | = 2/1234 > 0.001.
(c) The sequence is Cauchy. Note that if n > m, we have |zn − zm | =
n m n−m n−m n 1
n + 1 m + 1 = (n + 1)(m + 1) < nm < nm = m . It follows
1 − rn
(e) The series is geometric, with r = 0.99, so Sn = = 100−100·0.99n ,
1−r
and Sn → 100.
n
(f) One shows by induction that Sn = . Thus Sn → 1.
n+1
n
3. (a) Since H2n > for all n and the right-had sequence diverges to infinity, we
2
must have H2n → ∞, too. Since {Hn } has a divergent subsequence, {Hn }
must diverge, too.
(b) Suppose toward contradiction that Hn → H for some finite number H. Since
{H2n } is a subsequence, we’d have H2n → H, too. But we also have H2n >
1 1
+ Hn ; taking limits of both sides gives H ≥ + H, which is absurd.
2 2
∞
X 1
5. (a) The series converges absolutely by comparison to .
k2
k=1
∞
X 1
(b) The series converges absolutely by comparison to 2
.
k=1
k
∞
X 1
(c) The series converges absolutely by comparison to the geometric series k
.
k=1
3
(d) The series diverges by the nth term test—all terms exceed 1/3.
P P P
7. Let ak and bk be series, and let ck be the “sum series,” defined by ck =
ak + bk for all k. Let An , Bn , and Cn denote the partial sums for these series.
The key point is that—thanks to the commutative law for addition of finitely many
numbers—Cn = An + Bn for all n. It follows that if An → A and Bn → B, then
we must also have Cn → A + B, which is what we wanted to prove.
9. (a) It’s enough to show (i) if n > 1000 then Sn ≥ S1000 ; (ii) if n > 1000 then
Sn ≤ S1001 . Claim (i) amounts to observing that every string of the form
1 1 1 1
− + −···± adds up to a positive result; group
1001 1002 1003 1000 + n
summands in pairs to see why. Claim (ii) holds because every string of the
1 1 1 1
form − + − + ··· ± has negative sum; again,
1002 1003 1004 1000 + n
group summands in pairs to see why.
(b) For given ǫ > 0 we can take any integer N ≥ 1/ǫ. Then |SN − SN+1 | =
1
< ǫ.
N +1
(c) For given ǫ > 0 choose N as in the preceding part. Then for n > m > N
we have both Sn and Sm between SN and SN+1 , and thus within ǫ of each
other.
P P
11. (a) Let An and Bn be the partial sums for ak and bk . The hypothesis boils
down to the fact that for n > 42 we have Bn = An − A42 + B42 . Thus, the
sequences {An } and {Bn } differ (for large n) by an additive constant, and
so both converge or both diverge.
318 Selected Solutions
(b) We know (see the preceding part) that Bn = An − A42 + B42 . Because
X4
bk − ak = k2 , we must have B42 − A42 = 2k2 = 25585. Thus,
k=1
Bn = An + 25585, and since An → 100 we must have Bn → 25685.
13. A similar problem appears in Section 1.5.
1 1 1 1 1 1
11. Notice that the sum Sn = √ + √ + · · · + √ > √ + √ + · · · + √ =
1 2 n n n n
n √ √
√ = n. Since n → ∞, we must have Sn → ∞, too.
n
∞
X 1
13. (a) The series converges by comparison to k−1
, which converges to 2.
k=1
2
∞ ∞
X 1 X 1 1
(b) We have R10 = < = 9.
k=11
2k−1 + 1 k=11
2k−1 2
1
(c) We have S = S10 + R10 < 1.26255 + ≈ 1.26450. This means that S
29
lies somewhere in the interval [1.26255, 1.26450].
15. (a) See your favorite calculus text.
(b) We need to choose n so RZn < 0.001. By the given inequality, this holds if
∞
dx
n is large enough so that < 0.001. A calculus calculation shows
Z ∞ n x3
dx 1
that 3
= 2
, and the last quantity is less than 0.001 if n ≥ 23. This
n x 2n
means that S23 ≈ 1.2012 is within 0.001 of the true answer.
to any sequence {xn } of the given type, Lemma 3.2 implies that lim g(x) = L, as
x→a
desired.
9. (a) We say lim f (x) = L if for all ǫ > 0 there exists M < 0 such that
x→−∞
|f (x) − L| < ǫ whenever x < M .
(b) We say lim g(x) = L if for all ǫ > 0 there exists δ > 0 such that
x→0−
|g(x) − L| < ǫ whenever −δ < x < 0.
(c) Here’s the idea of a proof that lim f (x) = L implies that lim f (1/x) =
x→−∞ x→0−
L. The converse is proved similarly.
Let ǫ > 0 be given. We want to show that |f (1/x) − L| < ǫ for x in some
interval (δ, 0). By hypothesis, there is some M < 0 so that |f (t) − L| < ǫ
when t < M . Writing t = 1/x, this means that |f (1/x) − L| < ǫ when
1/x < M < 0, or, equivalently, when 0 > x > 1/M . (The inequality
algebra is a bit tricky because both M and x are negative.) This implies that
we can take −δ = 1/M , or δ = −1/M .
11. (a) We can use δ ≈ 0.0024 (or less). Note δ ≈ ǫ/4.
(b) We can use δ ≈ 0.00005 (or less). Note δ ≈ ǫ/200.
(c) With ǫ = 0.01, the value δ = 0.001 does not quite work at a = 10. (The
value δ = 0.0005 does work.)
13. First suppose a ∈ Z0. Let ǫ > 0 be given. Set δ = 1. This δ works, since if
0 < |x − 0| < δ, then x ∈
/ Z, and so |f (x) − 0| = 0 < ǫ, as desired.
If a ∈/ Z, then we can take δ to be the (positive!) distance from a to the nearest
integer.
15. The key idea is that, because S is a finite set, there is a smallest distance, say d,
between any two points in S. Therefore, to show that limx→a j(x) = 0 when
a ∈ S, we can use δ = d works for any ǫ > 0.
To show that limx→a j(x) = 0 when a ∈ / S, we can choose any ǫ > 0 and let δ be
the smallest distance from a to any point of S. Such a δ exists because S is finite.
17. (a) The point is that all three quantities in the claimed inequality are even func-
tions of θ—i.e., they have the same value for θ and for −θ.
(b) Draw the right triangles indicated, and note that θ is the area of the pie-shaped
wedge with angle θ at the origin.
13. By the continuity hypothesis, f (0) = limn→∞ f (1/n). Since all terms of the
sequence f (1/n) are positive, the limit can’t be negative. (One can argue this more
formally using general properties of sequences. Or see Problem 3, page 91.)
15. Note that limx→3 f (x) = f (3). Let ǫ = f (3) − 5. If we choose any δ > 0 that
works for this ǫ, we’re done.
17. (a) The discontinuity of f (x) = 1/x at a = 0 is not removable; lim f (x) does
x→0
not exist.
(b) The discontinuity is removed if we set f (−2) = −4 = lim f (x).
x→−2
19. We need to show that f (c) = 0 if c is any irrational number. Recall (why?) that
there is a sequence {rn } of rationals such that rn → c; note that f (rn ) = 0 for
all n. Continuity of f at c requires that f (rn ) → f (c). In other words, f (c) =
lim f (rn ) = lim 0 = 0.
n→∞ n→∞
13. Suppose toward contradiction f (2) ≤ f (1). If f (2) = f (1) or f (2) = f (0), then
f is not one-to-one, a contradiction. Thus we have either f (0) < f (2) < f (1) or
f (2) < f (0) < f (1). In the former case, by the intermediate value theorem there
exists c with 0 < c < 1 and f (c) = 2, which again contradicts the one-to-one
property. The remaining case, f (2) < f (0) < f (1) is similar.
Alternatively, just note that Problem 12 says that f is strictly monotone, and there-
fore strictly increasing.
15. If f is not constant then the range of f is an interval. But every interval contains
both rational and irrational numbers.
13. (a) For given ǫ > 0. Choose δ1 and δ2 such that, for x, y ∈ I,
ǫ
|x − y| < δ1 =⇒ |f (x) − f (y)| < ;
2
ǫ
|x − y| < δ2 =⇒ |g(x) − g(y)| < .
2
Now show that δ = min(δ1 , δ2 ) works for f + g.
(b) One of many possibilities is f (x) = x = g(x) on I = R.
15. (a) For given ǫ > 0 the number δ = ǫ/K works in the definition.
(b) If A = 0 then L is constant, and K = 0 works. Otherwise, for given ǫ > 0
the number δ = ǫ/|A| works.
(c) The ratio in question is bounded by K.
√
√ g(x) − g(0)
= x = √1 blows up as x →
(d) For g(x) = x, the ratio
x−0 x x
0+ . By the preceding part, g is not Lipschitz continuous on [0, 1].
17. (a) The function f must be constant (and hence uniformly continuous) on R.
(b) The function f need only be bounded (and not necessarily uniformly contin-
uous) on R.
(c) The function f must be constant (and hence uniformly continuous) on R.
(d) For some δ, values of f can’t rise or fall value by more than 1 on intervals
of length δ. Such a function could have jumps, and so need not even be
continuous.
3.6 Compactness
1. (a) Each xi is in some Ui , so it takes at most n such Ui to cover F .
(b) We can cover R with open sets Ui = (−i, i) for i ∈ N. A finite subcollection
covers only a bounded subset of R.
(c) It takes zero open sets to cover ∅; that’s certainly finite!
3. (a) Cover each term 1/k with a small open interval that contains no other term of
the sequence.
(b) The set {an } is compact in this case, because it contains its own limit.
(c) Many sequences are possible. One is 0, 1, 0, 1, . . . .
5. Find a linear (therefore certainly continuous) function f : R → R that maps [0, 1]
onto [a, b].
7. The Ui are open because they are inverse images of the open intervals (−i, i). The
Ui cover all of R, and so all of K. Compactness of K means that we need only
finitely many Ui to cover K. This amounts to boundedness of f on K.
9. Proposition 3.41 says that the image of a compact set is compact.
326 Selected Solutions
5. (a) The chain rule says that if h(x) = 1/g(x), then h′ (a) = −g ′ (a)/g(a)2 .
(b) Applying the product rule to f (x)/g(x) = f (x) · h(x) gives the derivative
f ′ (a) · h(a) + f (a)h′ (a) = f ′ (a)/g(a) − f (a)g ′ (a)/g(a)2 , which is (after
a bit of algebra) seen to be the desired expression.
7. All parts are straightforward—but perhaps messy—calculations.
9. Clearly, g(a) = f (a)2 = 0. Also the chain rule gives g ′ (a) = 2f (a)f ′ (a) = 0.
11. (a) Notice first that f is continuous at x = 17, by Theorem 4.3, page 213. We
also know, or can easily prove, that h(x) = |x| is continuous everywhere.
Since g(x) is the composition h(f (x)), we know that g(x) must also be con-
tinuous.
(b) If f (x) = x − 17, then g(x) = |x − 17|, which is clearly not differentiable
at x = 17.
(c) Note that g(x)2 is really the same thing as f (x)2 , which is the product of two
differentiable functions, and therefore differentiable.
(d) We know f has at least one root. If a > 0 then f ′ (x) = 3x2 +a, so f ′ (x) > 0
for all x. Thus f ′ has no roots, and so f can have at most one root.
5. (a) Rolle’s theorem gives x1 between a and c and x2 between c and b.
(b) Rolle’s theorem applied to f ′ on the interval [x1 , x2 ] gives the desired x0 .
(c) If f : [a, c] → R has continuous derivatives f ′ , f ′′ , . . . , f (42) on [a, c], and
there are inputs x1 , x2 , . . . , x43 with f (x1 ) = f (x2 ) = · · · = f (x43 ), then
the desired x0 exists.
7. Rolle’s theorem implies that f ′ has a root in each interval (0, a), (a, 2a), (2a, 3a),
. . . . Since f ′ is also periodic with period a, (f ′ )′ = f ′′ also has roots in each of
these intervals, and similarly for higher-order derivatives.
9. (a) Every number c in (1, 7) works.
(b) The only solution is c = 4.
f (b) − f (a)
(c) In this case the MVT equation f ′ (c) = reduces to 100c99 = 1,
b−a
1
which gives c = ≈ 0.955.
1001/99
f (b) − f (a)
(d) In this case the equation f ′ (c) = reduces to 6c + 5 = 3a +
b−a
3b + 5, which gives c = (a + b)/2, the midpoint of (a, b).
(e) If q(x) = Ax2 + Bx + C and [a, b] is any interval, then the mean value
equation holds if and only if c = (a + b)/2. To see why, note that in this case
f (b) − f (a)
the equation f ′ (c) = reduces to B + 2Ac = B + A(a + b).
b−a
11. If a car has the same velocity at times a and b, then there must be some intermediate
time c at which the acceleration is zero.
13. (a) Consider h(x) = f (x) − g(x). By hypothesis, h′ (x) = 0 for all x ∈ (a, b).
By Proposition 4.11, h is constant, as desired.
(b) Let h(x) = f (x) − g(x) − 5x. Then h′ (x) = 0 for all x ∈ (a, b), and so
h(x) = C for some constant C. It follows that f (x) = g(x) + 5x + C.
f (b) − f (a)
15. If we write m = , then L(x) = m(x−a)+f (a), and the rest follows.
b−a
17. (a) This should be straightforward.
(b) The proof of (i) can be essentially rerun with strict inequalities.
(c) If f were not monotone, then we’d have f (a) = f (b) for some a and b in I.
Rolle’s theorem now says there is c ∈ I with f ′ (c) = 0, which contradicts
the hypothesis.
19. Suppose f (b) ≥ b for some b > 0. Then the MVT implies that f ′ (c) =
f (b) − f (0) f (b)
= ≥ 1 for some c between 0 and b; this contradicts our as-
b−0 b
sumption.
21. (a) The function f is continuous, and so must attain a minimum on the interval
[s, t] by the extreme value theorem. The minimum can’t be at either endpoint
because f ′ (s) < 0 and f ′ (t) > 0.
Selected Solutions 329
(b) The function g(x) = f (x) − vx satisifies the conditions in (a), so there is
some c between s and t with 0 = g ′ (c) = f ′ (c) − v. Thus f ′ (c) = v as
desired.
23. (a) All parts follow by plugging t = 0 into f and p2 .
(b) That g(0) = 0 = g(1) is an easy calculation. The t1 in question exists by
Rolle’s theorem applied to g.
(c) That g ′ (0) = 0 = g ′ (t1 ) is an easy calculation. The t2 in question exists by
Rolle’s theorem applied to g ′ .
(d) As in the previous parts, the claimed t3 exists by Rolle’s theorem applied to
g ′′ .
2x 1
(b) Here is the key identity: (x + 1/n)2 − x2 = + 2 . The right-hand
n n
side is large for large |x|.
11. Let ǫ > 0 be given. Since h is uniformly continuous on R we can choose δ > 0
such that |h(s) − h(t)| < ǫ whenever |s − t| < δ. Because fn → f uniformly on
I we can choose N such that |fn (x) − f (x)| < δ holds for all x ∈ I whenever
n > N . By our choice of δ, this guarantees that |h (fn (x)) − h (f (x))| < ǫ for all
x ∈ I, as desired.
13. (a) The ratio calculation is slightly messy, but it works.
(b) We get f ′′ = −f . This combined with f (0) = 0 and f ′ (0) = 1 says that
f (x) = sin(x).
(c) Differentiate f term by term to find a power series for f ′ . What famous
function is this? How do you know?
(d) The graphs should resemble the sine and cosine functions, respectively.
13. All the calculations work the same as in Problem 12, except that we need δ =
ǫ/(b − a).
as claimed.
(b) Part (a) implies that f (c) ≤ 0 must hold for some c ∈ [a, b]. For similar
reasons, f (d) ≥ 0 must hold for some d ∈ [a, b]. If either f (c) = 0 or
f (d) = 0, we’re done; otherwise, the intermediate value theorem says that
f (e) = 0 for some e between c and d.
R1
(c) Let f (x) = 1 for x ≥ 0 and f (x) = 1 for x < 0. Then −1 f = 0, but
6 0 for all c.
f (c) =
Rb Rb Rb
7. The claim amounts to saying that − a |f | ≤ a f ≤ a |f |. This follows from
Corollary 5.6, and from the fact that − |f (x)| ≤ f (x) ≤ |f (x)| for all x ∈ [a, b].
9. (a) The function f is integrable on every interval [a, b] because it is built up from
integrable functions as allowed by Theorem 5.8, page 274.
R1
(b) A look at the graph shows −1 f = −1.
Rn
(c) A look at the graph shows −n f = −n.
11. All parts are basic elementary calculus calculations.
13. Suppose f (x) ≤ g(x) holds except at some finite list x1 , x2 , . . . xn of points in
[a, b]. If we redefine f at these points (we could set f (xi ) = g(xi ), for instance,
without changing the integral) then f (x) ≤ g(x) holds throughout [a, b], and so
Corollary 5.6 gives the desired result.
332 Selected Solutions
5.3 Integrability
1. All of the concocted functions are continuous.
3. Here is one possibility: Set h = ǫ/100 and use P = {0, h, 1−h, 1+h, . . . , 10−
h, 10}.
R2
7. Given any box sum for 0 f with value less than ǫ, add (if necessary) one more
point at x R= 1 to get another box sum with value less than ǫ. This same box sum
1
works for 0 f ; just ignore boxes to the right of x = 1.
11. Let mi and Mi be the inf and sup of f on [xi−1 , xi ]. For each i, it is clear that
mi ≤ f (si ) ≤ Mi , and hence that mi ∆xi ≤ f (si )∆xi ≤ Mi ∆xi . Summing
over i gives the desired result.
13. The function fbig is constant on each partition subinterval, and is therefore integrable
Rb
on [a, b] by Theorem 5.8, page 274. By construction, moreover, the integral a fbig
has the same value as the upper sum US (f, P).
It is also clear that fbig (x) ≥ f (x) for all x ∈ [a, b], and so (by Corollary 5.6,
page 270) we have
Z b Z b
US (f, P) = fbig ≥ f = I,
a a
335
336 Index