MATH 13 Book 2024
MATH 13 Book 2024
A B
Contents
Preface: What is Math 13 and who is it for? ii
i
Preface: What is Math 13 and who is it for?
Math 13 was created by the late Howard Tucker as a traditional discrete mathematics course (the
number was a deliberate joke). It eventually evolved to become the transition class in UCI’s under-
graduate program, introducing students to abstraction and proof, and serving as the key pre-requisite
for upper-division courses such as abstract algebra, analysis, linear algebra & number theory.
The typical student is simultaneously working through lower-division calculus and linear algebra.
Knowledge of such material is unnecessary and students are encouraged to take Math 13 early both
to ease the transition from algorithmic to abstract mathematics, and so that a proof-mentality may be
brought to other lower-division classes.
This text evolved from course notes dating back to 2008. Math 13 is something of a hydra due to the
niche it occupies in UCI’s program: part proof-writing, part discrete mathematics, and part introduc-
tion to specific upper-division topics. Logic is covered only at a basic level while set theory is spread
throughout the text, the intent being for dry ‘grammar’ topics to be absorbed through engagement
with more accessible and fun ideas. By the end of the course, interested students should be prepared
for a formal study of logic and set theory at the upper-division level.
Learning Outcomes
1. Developing the skills necessary to read and practice abstract mathematics.
2. Understanding the concept of proof and becoming acquainted with multiple proof techniques.
3. Learning what sort of questions mathematicians ask and what excites them.
Number Theory & Abstract Algebra How can we perform arithmetic with remainders? Can you
figure out what on which day of the week you were born?
Geometry and Topology How can we visualize objects such as the Möbius strip? How can we use
sequences of sets to produce objects (fractals) that appear similar at all scales?
To Infinity and Beyond! Why are some infinities greater than others?
Useful Texts The following texts are recommended if you want more exercises and material. The
first two are available free online; the rest were previous textbooks for Math 13.
ii
1 Introduction: What is a Proof?
The essential concept in higher-level mathematics is that of proof. A basic dictionary entry might
cover two meanings:
1. A test or trial of an assertion.
2. An argument that establishes the validity (truth) of an assertion.
In science and wider culture, the first meaning predominates: a defendant was proved guilty in court;
a skin cream is clinically proven to make you look younger; an experiment proves that the gravitational
constant is 9.81ms−2 . A common mistake is to treat such a proved assertion as unambiguously true.
Distinct juries might disagree as to whether a defendant is guilty; indeed for many crimes the truth
is uncertain, hence the more nuanced legal expression proved beyond reasonable doubt.
In mathematics we use the second meaning: a proof establishes the incontrovertible truth of some
assertion. To see what we mean, consider a simple claim (mathematicians use the word theorem).
Hopefully you believe this statement. But how do we prove it? We can test the assertion by consider-
ing examples (4 + 6 = 10 is even, (−8) + 30 = 22 is even, etc.), but we cannot expect to verify all pairs
this way. For a mathematical proof, we somehow need to test all possible examples simultaneously.
To do so, it is essential to have a clear idea of what is meant by an even integer.
Definition 1.2. An integer is even if it may be written in the form 2k where k is an integer.
Proof. Let x and y be even. Then x = 2k and y = 2l for some integers k and l. But then
x + y = 2k + 2l = 2(k + l ) (∗)
is even.
The box indicates the end of the argument. Traditionally the letters Q.E.D. were used, an acronym
for the Latin quod erat demonstrandum (which is what was to be demonstrated).
Consider how the proof depends crucially on the definition.
• The theorem did not mention any variables, though these were essential to the proof. The vari-
ables k and l come for free once you write the definition of evenness! This is very common; simple
proofs are often little more than rearranged definitions.
• According to the definition, 2k and 2l together represent all possible pairs of even integers. It is
essential that k and l be different symbols: Is is clear why? What would you be proving if k = l?
• The calculation (∗) is the easy bit; without the surrounding sentences and the direct reference
to the definition of evenness, the calculation means nothing.
Notice the sleight of hand: a mathematical proof establishes truth only by reference to one or more
definitions.1
1 Strictly speaking, the definition and theorem also depend on the meanings of integer and sum, though to rigorously
define either would take us too far afield. In any context, some concepts will be considered too basic to merit definition.
1
Theorems & Conjectures
Theorems are true mathematical statements that we can prove. Some are important enough to merit
names (Pythagorean theorem, fundamental theorem of calculus, rank–nullity theorem, etc.), but most
are simple statements such as Theorem 1.1.
In practice we are often confronted with conjectures: statements we suspect to be true, but which we
don’t (yet) know how to prove. Much of the fun and creativity of mathematics lies in formulating
and attempting to prove (or disprove) conjectures.
A conjecture is the mathematician’s analogue of the scientist’s hypothesis: a statement one would
like to be true. The difference in approach takes us right back to the dual meaning of proof. The
scientist tests their hypothesis using the scientific method, conducting experiments which attempt
(and hopefully fail!) to show that the hypothesis is incorrect. The mathematician tries to prove the
validity of a conjecture by relying on logic. The job of a mathematical researcher is to formulate
conjectures, prove them, and publish the resulting theorems. Attempting to formulate your own
conjectures is an essential part of learning mathematics; many will likely be false, but you’ll learn
much by figuring out why!
Here are two conjectures to give us a taste of this process.
How can we decide if these conjectures are true or false? To get a feel for things, we start by comput-
ing examples for several small integers n. (In practice, this is likely what lead to the formulation of
the conjectures in the first place!)
n 1 3 5 7 9 11 13 n 1 2 3 4 5 6 7
n2 −1 0 8 24 48 80 120 168 n2 + n + 41 43 47 53 61 71 83 97
Since 0, 8, 24, 48, 80, 120 and 168 are all multiples of 8, and 43, 47, 53, 61, 71, 83 and 97 are all prime,
both conjectures appear to be true. Would you bet $100 that this is indeed the case? Is n2 − 1 a multiple
of 8 for every odd integer n? Is n2 + n + 41 prime for every positive integer n? Establishing whether
each conjecture is true or false requires one of the following:
Let us start with Conjecture 1.3. If n is an odd integer, then, by definition, we may write n = 2k + 1
for some integer k. Now compute the object of interest:
2
this? We return to testing some small values of k:
k −2 −1 0 1 2 3 4
k ( k + 1) 2 0 0 2 6 12 20
Once again, the claim seems to be true for small values of k, but is it is true for all k? Again, the only
way is to prove or disprove it. Observe that k (k + 1) is the product of two consecutive integers. This is
great, because for any two consecutive integers one is even and the other odd; their product must be
even. Conjecture 1.3 is indeed a theorem!
So far, our approach has been investigative. Scratch work is an essential part of the process, but we
shouldn’t expect a reader to have to fight their way through such. We therefore offer a formal proof.
This is the final result of our deliberations; investigate, spot a pattern, conjecture, prove, and finally
present our work in as clean and convincing a manner as we can.
Proof. Let n be an odd integer. By definition, we may write n = 2k + 1 for some integer k. Then
All that work, just for five lines of clean argument! But wasn’t it fun?
When constructing elementary proofs it is common to be unsure over how much detail to include.
We relied on the definition of oddness, but we also used the fact that a product is even whenever
either factor is even; does this need a proof? Since the purpose of a proof is to convince the reader,
the appropriateness of an argument will depend on context and your audience: if you are trying to
convince a middle-school student, maybe you should justify this step more fully, though the cost
would be a longer argument whose totality is harder to grasp. A proof that works perfectly in all
situations is unlikely to exist! A good rule is to imagine writing for another mathematician at your
own level—if a fellow student believes your argument, that’s a good sign of both its validity and
appropriateness.
We now consider Conjecture 1.4. The question is whether n2 + n + 41 is prime for every positive
integer n. When n ≤ 7 the answer is yes, but examples do not make a proof! To investigate further,
return to the definition of prime (footnote 2): is there a positive integer n for which is n2 + n + 41 can
be factored as a product of two integers, both at least 2? A straightforward answer is staring us in
the face! When n = 41 such a factorization certainly exists:
n2 + n + 41 = 412 + 41 + 41 = 41(41 + 1 + 1) = 41 · 43
We call n = 41 a counterexample; there is at least one integer n for which n2 + n + 41 is not prime.
Conjecture 1.4, being a claim about all integers n, is therefore false—it has been disproved.
3
Planning and Writing Proofs
Your main responsibility in this course is the construction of proofs. Their sheer variety means that,
unlike in elementary calculus, you cannot simply practice tens of similar problems until the process
becomes automatic. So how do you learn to write proofs?
The first step is to read other arguments. Don’t just accept them, make sure you believe them: check
the calculations, verify claims, and rewrite the argument in your own words adding any clarifications
you deem necessary or helpful.
As you dissect others’ work, a daunting question often arises: how did they ever come up with this? As
our work on Theorem 1.5 shows, the source of a proof is often less magical than it appears; usually
the author experimented until they found something that worked. The experimentation is hidden
in the final proof, whose purpose is to be as clean and convincing as possible. A proof is akin to a
concert performance after hours of private practice; wrong notes don’t belong in the Carnegie Hall!
In order to bridge the gap, we recommend splitting the proof-writing process into several steps.
Interpret Make sense of the statement. What is it saying? Can you rephrase in a clearer manner?
What are you assuming? A key part of this step is identifying the logical structure of the state-
ment. We’ll discuss this at length in the next chapter.
Brainstorm Convince yourself that the statement is true. First, look up the relevant definitions. Next,
think of some instances where the conditions of the statement are met. Try out some exam-
ples, and ask yourself what makes the claim work in those instances. Examples can help build
intuition about why a claim is true and sometimes suggest a proof strategy. Review other theo-
rems that use these definitions. Do you know any theorems that relate your assumptions to the
conclusion? Have you seen a proof of a similar statement before?
Sketch Build the skeleton of your proof. Think again about what you are assuming and what you
are are you trying to prove. It is often straightforward to write down reasonable first and last
steps (the bread slices of a proof-sandwich). Try to connect these with informal arguments. If you
get stuck, try a different approach.
This step is often the longest in the proof-writing process. It is also where you will be doing
most of your calculations. You can be as messy as you like because no-one ever has to see it! Once
you’ve learned a variety of proof methods, this is a good stage at which to experiment with
different approaches.
Prove Once you have a suitable sketch, it’s time to prove the statement to the world. Translate your
sketch into a linear story. Carefully word your explanations and avoid shorthand, though well-
understood mathematical symbols such as =⇒ are encouraged. The result should be a clear,
formal proof such as you’d find in a mathematics textbook. Although you are providing a
mathematical argument, your proof should read like prose and be written in complete sentences.
Review Finally, review your proof. Assume the reader is meeting the problem for the first time and
has not seen your sketch. Read your proof with skepticism; consider its readability and flow.
Get rid of unnecessary claims and revise the wording if necessary. Read your proof out loud.
If you’re adding extra words that aren’t written down, include them in the proof. Share your
work with others. Do they understand it without any additional input from you?
4
Conjectures: True or False?
Higher-level mathematics is all about the links between proofs, definitions, theorems and conjec-
tures. We prove theorems (and solve homework problems) because they make us use, and aid our
understanding of, definitions. We state definitions to help us formulate conjectures and prove theo-
rems. One does not know mathematics, one does it. Mathematics is a practice; an art as much as it is a
science.
With this in mind, do your best to prove or disprove the following conjectures. Don’t worry if some
terms or notations are unfamiliar: ask! Everything will be covered formally soon enough. At the end
of the course, revisit these problems to realize how much your proof skills have improved.
11. If r is a rational number, then there is a non-zero integer n such that rn is an integer.
13. For all real numbers x, there exists a real number y for which x < y.
14. There is a real number x such that, for all real numbers y, we have x < y.
15. The sets A = {n ∈ N : n2 < 25} and B = {n2 : n ∈ N and n < 5} are equal. Here N denotes
the set of natural numbers.
5
2 Logic and the Language of Proofs
2.1 Propositions
To read and construct proofs, we need to develop the language of logic. This is to mathematics what
grammar is to English.
For a proposition to make sense, readers must agree on the meaning of each concept it references. In
the real world, arguments about propositions are often disagreements over definitions. For instance,
the question of whether God exists is meaningless unless we agree on which conception (Shiva,
Yahweh, Allah, Zeus, all/any of them?) is being discussed! This also illustrates that the truth status
of a proposition need not be known when stated, a particularly common situation in mathematics.
Definition 2.3. Let P and Q be propositions. The following truth tables define three new propositions:
• The conjunction P ∧ Q is read “P and Q.” P Q P∧Q P∨Q P ¬P
T T T T T F
• The disjunction P ∨ Q is read “P or Q.” T F F T F T
F T F T
• The negation ¬ P is read “not P.” F F F F
A tautology is a logical expression that is always true (truth table has a column of T’s), regardless of
its component propositions. A contradiction is a logical expression that is always false.
The letters T/F stand for true/false. For instance, the second line of the first table says that if P is true
and Q is false, then the proposition “P and Q” is false; similarly “P or ‘Q” is true.
Examples 2.4. 1. By choosing explicit propositions, we may compare the logical and/or/not with
their plain English meanings. Suppose P and Q are the propositions
6
English! Note also that the logical or is inclusive (first line of the truth table): with a logical or,
“I like purple or chartreuse,” means you might like both.
2. We continue the previous example by adding a third proposition R: “It is 9am.” What logical
expression might be represented by the following sentence?
P Q R Q ∨ R P ∧ ( Q ∨ R) P ∧ Q ( P ∧ Q) ∨ R
T T T T T T T
T T F T T T T
T F T T T F T
T F F F F F F
F T T T F F T
F T F T F F F
F F T T F F T
F F F F F F F
The moral here is that English is terrible for logic! Clear identification of propositions is essen-
tial if you want to avoid ambiguous sentences such as the above.
Connective propositions can be read and written in many different ways: for instance,
P =⇒ Q P ⇐⇒ Q
P implies Q P therefore Q P if and only if Q
If P, then Q Q follows from P P iff Q
P only if Q Q if P P and Q are (logically) equivalent
P is sufficient for Q Q is necessary for P P is necessary and sufficient for Q
7
Examples 2.6. 1. Here are six English sentences expressing the same conditional P =⇒ Q:
• If you are born in Rome, then you are Italian.
• You are Italian if you are born in Rome.
• You are born in Rome only if you are Italian.
• Being born in Rome is sufficient for being Italian.
• Being Italian is necessary for being born in Rome.
Are you comfortable with what the propositions P and Q are in this situation?
2. P ∧ ( P =⇒ Q) =⇒ Q is a tautology.
P Q P =⇒ Q P ∧ ( P =⇒ Q) P ∧ ( P =⇒ Q) =⇒ Q
T T T T T
T F F F T
F T T F T
F F T F T
Identifying the hypothesis and conclusion is essential if you want to understand a theorem!
P =⇒ P2 =⇒ · · · =⇒ Pn =⇒ Q
We’ll revisit these ideas in Section 2.3, and repeatedly throughout the course.
While the biconditional should be easy to remember, it is harder to make sense of the conditional
connective. Short of simply memorizing the truth table, here are two examples that might help.
Examples 2.7. 1. Suppose your professor says, “If the class earns a B average on the midterm, then
I’ll bring doughnuts.” The only situation in which the teacher will have lied is if the class earns
a B average but she fails to provide doughnuts.
This argument is perfectly correct: the implication P =⇒ Q is true. It (rightly!) makes us uncom-
fortable because the hypothesis is false.
If we instead add 1 to each side of 7 = 3, we’d obtain a example where F =⇒ F is true.
8
The Converse and Contrapositive
It is vital to understand the distinction between these, particularly because of a crucial result.
Proof. Compute the truth table and observe that its third and sixth columns are identical:
P Q P =⇒ Q ¬ Q ¬ P ¬ Q =⇒ ¬ P
T T T F F T
T F F T F F
F T T F T T
F F T T T T
By contrast, the converse is not logically equivalent to the original: the converse of a true implication
could be either true or false (though see Exercise 11 for a special situation).
Since a peach is certainly a fruit, P =⇒ Q is true: “If Claudia has a peach, then she has a fruit.”
The converse of P =⇒ Q is the sentence
This is palpably false: Claudia might have an apple! However, in accordance with Theorem 2.9, the
contrapositive is true:
¬ Q =⇒ ¬ P: “If Claudia does not have a fruit, then she does not have a peach.”
9
Example 2.12. Consider the implication
As in Example 2.7, it might help to think about what it means for the original statement to be false.
• The converse Q =⇒ P
• The contrapositive of the converse ¬ P =⇒ ¬ Q
Our final results in basic logic also involve negations; they are named for Augustus de Morgan, a
British logician of the 19th century.
1. ¬( P ∧ Q) is logically equivalent to ¬ P ∨ ¬ Q
2. ¬( P ∨ Q) is logically equivalent to ¬ P ∧ ¬ Q
Proof. For the first law, compute the truth table: the fourth and seventh columns are identical.
P Q P ∧ Q ¬( P ∧ Q) ¬P ¬Q ¬P ∨ ¬Q
T T T F F F F
T F F T F T T
F T F T T F T
F F F T T T T
Example 2.14. Consider the sentence Subway Coffee Subway and Coffee
I rode the subway and I had coffee. T T T
T F F
To negate this, we might write F T F
F F F
I didn’t ride the subway or I didn’t have coffee.
This reads awkwardly because the negation encompasses three distinct possibilities. Note how the
logical (inclusive) use of or includes the last row of the truth table: the possibility that one neither
rode the subway nor had coffee. As with Example 2.4, this is another advert for the use of logic over
English.
10
Aside: Algebraic Logic We can use truth tables to establish other laws of basic logic, e.g.:
To make things more algebraic, we’ve replaced “is logically equivalent to” with a biconditional.3
Armed with these laws, one can often manipulate logical expressions without the laborious creation
You are welcome to try memorizing these laws, though there is typically no need: De Morgan’s laws,
together with your intuitive understanding of and, or and not mean you’ll likely perform correct
manipulations regardless.
Exercises 2.1. A reading quiz and several questions with linked video solutions can be found online.
1. Express each statement in the form, “If . . . , then . . . ” There are many possible correct answers.
(a) You must eat your dinner if you want to grow.
(b) Being a multiple of 12 is a sufficient condition for a number to be even.
(c) It is necessary for you to pass your exams in order for you to obtain a degree.
(d) A triangle is equilateral only if all its sides have the same length.
2. Suppose “x is an even integer” and “y is an irrational number” are true statements, and that
“z ≥ 3” is a false statement. Which of the following are true?
(Hint: Label each statement and think about each using connectives)
(a) If x is an even integer, then z ≥ 3.
(b) If z ≥ 3, then y is an irrational number.
(c) If z ≥ 3 or x is an even integer, then y is an irrational number.
(d) If y is an irrational number and x is an even integer, then z ≥ 3.
3. Orange County is considering two competing transport plans: widening the 405 freeway and
constructing light rail down its median. A local politician is asked, “Would you like to see the
405 widened or would you like to see light rail?” The politician wants to sound positive, but to
avoid being tied to one project. What is their response?
(Hint: Think about how the word ‘OR’ is used in logic)
4. Consider the proposition: “If the integer m is greater than 3, then 2m is not prime.”
(a) Rewrite the proposition using the word ‘necessary.’
(b) Rewrite the proposition using the word ‘sufficient.’
(c) Write the negation, converse and contrapositive of the proposition.
5. Suppose the following sentence is true: “If Amy likes art, then no-one likes history.” What, if
anything, can we conclude if we discover that someone likes history.
3 Stating the laws in this fashion is to assert that each expression is a tautology (Definition 2.3). For instance, to claim
11
6. Construct the truth tables for the propositions P ∨ ( Q ∧ R) and ( P ∨ Q) ∧ R. Are they the same?
7. Use truth tables to establish the following laws of logic:
(a) Double negation: ¬(¬ P) ⇐⇒ P
(b) Idempotent law: P ∧ P ⇐⇒ P
(c) Absorption law: P ∧ ( P ∨ Q) ⇐⇒ P
(d) Distributive law: ( P ∧ Q) ∨ R ⇐⇒ ( P ∨ R) ∧ ( Q ∨ R)
8. (a) Decide whether ( P ∧ ¬ P) =⇒ Q is a tautology, a contradiction, or neither.
(b) Explain why ¬ P ∨ ¬ Q is logically equivalent to P =⇒ ( P ∧ ¬ Q).
(c) Prove: ( P ∧ ¬ Q) =⇒ F ⇐⇒ ( P =⇒ Q) is a tautology. Here F represents a contradiction.
9. (a) Prove that the expressions ( P =⇒ Q) ∧ ( Q =⇒ P) and P ⇐⇒ Q are logically equivalent.
(b) Prove that ( P =⇒ Q) ∧ ( Q =⇒ R) =⇒ P =⇒ R is a tautology.
Why do these make intuitive sense?
10. Use logical algebra (page 11) to show that ( P ∨ Q) ∧ ¬ P ∧ ¬ Q is a contradiction.
11. Compare the truth tables for P =⇒ Q and its converse. Do there exist propositions P, Q for
which both P =⇒ Q and its converse are false? Explain.
12. A friend insists that the negation of “Mark and Mary have the same height,” is “Mark or Mary
do not have the same height.” What is the correct negation? Where did your friend go wrong?
13. Suppose that the following statements are true:
(a) Every octagon is magical.
(b) If a polygon is not a rectangle, then is it not a square.
(c) A polygon is a square, if it is magical.
Is it true that “Octagons are rectangles”? Explain your answer.
(Hint: try rewriting each of the statements as an implication)
14. The connective ↓ (the Quine dagger, NOR) is defined by a truth table. P Q P↓Q
T T F
(a) Prove that P ↓ Q is logically equivalent to ¬( P ∨ Q). T F F
(b) Find a logical expression built using only P and the connective ↓ F T F
which is logically equivalent to ¬ P. F F T
(c) Find an expression built using only P, Q and ↓ which is logically equivalent to P ∧ Q.
15. (Just for fun) Augustus de Morgan satisfied his own problem:
I turn(ed) x years of age in the year x2 .
(a) Given that de Morgan died in 1871, and that he wasn’t the beneficiary of some miraculous
anti-aging treatment, find the year in which he was born.
(b) Suppose you have an acquaintance who satisfies the same problem. When were they born
and how old will they turn this year?
12
2.2 Propositional Functions & Quantifiers
Mathematical propositions are typically more complicated that those seen in Section 2.1. In particular,
they often involve variables: for instance, “x is an integer greater than 5.”
Definition 2.15. A propositional function is a family of propositions depending on one or more vari-
ables. The collection of permitted variables is the domain.
If P is a propositional function depending on a single variable x, then for each object a in the domain
P( a) is a proposition. Typically P( x ) is true for some x and false for others.
Example 2.16. Suppose P( x ) is the propositional function “x2 > 4” with domain the real numbers.
Plainly P(1) is false (“12 > 4” is nonsense), while P(6) is true (”62 > 4”).
Propositional functions are often quantified. English contains various quantifiers (all, some, many, few,
several, etc.), but in mathematics we are primarily concerned with just two.
Definition 2.17. The universal quantifier ∀ is read “for all.” The existential quantifier ∃ is read “there
exists.” Given a propositional function P( x ), we may define two new quantified propositions:
When quantifying propositions it is common to describe the domain by including a descriptor after
the quantifier (bounding the quantifier—see the example below).
As with connectives, there are multiple ways to express quantified propositions both mathemati-
cally and in English. Symbolic quantifiers involve a trade-off so consider your audience: compact
statements can improve clarity but are harder to read for the uninitiated.
Example (2.16 cont.). To gain some practice with bounded quantifiers, we introduce the notation
x ∈ R which simply means that x is a real number.4
• “∀ x ∈ R, x2 > 4” might be read, “The square of every real number is greater than 4.”
The quantified proposition is false since 12 > 4 is false: we call x = 1 a counter-example.
• “∃ x ∈ R, x2 > 4” might be read, “There is a real number whose square is greater than 4.”
The quantified proposition is true since 62 > 4 (is true): we call x = 6 an example.
4 This notation should be familiar. Don’t worry if not, for it will be properly discussed in chapter 4.
13
Universal Quantifiers and Connectives: Hidden Quantifiers
Universally quantified propositions are interchangeable with implications. To see this, suppose a
propositional function Q( x ) is given and let P( x ) be “x lies in the domain of Q.” Then
∀ x, Q( x ) is logically equivalent to P( x ) =⇒ Q( x )
Examples 2.19. 1. The universal statement, “Every cat is neurotic,” may instead be written,
The universal quantifier is explicit in only one of the sentences! For even more variety, the
third sentence could be viewed as a universal statement about all integers; including the hidden
quantifier in this case results in
∀n ∈ Z, n odd =⇒ n2 odd.
We’ve already seen that disproving a universal statement requires only that we supply a counter-
example. While such might require some effort to find, often the resulting argument is very simple.
By contrast, proving a universal statement is the same as proving a conditional connective, typically
a more involved activity. We therefore largely postpone this to the next section. Regardless, a simple
proof of the above oddness claim should be easy to follow.
Proof of Example 2.19.3. If an integer n is odd, then it may be written in the form n = 2k + 1 for some
integer k. But then
Similarly, proving an existential statement (by providing an example) is typically more straightfor-
ward than disproving such. To help understand this duality, we need to see how to negate quantified
propositions.
5 Bycontrast, the existential quantifier is never hidden: it is always explicit either symbolically (∃) or as a phrase in
English (there is, there exists, some, at least one, etc.).
14
Negating Quantified Propositions
To negate a proposition is to consider what it means for the proposition to be false. We already
understand what this means for a universal proposition:
“∀ x, P( x )” is false if and only if there exists a counter-example.
Otherwise said, the negation of a universal statement is existentially quantified:
The negation of “P( x ) is always true” is “P( x ) is sometimes false.”
Repeating P( x ) with ¬ P( x ) results in the related observation:
The negation of “P( x ) is always false” is “P( x ) is sometimes true.”
In summary:
Examples 2.21. 1. “Everyone owns a bicycle,” has negation “Someone does not own a bicycle.” It is
somewhat ugly, but we could write this semi-symbolically:
¬ ∀ people x, x owns a bicycle ⇐⇒ ∃ a person x such that x does not own a bicycle
2. The quantified proposition6 “∃ x > 0, sin x = 4,” has the form ∃ x, P( x ). Its negation is therefore
∀ x, ¬ P( x ): explicitly,
∀ x > 0, sin x ̸= 4
Since sine satisfies −1 ≤ sin x ≤ 1, the original proposition is false and its negation true.
3. Take special care negating connectives: a negated hidden quantifier ∀ x becomes explicit.
¬ P( x ) =⇒ Q( x ) is logically equivalent to ∃ x, P( x ) ∧ ¬ Q( x ) (Theorem 2.11)
(a) (Example 2.19.3) The negation of “n odd =⇒ n2 odd,” is the (false) claim
(b) (Example 2.19.2) The negation of the false claim “x ∈ R =⇒ x2 > 4” is the true assertion
15
Multiple Quantifiers
A propositional function can have several variables, each of which may be quantified.
might be read, “Given any positive number, there is another such that their product is four.”
Hopefully you believe that this is true! Here is a simple argument which comes from viewing
(∗) as an implication, “x > 0 =⇒ ∃y > 0 such that xy = 4.”
Being clear about domains is critical. Suppose we modify the original proposition:
Our proof now fails! The new statement (†) is false: indeed x = 0 provides a counter-example.
Alternatively, we could negate (†): following Theorem 2.20, we switch the symbols ∀ ↔ ∃ and
negate the final proposition,7
¬ ∀ x ∈ R, ∃y ∈ R, xy = 4 ⇐⇒ ∃ x ∈ R, ∀y ∈ R, xy ̸= 4 (¬(†))
Our disproof of (†) is really a proof of the negation: we provided the example x = 0, thus demon-
strating the truth of a ∃-statement. Since the negation is true, the original (†) is false.
2. Order of quantifiers matters! The meaning of a sentence—and its truth state—can change if we
alter the order of quantification.
(a) ∀ x ∈ R, ∃y ∈ R, x2 < y
(b) ∃y ∈ R, ∀ x ∈ R, x2 < y
Don’t worry if these arguments seem difficult at the moment; much more practice is coming.
7 Foran abstract justification of this heuristic, consider a propositional function P( x ): “∃y, Q( x, y),” then
¬ ∀ x, ∃y, Q( x, y) ⇐⇒ ¬ ∀ x, P( x ) ⇐⇒ ∃ x, ¬ P( x ) ⇐⇒ ∃ x, ¬ ∃y, Q( x, y) ⇐⇒ ∃ x, ∀y, ¬ Q( x, y)
16
We finish with two harder examples you might have encountered elsewhere. For this course, you do
not have to know what these statements mean, though you do have to be able to negate them.
∀ a, b, c ∈ R, ax + by + cz = 0 =⇒ a = b = c = 0
Since this is a conditional proposition, the expression ∀ a, b, c ∈ R would likely be hidden. The
negation of this statement, what it means for x, y, z to be linearly dependent, is
The original statement contained a hidden quantifier ∀ x which became explicit upon negation.
Exercises 2.2. A self-test quiz and several worked questions can be found online.
1. Rewrite each sentence using quantifiers. Then write the negation (use words and quantifiers).
(a) All mathematics exams are hard.
(b) No football players are from San Diego.
(c) There is a odd number that is a perfect square.
2. Let P be the proposition: “Every positive integer is divisible by thirteen.”
(a) Write P using quantifiers.
(b) What is the negation of P?
(c) Is P true or false? Prove your assertion.
3. A friend claims that the sentence “x2 > 0 =⇒ x > 0” has negation “x2 > 0 and x ≤ 0.” Why
is this incorrect? What is the correct negation?
4. Consider the quantified statement
∀ x, y, z ∈ R, ( x − 3)2 + (y − 2)2 + (z − 7)2 > 0 (∗)
(a) Express (∗) in words.
(b) Is (∗) true or false? Explain.
(c) Express the negation of (∗) in symbols, and then in words.
(d) Is the negation of (∗) true or false? Explain.
5. Suppose P, Q, R are propositional functions. Compute the negations of the following:
(a) ∀ x, ∃y, P( x ) ∧ Q(y) (b) ∀ x, ∃y, ∀z, R( x, y, z)
17
6. Revisit Example 2.22.2. Decide whether each of the following is true or false:
(a) ∃ x ∈ R, ∀y ∈ R, x2 < y (b) ∀y ∈ R, ∃ x ∈ R, x2 < y
7. The following are statements about positive real numbers x, y. Which is true? Explain.
(a) ∀ x, ∃y such that xy < y2 (b) ∃ x such that ∀y, xy < y2
8. Which of the following statements are true? Explain.
(a) ∃ a married person x such that ∀ married people y, x is married to y.
(b) ∀ married people x, ∃ a married person y such that x is married to y.
9. Prove or disprove:
(a) For every two points A and B in the plane, there exists a circle on which both A and B lie.
(b) There exists a circle in the plane on which lie any two points A and B.
10. Consider the following proposition (you do not have to know what is meant by a field).
All non-zero elements x in a field F have an inverse: some y ∈ F for which xy = 1.
(a) State what it means for a sequence ( xn ) not to diverge to ∞. Beware of the hidden quantifier!
(b) State what it means for a sequence ( xn ) not to converge to L.
(c) State what it means for a sequence ( xn ) not to converge at all.
(d) (Challenge: non-examinable) Use the definitions to prove that the sequence defined by
xn = n diverges to ∞, and that the sequence defined by yn = n1 converges to zero.
18
2.3 Methods of Proof
The previous sections covered some of the language of foundational logic. While one can study this
more deeply, we focus on putting it to work in the service of mathematics. The real work begins now.
A mathematical theorem is8 a justified assertion of the truth of an implication P =⇒ Q. A proof is
any logical argument justifying the theorem. The first step in analyzing or strategizing a proof is to
identify the hypothesis P and conclusion Q.
There are four standard methods of proof; in practice longer arguments combine several of these.
Direct Assume the hypothesis P and deduce the conclusion Q.9 This structure should be intuitive,
though it may help to revisit the truth table in Definition 2.5 and the tautology of Example 2.6.2.
Contrapositive Directly prove the contrapositive ¬ Q =⇒ ¬ P (logically equivalent to P =⇒ Q by
Theorem 2.9).
Contradiction Assume P ∧ ¬ Q and deduce a contradiction (directly prove ( P ∧ ¬ Q) ⇒ F). Theorem
2.11 and Exercise 2.1.8c show that P =⇒ Q is true. If P( x ) =⇒ Q( x ) has a hidden universal
quantifier, the negation means we start by assuming ∃ x, P( x ) ∧ ¬ Q( x ) (Exercise 2.2.12).
Induction This has a completely different flavor; we will consider it in Chapter 5.
Each method has advantages and disadvantages: direct proofs typically have the simplest logical
flow; contrapositive/contradiction approaches are useful when the negations ¬ P, ¬ Q are easier to
work with than P, Q themselves. All methods are equally valid, and, as we’ll see shortly, one can
often prove a simple theorem using all three approaches!
As you work through this section, pay special attention to the logical structure—to encourage this,
the mathematical level is very low. Refer to the previous sections if the logical terminology feels
unfamiliar. Now is also a good time to re-read Planning and Writing Proofs (page 4).
Direct Proofs
We begin by generalizing Example 2.19.3.
To make sense of this, we first need to identify the logical structure by writing the theorem in terms
of propositions and connectives. One way is to view the Theorem in the form P =⇒ Q:
• P( x, y) is “x and y are both odd.” This is our assumption, the hypothesis.
• Q( x, y) is “The product xy is odd.” This is what we wish to demonstrate, the conclusion.
• Both propositional functions are statements about integers. The Theorem is universal (“any
pair”), and so contains a (hidden) quantifier ∀ x, y ∈ Z.
We also need a clear understanding of the meaning of all necessary terms. To keep things simple,
we’ll treat integer and product as understood and be explicit only as to the meaning of oddness.
8 It might be awkward to fit a theorem into this format but it can always be done. Often all that is stated is the conclusion
Q, in which case P would be the assertion “All mathematics we already know/assume to be true.”
9 To assume a proposition is to suppose its truth. To suppose P is false, we “assume/suppose ¬ P.”
19
A direct proof can be viewed as a proof sandwich whose bread slices are the hypothesis and conclu-
sion (P and Q): write these down as a first step. Next define any useful terms in the hypothesis. All
that remains is to perform a simple calculation!
Observe how we wrote xy in the form 2(integer)+1 so as to make the conclusion absolutely clear.
Insufficient Generality Before leaving this example, it is worth highlighting the most common
mistake seen in such arguments.
This is an example of the theorem. Since the theorem is universal, a single example does not constitute
a proof (recall, however, that an example proves an existential statement: Definition 2.18).
This only verifies the special case where both odd integers are equal: it proves x odd =⇒ x2 odd.
There is nothing wrong with trying out examples or sketching incomplete thoughts—indeed both
are encouraged!—but you need to be aware of when your argument isn’t sufficiently general.
For another simple direct proof, consider the sum of two consecutive integers.
The theorem is again a universal claim (“any”) of the form P =⇒ Q about two integers x, y:
• P( x, y) is “x, y are consecutive integers.”
• Q( x, y) is “x + y is odd.”
The trick is to observe that, being consecutive, we may write one integer in terms of the other. The
proof sandwich is still visible, though it would be hard to write down the last sentence without
already having settled on the trick, which is essentially the definition of “consecutive integers.”
Proof. Suppose we are given two consecutive integers. Label the smaller of these x and the other
x + 1. Their sum is then
x + ( x + 1) = 2x + 1
which is odd.
20
Proof by Contrapositive
Here is another straightforward result about odd and even integers.
Theorem 2.26. If the sum of two integers is odd, then they have opposite parity.
Parity means evenness or oddness: the conclusion is that one of the integers is even and the other odd.
Naı̈vely attempting a direct proof produces an immediate difficulty:
We want to conclude something about x and y separately, but the direct approach lumps them together
in the same algebraic expression.
A contrapositive approach (¬ Q =⇒ ¬ P) suggests itself as a remedy, since the new hypothesis ¬ Q,
by treating x and y separately, gives us twice as much to start with.10
Proof. Suppose x and y have the same parity. There are two cases. (state hypothesis ¬ Q)
Again observe the proof sandwich and how the argument depends on little more than the definitions
of even and odd.
When presenting a lengthier contrapositive argument, consider orienting the reader by starting with
the phrase, “We prove the contrapositive.” For simple proofs (like the above) this is unnecessary,
since the logical structure should be clear without such assistance. It is also unnecessary to define
and spell out the propositions P and Q or include any of the bracketed commentary. However, feel
free to continue this practice if you think it aids your explanation, of if you are nervous about your
proof skills.11
10 Warning! The contrapositive is still a universal statement: ∀ x, y ∈ Z, ¬ Q ( x, y ) =⇒ ¬ P ( x, y ). We are not negating the
21
For another example of a contrapositive argument, we extend the first result of this section.
Theorem 2.27. The product of two integers is odd if and only if both integers are odd.
Proof. (⇒) We prove the contrapositive. Let x, y be integers, at least one of which is even. Suppose,
without loss of generality, that x = 2k is even. Then xy = 2ky is also even.
Note de Morgan’s law: ¬( x odd and y odd) is equivalent to “x even or y even” (“at least one”).
The common phrase without loss of generality, often abbreviated WLOG, saves us from performing
a second, almost identical, argument assuming y = 2l is even. WLOG is stated when one makes a
choice which does not materially affect the argument.
Proof by Contradiction
Here is a simple result considered in several ways.12
Remember to write contradiction at the end so the reader knows what you’ve done!
A nice side-effect of this approach is that it suggests an alternative direct proof.
Direct Proof. For any integer x, (†) says that 3x + 5 and 5x + 2, in summing to an odd
number, have opposite parity.
The last argument in fact proves that 3x + 5 is even if and only if 5x + 2 is odd; the converse of the
original claim comes for free! Revisiting (∗), you should believe that all the arrows are reversible.
12 From now on we’ll reserve ‘Theorem’ for results that are worth remembering in their own right.
22
Such variety is one of the things that makes proving theorems fun! While the choice of proof method
is largely a matter of personal taste, remember your audience. Our final direct argument is very slick
but risks confusing an elementary reader rather than empowering them.13
Three Proofs of the Same Result We finish this section with three proofs of the same result. All are
based on the same factorization of a polynomial
x3 + 4x2 − 2x − 20 = ( x − 2)( x2 + 6x + 10) = ( x − 2) ( x + 3)2 + 1
and the well-known fact that ab = 0 ⇐⇒ a = 0 or b = 0 (see Exercise 14). Since the mathematics is
so simple, pay attention to and compare the logical structures—which do you prefer?
Exercises 2.3. A self-test quiz and several worked questions can be found online.
(a) There is an even integer which can be expressed as the sum of three even integers.
(b) Every even integer can be expressed as the sum of three even integers.
(c) There is an odd integer which can be expressed as the sum of two odd integers.
(d) Every odd integer can be expressed as the sum of three odd integers.
2. For any given integers a, b, c, if a is even and b is odd, prove that 7a − ab + 12c + b2 + 4 is odd.
a tome of perfect proofs. As with all matters spiritual, one person’s Book is likely very different to another’s. . .
23
5. Consider the following proposition, where x is assumed to be a real number.
x3 − 3x2 − 2x + 6 = 0 =⇒ x = 3
(a) Is the proposition true or false? Justify your answer. Is its converse true?
(b) Repeat part (a) for the proposition x3 − 3x2 − 2x + 6 = 0 =⇒ x ̸= 3.
6. Below is the proof of a result. What result is being proved?
Proof. Assume, without loss of generality, that x = 2a and y = 2b are both even. Then
8. Consider the following proof of the fact that (for m an integer) if m2 is even, then m is even. Can
you re-write the proof so that it doesn’t use contradiction?
Proof. Suppose that m2 is even and m is odd. Write m = 2k + 1 for some integer k. Then
is odd. Contradiction.
9. Here is a ‘proof’ that every real number x equals zero. Find the mistake.
x = y =⇒ x2 = xy =⇒ x2 − y2 = xy − y2
=⇒ ( x − y)( x + y) = ( x − y)y
=⇒ x + y = y
=⇒ x = 0
24
2.4 Further Proofs & Strategies
The arguments in this section are slightly trickier and more representative of typical mathematics.
Some of these results are indeed quite famous and worth knowing in their own right. We will also
introduce lemmas and corollaries which are used to break up the presentation of complex results.
For a more involved example of a universal result, here is a famous inequality relating the arithmetic
and geometric means of two numbers.
Concentrate on the first since it is simpler. The hypothesis (x, y ≥ 0) doesn’t give us much to work
with, so it seems sensible to play with the inequality and try to eliminate the ugly square-root:
x+y √
≥ xy =⇒ ( x + y)2 ≥ 4xy =⇒ x2 − 2xy + y2 ≥ 0 =⇒ ( x − y)2 ≥ 0
2
Now we have something believable! The question is whether we can reverse the arrows. Only the
first should give you any pause; it is here that we use the non-negativity of x, y.
25
Proof. Suppose x, y ≥ 0. Multiply out a trivial inequality:
The scratch work really helped us figure out how and where to apply the hypothesis. Notice also
how the second result came almost for free! Result 1 only needed the ⇒ direction in the proof, but
the second result used the fact that all arrows are biconditionals.
For variety, here is a contradiction proof incorporating the same calculations in a different order.
x +y √
Contradiction Proof. Let x, y ≥ 0 and suppose that 2 < xy. Since x + y ≥ 0, the second inequality
holds if and only if ( x + y)2 < 4xy. Now multiply out and rearrange:
The AM–GM inequality in fact holds for any finite collection of non-negative numbers x1 , . . . , xn :
x1 + x2 + · · · + x n √
≥ n x1 x2 · · · x n
n
with equality if and only if all xi are equal. Proving this is a lot harder (see Exercise 15).
26
Viewed in this way as an implication, any of our proof strategies might be applicable to a non-
existence proof. Contradiction and contrapositive arguments are particularly common however,
since the right hand side (¬ Q) is already a negative statement.
Example 2.32. We prove that the equation x17 + 12x3 + 13x + 3 = 0 has no positive solutions.
Before seeing a proof, consider several ways in which this claim could be presented.
Non-existence (¬(∃ x, Q( x ))) There are no x > 0 for which x17 + 12x3 + 13x + 3 = 0.
Universal (∀ x, ¬ Q( x )) For all x > 0, we have x17 + 12x3 + 13x + 3 ̸= 0.
Direct (P ⇒ ¬ Q) If x > 0, then x17 + 12x3 + 13x + 3 ̸= 0.
Contrapositive (Q ⇒ ¬ P) If x17 + 12x3 + 13x + 3 = 0, then x ≤ 0.
Contradiction (P ∧ Q false) x > 0 and x17 + 12x3 + 13x + 3 = 0 is impossible.
We present two similar arguments based on the direct and contradiction structures.
Direct proof. Suppose that x > 0. Then x17 + 12x3 + 13x + 3 > 0 since all terms are posi-
tive. We conclude that x17 + 12x3 + 13x + 3 ̸= 0.
Contradiction proof. Assume that x > 0 satisfies x17 + 12x3 + 13x + 3 = 0. Since all terms
on the left hand side are positive, we have a contradiction.
Both arguments were very easy, with all difficulty coming from understanding why they work: pre-
cisely the above discussion! Learning/practicing how to recognize/translate a claim into an action-
able format is the essential skill here, both for the presenter and the reader.
Lemma: A theorem whose importance you want to downplay or which will later be used
to help prove a more significant result.
Corollary: A theorem which follows quickly from a previous result, either as a special case
or by modifying the proof in a straightforward way.
Presentation style varies: some authors and journals reserve theorem for only the most important
results, with everything else presented as a lemma or corollary; others never use these terms or just
call everything a proposition! Regardless, lemmas and corollaries are useful to have in your toolkit if
readability is your goal.
Here is a very simple result in preparation for a much more important upcoming theorem.
You should be able to prove this yourself, since the lemma is just a special case of Theorem 2.27. If
you are completely unsure how to start, revisit that result and the rest of Section 2.3.
27
Irrational Numbers
Since their definition is inherently negative, irrational numbers provide good examples of non-
existence/contradiction arguments. They are also interesting in their own right.
m
Definition 2.34. A real number x is said to be rational if it may be written in the form x = n for some
integers m, n. A real number is irrational if no such integers exist.
√
You likely know of a few irrational numbers ( 2, π, e), but how do we prove that a given number is
irrational? Our next result is very famous, with versions dating back at least to Aristotle (c. 340 BCE).
√
Theorem 2.35. 2 is irrational.
√ m
We must disprove the existence claim ∃m, n ∈ Z, 2= n.As before, consider several restatements:
√
Non-existence which 2 = mn .
There are no integers m, n for √
Universal For all integers m,√n, we have 2 ̸= mn .
Direct m, n ∈ Z, then 2 ̸= mn .
If √
Contrapositive If 2 = mn , then
√ m, nmare not both integers.
Contradiction m, n ∈ Z and 2 = n is impossible.
Are the drawbacks of a direct or contrapositive approach obvious? We prove by contradiction. To
improve readability, we outsource a repeated step to the (⇒) direction of Lemma 2.33.
√
Proof. Suppose m, n ∈ Z and that 2 = mn . Without loss of generality, assume m, n have no common
factors. Cross-multiply and square:
Just as we can simplify 46 = 23 , the no common factors assumption is without loss of generality: it costs
√
nothing once we suppose 2 = mn is rational. This last is what we contradicted! A (wrong) belief that
the no common factors assumption was contradicted means the calculation continues forever!
m2 = 2n2 =⇒ n2 = 2k2 =⇒ k2 = 2l 2 =⇒ · · ·
√ √
The irrationality of various surds ( 3, 3 2, etc.), can be proved similarly (π and e are much harder).
We may also apply the theorem to demonstrate the irrationality of many other numbers.
√ √
Example 2.36. Suppose 2 − 5 3 = x were rational: ∃m, n ∈ Z such that x = mn . Then
√ √ √ √ x2 − 73 m2 − 73n2
75 = (5 3)2 = ( 2 − x )2 = 2 + x2 − 2 2x =⇒ 2 = =
2x 2mn
√
Otherwise said, 2 is rational: contradiction.
28
Non-constructive Existence Proofs
Every existence proof we’ve thus far seen has been constructive: we’ve exhibited/constructed an
explicit example x for which Q( x ) is true. Sometimes this is asks too much. Indeed it is often far
easier to show the existence of something without explicitly stating what it is. We present two famous
examples of this situation.
The proof is very sneaky:√it does not provide an explicit example and does not answer the begged
√
question of whether ( 2) 2 is rational or irrational! In fact this number is irrational, though demon-
strating such is massively harder.15
We finish with a particularly famous example of a non-constructive existence proof. This argument
dates back to Euclid’s Elements (300 BCE), the most influential textbook in mathematical history. As
ever, we need a solid definition before trying to prove anything.
Definition 2.38. An integer ≥ 2 is prime if the only positive integers it is divisible by are itself and 1.
The first few primes are 2, 3, 5, 7, 11, 13, 17, 19, . . . It follows, though it is not completely obvious, that
every integer ≥ 2 is either prime or a product of primes (composite). In particular, every integer ≥ 2
is divisible by at least one prime. We now state Euclid’s result, and prove it by contradiction.
Theorem 2.39 (Elements, Book IX, Prop. 20). There are infinitely many prime numbers.
Proof. Assume there are exactly n primes p1 , . . . , pn and define the integer
Π : = p1 · · · p n + 1
Certainly Π is divisible by some prime pi (in our list by assumption!), as is the product p1 · · · pn . But
then the difference
1 = Π − p1 · · · p n
15 If you’re interested, look up the Gelfond–Schneider Theorem (1934), Hilbert’s Seventh Problem, and what they say
about algebraic and transcendental numbers. Such ideas are far beyond the level of this text.
29
Exercises 2.4. A self-test quiz and worked questions can be found online.
1. Prove or disprove:
(a) There exist integers m and n such that 2m − 3n = 15.
(b) There exist integers m and n such that 6m − 3n = 11.
2. Prove or disprove: There exists a line L in the plane such that, for all points A, B in the plane,
we have that A, B lie on L.
3. Prove: For every positive integer n, the integer n2 + n + 3 is odd and greater than or equal to 5.
4. Let p be an odd integer. Prove that the equation x2 − x − p = 0 has no integer solutions.
5. (Example 2.30, cont.) Prove or disprove: ∃y < 0 such that ∀ x ≥ 0, x3 < y2 .
√ √
6. Prove or disprove: 2 − 7 is rational.
7. Prove or disprove the following conjectures about real numbers x, y.
(a) If 3x + 5y is irrational, then at least one of x and y is irrational.
(Be careful! This isn’t a logical ‘and’: what happens when you negate?)
(b) If x and y are rational, then 3x + 4xy + 2y is rational.
(c) If x and y are irrational, then 3x + 4xy + 2y is irrational.
√ √ √
8. Prove by contradiction: if x and y are positive real numbers, then x + y ̸= x + y.
How would you change things to make a contrapositive argument?
9. Prove that between any two distinct rational numbers there exists another rational number.
10. Consider the proposition:
For any non-zero rational r and any irrational number t, the number rt is irrational.
(a) Translate this statement into logic using quantifiers and propositional functions.
(b) Prove the statement.
30
13. The real numbers satisfy the Archimedean property:
For any x, y > 0, there exists a positive integer n such that nx > y.
(a) Use the Archimedean property to show that there are no positive real numbers which are
less than n1 for all positive integers n.
(b) Consider the following ‘proof’ of the fact that every real number is less than some positive
integer:
Proof. Consider a real number x. For example, x = 19.7. Then x < 20 and 20 is a positive
integer.
(a) What is the largest possible value of xyz and when does it occur?
(b) (Hard) Prove that (1 − x )(1 − y)(1 − z) ≥ 8xyz.
(a) When n = 3, try mimicking our earlier approach by cubing the desired inequality. Why
does this seem unwise?
(b) Prove that x ≤ e x−1 for all real numbers x, with equality if and only if x = 1.
(Hint: Consider f ( x ) = e x−1 − x and apply a derivative test from calculus)
xi
(c) Let µ = x1 + x2 +···+
n
xn
be the arithmetic mean. Apply part (b) to each expression x = µ to
conclude that x1 · · · xn ≤ µn and hence complete the proof.
31
3 Divisibility and the Euclidean Algorithm
This chapter introduces congruence, which generalizes the idea of integer parity (evenness/oddness).
This is of fundamental importance to the sub-disciplines of number theory and abstract algebra,
providing some of the most straightforward examples of groups and rings. We will cover the basics
in this section, returning in Chapter 7 for more formal observations.
Definition 3.1. Let m, n be integers. The proposition n | m is read “n divides m,” and means
∃k ∈ Z such that m = kn
We could also say that “n is a divisor of m,” that “m is divisible by n,” or that “m is a multiple of n.”
Examples 3.2. 1. We write 4 | 20 since 20 = 4 × 5. The same equation also says that 5 | 20.
2. The proposition 9 ∤ 7 is read “9 does not divide 7.” It is shorthand for ¬(9 | 7).
When integers do not divide, there is a remainder left over. Your study of remainders likely goes back
to elementary school when you first learned division: for instance,16
Theorem 3.3 (Division Algorithm). Suppose m, n ∈ Z with n positive. Then there exist unique
integers q, r (the quotient and remainder) for which
m = qn + r and 0≤r<n
An algorithm is typically a computational process: if m > 0 one could view this as the repeated sub-
traction of n from m until the result r = m − qn satisfies 0 ≤ r < n. A rigorous proof requires
foundational ideas related to induction to which we will return in Chapter 5. For our current pur-
poses, we just need to know that remainders exist. Indeed our next step is to find a way to compare
remainders without explicitly invoking the division algorithm.
16 The common meaning of divide is to apportion a quantity equally. Thus to divide 33 apples between 5 people, each
person gets 6 apples and 3 are left over. In grade school mathematics, fractions come much later.
32
Definition 3.5. Let a, b and n be integers with n positive. The proposition
means that a and b have the same remainder on division by n. The integer n is called the modulus.
1. We write 7 ≡ 10 (mod 3), since 7 = 2 · 3 + 1 and 13 = 4 · 3 + 1 have the same remainder (r = 1).
2. We write 6 ̸≡ 10 (mod 3), since 6 = 2 · 3 + 0 and 17 = 5 · 3 + 2 have different remainders.
Calculating using the division algorithm is tedious. Our next result is crucial in that it permits the
direct comparison of remainders. This can be treated as an equivalent definition of congruence.
Proof. The second biconditional is nothing more than an application of Definition 3.1:
n | (b − a) ⇐⇒ ∃k ∈ Z such that b − a = kn
⇐⇒ b = a + kn for some integer k
Before presenting direct arguments for each direction of the first biconditional, it is helpful to intro-
duce notation from the division algorithm:
a = q1 n + r1 b = q2 n + r2 0 ≤ r1 , r2 < n
=⇒ b − a = (q2 − q1 )n + (r2 − r1 ) (∗)
(⇒) If a ≡ b (mod n), then a, b have the same remainder r1 = r2 . But then (∗) says that n | (b − a).
(⇐) Assume that n | (b − a) so that b − a = kn for some integer k. By (∗), we see that
r2 − r1 = ( b − a ) − ( q2 − q1 ) n = ( k − q2 + q1 ) n
− n < r2 − r1 < n
The only possibility is r2 − r1 = 0. Otherwise said, a, b have the same remainder: a ≡ b (mod n).
If you’re having trouble with the last step, think about an example! Suppose n = 26 and write
x = r2 − r1 . Hopefully you believe that x = 0 is the only integer satisfying the two conditions,
x is divisible by 26 and − 26 < x < 26
Since the result is abstract, it is good to recap the relationship between congruence and divisibility.
33
Examples 3.8. 1. We describe all integers x which are congruent to 7 on division by 11:
2. To get more of a feel for the notation, consider the following conjectures:
Conjecture (a) is true. If a ≡ 8 (mod 6), then a = 6k + 8 for some integer k, from which
a = 6k + 8 = 3(2k + 2) + 2 =⇒ a ≡ 2 (mod 3)
Modular Arithmetic
Remainders have a natural arithmetic similar to that of the real numbers. We use the same symbols,
with even the congruence symbol ≡ looking a bit like an equals sign!17 Modular arithmetic has many
applications, particularly to data security, with cell-phones and computers performing countless such
calculations daily. Here are the basic rules, generalizing most of what we saw in Section 2.3.
Theorem 3.9. Suppose a, b, c, d are integers and that n is some modulus. Then,
2. The usual associative, commutative and distributive laws of arithmetic hold for congruences:
The theorem says that the operations ‘take the remainder’ and ‘add/subtract/multiply’ can be per-
formed in any order or combination, the result will be the same.
Warning! Division does not work so nicely in modular arithmetic (see Example 3.13).
Example 3.10. Find the remainder when 29 + 14 is divided by 6. We do this in two ways:
(a) First find the sum 43, then compute its remainder: 43 ≡ 1 (mod 6) since 6 | (43 − 1).
(b) Alternatively, we could find the remainder of each component and then add:
29 + 14 ≡ 5 + 2 ≡ 7 ≡ 1 (mod 6)
17 This is no accident. In Chapter 7 we’ll see that congruence is an important example of an equivalence relation: a general-
ized notion of equality. Indeed, two integers are congruent if and only if something about them is equal: their remainders!
34
Proof. 1. We prove the multiplication rule. Suppose that a ≡ c and b ≡ d. By Theorem 3.7, we
have c = a + kn and d = b + ln for some integers k, l. Now compute:
2. The associative, commutative and distributive laws hold because x = y =⇒ x ≡ y (mod n),
regardless of n (equal numbers have the same remainder!).
The ability to take remainders before adding and multiplying is very powerful, allowing us rapidly
to perform some surprising calculations.
This is more promising, for we can use it to simplify the original expression:
211
37423 ≡ (−3) · (−3) · · · (−3) ≡ (−3)2 (−3) ≡ (−1)211 (−3) ≡ 3 (mod 10)
| {z }
423 times
79 + 143 ≡ 19 + 23 ≡ 1 + 8 ≡ 9 ≡ 3
It would have been madness to compute 79 + 143 = 40356351 before finding the remainder!
• Replace each integer by something small with the same remainder: 37 ≡ −3 (mod 10) is more
helpful than 37 ≡ 7 (mod 10), since powers of −3 are much easier to work with.
• The base of an exponential can be reduced, but not the exponent: 1723 ≡ 323 (mod 7) is correct,
but 323 ̸≡ 32 (mod 7). Exponentiation is merely shorthand for repeated multiplication.
18 Usinglogarithms, a pocket calculator will tell you that 37423 ≈ 2.2 × 10663 has 663 digits! This is no help since what
we want is the units digit, not its largest few significant figures.
35
Application: On what day were you born?
While we all know our date of birth, most of us do not know on which day of the week we were born.
You can answer this question quite easily (perhaps in your head!) using modular arithmetic.
• Since 365 ≡ 1 (mod 7), a standard year advances the calendar one weekday.
• Each leap year19 advances the calendar an additional day.
Can you figure the weekday today’s date in your year of birth? Thinking about the length of each
month modulo 7, you should also be able to find your birthday.
Example 3.12. Paul Revere was born January 1st , 1735, in Boston. Given that January 1st , 2024 was a
Monday, find the weekday of Revere’s birth.
The dates differ by 289 years, in which time there have been 288
4 − 2 = 70 leap years (not 1800 and
1900). The calendar has therefore advanced 289 + 70 ≡ 2 weekdays: Revere was born on a Saturday.
Examples 3.13. 1. Even when there is a common factor, dividing both sides is perilous. For instance
We also divided the modulus! If we hadn’t done so, the result would be false: 21 ̸≡ 6 (mod 10).
2. Congruence equations are harder to solve than standard equations. For instance, we cannot
attack 2x ≡ 7 (mod 9) by division: x ≡ 72 is meaningless since 27 is not an integer!
It won’t always work, but in this case a sneaky multiplication by 5 solves the problem:
2x ≡ 7 =⇒ 10x ≡ 35 =⇒ x ≡ 8 (mod 9)
Exercises 3.1. A reading quiz and several questions with linked video solutions can be found online.
4. Use the division algorithm to prove that if p is an odd prime, then p ≡ 1 or p ≡ 3 (mod 4).
19 In the Gregorian calendar (the de facto worldwide standard introduced in the 1600s), leap years occur in centuries
divisible by 400 and every non-century divisible by 4: for instance, 2000 was a leap year but 1900 was not.
36
5. Prove the first part of Theorem 3.9: if a ≡ c and b ≡ d, then a + b ≡ c + d (mod n).
6. Find a positive integer n and integers a, b such that a2 ≡ b2 (mod n) but a ̸≡ b (mod n).
7. Check explicitly that 323 ̸≡ 32 (mod 7). (Hint: 33 = 27 . . .)
8. Compute the following remainders—use a calculator to help!
(a) 129 + 1924 on division by 10.
(b) 3010 on division by 13.
(c) 17251 · 2312 − 1941 on division by 5. (Hint: 17 ≡ 2 and 22 ≡ −1 (mod 5))
(d) (Hard) 1210 + 236 · 1812 on division by 141. (Hint: what nice number is close to 141?)
9. Prove that 3 | (4n − 1) for all positive integers n.
10. Let n be an integer. Prove that exactly one of n, n + 2 and n + 4 is divisible by 3.
11. (a) Let n be a positive integer. Prove that n is congruent to the sum of its digits modulo 9.
(Hint: e.g. 345 = 3 · 102 + · · · )
(b) Is the integer 123456789 divisible by 9?
12. Describe all integers x which satisfy the congruence equations:
(a) 3x ≡ 2 (mod 8) (b) 7x ≡ 28 (mod 42).
13. Abraham Lincoln was born February 12th , 1809. On what day of the week was this?
(Start by looking up the day for February 12th this year)
14. Let n be an integer.
(a) Prove: n2 ≡ 0 or 1 (mod 3). (Hint: prove by cases)
(b) Prove: n2 ≡ 0 or 1 (mod 4).
(c) Find all possible remainders of n2 on division by 7.
(d) Find all possible remainders of n3 on division by 7.
15. Use some part(s) of Exercise 14 to prove the following.
√
(a) 4m + 6 is not an integer, for any integer m ≥ −1.
(b) Any number which is simultaneously a square and a cube must be of the form 7k or 7k + 1
for some integer k.
16. Let n be an integer ≥ 2 and consider numbers of the form 11 · · 11}
| ·{z
n times
37
3.2 Greatest Common Divisors and the Euclidean Algorithm
A basic goal of number theorists is to find integer solutions to equations. For instance:
Are there any integer points on the line with equation 9x − 21y = 6? That is, does the
equation 9x − 21y = 6 have any solutions, where x, y are both integers?
You might start by sketching the line (graph paper will help). What do you observe? If there are
integer points, do they seem to follow any pattern?
In this section we will see how to solve all such linear Diophantine equations.20 The method introduces
a famous procedure dating at least to Euclid’s Elements (c. 300 BCE), and an important concept.
Definition 3.14. Let a and b be integers, not both zero. Their greatest common divisor gcd( a, b) is the
largest (positive) divisor of both a and b. We say that a and b are relatively prime if gcd( a, b) = 1.
Example 3.15. The positive divisors of a = 60 and b = 90 are listed in the table. The greatest common
divisor gcd(60, 90) = 30 is plainly the largest number common to both rows.
a 1 2 3 4 5 6 10 12 15 20 30 60
b 1 2 3 5 6 9 10 15 18 30 45 90
For large integers, listing divisors is very inefficient. This is where Euclid rides to the rescue.
Theorem 3.16 (Euclidean Algorithm). Let a > b be positive integers. We construct a decreasing
sequence of integers b = r0 > r1 > r2 > · · ·
The algorithm eventually terminates with a remainder of zero: some rt+1 = 0. The greatest common
divisor is the last non-zero remainder: gcd( a, b) = rt .
Exercise 8 provides a proof. If either/both of a, b are negative, simply ignore the signs and compute
normally: for instance gcd(−4, 34) = gcd(34, 4) = 2.
Example 3.17. We compute gcd(1260, 750) using the Euclidean algorithm. Note how each line is a
single instance of the division algorithm a = qb + r and how remainders move diagonally ↙ at each
step. For this first example, we also summarize the data in a table.
a q b r
1260 = 1 × 750 + 510 1260 1 750 510
750 = 1 × 510 + 240 750 1 510 240
510 = 2 × 240 + 30 510 2 240 30
240 = 8 × 30 + 0 240 8 30 0
The Euclidean algorithm says that gcd(1260, 750) = 30, the final non-zero remainder.
20 Equations with integer coefficients and solutions honor Diophantus of Alexandria (3rd C. CE ).
38
To apply the Euclidean algorithm to our motivational problem, we need to run it backwards. Start
with the penultimate line of the algorithm and move upwards, substituting remainders one at a time.
The result is an expression of the form gcd( a, b) = ax + by for some integers x, y.
Example (3.17, cont). We find integers x, y such that 1260x + 750y = 30 = gcd(1260, 750) .
Start by expressing 30 using the third line of the algorithm and work upwards:
Rearranging, we see that x = 3 and y = −5 satisfy the equation 1260x + 750y = 30. To simplify,
divide everything by 30 to see that (3, −5) is an integer point on the line 42x + 25y = 1.
To comfortably apply the algorithm like this will require significant practice; we’ll see another exam-
ple momentarily. The reversed algorithm works in general, yielding a very powerful result.
Corollary 3.18 (Bézout’s Identity). Given integers a, b, not both zero, ∃ x, y ∈ Z such that
gcd( a, b) = ax + by
If either of a, b are negative, apply the algorithm to | a| , |b| and correct the signs afterwards.
Example 3.19. We express 3 = gcd(123, 78) = 123x + 78y: remainders are underlined for clarity.
123 = 1 × 78 + 45
3 = 12 − 9
78 = 1 × 45 + 33
= 12 − (33 − 2 × 12) = 3 × 12 − 33
= 3(45 − 33) − 33 = 3 × 45 − 4 × 33
45 = 1 × 33 + 12
=⇒
33 = 2 × 12 + 9
= 3 × 45 − 4(78 − 45) = 7 × 45 − 4 × 78
12 = 1 × 9 + 3
= 7(123 − 78) − 4 × 78
9 = 3×3+0 = 7 × 123 − 11 × 78 ( x, y) = (7, −11)
Corollary 3.20. Suppose a, b, c are integers for which gcd( a, b) = 1 and a | bc. Then a | c.
Example 3.21. The claim “a | c and b | c =⇒ ab | c” is, in general, false (e.g., a = b = c = 2).
However: Suppose c = 2m = 3n for some integers m, n. Since gcd(2, 3) = 1 and 2 | (3n), the
Corollary says that 2 | n. Thus n = 2k for some k ∈ Z and c = 6k. We conclude:
2 | c and 3 | c =⇒ 6 | c
While we could have done this using Theorem 2.27, Bézout proves the general claim whenever
gcd( a, b) = 1: e.g., 19 | c and 21 | c =⇒ 399 | c (see the video problems if you’re unsure why).
39
Linear Diophantine & Congruence Equations
We return, as promised to the motivating problem of finding integer points on straight lines.
Example (3.17, mk. III). We saw that ( x0 , y0 ) = (3, −5) solves 1260x + 750y = 30 (equivalently
42x + 25y = 1). Suppose ( x, y) is any other integer solution and observe that
Since gcd(42, 25) = 1, Corollary 3.20 says 42 | (y − y0 ). Write y − y0 = 42t for some t ∈ Z, then
This method works for all solvable linear Diophantine equations; here is the general result.
Theorem 3.22. Let a, b, c be integers where a, b are not both zero, and let d = gcd( a, b).
b a
x = x0 − t, y = y0 + t, where t ∈ Z
d d
Example (3.19, cont). We saw that (7, −11) solves 123x + 78y = 3 = gcd(123, 78).
2. The equation 123x − 78y = 6 has integer solutions since 3 | 6. Indeed ( x0 , y0 ) = (14, 22) is such
(modify signs and scale (7, −11) by 2). All integer solutions are then given by
78 123
x = 14 + t = 14 + 26t, y = 22 + t = 22 + 41t, where t ∈ Z
3 3
We began to consider linear congruence equations in Example 3.13. The linear Diophantine method
also applies in this context. To see why, observe that
ax ≡ c (mod n) ⇐⇒ ∃y ∈ Z such that ax = ny + c
The Theorem tells us when we can solve the right hand side, and supplies a method for doing so. To
solve ax ≡ c (mod n), simply take the x-part!
Examples 3.23. 1. To solve the congruence equation 123x ≡ 6 (mod 78) is to solve the Diophantine
equation 123x = 78y + 6. By the last example, x = 14 + 26t. Otherwise said
2. The congruence equation 4x ≡ 6 (mod 20) has no solutions. If it did, then 4x = 20y + 6 would
have integer solutions, but it does not since gcd(4, 20) = 4 does not divide 6.
40
3. Here is a slightly different approach for 7x ≡ 11 (mod 23). By applying the algorithm,
Exercises 3.2. A reading quiz and several questions with linked video solutions can be found
online. Some of the later questions are particularly tricky, and are included to give you a taste of
upper-division number theory and abstract algebra.
2. For each part of Exercise 1, find integers x, y which satisfy Bézout’s identity gcd( a, b) = ax + by.
3. Describe all the integer points on the line 9x − 21y = 6 using Theorem 3.22.
4. (a) Use Theorem 3.22 to show that there are no integer points on the line 4x + 6y = 1.
(b) Give an elementary proof (without using the Theorem) of part (a)?
5. Find all integer points ( x, y) on the following lines, or show that none exist.
6. (a) Show that there exists no integer x such that 3x ≡ 5 (mod 6).
(b) Find all solutions x to the congruence equation 12x ≡ 1 (mod 17).
7. Five people each take the same number of candies from a jar. Then a group of seven people
does the same: in so doing they empty the jar. If the jar originally contained 93 candies, can
you be sure how much candy each person took?
(a) Suppose a > b > r1 > r2 · · · is a sequence of non-negative integers. Why must there be
only finitely many terms? This shows that the algorithm terminates with some rt+1 = 0.
(b) Suppose that a, b, q, r are integers satisfying a = qb + r. Prove that gcd( a, b) = gcd(b, r ).
(You cannot use Bézout’s identity to do this, since Bézout is a corollary of the algorithm!)
(c) Argue that gcd( a, b) = rt .
9. Suppose m ̸= 0. What is gcd(m, 0)? Why? Why is Bézout’s identity trivial in this situation?
10. Use Bézout’s identity to prove that if k | a and k | b, then k | gcd( a, b).
21 In future studies, you’ll refer to 10 as the multiplicative inverse of 7 in the ring Z23 , and write 7−1 = 10 (see Exercise 18).
41
11. Prove: gcd(m, n) = 1 ⇐⇒ ∃ x, y ∈ Z such that mx + ny = 1.
(Hint: One direction follows from Bézout’s identity, but the other. . . )
14. (Hard) Show that if a is relatively prime to both b and c then it is relatively prime to bc.
(Hint: suppose d | a and d | bc and try to prove that d = ±1)
(a) Use the rule to find all solutions to the congruence equation 22x ≡ 66 (mod 77).
(b) We prove the rule. Let d = gcd(c, n):
i. Explain why gcd( dc , nd ) = 1
n
ii. If ac ≡ bc (mod n), prove that d | ( a − b ).
16. (a) Prove part 1 of Theorem 3.22 using Bézout’s identity.
(b) Prove part 2 by mimicking the method in Example 3.17, (mk. III).
17. (Hard) Apply the discussion of the Euclidean algorithm and linear equations to the following.
18. The set of remainders Zn = {0, 1, 2, . . . , n − 1} is called a ring when equipped with addition
and multiplication modulo n. We say that b ∈ Zn is an inverse of a ∈ Zn if
ab ≡ 1 (mod n)
42
4 Sets and Functions
Sets are the fundamental building blocks of mathematics, supplying the language used to describe
mathematical objects and to group objects according to shared characteristics. While our primary
focus is learning to understand and employ set notation, the mathematical discipline of set theory is
far more ambitious: set theorists define all basic mathematical objects—numbers, addition, functions,
etc.—purely in terms of sets!22 We will only scratch the surface of set theory; indeed long before one
can accept the benefit of such an approach, it is necessary to develop a significant level of familiarity
with sets and their basic operations.
As in the definition, it is typical to use upper-case letters (A, B, C, . . .) for abstract sets and lower-case
letters for elements/members.
Venn diagrams are useful for visualizing abstract sets. A set is represented by a region in the plane,
with elements depicted by dots. The diagram in the definition represents a set A comprising at least
four elements a1 , a2 , a3 and x. The element y does not lie in A.
Example 4.2. Let A be the set of (names of) US states. Then Michigan ∈ A and Saskatchewan ∈
/ A.
1. Sets are equal, written A = B, when they have precisely the same elements.
The following observations are merely translations of the definition—do they make sense to you?
43
Roster & Set-Builder Notation
Roster notation is ideal for describing small sets: simply list the elements in any order between curly
brackets { , }.
Example 4.4. A = {3, 21 } is the set containing the numbers 3 and 21 . For instance 3 ∈ A, but 7 ∈
/ A.
1
Since order doesn’t matter, we could also write A = { 2 , 3}. Now let Let B = {3}. Plainly,
1 1
• A ⊈ B since 2 ∈ A and 2 / B (∃ x ∈ A for which x ∈
∈ / B).
Set-builder notation describes the elements of a set in terms of some common property. Suppose U is
some (already understood) set and P( x ) a propositional function with domain U , then
A := x ∈ U : P( x ) (“A is the set of x in U such that P( x )”)
defines a set A as the subset of U whose elements x satisfy the property P( x ). A vertical separator |
can be used instead of a colon: in some contexts the choice is essential for clarity.
Examples 4.5. 1. We continue Example 4.4. Note that 2x2 − 7x + 3 = (2x − 1)( x − 3) and recall that
R represents the set of real numbers and Z the set of integers. In set-builder notation, our sets
may be written
A = x ∈ R : 2x2 − 7x + 3 = 0 , B = x ∈ Z : 2x2 − 7x + 3 = 0
2. Let X = {2, 4, 6} and Y = {1, 2, 5, 6}. There are many options for how to write these in set-
builder notation. For instance:
X = n ∈ Z : 21 n ∈ {1, 2, 3} , Y = n ∈ Z 1 ≤ n ≤ 6 and n ̸= 3, 4
We now practice the opposite skill by converting five sets from set-builder to roster notation.
S1 = x ∈ X : x is divisible by 4 = {4} S2 = y ∈ Y : y is odd = {1, 5}
S3 = x ∈ X x ∈ Y = {2, 6} S4 = x ∈ X : x ∈ / Y = {4}
S5 = y ∈ Y y is odd and y − 1 ∈ X } = {5}
Can you find alternative descriptions in set-builder notation for the sets S1 , . . . , S5 above? Take
your time getting used to this notation, since translating between various descriptions of a set
is essential to reading mathematics.
44
3. We use the set C = {0, 1, 2, 3, . . . , 24} to describe D = {n ∈ Z : n2 − 3 ∈ C } in roster notation.
Start by expanding the criterion for membership in D:
n2 − 3 ∈ C ⇐⇒ n2 ∈ 3, 4, 5, . . . , 25, 26, 27
4. To express E = {0, 2, 6, 12, . . .} in set-builder notation, we might spot a pattern and decide that
E = n ∈ Z : n = m(m + 1) for some integer m ≥ 0
In the first case the next term in the sequence is 4 · 5 = 20, whereas in the second it is 20 + 128 =
148. For larger sets, the clarity afforded by set-builder notation is essential!
3. Intervals are commonly encountered subsets of the real numbers. For instance:
• [1, π ] = x ∈ R 1 ≤ x ≤ π is a closed interval
• [−4, 7.21) = { x ∈ R −4 ≤ x < 7.21} is a half-open interval.
√ √
• (−∞, 2) = { x ∈ R x < 2} is an infinite (open) interval.
23 We assume the reader is comfortable with the real line where number corresponds to length. A rigorous development
of R is a matter for an upper-division analysis course.
24 Be careful with the second version—the colon is the such that separator while | denotes the property “4 divides x.”
45
In view of the natural subset relationships N ⊊ Z ⊊ Q ⊊ R ⊊ C, we consider a simple result.
Proof. Think back to the criteria following Definition 4.3. Suppose A ⊆ B and B ⊆ C. Then
( A⊆ B) ( B⊆C )
x ∈ A =⇒ x ∈ B =⇒ x ∈ C
We conclude that A ⊆ C.
Compare this to Exercise 2.1.9b: if we rewrite each subset relation as an implication, the proof struc-
ture becomes ( x ∈ A ⇒ x ∈ B) ∧ ( x ∈ B ⇒ x ∈ C ) =⇒ ( x ∈ A ⇒ x ∈ C ). This is typical of basic set
results: a translation often reduces the problem to one of the standard rules of logic.
Definition 4.8. A finite set contains a finite number of elements: this number is its cardinality | A|.
A set with infinitely many elements is said to be an infinite set.
The symbol ∅ denotes the empty set: a set containing no elements (cardinality zero, |∅| = 0).
√
Examples 4.9. 1. If A = 1, 3, π, 2, 103 , then | A| = 5.
2. Let B = 4, {1, 2}, {3} . The elements of B are 4, {1, 2} and {3}, therefore | B| = 3. It doesn’t
matter that the element {1, 2} ∈ B is also a set!
3. Recall some basic trigonometry: 1
1
2
1 π 5π 7π 11π
x ∈ [0, 4π ] : cos x = = , , , 0
2 3 3 3 3
π 2π 3π 4π
has cardinality 4. −1
4. There are many, many representations of the empty set in set-builder notation: for example
∅ = x ∈ R : x2 = −1 = x ∈ N : x2 + 3x + 2 = 0 = n ∈ N : n < 0
In general, if X is any set and P( x ) is false for all x ∈ X, then25 ∅ = x ∈ X : P( x ) .
Cardinality is very simple for subsets of finite sets: if B is finite, so is any subset, and we have
A ⊆ B =⇒ | A| ≤ | B|
(Is it obvious why the converse is false?!) For infinite sets, cardinality is more subtle; we’ll return to
this matter and uncover some of its bizarre and fun consequences in Chapter 8.
25 The existence of the empty set is sometimes considered an axiom: an assumption made without proof. Provided one
accepts that set-builder notation always defines a set (itself an axiom!) and that at least one set X exists, the empty set may
be defined as in the example; a suitable property P( x ) might be something like “x ∈/ { x }.”
46
We finish with a couple of simple results regarding the empty set.
1. If | A| = 0, then A = ∅. The empty set is the unique set with cardinality zero.
2. ∅ ⊆ A and A ⊆ A
Proof. Consider the claim ∅ ⊆ A. By the observations following Definition 4.3, this means
x ∈ ∅ =⇒ x ∈ A
This is true (for any set A!) since there are no elements x satisfying the hypothesis.26
1. Suppose A has cardinality zero. Repeating and combining with the above observation, we see
that ∅ ⊆ A and A ⊆ ∅. We conclude that A = ∅.
2. We already know that ∅ ⊆ A. For the second part, simply observe that x ∈ A =⇒ x ∈ A.
Exercises 4.1. A reading quiz and several questions with linked video solutions can be found online.
1. Describe the following sets in roster notation: that is, list their elements.
(a) x ∈ N : x2 ≤ 3x (b) n ∈ {0, 1, 2, 3, . . . , 19} : n + 3 ≡ 5 (mod 4)
(c) n ∈ {−2, −1, 0, 1, . . . , 23} : 4 | n2 (d) x ∈ 12 Z : 0 ≤ x ≤ 4 and 4x2 ∈ 2Z + 1
(e) y ∈ R : y = x2 for some x ∈ R with x2 − 3x + 2 = 0
2. Describe the following sets in set-builder notation (look for a pattern).
(a) . . . , −3, 0, 3, 6, 9, . . . (b) −3, 1, 5, 9, 13, . . .
(c) 1, 13 , 71 , 15
1 1
, 31 , . . .
3. Each of the following sets of real numbers is a single interval. Determine the interval.
(a) x ∈ R : x > 3 and x ≤ 17 (b) x ∈ R : x ≰ 3 or x ≤ 17
(c) x2 ∈ R : x ̸= 0 (d) x ∈ R− : x2 ≥ 16 and x3 ≤ 27
4. Is the set { x ∈ Z : −1 ≤ x < 43} finite or infinite? If finite, what is its cardinality?
n o
5. What is the cardinality of the set ∅, ∅ , ∅, {∅} ? What are its elements?
6. Let A = ∅, B = { A}, C = { A} and D = A, {0}, {0, 1} .
Answer the following true or false:
(a) 0 ∈ A (b) A ∈ B (c) A ∈ C (d) B ∈ C (e) A ∈ D
(f) B ∈ D (g) 0 ∈ D (h) {0} ∈ D (i) {1} ∈ D
7. List all the proper subsets of {1, 2, 3}.
26 If P( x ) is always false, then (∀ x ) P( x ) =⇒ Q( x ) is true (Definition 2.5). This is called a vacuous (empty) theorem.
47
8. Let A, B, C, D be the following sets:
A = {−4, 1, 2, 4, 10} B = m ∈ Z : |m| ≤ 12 (absolute value of m)
C = n ∈ Z : n2 ≡ 1 (mod 3) D = t ∈ Z : t2 + 3 ∈ [4, 20)
Of the 12 subset relations A ⊆ B, A ⊆ C, . . . , D ⊆ C, which are true and which false?
9. Let A = 1, 2, {1, 2}, {3} and B = {1, 2}. Answer the following true or false:
(a) B ∈ A (b) B ⊆ A (c) 3 ∈ A (d) {3} ⊆ A
(e) {3} ∈ A (f) ∅ ⊆ A (g) ∅ ∈ A
10. Let A = {0, 2, 4, 6, 8, 10}. Write the set B = { X ⊆ A : | X | = 2} in roster notation.
11. (a) Suppose A ⊆ B ⊆ C ⊆ A. Show that A = B = C.
(b) Is it possible for sets A, B, C to satisfy A ⊊ B ⊆ C ⊆ A? Why/why not?
12. Let A = {1,2,3,4}, and let B = { x, y} : x, y ∈ A .
(a) Describe B in roster notation (what happens when x = y?).
(b) Find the cardinalities of the following sets:
n o n o
C= x, {y} : x, y ∈ A and D = x, {y} : x, y ∈ A
48
4.2 Unions, Intersections and Complements
In this section we construct new sets from old, modeled on the logical and, or, and not (Definition 2.3).
In the Venn diagram, the outer box represents the universal set U .
4. The complement of A relative to B is the set of elements in B which do A B
not lie in A:
B \ A = B ∩ AC (x ∈ B \ A ⇐⇒ x ∈ B and x ∈
/ A)
A\B A∩B B\A
= x∈B:x∈ /A
Observe the notational similarities with logic: ∪ looks a bit like ∨ (OR); ∩ like ∧ (AND). The second
Venn diagram suggests the identities
A = ( A \ B) ∪ ( A ∩ B) and B = ( B \ A) ∪ ( A ∩ B)
While these indeed hold, note that a Venn diagram isn’t a proof: set identities must rigorously be
proved in the style of the upcoming theorems.
Examples 4.12. Reading set notation is one of the most basic requirements of abstract mathematics.
Make sure you understand why the following examples are correct before moving on!
1. Let U = {1, 2, 3, 4, 5}, A = {1, 2, 3}, and B = {2, 3, 4}. Then
AC = {4, 5} BC = {1, 5} B \ A = {4} A \ B = {1}
A ∪ B = {1, 2, 3, 4} A ∩ B = {2, 3} A ∩ B = {1}
C
AC ∪ BC = {1, 4, 5}
27 The elements x must live somewhere! Without a universal set, should, say, {7}C be the set of integers except 7, real
numbers except 7, etc.? Often U is naturally assumed: e.g., R in calculus. The universal set is not needed for parts 1, 2 & 4:
that the union is a set is typically an axiom, while intersections and relative complements are subsets of pre-existing sets.
49
2. Using interval notation, let U = [−4, 5], A = [−3, 2], and B = [−4, 1). Then
While you should believe these from the picture, they also make for good logic practice. E.g.,
x ∈ AC ⇐⇒ x ∈ / A ⇐⇒ ¬ x ∈ A ⇐⇒ ¬ −3 ≤ x and x ≤ 2
⇐⇒ x < −3 or x > 2 (de Morgan’s law, Theorem 2.13)
⇐⇒ x ∈ [−4, −3) ∪ (2, 5] (remember that −4 ≤ x ≤ 5 (x ∈ U ))
The argument illustrates the basic strategy for set computations & proofs: convert claims to
propositions (Definition 4.11 parentheses) and apply basic logic (Theorem 2.13, page 11, etc.).
Alternatively, you can prove each direction separately: this would be to show that each of the
sets AC , [−4, −3) ∪ (2, 5] is a subset of the other (page 43, (∗)).
3. Let A = (−∞, 3) and B = [−2, ∞) in interval notation. Then A ∪ B = R and A ∩ B = [−2, 3).
We show the first: for variety, this time we observe that each side is a subset of the other.
1. ∅ ∪ A = A and ∅ ∩ A = ∅ 2. A ∩ B ⊆ A ⊆ A ∪ B
3. A ∪ B = B ∪ A and A ∩ B = B ∩ A 4. A ∪ A = A ∩ A = A
5. A ∪ ( B ∪ C ) = ( A ∪ B) ∪ C and A ∩ ( B ∩ C ) = ( A ∩ B) ∩ C
6. A ⊆ B =⇒ A ∪ C ⊆ B ∪ C and A ∩ C ⊆ B ∩ C
If you don’t believe a result, visualize it with a Venn diagram. We prove part 2 and half of 6.
50
6. (first half) Suppose A ⊆ B. We wish to prove that x ∈ A ∪ B =⇒ x ∈ A ∪ C. However,
x ∈ A ∪ C =⇒ x ∈ A or x ∈ C (definition of union)
=⇒ x ∈ B or x ∈ C (since A ⊆ B)
=⇒ x ∈ B ∪ C
Our next batch of rules describe how complements interact with other set operations: parts 1 and 2
are de Morgan’s laws for sets; unsurprisingly, their proofs depend on the corresponding laws of logic.
1. ( A ∩ B)C = AC ∪ BC (pictured)
C
2. ( A ∪ B) = AC ∩ BC A B
C
3. ( AC ) = A
4. A ⊆ B ⇐⇒ BC ⊆ AC
Proof. We prove only part 1. As before, the natural approach is to restate the result using propositions.
x ∈ ( A ∩ B)C ⇐⇒ ¬ x ∈ A ∩ B ⇐⇒ ¬ x ∈ A and x ∈ B
⇐⇒ ¬ x ∈ A or ¬ x ∈ B (de Morgan’s first law of logic)
⇐⇒ x ∈ AC or x ∈ BC
⇐⇒ x ∈ AC ∪ BC
1. A ∩ ( B ∪ C ) = ( A ∩ B) ∪ ( A ∩ C ) C
2. A ∪ ( B ∩ C ) = ( A ∪ B) ∩ ( A ∪ C )
The Venn diagram illustrates the second result: think about adding the A B
colored regions.
x ∈ A ∩ ( B ∪ C ) ⇐⇒ x ∈ A and x ∈ B ∪ C
⇐⇒ x ∈ A and x ∈ B or x ∈ C
⇐⇒ x ∈ A and x ∈ B or x ∈ A and x ∈ C (distributive law, page 11)
⇐⇒ x ∈ A ∩ B or x ∈ A ∩ C
⇐⇒ x ∈ ( A ∩ B) ∪ ( A ∩ C )
Remember, if you prefer, you can prove these equalities in two stages: S = T ⇐⇒ S ⊆ T and S ⊇ T.
51
Exercises 4.2. A reading quiz and several questions with linked video solutions can be found online.
1. Describe each set straightforwardly as you can: e.g.,
x ∈ R : x2 < 9 and x3 < 8 = (−3, 3) ∩ (−∞, 2) = (−3, 2)
(a) x ∈ R : x2 ̸= x (b) x ∈ R : x3 − 2x2 − 3x ≤ 0 or x2 = 4
(c) y ∈ R : ∃ x ∈ R with y = x2 and x ̸= 1 (d) z ∈ Z : z2 is even and z3 is odd
(e) y ∈ 3Z + 2 : y2 ≡ 1 (mod 3)
2. Let A = {1, 3, 5, 7, 9}, B = {1, 4, 7, 10} and U = {1, 2, . . . , 10}. What are the following sets?
(a) A ∩ B (b) A ∪ B (c) B \ A (d) AC
(e) ( A \ B)C (f) AC ∩ BC (g) ( A ∪ B) \ ( A ∩ B)
3. In Example 4.12.2, use logic to formally justify the assertions BC = [1, 5], A ∩ B = [−3, 1), and
A ∪ B = [−4, 2]. If you prefer, use the ‘subset of each side’ approach of Example 4.12.3.
4. Give formal proofs of the following parts of Theorems 4.13, 4.14 and 4.15.
(a) ∅ ∩ A = ∅ (b) A ∩ ( B ∩ C ) = ( A ∩ B) ∩ C
C
(c) ( AC )
=A (d) A ∪ ( B ∩ C ) = ( A ∪ B) ∩ ( A ∪ C )
(e) A ⊆ B ⇐⇒ BC ⊆ AC
5. By showing that each side is a subset of the other, give a formal proof of the set identity
A = ( A \ B) ∪ ( A ∩ B)
Now repeat your argument using only results from set algebra (Theorems 4.14 and 4.15).
6. Prove the identity A ∪ B = A ⇐⇒ B ⊆ A for any sets A, B.
7. Prove the identities for any sets A, B, C:
(a) ( A ∩ B ∩ C )C = AC ∪ BC ∪ CC (b) ( A ∪ B) \ ( A ∩ B) = ( A \ B) ∪ ( B \ A)
8. Prove or disprove the following conjectures (Hint: revisit Section 2.4).
(a) ∃ x ∈ R \ Q such that x2 ∈ Q (b) ∀ x ∈ R \ Q we have x2 ∈ Q
9. Let A ⊆ R, and let x ∈ R. We say that x is far away from the set A if and only if:
∃d > 0 such that A ∩ [ x − d, x ] = ∅
If this does not happen, we say that x is close to A.
(a) Draw a picture of a set A and elements x, y such that x is far away from and y is close to A.
(b) State the meaning of “x is close to A” (negate “x is far away from A”).
(c) Let A = {1, 2, 3}.
i. Show that x = 4 is far away from A using the definition.
ii. Let A = {1, 2, 3}. Show that x = 1 is close to A.
(d) For general A ⊆ R, show that if x ∈ A, then x is close to A.
(e) Let A = ( a, b) be a bounded interval. Is the end-point a far away from A? What about b?
52
4.3 Introduction to Functions
Sets become a lot more useful and interesting once you start transforming their elements! This is
accomplished using functions. In this section we introduce some basic concepts and notation, much
of which should be familiar. A formal definition will be given in Chapter 7, but for the present a
naı̈ve notion will suffice.
The rule defining a function can also be described using arrow notation f : a 7→ b.
Examples 4.17. 1. It is common to graph functions whose codomain is a subset of the real numbers:
the domain and range are found by projecting the graph onto the two axes.
For instance if f : [−3, 2) → R is the square function
f (x)
9
2
f : x 7→ x (equivalently f ( x ) = x2 )
6
range
then dom( f ) = [−3, 2) and range( f ) = [0, 9]. We could
also calculate other images/pre-images, for example, 3
f [−1, 2) = x2 : −1 ≤ x < 2 = [0, 4)
−3 −2 −1 0 1 2
f −1 (−10, 2] = x ∈ [−3, 2) : −10 < x2 ≤ 2 domain x
√ √
= [− 2, 2]
53
3. Let A = {0, 1, 2, . . . , 7} be the set of remainders modulo 8 n 0 1 2 3 4 5 6 7
and define two functions f , g : A → A: f (n) 0 3 6 1 4 7 2 5
g(n) 0 6 4 2 0 6 4 2
f (n) = 3n (mod 8) g(n) = 6n (mod 8)
This time the table completely describes the functions. Observe that
range( f ) = A f {1, 5} = {3, 7} f −1 {1, 2, 3, 4} = {1, 3, 4, 6}
range( g) = {0, 2, 4, 6} g {1, 5} = {6} g−1 {1, 2, 3, 4} = {2, 3, 6, 7}
f : A → B : a 7→ { a, a + 1 (mod 5)}
where we take the remainder modulo 5. You should be able to convince yourself that
range( f ) = {0, 1}, {1, 2}, {2, 3}, {3, 4}, {4, 0}
f {1, 4} = f (1), f (4) = {1, 2}, {4, 0} and f −1 {2, 4}, {4, 0} = {4}
1. Injective (1–1, an injection) if distinct inputs produce distinct outputs. Otherwise said (we state
the contrapositive),
∀b ∈ B, ∃ a ∈ A such that f ( a) = b
This merely expresses B ⊆ range( f ); the reverse inclusion range( f ) ⊆ B holds for any function.
These are universal statements, so counter-examples are enough to demonstrate the negations:
54
Examples (4.17, cont.). We briefly revisit our previous examples.
1. Let f : [−3, 2) → R : x 7→ x2 .
You should have seen the approach of the next example in other classes.
The sign of the square-root was chosen so that x ∈ dom( f ) = (−∞, 2).
−1 0 1 x 2
55
Composition of Functions
We consider how injectivity and surjectivity interact with composition of functions.
1 1
( g ◦ f )( x ) = and ( f ◦ g)( x ) =
x2 −1 ( x − 1)2
Even though ±1 are legitimate inputs for f , dom( g ◦ f ) = R \ {±1} is implied so as to prevent
division by zero.
Somewhat surprisingly, the converse of this theorem is false. If a composition is injective or surjective,
only one of the original functions is required also to be.
56
Example 4.24. Here is a formulaic version of the picture in the theorem. Make sure you’re comfort-
able with the definitions and draw pictures or graphs to help make sense of what’s going on.
Proof. This time we leave part 1 for the Exercises. Let c ∈ C and assume g ◦ f is surjective. But then
∃ a ∈ A such that c = ( g ◦ f )( a) = g f ( a)
Otherwise said, ∃b(= f ( a)) ∈ B for which c = g(b): that is, g is surjective.
Theorem 4.25. Let A and B be finite sets. The following are equivalent:
1. | A| ≤ | B| 2. ∃ f : A → B injective 3. ∃ g : B → A surjective
Moreover, | A| = | B| ⇐⇒ ∃ f : A → B bijective.
The theorem asserts that any one of the three numbered statements is true if and only ⃝
1
if all are. It might appear that six arguments are required but, by proving in a circle,
=⇒
=⇒
we only need three: for instance ⃝ 1 ⇒⃝3 holds because ⃝ 1 ⇒⃝ 2 and ⃝
2 ⇒ ⃝.3
The proof is very abstract, but if you focus on the picture it should make sense. ⃝
3
=⇒ ⃝
2
y
A = { a1 , a2 , . . . , a m } B = {b1 , b2 , . . . , bn } { a1 , a2 , . . . , a m }
_O _O
f g f g m
⃝
1 ⇒⃝2 If m ≤ n, define f : A → B by f ( ak ) = bk as in the _ _ z }| {
picture. This is injective since the b1 , . . . , bm are distinct. {b1 , b2 , . . . , bm , bm+1 , . . . , bn }
⃝
2 ⇒⃝3 Suppose f : A → B is injective. Without loss of generality, label the elements of B such that
bk = f ( ak ) for 1 ≤ k ≤ m. Define the surjective function g : B → A as in the picture:28
(
ak if k ≤ m
g ( bk ) : =
a1 if k > m
⃝
3 ⇒⃝1 Suppose g : B → A is surjective. Without loss of generality, label the elements of B such
that ak = g(bk ) for 1 ≤ k ≤ m. Since the bk must be distinct, we see that n ≥ m.
57
Exercises 4.3. A reading quiz and several questions with linked video solutions can be found online.
f ( x ) = x2 + 6x + 1, g( x ) = 2x + 3
58
9. Let f : R → R+ be the function defined by f ( x ) = e x . Explain why the following “proof” that
f is surjective is incorrect. Then, give a correct proof.
15. Following Theorem 4.22, the composition of bijective functions f , g is itself bijective. Give a
brief explanation as to why ( g ◦ f )−1 = f −1 ◦ g−1 .
16. Let f : A → B and X ⊆ A. Fill in the blanks to complete a proof of the following facts:
(a) X ⊆ f −1 f ( X ) . (b) If f is injective, then X = f −1 f ( X ) .
Proof. (a) x ∈ X =⇒ f ( x ) ∈ . Let y = f ( x ), then x ∈ ⊆ f −1 f ( X ) .
(b) a ∈ f −1 f ( X ) =⇒ , whence ∃ x ∈ X with f ( a) = f ( x ). By injectivity, ,
whence a ∈ X. We conclude that . Combine with part (a) for the result.
59
5 Mathematical Induction and Well-ordering
In Section 2.3 we discussed three methods of proof: direct, contrapositive, and contradiction. The
fourth standard method, induction, has a very different flavor. Before discussing this formally, we
consider some contexts in which induction arguments often arise.
Further experimentation will hopefully convince you that r3 = 7, at which point you might be ready
to hypothesize a general formula—if not, experiment more!
To make progress we need to think abstractly. If we have a stack of n + 1 disks, then to move the
largest disk all others must be stacked on a single peg. Moving n + 1 disks to another peg is therefore a
three-step process:
1. Move the smallest n disks to another peg (rn moves);
2. Move the largest disk (one move);
3. Move the remaining disks on top of the largest (rn moves).
60
We are now in a position to prove our conjecture.
Proof. Certainly the formula rn = 2n − 1 holds when n = 1 disk (one disk requires r1 = 1 move).
Now suppose that n disks require rn = 2n − 1 moves, where n ∈ N is some fixed number. Then n + 1
disks require
moves. Since n was arbitrary, we see that we’ve proved an infinite family of implications:
Since the first proposition (r1 = 21 − 1) holds, we conclude that all do: rn = 2n − 1 for all n ∈ N.
To answer the original question, 10 disks require r10 = 210 − 1 = 1023 moves; at one move per second
this would take 17 minutes, 3 seconds.
Proof by Induction
The above argument is an example of a proof by induction. We invoke this method when we want
to prove a sequence of propositions P(1), P(2), P(3), . . . , one for each natural number. The abstract
structure of an induction proof consists of two separate arguments:
1. Base case: Prove P(1).
2. Induction step: Prove P(n) =⇒ P(n + 1) (for each n ∈ N). During this phase, P(n) is termed
the induction hypothesis.
The result is an infinite chain of implications:
P(1) =⇒ P(2) =⇒ P(3) =⇒ P(4) =⇒ P(5) =⇒ · · ·
Since P(1) is true (base case), all remaining propositions P(2), P(3), P(4), . . . are also true.
Re-read the Tower of Hanoi proof; can you separate the base case and the induction step? Since this
is an introduction, our presentation was informal. A few modifications should be made to produce a
formal argument.
• Set-up the proof by stating, “We prove by induction.” It might also be helpful to spell out the
propositions P(n) and to tell the reader what variable (n) controls the induction.
• Label the base case and induction step to aid the reader.
• After the induction step is complete, state your conclusion. In the above we would replace
everything after (∗) with, “By mathematical induction, r n = 2n − 1 for all n ∈ N.”
Here is a straightforward and famous result, where we write the proof in our new language.
n
Theorem 5.2. The sum of the first n positive integers is ∑ k = 21 n(n + 1)
k =1
n
You should be familiar with summation notation ∑ k = 1 + 2 + 3 + · · · + n from calculus: if not, ask.
k =1
61
n
Proof. We prove by induction. For each n ∈ N, let P(n) be the proposition ∑ k = 21 n(n + 1).
1 k =1
Base case (n = 1): ∑ k = 1 = 21 1(1 + 1), says that P(1) is true.
k =1
Induction Step: Fix n ∈ N and assume P(n) is true for this n . We compute the sum of the first n + 1
positive integers and use the induction hypothesis P(n) to simplify:
n +1 n
1
∑ k = ∑ k + ( n + 1) = 2 n ( n + 1) + ( n + 1) (induction hypothesis)
k =1 k =1
1 1
= 1 + n (n + 1) = (n + 2)(n + 1)
2 2
1
= ( n + 1) ( n + 1) + 1
2
Therefore P(n + 1) is true.
By mathematical induction, we conclude that P(n) is true for all n ∈ N. Otherwise said,
n
1
∀n ∈ N, ∑ k = 2 n ( n + 1)
k =1
Note how we grouped 12 (n + 1) (n + 1) + 1 so that it is obviously the right hand side of P(n + 1).
We present several more examples in a similar vein, though done a little faster. As is typical, we don’t
explicitly introduce the notation P(n), though you should feel free to continue doing so if you find it
helpful. Aim to lay out your formal arguments in a similar style.
1
2 + 5 + 8 + · · · + (3n − 1) = n(3n + 1) (†)
2
1
Base case (n = 1): The proposition (†) is trivially true: 2 = 2 · 1 · (3 · 1 + 1).
Induction Step: Fix n ∈ N and assume (†) holds for this value of n. Then
1
2 + 5 + · · · + (3n − 1) + [3(n + 1) − 1] = n(3n + 1) + 3n + 2
2
1 1
= (3n2 + 7n + 4) = (n + 1)(3n + 4)
2 2
1
= ( n + 1) 3( n + 1) + 1
2
which is the required proposition for n + 1.
For brevity we labelled the desired proposition (what we’d might call P(n)) by (†) so it could be
referenced. The structure is similar to Theorem 5.2: since the goal is to evaluate a sum, the induction
step is little more than adding the same thing (3n + 2) to both sides of the induction hypothesis. In
fact, the example could have been proved directly as a corollary of Theorem 5.2—can you see how?
62
Our next two examples are a little harder, requiring more creativity to invoke the induction hypoth-
esis. Both can alternatively be proved directly using modular arithmetic (Chapter 3).
is divisible by 13.
By mathematical induction, 17n − 4n is divisible by 13 for all n ∈ N.
from which
( n + 1) ( n + 1) + 1 2( n + 1) + 1 = 6 ( n + 1)2 + k
is divisible by 6.
Scratch work is your friend! Unless things are very simple, start with some scratch work for the
hard part: the induction step. Explicitly state the propositions P(n) and P(n + 1) and try to manipulate
one into the other. Here are the relevant propositions for Example 5.4.1:
P ( n ): ∃k ∈ Z such that 17n − 4n = 13k
P ( n + 1): ∃l ∈ Z such that 17n+1 − 4n+1 = 13l
Since 17n is common to both, it is natural to try multiplying both sides of the equation in P(n) by 17;
if you re-read the example, you’ll see that this is essentially the induction step! For Example 5.4.2,
you might try multiplying out the cubic expressions
and comparing coefficients. Since the leading term in both is n3 , the difference is quadratic and there-
fore much easier to think about. . .
Remember that scratch work isn’t a proof; while it might make perfect sense to you, it isn’t a proof
unless a reader can follow it without assistance. Once you think you understand the induction step,
63
lay out the entire proof cleanly: set-up, base case, induction step, conclusion. As an example of what
happens when you don’t, here is a typical attempt at Example 5.3 by someone new to induction:
1
P(n + 1) = 2 + 5 + · · · + (3n − 1) + [3(n + 1) − 1] = (n + 1) 3(n + 1) + 1
2
1 1
n(3n + 1) + (3n + 2) = (n + 1)(3n + 4)
2 2
3n2 + n + 6n + 4 = 3n2 + 7n + 4
Is this a good argument? While there are many issues,29 the work isn’t without merit: the required
calculation is present (left side of 1st line = left side of 2nd ). While helpful as scratch work, a substan-
tial re-write is needed to make this convincing to a reader.
We finish this section with a trickier example of this thinking at work.
Example 5.5. An L-shaped tromino is an arrangement of three squares in an “L” shape. We claim:
If any single square is removed from a 2n × 2n square gird, then
the remaining grid may be tiled by L-shaped trominos.
The claim has the form ∀n ∈ N, P(n), but note that P(n) is itself universal.
The picture shows one of the sixteen possible examples when n = 2. To get
an idea of how to structure the induction step, think how you might use
2 × 2 grids to analyse a 4 × 4 grid: the picture shows how!
• Place a single tromino (orange in the picture) so that one of its squares lies in each remaining
quadrant. What’s left of each quadrant is a 21 × 21 grid with one missing square: again tilable.
This scratch work is really an argument P(1) =⇒ P(2)! It remains only to formalize this intuition
into a general proof. We proceed by induction on n.
Base case (n = 1): If a single square is removed from a 2 × 2 grid, the three remaining squares form
single L-shaped tromino.
Induction step: Fix n ∈ N and assume that after removing any square from any 2n × 2n grid, the
remainder is tilable. Now take any 2n+1 × 2n+1 grid and remove a square.
• By the induction hypothesis, the 2n × 2n quadrant containing the removed square is tilable.
• Place a single tromino in the center so that one of its squares lies in each remaining quad-
rant. What’s left of each quadrant is a 2n × 2n grid with one missing square, each of which
is tilable by the induction hypothesis.
By induction, we conclude that every 2n × 2n grid is tilable by trominos after any square is removed.
29 • There is no set-up, base case or conclusion, and the word induction is missing. The argument also needs some English.
• P(n) has not been defined. If you don’t define it, don’t write it.
• P(n + 1) is a proposition: it cannot equal a number! Replacing “P(n + 1) =” with “P(n + 1) ⇐⇒” would correct this.
• There are no conditional connectives to indicate the logical flow. Moreover, read top to bottom, the argument is
essentially P(n) ∧ P(n + 1) =⇒ T, rather than the correct induction step P(n) =⇒ P(n + 1).
64
Exercises 5.1. A reading quiz and several questions with linked video solutions can be found online.
1. Suppose you move one disk on the Tower of Hanoi per second.
(a) One of the oldest versions of the problem has monks transferring a tower of 64 disks.
Roughly how many years would this take?
(b) In a realistic human lifetime, how large a tower could be moved?
2. Imagine you cut a large large piece of paper in half and stack the two pieces on top of each
other. You then repeat the process, cutting all sheets in half and making a single taller stack.
If a single sheet of paper has thickness 0.1 mm, how many times would you have to repeat the
cut-and-stack process until the stack of paper reached to the sun? (≈ 150 million kilometers).
Prove that you are correct.
3. A room contains n people. Everybody wants to shake everyone else’s hand (but not their own).
(a) Suppose n people require hn handshakes. If person n + 1 enters the room, how many
additional handshakes are required? Obtain a recurrence relation for hn+1 in terms of hn .
(b) Hypothesize a general formula for hn , and prove it by induction.
65
9. Prove by induction that, for all n ∈ N,
1
1 · 2 + 2 · 3 + 3 · 4 + · · · + n ( n + 1) = n(n + 1)(n + 2)
3
10. (a) Show, by induction, that for all n ∈ N, the number 4 divides the integer 11n − 7n .
(b) More generally, prove by induction that ( a − b) | ( an − bn ) for any a, b, n ∈ N.
11. (a) Find a formula for the sum of the first n odd natural numbers. Prove your assertion.
(b) Use Theorem 5.2 to give an alternative direct proof of your formula.
12. Find the error in the following “proof” of the statement, “All cats have the same color fur.”
Proof. Let P(n) be the proposition, “Any set of n cats have the same color fur.” We prove by
induction on n.
Base case (n = 1): Any cat has the same color fur as itself.
Induction step: Fix n ∈ N and assume P(n). Take any set S = {C1 , C2 , . . . , Cn+1 } of n + 1 cats.
The set S \ {C1 } has n cats; by the induction hypothesis all have the same color fur. Again
by the induction hypothesis, all cats in S \ {C2 } have the same color fur. Combining these
observations, we see that all cats in S have the same color fur. Since S was arbitrary, we
see that P(n + 1) holds.
d
13. Use induction, the product rule, and the fact that dx x = 1 to prove the power law from calculus:
d n
∀n ∈ N, x = nx n−1
dx
15. Consider the following scratch work. Determine what result is being proved, then convert the
scratch work into a formal proof of that result.
(1 + x )n+1 = (1 + x )n (1 + x ) ≥ (1 + nx )(1 + x )
= 1 + x + nx + nx2 = 1 + (n + 1) x + nx2
≥ 1 + ( n + 1) x
66
5.2 Well-ordering and the Principle of Mathematical Induction
In this section we think more carefully about the logic behind induction, and tie it to a fundamental
property of the natural numbers.
Definition 5.6. A non-empty set of real numbers A is well-ordered if every non-empty subset of A
contains a minimum element.
To test if a set A is well-ordered, we need to check all of its non-empty subsets. The definition could
be written as equivalently as follows (in the second line we expand what is meant by a minimum):
• If B ⊆ A and B ̸= ∅, then min B exists.
• ( B ⊆ A) ∧ ( B ̸= ∅) =⇒ ∃b ∈ B, ∀ x ∈ B, b ≤ x
To show that A is not well-ordered, we need only exhibit a non-empty subset B with no minimum.
Examples 5.7. 1. A = {4, −7, π, 19, ln 2} is a well-ordered set. There are 31(!) non-empty subsets of
A, each of which has a minimum element.
Can you justify this fact without listing the subsets? It might be easier to think about why any
finite set A = { a1 , . . . , an } ⊆ R is well-ordered. . .
2. The interval A = [3, 10) is not well-ordered. Indeed B = (3, 4) is a non-empty subset with no
minimal element. While you should believe this, let’s prove it anyway!
b +3
We need to prove that ∀b ∈ B, ∃ x ∈ B with x < b. Given any b ∈ (3, 4), observe that x := 2
satisfies
3. The integers Z are not well-ordered. For instance, Z is a non-empty subset of itself and there is
no minimal integer.
You might suspect (wrongly!) that every well-ordered set is finite. That the natural numbers form a
well-ordered infinite set is, for us, an axiom,30 a foundational claim forming part of our basic concep-
tion of the natural numbers.
Also known as the least natural number principle, well-ordering is applied widely throughout mathe-
matics. In fact we’ve already done so in this text! Consider the set of positive remainders generated
by the Euclidean algorithm (Theorem 3.16) when applied to natural numbers a > b:
{. . . , r2 , r1 , b, a} ⊆ N
Well-ordering guarantees that this set has a minimal element rt (which turns out to be gcd( a, b)); this
is essentially the argument for Exercise 3.2.8(a).
30 There are many ways to define the natural numbers. Typically well-ordering is either an axiom (essentially part of the
67
Armed with the well-ordering principle, we can justify the method of proof by induction.
Theorem 5.9 (Principle of Mathematical Induction). For each n ∈ N, let P(n) be a proposition.
Additionally make the two standard assumptions:
Before attempting a proof, consider how the theorem could be written as a pure implication:
P(1) ∧ ∀n ∈ N, P(n) =⇒ P(n + 1) =⇒ ∀n ∈ N, P(n)
This helps us select a proof strategy: a direct approach seems hard since the conclusion is universal; a
contrapositive approach requires an ugly negation of the hypothesis; a proof by contradiction seems
most sensible since negation of the conclusion is straightforward.
Proof. We argue by contradiction. Assume the base case, the induction step, and that ∃n ∈ N for
which P(n) is false. The set of natural numbers
S := k ∈ N : P(k ) is false
• s ∈ S =⇒ P(s) is false.
• The base case tells us that s ̸= 1. Thus s ≥ 2 and s − 1 ∈ N.
• s − 1 < min S =⇒ P(s − 1) is true.
• The induction step (P(s − 1) =⇒ P(s)) tells us that P(s) is true.
Now we have the proof, it is straightforward to extend the principle of induction. For any integer m
(positive, negative or zero), the set
Z≥m = {n ∈ Z : n ≥ m} = {m, m + 1, m + 2, m + 3, . . .}
is also well-ordered. By changing the base case to P(m) and replacing N with Z≥m , we immediately
obtain the proof of a more general principle of induction.
Corollary 5.10 (Induction with base case m). Fix an integer m. For each integer n ≥ m, let P(n) be
a proposition. Suppose:
The intuitive concept is exactly as before, just with a different base case!
P(m) =⇒ P(m + 1) =⇒ P(m + 2) =⇒ P(m + 3) =⇒ · · ·
68
Examples 5.11. 1. For all integers n ≥ 2, we prove that31
n
1 1
∑ k ( k − 1) = 1 − n (∗)
k =2
2
1 1
Base case (n = 2): When n = 2, (∗) reads ∑ i ( i −1)
= 2 = 1 − 12 .
i =2
Induction step: Assume that (∗) is true for some fixed n ≥ 2. Then
n +1 n
1 1 1 1 1
∑ k ( k − 1)
=∑ +
k ( k − 1) ( n + 1) n
= 1− +
n n ( n + 1)
(induction hypothesis)
i =2 i =2
( n + 1) − 1 1
= 1− = 1−
n ( n + 1) n+1
The proof of the induction step thus hinges on being able to show that 3n3 ≥ (n + 1)3 . There
are many ways to convince yourself of this, for instance
3 3
n+1 1
3n3 ≥ (n + 1)3 ⇐⇒ 3 ≥ = 1+ (†)
n n
5 3 125
The right side decreases as n increases; since n ≥ 4, the right side is at most 4 = 64 < 2,
whence (†) holds for all n ≥ 4.
We now prove the original claim by induction.
69
Example 5.12. We prove an extended version of de Morgan’s law for sets (Theorem 4.14(a)): for any
collection of sets A1 , . . . , An where n ≥ 2, we have
( A1 ∩ · · · ∩ A n )C = A1 C ∪ · · · ∪ A n C (‡)
Induction step: Fix n ∈ N≥2 and suppose (‡) holds for all collections of n sets. Given a collection of
n + 1 sets, we see that
C
( A 1 ∩ · · · ∩ A n ∩ A n +1 )C = ( A 1 ∩ · · · ∩ A n ) ∩ A n +1
= ( A 1 ∩ · · · ∩ A n )C ∪ A n +1 C (de Morgan again!)
= A 1 C ∪ · · · ∪ A n C ∪ A n +1 C (induction hypothesis)
We could have approached the argument as a standard induction with base case n = 1. Instead
we deliberately chose n = 2, both to avoid confusion (the n = 1 case A1 C = A1 C isn’t helpful or
interesting) and to highlight the importance of de Morgan’s law for two sets to the entire argument.
If S is non-empty, let s = min S, and let t ∈ N be such that s2 = 2t2 . Plainly t < s. Since s2 is
even, s is also even, and we can write s = 2k. But then
70
Aside: Well-ordering more generally Definition 5.6 is a weak version of a much deeper concept.
Informally, to well-order a set means to list its elements in some order so that every non-empty subset
has an initial element with respect to that order.
For instance, the set of negative integers Z− = {. . . , −4, −3, −2, −1} is not well-ordered with respect
to the standard ordering of the integers, but is well-ordered with respect to the reverse ordering
The principle of mathematical induction is easily modified to accommodate theorems of the form
∀n ∈ Z− , P(n): the base case is P(−1) and the induction step justifies the chain
All the infinite well-ordered sets we’ve thus far seen have “looked like” the natural numbers, how-
ever more esoteric examples exist. For instance, the following well-ordered set looks like two copies
of the natural numbers, one following the other:
1 2 3 4 3 5 7 9 1 1
A = 0, , , , , . . . , 1, , , , , . . . = 1 − : n ∈ N ∪ 2 − : n ∈ N
2 3 4 5 2 3 4 5 n n
Every non-empty subset of A really does have a minimum! It is possible to modify induction to
apply to propositions indexed by well-ordered sets like this, though an extra step is required to deal
with limit elements (like 1 ∈ A) with no immediate predecessor. If your further studies include set
theory, you’ll likely spend much time considering well-orders and their associated ordinals.
Exercises 5.2. A reading quiz and several questions with linked video solutions can be found online.
(a) If the statement is written in the form ∀n ∈ N≥2 , P(n), what is the proposition P(n)?
(b) Prove the result by induction.
n
1 − r n +1
5. Prove the geometric series formula: if r ̸= 1 and n ∈ N0 , then ∑ r k = 1−r
k =0
n
1 3 2n−1
6. For all integers n ≥ 3, prove that ∑ k ( k −2)
= 4 − 2n(n−1)
k =3
n
1
7. Prove: for any n ∈ N, ∑ i2
<2
i =1 n
1 1
(Hint: prove the stronger fact that ∑ i2
< 2− n for all n ≥ 2)
i =1
71
8. The set A3 = {1, 2, 3} satisfies the property that the sum of its elements (1 + 2 + 3 = 6) is
divisible by every element of A3 .
(a) Use induction to prove that for any n ≥ 3, there is a set An of n natural numbers such that
the sum of its elements is divisible by every element of An .
(b) Prove by contradiction that no set of two natural numbers satisfies this property.
9. Suppose that x2 + 4y2 = 3z2 has a solution ( x, y, z) where all three are positive integers.
(a) By considering remainders modulo 3, prove that 3 | z. Thus create a new solution ( X, Y, Z )
in positive integers, where Z < z.
(b) Use the method of minimal counter-example to prove that x2 + 4y2 = 3z2 has no solutions
where x, y, z ∈ N.
10. We use the fact that N0 is well-ordered to prove the division algorithm (Theorem 3.3).
If m ∈ Z and n ∈ N, then ∃ unique q, r ∈ Z such that m = qn + r and 0 ≤ r < n.
Given m, n, define
S = N0 ∩ m + nZ = k ∈ N0 : k = m − qn for some q ∈ Z
(a) (Existence) Show that S is a non-empty subset of N0 . By well-ordering, define r := min S.
Prove that 0 ≤ r < n.
(b) (Uniqueness) Suppose two pairs of integers (q1 , r1 ) and (q2 , r2 ) satisfy m = qi n + ri and
0 ≤ r1 , r2 < n. Prove that r1 = r2 .
11. (Hard) We consider a version of Peano’s axioms for the natural numbers.
i. (Initial element) 1 ∈ N
ii. (Successor function) f (n) = n + 1 is a function f : N → N
iii. (No predecessor of the initial element) 1 ̸∈ range( f )
iv. (Unique predecessor/order) f is injective: m + 1 = n + 1 =⇒ m = n
v. (Induction) Any subset A ⊆ N with the following properties equals N:
1∈A and ∀ a ∈ A, a + 1 ∈ A
(a) Replace N with Z in each axiom. Which are true and which false?
(b) Let T = (m, n) : m, n ∈ N be the set of all ordered pairs of natural numbers.
i. Let f : T → T be the function f (m, n) = (m + 1, n). Letting the pair (1, 1) play the role
of 1, and f the successor function, decide which of Peano’s axioms are satisfied by T.
ii. Repeat the question for the same initial element and
(
(m − 1, n + 1) if m ≥ 2
f : T → T : (m, n) 7→
(m + n, 1) if m = 1
(c) Prove that range( f ) = N \ {1}: every element except 1 is the successor of something.
(Hint: let A = {1} ∪ range( f ) in the induction axiom)
(d) Prove that N, as defined by Peano, is well-ordered (with respect to x < x + 1, etc.).
72
5.3 Strong Induction
The principle of mathematical induction (Theorem 5.9) is often known as weak induction. Strong
induction differs primarily in that the induction step can assume more than one previous proposition.
Theorem 5.14 (Principle of Strong Induction). Let l ≥ m be fixed integers and suppose P(n) is a
proposition, one for each n ∈ Z≥m . Suppose:
Exercise 6 provides a proof by showing that strong and weak induction are equivalent. We instead
concentrate on a few examples. The additional difficulty of strong induction comes from determining
how many base cases are required and in phrasing the induction hypothesis: in practice one rarely
needs to employ all the propositions P(m), . . . , P(n).
While the Fibonacci sequence seems to be increasing, it also appears to be less than doubling at each
step, suggesting the claim
∀n ∈ N, f n < 2n
We prove this using strong induction. Two base cases are suggested since the sequence is defined by
two initial conditions ( f 1 = f 2 = 1): in the language of the Theorem, m = 1 and l = 2. Moreover, the
fact that each term from f 3 onwards is the sum of its two predecessors suggests that the induction step
requires only the explicit use of two propositions.
Induction step: Fix n ≥ 2 and suppose32 that f n−1 < 2n−1 and f n < 2n . Then
f n +1 = f n + f n −1 < 2n + 2n −1 < 2n + 2n = 2n +1
The Fibonacci numbers satisfy many identities which can often be established by induction (see, for
instance, Exercises 3 & 4).
32 To follow Theorem 5.14 precisely, we should assume that f < 2k for all k ≤ n. Do so if you like, though our phrasing
k
is more typical. Since we only make explicit use of two cases in the induction step, it is clearer to state these concretely
rather than introducing the new variable k.
73
It is instructive to consider why we really needed strong induction to prove our Fibonacci example.
Here are two broken attempts to prove the claim by weak induction.
What is the problem? The induction hypothesis assumes f n < 2n , but nothing about f n−1 : we are
stuck! Let’s correct this flaw by making the induction hypothesis as in the correct proof.
Induction step: Fix n ≥ 2 and suppose that f n−1 < 2n−1 and f n < 2n . Then
f n +1 = f n + f n −1 < 2n + 2n −1 < 2n + 2n = 2n +1
Where is the problem now? Consider the first instance, n = 2, in which the induction step is invoked:
f 3 = f 2 + f 1 < 22 + 21
We haven’t proved enough base cases to get us started: the single base case establishes f 1 < 21 , but
not f 2 < 22 . The induction step correctly establishes the chain of implications
P(1) ∧ P(2) =⇒ P(3), P(3) ∧ P(4) =⇒ P(5), P(4) ∧ P(5) =⇒ P(6), . . .
but the process only gets started if we prove both base cases P(1) and P(2).
The moral here is to try the induction step as scratch work. Your attempt should tell you if you need
strong induction and how many base cases are required.
33 The induction step requires n ≥ 2: since f n−1 = f 0 doesn’t exist, f n+1 = f n + f n−1 is meaningless when n = 1.
74
For another sequential induction example in the same vein, see Exercise 5.3 where three base cases
are required and the induction step explicitly uses three propositions.
To see strong induction in all its glory, with the induction step making use of all previous proposi-
tions, we prove the existence part of the Fundamental Theorem of Arithmetic, which states that all
integers ≥ 2 can be (uniquely) expressed as a product of primes: e.g., 3564 = 22 × 34 × 11.
This provides the missing piece in our discussion of Euclid’s Theorem (2.39) on the existence of
infinitely many primes. First recall Definition 2.38: p ∈ N≥2 is prime if and only if its only positive
divisors are itself and 1. A non-prime q ∈ N≥2 is said to be composite: ∃ a, b ∈ N≥2 such that q = ab.
Base case (n = 2): The only positive divisors of 2 are itself and 1, hence 2 is prime.
Induction step: Fix n ≥ 2 and suppose that every natural number k satisfying 2 ≤ k ≤ n is either prime
or a product of primes. There are two possibilities:
By induction we see that all natural numbers n ≥ 2 are either prime, or a product of primes.
Exercises 5.3. A reading quiz and several questions with linked video solutions can be found online.
3. Let f n be the nth Fibonacci number (Example 5.15). Prove the following by induction ∀n ∈ N:
n
3 n −2
(a) ∑ f k2 = f n f n+1 (b) f n ≥ 2
k =1
75
4. Extending Exercise 3(b), prove Binet’s formula for the nth Fibonacci number:
1 1 √ 1 √
f n = √ ϕn − ϕ̂n where ϕ= (1 + 5) and ϕ̂ = (1 − 5)
5 2 2
(ϕ is the famous golden ratio: ϕ, ϕ̂ are the solutions to the quadratic equation x2 = x + 1)
n = 2r 1 + 2r 2 + · · · + 2r ℓ
6. Prove the principle of strong induction (Theorem 5.14) by applying weak induction to a new
family of propositions Q(n) via:
(a) If the Theorem is written in the form ∀n ∈ N≥2 , P(n), what is the proposition P(n)?
(b) Explicitly carry out the induction step for the three situations n + 1 = 9, n + 1 = 106 and
n + 1 = 45. How many different ways can you perform the calculation for n + 1 = 45?
(c) Explain why it is only necessary in the induction step to assume that all integers k satisfy-
1
ing 2 ≤ k ≤ n+
2 are prime or products of primes.
Let p be prime, let n ∈ N, and let a1 , . . . , an be natural numbers such that p | a1 a2 · · · an . Prove
by induction that,
(Hint: n = 1 isn’t really part of the induction, but you can treat it as a base case)
9. The Fundamental Theorem of Arithmetic states that every integer n ≥ 2 can be written as a product
of prime factors in a unique way (up to reordering of the prime factors). In other words,
Part i. is Theorem 5.17. Using Exercise 8, or otherwise, supply a proof of part ii.
34 Strictly, this is definition of prime, whereas Definition 2.38 defines a subtly different concept: irreducibility. Within the
76
6 Set Theory, Part II
In this chapter we return to set theory and consider several more-advanced constructions.
Definition 6.1. The Cartesian product of sets A and B is the set of ordered pairs
A × B = ( a, b) : a ∈ A and b ∈ B
Examples 6.2. 1. The Cartesian product of the real line R with itself is the usual xy-plane.
As you’ve seen in other classes, rather than writing R × R which y
is unwieldy, we denote this set 4 (2, 3)
R2 = ( x, y) : x, y ∈ R 2
More generally, the set of n-tuples of real numbers is35
Rn = ( x 1 , x 2 , . . . , x n ) : x 1 , x 2 , . . . , x n ∈ R = R
| ×R×
{z· · · × R} −2 2 4
n times
−2 x
The order of terms in an ordered pair really matters: the Cartesian product with the roles re-
versed is
B × A = (α, 1), ( β, 1), (α, 2), ( β, 2), (α, 3), ( β, 3)
The Cartesian product Mains × Sides is the set of all possible meals consisting of one main and
one side. It should be obvious that there are 4 × 3 = 12 possible meal choices.
The last two examples illustrate one of the simplest properties of Cartesian products, in a result which
indeed justifies the very use of the word product!
n o
35 Strictly this should be defined inductively, e.g., R3 := R2 × R = ( x, y), z : x, y, z ∈ R , but this is very tedious!
77
Theorem 6.3. If A and B are finite sets, then | A × B| = | A| · | B|.
Proof. Label the elements of each set and list the elements of A × B lexicographically. If | A| = m and
| B| = n, then
A×B = ( a1 , b1 ), ( a1 , b2 ), ( a1 , b3 ), · · · ( a1 , bn ),
( a2 , b1 ), ( a2 , b2 ), ( a2 , b3 ), · · · ( a2 , bn ),
.. .. .. ..
. . . .
( am , b1 ), ( am , b2 ), ( am , b3 ), · · · ( a m , bn )
Every element of A × B is listed exactly once. There are m rows and n columns, so | A × B| = mn.
Set Identities These may be established as we’ve done previously (Section 4): convert everything
into propositions regarding elements of sets and use basic logic. If you’re feeling more confident, you
might also be able to invoke previously established rules of set algebra.
Examples 6.4. 1. Consider the complement of a Cartesian product A × B. If you had to guess an
expression for ( A × B)C , you might mistakenly think it is AC × BC . Let us think more carefully:
( x, y) ∈ ( A × B) ∪ (C × D ) =⇒ ( x, y) ∈ A × B or ( x, y) ∈ C × D
=⇒ ( x ∈ A and y ∈ B) or ( x ∈ C and y ∈ D )
=⇒ ( x ∈ A or x ∈ C ) and (y ∈ B or y ∈ D )
=⇒ x ∈ A ∪ C and y ∈ B ∪ D
=⇒ ( x, y) ∈ ( A ∪ C ) × ( B ∪ D )
78
Exercises 6.1. A reading quiz and several questions with linked video solutions can be found online.
1. (a) Suppose that A = {1, 2} and B = {3, 4, 5}. State the set A × B in roster notation.
(b) Sketch both A × B and B × A using dots in R2 . What do you observe about your pictures?
(c) If A, B, C are any sets, we may define
A × B × C = ( a, b, c) : a ∈ A, b ∈ B, c ∈ C
x ∈ A, x ̸∈ A, x ∈ B, x ̸∈ B, y ∈ C, y ̸∈ C, y ∈ D, y ̸∈ D
(Be careful: In this problem B = (0, 4) is an interval (a subset of R), not a point in R2 !)
5. Draw a picture, similar to that in Example 6.4.2, which illustrates the fact that
( A × B )C = ( AC × BC ) ∪ ( AC × B ) ∪ ( A × BC )
( A × B) ∩ ( B × A) = C × C
7. Prove that A ∩ B = ∅ ⇐⇒ ( A × B) ∩ ( B × A) = ∅.
(Hint: try the previous question first)
(a) A × ( B ∪ C ) = ( A × B) ∪ ( A × C )
(b) A × ( B ∩ C ) = ( A × B) ∩ ( A × C )
(c) A × ( B \ C ) = ( A × B) \ ( A × C )
79
9. (a) Give an explicit example of sets A, B, C, D such that
( A × B) ∪ (C × D ) ̸= ( A ∪ C ) × ( B ∪ D )
( A ∪ C ) × ( B ∪ D ) = ( A × B) ∪ ( A × D ) ∪ (C × B) ∪ (C × D )
| A1 × · · · × A n | = | A1 | · · · | A n |
11. (a) Suppose | A| = 3, and | B| = 4. What are the minimum and maximum values for the
cardinalities |( A × B) ∩ ( B × A)| and |( A × B) ∪ ( B × A)|?
(b) (Hard) More generally, suppose | A| = m, | B| = n and | A ∩ B| = c. What can you say
about the cardinalities |( A × B) ∩ ( B × A)| and |( A × B) ∪ ( B × A)|?
(a) If A = B = R and X = [1, 3], Y = (2, 4], then X × Y ⊆ A × B. Compute the images
π1 ( X × Y ) and π2 ( X × Y ).
(b) Let Z be any set and suppose there are functions ρ1 : Z → A and ρ2 : Z → B. Show that
there is a unique function h : Z → A × B such that ρ1 = π1 ◦ h and ρ2 = π2 ◦ h.
(a) The regularity axiom ofset theory says there is no set a for which a ∈ a. Use this to prove
that the cardinality of a, { a, b} is two.
(b) Prove that ( a, b) = (c, d) =⇒ a = c and b = d or a = {c, d} and c = { a, b}
(c) In the second case, prove that there exists a set S such that a ∈ S ∈ a. The axiom of
regularity also says that this is illegal. Conclude that ( a, b) = (c, d) ⇐⇒ a = c and b = d.
80
6.2 Power Sets
Thus far we have used the operations of subset, complement, union, intersection and Cartesian prod-
uct to build new sets from old. There is essentially only one further method available.
Definition 6.5. Let A be a set. Its power set P ( A) is the set of all subsets of A:
P ( A) = B : B ⊆ A
Otherwise said: B ∈ P ( A) ⇐⇒ B ⊆ A.
0-element subsets: ∅
1-element subsets: {1}, {3}, {7}
2-element subsets: {1, 3}, {1, 7}, {3, 7}
3-element subsets: {1, 3, 7}
Gathering these together yields the power set:
P ( A) = ∅, {1}, {3}, {7}, {1, 3}, {1, 7}, {3, 7}, {1, 3, 7}
The power set therefore has eight elements. Be absolutely certain you understand the difference
between ∈ and ⊆. Here are eight propositions; which are true and which false?36
0-element subsets: ∅ n o
1-element subsets: {1}, {2}, 3
n o
2-element subsets: 1, {2}, 3
Remember that to make a subset out of a single element you surround the element with braces:
1 ∈ B =⇒ {1} ⊆ B =⇒ {1} ∈ P ( B)
n o n o
{2}, 3 ∈ B =⇒ {2}, 3 ⊆ B =⇒ {2}, 3 ∈ P ( B)
Using different-sized braces is essential here! The power set of B has four elements:
n o n o
P ( B) = ∅, {1}, {2}, 3 , 1, {2}, 3
81
Theorem 6.7. If A ⊆ B, then P ( A) ⊆ P ( B).
You’ve seen this pattern before: we are looking at the first few lines of Pascal’s Triangle! It should be
no surprise that if | A| = 4, then |P ( A)| = 1 + 4 + 6 + 4 + 1 = 16. The progression 1, 2, 4, 8, 16, . . . in
the final column suggests a general result.
Conjuring a proof may seem daunting given how little we know about A; only its cardinality. By
introducing a variable n for the cardinality and rephrasing the theorem
∀n ∈ N0 , | A| = n =⇒ |P ( A)| = 2n
induction seems like a sensible approach. But what might the induction step look like? The basic idea
is to view a set with n + 1 elements as the disjoint union of a set with n elements and a single-element
set. It is instructive to see an example of the strategy before writing the proof.
Example 6.9. Let B = {1, 2, 3}. Delete 3 ∈ B to create a smaller set Subsets of B
X X ∪ {3}
A = B \ {3} = {1, 2}
∅ {3}
In the table, the subsets of Y ⊆ B are split into two groups depending on whether {1} {1, 3}
3 ∈ Y. Each subset Y ⊆ B either has the form X or X ∪ {3} where X ⊆ A. {2} {2, 3}
{1, 2} {1, 2, 3}
Plainly B has twice the number of subsets of A; two for each subset X ⊆ A.
This method of pairing is exactly mirrored in the induction step of our formal proof.
82
Proof. We prove by induction on the cardinality of A. For each n ∈ N0 , consider the proposition
Base case (n = 0): If n = 0, then A = ∅ (Lemma 4.10). But then P ( A) = {∅} =⇒ |P ( A)| = 1 = 20 .
Induction step: Fix n ∈ N0 and assume (∗) is true for this n. Let B be any set with n + 1 elements.
Choose an element b ∈ B and define A = B \ {b}. The subsets of B may be separated into two
types:
Example 6.10. You might erroneously expect the sets P ( A × B) and P ( A) × P ( B) to be the same.
Here is a simple counter-example to convince you otherwise!
Let A = { a} and B = {b, c}. Think about cardinalities:
|P ( A × B)| = 2| A× B| = 2| A|| B| = 22 = 4
|P ( A) × P ( B)| = |P ( A)| |P ( B)| = 2| A| 2| B| = 2| A|+| B| = 23 = 8
Since the cardinalities are different, the sets cannot be equal: P ( A × B) ̸= P ( A) × P ( B). But what
about subset? Might the smaller set be a subset of the larger? Again the answer is no, as can be seen
by computing the sets explicitly.
A × B = ( a, b), ( a, c) =⇒ P ( A × B) = ∅, {( a, b)}, {( a, c)}, {( a, b), ( a, c)}
The elements of P ( A × B) are sets of ordered pairs. By contrast, the elements of P ( A) × P ( B) are
ordered pairs of sets:
P ( A) × P ( B) = ∅, { a} × ∅, {b}, {c}, {b, c}
n o
= ∅, ∅ , ∅, {b} , ∅, {c} , ∅, {b, c} , { a}, ∅ , { a}, {b} , { a}, {c} , { a}, {b, c}
The elements of the two sets have completely different types, so there is no way that one could be a
subset of the other!
83
Exercises 6.2. A reading quiz and several questions with linked video solutions can be found online.
3. Determine whether the following statements are true or false. Justify your answers.
4. Here are three incorrect proofs of Theorem 6.7: A ⊆ B =⇒ P ( A) ⊆ P ( B). Why does each fail?
6. (a) For any set A, show there is an injection ι : A → P ( A). (Explicitly construct a map, and
show that it is one-to-one.)
(b) Is there any set A such that A ∩ P ( A) ̸= ∅?
7. If you’ve studied combinatorics, you’ll know that the binomial coefficient (nr) = n!
r!(n−r )!
denotes
the number of distinct ways one can choose r objects from a set of n objects.
84
6.3 Indexed Collections of Sets: Union and Intersection Revisited
In Definition 4.11 we defined the union of two sets A ∪ B = { x : x ∈ A or x ∈ B}, which inductively
extends to any finite union of sets
A1 ∪ · · · ∪ An = { x : x ∈ Ak for some k }
In this section we consider a stronger definition and compute unions and intersections of (potentially)
infinite collections of sets.
Definition 6.11. Given a set of sets { An } (each An is a set!), we form its union and intersection:
[ [
An = { x : x ∈ An for some n} x∈ An ⇐⇒ ∃n such that x ∈ An
\ \
An = { x : x ∈ An for all n} x∈ An ⇐⇒ ∀n we have x ∈ An
For all examples in this section, the sets An are indexed: each n lies in some indexing set, typically
N, Z or R. It is typical to decorate the union/intersection symbols to indicate this: e.g., if n ∈ N we
S S
might use the notation n∈N An or ∞ n =1 A n .
Example 6.12. Here is a simple (finite) example to get us used to the notation. Let
A proof is almost immediate from the definition: can you supply it?
Nested Collections
When a collection of sets is indexed by the natural numbers N in such a way that successive sets
satisfy a subset relation, we describe the collection as nested, for instance
A1 ⊇ A2 ⊇ A3 ⊇ · · ·
S∞
Since An ⊆ A1 for all n, we see that n =1 A n = A1 :
∞
[
x∈ An ⇐⇒ ∃n ∈ N, x ∈ An ⇐⇒ x ∈ A1
n =1
Computing the intersection in such a situation typically requires much more care. . .
85
Example 6.14. Consider the nested collection { An : n ∈ N} of half-open intervals An = 0, n1 :
1 1
m ≤ n =⇒ n ≤ m =⇒ An ⊆ Am =⇒ A1 ⊇ A2 ⊇ A3 ⊇ · · ·
S∞ 1
The union is therefore n=1 [0, n )
= A1 = [0, 1).
Before considering the full intersection, we first compute all finite intersections. The nesting condition
says that a finite intersection is simply the smallest of the listed sets: for any constant m ∈ N,
\m
0, n1 = A1 ∩ · · · ∩ Am = Am = 0, m1
n =1
To find the infinite intersection, it is very tempting to take limits:
\∞
1
? \m ? h
0, m = lim 0, m1 = 0, lim m1 = [0, 0)
m→∞ m→∞
n =1 n =1
This is mathematical garbage! Nothing you know about limits justifies either questionable equality.
Moreover, the ‘answer’ [0, 0) could only mean the empty set, which is incorrect: we claim that
∞
\
0, n1 = {0}
n =1
If the last part of the argument seems difficult, try an example! If x = 0.13, observe that
∞
\
0.1 < 0.13 =⇒ x ∈
/ A10 =⇒ x ∈
/ An A5
0 1 x
n =1 10
By modifying the endpoints of the sets An we obtain slightly different results:
∞
\ ∞
\ ∞
\
1 1
0, n) =∅ 0, n] =∅ 0, n1 ] = {0}
n =1 n =1 n =1
The moral is that you cannot naı̈vely apply limits to sequences of sets. If thinking about limits helps
your intuition, great, but you can’t trust it blindly!
37 The 1
existence of N ≥ x should be intuitive; it is in fact guaranteed by the Archimedean property (Exercise 2.4.13).
86
An indexed collection can also be nested the other way round, in which case the intersection is
straightforward (though unions need more work)
∞
\
A1 ⊆ A2 ⊆ A3 ⊆ · · · =⇒ A n = A1
n =1
Examples 6.15. Here are a few more simple examples of computing unions and intersections of
indexed collections; some are nested, some not.
1. Let An = [n, n + 1) ⊆ R, for each n ∈ Z. For example A3 = [3, 4) and A−17 = [−17, −16).
Every real number lies in precisely one such set (the sets An are pairwise disjoint), whence
[ \
An = R and An = ∅
n ∈Z n ∈Z
To prove the former, note that x ∈ [n, n + 1) where n = ⌊ x ⌋ is the greatest integer less than or
S
equal to x: i.e., ∀ x ∈ R, we have x ∈ A⌊ x⌋ , whence R ⊆ n∈Z An (the reversed subset inclusion
is trivial since each An ⊆ R).
2. For each n ∈ N, let An = [−n, n] be a closed interval. This is a nested collection
∞
\
A1 ⊆ A2 ⊆ A3 ⊆ · · · =⇒ An = A1 = [−1, 1]
n =1
3. For each n ∈ N, let An = { x ∈ R : x2 − 1 < n1 }. Before computing the union and intersection
of these sets, it is helpful to write each set as a pair of intervals. Note that
r r
2 1 1 2 1 1 1
x −1 < ⇐⇒ − < x − 1 < ⇐⇒ 1 − < |x| < 1 +
n n n n n
The sets are nested: A1 ⊇ A2 ⊇ A3 ⊇ A4 ⊇ · · · , where
√ √
− 2 0 2
q q q q A1
1 1 1 1
An = − 1 + n , − 1 − n ∪ 1 − n, 1 + n A2
∞
[ √ √
=⇒ An = A1 = (− 2, 0) ∪ (0, 2)
A3
n =1 A4
For the intersection, A5
1
∀n ∈ N, x ∈ An ⇐⇒ ∀n ∈ N, x2 − 1 < ⇐⇒ x2 − 1 = 0
n
T∞
It follows that n =1 An = {1, −1}.
4. Let A0 = {0, 1}, An = {1} is n ≥ 1 is odd, and An = {2} if n ≥ 2 is even. Then,
! !
\ ∞
∞ [ ∞
[ ∞
[
Ak = Ak ∩ Ak ∩ · · · = {0, 1, 2} ∩ {0, 1} ∩ {0, 1} ∩ · · · = {0, 1}
n =1 k = n k =1 k =2
T∞ S∞
Think about why x ∈ n =1 k=n Ak ⇐⇒ x lies in infinitely many of the sets An .
87
Unions: Don’t Confuse Sets and Elements
When working with large unions, it is easy to confuse two sets:
It is important to understand the difference! Sometimes the indexed collection itself is the object of
interest, other times we may want to use its union or intersection to define something new.
Example 6.16 (Projective space). Let Am be the line38 through the ori- A −3 A ∞ A 4 A
2
gin in R2 with gradient m ∈ R ∪ {∞}. Since every point in R2 lies on A1
A− 3
such a line, and distinct lines intersect at the origin, we see that 4 A1
2
[
2
\ A− 1
Am = R and Am = (0, 0) 5
A0
The indexed collection is known as projective space
P(R2 ) = Am : m ∈ R ∪ {∞}
and is interesting in its own right. Each element of projective space is a line, making P(R2 ) a very
different set to R2 . This example also shows that indexing sets don’t have to be simple sets of integers!
Example 6.17 (Finite Decimals). For each n ∈ N, let An be the set of decimals of length n,
An = 0.a1 a2 . . . an : where each ai ∈ {0, 1, . . . , 9}
For example 0.134 ∈ A3 . Since 0.134 = 0.1340, we also have 0.134 ∈ A4 . This is a nested collection
\
A1 ⊆ A2 ⊆ A3 ⊆ A4 ⊆ · · · =⇒ An = A1 = 0, 0.1, . . . , 0.9
n ∈N
Now consider unions. If m ∈ N, then
m
[
An = Am = x ∈ [0, 1) : x has a decimal representation of length ≤ m
n =1
You might guess that the infinite union would be the entire39 interval [0, 1], but this is incorrect:
[
x∈ An ⇐⇒ ∃n ∈ N such that x ∈ An
n ∈N
⇐⇒ x is a decimal with some finite length n
S
The infinite union An is precisely the set of x ∈ [0, 1) which have a finite decimal representation!
S
This is far from the entire interval: many rational numbers are excluded (e.g., 31 = 0.3333 · · · ∈
/ An ),
and the union contains no irrational numbers.
38 The symbol ∞ is used to indicate the vertical line A∞ with ‘infinite gradient.’
39 We would include 1 = 0.9999 · · ·
88
Optional! We finish this section with a bit of fun, using an infinite intersection to create a fractal set.
Example 6.18 (Cantor’s middle-third set). Starting with the interval C0 = [0, 1], construct a sequence
of sets Cn by repeatedly removing the middle third of all intervals in Cn .
1 2
0 3 3 1
C0 = [0, 1]
C1 = [0, 13 ] ∪ [ 32 , 1]
C2 = [0, 19 ] ∪ [ 92 , 13 ] ∪ [ 32 , 97 ] ∪ [ 89 , 1]
..
.
The sequence is drawn up to C9 , though you’ll have to zoom in a long way to see the detail!
∞
T
Cantor’s middle-third set is defined to be the infinite intersection C := Cn .
n =0
Cantor’s set has several interesting properties which we state non-rigorously.
where len(Cn ) is the sum of the lengths of all intervals in Cn . Since C ⊆ Cn for all n, we conclude
that len(C) = 0: Cantor’s set contains no intervals of positive length.
Infinite Cardinality Cantor’s set contains the endpoints of every interval removed at any stage of
its construction. In particular, 31n ∈ C for all n ∈ N0 , whence C is an infinite set.40
C = f (C) ∪ g(C)
f (C) g(C)
Cantor’s set consists of two shrunken copies of itself, a classic property of fractals.
We’ll analyze Cantor’s set a little and consider a related construction in Exercises 14 & 15. Other
fractal sets with a similar construction include the Sierpiński carpet and the von Koch snowflake.
Exercises 6.3. A reading quiz and several questions with linked video solutions can be found online.
S T
1. Let I = {1, 3, 4}. Determine r∈ I Sr and r∈ I Sr , where each Sr = [r − 1, r + 3] is an interval.
2. For each integer n, consider the set Bn = {n} × R.
S4
(a) Draw a picture (in the Cartesian plane) of n =2 Bn = B2 ∪ B3 ∪ B4 .
(b) Draw a picture of the set C = [1, 5] × {−2, 2}.
(Careful! [1, 5] is an interval, while {−2, 2} is a set containing two points)
S
4
(c) Compute n=2 Bn ∩ C.
S4
(d) Compute n =2 ( Bn ∩ C ). Compare with your answer to part (c).
40 In fact it is more than merely infinite, it is uncountably so, as we’ll discuss in Chapter 8. The bizarre contrast between
this and the zero measure property was part of the reason Cantor introduced his set.
89
3. Give an example of four subsets A, B, C, D of {1, 2, 3, 4} such that all intersections of two subsets
are different.
4. For each of collection, define an interval An such that the given collection is { An : n ∈ N}.
Then find both the union and intersection of the collection.
(a) [1, 2 + 1), [1, 2 + 21 ), [1, 2 + 31 ), . . .
(b) (−1, 2), (− 23 , 4), (− 35 , 6), (− 74 , 8), . . .
(c) ( 41 , 1), ( 18 , 21 ), ( 16
1 1 1 1
, 4 ), ( 32 1 1
, 8 ), ( 64 , 16 ), . . .
S T
5. For each real number x, let A x = {3, −2} ∪ {y ∈ R : y > x }. Find x ∈R A x and x ∈R Ax .
T
6. Use Definition 6.11 to prove that A1 ⊆ A2 ⊆ A3 ⊆ · · · =⇒ n ∈N A n = A1
10. Let { An : n ∈ I } and { Bn : n ∈ I } be indexed families of sets. Give explicit examples for which
the following hold:
S S S
(a) ( n∈ I An ) ∩ ( n∈ I Bn ) ̸= n∈ I ( An ∩ Bn )
T T T
(b) ( n∈ I An ) ∪ ( n∈ I Bn ) ̸= n∈ I ( An ∪ Bn )
11. (De Morgan’s laws) Let { An : n ∈ I } be an indexed family of sets and B a set. Prove
S T
(a) B \ ( n∈ I An ) = n∈ I ( B \ An )
T S
(b) B \ ( n∈ I An ) = n∈ I ( B \ An )
12. Suppose we are working in a universal set U (so every set is considered a subset of U ). Give an
T
explanation for why it makes sense to define n∈ I An = U when I = ∅.
90
13. (Hard) Let An = { mn ∈ Q : 0 < m < n, m ∈ N}, for each n ∈ N.
1 2
We write x = [0.a1 a2 a3 · · · ]3 . For example [0.12]3 = 3 + 32
= 59 .
64
(a) Verify that [0.02101]3 = 243 , [0.22222 · · · ]3 = 1 and [0.020202020 · · · ]3 = 14 .
(Hint: You’ll need the geometric series formula ∑∞ n r
n=1 r = 1−r for the latter two)
(b) Let Cn be the nth set in the construction of Cantor’s middle-third set C (Example 6.18).
Prove by induction that Cn is the set of all x ∈ [0, 1] with a ternary representation whose
first n digits are only 0 or 2.
(Hints: Use Cn+1 = f (Cn ) ∪ g(Cn ); What does division by 3 do to a ternary representation?)
(c) Argue that 14 ∈ C , but that it is not an endpoint of any of the deleted middle-thirds re-
moved during the construction of C .
15. (Hard) We construct a modified Cantor set and fractal curve. Starting with F0 = [0, 1], repeat-
edly delete the third quarter segment of each interval to obtain a sequence of sets F0 , F1 , F2 , . . .:
91
7 Relations and Partitions
The mathematics of sets is rather basic until one has a notion of how to relate elements of sets to each
other. We are already familiar with several examples of this, for instance:
1. The usual order of numbers (e.g., 3 < 7) is a way of relating/comparing two elements of R.
2. A function f : A → B relates elements in a set A with those in B.
In this chapter we discuss a general framework based on the Cartesian product (Section 6.1).
Definition 7.1. Let A and B be sets. A (binary) relation R from A to B is a set of ordered pairs
R ⊆ A×B
A relation on A is a relation from A to itself. The domain and range of R are the sets
dom(R) = a ∈ A : ∃b ∈ B, ( a, b) ∈ R range(R) = b ∈ B : ∃ a ∈ A, ( a, b) ∈ R
92
With abstract relations, there are only a small number of things we can do.
To find the elements of R−1 , you simply switch the components of each ordered pair in R.
We say that R is symmetric if R = R−1 (requires A = B): i.e., ( x, y) ∈ R ⇐⇒ (y, x ) ∈ R.
Proof. Here are two of the arguments. Try the others yourself.
4. Suppose R ⊆ S . Then,
Therefore R−1 ⊆ S −1 . Replacing R, S with R−1 , S −1 and applying part 1, we also see that
93
Exercises 7.1. A reading quiz and practice question can be found online.
1. Let R be the relation on {0, 1, 2} defined by
0R0 0R1 2R1
(a) Write R as a set of ordered pairs. What are its domain and range?
(b) What is the inverse of R?
2. (a) Let R be the relation on R defined by a R b ⇐⇒ | a − b| = 1. Is this relation symmetric?
(b) Let S be the relation on R defined by
a S b ⇐⇒ ∃ x ∈ Q \ {0} such that a = x2 b
Is this relation symmetric?
3. Draw pictures of the following relations on the set of real numbers R.
(a) R = ( x, y) : y ≤ 2 and y ≥ x and y ≥ 2 − x
(b) S = ( x, y) : ( x − 4)2 + (y − 1)2 ≤ 9
State the domain and range and draw the inverse of each relation.
a
4. A relation is defined on N by a R b ⇔ b ∈ N. Let c, d ∈ N. Under what conditions can we
write c R−1 d?
5. Let R ⊆ {1, 2, 3, 4} × {1, 2, 3, 4} be the relation
R = (1, 3), (1, 4), (2, 2), (2, 4), (3, 1), (3, 2), (4, 4)
(a) Find dom(R), range(R) and R−1 .
(b) Compute the relations R ∪ R−1 and R ∩ R−1 , and check that they are symmetric.
6. For the relation R = ( x, y) : x ≤ y on N, what is R−1 , and what is the intersection R ∩ R−1 ?
7. Let R ⊆ Z × Z be the relation m R n iff m | n. Compute R ∩ R−1 .
8. Let A be a set with | A| = 4. What is the maximum number of elements that a relation R on A
can contain such that R ∩ R−1 = ∅?
9. Give formal proofs of the remaining parts of Theorem 7.4.
10. Let R and S be two symmetric relations on a set A.
(a) Show R ∩ S is symmetric.
(b) Does R ∪ S have to be symmetric? Give a proof or counterexample.
11. Let R be a relation on a set A and define S = R ∪ R−1 . Prove that S is the smallest symmetric
relation on A containing R in the following sense: if
n
T = T ⊆ A × A : T symmetric and R ⊆ T
then
\
S= T
T ∈T
(S is known as the symmetric closure of R)
94
7.2 Functions revisited
In Section 4.3, we naı̈vely defined a function f : A → B as a rule associating to each element a ∈ A an
element f ( a) ∈ B. But what do we mean by a rule? We address this issue by turning Example 7.2.7
on its head: a function f : A → B is precisely its graph.
√
Example 7.5. The function f : [0, 4] → [0, 2] : x 7→ x corresponds to
the relation √2
x √
√ ( x, x )
x, x : x ∈ [0, 4] ⊆ [0, 4] × [0, 2]
0
0 x2 4
The difficulty is that we cannot use the notation f ( a) until we know that we have a function. . .
A relation f ⊆ A × B is a function if and only if every a ∈ A is the first entry of exactly one ordered
pair ( a, b) ∈ f . This is simply an abstraction of the vertical line test from calculus.
Example 7.7. Let A = [0, 4] and B = [0, 5]. Two relations f , R ⊆ A × B are drawn below.
The first relation defines a function f : A → B since every a ∈ A corresponds to exactly one b ∈ B:
the vertical line through any a ∈ A intersects the graph in exactly one point ( a, b).
The second relation R does not define a function. In fact it fails both parts of the definition:
4 4
range(R)
b b2
2 2
b1
0 0
0 2 a 4 0 a 2 ã 4
dom( f ) = A dom(R) ̸= A
Injectivity (Definition 4.18) can be rephrased in this new context: f : A → B injective means
(∀ a ∈ A) ( a1 , b), ( a2 , b) ∈ f =⇒ a1 = a2
95
Example 7.8. Let A = {1, 2, 3} and B = { p, q, r }. The relation B
r
f = (1, r ), (2, p), (3, r )
The example highlights one advantage of the relational approach: the inverse of a function f : A → B
always exists; it is the inverse relation f −1 ⊆ B × A. The question is whether f −1 is also a function. From
the example and our discussions in Section 4.3, you should strongly suspect the answer.
Theorem 7.9. Let f : A → B be a function and consider its inverse relation f −1 ⊆ B × A. Then
Proof. Recalling Definition 7.6, we see that f −1 is a function if and only if the two conditions hold:
1. dom( f −1 ) = B
2. (b, a1 ), (b, a2 ) ∈ f −1 =⇒ a1 = a2
96
Exercises 7.2. A reading quiz and practice questions can be found online.
97
7.3 Equivalence Relations & Partitions
What do we mean when we say that two objects are equal? Mathematicians use the word flexibly,
often in reference to non-identical objects which share some common property.43 You’ve been doing
this for years, indeed our very first result (Theorem 1.1: even + even = even) has exactly this flavor.
To help develop a more flexible notion of equality, consider three important concepts.
Reflexivity: ∀ x ∈ X, x = x
Symmetry: (∀ x, y ∈ X ) x = y =⇒ y = x
Transitivity: (∀ x, y, z ∈ X ) x = y and y = z =⇒ x = z
We turn this simple example on its head to create an abstract, generalized notion of equality.
Example 7.10 says that ‘equals’ is indeed an equivalence relation on any set X. Things would be very
boring if this were the only example. . .
As we’ll see shortly, an equivalence relation algebraically characterizes what it means for elements to
share a common property: here x ∼ y if and only if they have the same parity.
43 This goes back at least to Euclid, who used equal to refer to congruent triangles. To Euclid, congruent triangles were
typically hidden. The symbol ∼ (‘tilde,’ or ‘twiddles’) is commonly used for an abstract equivalence relation. It is the same
symbol used to denote similar triangles: congruence and similarity are both equivalence relations on the set of triangles!
98
It would also be somewhat dull if every relation were an equivalence relation. In fact most are not.
Examples 7.13. 1. Consider the relation ≤ on the natural numbers N. We check each condition:
Reflexivity: True. ∀ x ∈ R, x ≤ x.
Symmetry: False. For example, 2 ≤ 3 but 3 ≰ 2.
Transitivity: True. ∀ x, y, z ∈ R, if x ≤ y and y ≤ z, then x ≤ z.
Plainly ≤ is not an equivalence relation on N.
The usefulness of equivalence relations comes when we group together all related elements.
Definition 7.14. Given an equivalence relation ∼ on X, the equivalence class of x ∈ X is the set
[ x ] = {y ∈ X : y ∼ x } (y ∈ [ x ] ⇐⇒ y ∼ x)
2. (Example 7.12) For the relation x ∼ y ⇐⇒ x − y is even, there are two equivalence classes:
[0] = {. . . , −4, −2, 0, 2, 4, . . .} = 2Z = {even integers}
[1] = {. . . , −3, −1, 1, 3, 5, . . .} = 1 + 2Z = {odd integers}
and the quotient has two elements: Z ∼ = [0], [1] .
99
Note that any even number is a representative of the first class in the even/odd example: for instance
4 ∈ [0] since 4 − 0 = 4 is even.
Indeed [4] = [0], which leads us a simple piece of bookkeeping. . .
Observe how the proof uses all three defining properties of an equivalence relation.
(⇒) Suppose x ∼ y. We prove that [ x ] = [y] by showing that each side is a subset of the other.
(⊆) Let z ∈ [ x ]. By definition, z ∼ x. By transitivity,
z ∼ x and x ∼ y =⇒ z ∼ y =⇒ z ∈ [y]
(⊇) By symmetry, we also have y ∼ x. Repeating the previous argument yields [y] ⊆ [ x ].
It should be clear that this is an equivalence relation (try saying out loud what reflexivity, symme-
try & transitivity mean here!). There is one equivalence class for each letter-grade awarded, each
class being the set of all students who obtain a given grade. If we label the equivalence classes
A+ , A, A− , B+ , . . . , F, where, say, B = {students obtaining a B-grade}, then the quotient
X = A+ , A, A− , B+ , . . . , F
∼
is essentially the set of possible letter-grades! Suppose Laura & Jorge both achieve a B+ ; both are
representatives of this equivalence class and the following propositions are all true:
x ∼ y ⇐⇒ f ( x ) = f (y)
100
Examples 7.19. 1. The function
x ∼ y ⇐⇒ x2 ≡ y2 (mod 5)
( x, y) ∼ (v, w) ⇐⇒ x2 + y2 = v2 + w2
w
As stated in the Theorem, there is one equivalence classes for 1
each element r2 ∈ range( f ) = R0+ : if x2 + y2 = r2 , then
n o
[( x, y)] = (v, w) ∈ R2 : v2 + w2 = r2
Partitions When you cut a cake, each crumb ends up in exactly one slice. To partition a set is to do
precisely this: split into disjoint subsets so that each element lies in exactly one subset.
Partitions are intimately related to equivalence relations, as the next example illustrates.
Run through the checklist: ∼ is reflexive, symmetric & transitive; it is an equivalence relation! More-
over, its equivalence classes are precisely the partitioning subsets A1 , A2 and A3 :
101
This tight relationship between partitions and equivalence relations is completely general: equiva-
lence relations provide a straightforward algebraic method of working with partitions.
x ∼ y ⇐⇒ ∃ An such that x, y ∈ An
Its equivalence classes are the distinct subsets An : otherwise said, X ∼ = { An }.
Before reading the proof, look back at every example of an equivalence relation in this section and
convince yourself that the equivalence classes really do partition the ‘big set.’ In both parts of the
proof look for where the assumptions are used—reflexive, symmetric, transitive in part 1; the two
partition conditions in part 2—and think about why they are needed.
version of the vertical line test for the canonical map (Theorem 7.18)!
102
Exercises 7.3. A reading quiz and practice questions can be found online.
3. For each equivalence relation on R2 , identify the equivalence classes and draw several of them.
4. Let X = {1, 2, 3, 4, 5, 6}. The distinct equivalence classes resulting from an equivalence relation
∼ on X are {1, 4, 5}, {2, 6}, and {3}. What is ∼? Give your answer as a subset of X × X.
5. ⊆ is a relation on any set of sets. Is ⊆ reflexive, symmetric, transitive? Prove your assertions.
6. Suppose X is a set with at least two elements. Which of the properties reflexive, symmetric,
transitive are satisfied by the relation ̸=?
8. Let A = {2m : m ∈ Z}. A relation ∼ is defined on the set Q+ of positive rational numbers by
a ∼ b ⇐⇒ ab−1 ∈ A
Show that ∼ is an equivalence relation and describe the elements in the equivalence class [3].
√ √
9. A relation is defined on the set X = { a + b 2 : a, b ∈ Q, a + b 2 ̸= 0} by x ∼ y ⇐⇒ yx ∈ Q.
Show that ∼ is an equivalence relation and determine the distinct equivalence classes.
10. For the purposes of this question, a real number is x small if | x | ≤ 1. Let R be the relation on
the set of real numbers defined by x R y ⇐⇒ x − y is small.
Prove or disprove: R is an equivalence relation on R.
11. Find the equivalence classes for the relation ∼ in Example 7.19.1.
12. For each relation R on Z, decide whether it is reflexive, symmetric, or transitive, and whether
it is an equivalence relation.
13. For Example 7.13.3, compute the ‘classes’ [ x ] = {y ∈ X : x Ry}. What do you observe?
103
14. Let X = {1, 2, 3}. Define the relation S = (1, 1), (1, 2), (1, 3), (2, 1), (2, 2), (3, 1), (3, 3) on X.
(a) Which of the properties reflexive, symmetric, transitive are satisfied by S ?
(b) Let An = { x ∈ X : x S n}. Show that { A1 , A2 , A3 } do not partition X.
(c) Repeat parts (a) and (b) for the relation T on X, where
T = (1, 1), (1, 2), (1, 3), (2, 1), (2, 2), (2, 3), (3, 3)
(Warning! Some of the sets A1 , A2 , A3 might be the same in these examples)
15. (Example 7.19.2) Prove directly that the circles Ar = ( x, y) : x2 + y2 = r2 partition R2 : i.e.,
[
R2 = Ar and Ar1 ∩ Ar2 ̸= ∅ =⇒ Ar1 = Ar2
r ≥0
16. Determine whether each collection { An : n ∈ R} partitions R2 . Sketch several of the sets An .
(a) An = ( x, y) ∈ R2 : y = 2x + n (b) An = ( x, y) ∈ R2 : y = ( x − n)2
(c) An = ( x, y) ∈ R2 : xy = n (d) An = ( x, y) ∈ R2 : y4 − y2 = x − n
17. Let X be the set of all humans. If x ∈ X, define the set
A x = {people who had the same breakfast or lunch as x }
(a) Does the collection { A x : x ∈ X } partition X? Explain your answer.
(b) Is your answer different if the or in the definition of A x is changed to and?
18. A relation R is antisymmetric if ( x, y) ∈ R ∧ (y, x ) ∈ R =⇒ x = y. Give examples of
relations R on X = {1, 2, 3} having the stated property.
(a) R is both symmetric and antisymmetric.
(b) R is neither symmetric nor antisymmetric.
(c) R is transitive but R ∪ R−1 is not transitive.
19. Let R be a relation on X and define its reflexive closure S = R ∪ ( x, x ) : x ∈ X . Prove that S
is reflexive and that, if T is any reflexive relation for which R ⊆ T , then S ⊆ T .
20. (a) Prove Theorem 7.18.
(b) If f has domain X, explain why f −1 ({b}) : b ∈ range( f ) forms a partition of X.
21. Define a relation ∼ on R2 \ {(0, 0)} by
( x, y) ∼ (v, w) ⇐⇒ ∃λ ̸= 0 such that (λx, λy) = (v, w)
(a) Prove that ∼ is an equivalence relation.
(b) What does this relation have to do with projective space P(R2 ) (Example 6.16)?
22. (If you’ve studied linear algebra) We say that square matrices A, B are similar if there exists a
matrix M such that B = MAM−1 .
(a) Prove that similarity is an equivalence relation on the set of n × n matrices.
(b) What is the equivalence class of the identity matrix?
(c) Show that −−11 15 4 10
5 9 and 0 −6 are similar.
(Hint: diagonalize the matrices!)
104
7.4 Well-definition, Rings and Congruence
We return to our discussion of congruence (Section 3.1) in the context of equivalence relations and
partitions. The important observation is that congruence modulo n is an equivalence relation on Z, each
equivalence class being the set of all integers sharing a remainder modulo n.
The theorem is merely a generalization of Example 7.12 and Exercise 7.3.1. You should prove this
yourself. The equivalence classes are precisely the integers which are congruent modulo n:
[ a] = x ∈ Z : x ≡ a (mod n)
= x ∈ Z : x and a have the same remainder as when divided by n
= x ∈ Z : x − a is divisible by n
In this language, we can restate what it means for two equivalence classes to be equal.
If the meaning of any of the above is unclear, re-read the previous section! The equivalence classes of
∼n partition the integers into exactly n subsets, one for each remainder. The quotient set is therefore
Z
∼ n = [0], [1], . . . , [ n − 1]
We use this set to define an extremely important object.
Definition 7.25. Define operations +n and ·n on the quotient set Z ∼n as follows:
[ x ] + n [ y ] : = [ x + y ], [ x ] ·n [y] := [ x · y]
The ring Zn is the set Z ∼n together with the operations +n and ·n .
There is a potential difficulty: each equivalence class is large (e.g., [3] = {. . . , −5, 3, 11, 19, . . .}), so we
have lots of choice regarding representatives. For instance, since [3] = [11] and [6] = [22] we should
be able to observe that
Is this true? If not, then the operation +8 would not be particularly useful. Thankfully this is not a
problem: according to the definition of +8 , things turn out exactly as we’d like,
105
Consider things more abstractly. Given equivalence classes X and Y, here is the process for comput-
ing the equivalence class X +n Y:
1. Choose representatives x ∈ X and y ∈ Y so that X = [ x ] and Y = [y].
This is nothing more than a rehash of Theorem 3.9: compare it with what follows until you are
comfortable we are doing the same thing! Observe that
[ x ] = z ∈ Z : z ≡ x (mod n) = x + kn : k ∈ Z
Otherwise said, all representatives (all possible choices in step 1) of the equivalence class [ x ] have the
form x + kn for some k ∈ Z. Thinking similarly for y, the Theorem requires only that we prove
Compare our equivalence class notation with that from Section 3.1. For instance
In this language, (∗) would be written −3 + 12 = 1 in Z8 . It is critical to appreciate that −3, 12 and 1
denote sets (equivalence classes) and not integers. Until you are 100% certain of this, you should stick
to either of the notations in (∗).
It is something of a mathematical joke to write 1 + 1 = 0 (in Z2 ). Even in this context, 1 + 1 = 2;
the seeming paradox is that 2 = 0 in Z2 . This is really a very simple claim about sets: that the sum
of any two odd numbers is even! Indeed general modular arithmetic (in Zn ) is little more than an
abstraction/generalization of this simple fact.
106
Functions and Partitions Our construction of Zn is a special case of a more general situation.49
We say that f is well-defined if this construction defines a legitimate function. More formally:
[ x ] = [y] =⇒ f ([ x ]) = f ([y])
Examples 7.29. 1. Consider Z3 = Z ∼, that is X = Z where x ∼ y ⇐⇒ x ≡ y (mod 3). The rule
f ([ x ]) = 2x + 1 does not produce a well-defined function f : Z3 → Z since, for instance
since 6x ≡ 0(mod 3). If this seems abstract, note that f is easily [x] [0] [1] [2]
stated in tabular form since there are only three possible inputs! f ([ x ]) [1] [0] [2]
3. This last example is something of an advert for the advanced notation in the previous aside.
We ask for which integers k the rule f k ( x ) = kx produces a well-defined function f k : Z4 → Z6 .
If this is confusing, re-rewrite using either of the notations
f k ( x ) = kx (mod 6) or f k ([ x ]4 ) = [kx ]6
Start with a special case to get a feel for things. A table x 0 1 2 3 4 5 6 ···
of values for f 1 ( x ) shows an immediate problem! f1 (x) 0 1 2 3 4 5 0 · · ·
In Z4 , we have 0 = 4, however f 1 (0) = 0 and f 1 (4) = 4 which are not equal in Z6 ! It follows that
f 1 is not well-defined: it isn’t a function (and never was!).
Now proceed systematically.
It follows that f k is well-defined ( f k ( x ) = f k (y) ∈ Z6 ) if and only if 6 | 4kn for all n ∈ Z. This is
the case if and only if 6 | 4k. Otherwise said,
f k is well-defined ⇐⇒ 6 | 4k ⇐⇒ 3 | 2k ⇐⇒ 3 | k
Since kx ∈ Z6 , equivalent values of k modulo 6 won’t x 0 1 2 3 4 5 6 ···
change f k : there are only two well-defined functions f0 (x) 0 0 0 0 0 0 0 · · ·
f 0 ( x ) = 0 and f 3 ( x ) = 3x. f3 (x) 0 3 0 3 0 3 0 · · ·
You’ll have much more practice with problems like these when you study group theory.
49 Theorem 7.27 verifies the well-definition of two functions +n , ·n : Z ∼ × Z ∼ → Z ∼
n n n
107
Just for Fun: Geometric Applications (optional!)
We consider how equivalence relations permit the easy construction of certain geometric objects and
of functions on such.
( a, b) ∼ (c, d) ⇐⇒ a − c ∈ Z and b = d
The equivalence classes are horizontal strings of points with the same y co-ordinate:
y
[( a, b)] = ( a + n, b) : n ∈ Z
2
The set of equivalence classes R ∼ may be visu-
alized as a cylinder by imagining rolling the plane wrap
into a tube of circumference 1 so that all points in around
a given equivalence class coincide.50 0 1 x
Now consider the rule f ([( x, y])) = y sin(2πx ).
2. As a natural extension, see if you can visualize why the equivalence classes of the relation
( a, b) ∼ (c, d) ⇐⇒ a − c ∈ Z and b − d ∈ Z
is well-defined and may thought of as having domain the torus. The pictures below illustrate
f , where the colors correspond. y
f [( x, y)] = sin(2πx ) cos(2πy)
+1.0
1
+0.5
1
2 0.0
−0.5
0
−1.0
0 1
2
1 x
50 Alternatively,
imagine piercing a roll of toilet paper and unrolling it: the single puncture becomes a row of (almost!)
equally spaced holes. Unfortunately for the analogy, toilet paper has purposeful thickness!
108
Equivalence relations and partitions are regularly used in this manner to describe objects in geometry
and topology. Here is a final famous example written using the language of partitions.
• If P ∈ X does not lie on the left or right edges, place it in a subset { P} by itself.
• Otherwise, pair Q with a point on the other edge identified in the opposite direction Q, Q̂ .
These subsets clearly partition the rectangle X and thus describe an equivalence relation ∼ on X. The
quotient set X ∼ may be visualized via the classic construction of a Möbius strip: give the rectangle
a half-twist and glue the two edges so that both points in each equivalance class coincide. This
construction allows us to analyse the strip while working on the rectangle: if you walk off the right
side of the rectangle (at Q̂) you simply end up at the corresponding point (Q) on the left side!
Q̂
P
Q
Rectangle Half-twist and glue
Exercises 7.4. A reading quiz and practice questions can be found online.
1. Let X be the set of students in a (large) math class and define an equivalence relation on X via
109
9. (a) Suppose f : X ∼ → A is well-defined. State what it means for f to be injective. What do
you notice?
(b) Prove that f : Z7 → Z35 : [ x ]7 7→ [15x ]35 is a well-defined, injective function.
(c) Repeat part (b) for g : Z100 → Z300 : [ x ]100 7→ [9x ]300 .
(Hint: You may find it useful that 9 · (−11) ≡ 1 (mod 100))
10. Define a partition of the sphere S2 = ( x, y, z) : x2 + y2 + z2 = 1 into subsets consisting of
pairs of antipodal points: ( x, y, z), (− x, −y, −z)
Let ∼ be the equivalence relation whose equivalence classes are the above subsets.
2
(a) f : S ∼ → R : [( x, y, z)] 7→ xyz is not well-defined. Explain why.
2
(b) Prove that f : S ∼ → R3 : [( x, y, z)] → (yz, xz, xy) is a well-defined function.
The image of this function is Steiner’s Roman Surface; look it up!
11. Consider the relation ( a, b) ∼ (c, d) ⇐⇒ ad = bc on Z × N.
(a) Prove that ∼ is an equivalence relation.
(b) List several elements of the equivalence class [(2, 3)]. Repeat for [(−3, 7)]. What does ∼
have to do with the set of rational numbers Q?
(c) Define operations + and × on Z × N ∼ by
[( a, b)] + [(c, d)] = [( ad + bc, bd)] [( a, b)] × [(c, d)] = [( ac, bd)]
Prove that + and × are well-defined (do this without using division!).
(d) (Hard) Prove that f [( x, y)] = yx is a well-defined bijection f : Z × N ∼ → Q, and that f
transforms + and × into the usual addition and multiplication on Q: that is,
f [( a, b)] + [(c, d)] = f [( a, b)] + f [(c, d)] , etc.
(This is essentially a formal definition of the ring of rational numbers!)
12. Define ∼ on R by x ∼ y if and only if x − y ∈ Z.
(a) Show ∼ is an equivalence relation on R.
(b) Find an example of a surjective function F : R → [0, 1) such that x ∼ y =⇒ F ( x ) = F (y).
(c) Use part (b) to find a well-defined function f : R ∼ → [0, 1).
13. Suppose ∼ is an equivalence relation on X and let γ( x ) = [ x ]. Prove the following:
(a) If f : X ∼ → A is well-defined, then F = f ◦ γ : X → A satisfies x ∼ y =⇒ F ( x ) = F (y).
(b) If F : X → A satisfies x ∼ y =⇒ F ( x ) = F (y), then there is a unique function f : X ∼ → A
satisfying F = f ◦ γ.
14. (Just for fun!) Prove that you can cut a Möbius strip round the middle yet
still end up with a single loop.
Look up the description of a Klein bottle: how is the relationship between
it and the Möbius strip similar to the torus/cylinder relationship?
110
8 Cardinality and Infinite Sets
During the late 1800’s, Georg Cantor assaulted the foundations of mathematics with his investiga-
tions of certain infinite sets, in particular his middle-third set (Example 6.18). Cantor’s ideas about
infinity met significant resistance: some mathematicians and philosophers considered his approach
unnatural; he even inflamed religious scholars who thought his ideas an affront to the divine!
Despite initial skepticism, Cantor’s notion of cardinality is now universally accepted. By unearthing
several contradictions inherent in contemporary (naı̈ve) set theory, mathematicians became con-
vinced that a rigorous axiomatic approach was necessary. Developing an axiomatic foundation to
mathematics became a core goal of early 20th century mathematics.
While the elements of A and B are completely different, the sets themselves may be compared using
cardinality: we write | A| ≤ | B| to indicate that B has at least as many elements as A. In Theorem
4.25, we saw that this mandates the existence of an injective function f : A → B: plainly
f (fish) = α, f (dog) = β
Definition 8.2. The cardinality of a set A is denoted | A|. We compare cardinalities as follows:
• | A| ≤ | B| ⇐⇒ ∃ f : A → B injective.
• | A| = | B| ⇐⇒ ∃ g : A → B bijective.
“Cardinality” is an abstract property whereby sets can be compared, rather than a value attaching to
a given set.51 Regardless, the Lemma proves that cardinality partitions any collection of sets: every
set has a cardinality, and no set has more than one cardinality. It is moreover natural to identify the
cardinalities of finite sets with the cardinal numbers 0, 1, 2, 3, 4, . . .
51 We could use the Lemma to define | A| := [ A] as the equivalence class of A, though this likely feels confusing!
111
Countably Infinite Sets
To make progress, it is helpful to introduce a symbol for the cardinality of the simplest infinite set.
Definition 8.4. We say that A is a countably infinite52 set and write | A| = ℵ0 (aleph-nought or aleph-
null), if its cardinality equals that of the set of natural numbers N:
| A| = ℵ0 ⇐⇒ ∃ g : N → A bijective
In the next section we’ll see why a new symbol is necessary; why ∞ doesn’t suffice.
2. Let 2N = {2, 4, 6, 8, 10, . . .} be the set of positive even integers. The function
h : N → 2N : n 7→ 2n
Theses examples demonstrate a perplexing property of infinite sets. For instance, 2N is a proper subset
in bijective correspondence with N! It feels like we want to say two contradictory things:
• N has the ‘same number of elements’ as 2N (bijectivity of h).
• N has ‘twice the number of elements’ as 2N (‘half’ of N is even, and ‘half’ odd).
The source of our discomfort is that number of elements is meaningless for infinite sets. However, if we
replace this phrase with cardinality, then both statements are true!54 The existence of a proper subset
with the same cardinality can indeed be used as a definition of infinite set (Exercise 13).
Rather than hunting for an explicit bijection, the simplest approach to these problems is often to list
the elements of a set A so that it ‘looks like’ the natural numbers, with an initial element a1 followed
by all others in sequence:
For instance, we could have approached our examples by listing elements in tabular form:
N n 1 2 3 4 5 6 7 8 9 10 · · ·
N≥2 g(n) 2 3 4 5 6 7 8 9 10 11 · · ·
2N h(n) 2 4 6 8 10 12 14 16 18 20 · · ·
The required bijections are immediately visible with little need to state them explicitly. We apply this
technique to two important examples of countably infinite sets.
52 Some authors say denumerable and use countable for sets with | A| ≤ ℵ0 . Aleph is the first letter of the Hebrew alphabet.
53 Thisis a version of Hilbert’s Grand Hotel Problem. Imagine a hotel with an infinite number of rooms: Room 1, Room 2,
Room 3, Room 4, etc. Even if the hotel is full, by moving everyone one room down the hall, space can always be found for
an additional guest. The second example is another version, where infinitely many new guests may be accommodated!
54 We won’t pursue cardinal arithmetic in any detail, but it is completely legitimate to write 2ℵ = ℵ for the second
0 0
example. The first example would be 1 + ℵ0 = ℵ0 .
112
List the set of integers in a non-standard order, alternating positive and negative terms:
Z = {z1 , z2 , z3 , z4 , . . .} = {0, 1, −1, 2, −2, 3, −3, 4, −4, . . .}
The function g : N → Z : n 7→ zn is bijective, and we conclude:
You might feel that our argument was too quick! Is it really obvious what g is? Bijectivity is the
observation that every integer appears exactly once in our non-standard ordering. If you really want,
you can construct an explicit formula for g from a tabular representation
(
1
n 1 2 3 4 5 6 7 8 9 ··· n if n even
=⇒ g(n) = 2 1
g ( n ) 0 1 −1 2 −2 3 −3 4 −4 · · · − 2 (n − 1) if n odd
and prove bijectivity using the formula. In practice, it is not worth the cost to be this explicit.
As you build up examples, you no longer have to compare directly to the natural numbers. Since
composition of bijective functions is bijective (Theorem 4.22/transitivity in Lemma 8.3),
B is countably infinite ⇐⇒ ∃ A countably infinite and ∃ g : A → B bijective
For instance, the set of even integers 2Z is denumerable via the bijection g : Z → 2Z : z 7→ 2z.
We use this approach to help prove the first of Cantor’s truly counter-intuitive revelations.
Any sensible person should feel that there are far, far more rational numbers than integers, yet the
sets have the same cardinality. Bizarre!
Proof. For each pair of natural numbers a, b, place the fraction ba in the bth row, ath column of an
infinite square. List the positive rational numbers by tracing diagonals in a snake-like manner and
deleting any number that has already been traced ( 22 = 11 , 64 = 32 , etc.).
Since ba each appears in diagonal a + b − 1 and repeats are
deleted, every positive rational number appears exactly once. 1 2 3 4 5 6 7 ···
1 1 1 1 1 1 1
We therefore obtain an enumeration
1 2 3 4 5 6 7 ···
2 2 2 2 2 2 2
1 2 1 1 3 4 3 2 1 1
Q+ = , , , , , , , , , , ... 1 2 3 4 5 6 7 ···
1 1 2 3 1 1 2 3 4 5 3 3 3 3 3 3 3
1 2 3 4 5 6 7 ···
and conclude that Q+ is countably infinite. Plainly this 4 4 4 4 4 4 4
corresponds to a bijection g : N → Q+ . To finish the proof, 1 2 3 4 5 6 7 ···
5 5 5 5 5 5 5
extend g by defining the bijective function
1 2 3 4 5 6 7 ···
6 6 6 6 6 6 6
g(n)
if n > 0 .. .. .. .. .. .. .. . .
. . . . . . . .
h : Z → Q : n 7→ 0 if n = 0
− g(−n) if n < 0
which identifies the negative rationals with Z− . By Theorem 8.6, we deduce that |Q| = |Z| = ℵ0 .
113
Other countably infinite sets appear to be even larger than Q! For example:
• Cartesian products such as Q × Q.
• The algebraic numbers A = x ∈ C : p( x ) = 0 for some polynomial p with integer coefficients .
Every rational number is algebraic ( ba is a root of p( x ) = bx − a), but so are many irrationals
√
( 2 is a root of p( x ) = x2 − 2). Non-algebraic numbers (e.g., π and e) are termed transcendental.
Arguments for these examples are left to the exercises.
Otherwise said, every infinite set has cardinality at least as large as the natural numbers: ℵ0 may
therefore be considered the least infinite cardinal.
Proof. (⇒) Express the set in roster notation A = { a1 , . . . , an }. We must prove two things:55
Since A has n elements, at least two of the values g(1), . . . , g(n + 1) must be equal. There-
fore g is not injective and consequently not bijective.
We’ll address the existence of infinite sets with cardinality larger than ℵ0 in the next section.
Exercises 8.1. A reading quiz and practice question can be found online.
1. Refresh your proof skills by proving explicitly that the following functions are bijections:
4. Show that the set of all triples of the form (n2 , 5, n + 2) with n ∈ 3Z is countably infinite by
providing an explicit bijection with a known countably infinite set.
5. State a bijection g : (0, 1) → (4, 6) which shows that these intervals have the same cardinality.
55 The n = 0 case (A = ∅) works, though it feels strange: f = ∅ is a suitable injective (!) function f : ∅ → N, and there
are no functions g ⊆ N → ∅. It will help to think about the formal definition of function (Section 7.2)!
114
6. Prove Lemma 8.3 (revisit Theorem 4.22 on the composition of bijective functions.)
7. Prove that A ⊆ B =⇒ | A| ≤ | B| (show there exists an injective function f : A → B).
8. (a) Prove that N × N is countably infinite by modifying the proof of Theorem 8.7.
(b) Combine part (a) with Theorem 8.7 to prove that Q × Q is countably infinite.
(c) Suppose An is countably infinite for each n ∈ N and list elements as follows:
A1 = { a11 , a12 , a13 , a14 , . . .}
A2 = { a21 , a22 , a23 , a24 , . . .}
A3 = { a31 , a32 , a33 , a34 , . . .}, etc.
S
Prove that An is countably infinite (a countable union of countable sets is countable).
115
8.2 Uncountable Sets
Since Q seems so large, you might imagine that no sets could have strictly larger cardinality. But we
haven’t yet thought about the real numbers. . .
Definition 8.9. A set A is uncountable if ℵ0 < | A|. Otherwise said, there exists an injection f : N → A
but no bijection g : N → A.
We denote the cardinality of the interval (0, 1) by c for continuum. The theorem may therefore be
written ℵ0 < c. We prove by showing that ℵ0 ≤ c and ℵ0 ̸= c.
1
Proof. (ℵ0 ≤ c) The function f : N → (0, 1) : n 7→ n +1 is plainly injective:
1 1
f (n) = f (m) =⇒ = =⇒ n = m
n+1 m+1
(ℵ0 ̸= c) Suppose, for contradiction, that g : N → (0, 1) is a bijection. Express the sequence of values
g(1), g(2), g(3), . . . as decimals:56
Since x disagrees with g(n) at the nth decimal place, we see that x ̸= g(n): that is, x is not in the
above list. However x ∈ (0, 1) and g is surjective, so x must be in the list: contradiction.
The second part of the proof is known as Cantor’s diagonal argument, since we compare the constructed
decimal x with the diagonal of an infinite square of integers. Since the interval (0, 1) is uncountable,
and (0, 1) ⊆ R, it is immediate that the real numbers are also uncountable. Using only the ideas
developed so far (a combination of Exercises 8.1.5 and 14), we could prove directly that every interval
of finite length has cardinality c. It is easier, however, to delay this momentarily. . .
Even more amazingly, Cantor’s middle-third set (Example 6.18) also has cardinality c, despite seem-
ing vanishingly small! The details, and more, are in Exercise 12.
56 A number x ∈ (0, 1) has two decimal representations if and only if one of them terminates and the other ultimately
becomes an infinite sequence of 9’s: e.g., 0.135 = 0.1349999 . . .. For this proof, we choose the terminating decimal whenever
it exists. We restrict to xn = 1, 2 later in the proof to keep away from these double representations.
116
Non-explict Comparison of Cardinalities
The following result is very useful for comparing cardinalities.
This seems like it should be obvious, but pause for a moment: it is not a result about numbers! The
theorem should be understood in the context of Definition 8.2, in which language it becomes:
The proof is beautiful, though a little long to reproduce here; if you’re interested, check out any ele-
mentary text on set theory. Its usefulness is that it allows us to equate cardinalities without explicitly
constructing bijective functions; injective functions are typically easier to conjure!
Example 8.12. The following functions are both injective (only—they are not bijective!):
Proof. Let A be an interval (could be infinite length) and choose a subinterval ( a, b) ⊆ A. The follow-
ing functions are injective:
117
Cantor’s Paradoxical Theorem
For a final punchline, we generalize Theorem 6.8 which, for finite sets A, asserted that |P ( A)| = 2| A|
is strictly larger than A itself. We now have the technology to attack this for infinite sets.
The main implication is that there is no largest cardinality! We can always construct a set with strictly
larger cardinality just by taking the power set. For example, |P (R)| > |R| = c. Want an even larger
cardinality? Try P P (R) , or P P (P (R)) ! This process may be continued indefinitely.
Proof. If A = ∅, the result is trivial. Otherwise, first observe that f : a 7→ { a} defines an injective
function f : A → P ( A), whence | A| ≤ |P ( A)|.
To complete the argument we must show that no bijective function g : A → P ( A) can exist. Suppose,
for a contradiction, that g : A → P ( A) is bijective, and consider the set
X= a∈A:a∈ / g( a)
Take stock for a moment and think about X. Since g( a) is a subset of A, the condition a ̸∈ g( a) is
legitimate, whence X is a genuine subset of A (X ∈ P ( A)). A simple example will hopefully help.
Since 1 ∈ g(1), 2 ∈ / g(2), and 3 ∈ / g(3), we see that X = {2, 3}. Since our goal is to prove that no
bijection A → P ( A) can exist, it is important to note that this g is not bijective; indeed g isn’t surjective,
since X ∈ / range( g) = {1, 2}, {1, 3}, ∅ . This last observation is what finishes the proof. . .
Proof Continued. By assumption, g is surjective. Thus X ∈ range( g). Otherwise said, X = g( a) for
some a ∈ A. We ask whether this element a lies in the set X:
/ g( a)
a ∈ X ⇐⇒ a ∈ (definition of X)
/X
⇐⇒ a ∈ (since X = g( a))
The conclusion a ∈ X ⇐⇒ a ∈
/ X is plainly a contradiction! No bijection g : A → P ( A) can exist,
and so | A| ⪇ |P ( A)|.
Cantor’s theorem played a key role in pushing set theory towards axiomatization, in part because of
a simple paradox. If a ‘set’ is merely a collection of objects, we may consider the ‘set of all sets’ S. Its
power set P (S) is a set of sets, which must be a subset of S. Plainly |P (S)| ≤ |S|; but this contradicts
Cantor’s theorem!
The remedy is a rigorous definition of ‘set.’ Axiomatic set theory describes a small number of legitimate
ways to build sets, of which we’ve seen several in these notes: e.g., union, power set, set-builder
notation. In particular, the ‘set of all sets’ cannot be legitimately constructed.57
57 Thecritical condition for preventing Cantor’s paradox is that set-builder notation { x ∈ A : P( x )} can only produce a
subset of an already existing set A. The ‘set of all sets’ would have the form { x : P( x )} where x is unrestricted.
118
Some Final Thoughts on the Limits of Proof
During this course we’ve learned some of the basic methods and concepts used by mathematicians.
In particular, we’ve learned how to use proofs to demonstrate the truth of statements about mathe-
matical objects. As we finish, it makes sense to reflect on the limits of our methods.
By the early 20th century, the discovery of various paradoxes and contradictions (such as Cantor’s)
caused a foundational crisis in mathematics. If a concept as basic as set is self-contradictory, how
are we to have faith in any mathematical conclusion?! The response to this crisis was an effort to
formulate a list of reasonable axioms from which all mathematics could be derived using basic logical
reasoning. Such an axiomatic foundation would ideally satisfy two conditions:
• Consistency: No contradiction can be derived from the axioms.
• Completeness: All true mathematical statements could be derived from the axioms.
Any hope for such a foundation was crushed in 1931, when Kurt Gödel published his famous Incom-
pleteness Theorems, showing that no such axiomatic system could exist. Very roughly, Gödel showed
that in any consistent axiomatic system strong enough to produce some basic arithmetic, there are
undecideable statements; neither deducible nor refutable from the axioms. Perhaps even worse, no
such system can prove its own consistency.
While the strongest aims of early 20th axiomatics cannot be accomplished, contemporary research
was able to provide a foundation that most modern mathematicians deem adequate. The most pop-
ular approach is to base all of mathematics on set theory—as your studies progress, you’ll see that
many of the objects you study can be formalized as sets together with functions and relations be-
tween them. We’ve started this work already: Chapter 7 says that functions and relations are them-
12
selves sets! Numbers like 0, 1, 2, 19 or even π = 3.14 . . . can be thought of as sets if one so desires.
In turn, set theory is often axiomatized using the ZFC axioms (short for Zermelo–Fraenkel set theory
with the Axiom of Choice).
While the ZFC system remains subject to Gödel’s limitations,58 it has proven able to formalize most
of the mathematics actually used by current mathematicians, and has not (thus far!) produced any
inconsistencies. While there is plenty of fun to be had exploring set theory, its history and its quirks,
most modern mathematicians feel little need to dwell on the foundational issues of last century!
Exercises 8.2. A reading quiz and practice question can be found online.
2. Find explicit bijections (thus showing that the given intervals have the same cardinality):
(a) f : [2, 3) → [1, 5) (b) g : [2, 3) → (1, 5] (c) h : (−3, 2) → R (d) j : R → (1, ∞)
(Hint: The proof of Corollary 8.13 should provide some inspiration—be creative)
3. Let B = [3, 5) ∪ (6, 10). Use the Cantor–Schröder–Bernstein Theorem to prove that | B| = c.
(Hint: State injective functions f : (0, 1) → B and g : B → (0, 1))
58 Perhaps the most famous undecidable statement in ZFC is relevant to our recent discussion: the continuum hypothesis is
the claim that no set has cardinality strictly between ℵ0 and c; that intervals are the simplest (‘smallest’) uncountable sets.
119
4. (a) Prove that f : N × N → N defined by f (m, n) = 2m 3n is injective.
(b) Use part (a) and the Cantor–Schröder–Bernstein Theorem to conclude that |N × N| = ℵ0 .
(c) Extend your argument to prove that, for any natural number k, |N · · × N}| = ℵ0
| × ·{z
(d) Use part (b) to give an alternative proof that |Q+ | = ℵ0 . k times
7. Give an example of an uncountable set I and an indexed collection { An : n ∈ I } for which all
the following conditions hold:
8. The proof of Cantor’s Theorem makes use of a construction similar to Russell’s paradox. Let X
be the ‘set’ of all sets which are not members of themselves:
X = { A : A ̸∈ A}
9. In Exercise 6.3.14, we saw that Cantor’s middle-third set C is the set of all numbers in [0, 1] pos-
sessing a ternary expansion consisting only of 0’s and 2’s. By modifying the proof of Theorem
8.10, argue that C is uncountable.
(Exercise 12 establishes the stronger result |C| = c)
120
The remaining questions are more of a challenge: if these seem interesting, consider taking a
set theory course!
10. Express a real number x ∈ (0, 1) as a decimal x = 0.x1 x2 x3 x4 . . . where we choose the terminat-
ing decimal whenever there is a choice (footnote 56). Prove that
11. If A and B are non-empty sets we let A B denote the set of all functions f : B → A.
12. Let x ∈ [0, 1]. A binary expansion of x is a sequence (bn ) of zeros and ones such that
∞
bn
x= ∑ 2 n
n =1
The binary expansion of x ∈ [0, 1] is (almost) unique;59 if there is a choice, take the terminating
expansion. Define a function f : [0, 1] → P (N) (the set of subsets of N) by
59 Binary expansions are unique unless x has a terminating expansion, in which case the there is a second representation
with an infinite string of 1’s: e.g., [0.011111 · · · ]2 = [0.1]2 . This discussion is beloved of computer scientists who, following
Exercise 11, might view P (N) ∼ {0, 1}N as the set of binary sequences/strings ( x1 , x2 , x3 , . . .), where each x j ∈ {0, 1}.
121