Math 208 V4 (2P) 2024
Math 208 V4 (2P) 2024
Department of Mathematics
The University of North Dakota
Copyright © 2024 UND Math Department
http://arts-sciences.und.edu/math/
Copyright © 2005, 2006, 2007, 2008, 2009, 2014, 2015, 2016, 2017, 2023 University of North Dakota
Mathematics Department
Permission is granted to copy, distribute and/or modify this document under the terms of the
GNU Free Documentation License, Version 1.2 or any later version published by the Free Software
Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy
of the license is included in the section entitled ”GNU Free Documentation License”.
2 Logical Equivalence 9
4 Rules of Inference 21
6 Set Operations 34
7 Styles of Proof 42
8 Relations 51
9 Properties of Relations 59
10 Equivalence Relations 65
3
4 CONTENTS
12 Special Functions 82
17 Algorithms 119
38 Graphs 244
6 CONTENTS
11.1 Graph of y = x2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
7
8 LIST OF FIGURES
31.1 A ∪ B = (A − B) ∪ B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
9
10 LIST OF TABLES
11
12 LIST OF ALGORITHMS
Introduction
Discrete math has become increasingly important in recent years, for a number of rea-
sons: The Art of Problem Solving http://www.artofproblemsolving.com/articles/
discrete-math
1
2 LIST OF ALGORITHMS
Discrete math shows up on most middle and high school math contests.
Prominent math competitions such as MATHCOUNTS (at the middle school level) and
the American Mathematics Competitions (at the high school level) feature discrete math
questions as a significant portion of their contests. On harder high school contests, such
as the AIME, the quantity of discrete math is even larger. Students that do not have
a discrete math background will be at a significant disadvantage in these contests. In
fact, one prominent MATHCOUNTS coach tells us that he spends nearly 50% of his
preparation time with his students covering counting and probability topics, because of
their importance in MATHCOUNTS contests.
The basic objects in logic are propositions. A proposition is a statement which is either true (T )
or false (F ) but not both. For example in the language of mathematics p : 3 + 3 = 6 is a true
proposition while q : 2 + 3 = 6 is a false proposition. What do you want for lunch? is a question,
not a proposition. Likewise Get lost! is a command, not a proposition. The sentence There are
exactly 1087 + 3 stars in the universe is a proposition, despite the fact that no one knows its truth
value. Here are two, more subtle, examples:
(1) He is more than three feet tall is not a proposition since, until we are told to whom he refers,
the statement cannot be assigned a truth value. The mathematical sentence x + 3 = 7 is
not a proposition for the same reason. In general, sentences containing variables are not
propositions unless some information is supplied about the variables. More about that later
however.
(2) This sentence is false is not a proposition. It seems to be both true and false. In fact if is
T then it says it is F and if it is F then it says it is T . It can be dangerous using sentences
that refer to themselves. Of course, using a knife can also be dangerous, but we do use knives
safely when we are careful. Likewise, using self-referential sentences can be done safely if care
is taken.
1
2 CHAPTER 1. LOGICAL CONNECTIVES AND COMPOUND PROPOSITIONS
p ¬ p
T F T
F T F
Sometimes a little common sense is required. For example It is raining is a proposition, but its
truth value is not constant, and may be arguable. That is, someone might say It is not raining, it
is just drizzling, or Do you mean on Venus? Feel free to ignore this sort of quibbling.
Simple propositions, such as It is raining, and The streets are wet, can be combined to create
more complicated propositions such as It is raining and the streets are not wet. These sorts of
involved propositions are called compound propositions. Compound propositions are built up
from simple propositions using a number of connectives to join or modify the simple propositions.
In the last example, the connectives are and which joins the two clauses, and not, which modifies
the second clause.
It is important to keep in mind that since a compound proposition is, after all, a proposition, it
must be classifiable as either true or false. That is, it must be possible to assign a truth value to
any compound proposition. There are mutually agreed upon rules to allow the determination of
exactly when a compound proposition is true and when it is false. Luckily these rules jive nicely
with common sense (with one small exception), so they are easy to remember and understand.
The simplest logical connective is negation. In normal English sentences, this connective is indi-
cated by appropriately inserting not in the statement, by preceding the statement with it is not
the case that, or for mathematical statements, by using a slanted slash. For example, if p is the
proposition 2 + 3 = 4, then the negation of p is denoted by the symbol ¬p and it is the proposition
2 + 3 ̸= 4. In this case, p is false and ¬p is true. If p is It is raining, then ¬p is It is not raining
or even the stilted sounding It is not the case that it is raining. The negation of a proposition p
is the proposition whose truth value is the opposite of p in all cases. The behavior of ¬p can be
exhibited in a truth table. In each row of the truth table 1.1 we list a possible truth value of p
and the corresponding truth value of ¬p.
The connective that corresponds to the word and is called conjunction. The conjunction of p with
q is denoted by p ∧ q and read as p and q. The conjunction of p with q is declared to be true exactly
when both of p, q are true. It is false otherwise. This behavior is exhibited in the truth table 1.2.
3
p q p ∧ q
T T T T T
T F T F F
F T F F T
F F F F F
p q p ∨ q p ⊕ q
T T T T T T F T
T F T T F T T F
F T F T T F T T
F F F F F F F F
Four rows are required in this table since when p is true, q may be either true or false and when p
is false it is possible for q to be either true or false. Since a truth value must be assigned to p ∧ q
in every possible case, one row in the truth table is needed for each of the four possibilities.
The logical connective disjunction corresponds to the word or of ordinary language. The disjunc-
tion of p with q is denoted by p ∨ q, and read as p or q. The disjunction p ∨ q is true if at least one
of p, q is true.
Disjunction is also called inclusive-or, since it includes the possibility that both component state-
ments are true. In everyday language, there is a second use of or with a different meaning. For
example, in the proposition Your ticket wins a prize if its serial number contains a 3 or a 5, the
or would normally be interpreted in the inclusive sense (tickets that have both a 3 and 5 are still
winners), but in the proposition With dinner you get mashed potatoes or french fries, the or is being
used in the exclusive-or sense.
The exclusive-or is also called the disjoint disjunction of p with q and is denoted by p ⊕ q. Read
that as p xor q if it is necessary to say it in words. The value of p ⊕ q is true if exactly one of p, q is
true. The exclusion of both being true is the difference between inclusive-or and exclusive-or. The
truth table shown officially defines these two connectives. In a mathematical setting one usually
assumes the inclusive-or is intended unless the exclusive sense is explicitly indicated.
The next two logical connectives correspond to the ordinary language phrases If · · · , then · · · and
the (rarely used in real life but common in mathematics) · · · if and only if · · · .
4 CHAPTER 1. LOGICAL CONNECTIVES AND COMPOUND PROPOSITIONS
p q p → q
T T T T T
T F T F F
F T F T T
F F F T F
In mathematical discussions, ordinary English words are used in ways that usually correspond to the
way we use words in normal conversation. The connectives not, and, or mean pretty much what
would be expected. But the implication, denoted p → q and read as If p, then q can be a little
mysterious at first. This is partly because when the If p, then q construction is used in everyday
speech, there is an implied connection between the proposition p (called the hypothesis) and the
proposition q (called the conclusion). For example, in the statement If I study, then I will pass
the test, there is an assumed connection between studying and passing the test. However, in logic,
the connective is going to be used to join any two propositions, with no relation necessary between
the hypothesis and conclusion. What truth value should be assigned to such bizarre sentences as
If I study, then the moon is 238, 000 miles from earth?
Is it true or false? Or maybe it is neither one? Well, that last option isn’t too pleasant because
that sentence is supposed to be a proposition, and to be a proposition it has to have truth value
either T or F . So it is going to have to be classified as one or the other. In everyday conversation,
the choice isn’t likely to be too important whether it is classified it as either true or false in the
case described. But an important part of mathematics is knowing when propositions are true and
when they are false. The official choices are given in the truth table for p → q. We can make sense
of this with an example.
Example 1.1. First consider the statement which Bill’s dad makes to Bill: If you get an A in
math, then I will buy you a new car. If Bill gets an A and his dad buys him a car, then dad’s
statement is true, and everyone is happy (that is the first row in the table). In the second row,
Bill gets an A, and his dad doesn’t come through. Then Bill’s going to be rightfully upset since his
father lied to him (dad made a false statement). In the last row of the table he can’t complain if he
doesn’t get an A, and his dad doesn’t buy him the car (so again dad made a true statement). Most
people feel comfortable with those three rows. In the third row of the table, Bill doesn’t get an A,
and his dad buys him a car anyhow. This is the funny case. It seems that calling dad a liar in this
case would be a little harsh on the old man. So it is declared that dad told the truth. Remember it
this way: an implication is true unless the hypothesis is true and the conclusion is false.
5
p q p ←→ q
T T T T T
T F T F F
F T F F T
F F F T F
The biconditional is the logical connective corresponding to the phrase · · · if and only if · · · . It
is denoted by p ⇐⇒ q, (read p if and only if q), and often more tersely written as p iff q. The
biconditional is true when the two component propositions have the same truth value, and it is
false when their truth values are different. Examine the truth table to see how this works.
The connectives described above combine at most two simple propositions. More complicated
propositions can be formed by joining compound propositions with those connectives. For exam-
ple, p ∧ (¬q), (p ∨ q) → (q ∧ (¬r)), and (p → q) ⇐⇒ ((¬p) ∨ q) are compound propositions, where
parentheses have been used, just as in ordinary algebra, to avoid ambiguity. Such extended com-
pound propositions really are propositions. That is, if the truth value of each component is known,
it is possible to determine the truth value of the entire proposition. The necessary computations
can be exhibited in a truth table.
Example 1.2. Suppose that p, q and r are propositions. To construct a truth table for (p ∧ q) → r,
first notice that eight rows will be needed in the table to account for all the possible combinations
of truth values of the simple component statements p, q and r. This is so since there are, as noted
above, four rows needed to account for the choices for p and q, so there will be those four rows paired
with r having truth value T , and four more with r having truth value F , for a total of 4 + 4 = 8. In
general, if there are n simple propositions in a compound statement, the truth table for the compound
statement will have 2n rows. The truth table for (p ∧ q) → r is given in Table 1.6, with an auxiliary
column for p ∧ q to serve as an aid for filling in the last column.
Be careful about how propositions are grouped. For example, if truth tables for p ∧ (q → r) and
(p ∧ q) → r are constructed, they turn out not to be the same in every row. Specifically if p is false,
then p ∧ q is false, and (p ∧ q) → r is true. Whereas when p is false p ∧ (q → r) is false. So writing
p ∧ q → r is ambiguous.
Here are a few examples of translating between propositions expressed in ordinary language and
propositions expressed in the language of logic.
Example 1.3. Let c be the proposition It is cold and s : It is snowing, and h : I’m staying home.
6 CHAPTER 1. LOGICAL CONNECTIVES AND COMPOUND PROPOSITIONS
p q r ( p ∧ q )→ r
T T T T T T T T
T T F T T T F F
T F T T F F T T
T F F T F F T F
F T T F F T T T
F T F F F T T F
F F T F F F T T
F F F F F F T F
Then (c ∧ s) → h is the proposition If it is cold and snowing, then I’m staying home. While
(c ∨ s) → h is If it is either cold or snowing, then I’m staying home. Messier is ¬(h → c) which
could be expressed as It is not the case that if I stay home, then it is cold, which is a little too
convoluted for our minds to grasp quickly. Translating in the other direction, the proposition It is
snowing and it is either cold or I’m staying home would be symbolized as s ∧ (c ∨ h). Notice the
parentheses are needed in this last proposition since (s ∧ c) ∨ h does not capture the meaning of the
ordinary language sentence, and s ∧ c ∨ h is ambiguous.
There is a connection between logical connectives and certain operations on bit strings. There
are two binary digits (or bits): 0 and 1. A bit string of length n is any sequence of n
bits. For example, 0010 is a bit sting of length four. Computers use bit strings to encode and
manipulate information. Some bit string operations are really just disguised truth tables. Here is
the connection: Since a bit can be one of two values, bits can be used to represent truth values.
Let T correspond to 1, and F to 0. Then given two bits, logical connectives can be used to produce
a new bit. For example ¬1 = 0, and 1 ∨ 1 = 1. This can be extended to strings of bits of the
same length by combining corresponding bit in the two strings. For example, 01011 ∧ 11010 =
(0 ∧ 1)(1 ∧ 1)(0 ∧ 0)(1 ∧ 1)(1 ∧ 0) = 01010.
7
Exercises
c) Pistachio is the best ice cream flavor. d) All unicorns have four legs.
a) p ⊕ ¬q b) ¬(q → p) c) q ∧ ¬p
d) ¬q ∨ p e) p → (¬q ∧ r)
Exercise 1.3. Perform the indicated bit string operations. The bit strings are given in groups of
four bits each for ease of reading.
Exercise 1.4. Let s be the proposition It is snowing and f be the proposition It is below freez-
ing. Convert the following English sentences into statements using the symbols s, f and logical
connectives.
Exercise 1.5. Let j be the proposition Jordan played and w be the proposition The Wizards won.
Write the following propositions as English sentences.
a) ¬j ∧ w b) j → ¬w c) w ∨ j
d) w → ¬j
Exercise 1.6. Let c be the proposition Sam plays chess, let b be Sam has the black pieces, and let
w be Sam wins.
8 CHAPTER 1. LOGICAL CONNECTIVES AND COMPOUND PROPOSITIONS
Problems
a) Today is Tuesday.
b) Why are you whining?
c) The Vikings are the worst team in professional sports.
d) This sentence has five words.
e) There is a black hole at the center of every galaxy.
a) ¬q −→ ¬p.
b) p −→ (q ∧ r).
(You will need eight rows for this one.)
Problem 1.3. Perform the indicated bit string operations. The bit strings are given in groups of
four bits each for ease of reading.
Problem 1.4. Let s be the proposition It is snowing and f be the proposition It is below freez-
ing. Convert the following English sentences into statements using the symbols s, f and logical
connectives.
Logical Equivalence
It is clear that the propositions It is sunny and it is warm and It is warm and it is sunny mean the
same thing. More generally, for any propositions p, q, we see that p ∧ q and q ∧ p have the same
meaning. To say it a little differently, for any choice of truth values for p and q, the propositions
p ∧ q and q ∧ p have the same truth value. One more time: p ∧ q and q ∧ p have identical truth
tables.
Two propositions with identical truth values are called logically equivalent. The expression p ≡ q
means p, q are logically equivalent.
Some logical equivalences are not as transparent as the example above. With a little thought it
should be clear that I am not taking math or I am not taking physics means the same as It’s not
the case that I am taking math and physics. In symbols, (¬m) ∨ (¬p) means the same as ¬(m ∧ p).
Example 2.1 (De Morgan). Prove that ¬(p ∧ q) ≡ (¬p ∨ ¬q) using a truth table. We construct
the truth table 2.1 in the order of precedence: ¬ before ∧ or ∨, but the expresion in parentheses has
highest precedence. We construct the table using additional columns for compound parts of the two
expressions.
It may be a little harder to believe (p → q) ≡ (¬p ∨ q), but checking a truth table shows they are
in fact equivalent. Saying If it is Monday, then I am tired is identical to saying It isn’t Monday or
I am tired. You should construct a truth table to demonstrate their equivalence.
9
10 CHAPTER 2. LOGICAL EQUIVALENCE
p q ¬( p ∧ q ) ¬ p ∨ ¬ q
T T F T T T F T F F T
T F T T F F F T T T F
F T T F F T T F T F T
F F T F F F T F T T F
Table 2.2 contains the most often used logical equivalences. These are well worth learning by sight
and by name.
Equivalence Name
¬(¬p) ≡ p Double Negation
p∧T≡p Identity laws
p∨F≡p
p∨T≡T Domination laws
p∧F≡F
p∨p≡p
Idempotent laws
p∧p≡p
p∨q ≡q∨p Commutative laws
p∧q ≡q∧p
(p ∨ q) ∨ r ≡ p ∨ (q ∨ r) Associative laws
(p ∧ q) ∧ r ≡ p ∧ (q ∧ r)
p ∨ (q ∧ r) ≡ (p ∨ q) ∧ (p ∨ r)
Distributive laws
p ∧ (q ∨ r) ≡ (p ∧ q) ∨ (p ∧ r)
¬(p ∧ q) ≡ (¬p ∨ ¬q) De Morgan’s laws
¬(p ∨ q) ≡ (¬p ∧ ¬q)
p ∨ ¬p ≡ T Law of Excluded Middle
p ∧ ¬p ≡ F Law of Contradiction
p → q ≡ ¬p ∨ q Disjunctive form
p → q ≡ ¬q → ¬p Implication ≡ Contrapositive
¬p → ¬q ≡ q → p Inverse ≡ Converse
There are three propositions related to the basic If . . . , then . . . implication: p → q. First
11
¬q → ¬p is called the contrapositive of the implication. The converse of the implication is the
proposition q → p. Finally, the inverse of the implication is ¬p → ¬q. Using a truth table, it is
easy to check that an implication and its contrapositive are logically equivalent, as are the converse
and the inverse. A common slip is to think the implication and its converse are logically equivalent.
Checking a truth table shows that isn’t so. The implication If an integer ends with a 2, then it is
even is T , but its converse, If an integer is even, then it ends with a 2, is certainly F .
Five basic connectives have been given: ¬, ∧, ∨, →, ⇐⇒ , but that is really just for convenience.
It is possible to eliminate some of them using logical equivalences. For example, p ⇐⇒ q ≡ (p →
q) ∧ (q → p) so there really is no need to explicitly use the biconditional. Likewise, p → q ≡ ¬p ∨ q,
so the use of the implication can also be avoided. Finally, p ∧ q ≡ ¬(¬p ∨ ¬q) so that there really is
no need ever to use the connective ∧. Every proposition made up of the five basic connectives can
be rewritten using only ¬ and ∨ (probably with a great loss of clarity however).
The most often used standardization, or normalization, of logical propositions is the disjunctive
normal form (DNF), using only ¬ (negation), ∧ (conjunction), and ∨ (disjunction). A proposi-
tional form is considered to be in DNF if and only if it is a disjunction of one or more conjunctions of
one or more literals (a literal is a letter or a letter preceded by the negation symbol). For example,
the following are all in disjunctive normal form:
• p∧q
• p
• (a ∧ q) ∨ r
• (p ∧ ¬q ∧ ¬r) ∨ (¬s ∧ t ∧ u)
It is always possible the verify a logical equivalence via a truth table. But it also possible to verify
equivalences by stringing together previously known equivalences. We provide two examples of this
process.
12 CHAPTER 2. LOGICAL EQUIVALENCE
Example 2.2. Show ¬(p∨(¬p∧q)) ≡ ¬p∧¬q. The plan is to start with the expression ¬(p∨(¬p∧q)),
work through a sequence of equivalences ending up with ¬p ∧ ¬q. It’s pretty much like proving
identities in algebra or trigonometry.
Proof.
Proof.
Exercises
Exercise 2.1. Use truth tables to verify each of the following equivalences:
a) (p ∨ q) ∨ r ≡ p ∨ (q ∨ r) b) ¬p ∧ (p ∨ q) ≡ ¬(q → p)
c) p ∨ (q ∧ r) ≡ (p ∨ q) ∧ (p ∨ r)
Exercise 2.2. Show that the statements are not logically equivalent.
a) p ∧ (q → r) ̸≡ (p ∧ q) → r
b) p → q ̸≡ q → p
c) p → q ̸≡ ¬p → ¬q
Exercise 2.3. Use truth tables to show that the following are tautologies.
a) [p ∧ (p → q)] → q
b) [(p → q) ∧ (q → r)] → (p → r)
Exercise 2.4. Consider the implication If it is Saturday, then I will mow the lawn.
b) [(p ∧ q) ∨ r] → [p ∧ (q ∨ r)]
Exercise 2.7. Give a proof of (p ∧ ¬r) → ¬q ≡ p → (q → r) using the Fundamental Logical
Equivalences, following the pattern of examples 2.2 and 2.3.
14 CHAPTER 2. LOGICAL EQUIVALENCE
Problems
Problem 2.5. Consider the four answers for exercise 2.4. Which of the are logically equivalent to
the implication? Which of the four are not logically equivalent to the implication, (but are logically
equivalent to each other)?
Problem 2.7. Use truth tables to show that the following are tautologies.
a) (p ∧ q) → p
The sentence x2 − 2 = 0 is not a proposition. It cannot be assigned a truth value unless some
more information is supplied about the variable x. Such a statement is called a predicate or a
propositional function.
Instead of using a single letter to denote a predicate, a symbol such as S(x) will be used to indicate
the dependence of the sentence on a variable. Here are two more examples of predicates.
With a given predicate, there is an associated set of objects which can be used in place of the
variables. For example, in the predicate S(x) : x2 − 2 = 0, it is understood that the x can be
replaced by a number. Replacing x by, say, the word blue does not yield a meaningful sentence.
For the predicate A(c) above, c can be replaced by, say, makes of cars (or maybe types of nails!).
For B(x, y), the x can be replaced by any human male, and the y by any human. The collection
of possible replacements for a variable in a predicate is called the domain of discourse for that
variable. Usually the domain of discourse is left for the reader to guess, but if the domain of
discourse is something other than an obvious choice, the writer will mention the domain to be used.
15
16 CHAPTER 3. PREDICATES AND QUANTIFIERS
A predicate is not a proposition, but it can be converted into a proposition. There are three ways
to modify a predicate to change it into a proposition. Let’s use S(x) : x2 − 2 = 0 as an example.
The first way to change S(x) to make it into a proposition is to assign a specific value from
the variable’s domain of discourse to the variable. For example, setting x = 3, gives the (false)
√
proposition S(3) : 32 − 2 = 0. On the other hand, setting x = 2 gives the (true) proposition
√ √
S( 2) : ( 2)2 − 2 = 0. The process of setting a variable equal to a specific object in its domain of
discourse is called instantiation. Looking at the two-place predicate B(x, y) : x is the brother of
y, we can instantiate both variables to get the (true - look up the Osmonds) proposition B(Donny,
Marie) : Donny is the brother of Marie. Notice that the sentence B(Donny, y) : Donny is the
brother of y has not been converted into a proposition since it cannot be assigned a truth value
without some information about y. But it has been converted from a two-place predicate to a
one-place predicate.
A second way to convert a predicate to a proposition is to precede the predicate with the phrase
There is an x such that. For example, There is an x such that S(x) would become There is an x
such that x2 − 2 = 0. This proposition is true if there is at least one choice of x in its domain of
discourse for which the predicate becomes a true statement. The phrase There is an x such that is
denoted in symbols by ∃x, so the proposition above would be written as ∃x S(x) or ∃x (x2 − 2 = 0).
When trying to determine the truth value of the proposition ∃x P (x), it is important to keep the
domain of discourse for the variable in mind. For example, if the domain for x in ∃x (x2 − 2 = 0) is
all integers, the proposition is false. But if its domain is all real numbers, the proposition is true.
The phrase There is an x such that (or, in symbols, ∃x) is called existential quantification. In
English it can also be read as There exists x or For some x.
The third and final way to convert a predicate into a proposition is by universal quantification.
The phrase For all x is also rendered in English as For each x or For every x. The universal
quantification of a predicate, P (x), is obtained by preceding the predicate with the phrase For
all x, producing the proposition For all x, P (x), or, in symbols, ∀x P (x). This proposition is true
provided the predicate becomes a true proposition for every object in the variable’s domain of
discourse. Again, it is important to know the domain of discourse for the variable since the domain
will have an effect on the truth value of the quantified proposition in general.
For multi-placed predicates, these three conversions can be mixed and matched. For example, using
the obvious domains for the predicate B(x, y) : x is the brother of y here are some conversions into
propositions:
17
(1) B(Donny, M arie) has both variables instantiated. The proposition is true.
(2) ∃y B(Donny, y) is also a true proposition. It says Donny is somebody’s brother. The first
variable was instantiated, the second was existentially quantified.
(3) ∀y B(Donny, y) says everyone has Donny for a brother, and that is false.
(4) ∀x ∃y B(x, y) says every male is somebody’s brother, and that is false.
(5) ∃y ∀x B(x, y) says there is a person for whom every male is a brother, and that is false.
(6) ∀x B(x, x) says every male is his own brother, and that is false.
Translation between ordinary language and symbolic language can get a little tricky when quantified
statements are involved. Here are a few more examples.
Example 3.1. Let P (x) be the predicate x owns a Porsche, and let S(x) be the predicate x
speeds. The domain of discourse for the variable in each predicate will be the collection of all
drivers. The proposition ∃xP (x) says Someone owns a Porsche. It could also be translated as
There is a person x such that x owns a Porsche, but that sounds too stilted for ordinary
conversation. A smooth translation is better. The proposition ∀x(P (x) → S(x)) says All Porsche
owners speed.
Translating in the other direction, the proposition No speeder owns a Porsche could be expressed
as ∀x(S(x) → ¬P (x)).
Example 3.2. Here’s a more complicated example: translate the proposition Al knows only Bill
into symbolic form. Let’s use K(x, y) for the predicate x knows y. The translation would be
K(Al, Bill) ∧ ∀x (K(Al, x) → (x = Bill)).
Example 3.3. For one last example, let’s translate The sum of two even integers is even
into symbolic form. Let E(x) be the predicate x is even. As with many statements in ordinary
language, the proposition is phrased in a shorthand code that the reader is expected to unravel. As
given, the statement doesn’t seem to have any quantifiers, but they are implied. Before converting
it to symbolic form, it might help to expand it to its more long winded version: For every choice
of two integers, if they are both even, then their sum is even. Expressed this way, the
translation to symbolic form is: ∀x ∀y ((E(x) ∧ E(y)) → E(x + y)).
Notice that if the domain of discourse consists of finitely many entries a1 , ..., an , then ∀x p(x) ≡
p(a1 ) ∧ p(a2 ) ∧ ... ∧ p(an ). So the quantifier ∀ can be expressed in terms of the logical connective ∧.
The existential quantifier and ∨ are similarly linked: ∃x p(x) ≡ p(a1 ) ∨ p(a2 ) ∨ ... ∨ p(an ).
18 CHAPTER 3. PREDICATES AND QUANTIFIERS
From the associative and commutative laws of logic we see that we can rearrange any system of
propositions which are linked only by ∧’s or linked only by ∨’s. For instance, consider the previous
examples 3.1 – 3.3 with finite domains of discourse. Consequently any more generally quantified
proposition of the form ∀x∀y p(x, y) is logically equivalent to ∀y∀x p(x, y). Similarly for statements
which contain only existential quantifiers. But the distributive laws come into play when ∧’s and
∨’s are mixed. So care must be taken with predicates which contain both existential and universal
quantifiers, as the following example shows.
Example 3.4. Let p(x, y) : x + y = 0 and let the domain of discourse be all real numbers for both
x and y. The proposition ∀y ∃x p(x, y) is true, since, for any given y, by setting (instantiating)
x = −y we convert x + y = 0 to the true statement (−y) + y = 0. In fact (∀y ∈ R)[(−y) + y = 0]
is a tautology. However the proposition ∃x ∀y p(x, y) is false. If we set (instantiate) y = 1, then
x + y = 0 implies that x = −1. When we set y = 0, we get x = 0. Since 0 ̸= −1 there is no x
which will work for all y, since it would have to work for the specific values of y = 0 and y = 1.
To form the negation of quantified statements, we apply De Morgan’s laws. This can be seen in
case of a finite domain of discourse as follows:
Exercises
Exercise 3.1. Let p(x) : 2x ≥ 4, for integers x. Determine the truth values of the following
propositions.
a) p(2) b) p(−3)
c) ∀x ((x ≤ 10) → p(x)) d) ∃x ¬p(x)
Exercise 3.2. Let p(x, y) be x has read y, where the domain of discourse for x is all students in
this class, and the domain of discourse for y is all novels. Express the following propositions in
English.
Exercise 3.3. Let F (x, y) be the statement x can fool y, where the domain of discourse for both x
and y is all people. Use quantifiers to express each of the following statements.
Exercise 3.4. Negate each of the statements from exercise 3.2 in English.
Exercise 3.5. Negate each statement from exercise 3.3 in logical symbols. Of course, the easy
answer would be to simply put ¬ in front of each statement. But use the principle given at the end
of this chapter to move the negation across the quantifiers.
Exercise 3.6. Express symbolically: The product of an even integer and an odd integer is even.
Problems
Problem 3.1. Let h be Ben is healthy., w: Ben is wealthy., and s: Ben is wise.
Express the following in English:
a) h ∧ w
b) w ∨ s
c) h → (w ∧ s)
d) (h → w) ∧ s
Problem 3.2. Let S(x, y) be the predicate x has seen y where the domain of discourse for x is
all students in this class and the domain of discourse for y is all movies. Express the following in
logical symbols using quantifiers.
Problem 3.4. Negate the propositions in 3.2 in symbols. Note: An easy way to do this is to simply
write ¬ in front of the answers in 3.2. Don’t do that! Give the negation with no quantifiers coming
after a negation symbol.
Problem 3.5. Negate the propositions in 3.2 in English. Note: An easy way to do this is to simply
write It is not the case that .... in front of each proposition. Don’t do that! Give the negation as a
reasonably natural English sentence.
Chapter 4
Rules of Inference
The heart of mathematics is proof. In this chapter, we give a careful description of what exactly
constitutes a proof in the realm of propositional logic. Throughout the course various methods of
proof will be demonstrated, including the particularly important style of proof called induction.
It’s important to keep in mind that all proofs, no matter what the subject matter might be, are
based on the notion of a valid argument as described in this chapter, so the ideas presented here
are fundamental to all of mathematics.
Imagine trying carefully to define what a proof is, and it quickly becomes clear just how difficult a
task that is. So it shouldn’t come as a surprise that the description takes on a somewhat technical
looking aspect. But don’t let all the symbols and abstract-looking notation be misleading. All these
rules really boil down to plain old common sense when looked at correctly.
The usual form of a theorem in mathematics is: If a is true and b is true and c is true, etc., then s
is true. The a, b, c, · · · are called the hypotheses, and the statement s is called the conclusion.
For example, a mathematical theorem might be: if m is an even integer and n is an odd integer,
then mn is an even integer. Here the hypotheses are m is an even integer and n is and odd integer,
and the conclusion is mn is an even integer.
We begin by concerning ourselves with proofs from the realm of propositional logic rather than the
sort of theorem mentioned above. We will be interested in arguments in which the form of the
argument is the item of interest rather than the content of the statements in the argument.
21
22 CHAPTER 4. RULES OF INFERENCE
For example, consider the simple argument: (1) My car is either red or blue and (2) My car is not
red, and so (3) My car is blue. Here the hypotheses are (1) and (2), and the conclusion is (3). It
should be clear that this is a valid argument. That means that if you agree that (1) and (2) are
true, then you must accept that (3) is true as well.
Definition 4.1. An argument is called valid provided that if you agree that all the hypotheses
are true, then you must accept the truth of the conclusion.
Now the content of that argument (in other words, the stuff about my and cars and colors) really
have nothing to do with the validity of the argument. It is the form of the argument that makes
it valid. The form of this argument is (1) p ∨ q and (2) ¬p, therefore (3) q. Any argument that has
this form is valid, whether it talks about cars and colors or any other notions. For example, here
is another argument of the very same form: (1) I either read the book or just looked at the pictures
and (2) I didn’t read the book, therefore (3) I just looked at the pictures.
Some arguments involve quantifiers. For instance, consider the classic example of a logical argument:
(1) All men are mortal and (2) Socrates is a man, and so (3) Socrates is mortal. Here the hypotheses
are the statements (1) and (2), and the conclusion is statement (3). If we let M (x) be x is a man and
D(x) be x is mortal (with domain for x being everything!), then this argument could be symbolized
as shown.
∀x(M (x) → D(x))
M (Socrates)
∴ D(Socrates)
The general form of a proof that a logical argument is valid consists in assuming all the hypotheses
have truth value T , and showing, by applying valid rules of logic, that the conclusion must also
have truth value T .
Just what are the valid rules of logic that can be used in the course of the proof? They are called
the Rules of Inference, and there are seven of them listed in table 4.1. Each rule of inference arises
from a tautology, and actually there is no end to the rules of inference, since each new tautology
can be used to provide a new rule of inference. But, in real life, people rely on only a few basic
rules of inference, and the list provided in the table is plenty for all normal purposes.
It is important not to merely look on these rules as marks on the page, but rather to understand what
each one says in words. For example, Modus Ponens corresponds to the common sense rule: if we are
told p is true, and also If p is true, then so is q, then we would leap to the reasonable conclusion that
23
q is true. That is all Modus Ponens says. Similarly, for the rule of proof of Disjunctive Syllogism:
knowing Either p or q is true, and p is not true, we would immediately conclude q is true. That’s
the rule we applied in the car example above. Translate the remaining six rules of inference into
such common sense statements. Some may sound a little awkward, but they ought to all elicit an
of course that’s right feeling once understood. Without such an understanding, the rules seem like
a jumble of mystical symbols, and building logical arguments will be pretty difficult.
What exactly goes into a logical argument? Suppose we want to prove (or show valid) an argument
of the form If a and b and c are true, then so is s. One way that will always do the trick is to
construct a truth table as in examples earlier in the course. We check the rows in the table where
all the hypotheses are true, and make sure the conclusion is also true in those rows. That would
complete the proof. In fact that is exactly the method used to justify the seven rules of inference
given in the table. But building truth tables is certainly tedious business, and it certainly doesn’t
seem too much like the way we learned to do proofs in geometry, for example. An alternative is
the construction of a logical argument which begins by assuming the hypotheses are all true and
applies the basic rules of inferences from the table until the desired conclusion is shown to be true.
Here is an example of such a proof. Let’s show that the argument displayed in figure 4.1 is valid.
p
p→q
s∨r
r → ¬q
∴s∨t
(2) as a consequence of previous steps and some rule of inference from the table, or
Finally the last statement in the proof will be the desired conclusion. Of course, we could prove
the argument valid by constructing a 32 row truth table instead! Well, actually we wouldn’t need
all 32 rows, but it would be pretty tedious in any case.
Such proofs can be viewed as games in which the hypotheses serve as the starting position in a
game, the goal is to reach the conclusion as the final position in the game, and the rules of inference
(and logical equivalences) specify the legal moves. Following this outline, we can be sure every step
in the proof is a true statement, and, in particular, the desired conclusion is true, as we hoped to
show.
One step more complicated than the last example are arguments that are presented in words rather
than symbols. In such a case, it is necessary to first convert from a verbal argument to a symbolic
argument, and then check the argument to see if it is valid. For example, consider the argument:
Tom is a cat. If Tom is a cat, then Tom likes fish. Either Tweety is a bird or Fido is a dog. If Fido
is a dog, then Tom does not like fish. So, either Tweety is a bird or I’m a monkey’s uncle. Just
reading this argument, it is difficult to decide if it is valid or not. It’s just a little too confusing to
process. But it is valid, and in fact it is the very same argument as given above. Let p be Tom is
a cat, let q be Tom likes fish, let s be Tweety is a bird, let r be Fido is a dog, and let t be I’m a
25
monkey’s uncle. Expressing the statements in the argument in terms of p, q, r, s, t produces exactly
the symbolic argument proved above.
Some logical arguments have a convincing ring to them but are nevertheless invalid. The classic
example is an argument of the form If it is snowing, then it is winter. It is winter. So it must
be snowing. A moment’s thought is all that is needed to be convinced the conclusion does not
follow from the two hypotheses. Indeed, there are many winter days when it does not snow. The
error being made is called the fallacy of affirming the conclusion. In symbols, the argument
is claiming that [(p → q) ∧ q] → p is a tautology, but in fact, checking a truth table shows that it
is not a tautology. Fallacies arise when statements that are not tautologies are treated as if they
were tautologies.
Logical arguments involving propositions using quantifiers require a few more rules of inference.
As before, these rules really amount to no more than a formal way to express common sense. For
instance, if the proposition ∀ x P (x) is true, then certainly for every object c in the universe of
discourse, P (c) is true. After all, if the statement P (x) is true for every possible choice of x, then,
in particular, it is true when x = c. The other three rules of inference for quantified statements are
just as obvious. All four quantification rules appear in table 4.3.
Example 4.2. Let’s analyze the following (fictitious, but obviously valid) argument to see how
these rules of inference are used. All books written by Sartre are hard to understand. Sartre wrote
a book about kites. So, there is a book about kites that is hard to understand. Let’s use to following
predicates to symbolize the argument:
The domain for x in each case is all books. In symbolic form, the argument and a proof are
Exercises
Exercise 4.2. Prove the following argument is valid. All Porsche owners are speeders. No owners
of sedans buy premium fuel. Car owners that do not buy premium fuel never speed. So Porsche
owners do not own sedans. Use all car owners as the domain of discourse.
¬p → (r ∧ ¬s)
t→s
u → ¬p
¬w
u∨w
∴ ¬t ∨ w
Problems
Problem 4.1. Show that p → q and ¬p, ∴ ¬q is not a valid rule of inference. It is called the
Fallacy of denying the hypothesis.
¬p ∧ q
r→p
¬r → s
s→t
∴t
p∨q
q→r
(p ∧ s) → t
¬r
¬q → (p ∧ s)
∴t
(¬p ∨ q) → r
s ∨ ¬q
¬t
p→t
(¬p ∧ r) → ¬s
∴ ¬q
Problem 4.5. Express the following argument is symbolic form and prove the argument is valid.
If Ralph doesn’t do his homework or he doesn’t feel sick, then he will go to the party and he will
stay up late. If he goes to the party, he will eat too much. He didn’t eat too much. So Ralph did
his homework.
Problem 4.6. In problem 4.5, show that you can logically deduce that Ralph felt sick.
Problem 4.7. In problem 4.5, can you logically deduce that Ralph stayed up late?
∃x(A(x) ∧ ¬B(x))
∀x(A(x) → C(x))
∴ ∃x(C(x) ∧ ¬B(x))
Chapter 5
A set is a collection of objects. Often, but not always, sets are denoted by capital letters such as
A, B, · · · and the objects that make up a set, called its elements, are denoted by lowercase letters.
Write x ∈ A to mean that the object x is an element of A. If the object x is not an element of A,
write x ̸∈ A.
Two sets A and B are equal, written A = B provided A and B comprise exactly the same elements.
Another way to say the same thing: A = B provided ∀x (x ∈ A ⇐⇒ x ∈ B).
There are a number of ways to specify a given set. We consider two of them.
One way to describe a set is to list its elements. This is called the roster method. Braces are
used to signify when the list begins and where it ends, and commas are used to separate elements.
For instance, A = {1, 2, 3, 4, 5} is the set of positive whole numbers between 1 and 5 inclusive.
It is important to note that the order in which elements are listed is immaterial. For example,
{1, 2} = {2, 1} since x ∈ {1, 2} and x ∈ {2, 1} are both true for x = 1 and x = 2 and false for
all other choices of x. Thus x ∈ {1, 2} and x ∈ {2, 1} always have the same truth value, and that
means ∀x (x ∈ {1, 2} ⇐⇒ x ∈ {2, 1}) is true. According to the definition of equality given above,
it follows that {1, 2} = {2, 1}. The same sort of reasoning shows that repetitions in the list of
elements of a set can be ignored. For example {1, 2, 3, 2, 4, 1, 2, 3, 2} = {1, 2, 3, 4}. There is no point
in listing an element of a set more than once.
29
30 CHAPTER 5. SETS: BASIC DEFINITIONS
The roster method has certain drawbacks. For example we probably don’t want to list all of the
elements in the set of positive integers between 1 and 99 inclusive. One option is to use an ellipsis.
The idea is that we list elements until a pattern is established, and then replace the missing elements
with . . . (which is the ellipsis). So {1, 2, 3, 4, . . . , 99} would describe our set.
The use of an ellipsis has one pitfall. It is hoped that whoever is reading the list will be able to
guess the proper pattern and apply it to fill in the gap.
Another method to specify a set is via the use of set-builder notation. A set can be described
in set-builder notation as A = {x|p(x)}. Here we read A is the set of all objects x for which the
predicate p(x) is true. So {1, 2, 3, 4, . . . , 99} becomes {x|x is a whole number and 1 ≤ x ≤ 99}.
Certain sets occur often enough that we have special notation for them.
In addition to the above sets, there is a set with no elements, written as ∅ (also written using the
roster style as { }), and called the empty set. This set can be described using set builder style in
many different ways. For example, {x ∈ R|x2 = −2} = ∅. In fact, if P (x) is any predicate which is
always false, then {x | P (x)} = ∅. There are two easy slips to make involving the empty set. First,
don’t write ∅ = 0 (the idea being that both ∅ and 0 represent nothing). That is not correct since ∅
is a set, and 0 is a number, and it’s not fair to compare two different types of objects. The other
error is thinking ∅ = {∅}. This cannot be correct since the right-hand set has an element, but the
left-hand set does not.
At the other extreme from the empty set is the universal set, denoted U. The universal set
consists of all objects under consideration in any particular discussion. For example, if the topic du
jour is basic arithmetic then the universal set would be the set of all integers. Usually the universal
set is left for the reader to guess. If the choice of the universal set is not an obvious one, it will be
pointed out explicitly.
31
The set A is a subset of the set B, written as A ⊆ B, in case ∀x(x ∈ A → x ∈ B) is true. In plain
English, A ⊆ B if every element of A also is an element of B. For example, {1, 2, 3} ⊆ {1, 2, 3, 4, 5}.
On the other hand, {0, 1, 2, 3} ̸⊆ {1, 2, 3, 4, 5} since 0 is an element of the left-hand set but not of
the right-hand set. The meaning of A ̸⊆ B can be expressed in symbols using De Morgan’s law:
and the last line says A ̸⊆ B if there is at least one element of A that is not an element of B.
The empty set is a subset of every set. To check that, suppose A is any set, and let’s check to make
sure ∀x(x ∈ ∅ → x ∈ A) is true. But it is since for any x, the hypothesis of x ∈ ∅ −→ x ∈ A is F ,
and so the implication is T . So ∅ ⊆ A. Another way to same the same thing is to notice that to
claim ∅ ̸⊆ A is the same as claiming there is at least one element of ∅ that is not an element of A,
but that is ridiculous, since ∅ has no elements at all.
To say that A = B is the same as saying every element of A is also an element of B and every
element of B is also an element of A. In other words, A = B ⇐⇒ (A ⊆ B ∧ B ⊆ A), and this
indicates the method by which the common task of showing two sets are equal is carried out: to
show two sets are equal, show that each is a subset of the other.
A set is finite if the number of distinct elements in the set is a non-negative integer. In this case
we call the number of distinct elements in the set its cardinality and denote this natural number
by |A|. For example, |{1, 3, 5}| = 3 and |∅| = 0, |{∅}| = 1, and |{∅, {a, b, c}, {X, Y }}| = 3. A set,
such as Z, which is not finite, is infinite.
Given a set A the power set of A, denoted P(A), is the set of all subsets of A. For example if
A = {1, 2}, then P(A) = {∅, {1}, {2}, {1, 2}}. It is not hard to see that if |A| = n, then |P(A)| = 2n .
32 CHAPTER 5. SETS: BASIC DEFINITIONS
Exercises
Exercise 5.3. Determine the cardinality of the sets in exercises 5.1 and 5.2.
Exercise 5.4. Is the proposition Every element of the empty set has three toes true or false?
Explain your answer!
Exercise 5.7. True or False: The set of even integers is a subset of the set of integers that are
multiples of four.
33
Problems
a) If n ∈ N, then n + 1 ∈ N.
b) If n ∈ N, then n − 1 ∈ N.
Chapter 6
Set Operations
Set operations can be visualized using Venn diagrams. A circle (or other closed curve) is drawn
to represent a set. The points inside the circle are used to stand for the elements of the set. To
represent the set operation of intersection, two such circles are drawn with an overlap to indicate
the two sets may share some elements. In the Venn diagram, figure 6.1, the shaded area represents
the intersection of A and B.
A B
34
35
for example. The Venn Diagram 6.2 represents the union of A and B.
A B
A B
A B
When U is a universal set, we denote U − A by A and call it the complement of A. The Venn
diagram for is in figure 6.5. If U = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, then {0, 1, 2, 3, 4} = {5, 6, 7, 8, 9}. The
universal set matters here. If U = {x ∈ N |x ≤ 100}, then {0, 1, 2, 3, 4} = {5, 6, 7, 8, ..., 100}.
There is a close connection between many set operations and the logical connectives of Chapter 1.
The intersection operation is related to conjunction, union is related to disjunction, and comple-
mentation is related to negation. It is not surprising then that the various laws of logic, such as the
associative, commutative, and distributive laws carry over to analogous laws for the set operations.
Table 6.1 exhibits some of these properties of these set operations.
These can be verified by using membership tables which are the analogs of truth tables used to
verify the logical equivalence of propositions. For a set A either an element under consideration is
in A or it is not. These binary possibilities are kept track of using 1 if x ∈ A and 0 if x ̸∈ A, and
then performing related bit string operations.
37
Identity Name
The meaning of the first row of the table is that if x ∈ A and x ∈ B, then x ̸∈ A ∩ B, as indicated
by the 0 in the first row, fourth column, and also not in A ∪ B as indicated by the 0 in the first row,
last column. Since the columns for A ∩ B and A ∪ B are identical, it follows that (A ∩ B) = A ∪ B
as promised.
Just as compound propositions can be analyzed using truth tables, more complicated combinations
of sets can be handled using membership tables. For example, using a membership table, it is
A B A∩B (A ∩ B) A B A∪B
1 1 1 0 0 0 0
1 0 0 1 0 1 1
0 1 0 1 1 0 1
0 0 0 1 1 1 1
38 CHAPTER 6. SET OPERATIONS
easy to verify that A ∪ (B ∩ C) = A ∩ (B ∪ C). But, just as with propositions, it is usually more
enlightening to verify such equalities by applying the few basic laws of set theory listed above.
A ∪ (B ∩ C) = A ∩ (B ∩ C) = A ∩ (B ∪ C).
There is a correspondence between set operations of finite sets and bit string operations. Let
U = {u1 , u2 , ..., un } be a finite universal set with distinct elements listed in a specific order. Notice
the universal set is ordered. We may write it as an n-tuple: U = (u1 , u2 , ..., un ). For a set A
under consideration, we have A ⊆ U. By the law of excluded middle, for each uj ∈ U, either
uj ∈ A or uj ̸∈ A. We define a binary string of length n, called the characteristic vector of A,
denoted χ(A), by setting the jth bit of χ(A) to be 1 if uj ∈ A and 0 if uj ̸∈ A. For example if
U = {1, 2, 3, 4, 5, 6, 7, 8, 9}, and A = {1, 3, 4, 5, 8}, then χ(A) = 101110010.
An interesting side-effect is that for example χ(A ∩ B) = χ(A) ∧ χ(B), χ(A ∪ B) = χ(A) ∨ χ(B)
and χ(A) = ¬χ(A). As a function, we say that χ maps intersection to conjunction
Since every proposition can be expressed using ∧, ∨ and ¬, if we represent sets by their characteristic
vectors, we can get a machine to perform set operations as logical operations on bit strings. This
is the method programmers use to manipulate sets in computer memory.
The order in which elements of a set are listed does not matter. But there are times when order
is important. For example, in a horse race, knowing the order in which the horses cross the finish
line is more interesting than simply knowing which horses were in the race. There is a familiar way,
introduced in algebra, of indicating order is important: ordered pairs. Ordered pairs of numbers
are used to specify points in the Euclidean plane when graphing functions. For instance, when
graphing y = 2x + 1, setting x = 3 gives y = 7, and so the ordered pair (3, 7) will indicate one of
the points on the graph.
In this course, ordered pairs of any sorts of objects, not just numbers, will be of interest. An
ordered pair is a collection of two objects (which might both be the same) with one specified as
first (the first coordinate) and the other as second (the second coordinate). The ordered pair with a
39
specified as first and b as second is written (as usual) (a, b). The most important feature of ordered
pairs is that (a, b) = (c, d) if and only if a = c and b = d. In words, two ordered pairs are equal
provided they match in both coordinates. So (1, 2) ̸= (2, 1).
More generally, an ordered n-tuple (a1 , a2 , ..., an ) is the ordered collection with a1 as its first
coordinate, a2 as its second coordinate, and so on. Two ordered n-tuples are equal provided they
match in every coordinate.
The last operation to be considered for combining sets is the Cartesian product of two sets
A and B. It is defined by A × B = {(a, b)|a ∈ A ∧ b ∈ B }. In other words, A × B comprises all
ordered pairs that can be formed taking the first coordinate from A and the second coordinate from
B. For example if A = {1, 2}, and B = {α, β}, then A × B = {(1, α), (2, α), (1, β), (2, β)}. Notice
that in this case A × B ̸= B × A since, for example, (1, α) ∈ A × B, but (1, α) ̸∈ B × A.
A special case occurs when A = B. In this case we denote the Cartesian product of A with itself
by A2 . The familiar example R × R = R2 is called the Euclidean plane or the Cartesian plane.
More generally given sets A1 , ..., An the Cartesian product of these sets is written as A1 × A2 × ... ×
An = {(a1 , a2 , ...., an )|ai ∈ Ai , 1 ≤ i ≤ n}. Also An denotes the Cartesian product of A with itself
n times.
n
Y
In order to avoid the use of an ellipsis we also denote the Cartesian product of A1 , ..., An as Ak .
k=1
The variable k is called the index of the product. Most often the index is a whole number. Unless
we are told otherwise we start with k = 1 and increment k by 1 successively until we reach n. So
5
Y
if we are given A1 , A2 , A3 , A4 , and A5 , Ak = A1 × A2 × A3 × A4 × A5 .
k=1
40 CHAPTER 6. SET OPERATIONS
Exercises
a) A ∩ B b) A ∪ B c) A − B
d) B − A
Exercise 6.6. Let A = {1, 2, 3, 4}, B = {a, b, c}, C = {α, β}, and D = {7, 8, 9}. Write out the
following Cartesian products.
a) A × B b) B × A c) C × B × D
Exercise 6.9. Let A = {1, 2, 3} × {1, 2, 3, 4}. List the elements in the set B = {(s, t) ∈ A | s ≥ t}.
41
Problems
a) A ∩ B b) A ∪ B
c) A − B d) B − A
Problem 6.3. Let A = {1, 2, 3} × {1, 2, 3, 4}. List the elements of the set B = {(s, t) ∈ A | s < t }.
Styles of Proof
Earlier, we practiced proving the validity of logical arguments, both with and without quantifiers.
The technique introduced there is one of the main tools for constructing proofs in a more general
setting. In this chapter, various common styles of proof in mathematics are described. Recognizing
these styles of proof will make both reading and constructing proofs a little less onerous. The
example proofs in this chapter will use some familiar facts about integers, which we will prove in a
later chapter.
As mentioned before, the typical form of the statement of a theorem is: if a and b and c and · · · ,
then d. The propositions a,b,c, · · · are called the hypotheses, and the proposition d is called the
conclusion. The goal of the proof is to show that (a ∧ b ∧ c ∧ · · · ) → d is a true proposition. In the
case of propositional logic, the only thing that matters is the form of a logical argument, not the
particular propositions that are involved. That means the proof can always be given in the form of
a truth table. In areas outside of propositional logic that is no longer possible. Now the content of
the propositions must be considered. In other words, what the words mean, and not merely how
they are strung together, becomes important.
Suppose we want to prove an implication Theorem: If p, then q. In other words, we want to show
p → q is true. There are two possibilities: Either p is false, in which case p → q is automatically
true, or p is true. In this second case, we need to show that q is true as well to conclude p → q is
true. In other words, to show p → q is true, we can begin by assuming p is true, and then give an
argument that q must be true as well. The outline of such a proof will look like:
42
43
Proof.
Step 1) Reason 1
Step 2) Reason 2
.. ..
. .
Step l) Reason l
The symbol □ in the last line informs the reader that the proof is finished. Every step in the proof
must be a true proposition, and since the goal is to conclude q is true, the proposition q will be the
last step in the proof. There are only four acceptable reasons that can be invoked to justify a
step in a proof. Each step can be: (1) a hypothesis (and so assumed to be true), (2) an application
of a definition, (3) a known fact proved previously, and so known to be true, or (4) a consequence of
applying a rule of inference or a logical equivalence to earlier steps in the proof. The only difference
between these sorts of formal proofs and the proofs of logical arguments we practiced earlier is the
inclusion of definitions as a justification of a step.
Before giving a few examples, there is one more point to consider. Most theorems in mathematics
involve variables in some way, along with either universal or existential quantifiers. But, in the case
of universal quantifiers, tradition dictates that the mention of the quantifier is often suppressed,
and left for the reader to fill in. For example consider: Theorem: If n is an even integer, then
n2 is an even integer. The statement is really shorthand for Theorem: For every n ∈ Z, if n
is even, then n2 is even. If we let E(n) be the predicate n is even with universe of discourse Z,
the theorem becomes Theorem: ∀n(E(n) → E(n2 )). The truth of such a universally quantified
statement can be accomplished with an application of the rule of universal generalization. In other
words, we prove that for an arbitrary n ∈ Z, the proposition E(n) → E(n2 ) is true.
Proof.
Usually proofs are not presented in the dry step-wise style of the last example. Instead, a more
narrative style is used. So the above proof could go as follows:
Proof. Suppose n is an even integer. That means n = 2k for some integer k. Squaring both sides
gives n2 = (2k)2 = 4k 2 = 2(2k 2 ) which shows n2 is even.
All the ingredients of the step-wise proof are present in the narrative form, but this second form is
a little more reader friendly. For example, we can include a few comments, such as squaring both
sides gives to help the reader figure out what is happening.
The method of proof given above is called direct proof. The characteristic feature of a direct
proof is that in the course of the proof, the hypotheses appear as steps, and the last step in the
proof is the conclusion of the theorem.
Proof. Suppose m and n are odd integers. That means m = 2j +1 for some integer j, and n = 2k+1
for some integer k. Adding gives m + n = (2j + 1) + (2k + 1) = 2j + 2k + 2 = 2(j + k + 1), and so
we see m + n is even.
There are situations where a direct proof is not very convenient for one reason or another. There
are several other styles of proof, each based on some logical equivalence.
Theorem 7.4. p → q
In other words, we replace the requested implication with its contrapositive, and prove that instead.
This method of proof is called indirect proof. Here’s an example.
45
Proof. Suppose m is not even. Then m is odd. So m = 2k + 1 for some integer k. Squaring both
sides of that equation gives m2 = (2k + 1)2 = 4k 2 + 4k + 1 = 2(2k 2 + 2k) + 1, which shows m2 is
not even.
Notice that we gave a direct proof of the equivalent theorem: If m is not an even integer, then m2
is not an even integer.
Another alternative to a direct proof is proof by contradiction. In this method the plan is
to replace the requested Theorem: r (where r can be any simple or compound proposition) with
Theorem: ¬r → F, where F is any proposition known to be false. The reason proof by contradiction
is a valid form of proof is that ¬r → F ≡ r, so that showing ¬r → F is true is identical to showing
r is true. Proofs by contradiction can be a bit more difficult to discover than direct or indirect
proofs. The reason is that in those two types of proof, we know exactly what the last line of our
proof will be. We know where we want to get to. But in a proof by contradiction, we only know
that we want to end up with some (any) proposition known to be false. Typically, when writing a
proof by contradiction, we experiment, trying various logical arguments, hoping to stumble across
some false proposition, and so conclude the proof. For example, consider the following.
√
Theorem 7.7. 2 is irrational.
√ √ m
Proof. Suppose that 2 is rational. Then there exist integers m and n with n ̸= 0, so that 2= n,
2
m m
with n in lowest terms. Squaring both sides gives 2 = n2 . Thus m2 = 2n2 and so m2 is even.
Therefore m is even. So m = 2k for some integer k. Substituting 2k for m in m2 = 2n2 shows
(2k)2 = 4k 2 = 2n2 . Which means that n2 = 2k 2 . Therefore n2 is even, which means n is even.
m
Now since both m and n are even, they have 2 as a common factor. Therefore n is in lowest terms
and it is not in lowest terms. →←.
46 CHAPTER 7. STYLES OF PROOF
The symbol →← (two arrows crashing into each other head on) denotes that we have reached a
fallacy (F), a statement known to be false. It usually marks the end of a proof by contradiction.
In the next example, we will prove a proposition of the form p → q by contradiction. The theorem
is about real numbers x and y.
√ √
Theorem 7.9. If 0 < x < y, then x< y.
Think of the statement of the theorem in the form p → q. The plan is to replace the requested
theorem with
But ¬(p → q) ≡ ¬(¬p ∨ q) ≡ p ∧ ¬q. So we will actually prove (p ∧ ¬q) → F. In other words, we
will prove (directly)
√ √
Theorem 7.11. If 0 < x < y and x≥ y, then (some fallacy).
√ √ √ √ √ √ √
Proof. Suppose 0 < x < y and x ≥ y. Since x > 0, x x ≥ x y, which is the same as
√ √ √ √ √ √ √ √
x ≥ xy. Also, since y > 0, y x ≥ y y, which is the same as xy ≥ y. Putting x ≥ xy
√
and xy ≥ y together, we conclude x ≥ y. Thus x < y and x ≥ y. →←
The only other common style of proof is proof by cases. Let’s first look at the justification for
this proof technique. Suppose we are asked to prove
We dream up some propositions, r and s, and replace the requested theorem with three theorems:
(2) r → q, and
(3) s → q.
The propositions r, s we dream up are called the cases. There can be any number of cases. If we
dream up three cases, then we would have four theorems to prove, and so on. The hope is that
47
the proofs of these replacement theorems will be much easier than a proof of the original theorem.
This is the divide and conquer approach to a proof.
is a tautology - which you should be able to verify. Proof by cases, as with proof by contradiction,
is generally a little trickier than direct and indirect proofs. In a proof by contradiction, we are not
sure exactly what we are shooting for. We just hope some contradiction will pop up. For a proof
by cases, we have to dream up the cases to use, and it can be difficult at times to dream up good
cases.
Proof. Suppose n is an integer. There are two cases: Either (1): n > 0, or (2): n ≤ 0. (This has
the form p −→ (r ∨ s) of (1) in Theorem 7.)
Case 1: We need to show If n > 0, then |n| ≥ n. (We will do this with a direct proof.) Suppose n > 0.
Then |n| = n. Thus |n| ≥ n is true.
Case 2: We need to show If n ≤ 0, then |n| ≥ n. (We will again use a direct proof.) Suppose n ≤ 0.
Now 0 ≤ |n|. Thus, n ≤ |n|.
So, in any case, n ≤ |n| is true, and that proves the theorem.
A proof of a statement of the form ∃xP (x) is called an existence proof. The proof may be
constructive, meaning that the proof provides a specific example of, or at least an explicit recipe
for finding, an x so that P (x) is true; or the proof may be non-constructive, meaning that it
establishes the existence of x without giving a method of actually producing an example of an x
for which P (x) is true.
To give examples of each type of existence proof, let’s use a familiar fact (which will be proved
a little later in the course): There are infinitely many primes. Recall that a prime is an integer
greater than 1 whose only positive divisors are 1 and itself. The next two theorems are contrived,
but they demonstrate the ideas of constructive and nonconstructive proofs.
48 CHAPTER 7. STYLES OF PROOF
Proof. Checking shows that 101 has no positive divisors besides 1 and itself. Also, 101 has more
than two digits. So we have produced an example of a prime with more than two digits.
That is a constructive proof of the theorem. Now, here is a non-constructive proof of a similar
theorem.
Theorem 7.16. There is a prime with more than one billion digits.
Proof. Since there are infinitely many primes, they cannot all have one billion or fewer digits. So
there must some primes with more than one billion digits.
Finally, suppose we are asked to prove a theorem of the form ∀x P (x), and for one reason or
another we come to believe the proposition is not true. The proposition can be shown to be false
by exhibiting a specific element from the domain of x for which P (x) is false. Such an example is
called a counterexample to the theorem. Let’s look at a specific instance of the counterexample
technique.
Counterexample 7.18. To disprove the theorem, we explicitly specify a positive integer n such
that n2 − n + 41 is not prime. In fact, when n = 41, the expression is not a prime since clearly
412 − 41 + 41 = 412 is divisible by 41. So, n = 41 is a counterexample to the proposition.
An interesting fact about this example is that n = 41 is the smallest counterexample. For n =
1, 2, · · · 40, it turns out that n2 − n + 41 is a prime! This examples shows the danger of checking a
theorem of the form ∀x P (x) for a few (or a few billion!) values of x, finding P (x) true for those
cases, and concluding it is true for every possible value of x.
49
For the purpose of these exercises and problems, feel free to use familiar facts and definitions about
integers. For example: Recall, an integer n is even if n = 2k for some integer k. And, an integer n
is odd if n = 2k + 1 for some integer k.
Exercises
Exercise 7.1. Give a direct proof that the sum of two even integers is even.
Exercise 7.2. Give an indirect proof that if the square of the integer n is odd, then n is odd.
Exercise 7.3. Give a proof by contradiction that the sum of a rational number and an irrational
number is irrational.
Exercise 7.5. Give a counterexample to the proposition Every positive integer that ends with a 7
is a prime.
50 CHAPTER 7. STYLES OF PROOF
Problems
Problem 7.1. Give a direct proof that the sum of an even integer and an odd integer is odd.
Hint: Start by letting m be an even integer and letting n be an odd integer. That means m = 2k
for some integer k and n = 2j + 1 for some integer j. You are interested in m + n, so add them up
and see what you get. Why is the thing you get an odd integer (think about the definition of odd)?
Problem 7.2. Give a direct proof that the sum of two odd integers is even.
Problem 7.3. Give an indirect proof that if n3 is even, then n is even. Hint: Study the solution
of a similar statement in the sample exercises for this lesson.
Hint: This is the problem in this set that gives the most grief. Study the section in the notes where
the mechanics of proving a statement of the form If P, then Q by contradiction is discussed. Be
sure you understand why the first line of the proof should be something like Suppose 3n + 2 is odd
and n is even.
Problem 7.5. Give an example of a predicate P (n) about positive integers n, such that P (n) is
true for every positive integer from 1 to one billion, but which is never-the-less not true for all
positive integers. (Hint: there is a really simple choice possible for the predicate P (n).)
Problem 7.6. The maximum of two numbers, a and b is a provided a ≥ b. Notation: max(a, b) =
a. The minimum of a and b is a provided a ≤ b. Notation: min(a, b) = a. Examples: max(2, 3) =
3, max(5, 0) = 5, min(2, 3) = 2, min(5, 0) = 0, max(4, 4) = min(4, 4) = 4.
Give a proof by cases that for any numbers s, t,
min(s, t) + max(s, t) = s + t.
Problem 7.7. Give a proof by cases that for integers m, n, we have |mn| = |m||n|. Hint: Consider
four cases: (1) m ≥ 0 and n ≥ 0, (2) m ≥ 0 and n < 0, (3) m < 0 and n ≥ 0, and (4) m < 0 and
n < 0.
Chapter 8
Relations
Two-place predicates, such as B(x, y) : x is the brother of y, play a central role in mathematics.
Such predicates can be used to describe many basic concepts. As examples, consider the predicates
given verbally:
(1) G(x, y) : x is greater than or equal to y which compares the magnitudes of two values.
(2) P (x, y) : x has the same parity as y which compares the parity of two integers.
(3) S(x, y) : x has square equal to y which relates a value to its square.
Two-place predicates are called relations, probably because of examples such as the brother of
given above. To be a little more complete about it, if P (x, y) is a two-place predicate, and the
domain of discourse for x is the set A, and the domain of discourse for y is the set B, then P is
called a relation from A to B. When working with relations, some new vocabulary is used. The
set A (the domain of discourse for the first variable) is called the domain of the relation, and the
set B (the domain of discourse for the second variable) is called the codomain of the relation.
There are several different ways to specify a relation. One way is to give a verbal description as in
the examples above. As one more example of a verbal description of a relation, consider
E(x, y) : The word x ends with the letter y. Here the domain will be words in English, and the
51
52 CHAPTER 8. RELATIONS
codomain will the the twenty-six letters of the alphabet. We say the ordered pair (cat, t) satisfies
the relation E, but that (dog, w) does not.
When dealing with abstract relations, a verbal description is not always convenient. An alternate
method is to tell what the domain and codomain are to be, and then simply list the ordered pairs
which will satisfy the relation. For example, if A = {1, 2, 3, 4} and B = {a, b, c, d}, then one of
many possible relations from A to B would be {(1, b), (2, c), (4, c)}. If we name this relation R, we
will write R = {(1, b), (2, c), (4, c)}. It would be tough to think of a natural verbal description of R.
When thinking of a relation, R, as a set of ordered pairs, it is common to write aRb in place of
(a, b) ∈ R. For example, using the relation G defined above, we can convey the fact that the pair
(3, 2) satisfies the relation by writing any one of the following: (1) G(3, 2) is true, (2) (3, 2) ∈ G, or
(3) 3G2. The third choice is the preferred one when discussing relations abstractly.
Sometimes the ordered pair representation of a relation can be a bit cumbersome compared to
the verbal description. Think about the ordered pair form of the relation E given above: E =
{ (cat, t), (dog, g), (antidisestablishmentarianism, m), · · · }.
Another way represent a relation is with a graph. Here, a graph is a diagram made up of dots,
called vertices, some of which are joined by lines, called edges. To draw a graph of a relation R
from A to B, make a column of dots, one for each element of A, and label the dots with the names
of those elements. Then, to the right of A’s column make a column of dots for the elements of B.
Then connect the vertex labelled a ∈ A to a vertex b ∈ B with an edge provided (a, b) ∈ R. The
diagram is called the bipartite graph representation of R.
Example 8.1. Let A = {1, 2, 3, 4 }, B = {a, b, c, d }, and let R = {(1, a), (2, b), (3, c), (3, d), (4, d)}.
Then the bipartite graph which represents R is given in figure 8.1.
The choices made about the ordering and the placement of the vertices for the elements of A and B
may make a difference in the appearance of the graph, but all such graphs are considered equivalent.
Also, edges can be curved lines. All that matters is that such diagrams convey graphically the same
information as R given as a set of ordered pairs.
It is common to have the domain and the codomain of a relation be the same set. If R is a relation
from A to A, then we will say R is a relation on A. In this case there is a shorthand way of
representing the relation by using a digraph. The word digraph is shorthand for directed graph
meaning the edges have a direction indicated by an arrowhead. Each element of A is used to label
53
A B
1 a
2 b
3 c
4 d
1 2
4 3
a single point. An arrow connects the vertex labelled s to the one labelled t provided (s, t) ∈ R.
An edge of the form (s, s) is called a loop.
Again it is true that a different placement of the vertices may yield a different-looking, but equiva-
lent, digraph.
The last method for representing a relation is by using a 0-1 matrix. This method is particularly
handy for encoding a relation in computer memory. An m × n matrix is a rectangular array with
m rows and n columns. Matrices are usually denoted by capital English letters. The entries of a
matrix, usually denoted by lowercase English letters, are indexed by row and column. Either ai,j
or aij stands for the entry in a matrix in the ith row and jth column. A 0-1 matrix is one all of
whose entries are 0 or 1. Given two finite sets A and B with m and n elements respectively, we may
54 CHAPTER 8. RELATIONS
use the elements of A (in some fixed order) to index the rows of an m × n 0-1 matrix, and use the
elements of B to index the columns. So for a relation R from A to B, there is a matrix of R, MR
with respect to the orderings of A and B which represents R. The entry of MR in the row labelled
by a and column labelled by b is 1 if aRb and 0 otherwise. This is exactly like using characteristic
vectors to represent subsets of A × B, except that the vectors are cut into n chunks of size m.
Example 8.3. Let A = {1, 2, 3, 4} and B = {a, b, c, d} as before, and consider the relation R =
{(1, a), (1, b), (2, c), (4, c), (4, a)}. Then a 0-1 matrix which represents R using the natural orderings
of A and B isNote: This matrix may change appearance if A or B is listed in a different order.
1 1 0 0
0 0 1 0
MR =
0
0 0 0
1 0 1 0
Since relations can be thought of as sets of ordered pairs, it makes sense to ask if one relation is
a subset of another. Also, set operations such as union and intersection can be carried out with
relations.
These notions can be expressed in terms of the matrices that represent the relations. Bit-wise
operations on 0-1 matrices are defined in the obvious way. Then MR∪S = MR ∨ MS , and MR∩S =
MR ∧ MS . Also, for two 0-1 matrices of the same size M ≤ N means that wherever N has a 0
entry, the corresponding entry in M is also 0. Then R ⊆ S means the same as MR ≤ MS .
First, if R is a relation from A to B, then by reversing all the ordered pairs in R, we get a new
relation, denoted R−1 , called the inverse of R. In other words, R−1 is the relation from B to A
given by R−1 = {(b, a)|(a, b) ∈ R}. A bipartite graph for R−1 can be obtained from a bipartite
graph for R simply by interchanging the two columns of vertices with their attached edges (or, by
rotating the diagram 180◦ ).
If the matrix for R is MR , then the matrix for R−1 is produced by taking the columns of MR−1
to be the rows of MR . A matrix obtained by changing the rows of M into columns is called the
transpose of M , and written as M T . So, in symbols, if M is a matrix for R, then M T is a matrix
for R−1 .
55
S R a
1
α
2 b
3 c
4 β
1 a
2 b
3 c
4 R◦S
The second operation with relations concerns the situation when S is a relation from A to B and
R is a relation from B to C. In such a case, we can form the composition of S by R which is
denoted R ◦ S. The composition is defined as
Example 8.4. Let A = {1, 2, 3, 4}, B = {α, β} and C = {a, b, c}. Further let S = {(1, α), (1, β),
(2, α), (3, β), (4, α)} and R = {(α, a), (α, c), (β, b)}.
Since (1, α) ∈ S and (α, a) ∈ R, it follows that (1, a) ∈ R ◦ S. Likewise, since (2, α) ∈ S and
(α, c) ∈ R, it follows that (2, c) ∈ R ◦ S. Continuing in that fashion shows that
R ◦ S = {(1, a), (1, b), (1, c), (2, a), (2, c), (3, b), (4, a), (4, c)}.
The composition can also be determined by using the bipartite graphs of the relations. Make a
column of vertices for A labelled 1, 2, 3, 4, then to the right a column of points for B labelled α, β,
then again to the right a column of points for C labelled a, b, c. Draw in the edges as usual for R
and S. (See figure 8.3.) Then a pair (x, y) will be in R ◦ S provided there is a two edge path from
x to y.
In terms of 0-1 matrices if MS is the m × k matrix of S with respect to the given orderings of A
and B, and if MR is the k × n matrix of R with respect to the given orderings of B and C, then
whenever the i, l entry of S and l, j entry of R are both 1, then (ai , cj ) ∈ R ◦ S.
This example motivates the definition of the Boolean product of MS and MR as the corresponding
matrix MR◦S of the composition. More rigorously when M is an m×k 0-1 matrix and N is an k×n 0-
1 matrix, M ⊙N is the m×n 0-1 matrix whose i, j entry is (mi,1 ∧n1,j )∨(mi,2 ∧n2,j )∨, ..., (mi,k ∧nk,j ).
This looks worse than it is. It achieves the desired result. The Boolean product is computed the
same way as the ordinary matrix product where multiplication and addition have been replaced
with and and or, respectively.
Exercises
Exercise 8.2. The matrix of a relation S from {1, 2, 3, 4, 5} to {a, b, c, d} with respect to the given
orderings is displayed below. Represent S as a bipartite graph, and as a set of ordered pairs.
1 1 0 0
0 0 1 1
MS =
1 0 0 1
1 0 1 0
0 1 1 0
Exercise 8.3. Find the composition R ◦ S (where R and S are defined in exercises 8.1 and 8.2) as
a set of ordered pairs. Use the Boolean product to find MR◦S with respect to the natural orderings.
N.B. The natural ordering for R is not the ordering used in exercise 8.1.
R1 = {(1, 2), (1, 3), (1, 5), (2, 1), (2, 2), (2, 4), (3, 3), (3, 4),
(4, 1), (4, 5), (5, 5), (6, 6)} and
R2 = {(1, 2), (1, 6), (2, 1), (2, 2), (2, 3), (2, 5), (3, 1), (3, 3), (3, 6),
(4, 2), (4, 3), (4, 4), (5, 1), (5, 5), (5, 6), (6, 2), (6, 3), (6, 6)}.
(b) With respect to the given ordering of B find the matrix of each relation in part (a)
58 CHAPTER 8. RELATIONS
Problems
Problem 8.1. Let A = {a, b, c, d} and R = {(a, a), (a, c), (b, d), (c, a), (c, c), (d, b)} be a relation on
A. Draw a digraph which represents R. Draw the bipartite graph which represents R.
Problem 8.2. Let A = {a, b, c, d} and R = {(a, a), (a, c), (b, d), (c, a), (c, c), (d, b)} be a relation on
A. What is the inverse of R?
Problem 8.3. Find the composition, R ◦ S, where S = { (1, a), (4, a), (5, b), (2, c), (5, c), (3, d)} and
R = {(a, x), (a, y), (b, x), (c, z), (d, z)}, as a set of ordered pairs.
Problem 8.4. Let R1 = {(1, 2), (1, 3), (1, 5), (2, 1), (6, 6)} and
R2 = {(1, 2), (1, 6), (3, 6), (4, 2), (5, 6), (6, 2), (6, 3)}. Find R1 ∪ R2 and R1 ∩ R2 .
Problem 8.5. Let L be the relation less than on the set of integers. Examples 3 L 7 and −8 L 0
are true, but 5 L 2 and 6 L 6 are false. How would describe the relation L−1 ?
Problem 8.6. True or False: For any relation R, (R−1 )−1 = R. Explain your answer.
Problem 8.7. Are there relations R for which R = R−1 ? If not, explain why it is not possible. If
so, give an example of such a relation.
Problem 8.8. Let R be a relation on a set A, and let R−1 be its inverse. Prove that if (a, b) ∈
R ◦ R−1 , then (b, a) ∈ R ◦ R−1 .
Problem 8.9. Let A and B be two sets. Explain why the empty set, ∅, is a relation from A to B.
Problem 8.10. Let S be a relation from A to B, and let R be a relation from B to C. Prove
(R ◦ S)−1 = S −1 ◦ R−1 .
Chapter 9
Properties of Relations
There are several conditions that can be imposed on a relation R on a set A that make it useful.
These requirements distinguish those relations which are interesting for some reason from the garden
variety junk, which is, let’s face it, what most relations are.
It is easy to spot a reflexive relation from its digraph: there is a loop at every vertex. Also, a
reflexive relation can be spotted quickly from its matrix. First, let’s agree that when the matrix
of a relation on a set A is written down, the same ordering of the elements of A is used for both
the row and column designators. For a reflexive relation, the entries on the main diagonal of its
matrix will all be 1’s. The main diagonal of a square matrix runs from the upper left corner to the
lower right corner.
The flip side of the coin from reflexive is irreflexive. A relation R on A is irreflexive in case aRa for
all a ∈ A. In other words, no element of A is related to itself. The brother of relation is irreflexive.
The digraph of an irreflexive relation contains no loops, and its matrix has all 0’s on the main
59
60 CHAPTER 9. PROPERTIES OF RELATIONS
diagonal.
Actually, that discussion was a little careless. To see why, consider the relation
Is this relation reflexive? The answer is: we can’t tell. The answer depends on the domain of the
relation, and we haven’t been told what that is to be. For example, if the domain is the set N of
natural numbers, then the relation is reflexive, since n2 ≥ n for all n ∈ N. However, if the domain
is the set R of all real numbers, the relation is not reflexive. In fact, a counterexample to the claim
2
that S is reflexive on R is the number 21 since 12 = 14 , and 14 < 12 , so 12 S 12 . The lesson to be
learned from this example is that the question of whether a relation is reflexive cannot be answered
until the domain has been specified. The same is true for the irreflexive condition and the other
conditions defined below. Always be sure you know the domain before trying to determine which
properties a relation satisfies.
A relation R on A is symmetric provided (a, b) ∈ R → (b, a) ∈ R. Another way to say the same
thing: R is symmetric provided R = R−1 . In words, R is symmetric provided that whenever a
is related to b, then b is related to a. Any digraph representing a symmetric relation R will have
a return edge for every non-loop. Think of this as saying the graph has no one-way streets. The
matrix M of a symmetric relation satisfies M = M T . In this case M is symmetric about its main
diagonal in the usual geometric sense of symmetry. The B(x, y) : x is the brother of y relation
mentioned before is not symmetric if the domain is taken to be all people since, for example,
Donny B M arie, but M arie B Donny. On the other hand, if we take the domain to be all (human)
males, then B is symmetric.
A relation R on A is transitive if whenever (a, b) ∈ R and (b, c) ∈ R, then (a, c) ∈ R. This can also
be expressed by saying R◦R ⊆ R. In a digraph for a transitive relation whenever we have a directed
path of length two from a to c through b, we must also have a direct link from a to c. This means
that any digraph of a transitive relation has lots of triangles. This includes degenerate triangles
where a, b and c are not distinct. A matrix M of a transitive relation satisfies M ⊙ M ≤ M . The
relation ≤ on N is transitive, since from k ≤ m and m ≤ n, we can conclude k ≤ n.
Example 9.1.
Define a relation, N on the set of all living people by the rule a N b if and only if a, b live within
one mile of each other. This relation is reflexive since every person lives within a mile of himself.
It is not irreflexive since I live within a mile of myself. It is symmetric since if a lives within a
mile of b, then b lives within a mile of a. It is not antisymmetric since Mr. and Mrs. Smith live
within a mile of each other, but they are not the same person. It is not transitive: to see why, think
of the following situation (which surely exists somewhere in the world!): there is a straight road of
length 1.5 miles. Say Al lives at one end of the road, Cal lives at the other end, and Sal lives half
way between Al and Cal. Then Al N Sal and Sal N Cal, but not Al N Cal.
Example 9.2. Let A = R and define aRb iff a ≤ b, then R is a reflexive, transitive, antisymmetric
relation. Because of this example, any relation on a set that is reflexive, antisymmetic, and transitive
is called an ordering relation. The subset relation on any collection of sets is another ordering
relation.
Example 9.3. Let A = R and define aRb iff a < b. Then R is irreflexive, and transitive.
R = {(1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6), (1, 3), (3, 1), (1, 5),
(5, 1), (2, 4), (4, 2), (2, 6), (6, 2), (3, 5), (5, 3), (4, 6), (6, 4)}
is reflexive, symmetric, and transitive. In artificial examples such as this one, it can be a tedious
chore checking that the relation is transitive.
Example 9.5. If A = {1, 2, 3, 4} and R = {(1, 1), (1, 2), (2, 3), (1, 3), (3, 4), (2, 4), (4, 1)} then R is
not reflexive, not irreflexive, not symmetric, and not transitive but it is antisymmetric.
62 CHAPTER 9. PROPERTIES OF RELATIONS
Exercises
Exercise 9.1. Define a relation on {1, 2, 3} which is both symmetric and antisymmetric.
R = {(1, 2), (2, 1), (2, 3), (3, 2), (3, 4), (4, 3)}.
For each of the five properties of a relation defined in this chapter (reflexive, irreflexive, symmetric,
antisymmetric, and transitive) either show R satisfies the property, or explain why it does not.
Exercise 9.3. Each matrix below specifies a relation R on {1, 2, 3, 4, 5, 6} with respect to the given
ordering 1, 2, 3, 4, 5, 6.
For each of the five properties of a relation defined in this chapter (reflexive, irreflexive, symmetric,
antisymmetric, and transitive) either show R satisfies the property, or explain why it does not.
1 1 1 0 0 0 1 1 1 1 1 1
1 1 1 0 0 0 0 0 0 0 0 1
1 1 1 0 0 0 0 0 0 0 0 1
a) b)
0 0 0 1 1 1 0 0 0 0 0 1
0 0 0 1 1 1 0 0 0 0 0 1
0 0 0 1 1 1 0 0 0 0 0 1
1 0 0 0 0 1 1 0 0 0 0 1
0 1 0 0 1 0 0 1 0 0 1 0
0 0 1 1 0 0 0 0 1 1 0 0
c) d)
0 0 1 1 0 0 0 0 1 1 0 0
0 1 0 0 1 0 0 1 0 0 1 0
1 0 0 0 0 0 1 0 0 0 0 1
Exercise 9.4. Define the relation C(A, B) : |A| ≤ |B|, where the domains for A and B are all
subsets of Z.
For each of the five properties of a relation defined in this chapter (reflexive, irreflexive, symmetric,
antisymmetric, and transitive) either show C satisfies the property, or explain why it does not.
Exercise 9.6. Define the relation M (A, B) : |A ∩ B| = 1 (or, in plain English, A and B have
exactly one element in common), where the domains for A and B are all subsets of Z. A few
examples:
• {5, 10} M {1, 2, 3, 4, 5, 6} is true since the sets {5, 10} and {1, 2, 3, 4, 5, 6} have exactly one
element in common (namely 5).
• {1, 2, 3} M {6, 7, 8, 9} is false since {1, 2, 3} and {6, 7, 8, 9} have no elements in common.
• {1, 2, 3, 4} M {2, 4, 6, 8} is false since {1, 2, 3, 4} and {2, 4, 6, 8} have more than one element
in common.
For each of the five properties of a relation defined in this chapter (reflexive, irreflexive, symmetric,
antisymmetric, and transitive) either show M satisfies the property, or explain why it does not.
64 CHAPTER 9. PROPERTIES OF RELATIONS
Problems
Problem 9.1. Let R be the relation {(1, 1)} on the set A = {1, 2}. For each of the five properties of
a relation defined in this chapter (reflexive, irreflexive, symmetric, antisymmetric, and transitive)
either show R satisfies the property, or explain why it does not.
Problem 9.2. Let S be the relation {(1, 1), (1, 2), (1, 3), (2, 3)} on the set A = {1, 2, 3}. For each
of the five properties of a relation defined in this chapter (reflexive, irreflexive, symmetric, antisym-
metric, and transitive) either show S satisfies the property, or explain why it does not.
Problem 9.3. Let A be the relation on the set Z of all integer defined by s A t if and only if |s| ≤ |t|.
For each of the five properties of a relation defined in this chapter (reflexive, irreflexive, symmetric,
antisymmetric, and transitive) either show A satisfies the property, or explain why it does not.
Problem 9.4. Let D be the relation on the natural numbers defined by the rule mDn if and only
if m does not equal n. Examples: 5D7 is true and 4D4 is false. For each of the five properties of
a relation defined in this chapter (reflexive, irreflexive, symmetric, antisymmetric, and transitive)
either show D satisfies the property, or explain why it does not.
Problem 9.5. Let R be the relation {(1, 2), (2, 3), (3, 4)} on the set A = {1, 2, 3}. The relation R
is not transitive on A. What is the fewest number of ordered pairs that need to be added to R so it
becomes a transitive relation on A?
Problem 9.6. Give a counterexample to the claim that a relation R on a set A that is both
symmetric and transitive must be reflexive. Hint: There is a very simple example!
Problem 9.7. Define the relation M (A, B) : A ∩ B = ∅, where the domains for A and B are all
subsets of Z.
For each of the five properties of a relation defined in this chapter (reflexive, irreflexive, symmetric,
antisymmetric, and transitive) either show M satisfies the property, or explain why it does not.
Problem 9.8.
(a) Let A = {1}, and consider the empty relation, ∅, on A. For each of the five properties
of a relation defined in this chapter (reflexive, irreflexive, symmetric, antisymmetric, and
transitive) either show ∅ satisfies the property, or explain why it does not.
Equivalence Relations
Relations capture the essence of many different mathematical concepts. In this chapter, we will
show how to put the idea of are the same kind in terms of a special type of relation.
Before considering the formal concept of same kind let’s look at a few simple examples. Consider
the question, posed about an ordinary deck of 52 cards: How many different kinds of cards are
there? One possible answer is: There are 52 kinds of cards, since all the cards are different. But
another possible answer in certain circumstances is: There are four kinds of cards (namely clubs,
diamonds, hearts, and spades). Another possible answer is: There are two kinds of cards, red and
black. Still another answer is: There are 13 kinds of cards: aces, twos, threes, · · · , jacks, queens,
and kings. Another answer, for the purpose of many card games is: There are ten kinds of cards,
aces, twos, threes, up to nines, while tens, jacks, queens, and kings are all considered to be the
same value (usually called 10). You can certainly think of many other ways to split the deck into
a number of different kinds.
Whenever the idea of same kind is used, some properties of the objects being considered are deemed
important and others are ignored. For instance, when we think of the the deck of cards made of
the 13 different ranks, ace through king, we are agreeing the the suit of the card is irrelevant. So
the jack of hearts and the jack of clubs are taken to be the same for what ever purposes we have in
mind.
The mathematical term for same kind is equivalent. There are three basic properties always
65
66 CHAPTER 10. EQUIVALENCE RELATIONS
To put the idea of equivalence in the context of a relation, suppose we have a set A of objects, and
a rule for deciding when two objects in A are the same kind (equivalent) for some purpose. Then
we can define a relation E on the set A by the rule that the pair (s, t) of elements of A is in the
relation E if and only if s and t are the same kind. For example, consider again the deck of cards,
with two cards considered to be the same if they have the same rank. Then a few of the pairs in
the relation E would be (ace hearts, ace spades), (three diamonds, three clubs), (three clubs, three
diamonds), (three diamonds, three diamonds), (king diamonds, king clubs), and so on.
Using the terminology of the previous chapter, this relation E, and in fact any relation that corre-
sponds to notion of equivalence, will be reflexive, symmetric, and transitive. For that reason, any
reflexive, symmetric, transitive relation on a set A is called an equivalence relation on A.
Suppose E is an equivalence relation on a set A and that x is one particular element of A. The
equivalence class of x is the set of all the things in A that are equivalent to x. The symbol used for
the equivalence class of x is [x], so the definition can be written in symbols as [x] = {y ∈ A|y E x}.
For instance, think once more about the deck of cards with the equivalence relation having the same
rank. The equivalence class of the two of spades would be the set [2♠] = {2♣, 2♢, 2♡, 2♠}. That
would also be the equivalence class of the two of diamonds. On the other hand, if the equivalence
relation we are using for the deck is having the same suit, then the equivalence class of the two of
spades would be
[2♠] = {A♠, 2♠, 3♠, 4♠, 5♠, 6♠, 7♠, 8♠, 9♠, 10♠, J♠, Q♠, K♠}.
The most important fact about the collection of different equivalence classes for an equivalence
relation on a set A is that they split the set A into separate pieces. In fancier words, they partition
the set A. For example, the equivalence relation of having the same rank splits a deck of cards into
13 different equivalence classes. In a sense, when using this equivalence relation, there are only 13
different objects, four of each kind.
67
Example 10.2. Let A be the set of logical propositions and define R on A by pRq iff p ≡ q.
Example 10.3. Let A be the set of people in the world and define R on A by aRb iff a and b are
the same age in years.
Example 10.4. Let A = {1, 2, 3, 4, 5, 6} and R be the relation on A with the matrix from exercise
3. part a) of chapter 9.
Example 10.5. Define P on Z by a P b if and only if a and b are both even, or both odd. We say
a and b have the same parity.
For the equivalence relation has the same rank on a set of cards in a 52 card deck, there are 13
different equivalence classes. One of the classes contains all the aces, another contains all the 2’s,
and so on.
Example 10.6. For the equivalence relation from example 10.5, the equivalence class of 2 is the
set of all even integers.
In this example, there are two different equivalence classes, the one comprising all the even integers,
and the other comprising all the odd integers. As far as parity is concerned, −1232215 and 171717
are the same.
Suppose E is an equivalence relation on A. The most important fact about equivalence classes is
that every element of A belongs to exactly one equivalence class. Let’s prove that.
Theorem 10.7. Let E be an equivalence relation on a set A, and let a ∈ A. Then there is exactly
one equivalence class to which a belongs.
Using those three pieces of information, we need to show the two sets [a] and [b] are equal. Now, to
show two sets are equal, we show they have the same elements. In other words, we want to prove
Suppose c ∈ [b]. Then, according to the definition of [b], c E b. The goal is to end up with c ∈ [a].
Now, we know a ∈ [b], and that means a E b. Since E is symmetric and a E b, it follows that b E a.
Now we have c E b and b E a. Since E is transitive, we can conclude c E a, which means c ∈ [a] as
we hoped to show. That proves (2).
For homework, you will complete the proof of this theorem by doing part (1).
So to express the meaning of theorem 10.7 above in different words: The different equivalence
classes of an equivalence relation on a set partition the set into nonempty disjoint pieces. More
briefly: the equivalence classes of E partition A.
The fact that an equivalence relation partitions the underlying set is reflected in the digraph of an
equivalence relation. If we pick an equivalence class [a] of an equivalence relation E on a finite set
A and we pick b ∈ [a], then b E c for all c ∈ [a]. This is true since a E b implies b E a and if a E c,
then transitivity fills in b E c. So in any digraph for E every vertex of [a] is connected to every other
vertex in [a] (including itself) by a directed edge. Also no vertex in [a] is connected to any vertex
in A − [a]. So the digraph of E consists of separate components, one for each distinct equivalence
class, where each component contains every possible directed edge.
69
In terms of a matrix representation of an equivalence relation E on a finite set A of size n, let the
distinct equivalence classes have size k1 , k2 , ...kr , where k1 + k2 + ... + kr = n. Next list the elements
of A as a1,1 , ..., ak1 ,1 , a1,2 , ..., ak2 ,2 , ....., a1,r , ..., akr ,r where the ith equivalence class is {a1,i , ..., aki ,i }.
Then the matrix for R with respect to this ordering is of the form
Jk1 0 0 ... 0
0 Jk2 0 ... 0
.. ..
.. .. ..
.
. . . .
0 ... 0 Jkr−1 0
0 ... 0 0 Jkr
where Jm is the all 1’s matrix of size km × km . Conversely if the digraph of a relation can be
drawn to take the above form, or if it has a matrix representation of the above form, then it is an
equivalence relation and therefore reflexive, symmetric, and transitive.
70 CHAPTER 10. EQUIVALENCE RELATIONS
Exercises
Exercise 10.1. Let A = {0, 1, 2}. Let R = {(0, 0), (1, 1), (2, 2), (0, 1), (1, 0)}. Is R an equivalence
relation on A? If it is, what are the equivalence classes?
Exercise 10.2. Let A = {0, 1, 2, 3}. Let R = {(0, 0), (1, 1), (2, 2), (0, 1), (1, 0)}. Is R an equivalence
relation on A? If it is, what are the equivalence classes?
Exercise 10.3. Let A = {0, 1, 2}. Let R = {(0, 0), (1, 1), (2, 2), (0, 1)}. Is R an equivalence relation
on A? If it is, what are the equivalence classes?
Exercise 10.4. Let A = {0, 1, 2}. Let R = {(0, 0), (1, 1), (2, 2), (0, 1), (1, 0), (1, 2), (2, 1)}. Is R an
equivalence relation on A? If it is, what are the equivalence classes?
Exercise 10.5. True or False: The relation R = {(1, 1), (2, 2)} on A = {1, 2} is both symmetric
and antisymmetric.
Exercise 10.6. The relation S is defined on the set Z of all integers by the rule m S n if and only
if m2 = n2 . Is S an equivalence relation on Z? If it is, what are the equivalence classes of S?
Exercise 10.7. Let L be the collection of all straight lines in the plane. Four examples of elements
in L: x + y = 0, 2x-y = 5, x = 7, y = 0. A relation C on L is defined by the rule l1 C l2 provided the
lines l1 and l2 have at least one point in common. (The letter C should remind us of cross, and,
loosely speaking, two lines are related if they cross each other. We will have to agree that a line
crosses itself.) Is C an equivalence relation on L? If it is, what are the equivalence classes of C?
Exercise 10.8. Let R be a relation on a non-empty set A that is both symmetric, transitive. And,
suppose that for each a ∈ A, aRb for at least one b ∈ A. Prove that R is reflexive, hence, an
equivalence relation.
Exercise 10.9. Let E be an equivalence relation on a set A, and let a, b ∈ A. Prove that either
[a] ∩ [b] = ∅ or else [a] = [b].
Exercise 10.10. Let A = {1, 2, 3, 4, 5, 6, 7, 8}. Form a partition of A using {1, 2, 4}, {3, 5, 7}, and
{6, 8}. These are the equivalence classes for an equivalence relation, E, on A.
Exercise 10.11. Let A = {a, b, c, d, e, f, g, h}. Determine if each matrix represents an equivalence
relation on A. If the matrix represents an equivalence relation find the equivalence classes. The
natural ordering of the elements, a, b, c, d, e, f, g, h, is used to define the matrices.
1 0 1 0 1 0 1 0 1 1 0 0 0 0 1 1
0 1 0 1 0 1 0 0 1 1 0 0 0 0 1 1
1 0 1 0 1 0 1 0 0 0 1 1 0 0 0 0
0 1 0 1 0 1 0 0 0 0 1 1 0 0 0 0
(a)
1
(b)
0 1 0 1 0 1 0
0
0 0 0 1 1 0 0
0 1 0 1 0 1 0 0 0 0 0 0 1 1 0 0
1 0 1 0 1 0 1 0 1 1 0 0 0 0 1 1
0 0 0 0 0 0 0 1 1 1 0 0 0 0 1 1
Exercise 10.12. Complete the proof of theorem 10.7 on page 67 by proving part (1).
72 CHAPTER 10. EQUIVALENCE RELATIONS
Problems
Problem 10.1. Let A be the set of people alive on earth. For each relation defined below, determine
if it is an equivalence relation on A. If it is, describe the equivalence classes. If it is not, determine
which properties of an equivalence relation fail.
Problem 10.2. Let L be the collection of all straight lines in the plane. Four examples of elements
in L: x + y = 0, 2x-y = 5, x = 7, y = 0. A relation P on L is defined by the rule l1 P l2 provided
the lines l1 and l2 are parallel. Is P an equivalence relation on L? If it is, what are the equivalence
classes of P ?
Problem 10.3. Consider the relation S(x, y) : x is a sibling of y on the set, A, of people alive
on earth. Is S reflexive? Is S symmetric? Is S transitive? (To be precise, siblings will mean two
different people with the same two parents. Don’t consider half-siblings for this problem.)
Problem 10.4. The relation R = {(a, a), (a, b)} is not an equivalence relation on the set A =
{a, b, c}. What is the fewest number of ordered pairs that need to be added to R so the result is an
equivalence relation on A?
Problem 10.5. Let A be the set of all ordered pairs of positive integers. Some members of A are
(3, 6), (7, 7), (11, 4), (1, 2981). A relation R on A is defined by the rule (a, b)R(c, d) if and only if
ad = bc. For example (3, 5)R(6, 10) is true since (3)(10) = (5)(6).
Problem 10.6. Let A = {1, 2, 3, 4, 5, 6}. Form a partition of A using {1, 2}, {3, 4, 5}, and {6}.
These are the equivalence classes for an equivalence relation, E, on A. Draw the digraph of E.
Problem 10.7. Let A = {1, 2, 3}. The relation E = {(1, 1), (2, 2), (3, 3), (2, 3), (3, 2)} is an equiv-
alence relation on A. The relation F = {(1, 1), (2, 2), (3, 3), (1, 2), (2, 1)} is another equivalence
relation on A. Compute the composition F ◦ E. Is F ◦ E and equivalence relation on A?
Chapter 11
In algebra, functions are thought of as formulas such as f (x) = x2 where x is any real number. This
formula gives a rule that describes how to determine one number if we are handed some number x.
So, for example, if we are handed x = 2, the function f says that determines the value 4, and if we
are handed 0, f says that determines 0. There is one condition that, by mutual agreement, such
a function rule must obey to earn the title function: the rule must always determine exactly one
value for each (reasonable) value it is handed. Of course, for the example above, x = blue isn’t a
reasonable choice for x, so f doesn’t determine a value associated with blue. The domain of this
function is all real numbers.
Instead of thinking of a function as a formula, we could think of a function as any rule which
determines exactly one value for every element of a set A. For example, suppose W is the set of
all words in English, and consider the rule, I, which associates with each word, w, the first letter
of w. Then I(cat) = c, I(dog) = d, I(a) = a, and so on. Notice that for each word w, I always
determines exactly one value, so it meets the requirement of a function mentioned above. Notice
that for the same set of all English words, the rule T (w) is the third letter of the word w is not a
function since, for example, T (be) has no value.
Here is the semi-formal definition of a function: A function from the set A to the set B is any
rule which describes how to determine exactly one element of B for each element of A. The set A
is called the domain of f , and the set B is called the codomain of f . The notation f : A → B
means f is a function from A to B.
73
74 CHAPTER 11. FUNCTIONS AND THEIR PROPERTIES
x f (x)
1 a
2 a
3 c
4 b
5 d
6 e
There are cases where it is not convenient to describe a function with words or formulas. In such
cases, it is often possible to simply make a table listing the members of the domain along with the
associated member of the codomain.
Example 11.1. Let A = {1, 2, 3, 4, 5, 6}, B = {a, b, c, d, e} and let f : A → B be specified by table
11.1
It is hard to imagine a verbal description that would act like f , but the table says it all. It is
traditional to write such tables in a more compact form as
f = { (1, a), (2, a), (3, c), (4, b), (5, d), (6, e) }.
The last result in example 11.1 looks like a relation, and that leads to the modern definition of a
function:
Definition 11.2. A function, f , with domain A and codomain B is a relation from A to B (hence
f ⊆ A × B) such that each element of A is the first coordinate of exactly one ordered pair in f .
That completes the evolution of the concept of function from formula, through rule, to set of ordered
pairs. When dealing with functions, it is traditional to write b = f (a) instead of (a, b) ∈ f .
In algebra and calculus, the functions of interest have a domain and a codomain consisting of sets
of real numbers, A, B ⊆ R. The graph of f is the set of ordered pairs in the Cartesian plane of the
form (x, f (x)). Normally in this case, the output of the function f is determined by some formula.
For example, f (x) = x2 .
We can spot a function in this case by the vertical line test. A relation from a subset A of R to
another subset of R is a function if every vertical line of the form x = a, where a ∈ A intersects the
graph of f exactly once.
75
y
4 y = x2
x
1 2 3
In discrete mathematics, most functions of interest have a domain and codomain some finite sets,
or, perhaps a domain or codomain consisting of integers. Such domains and codomains are said to
be discrete.
When f : A → B is a function and both A and B are finite, then since f is a relation, we can
represent f either as a 0 − 1 matrix or a bipartite graph. If M is a 0 − 1 matrix which represents
a function, then since every element of A occurs as the first entry in exactly one ordered pair in f ,
it must be that every row of M has exactly one 1 in it. So it is easy to distinguish which relations
are functions, and which are not from the matrix for the relation. This is the discrete analog of the
vertical line test, (but notice that rows are horizontal).
Example 11.3. Again, let’s consider the function defined, as in example 11.1, by f is from A =
{1, 2, 3, 4, 5, 6} to B = {a, b, c, d, e} given by the relation f = { (1, a), (2, a), (3, c), (4, b), (5, d), (6, e) }.
If we take the given orderings of A and B, then the 0-1 matrix representing the function f appears
in figure 11.2.
Notice that in matrix form the number of 1’s in a column coincides with the number of occurrences
of the column label as output of the function. So the sum of all entries in a given column equals the
number of times the element labeling that column is an output of the function.
1 0 0 0 0
1 0 0 0 0
0 0 1 0 0
0 1 0 0 0
0 0 0 1 0
0 0 0 0 1
In the case of a function whose domain is a subset of R, the number of times that the horizontal
line y = b intersects the graph of f , is the number of inputs from A for which the function value
is b. Notice that these criteria are twisted again. In the finite case we are now considering vertical
information, and in the other case we are considering horizontal information. In either case, these
criteria will help us determine which of several special properties a function either has or lacks.
The one-to-one property is very easy to spot from either the matrix or the bipartite graph of a
function. When f : A → B is one-to-one, and |A| = m and |B| = n for some m, n ∈ N − {0}, then
when f is represented by a 0-1 matrix M , there can be no more than one 1 in any column. So the
column sums of any 0 − 1 matrix representing a one-to-one function are all less than or equal to 1.
77
Since every row sum of M is 1 and there are m rows, we must have m ≤ n. The bipartite graph
of a one-to-one function can be recognized by the feature that no vertex of the codomain has more
than one edge leading to it.
We say that a function f : A → B is onto, or surjective, if every element of B equals f (a) for
some a ∈ A. Consequently any matrix representing an onto function has each column sum at least
one, and thus m ≥ n. In terms of bipartite graphs, for an onto function, every element of the
codomain has at least one edge leading to it.
As an example, consider again the function L from all English words to the set of letters of the
alphabet defined by the rule L(w) is the last letter of the word w. This function is not one-to-one
since, for example, L(cat) = L(mutt), so two different members of the domain of L are associated
with the same member (namely t) of the codomain. However, L is onto. We could prove that by
making a list of twenty-six words, one ending with a, one ending with b, · · · , one ending with z.
(Only the letters j and q might take more than a moment’s thought.)
A function f : A → B which is both one-to-one and onto is called bijective. In the matrix of a
bijection, every column has exactly one 1 and every row has exactly one 1. So the number of rows
must equal the number of columns. In other words, if there is a bijection f : A → B, where A is a
finite set, then A and B have the same number of elements. In such a case we will say the sets have
the same cardinality or that they are equinumerous, and write that as |A| = |B|. The general
definition (whether A and B are finite or not) is:
Definition 11.5. A and B are equinumerous provided there exists a bijection from A to B.
Notice that for finite sets with the same number of elements, A, B, any one-to-one function must
be onto and vice versa. This is not true for infinite sets. For example the function f : Z → Z by
f (m) = 2m is one-to-one, but not onto, since f (n) = 1 is impossible for any n. On the other hand,
the function g : Z → Z given by the rule g(n) is the smallest integer that is greater than or equal
n
to 2 is onto, but not one-to-one. As examples, g(6) = 3 and g(−5) = −2. This function is onto
since clearly g(2n) = n for any integer n, so every element of the codomain has at least one edge
leading to it. But g(1) = g(2) = 1, so g is not one-to-one.
Proof. We need to show that for each a ∈ A there is exactly one c ∈ C such that (a, c) ∈ f ◦ g. So
suppose a ∈ A. since g : A → B, there is some b ∈ B with (a, b) ∈ g. Since f : B → C, there is a
c ∈ C such that (b, c) ∈ f . So, by the definition of composition, (a, c) ∈ f ◦ g. That proves there is
at least one c ∈ C with (a, c) ∈ f ◦ g. To complete the proof, we need to show that there is only
one element of C that f ◦ g pairs up with a. So, suppose that (a, c) and (a, d) are both in f ◦ g.
We need to show c = d. Since (a, c) and (a, d) are both in f ◦ g, there must be elements s, t ∈ B
such that (a, s) ∈ g and (s, c) ∈ f , and also (a, t) ∈ g and (t, d) ∈ f . Now, since g is a function, and
both (a, s) and (a, t) are in g, we can conclude s = t. So when we write (t, d) ∈ f , we might as well
write (s, d) ∈ f . So we know (s, c) and (s, d) are both in f . As f is a function, we can conclude
c = d.
If g : A → B and f : B → C, and (a, b) ∈ g and (b, c) ∈ f , then (a, c) ∈ f ◦ g. Another way to write
that is g(a) = b and f (b) = c. So c = f (b) = f (g(a)). That last expression look like the familiar
formula for the composition of functions found in algebra texts: (f ◦ g)(x) = f (g(x)).
When f : A → B is a function, we can form the relation f −1 from B to A. But f −1 might not be a
function. For example, suppose f : {a, b} → {1, 2} is f = {(a, 1), (b, 1)}. Then f −1 = {(1, a), (1, b)},
definitely not a function.
If in fact f −1 is a function, then for all a ∈ A with b = f (a), we have f −1 (b) = a so (f −1 ◦ f )(a) =
f −1 (f (a)) = f −1 (b) = a, ∀a ∈ A. Similarly (f ◦ f −1 )(b) = b, ∀b ∈ B. In this case we say f is
invertible. Another way to say the same thing: the inverse of a function f : A → B is a function
g : B → A which undoes the operation of f . As a particular example, consider the function
f : Z → Z given by the formula f (n) = n + 3. In words, f is the add 3 function. The operation
which undoes the effect of f is clearly the subtract 3 function. That is, f −1 (n) = n − 3.
For any set, S, define 1S : S → S by 1S (x) = x for every x ∈ S. In other words, 1S = {(x, x) | x ∈ S}.
The function 1S is called the identity function on S. So the computations above show f −1 ◦f = 1A
and f ◦ f −1 = 1B .
Now suppose that f is bijective, and let b ∈ B. Since f is onto, we have some a ∈ A with f (a) = b.
If e ∈ A with f (e) = b, then e = a since f is one-to-one. Thus b is the first entry in exactly one
ordered pair in the inverse relation f −1 . Whence, f −1 is a function.
Do not make the error of confusing inverses and reciprocals when dealing with functions. The
1 1
reciprocal of f : Z → Z given by the formula f (n) = n + 3 is = which is not the
f (n) n+3
inverse function for f . For example f (0) = 3, but the reciprocal of f does not convert 3 back into
1
0, instead the reciprocal associates 6 with 3. In fact, there are other problems with the reciprocal:
it doesn’t even make sense when n = −3 since that would give a division by 0, which is undefined.
So, be very careful when working with functions not to confuse the words reciprocal and inverse.
They are entirely different things.
The characteristic vector (see section 6) of a set may be used to define a special 0-1 function
representing the given set.
Example 11.8. Let U be a finite universal set with n elements ordered u1 , ..., un . Let Bn denote
all binary strings of length n. The characteristic function χ : P(U) → Bn , which takes a subset A
to its characteristic vector is bijective. Thus there is no danger of miscalculation. We can either
manipulate subsets of U using set operations and then represent the result as a binary vector or we
can represent the subsets as binary vectors and manipulate the vectors with appropriate bit string
operations. We’ll get exactly the same answer either way.
The process in example 11.8 allows us therefore to translate any set theory problem with finite sets
into the world of 0’s and 1’s. This is the essence of computer science.
80 CHAPTER 11. FUNCTIONS AND THEIR PROPERTIES
Exercises
Exercise 11.1. Recall that R is the set of all real numbers. In each case, give an example of a
function f : R → R with the indicated properties, or explain why no such function exists.
Exercise 11.2. Let A = {1, 2, 3, 4, 5} and B = {a, b, c, d, e, f }. In each case, give an example of a
function f : A → B with the indicated properties, or explain why no such function exists.
(a) f : A → B, f is one-to-one.
(b) g : B → A, g is one-to-one.
(c) f : A → B, f is onto.
(d) g : B → A, g is onto.
Problems
Problem 11.1. Let A = {1, 2, 3, 4, 5, 6}. In each case, give an example of a function f : A → A
with the indicated properties, or explain why no such function exists.
Special Functions
Certain functions arise frequently in discrete mathematics. Here is a catalog of some important
ones.
To begin with, the floor function is a function from R to Z which assigns to each real number
x, the largest integer which is less than or equal to x. We denote the floor function by ⌊x⌋. So
⌊x⌋ = n means n ∈ Z and n ≤ x < n + 1. For example, ⌊4.2⌋ = 4, and ⌊7⌋ = 7. Notice that for any
integer n, ⌊n⌋ = n. Be a little careful with negatives: ⌊π⌋ = 3, but ⌊−π⌋ = −4. A dual function is
denoted ⌈x⌉, where ⌈x⌉ = n means n ∈ Z and n ≥ x > n − 1. This is the ceiling function. For
example, ⌈4.2⌉ = 5 and ⌈−4.2⌉ = −4.
The graph (in the college algebra sense!) of the floor function appears in figure 12.1.
The fractional part of a number x ≥ 0 is denoted f rac(x) and equals x − ⌊x⌋. For numbers x ≥ 0,
the fractional part of x is just what would be expected: the stuff following the decimal point.
The definition above is the Mathematica and Wolfram/Alpha definition of the fractional part. The
Graham definition extends the domain: f rac(x) = x − ⌊x⌋, for all x.
For example, f rac(5.2) = 5.2 − ⌊5.2⌋ = 5.2 − 5 = 0.2. When x is negative its fractional part is
82
83
y
3
x
−3 −2 −1 0 1 2 3
−1
−2
−3
−4
For example, f rac(−5.2) = −5.2 − ⌈−5.2⌉ = −5.2 − (−5) = −0.2 In plain English, to determine
the fractional part of a number x, take the stuff after the decimal point and keep the sign of the
number. The graph of the fractional part function is shown in figure 12.2.
y
2
1
0
x
−4 −3 −2 −1−1 0 1 2 3 4
−2
y
3
x
−3 −2 −1 0 1 2 3
−1
−2
−3
The integral part of x is denoted by [x], or, sometimes, by int(x). In words, the integral part of x is
found by discarding everything following the decimal (at least if we agree not to end decimals with
an infinite string of 9’s such as 2.9999 · · · ). The graph of the integral part function is displayed in
figure 12.3.
The power functions are familiar from college algebra. They are functions of the form f (x) = x2 ,
f (x) = x3 , f (x) = x4 , and so on. By extension, f (x) = xa , where a is any constant greater than or
equal to 1 will be called a power function.
For any set X, the unit power function 1X (x) = x for all x ∈ X is called the identity function.
Exchanging the roles of the variable and the constant in the power functions leads to a whole class
of interesting functions, those of the form f : R → R, where f (x) = ax , and 0 < a. Such a function
85
y = 2x
10
5 y = log2 (x)
−5 5 10
−5
f is called the base a exponential function. The function is not very interesting when a = 1.
1 1
Also if 0 < b < 1, then the function g(x) = bx = , where f (x) = ax , and a = > 1. So we
f (x) b
may focus on a > 1. In fact the most important values for a are 2, e and 10.
The number e ≈ 2.718281828459... is called the natural base, but that story belongs to calculus.
Base 2 is the usual base for computer science. Engineers are most interested in base 10, while
mathematicians often use the natural exponential function, ex .
By graphing the function f : R → (0, ∞) defined by y = f (x) = ex we can see that it is bijective.
We denote the inverse function f −1 (x) by ln x and call it the natural log function. Since these
are inverse functions we have
The basic facts needed for manipulating exponential and logarithmic functions are the laws of
exponents.
Theorem 12.1 (Laws of Exponents). For a, b, c ∈ R, ab+c = ab · ac , abc = (ab )c and ac bc = (ab)c .
86 CHAPTER 12. SPECIAL FUNCTIONS
Theorem 12.2 (Laws of Logarithms). For a, b, c > 0, loga bc = loga b + loga c, and loga (bc ) =
c loga b.
Proof. We rely on the fact that all exponential and logarithmic functions are one-to-one. Hence,
we have that
aloga bc = bc = aloga b aloga c = aloga b+loga c ,
implies
loga bc = loga b + loga c.
c
aloga b = bc = (aloga b )c = ac loga b .
1
Notice that loga = loga b−1 = − loga b.
b
Calculators typically have buttons for logs base e and base 10. If loga b is needed for a base different
from e and 10, it can be computed in a roundabout way. Suppose we need to find c = loga b. In
other words, we need the number c such that ac = b. Taking the ln of both sides of that equation
we get
ac = b
ln (ac ) = ln b
c ln a = ln b
ln b
c=
ln a
ln 100
Example 12.4. For example, we see that log2 100 = ln 2 ≈ 6.643856.
87
Exercises
Exercise 12.1. In words, ⌊x⌋ is the largest integer less than or equal to x. Complete the sen-
tence: In words,⌈x⌉ is the smallest . . . . . . .
Exercise 12.6. Let f (x) = 4x5 and let g(x) = 2x . For values of x ≥ 1 it appears that the graph of
g is lower than the graph of f . Does g ever catch up with f , or does f always stay ahead of g?
√
Exercise 12.7. The xy button on your calculator is broken. Show how you can approximate 2 2
Problems
5
Problem 12.1. Write ln 5 − 4 ln 3 as a single logarithm.
2
Problem 12.2. Draw the college algebra style graph of f (x) = ex+3 − 1.
Problem 12.3. Let f (x) = 2x3 and g(x) = 3x . Notice that f (1) < g(1) and f (2) > g(2). Does g
ever catch up with f again, or does f always stay ahead of g?
Problem 12.4. Write log 1 + log 2 + log 3 + log 4 + log 5 + log 6 as a single logarithm.
n
X
Problem 12.5. Write log n as a single logarithm.
k=1
Chapter 13
A sequence is a list of numbers in a specific order. For example, the positive integers 1, 2, 3, · · · is
a sequence, as is the list 4, 3, 3, 5, 4, 4, 3, 5, 5, 4 of the number of letters in the English words of
the ten digits in order zero, one, · · · , nine. Actually, the first is an example of an infinite sequence,
the second is a finite sequence. The first sequence goes on forever; there is no last number. The
second sequence eventually comes to a stop. In fact the second sequence has only ten items. A
term of a sequence is one of the numbers that appears in the sequence. The first term is the first
number in the list, the second term is the second number in the list, and so on.
A more general way to think of a sequence is as a function from some subset of Z having a least
member (in most cases either { 0, 1, 2, · · · } or { 1, 2, · · · }) with codomain some arbitrary set.
Computer science texts use the former and elementary math application texts use the later. Math-
ematicians use any such well-ordered domain set.
In most mathematics courses the codomain will be a set of numbers, but that isn’t necessary.
For example, consider the finite sequence of initial letters of the words in the previous paragraph:
a, s, i, a, l, o, n, · · · , a, s, o. If the letter L is used to denote the function that forms this sequence,
then L(1) = a, L(2) = s, and so on.
The examples of sequences given so far were described in words, but there are other ways to tell
what objects appear in the sequence. One way is with a formula. For example, let s(n) = n2 ,
88
89
for n = 1, 2, 3, · · · . As the values 1, 2, 3 and so on are plugged into s(n) in succession, the infinite
sequence 1, 4, 9, 16, 25, 36, · · · is built up. It is traditional to write sn (or tn , etc) instead of s(n)
when describing the terms of a sequence, so the formula above would usually be seen as sn = n2 .
Read that as s sub n equals n2 . When written this way, the n in the sn is called a subscript or
index. The subscript of s173 is 173.
j+1
Example 13.1. What is the 50th term of the sequence defined by the formula sj = , where
j+2
j = 1, 2, 3, . . .? We see that
51
s50 = .
52
k+1
Example 13.2. What is the 50th term of the sequence defined by the formula tk = , where
k+2
k = 0, 1, 2, 3, . . .? Since the indices start at 0, the 50th term will be t49 :
50
t49 = .
51
A sequence can also be specified by listing an initial portion of the sequence, and trust the reader to
successfully perform the mind reading trick of guessing how the sequence is to continue based on the
pattern suggested by those initial terms. For example, consider the sequence 7, 10, 13, 16, 19, 22, · · · .
The symbol · · · means and so on. In other words, you should be able to figure out the way the
sequence will continue. This method of specifying a sequence is dangerous of course. For instance,
the number of terms sufficient for one person to spot the pattern might not be enough for another
person. Also, maybe there are several different obvious ways to continue the pattern
Example 13.3. What is the next term in the sequence 1, 3, 5, 7 · · · ? One possible answer is 9, since
it looks like we are listing the positive odd integers in increasing order. But another possible answer
is 8: maybe we are listing each positive integer with an e in its name. You can probably think of
other ways to continue the sequence.
In fact, for any finite list of initial terms, there are always infinitely many more or less natural ways
to continue the sequence. A reason can always be provided for absolutely any number to be the
next in the sequence. However, there will typically be only one or two obvious simple choices for
continuing a sequence after five or six terms.
The simple pattern suggested by the initial terms 7, 10, 13, 16, 19, 22, · · · is that the sequence begins
with a 7, and each term is produced by adding 3 to the previous term. This is an important type of
sequence. The general form is s1 = a (a is just some specific number), and, from the second term
on, each new term is produced by adding d to the previous term (where d is some fixed number).
90 CHAPTER 13. SEQUENCES AND SUMMATION
In the last example, a = 7 and d = 3. A sequence of this form is called an arithmetic sequence.
The number d is called the common difference, which makes sense since d is the difference of
any two consecutive terms of the sequence. It is possible to write down a formula for sn in this
case. After all, to compute sn we start with the number a, and begin adding d’s to it. Adding
one d gives s2 = a + d, adding two d’s gives s3 = a + 2d, and so on. For sn we will add n − 1 d’s
to the a, and so we see sn = a + (n − 1)d. In the numerical example above, the 5th term of the
sequence ought to be s5 = 7 + 4 · 3 = 19, and sure enough it is. The 407th term of the sequence is
s407 = 7 + 406 · 3 = 1225.
Example 13.4. The 1st term of an arithmetic sequence is 11 and the the 8th term is 81. What is
a formula for the nth term?
We know a1 = 11 and a8 = 81. Since a8 = a1 + 7d, where d is the common difference, we get the
equation 81 = 11 + 7d. So d = 10. We can now write down a formula for the terms of this sequence:
an = 11 + (n − 1)10 = 1 + 10n. Checking, we see this formula does give the required values for a1
and a8 .
For an arithmetic sequence we added the same quantity to get from one term of the sequence to
the next. If instead of adding we multiply each term by the same thing to produce the next term
the result is called a geometric sequence.
Example 13.5. Let s1 = 2, and suppose we multiply by 3 to get from one term to the next. The
sequence we build now looks like 2, 6, 18, 54, 162, · · · , each term being 3 times as large as the previous
term.
In general, if s1 = a, and, for n ≥ 1, each new term is r times the preceding term, then the formula
for the nth term of the sequence is sn = arn−1 , which is reasoned out just as for the formula for the
arithmetic sequence above. The quantity r in the geometric sequence is called the common ratio
since it is the ratio of any term in the sequence to its predecessor (assuming r ̸= 0 at any rate).
A sequence of numbers is an ordered list of numbers. A summation (or just sum) is a sequence
of numbers added up. A sum with n terms (that is, with n numbers added up) will be denoted by
Sn typically. Thus if we were dealing with sequence 1, 3, 5, 7, · · · , 2n − 1, · · · , then S3 = 1 + 3 + 5,
and Sn = 1 + 3 + 5 + · · · + (2n − 1). For the arithmetic sequence a, a + d, a + 2d, a + 3d, · · · , we see
Sn = a + (a + d) + (a + 2d) + · · · + (a + (n − 1)d).
It gets a little awkward writing out such extended sums and so a compact way to indicate a sum,
called summation notation, is introduced. For the sum of the first 3 odd positive integers above
91
3
X
we would write (2j − 1). The Greek letter sigma (Σ) is supposed to be reminiscent of the word
j=1
summation. The j is called the index of summation and the number on the bottom of the Σ
specifies the starting value of j while the number above the Σ gives the ending value of j. The
idea is that we replace j in turn by 1, 2 and 3, in each case computing the value of the expression
following the Σ, and then add up the terms produced. In this example, when j = 1, 2j − 1 = 1,
when j = 2, 2j − 1 = 3 and finally, when j = 3, 2j − 1 = 5. We’ve reached the stopping value, so
3
X
we have (2j − 1) = 1 + 3 + 5 = 9.
j=1
Notice that the index of summation takes only integer values. If it starts at 6, then next it is
replaced by 7, and so on. If it starts at −11, then next it is replaced by −10, and then by −9, and
so on.
The symbol used for the index of summation does not have to be j. Other traditional choices for
the index of summation are i, k, m and n. So for example,
4
X 4
X 4
X
(j 2 + 2) = 2 + 3 + 6 + 11 + 18 = (i2 + 2) = (m2 + 2)
j=0 i=0 m=0
and so on. Even though a different index letter is used, the formulas produce the same sequence of
numbers to be added up in each case, so the sums are the same.
Also, the starting and ending points can for the index can be changed without changing the value
of the sum provided care is taken to change the formula appropriately. Notice that
3
X 2
X
(3k − 1) = (3k + 2)
k=1 k=0
3
X
(3k − 1) = 2 + 5 + 8
k=1
and
2
X
(3k + 2) = 2 + 5 + 8
k=0
92 CHAPTER 13. SEQUENCES AND SUMMATION
5
X 127
2m = 2−1 + 20 + 21 + 22 + 23 + 24 + 25 = .
m=−1
2
There are two important formulas for finding sums that are worth remembering. The first is the
sum of the first n terms of an arithmetic sequence.
Sn = a + (a + d) + (a + 2d) + · · · + (a + (n − 1)d).
Here is a clever trick that can be used to find a simple formula for the quantity Sn : the list of
numbers is added up twice, once from left to right, the second time from right to left. When the
terms are paired up, it is clear the sum is 2Sn = n[a + (a + (n − 1)d)]. A diagram will make the
idea clearer:
The bottom row contains n terms, each equal to 2a + (n − 1)d, and so 2Sn = n [2a + (n − 1)d)] .
Dividing by 2 gives the important formula, for n = 1, 2, 3, . . .,
2a + (n − 1)d a + (a + (n − 1)d)
Sn = n =n . (13.1)
2 2
An easy way to remember the formula is to think of the quantity in the parentheses as the average
of the first and last terms to be added, and the coefficient, n, as the number of terms to be added.
Example 13.8. The first 20 terms of the arithmetic sequence 5, 9, 13, · · · is found to be
5 + 81
S20 = 20 = 860.
2
93
For a geometric sequence, a little algebra produces a formula for the sum of the first n terms of the
sequence. The resulting formula for Sn = a + ar + ar2 + · · · + arn−1 , is
a − arn 1 − rn
Sn = =a , if r ̸= 1.
1−r 1−r
Example 13.9. The sum of the first ten terms of the geometric sequence 2, 32 , 29 , · · · would be
1 10
2−2 3
S10 = 1
1− 3
1 10 1 10 1 10
! !
2−2 1− 1−
3 3 3 1 1
S10 = 1 =2 1 =2 2 = 3 1 − 10 =3−
1− 3 1− 3 3
3 39
Notice that the numerator in this case is the difference of the first term we have to add in and the
term immediately following the last term we have to add in.
Here is the algebra that shows the geometric sum formula is correct.
a − arn 1 − rn
Sn = =a , if r ̸= 1. (13.2)
1−r 1−r
Exercises
Exercise 13.1. Guess the next term in the sequence 1, 2, 4, 5, 7, 8, · · · . What’s another possible
answer?
Exercise 13.2. What is the 100th term of the arithmetic sequence with initial term 2 and common
difference 6?
Exercise 13.3. The 10th term of an arithmetic sequence is −4 and the 16th term is 47. What is
the 11th term?
Exercise 13.4. What is the 5th term of the geometric sequence with initial term 6 and common
ratio 2?
Exercise 13.5. The first two terms of a geometric sequence are g1 = 5 and g2 = −11. What is g5 ?
Exercise 13.6. Which sequences are both a geometric sequence and also an arithmetic sequence?
P4 2
Exercise 13.7. Evaluate j=1 (j + 1).
P4
Exercise 13.8. Evaluate k=−2 (2k − 3).
Exercise 13.9. What is the sum of the first 100 terms of the arithmetic sequence with initial term
2 and common difference 6?
Exercise 13.10. What is the sum of the first five terms of the geometric sequence with initial term
6 and common ratio 2?
P4 i
Exercise 13.11. Evaluate i=0 − 23 .
1 1 1 1
Exercise 13.12. Express in summation notation: + + + · · · + , the sum of the reciprocals
2 4 6 2n
of the first n even positive integers.
95
Problems
Problem 13.1. Guess the next term in the sequence 1, 3, 5, 7, 8, 9 · · · . What’s another possible
answer?
Problem 13.4. What is the 20th term of the arithmetic sequence with initial term 4 and common
difference 5?
Problem 13.5. The 8th term of an arithmetic sequence is 20 and the 12th term is 40. What is the
25th term?
Problem 13.6. What is the 7th term of the geometric sequence with initial term 3 and common
ratio 4?
Problem 13.7. Two terms of a geometric sequence are g3 = 2 and g5 = 72. There are two possible
values for g4 . What are those two values?
Problem 13.8. A geometric sequence has initial term 3, and common ratio 7. Determine the
smallest value of n so that the nth term of the sequence is more that one million.
P4
Problem 13.9. Evaluate j=1 (j + 1)2 .
P4
Problem 13.10. Evaluate k=−2 (2k + 3).
Problem 13.11. What is the sum of the first 100 terms of the arithmetic sequence with initial
term 2 and common difference 6?
Problem 13.12. What is the sum of the first four terms of the geometric sequence with initial
term 3 and common ratio −2?
Problem 13.13. What is the sum of the first four thousand terms of the geometric sequence with
initial term 3 and common ratio −1?
P4 i
Problem 13.14. Evaluate i=0 23 .
1 1 1 1
Problem 13.15. Express in summation notation: + + + ··· + , the sum of the
1 3 5 2n − 1
reciprocals of the first n odd positive integers.
Chapter 14
Example 14.1. For example, suppose b1 = 1, and for n > 1, bn = 2bn−1 . Then the 1st term of the
sequence will be b1 = 1 of course. To determine b2 , we apply the rule b2 = 2b2−1 = 2b1 = 2 · 1 = 2.
Next, applying the rule again, b3 = 2b3−1 = 2b2 = 2 · 2 = 4. Next b4 = 2b3 = 8. Continuing in this
fashion, we can form as many terms of the sequence as we wish: 1, 2, 4, 8, 16, 32, · · · . In this case,
it is easy to guess a formula for the terms of the sequence: bn = 2n−1 .
In general, to define a sequence recursively, (1) we first give one or more initial terms (this informa-
tion is called the initial condition(s) for the sequence), and then (2) we give a rule for forming
new terms from previous terms (this rule is called the recursive formula).
0, 2 · 0 + 1 = 1, 2 · 1 + 1 = 3, 2 · 3 + 1 = 7, 2 · 7 + 1 = 15 ···
In words, we can describe this sequence by saying the initial term is 0 and each new term is one
96
97
more than twice the previous term. Again, it is easy to guess a formula that produces the terms of
this sequence: an = 2n−1 − 1.
Such a formula for the terms of a sequence is called a closed form formula to distinguish it from
a recursive formula.
There is one big advantage to knowing a closed form formula for a sequence. In example 14.2 above,
the closed form formula for the sequence tells us immediately that a101 = 2100 − 1, but using the
recursive formula to calculate a101 means we have to calculate in turn a1 , a2 , · · · a100 , making 100
computations. The closed form formula allows us to jump directly to the term we are interested in.
The recursive formula forces us to compute 99 additional terms we don’t care about in order to get
to the one we want. With such a major drawback why even introduce recursively defined sequences
at all? The answer is that there are many naturally occurring sequences that have simple recursive
definitions but have no reasonable closed form formula, or even no closed form formula at all in
terms of familiar operations. In such cases, a recursive definition is better than nothing.
There are methods for determining closed form formulas for some special types of recursively defined
sequences. Such techniques are studied later in chapter 35. For now we are only interested in
understanding recursive definitions, and determining some closed form formulas by the method of
pattern recognition (aka guessing).
The most famous recursively defined sequence is due to Fibonacci. There are two initial conditions:
f0 = 0 and f1 = 1. The index starts at zero, by tradition. The recursive rule is, for n ≥ 2,
fn = fn−1 + fn−2 . In words, each new term is the sum of the two terms that precede it. So, the
Fibonacci sequence begins
There is a closed form formula for the Fibonacci Sequence, but it is not at all easy to guess:
√ !n √ !n
1 1+ 5 1 1− 5
fn = √ −√
5 2 5 2
For a positive integer n, the symbol n! is read n factorial and it is defined to be the product of all
the positive integers from 1 to n. For example, 5! = 1 · 2 · 3 · 4 · 5 = 120. In order to make many
formulas work out nicely, the value of 0! is defined to be 1.
98 CHAPTER 14. RECURSIVELY DEFINED SEQUENCES
A recursive formula can be given for n!. The initial term is 0! = 1, and the recursive rule is, for
n ≥ 1, n! = n[(n − 1)!]. Hence, the first few factorial values are:
1! = 1[0!] = 1 · 1 = 1,
2! = 2[1!] = 2 · 1 = 2,
3! = 3[2!] = 3 · 2 = 6,
4! = 4[3!] = 4 · 6 = 24,
..
.
n! = 1 · 2 · 3 · 4 · · · n, for n > 0.
The sequence of factorials grows very quickly. Here are the first few terms:
1, 2, 6, 24, 120, 720, 5040, 40320, 362880, 3628800, 39916800, 479001600, 6227020800, · · ·
Consider the terms of an arithmetic sequence with initial term a and common difference d:
These terms may clearly be found by adding d to the current term to get the next. That is, the
arithmetic sequence may be defined recursively as (1) a1 = a, and (2) for n ≥ 2, an = an−1 + d.
99
Exercises
Exercise 14.1. List the first five terms of the sequence defined recursively by a1 = 3, and, for
n ≥ 2, an = an−1 (2 + an−1 ).
Exercise 14.2. List the first seven terms of the sequence defined recursively by a0 = 1, a1 = 1,
and, for n ≥ 2, an = 1 + an−1 an−2 .
Exercise 14.3. List the first ten terms of the sequence defined recursively by a0 = 1, and, for
n ≥ 1, an = 1 + a⌊ n2 ⌋ .
Exercise 14.4. List the first ten terms of the sequence defined recursively by a0 = 1, and for n ≥ 1,
an = 2n − an−1 − 1, and guess a closed form formula for an .
There is an easy recursive rule for building the terms of this sequence. Guess the next term.
Exercise 14.6. Let d be a fixed real number. For a positive integer n, the symbol nd means the sum
of n d’s. Give a recursive definition of nd analogous to the definition of n! given in this chapter.
100 CHAPTER 14. RECURSIVELY DEFINED SEQUENCES
Problems
Problem 14.1. List the first five terms of the sequence defined recursively by a1 = 2, and, for
n ≥ 2, an = a2n−1 − 1.
Problem 14.2. List the first five terms of the sequence defined recursively by a1 = 2, and, for
n ≥ 2, an = 3an−1 + 2. Guess a closed form formula. Hint: This is a lot like example 14.2. for the
sequence.
Problem 14.3. List the first five terms of the sequence with initial terms u0 = 2 and u1 = 5, and,
for n ≥ 2, un = 5un−1 − 6un−2 . Guess a closed form formula for the sequence. Hint: The terms
are simple combinations of powers of 2 and powers of 3.
Problem 14.4. Let r be a fixed real number different from 0. For a positive integer n, the symbol
rn means the product of n r’s. For convenience, r0 is defined to be 1. Give a recursive definition
of rn analogous to the definition of n! given in this chapter.
Problem 14.5. Give a recursive definition of the geometric sequence with initial term 3 and com-
mon ratio 2.
Problem 14.6. Generalize problem 5: give a recursive definition of the geometric sequence with
initial term a and common ratio r.
Chapter 15
Two different ways of defining a set have been discussed. We can describe a set by the roster
method, listing all the elements that are to be members of the set, or we can describe a set using
set-builder notation by giving a predicate that the elements of the set are to satisfy. Here we
consider defining sets in another natural way: recursion.
Recursive definitions can also be used to build sets of objects. The spirit is the same as for recursively
defined sequences: give some initial conditions and a rule for building new objects from ones already
known.
Example 15.1. For instance, here is a way to recursively define the set of positive even integers,
E. First the initial condition: 2 ∈ E. Next the recursive portion of the definition: If x ∈ E, then
x + 2 ∈ E. Here is what we can deduce using these two rules. First of course, we see 2 ∈ E
since that is the given initial condition. Next, since we know 2 ∈ E, the recursive portion of the
definition, with x being played by 2, says 2 + 2 ∈ E, so that now we know 4 ∈ E. Since 4 ∈ E, the
recursive portion of the definition, with x now being played by 4, says 4 + 2 ∈ E, so that now we
know 6 ∈ E. Continuing in this way, it gets easy to believe that E really is the set of positive even
integers.
Actually, there is a little more to do with example 15.1. The claim is that E consists of exactly all
the positive even integers. In other words, we also need to make sure that no other things appear
in E besides the positive even integers. Could 312211 somehow have slithered into the set E? To
101
102 CHAPTER 15. RECURSIVELY DEFINED SETS
verify that such a thing does not happen, we need one more fact about recursively defined sets.
The only elements that appear in a set defined recursively are those that make it on the basis of
either the initial condition or the recursive portion of the definition. No elements of the set appear,
as if by magic, from nowhere.
In this case, it is easy to see that no odd integers sneak into the set. For if so, there would be a
smallest odd integer in the set and the only way it could be elected to the set is if the integer two
less than it were in the set. But that would mean a yet smaller odd integer would be in the set, a
contradiction. We won’t go into that sort of detail for the following examples in general. We’ll just
consider the topic at the intuitive level only.
Example 15.2. Give a recursive definition of the set, S, of all non-negative integer powers of 2.
Initial condition: 1 ∈ S. Recursive rule: If x ∈ S, then 2x ∈ S. Applying the initial condition and
then the recursive rule repeatedly gives the elements:
and so on, and that looks like the set of nonnegative powers of 2.
The plan is to use the initial conditions and the recursive rule to build elements of S until we can
guess a description of the integers in S. From the initial conditions we know 1 ∈ S and 2 ∈ S.
Applying the recursive rule to each of those we get 4, 5 ∈ S, and using the recursive rule on those
gives 7, 8 ∈ S, and so on.
So we get S = {1, 2, 4, 5, 7, 8, 10, 11, · · · } and it’s apparent that S consists of of the positive integers
that are not multiples of 3.
Recursively defined sets appear in certain computer science courses where they are used to de-
scribe sets of strings. To form a string, we begin with an alphabet which is a set of symbols,
traditionally denoted by Σ. For example Σ = {a, b, c} is an alphabet of three symbols, and
Σ = {!, @, #, $, %, &, X, 5} is an alphabet of eight symbols. A string over the alphabet Σ is
103
any finite sequence of symbols from the alphabet. For example aaba is a string of length four over
the alphabet Σ = {a, b, c}, and !!5X$$5@@ is a length nine string over Σ = {!, @, #, $, %, &, X, 5}.
There is a special string over any alphabet denoted by λ called the empty string. It contains no
symbols, and has length 0.
Example 15.4. A set, S, of strings over the alphabet Σ = {a, b} is given recursively by (1) λ ∈ S,
and (2) If x ∈ S, then axb ∈ S. Describe the strings in S.
The notation axb means write down the string a followed by the string x followed by the string b.
So if x = aaba then axb = aaabab. Let’s experiment with the recursive rule a bit, and then guess
a description for the strings in S. Starting with the initial condition we see λ ∈ S. Applying the
recursive rule to λ gives aλb = ab ∈ S. Applying the recursive rule to ab gives aabb ∈ S, and
applying the recursive rule to aabb shows aaabbb ∈ S. It’s easy to guess the nature of the strings in
S: Any finite string of a’s followed by the same number of b’s.
Example 15.5. Give a recursive definition of the set S of strings over Σ = {a, b, c} which do not
contain adjacent a’s. For example ccabbbabba is acceptable, but abcbaabaca is not.
For the initial conditions we will use (1) λ ∈ S, and a ∈ S. If we have a string with no adjacent a’s,
we can extend it by adding b or c to either end. But we’ll need to be careful when adding more a’s.
For the recursive rule we will use (2) if x ∈ S, then bx, xb, cx, xc ∈ S and abx, xba, acx, xca ∈ S.
Notice how the string a had to be put into S in the initial conditions since the recursive rule won’t
allow us to form that string from λ.
Here is different answer to the same question. It’s a little harder to dream up, but the rules are
much cleaner. The idea is that if we take two strings with no adjacent a’s, we can put them together
and be sure to get a new string with no adjacent a’s provided we stick either b or c between them.
So, we can define the set recursively by (1) λ ∈ S and a ∈ S, and (2) if x, y ∈ S, then xby, xcy ∈ S.
Example 15.6. Give a recursive definition of the set S of strings over Σ = {a, b} which contain
more a’s than b’s.
The idea is that we can build longer strings from smaller ones by (1) sticking two such strings
together, or (2) sticking two such strings together along with a b before the first one, between the
two strings, or after the last one. That leads to the following recursive definition: (1) a ∈ S and
(2) if x, y ∈ S then xy, bxy, xby, xyb ∈ S. That looks a little weird since in the recursive rule we
added b, but since x and y each have more a’s than b’s, the two together will have a least two more
a’s than b’s, so it’s safe to add b in the recursive rule.
104 CHAPTER 15. RECURSIVELY DEFINED SETS
Starting with the initial condition, and then applying the recursive rule repeatedly, we form the
following elements of S:
Example 15.7. A set, S, of strings over the alphabet Σ = {a, b} is defined recursively by the rules
(1) a ∈ S, and (2) if x ∈ S, then xbx ∈ S. Describe the strings in S.
It looks like S is the set of strings beginning with a followed by a certain number of ba’s. If we look
at the number of ba’s in each string, we can see a pattern: 0, 1, 3, 7, 15, 31, · · · , which we recognize
as being the numbers that are one less than the positive integer powers of 2 (1, 2, 4, 8, 16, 32, · · · ). So
it appears S is the set of strings which consisting of a followed by 2n − 1 pairs ab for some integer
n ≥ 0.
105
Exercises
Exercise 15.1. The set S is described recursively by (1) 1 ∈ S, and (2) if n ∈ S, then n + 1 ∈ S.
To what familiar set is S equal?
Exercise 15.2. Give a recursive definition of the set of positive integers that end with the digits
17.
Exercise 15.3. Give a recursive definition of the set of positive integers that are not multiples of
4.
Exercise 15.4. Describe the strings in the set S of strings over the alphabet Σ = {a, b, c} defined
recursively by (1) λ ∈ S and (2) if x ∈ S, then axbc ∈ S.
Exercise 15.5. Describe the strings in the set S of strings over the alphabet Σ = {a, b, c} defined
recursively by (1) c ∈ S and (2) if x ∈ S then ax ∈ S and bx ∈ S and xc ∈ S.
Exercise 15.6. A palindrome is a string that reads the same in both directions. For example, a
classic palindrome with length 21 is: A man, a plan, a canal: panama. For another example, aabaa
is a palindrome of length five and babccbab is a palindrome of length eight. The empty string is also
a palindrome. Give a recursive definition of the set of palindromes over the alphabet Σ = {a, b, c}.
106 CHAPTER 15. RECURSIVELY DEFINED SETS
Problems
Problem 15.1. A set S of integers is defined recursively by the rules: (1) 1 ∈ S, and (2) If n ∈ S,
then 2n + 1 ∈ S.
{3n − 3 | n a positive integer} = {0, 6, 24, 78, 240, 726, 2184, . . .}.
Problem 15.4. A set, S, of strings over the alphabet Σ = {a, b, c} is defined recursively by (1)
a ∈ S and (2) if x ∈ S then bxc ∈ S. List all the strings in S of length seven or less.
Problem 15.5. A set, S, of positive integers is defined recursively by the rule:
(1) 1 ∈ S, and (2) If n ∈ S, then 2n − 1 ∈ S. List all the elements in the set S.
Problem 15.6. Give a recursive definition of the set of positive integers that end with the digit 1.
Problem 15.7. Give a recursive definition of the set of strings over the alphabet Σ = {a, b, c} of
the form aaa · · · abccc · · · c. More carefully: zero or more a’s followed by a single b followed by the
same number of c’s as a’s.
Problem 15.8. Describe the strings in the set S of strings over the alphabet Σ = {a, b, c} defined
recursively by (1) a ∈ S and (2) if x ∈ S then ax ∈ S and xb ∈ S and xc ∈ S.
Hint: Your description should be a sentence that provides an easy test to check if a given string is
in the set or not. An example of such a description is: S consists of all strings of a’s, b’s, and c’s,
with more a’s than b’s. That isn’t a correct description since abb is in S and doesn’t have more a’s
than b’s, and also baac isn’t in S, but does have more a’s than b’s. So that attempted description is
really terrible. One way to do this problem is to use the rules to build a bunch of strings in S until
a suitable description becomes obvious. Alternatively, just thinking about the recursive rules might
be sufficient for you to see a simple description of the strings in S.
Problem 15.9. A set S of ordered pairs of integers is defined recursively by (1) (1, 1) ∈ S, and
(2) if (m, n) ∈ S, then (m + 2, n) ∈ S, and (m, n + 2) ∈ S, and (m + 1, n + 1) ∈ S. Give a simple
description of the ordered pairs in S.
Chapter 16
Mathematical Induction
As mentioned earlier, to show that a proposition of the form ∀ x P (x) is true, it is necessary to
check that P (c) is true for every possible choice of c in the domain of discourse. If that domain is
not too big, it is feasible to check the truth of each P (c) one by one. For instance, consider the
proposition For every page in these notes, the letter e appears at least once on the page. To express
the proposition in symbolic form we would let the domain of discourse be the set of pages in these
notes, and we would let the predicate E be has an occurrence of the letter e, so the proposition
becomes ∀ p E(p). The truth value of this proposition can be determined by the tedious but feasible
task of checking every page of the notes for an e. If a single page is found with no e’s, that
page would constitute a counterexample to the proposition, and the proposition would be false.
Otherwise it is true.
When the domain of discourse is a finite set, it is, in principle, always possible to check the truth
of a proposition of the form ∀ x P (x) by checking the members of the domain of discourse one by
one. But that option is no longer available if the domain of discourse is an infinite set since no
matter how quickly the checks are made there is no practical way to complete the checks in a finite
amount of time.
For example, consider the proposition For every natural number n, n5 − n ends with a 0. Here
the domain of discourse is the set N = { 0, 1, 2, 3, · · · }. The truth of the proposition could be
107
108 CHAPTER 16. MATHEMATICAL INDUCTION
established by checking:
05 − 0 = 0 15 − 1 = 0 25 − 2 = 30 35 − 3 = 240
45 − 4 = 1020 55 − 5 = 3120 65 − 6 = 7770 75 − 7 = 16800
85 − 8 = 32760 95 − 9 = 59040 105 − 10 = 99990 115 − 11 = 161040
.. .. .. ..
. . . .
(and so on forever.)
Checking these facts one by one is obviously a hopeless task, and, of course, just checking a few
of them (or even a few billion of them) will never suffice to prove they are all true. And it is not
sufficient to check a few and say that the facts are all clear. That’s not a proof, it’s only a suspicion.
So verifying the truth of ∀ n (n5 − n) ends with a 0 for domain of discourse N seems tough.
In general, proving a universally quantified statement when the domain of discourse is an infinite
set is a tough nut to crack. But, in the special case when the domain of discourse is the set
N = { 0, 1, 2, 3, · · · }, there is a technique called mathematical induction that comes to the
rescue.
The method of proof by induction provides a way of checking that all the statements in the list are
true without actually verifying them one at a time. The process is carried out in two steps. First
(the basis step) we check that the first statement in the list is correct. Next (the inductive step),
we show that if any statement in the list is known to be correct, then the one following must also
be correct. Putting these two facts together, it ought to appear reasonable that all the statements
in the list are correct. In a way, it’s pretty amazing: we learn infinitely many statements are true
just by checking two facts. It’s like killing infinitely many birds with two stones.
So, suppose a list of statements, p(0), p(1), p(2), · · · , p(k), p(k + 1) · · · is presented and we want to
show they are all true. The plan is to show two facts:
The well ordering property of the positive integers provides the justification for proof by induc-
tion. This property asserts that every non-empty subset of the natural numbers contains a smallest
number. In fact, given any nonempty set of natural numbers, we can determine the smallest number
in the set by the process of checking to see, in turn, if 0 is in the set, and, if the answer is no,
checking for 1, then for 2, and so on. Since the set is nonempty, eventually the answer will be yes,
that number is in the set, and in that way, the smallest natural number in the set will have been
found. Now let’s look at the proof that induction is a valid form of proof. The statement of the
theorem is a little more general than described above. Instead of beginning with a statement p(0),
we allow the list to begin with a statement p(k) for some integer k (almost always, k = 0 or k = 1
in practice). This does not have any effect of the concept of induction. In all cases, we have a list
of statements, and we show the first statement is true, and then we show that if any statement is
true, so is the next one. The particular name for the starting point of the list doesn’t really matter.
It only matters that there is a starting point.
Suppose that 1 and 2 are true, but that it is not the case that p(n) is true for all n ≥ k. Let
S = {n|n ≥ k and p(n) is false}, so that S ̸= ∅. Since S is a non-empty set of integers ≥ k it has a
least element, say t. So t is the smallest positive integer for which p(n) is false. In the ever colorful
jargon of mathematics, t is usually called the minimal criminal.
Many people find proofs by induction a little bit black-magical at first, but just keep the goals in
mind (namely check [1] the first statement in the list is true, and [2] that if any statement in the
list is true, so is the one that follows it) and the process won’t seem so confusing.
110 CHAPTER 16. MATHEMATICAL INDUCTION
A handy way of viewing mathematical induction is to compare proving the sequence p(k)∧p(k +1)∧
p(k + 2) ∧ ... ∧ p(m) ∧ ... to knocking down a set of dominos set on edge and numbered consecutively
k, k + 1, ..... If we want to knock all of the dominos down, which are numbered k and greater, then
we must knock the kth domino down, and ensure that the spacing of the dominos is such that every
domino will knock down its successor. If either the spacing is off (∃m ≥ k with p(m) not implying
p(m + 1)), or if we fail to knock down the kth domino (we do not demonstrate that p(k) is true),
then there may be dominoes left standing.
When checking the inductive step, p(n) → p(n + 1), the statement p(n), is called the inductive
hypothesis.
To discover how to prove the inductive step most people start by explicitly listing several of the
first instances of the inductive hypothesis p(n). Then, look for how to make, in a general way, an
argument from one, or more, instances to the next instance of the hypothesis. Once an argument
is discovered that allows us to advance from the truth of previous one, or more, instances, that
argument, in general form, becomes the pattern for the proof on the inductive hypothesis. Let’s
examine an example.
Example 16.2. Let’s prove that, for each positive integer n, the sum of the first n positive integers
n(n + 1)
is . Here is the list of statements we want to verify:
2
1(1 + 1)
p(1) : 1= To get p(2) add 2 to both sides.
2
2(2 + 1)
p(2) : 1+2= To get p(3) add 3 to both sides.
2
3(3 + 1)
p(3) : 1+2+3= You will need to simplify each step.
2
..
.
n(n + 1)
p(n) : 1 + 2 + ··· + n =
2
(n + 1)(n + 2)
p(n + 1) : 1 + 2 + · · · + (n + 1) =
2
..
.
Once you figure out the general form of the argument that takes us from one instance of p(·) to the
next, you have the form of the inductive argument.
111
Inductive Step: Suppose p(k) is true for some integer k ≥ 1. To be as precise as possible we should
k(k+1)
suppose 1 + 2 + · · · + k = 2 is true for some integer k ≥ 1.
(k+1)((k+2)
(We need to show p(k+1) is true. In other words, we need to verify 1+2+· · ·+(k+1) = 2 .
But we don’t write this down since it would be assuming the conclusion.)
Here are the computations which provide justification of the inductive step:
1 + 2 + · · · + (k + 1) = 1 + 2 + · · · + k + (k + 1)
k(k + 1)
= + (k + 1) using the inductive hypothesis
2
k(k + 1) 2(k + 1)
= +
2 2
k(k + 1) + 2(k + 1)
=
2
(k + 1)(k + 2)
=
2
as we needed to show. So we conclude all the statements in the list are true.
Notice that in the previous proof we used the following strategy to prove equality: start on one side
of the equation, p(n + 1) in this case, and work until we obtain the other side. We did this through
a series of algebraic manipulations and using the induction hypothesis along the way. This will be
our general strategy when writing induction proofs.
The next example reproves the useful formula for the sum of the terms in a geometric sequence.
Recall that to form a geometric sequence, fix a real number r ̸= 1, and list the integer powers of r
starting with r0 = 1: 1, r, r2 , r3 , · · · , rn , · · · . The formula given in the next example shows the result
of adding 1 + r + r2 + · · · rn . You may be familiar with the extension from calculus which allows us
to sum a + a · r + a · r2 + · · · a · rn . For a finite sum the previous sum is simply a(1 + r + r2 + · · · rn ).
So the formula derived is sufficient for our purposes.
112 CHAPTER 16. MATHEMATICAL INDUCTION
n
X rn+1 − 1
Example 16.3. For all n ≥ 0, we have rk = , (if r ̸= 1).
r−1
k=0
Proof. We assume r ̸= 1.
0
X rn+1 − 1 r−1
Basis: When n = 0 we have rk = r0 = 1. We also have = = 1.
r−1 r−1
k=0
m
X rm+1 − 1
Inductive Step: Now suppose that rk = is true for some m ≥ 0. Then, we see that
r−1
k=0
m+1 m
!
X X
k k
r = r + rm+1 by the recursive definition of a sum
k=0 k=0
m+1
r −1
= + rm+1 by induction hypothesis,
r−1
rm+1 − 1 rm+2 − rm+1
= +
r−1 r−1
m+1
− 1 + rm+2 − rm+1
r
=
r−1
rm+2 − 1
= .
r−1
Proof. Basis: When n = 2, the inequality to check is 22 > 2 + 1, and that is correct.
Inductive Step: Now suppose that 2n > n + 1 for some integer n ≥ 2. Then 2n+1 = 2 · 2n >
2(n + 1) = 2n + 2 > n + 2, as we needed to show.
Example 16.5. Of historical interest is the fact that one can show that using only 5c/ stamps and
9c/ stamps, any postage amount 32c/ or greater can be formed.
To re-phrase: Any integer n ≥ 32 is a linear combination of 5 and 9 with natural number coefficients.
That is: If n is an integer and n ≥ 32, then n = 5k + 9l for some k, l ∈ N.
Inductive Step: Now suppose we can write n = 5k + 9l for some integer n ≥ 32, where k, l ∈ N. We
need to show we can write (n + 1) as a natural number linear combination of 5 and 9. Since n ≥ 32
we must use either have (1) k ≥ 7, or (2) l ≥ 1. If not, then n ≤ 5 · 6 + 0 · 9 = 30 →←.
So, in either case, if we can write n as a linear combination of 5 and 9 with natural number
coefficients, then we can also write n + 1 in such a fashion.
Example 16.6. Let’s now look at an example of an induction proof with a geometric flavor. Suppose
we have a 4 × 5 chess board:
Each domino covers exactly two squares on the board. A perfect cover of the board consists of a
placement of dominoes on the board so that each domino covers two squares on the board (dominoes
can be either vertically or horizontally orientated), no dominoes overlap, no dominoes extend beyond
the edge of the board, and all the squares on the board are covered by a domino. It’s easy to see that
the 4 × 5 board above has a perfect cover. More generally, it is not hard to prove:
Theorem 16.7. An m × n board has a perfect cover with 1 × 2 dominoes if and only if at least one
of m and n is even.
There is a second version of mathematical induction. Anything that can be proved with this second
version can be proved with the method described above, and vice versa, but this second version is
114 CHAPTER 16. MATHEMATICAL INDUCTION
often easier to use. The change occurs in the induction assumption made in the inductive step of
the proof. The inductive step of the method described above (p(n) → p(n + 1) for all n ≥ k) is
replaced with [p(k) ∧ p(k + 1) ∧ · · · ∧ p(n)] → p(n + 1) for all n > k. The effect is that we now have
a lot more hypotheses to help us derive p(n + 1). In more detail, the second form of mathematical
induction is described in the following theorem.
This principle is shown to be valid in the same way the first form of induction was justified. The
utility lies in dealing with cases where we want to use inductive reasoning, but cannot deduce the
(n + 1)st case form the nth case directly. Let’s do a few examples of proofs using this second
form of induction. One more comment before doing the examples. In many induction proofs, it is
convenient to check several initial cases in the basis step to avoid having to include special cases in
the inductive step. The examples below illustrate this idea.
Example 16.9. Show that any integer n ≥ 32 can be written in the form n = 5 · k + 9 · l for some
k, l ∈ N.
Proof.
Basis: We can certainly write
32 = (1)5 + (3)9
33 = (3)5 + (2)9
34 = (5)5 + (1)9
35 = (7)5 + (0)9
36 = (0)5 + (4)9
Inductive Step: Suppose for some integer m with m ≥ 36 we can write j = 5k + 7l, where k, l ∈ N for
all integers 32 ≤ j ≤ m. Then since 32 ≤ m−4, by inductive hypothesis we can write m−4 = 5k+9l
for some natural numbers k, l. Thus m + 1 = 5(k + 1) + 9l, where k + 1, l ∈ N.
115
In that example, the basis step was a little messier than our first solution to the problem, but to
make up for that, the inductive step required much less cleverness.
Example 16.10. Induction can be used to verify a guessed closed from formula for a recursively
defined sequence. Consider the sequence defined recursively by the initial conditions a0 = 2, a1 = 5
and the recursive rule, for n ≥ 2, an = 5an−1 − 6an−2 . The first few terms of this sequence are
2, 5, 13, 35, 97, · · · . A little experimentation leads to the guess an = 2n + 3n . Let’s verify that guess
using induction. For the basis of the induction we check our guess gives the correct value of an for
n = 0 and n = 1. That’s easy. For the inductive step, let’s suppose our guess is correct up to n
where n ≥ 2. Then, we have
Example 16.11. In the game of Nim, two players are presented with a pile of matches. The
players take turns removing one, two, or three matches at a time. The player forced to take the last
match is the loser. For example, if the pile initially contains 8 matches, then first player can, with
correct play, be sure to win. Here’s how: player 1: take 3 matches leaving 5; player 2’s options will
leave 4, 3, or 2 matches, and so player 1 can reduce the pile to 1 match on her turn, thus winning
the game. Notice that if player 1 takes only 1 or 2 matches on her first turn, she is bound to lose
to good play since player 2 can then reduce the pile to 5 matches.
Let’s prove that if the number of matches in the pile is 1 more than a multiple of 4, the second
player can force a win; otherwise, the first player can force a win.
Proof. For the basis, we note that obviously the second player wins if there is 1 match in the pile,
and for 2, 3, or 4 matches the first player wins by taking 1, 2, or 3 matches in each case, leaving 1
match.
For the inductive step, suppose the statement we are to prove is correct for the number of matches
anywhere from 1 up to k for some k ≥ 4. Now consider a pile of k + 1 matches.
case 1: If k + 1 is 1 more than a multiple of 4, then when player 1 takes her matches, the pile will
not contain 1 more than a multiple of 4 matches, and so the next player can force a win by the
116 CHAPTER 16. MATHEMATICAL INDUCTION
case 2: If k + 1 is not 1 more than a multiple of 4, then player 1 can select matches to make it 1
more than a multiple of 4, and so the next player is bound to lose (with best play) by the inductive
assumption. So player 1 can force a win.
So, to win at Nim, when it is your turn, make sure you leave 1 more than a multiple of 4 matches
in the pile (which is easy to do unless your opponent knows the secret as well, in which case you
can just count the number of matches in the pile to see who will win, and skip playing the game
altogether!).
117
Exercises
n(n + 1)(2n + 7)
1 · 3 + 2 · 4 + 3 · 5 + · · · + n(n + 2) = .
6
1 · 21 + 2 · 22 + 3 · 23 + ... + n · 2n = (n − 1)2n+1 + 2.
Exercise 16.4. Prove by induction: For every integer n > 4, we have 2n > n2 .
Exercise 16.6. A pizza is cut into pieces (maybe some pretty oddly shaped) by making some integer
n2 + n + 2
n ≥ 0 number of straight line cuts. Prove: The maximum number of pieces is .
2
Exercise 16.7. A sequence is defined recursively by a0 = 0, and, for n ≥ 1, an = 5an−1 + 1. Use
induction to prove the closed form formula for an is
5n − 1
an = .
4
Problems
n(n + 1)(n + 2)
1 · 2 + 2 · 3 + 3 · 4 + · · · + n(n + 1) = .
3
Problem 16.3. Show that any integer n ≥ 8 can be written as a linear combination of the integers
3 and 5 using nonnegative integers as coefficients. That is if n ≥ 8, there exist nonnegative integers
kn , ln so that n = 3 · kn + 5 · ln . Do this twice, using both styles of induction.
Problem 16.5. Prove by induction: For every integer n ≥ 1, the number n5 − n is divisible by 5.
Problem 16.6. Prove by induction: For the Fibonacci sequence, for all n ≥ 0,
f02 + f12 + f22 + · · · + fn2 = fn fn+1 .
Problem 16.7. Prove by induction: For the Fibonacci sequence, for all n ≥ 1,
fn−1 fn+1 = fn2 + (−1)n .
as we needed to show.
Now, obviously there is something wrong with this proof by induction since, for example, 1+2+22 =
7, but 22+1 = 23 = 8. Where does the proof go bad?
Chapter 17
Algorithms
An algorithm is a recipe to solve a problem. For example, here is an algorithm that solves the
problem of finding the distance traveled by a car given the time it has traveled, t, and its average
speed, s: multiply t and s.
Over time, the requirements of what exactly constitutes an algorithm have matured. A really precise
definition would be filled with all sorts of technical jargon, but the ideas are commonsensible enough
that an informal description will suffice for our purposes. So, suppose we have in mind a certain
class of problems (such as determine the distance traveled given time traveled and average speed).
The properties of an algorithm to solve examples of that class of problems are:
(3) Definiteness: The instructions that make up the algorithm are precisely described. They
are not open to interpretation.
(5) Generality: The algorithm produces correct output for any set of input values.
The algorithm for finding distance traveled given time traveled and average speed obviously meets
all five requirements of an algorithm. Notice that, in this example, we have assumed the user of the
119
120 CHAPTER 17. ALGORITHMS
algorithm understands what it means to multiply two numbers. If we cannot make that assumption,
then we would need to add a number of additional steps to the algorithm to solve the problem of
multiplying two numbers together. Of course, that would make the algorithm significantly longer.
When describing algorithms, we’ll assume the user knows the usual algorithms for solving common
problems such as addition, subtraction, multiplication, and division of numbers, and knows how to
determine if one number is larger than another, and so on.
Just as important as an example of what an algorithm is, is an example of what is not an algorithm.
For example, we might describe the method by which most people used to look up a number in
a phone book. One would open the book and look to see if the listing you’re looking for is on
that page or not. If it is you find the number using the fact that the listings are alphabetized
and you’re done. If the number you’re looking for is not on the page, you use the fact that the
listings are alphabetized to either flip back several pages, or forward several pages. This page is
checked to see if the listing is on it. If it is not we repeat the process. One problem in this case is
that this description is not definite. The phrase flip back several pages is too vague, it violates the
definiteness requirement. Another problem is that someone could flip back and forth between two
pages and never find the number, and so violate the finiteness requirement. So this method is not
an algorithm.
Continuing the example from above of looking up an item indexed by a sorted list, one algorithm
for completing this is to look at the first entry in the list. If it’s the item you’re looking for you’re
done. Else move to the next entry. It’s either the item you’re looking for or you move to the next
entry. This is an example of a linear search algorithm. It’s not too bad for finding an early
entry in the list, but awful for later entries in the list.
Another algorithm to complete that task of looking up an item indexed by a sorted list is the
binary search algorithm. We first consider the middle entry. If it’s the item we’re looking for,
we’re done. Else we know the item is in the first half of the list, or the last half of the list since the
entries are ordered. We then pick the middle entry of the appropriate half, and repeat the halving
process on that half, until eventually the item is located. There are a few details to fix up to make
this a genuine algorithm. For example, what is the middle entry if there are an even number of
items listed? Also, what happens if the item we are looking for isn’t in the list? But it is clear with
a little effort we can add a few lines to the instructions to make this process into an algorithm.
Example 17.1. Here is an algorithm for determining ⌊m/n⌋ for positive integers m, n.
Here are the sequence of steps this algorithm would carry out with input m = 23 and n = 7:
[initial status. m = 23, n = 7, k = (undefined)]
instr 1: r = 3.2
instr 2: Output 3
stop!
123
Problems
Problem 17.1. Consider the following algorithm: The input will be two integers, m ≥ 0, and
n ≥ 1.
Describe in words what this algorithm does. In other words, what problem does this algorithm solve?
Problem 17.2. Consider the following algorithm: The input will be any integer n, greater than 1.
(a) List the steps the algorithm follows for the input n = 12.
(b) Describe in words what this algorithm does. In other words, what problem does this algorithm
solve?
Problem 17.3. Design an algorithm that takes any positive integer n and returns half of n if it is
even and half of n + 1 if n is odd. (Such an algorithm is needed quite often in computer science.)
124 CHAPTER 17. ALGORITHMS
Problem 17.4. Consider the following algorithm. The input will be a function f together with its
finite domain, D = {d1 , d2 , · · · , dn }.
(a) List the steps the algorithm follows for the input f : {a, b, c, d, e} → {+, ∗, &, $, #, @} given
by f (a) = ∗, f (b) = $, f (c) = +, f (d) = $, and f (e) = @.
(b) Describe in words what this algorithm does. In other words, what problem does this algorithm
solve?
Problem 17.5. Design an algorithm that will convert the ordered triple (a, b, c) to the ordered triple
(b, c, a). For example, if the input is (7, X, ∗), the output will be (X, ∗, 7).
Problem 17.6. Design an algorithm whose input is a finite list of positive integers and whose
output is the sum of the even integers in the list. If there are no even integers in the list, the output
should be 0.
Problem 17.7. A palindrome is a string of letters that reads the same in each direction. For
example, refer and redder are palindromes of length five and six respectively. Design an algorithm
that will take a string as input and output yes if the string is a palindrome, and no if it is not.
Chapter 18
Algorithm Efficiency
There are many different algorithms for solving any particular class of problems. In the last chapter,
we considered two algorithms for solving the problem of looking up a phone number given a person’s
name.
Algorithm L: Look at the first entry in the book. If it’s the number you’re looking for you’re
done. Else move to the next entry. It’s either the number you’re looking for or you move to the
next entry, and so on. (The linear search algorithm)
Algorithm B: The second algorithm took advantage of the arrangement of a phone book in
alphabetical order. We open the phone book to the middle entry. If it’s the number we’re looking
for, we’re done. Otherwise we know the number is listed in the first half of the book, or the last
half of the book. We then pick the middle entry of the appropriate half, and repeat the process.
After a number of repetitions, we will either be at the name we want, or learn the name isn’t in
the book. (The binary search algorithm)
The question arises, which algorithm is better? The question is pretty vague. Let’s assume that
better means uses fewer steps. Now if there are only one or two names in the phone book, it doesn’t
matter which algorithm we use, the look-up always takes one or two steps. But what if the phone
book contains 10000 names? In this case, it is hard to say which algorithm is better: looking up
Adam Aaronson will likely only take one step by the linear search algorithm, but binary search will
take 14 steps or so. But for Zebulon Zyzniewski, the linear search will take 10000 steps, while the
125
126 CHAPTER 18. ALGORITHM EFFICIENCY
(1) Small cases of the problem can be misleading when judging the quality of an algorithm, and
(2) It’s unlikely that one algorithm will always be more efficient than another.
The common approach to compare the efficiency of two algorithms takes those two lessons into
account by agreeing to the following protocol:
(1) only compare the algorithms when the size, n, of the problem it is applied to is huge. In
the phone book example, don’t worry about phone books of 100 names or even 10000 names.
Worry instead about phone books with n names where n gets arbitrarily large.
(2) to compare two algorithms, first, for each algorithm, find the maximum number of steps ever
needed when applied to a problem of size n. For a phone book of size n the linear search
algorithm will require n steps in the worst possible case of the name not being in the book.
On the other hand, the halving process of the binary search algorithm means that it will never
take more than about log2 n steps to locate a name (or discover the name is missing) in the
phone book. This information is expressed compactly by saying the linear search algorithm
has worst case scenario efficiency wL (n) = n while the binary search algorithm has worst
case scenario efficiency wB (n) = log2 n.
(3) we declare that algorithm #1 is more efficient than algorithm #2 provided, for all problems
of huge sizes n, w1 (n) < w2 (n), where w1 and w2 are the worst case scenario efficiencies for
each algorithm.
Notice that for huge n, wB (n) < wL (n). In fact, there is no real contest. For example, when
n = 1048576 = 220 , we get wB (n) = 20 while wL (n) = 1048576, and things only get better for wB
as n gets larger.
In summary, to compare two algorithms designed to solve the same class of problems we:
(1) Determine a number n that indicates the size of the a problem. For example, if the algorithm
manipulates a list of numbers, n could be the length of the list. If the algorithm is designed
to raise a number to a power, the size could be the power n.
127
(2) Decide what will be called a step when applying the algorithms. In the phone book example,
we took a step to mean a comparison. When raising a number to a power, a step might consist
of performing a multiplication. A step is usually taken to be the most time consuming action
in the algorithm, and other actions are ignored. Also, when determining the function, w
don’t get hung up worrying about minuscule details. Don’t spend time trying to determine if
w(n) = 2n + 7 or w(n) = 2n + 67. For huge values of n, the +7 and +67 become unimportant.
In such a case, w(n) = 2n has all the interesting information. Don’t sweat the small stuff.
(3) Determine the worst case scenario functions for the two algorithms, and compare them. The
smaller of the two (assuming they are not essentially the same) is declared the more efficient
algorithm.
Example 18.1. Let’s do a worst case scenario computation for the following algorithm designed to
determine the largest number in a list of n numbers. It would be natural to use the number of items
in the list, n, to represent the size of a problem. And let’s use the comparisons as steps. We are
going to make two comparisons each for each of the items in the list in every case (every case is a
worst case for this algorithm!). So we give this algorithm an efficiency w(n) = 2n. Notice that we
actually only need comparisons for the last n − 1 items in the list, and the exact number of times
the comparisons in instructions (3) and (4) are carried out might take a few minutes to figure out.
But it’s clear that both are carried out about n times, and since we are only interested in huge n’s,
being off by a few (or a few billion) isn’t really going to matter at all.
128 CHAPTER 18. ALGORITHM EFFICIENCY
Problems
Problem 18.1. For the algorithm presented in problem 17.2 from the last chapter:
(a) Select a value to represent the size of an instance of the problem the algorithm is designed to
solve.
Now that we have an idea of how to determine the efficiency of an algorithm by computing its
worst case scenario function, w(n), we need to be able to decide when one algorithm is better than
another. For example, suppose we have two algorithms to solve a certain problem, the first with
w1 (n) = 10000n2 , and the second with w2 (n) = 2n . Which algorithm would be the better choice
to implement based on these functions? To find out, let’s assume that our computer can carry out
one billion steps per second, and estimate how long each algorithm will take to solve a worst case
problem for various values of n.
w(n) n = 10 n = 20 n = 50 n = 100
10000n2 .001 sec .004 sec .025 sec .1 sec
n
2 .000001 sec .001 sec 4.2 months 4 × 1011 centuries
So, it looks like the selection of the algorithm depends on the size of the problems we expect to
run into. Up to size 20 or so, it doesn’t look like the choice makes a lot of difference, but for larger
values of n, the 10000n2 algorithm is the only practical choice.
It is worth noting that the values of the efficiency functions for small values of n can be deceiving.
It is also worth noting that, from a practical point of view, simply designing an algorithm to solve
a problem without analyzing its efficiency can be a pointless exercise.
129
130 CHAPTER 19. THE GROWTH OF FUNCTIONS
There are a few types of efficiency functions that crop up often in the analysis of algorithms. In
√
order of decreasing efficiency for large n they are: log2 n, n, n, n2 , n3 , 2n , n!.
Assuming one billion steps per second, here is how these efficiency functions compare for various
choices of n.
w(n) n = 10 n = 20 n = 50 n = 100
log
√2 n .000000003 sec .000000004 sec .000000005 sec .000000006 sec
n .000000003 sec .000000004 sec .000000006 sec .000000008 sec
n .00000001 sec .00000002 sec .00000004 sec .00000006 sec
n2 .0000001 sec .0000004 sec .0000016 sec .0000036 sec
n3 .000001 sec .000008 sec .000064 sec .00022 sec
2n .000001 sec .001 sec 18.3 minutes 36.5 years
n! .0036 sec 77 years 2.6 × 1029 centuries 2.6 × 1063 centuries
Even though the values in the first five rows of the table look reasonably close together, that is
a false impression fostered by the small values of n. For example, when n = 1000000, those five
entries would be as in table 19.3.
√
And, for even larger values of n, the n algorithm will require billions more years than the log2 n
algorithm.
There is a traditional method of estimating the efficiency of an algorithm. As in the examples above,
one part of the plan is to ignore tiny contributions to the efficiency function. In other words, we
won’t write expressions such as w(n) = n2 + 3, since the term 3 is insignificant for the large values
of n we are interested in. As far as behavior for large values of n is concerned, the functions n2
w(n) n = 1000000
log
√2 n .00000002 sec
n .000001 sec
n .001 sec
n2 17 minutes
n3 31.7 years
and n2 + 3 are indistinguishable. A second part of the plan is to not distinguish between functions
if one is always say 10 times the other. In other words, as far as analyzing efficiency, the functions
n2 and 10n2 are indistinguishable. And there is nothing special about 10 in those remarks. These
ideas lead us to the idea of the order of growth with respect to n, O(g(n)), in the next definition.
Definition 19.1. The function w(n) is O(g(n)) provided there is a number k > 0 such that
w(n) ≤ kg(n) for all n (or at least for all large values of n). The symbol O(g(n)) is read in English
as big-oh of g(n).
O(g(n)) actually represents the set of functions dominated by g(n). So, it would be proper to write
w(n) = n3 + 2n2 + 10n + 4 ∈ O(n3 ). Moreover, we could write O(n2 ) ⊂ O(n3 ) since the functions
dominated by n2 are among those dominated by n3 . Loosely speaking, finding the O estimate for
a function selects the most influential, or dominant, term (for large values of the variable) in the
function, and suppresses any constant factor for that term.
In each of the following examples, we find a big-oh estimate for the given expression.
Example 19.2. We have that n4 − 3n3 + 2n2 − 6n + 14 is O(n4 ), since for large n the first term
dominates the others.
Example 19.3. For large n, we have the inequalities:
Hence, (n3 log2 n + n2 − 3)(n2 + 2n + 8) is O(n5 log2 n). Alternatively, we have that n3 log2 n and
n2 dominate their respective factors. Thus, again, the product is O(n5 log2 n)
Problems
Problem 19.1. You have been hired for a certain job that can be completed in less than two months,
and offered two modes of payment. Method 1: You get $1,000,000,000 a day for as long as the job
takes. Method 2: You get $1 the first day, $2 the second day, $4 the third day, $8 the fourth day,
and so on, your payment doubling each day, for as long as the job lasts. Which method of payment
do you choose?
Problem 19.2. Suppose an algorithm has efficiency function w(n) = n log2 n. Compute the worst
case time required for the algorithm to solve problems of sizes n = 10, 20, 40, 60 assuming the
operations are carried out at the rate of one billion per second. Where does this function fit in the
table on the second page of this chapter?
The Integers
Number theory is concerned with the integers and their properties. In this chapter the rules of
the arithmetic of integers are reviewed. The surprising fact is that all the dozens of rules and tricks
you know for working with integers (and for doing algebra, which is just arithmetic with symbols)
are consequences of just a few basic facts.
The set of integers, {· · · , −2, −1, 0, 1, 2, · · · }, is denoted by the symbol Z. The two familiar arith-
metic operations for the integers, addition and multiplication, obey several basic rules. First, notice
that addition and multiplication are binary operations. In other words, these two operations com-
bine a pair of integers to produce a value. It is not possible to add (or multiply) three numbers at
a time. We can figure out the sum of three numbers, but it takes two steps: we select two of the
numbers, and add them up, and then add the third to the preliminary total. Never are more than
two numbers added together at any time. A list of the seven fundamental facts about addition and
multiplication of integers follows.
(1) The integers are closed with respect to addition and multiplication.
That means that when two integers are added or multiplied, the result is another integer. In
symbols, we have
∀a, b ∈ Z, ab ∈ Z and a + b ∈ Z.
133
134 CHAPTER 20. THE INTEGERS
(3) Addition and multiplication of integers are associative operations. In other words, when we
compute the sum (or product) of three integers, it does not matter whether we combine the
first two and then add the third to the total, or add the first to the total of the last two. The
final total will be the same in either case. Expressed in symbols, we have
(4) There is an additive identity denoted by 0. It has the property that when it is added to
any number the result is that number right back again. In symbols, we see that
0 + a = a = a + 0 for all a ∈ Z.
The preceding facts tell all there is to know about arithmetic. Every other fact can be proved from
these. For example, here is a proof of the cancellation law for addition using the facts listed above.
Proof. Suppose a + c = b + c. Add −c to both sides of that equation (applying fact 5 above) to
get (a + c) + (−c) = (b + c) + (−c). Using the associative rule, that equation can be rewritten as
a + (c + (−c)) = b + (c + (−c)), and that becomes a + 0 = b + 0. By property 4 above, that means
a = b.
135
Proof. Here are the steps in the proof. You supply the justifications for the steps.
a0 = a(0 + 0)
a0 = a0 + a0
a0 + (−(a0)) = (a0 + a0) + (−(a0))
a0 + (−(a0)) = a0 + (a0 + (−(a0)))
0 = a0 + 0
0 = a0
Your justification for each step should be stated as using one, or more, of the fundamental facts as
applied to the specific circumstance in each line.
The integers also have an order relation, a is less than or equal to b: a ≤ b. This relation satisfies
three fundamental order properties: ≤ is a reflexive, antisymmetric, and transitive relation on Z.
The notation b ≥ a means the same as a ≤ b. Also a < b (and b > a) are shorthand ways to say
a ≤ b and a ̸= b.
The trichotomy law holds: for a ∈ Z exactly one of a > 0, a = 0, or a < 0 is true.
The Well Ordering Principle for Z: The set of positive integers is well-ordered: every nonempty
subset of positive integers has a least element.
136 CHAPTER 20. THE INTEGERS
Exercises
Exercise 20.2. Prove that if ab = 0, then a = 0 or b = 0. Hint: Try an indirect proof with four
cases. Case 1: Show that if a > 0 and b > 0, then ab ̸= 0. Case 2: Show that if a > 0 and b < 0,
then ab ̸= 0. There are two more similar cases. (This fact is called the zero property.)
Exercise 20.3. Prove the cancellation law for multiplication: For integers a, b, c, with c ̸= 0, if
ac = bc, then a = b. (Hint: Use exercise 20.2)
Problems
Given integers a and b we say that a divides b and write a|b provided there is an integer c with
b = ac. So a divides b means a divides into b evenly. When a divides b we also say that a is
a factor of b, or that a is a divisor of b, or that b is a multiple of a. For example 3|12 since
12 = 3 · 4. Keep in mind that divides is a relation. When you see a|b you should think is that true
or false. Don’t write things like 3|12 = 4! If a does not divide b, write a/|b. For example, it is true
that 3/|13 since 3 does not divide into 13 evenly.
(1) a|0
(2) ±1|a
(5) a| − a
137
138 CHAPTER 21. THE DIVIDES RELATION AND PRIMES
(4) Proof. Suppose a|b and b|c. That means there are integers s, t so that as = b and bt = c.
Substituting as for b in the second equation gives (as)t = c, which is the same as a(st) = c.
That shows a|c. □
(9) Proof. Suppose a|b and a|c. That means there are integers s, t such that as = b and at = c.
Multiply the first equation by m and the second by n to get a(sm) = mb and a(tn) = nc. Now
add those two equations: a(sm) + a(tn) = mb + nc. Factoring out the a on the left shows
a(sm + tn) = mb + nc, and so we see a|(mb + nc). □
The prime integers play a central role in number theory. A positive integer larger than 1 is said to
be prime if its only positive divisors are 1 and itself.
The first few primes are 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79.
A positive integer larger than 1 which is not prime is composite. So a composite number n has
a positive divisor a which is neither 1 nor n. By part (6) of the theorem above, 1 < a < n.
Proof. Suppose n is a composite integer. That means n = ab where 1 < a, b < n. Not both a and
√ √ √ √
b are greater than n, for if so n = ab > n n = ( n)2 = n, and that is a contradiction.
√
So, if we haven’t found a divisor of n by the time we reach n, then n must be a prime.
Proof. Let n > 1 be given. The set, D, of all integers greater than 1 that divide n is nonempty since
n itself is certainly in that set. Let m be the smallest integer in that set. Then m must be a prime
since if k is an integer with 1 < k < m and k|m, then k|n, and so k ∈ D. That is a contradiction
since m is the smallest element of D. Thus m is a prime divisor of n.
Proof. Suppose that there were only finitely many primes. List them all: 2, 3, 5, 7, · · · , p. Form the
number N = 1 + 2 · 3 · 5 · 7 · · · p. According to the last theorem, there must be a prime that divides
N , say q. Certainly q also divides 2 · 3 · 5 · 7 · · · p since that is the product of all the primes, so q is
one of its factors. Hence q divides N − 2 · 3 · 5 · 7 · · · p. But that’s crazy since N − 2 · 3 · 5 · 7 · · · p = 1.
We have reached a contradiction, and so we can conclude there are infinitely many primes.
Theorem 21.5 (The Division Algorithm for Integers). If a, d ∈ Z, with d > 0, there exist unique
integers q and r, with a = qd + r, and 0 ≤ r < d.
Proof. Let S = {a − nd|n ∈ Z, and a − nd ≥ 0}. Then S ̸= ∅, since a − (−|a|)d ∈ S for sure. Thus,
by the Well Ordering Principle, S has a least element, call it r. Say r = a − qd. Then we have
a = qd + r, and 0 ≤ r. If r ≥ d, then a = (q + 1)d + (r − d), with 0 ≤ r − d contradicting the
minimality of r.
140 CHAPTER 21. THE DIVIDES RELATION AND PRIMES
The quantities q and r in the division algorithm are called the quotient and remainder when a
is divided by d.
141
Exercises
Exercise 21.1. Determine the quotient and remainder when 107653 is divided by 22869.
Exercise 21.7. Show that none of the 1000 consecutive integers 1001! + 2 to 1001! + 1001 are
primes.
Problems
Problem 21.1. For positive integers, a and b, if the quotient when a is divided by b is q, what are
the possible quotients when a + 1 is divided by b?
Problem 21.2. For positive integers, a and b, if the quotient when a is divided by b is q, what are
the possible quotients when 2a is divided by b?
Problem 21.4. Determine all the integers that 0 divides. (Hint: Think about the definition of the
divides relation. The correct answer is probably not what you expect.)
Problem 21.5. Determine if 3599 is a prime. (Hint: This is easy since 3599 = 3600 − 1.)
Problem 21.7. Prove property 10 of Theorem 21.1: For integers a, b, c, if a|b, then a|bc.
Chapter 22
The greatest common divisor of a and b, not both 0, is the largest integer which divides both a
and b. For example, the greatest common divisor of 21 and 35 is 7. We write gcd(a, b), as shorthand
for the greatest common divisor of a and b. So gcd(35, 21) = 7.
There are several ways to find the gcd of two integers, a and b (not both 0).
First, we could simply list all the positive divisors of a and b and pick the largest number that
appears in both lists. Notice that 1 will appear in both lists. For the example above the positive
divisors of 35 are 1, 5, 7, and 35. For 21 the positive divisors are 1, 3, 7, and 21. The largest
number appearing in both lists is 7, so gcd(35, 21) = 7.
Another way to say the same thing: If we let Da denote the set of positive divisors of a, then
gcd(a, b) = the largest number in Da ∩ Db .
The reason gcd(0, 0) is not defined is that every positive integer divides 0, and so there is no largest
integer that divides 0. From now on, when we use the symbol gcd(a, b), we will tacitly assume a and
b are not both 0. The integers a and b can be negative. For example if a = −34 and b = 14, then
the set of positive divisors of −34 is {1, 2, 17, 34} and the set of positive divisors of 14 is {1, 2, 7, 14}.
The set of positive common divisors of 14 and −34 is the set {1, 2, 17, 34} ∩ {1, 2, 7, 14} = {1, 2}.
142
143
Obviously then gcd(a, b) = gcd(−a, b) since a and −a have the same set of positive divisors. So
when computing the gcd(a, b) we may as well replace a and b by their absolute values if one or both
happen to be negative.
(2) gcd(a, 1) = 1.
(3) gcd(a, b) = gcd(b, a). (The order a and b are given is not important,
but it is traditional to list them with a ≥ b.)
If gcd(a, b) = 1, we say that a and b are relatively prime. When a and b are relatively prime,
they have no common prime divisor. For example 12 and 35 are relatively prime.
It’s pretty clear that computing gcd(a, b) by listing all the positive visors of a and all the positive
divisors of b, and selecting the largest integers that appears in both lists is not very efficient. There
is a better way of computing gcd(a, b).
Theorem 22.1. If a and b are integers (not both 0) and a = sb + t for integers s and t, then
gcd(a, b) = gcd(b, t).
Proof. To prove the theorem, we will show that the list of positive integers that divide both a and
b is identical to the list of positive integers that divide both b and t = a − sb. So, suppose d|a and
d|b. Then d|(a − sb) so d|t. Hence d divides both b and t. On the other hand, suppose d|b and d|t.
Then d|(sb + t), so that d|a. Hence d divides both a and b. It follows that gcd(a, b) = gcd(b, t).
Euclid is given the credit for discovering this fact, and its use for computing gcd’s is called the
Euclidean algorithm in his honor. The idea is to use the theorem repeatedly until a pair of
numbers is reached for which the gcd is obvious. Here is an example of the Euclidean algorithm in
action.
144 CHAPTER 22. GCD’S AND THE EUCLIDEAN ALGORITHM
Example 22.2. Since 14 = 1·10+4, gcd(14, 10) = gcd(10, 4). In turn 10 = 2·4+2 so gcd(10, 4) =
gcd(4, 2). Since 4 = 2 · 2, gcd(4, 2) = gcd(2, 0) = 2. So gcd(10, 14) = 2.
The same example, presented a little more compactly, and without explicitly writing out the divisions,
looks like
gcd(14, 10) = gcd(10, 4) = gcd(4, 2) = gcd(2, 0) = 2
At each step, the second number is replaced by the remainder when the first number is divided by the
second, and the second moves into the first spot. The process is repeated until the second number
is a 0 (which must happen eventually since the second number never will be negative, and it goes
down by at least 1 with each repetition of the process). The gcd is then the number in the first spot
when the second spot is 0 in the last step of the algorithm.
Example 22.3. Find the greatest common divisor of 540 and 252. We may present the computa-
tions compactly, without writing out the divisions. We have
Using the Euclidean algorithm to find gcd’s is extremely efficient. Using a calculator with a ten
digit display, you can find the gcd of two ten digit integers in a matter of a few minutes at most
using the Euclidean algorithm. On the other hand, doing the same problem by first finding the
positive divisors of the two ten digit integers would be a tedious project lasting several days. Some
modern cryptographic systems rely on the computation of the gcd’s of integers of hundreds of digits.
Finding the positive divisors of such large integers, even with a computer, is, at present, a hopeless
task. But a computer implementation of the Euclidean algorithm will produce the gcd of integers
of hundreds of digits in the blink of an eye.
145
a = q1 · b + r1 , 0 < r1 < b
b = q2 · r1 + r2 , 0 < r2 < r1
r1 = q3 · r2 + r3 , 0 < r3 < r2
.. ..
.=.
rk = qk+2 · rk+1 + rk+2 , 0 < rk+2 < rk+1
.. ..
.=.
rn−2 = qn · rn−1 + rn , 0 < rn < rn−1
rn−1 = qn+1 · rn + 0
The sequence of integer remainders b > r1 > ... > rk > ... ≥ 0 must eventually reach 0. Let’s say
rn ̸= 0, but rn+1 = 0, so that rn−1 = qn+1 · rn . That is, in the sequence of remainders, rn is the
last non-zero term. Then, just as in the examples above we see that the gcd of a and b is the last
nonzero remainder:
Let’s find gcd(317, 118) using this version of the Euclidean algorithm. Here are the steps:
317 = 2 · 118 + 81
118 = 1 · 81 + 37
81 = 2 · 37 + 7
37 = 5 · 7 + 2
7=3·2+1
2=2·1+0
Since the last non-zero remainder is 1, we conclude that gcd(317, 118) = 1. So, in the terminology
introduced above, we would say that 317 and 118 are relatively prime.
146 CHAPTER 22. GCD’S AND THE EUCLIDEAN ALGORITHM
Exercises
Exercise 22.1. Use the Euclidean algorithm to compute gcd(a, b) in each case.
a) a = 233, b = 89 b) a = 1001, b = 13 c) a = 2457, b = 1458 d) a = 567, b = 349
Exercise 22.3. Write a step-by-step algorithm that implements the Euclidean algorithm for finding
gcd’s.
Problems
Problem 22.1. Use the Euclidean algorithm to compute gcd(a, b) in each case.
Problem 22.3. If p is a prime, and n is any integer, what are the possible values of gcd(p, n)?
Problem 22.4. Prove or give a counterexample: If p and q are distinct primes, then gcd(2p, 2q) =
2.
GCD’s Reprised
The greatest common divisor of two integers a and b, not both zero, is defined to be the largest
integer gcd(a, b) that divides them both. But there is another way to describe the greatest common
divisor. First, a little vocabulary: recall that a linear combination of a and b is any expression
of the form as + bt where s, t are integers. For example, 4 · 5 + 10 · 2 = 40 is a linear combination
of 4 and 10. Here are some more linear combinations of 4 and 10:
If we make a list of all possible linear combinations of 4 and 10, an unexpected pattern appears:
· · · , −6, −4, −2, 0, 2, 4, 6, · · · . Since 4 and 10 are both even, we are sure to see only even integers
in the list of linear combinations, but the surprise is that every even number is in the list. Now
here’s the connection with gcd’s: The gcd of 4 and 10 is 2, and the list of all linear combinations is
exactly all multiples of 2. Let’s prove that was no accident.
Theorem 23.1. Let a, b be two integers (not both zero). Then the smallest positive number in the
list of the linear combinations of a and b is gcd(a, b). In other words, the gcd(a, b) is the smallest
positive integer that can be written as a linear combination of a and b.
Proof. Let L = { as + bt | s, t are integers and as + bt > 0 }. Since a, b are not both 0, we see this
set is nonempty. As a nonempty set of positive integers, it must have a least element, say m. Since
m ∈ L, m is a linear combination of a and b. Say m = as0 +bt0 . We need to show m = gcd(a, b) = d.
147
148 CHAPTER 23. GCD’S REPRISED
As noted above, since d|a and d|b, it must be that d|(as0 + bt0 ), so d|m. That implies d ≤ m.
We complete the proof by showing m is a common divisor of a and b. The plan is to divide a by
m and show the remainder must be 0. So write a = qm + r with 0 ≤ r < m. Solving for r we
get 0 ≤ r = a − qm = a − q(as0 + bt0 ) = a(1 − qs0 ) + b(−qt0 ) < m. That shows r is a linear
combination of a and b that is less than m. Since m is the smallest positive linear combination of
a and b, the only option for r is r = 0. Thus a = qm, and so m|a. In the same way, m|b. Since m
is a common divisor or a and b, it follows that m ≤ d. Since the reverse inequality is also true, we
conclude m = d.
Theorem 23.2. Let a, b be two integers (not both zero). Then the list of all the linear combinations
of a and b consists of all the multiples of gcd(a, b).
Proof. Since gcd(a, b) = d certainly divides any linear combination of a and b, only multiples of
d stand a chance to be in the list. Now we need to show that if n is a multiple of the d then n
will appear in the list for sure. According to the last theorem, we can find integers s0 , t0 so that
d = as0 + bt0 . Now since n is a multiple of d, we can write n = de. Multiplying both sides of
d = as0 + bt0 by e gives a(s0 e) + b(t0 e) = de = n, and that shows n does appear in the list of linear
combinations of a and b.
So, without doing any computations, we can be sure that the set of all linear combinations of 15
and 6 will be all multiples of 3.
In practice, finding integers s and t so that as + bt = d = gcd(a, b) is carried out by using the
Euclidean algorithm applied to a and b and then back-solving.
Example 23.3. Let a = 35 and b = 55. Then the Euclidean algorithm gives
55 = 35 · 1 + 20
35 = 20 · 1 + 15
20 = 15 · 1 + 5
15 = 5 · 3 + 0
can substitute this into the previous expression for 5 as a linear combination of 20 and 15 to get
5 = 1 · 20 + (−1) · 15 = 1 · 20 + (−1) · (1 · 35 + (−1) · 20). Which can be simplified by collecting 35’s and
20’s to write 5 = 2 · 20 + (−1) · 35. Now we can use the top equation to write 20 = 1 · 55 + (−1) · 35
and substitute this into the expression giving 5 as a linear combination of 35 and 20. We get
5 = 2 · (1 · 55 + (−1) · 35) + (−1) · 35. This simplifies to 5 = 2 · 55 + (−3) · 35.
This sort of computation gets a little tedious, keeping track of equations and coefficients. Moreover,
the back-substitution method isn’t very pleasant from a programming perspective since all the
equations in the Euclidean algorithm need to be saved before solving for the coefficients in a linear
combination for the gcd(a, b). A streamlined version called the continued fraction method uses
forward-substitution to allow us to compute gcd(a, b) as a linear combination.
For the example above we first build a four-row table with 4 + 2 = 6 columns. We begin by entering
a and b in the top row left hand spaces. We put *’s below them in the second row (these entries
are not used). Then we fill out the last two rows of the first two columns as shown.
55 35
* *
0 1
1 0
We complete the first row using the remainders from the Euclidean Algorithm.
55 = 35 · 1 + 20
35 = 20 · 1 + 15
20 = 15 · 1 + 5
15 = 5 · 3 + 0
55 35 20 15 5 0
* *
0 1
1 0
150 CHAPTER 23. GCD’S REPRISED
We complete the second row using the quotients from the Euclidean Algorithm.
55 = 35 · 1 + 20
35 = 20 · 1 + 15
20 = 15 · 1 + 5
15 = 5 · 3 + 0
55 35 20 15 5 0
* * 1 1 1 3
0 1
1 0
Finally, the remaining entries in each of the bottom two rows are filled in one after the other from
left to right by taking the quotient entry in a column times the row entry one column back and
then adding the row entry two columns back. For example, to compute the c in the following table
use c = q · j + k.
* * q
0 1 k j c
1 0
Using that rule to complete the table for 55 and 35 we obtain the result
55 35 20 15 5 0
* * 1 1 1 3
0 1 1 2 3 11
1 0 1 1 2 7
We don’t actually need the bottom two entries in the last column. But observe that 7 · 55 = 11 · 35.
This will always be true! So we have a nice way to check that we have correctly filled out the table.
151
Anyway we now find integers s and t so that gcd(a, b) = s · a + t · b. Obviously one of s and t
will be positive and the other negative - but which one? The answer is in counting the number
of back-substitutions we would have made. Start with a + at the one in the first column of the
bottom row, and step along alternating + and − in each column. This will give the correct sign
for s which is the second to last entry in the bottom row. The sign of t is the opposite and the
absolute value of t is the entry in the 3rd row and second to last column.
In general to use the continued fraction table method to write gcd(a, b) = sa + tb we build a table
with 4 rows and n + 2 columns, where the Euclidean Algorithm applied to a and b requires n steps
and rn = 0. In practice, as noticed above, we can omit the last column.
152 CHAPTER 23. GCD’S REPRISED
Exercises
Exercise 23.1. Determine gcd(13447, 7667) and write it as a linear combination of 13447 and
7667. Try both the method of back-substitution and the Extended Euclidean Algorithm to determine
a suitable linear combination.
Exercise 23.2. What can you conclude about gcd(a, b) if there are integers s, t with as + bt = 1?
Exercise 23.3. What can you conclude about gcd(a, b) if there are integers s, t with as + bt = 19?
Exercise 23.4. What can you conclude about gcd(a, b) if there are integers s, t with as + bt = 18?
Problems
Problem 23.1. Determine gcd(41559, 39417) and write it as a linear combination of 41559 and
39417. Try both the method of back-substitution and the Extended Euclidean Algorithm to determine
a suitable linear combination.
Problem 23.2. What can you conclude about gcd(a, b) if there are integers s, t with as + bt = 12?
Problem 23.3. What is the smallest positive integer that can be written as a linear combination
of 2191 and 1351?
Problem 23.4. Definition: The least common multiple of the positive integers a and b is the
smallest positive integer that is divisible by both a and b.
Example: the least common multiple of 24 and 18 is 72. Write that as lcm(24, 18) = 72.
You might recall the notion of least common multiple from the time you learned how to add fractions.
The idea was that to add two fractions, ab and dc , first write the two as equivalent fractions with the
2
same denominator. For example, to add 3 and 54 , write them as 8
12 and 15
12 , then add to get 23
12 . The
a c
least common denominator when adding b and d is the least common multiple of the denominators,
lcm(b, d).
The Fundamental Theorem of Arithmetic states the familiar fact that every positive integer greater
than 1 can be written in exactly one way as a product of primes. For example, the prime factor-
ization of 60 is 22 · 3 · 5, and the prime factorization of 625 is 54 . The factorization of 60 can be
written is several different ways: 60 = 2 · 2 · 3 · 5 = 5 · 2 · 3 · 2, and so on. The order in which the
factors are written does not matter. The factorization of 60 into primes will always have two 2’s,
one 3, and one 5. One more example: The factorization if 17 consists of the single factor 17. In the
standard form of the factorization of an integer greater than 1, the primes are written in order of
size, and exponents are used for primes that are repeated in the factorization. So, for example, the
standard factorization of 60 is 60 = 22 · 3 · 5.
Before proving the Fundamental Theorem of Arithmetic, we will need to assemble a few facts.
Theorem 24.1 (Euclid’s Lemma). If n|ab and n and a are relatively prime, then n|b.
Proof. Suppose n|ab and that gcd(n, a) = 1. We can find integers s, t such that ns + at = 1.
Multiply both sides of that equation by b to get nsb + abt = b. Since n divides both terms on the
left side of that equation, it divides their sum, which is b.
153
154 CHAPTER 24. THE FUNDAMENTAL THEOREM OF ARITHMETIC
One consequence of this theorem is that if a prime divides a product of some integers, then it must
divide one of the factors. That is so since if a prime does not divide an integer, then it is relatively
prime to that integer. That is useful enough to state as a theorem.
Theorem 24.3 (Fundamental Theorem of Arithmetic). If n > 1 is an integer, then there exist
prime numbers p1 ≤ p2 ≤ ... ≤ pr such that n = p1 p2 · · · pr and there is only one such prime
factorization of n.
Proof. There are two things to prove: (1) every n > 1 can be written in at least one way as a
product of primes (in increasing order) and (2) there cannot be two different such expressions equal
to n.
We will prove these by induction. For the basis, we see that 2 can be written as a product of primes
(namely 2 = 2) and, since 2 is the smallest prime, this is the only way to write 2 as a product of
primes.
For the inductive step, suppose every integer from 2 to k can be written uniquely as a product of
primes. Now consider the number k + 1. We consider two cases:
number between 2 and k. That contradicts the inductive assumption. We conclude the the
prime factorization of k + 1 is unique.
We can apply the Fundamental Theorem of Arithmetic to the problem of counting the number
of positive divisors of an integer greater than 1. For example, consider the integer 12 = 22 3. It
follows from the Fundamental Theorem that the positive divisors of 12 must look like 2a 3b where
a = 0, 1, 2, b = 0, 1. So there are six positive divisors of 12:
20 30 = 1 21 30 = 2 22 30 = 4 20 31 = 3 21 31 = 6 22 31 = 12
156 CHAPTER 24. THE FUNDAMENTAL THEOREM OF ARITHMETIC
Exercises
Problems
Note that the same list of primes is used for both factorizations, so we will need to allow exponents
to be 0 or more. For example, for a = 12 and b = 15 we will write a = 12 = 22 · 31 · 50 and
15 = 20 · 31 · 51 .
min(e1 ,f1 ) min(e2 ,f2 ) min(e3 ,f3 ) min(en ,fn )
Prove: gcd(a, b) = p1 p2 p3 · · · pn .
Example: gcd(12, 15) = 2min(2,0) 3min(1,1) 5min(0,1) = 20 31 50 = 3.
Al buys some books at $25 each, and some magazines at $3 each. If he spent a total of $88, how
many books and how many magazines did Al buy? At first glance, it does not seem we are given
enough information to solve this problem. Letting x be the number of books Al bought, and y
the number of magazines, then the equation we need to solve is 25x + 3y = 88. Thinking back to
college algebra days, we recognize 25x + 3y = 88 as the equation of a straight line in the plane,
88
and any point along the line will give a solution to the equation. For example, x = 0 and y = 3 is
one solution. But, in the context of this problem, that solution makes no sense because Al cannot
buy a fraction of a magazine. We need a solution in which x and y are both integers. In fact, we
need even a little more care than that. The solution x = −2 and y = 46 is also unacceptable since
Al cannot buy a negative number of books. So we really need solutions in which x and y are both
nonnegative integers. The problem can be solved by brute force: If x = 0, y is not an integer. If
x = 1, then y = 21, so that is one possibility. If x = 2, y is not an integer. If x = 3, y is not an
integer. And, if x is 4 or more, then y would have to be negative. So, it turns out there is only one
possible solution: Al bought one book, and 21 magazines.
157
158 CHAPTER 25. LINEAR DIOPHANTINE EQUATIONS
http://www.merriam-webster.com/audio.php?file=diopha01&word=Diophantineequation
In this chapter we will learn how to easily find the solutions to all linear Diophantine equations:
ax + by = c where a, b, c are given integers. To show some of the subtleties of such problems, here
are two more examples:
(1) Al buys some books at $24 each, and some magazines at $3 each. If he spent a total of $875,
how many books and how many magazines did Al buy? For this question we need to solve
the Diophantine equation 24x + 3y = 875. In this case there are no possible solutions. For
any integers x and y, the left-hand side will be a multiple of 3 and so cannot be equal to 875
which is not a multiple of 3.
(2) Al buys some books at $26 each, and some magazines at $3 each. If he spent a total of $157,
how many books and how many magazines did Al buy? Setting up the equation as before,
we need to solve the Diophantine equation 26x + 3y = 157. A little trial and error, testing
x = 0, 1, 2, 3, and so on shows there are two possible answers this time: (x, y) ∈ {(2, 35), (5, 9)}.
Determining all the solutions to ax + by = c is closely connected with the idea of gcd’s. One
connection is theorem 23.2. Here is how solutions of ax + by = c are related.
Theorem 25.1. ax + by = c has a solution in the integers if and only if gcd(a, b) divides c.
So, for example, 9x + 6y = 211 has no solutions (in the integers) while 9x + 6y = 213 does
have solutions. To find a solution to the last equation, apply the Extended Euclidean Algorithm
method to write the gcd(9, 6) as a linear combination of 9 and 6 (actually, this one is easy to do
by sight): 9 · 1 + 6 · (−1) = 3, then multiply both sides by 213/gcd(9, 6) = 213/3 = 71 to get
(71)9 + (−71)6 = 213. That shows x = 71, y = −71 is a solution to 9x + 6y = 213.
But that is only one possible solution. When a linear Diophantine equation has one solution it will
have infinitely many. In the example above, another solution will be x = 49 and y = −38. Checking
shows that (49)9 + (−38)6 = 213.
159
There is a simple recipe for all solutions, once one particular solution has been found.
Theorem 25.2. Let d = gcd(a, b). Suppose x = s and y = t is one solution to ax + by = c. Then
all solutions are given by
b a
x=s+k and y =t−k where, k = any integer.
d d
Proof. It is easy to check that all the displayed x, y pairs are solutions simply by plugging in:
b a abk abk
a s+k +b t−k = as + + bt − = as + bt = c.
d d d d
Checking that the displayed formulas for x and y give all possible solutions is trickier. Let’s assume
a ̸= 0. Now suppose x = u and y = v is a solution. That means au + bv = c = as + bt. It follows
that a(u − s) = b(t − v). Divide both sides of that equation by d to get
a b
(u − s) = (t − v).
d d
a b a b a
That equation shows (t − v). Since and are relatively prime, we conclude that (t − v).
d d d d d
a
Let’s say k = t − v. Rearrange that equation to get
d
a
v =t−k .
d
a b a
Next, replacing t − v in the equation (u − s) = (t − v) with k gives
d d d
a b b a
(u − s) = (t − v) = k .
d d d d
a
Since ̸= 0, we can cancel that factor. So, we have
d
b b
u−s=k so that u=s+k .
d d
Using the Extended Euclidean Algorithm method, we learn that gcd(221, 91) = 13 and since 13|39,
the equation will have infinitely many solutions. The Extended Euclidean Algorithm table provides
a linear combination of 221 and 91 equal to 13: 221(−2) + 91(5) = 13. Multiply both sides by 3 and
we get 221(−6) + 91(15) = 39. So one particular solution to 221x + 91y = 39 is x = −6, y = 15.
According the the theorem above, all solutions are given by
91 221
x = −6 + k = −6 + 7k and y = 15 − k = 15 − 17k,
13 13
Example 25.4. Armand buys some books for $25 each and some cd’s for $12 each. If he spent a
total of $331, how many books and how many cd’s did he buy?
Let x = the number of books, and y = the number of cd’s. We need to solve 25x + 12y = 331.
The gcd of 25 and 12 is 1, and there is an obvious linear combination of 25 and 12 which equals
1: 25(1) + 12(−2) = 1. Multiplying both sides by 331 gives 25(331) + 12(−662) = 331. So one
particular solution to 25x + 12y = 331 is x = 331 and y = −662. Of course, that won’t do for an
answer to the given problem since we want x, y ≥ 0. To find the suitable choices for x and y, let’s
look at all the possible solutions to 25x + 12y = 331. We have that
The only option for k is k = −27, and so we see Armand bought x = 331 + 12(−27) = 7 books and
y = −662 − 25(−27) = 13 cd’s.
161
Exercises
Exercise 25.5. Sal sold some ceramic vases for $59 each, and a number of bowls for $37 each. If
she took in a total of $4270, how many of each item did she sell?
Problems
Problem 25.3. Beth stocked her video store with a number of video game machines at $79 each,
and a number of video games at $41 each. If she spent a total of $6358, how many of each item did
she purchase?
Problem 25.4. If you all you have are dimes and quarters, in how many ways can you pay a $7
bill?
(For example, one way would be 10 dimes and 24 quarters.)
Problem 25.5. How many integer solutions are there to the equation 11x + 7y = 137 if the value
of x has to be at least −15 and not more than 20.
Problem 25.6. Determine all integer solutions to 5x − 7y = 99. (Watch that minus sign!)
Chapter 26
Modular Arithmetic
Karl Friedrich Gauss made the important discovery of modular arithmetic. Modular arithmetic
is also called clock arithmetic, and we are actually used to doing modular arithmetic all the time
(pun intended). For example, consider the question If it is 7 o’clock now, what time will it be in
8 hours?. Of course the answer is 3 o’clock, and we found the answer by adding 7 + 8 = 15, and
then subtracting 12 to get 15 − 12 = 3. Actually, we are so accustomed to that sort of calculation,
we probably just immediately blurt out the answer without stopping to think how we figured it
out. But trying a less familiar version of the same sort of problem makes it plain exactly what we
needed to do to answer such questions: If it is 7 o’clock now, what time will it be in 811 hours?
To find out, we add 7 + 811 = 818, then divide that by 12, getting 818 = (68)(12) + 2, and so we
conclude it will be 2 o’clock. The general rule is: to find the time h hours after t o’clock, add h + t,
divide by 12 and take the remainder.
There is nothing special about the number 12 in the above discussion. We can imagine a clock with
any integer number of hours (greater than 1) on the clock. For example, consider a clock with 5
hours. What time will it be 61 hours after 2 o’clock. Since 61 + 2 = 63 = (12)(5) + 3, the answer
is 3 o’clock.
In the general case, if we have a clock with m hours, then the time h hours after t o’clock will be
the remainder when t + h is divided by m. So, the reason it is 2 o’clock 811 hours after 7 o’clock
is that
811 + 7 ≡ 2 (mod 12)
162
163
This can all be expressed in more mathematical sounding language. The key is obviously the notion
of remainder. That leads to the following definition:
Definition 26.1. Given an integer m > 1, we say that two integers a and b are congruent modulo
m, and write a ≡ b (mod m), in case a and b leave the same remainder when divided by m.
Proof. The relation is clearly reflexive since every number leaves the same remainder as itself when
divided by m. Next, if a and b leave the same remainder when divided by m, so do b and a, so
the relation is symmetric. Finally, if a and b leave the same remainder, and b and c leave the same
remainder, then a and c leave the same remainder, and so the relation is transitive.
Proof. Suppose a ≡ b (mod m). That means a and b leave the same remainder, say r when divided
by m. So we can write a = jm + r and b = km + r. Subtracting the second equation from the first
gives a − b = (jm + r) − (km + r) = jm − km = (j − k)m, and that shows m|(a − b).
For the converse, suppose m|(a − b). Divide a, b by m to get quotients and remainders: a = jm + r
and b = km + s, where 0 ≤ r, s < m. We need to show that r = s. Subtracting the second equation
from the first gives a − b = m(j − k) + (r − s). Since m divides a − b and m divides m(j − k), we
can conclude m divides (a − b) − m(j − k) = r − s. Now since 0 ≤ r, s < m, the quantity r − s must
be one of the numbers m − 1, m − 2, · · · , 2, 1, 0, −1, −2, · · · − (m − 1). The only number in that list
that m divides is 0, and so r − s = 0. That is, r = s, as we wanted to show.
The equivalence class of an integer a with respect to congruence modulo m will be denoted by [a],
or [a]m in case we are employing more than one number m as a modulus. In other words, [a] is
the set of all integers that leave the same remainder as a when divided by m. Or, another way
to say the same thing, [a] comprises all integers b such that b − a is a multiple of m. That means
b − a = km, or b = a + km.
That last version is often the easiest way to think about the integers that appear in [a]: start with
a and add and subtract any number of m’s. For example, the equivalence class of 7 modulo 11
164 CHAPTER 26. MODULAR ARITHMETIC
would be
[7] = {· · · , −15, −4, 7, 18, 29, 40, · · · }.
We know that the distinct equivalence classes partition Z. Since dividing an integer by m leaves one
of 0, 1, 2, · · · , m − 1 as a remainder, we can conclude that there are exactly m equivalence classes
modulo m. In particular, [0], [1], [2], [3], ...[m − 1] is a list of all the different equivalence classes
modulo m. It is traditional when working with modular arithmetic to drop the [ ] symbols denoting
the equivalence classes, and simply write the representatives. So we would say, modulo m, there
are m numbers: 0, 1, 2, 3, · · · , m − 1. But keep in mind that each of those numbers really represents
a set, and we can replace any number in that list with another equivalent to it modulo m. For
example, we can replace the 0 by m. The list 1, 2, 3 · · · , m − 1, m still consists of all the distinct
values modulo m.
One reason the relation of congruence modulo m useful is that addition and multiplication of
numbers modulo m acts in many ways just like arithmetic with ordinary integers.
Theorem 26.4. If a ≡ c (mod m), and b ≡ d (mod m), then a + b ≡ c + d (mod m) and ab ≡ cd
(mod m).
Proof. Suppose a ≡ c (mod m) and b ≡ d (mod m). Then there exist integers k and l with
a = c + km and b = d + lm. So a + b = c + km + d + lm = (c + d) + (k + l)m. This can be rewritten
as (a + b) − (c + d) = (k + l)m, where k + l ∈ Z. So a + b ≡ c + d (mod m). The other part is done
similarly.
Example 26.5. What is the remainder when 1103 + 112 is divided by 11? We can answer this
problem in two different ways. We could add 1103 and 112, and then divide by 11. Or, we could
determine the remainders when each of 1103 and 112 is divided by 11, then add those remainders
before dividing by 11. The last theorem promises us the two answers will be the same. In fact
1103 + 112 = 1215 = (110)(11) + 5 so that 1103 + 112 ≡ 5 (mod 11). On the other hand 1103 =
(100)(11) + 3 and 112 = (10)(11) + 2, so that 1103 + 112 ≡ 3 + 2 ≡ 5 (mod 11).
Example 26.6. A little more impressive is the same sort of problem with operation of multi-
plication: what is the remainder when (1103)(112) is divided by 11? The calculation looks like
(1103)(112) ≡ (3)(2) ≡ 6 (mod 11).
Example 26.7. For a really awe inspiring example, let’s find the remainder when 1103112 is divided
by 11. In other words, we want to find x = 0, 1, 2, · · · 10 so that 1103112 ≡ x (mod 11).
165
Now 1103112 is a pretty big number (in fact, since log 1103112 = 112 log 1103 = 340.7 · · · , the number
has 341 digits). In order to solve this problem, let’s start by thinking small: Let’s compute 1103n ,
for n = 1, 2, 3, · · · .
Now that last equation is very interesting. It says that whenever we see 11035 we may just as well
write 1 if we are working modulo 11. And now we see there is an easy way to determine 1103112
modulo 11:
1103112 ≡ 11035(22)+2 ≡ (11035 )22 (11032 ) ≡ 122 (9) ≡ 9 (mod 11)
The sort of computation in example 26.7 appears to be just a curiosity, but in fact the last sort of
example forms the basis of one version of public key cryptography. Computations of exactly that
type (but with much larger integers) are made whenever you log into a secure Internet site. It’s
reasonable to say that e-commerce owes its existence to the last theorem.
While modular arithmetic in many ways behaves like ordinary arithmetic, there are some differences
to watch for. One important difference is the familiar rule of cancellation: in ordinary arithmetic,
if ab = ac and a ̸= 0, then b = c. This rule fails in modular arithmetic. For example, 3 ̸≡ 0 (mod 6)
and (3)(5) ≡ (3)(7) (mod 6), but 5 ̸≡ 7 (mod 6).
Solving congruence equations is a popular sport. Just as with regular arithmetic with integers, if
we want to solve a + x ≡ b (mod m), we can simply set x ≡ b − a (mod m). So, for example,
solving 55 + x ≡ 11 (mod 6) we would get x ≡ 11 − 55 ≡ −44 ≡ 4 (mod 6).
Equations involving multiplication, such as ax ≡ b (mod m), are much more interesting. If the
modulus m is small, equations of this sort can be solved by trial-and-error: simply try all possible
choices for x. For example, testing x = 0, 1, 2, 3, 4, 5, 6 in the equation 4x ≡ 5 (mod 7), we see
x ≡ 3 (mod 7) is the only solution. The equation 4x ≡ 5 (mod 8) has no solutions at all. And the
equation 2x ≡ 4 (mod 6) has x ≡ 2, 5 (mod 6) for solutions.
166 CHAPTER 26. MODULAR ARITHMETIC
Trial-and-error is not a suitable approach for large values of m. There is a method that will produce
all solutions to ax ≡ b (mod m). It turns out that such equations are really just linear Diophantine
equations in disguise, and that is the key to the proof of the following theorem.
Theorem 26.8. The congruence ax ≡ b (mod m) can be solved for x if and only if d = gcd(a, m)
divides b.
Proof. Solving ax ≡ b (mod m) is the same as finding x so that m|(ax − b) and that’s the same as
finding x and y so that ax−b = my. Rewriting that last equation in the form ax+(−m)y = b, we can
see solving ax ≡ b (mod m) is the same as solving the linear Diophantine equation ax + (−m)y = b.
We know that equation has a solution if and only if gcd(a, m)|b, so that proves the theorem.
This is why 4x ≡ 5 (mod 7) has a solution: gcd(4, 7) = 1 and 1|5. And, why 4x ≡ 5 (mod 8)
has no solutions: gcd(4, 8) = 4, because 4/|5. The theorem also shows that 2x ≡ 4 (mod 6) has
a solution since gcd(2, 6) = 2 and 2|4. But why does this last equation have two solutions? The
answer to that is also provided by the results concerning linear Diophantine equations.
Let gcd(a, m) = d. The solutions to ax ≡ b (mod m) are the same as the solutions for x to
ax + (−m)y = b. Supposing that last equation has a solution with x = s, then we know all possible
choices of x are given by x = s + k m
d . So if x = s is one solution to ax ≡ b (mod m), then all
solutions are given by x = s + k m
d , where k is any integer. In other words, all solutions are given
m
by x ≡ s (mod d ), and so there are d solutions modulo m,
Example 26.9. Let’s find all the solutions to 2x ≡ 4 (mod 6). Since x = 2 is obviously one
solution, we see all solutions are given by x = 2 + k 26 = 2 + 3k, where k is any integer. When
k = 0, 1 we get x = 2, 5, and other values of k repeat these two modulo 6. Looking at the solutions
written as x = 2 + k 62 = 2 + 3k, we see that another way to express the solutions is x ≡ 2 (mod 3).
First we see that gcd(91, 42) = 7 and, since 7|35, the equation will have a solution. In fact, since
gcd(42, 91) = 7, there are going to be seven solutions modulo 91. All we need is to find one particular
solution, then the others will all be easy to determine. Again using the continued fraction method
(or just playing with 42 and 91 a little bit) we discover (42)(−2) + (91)(1) = 7 = gcd(42, 91).
Multiplying by 5 gives (42)(−10) + (91)(5) = 35. Thus x = −10 is one solution to 42x ≡ 35
91
(mod 91). It follows that all solutions are given by x ≡ −10 (mod gcd(42,91) ). That’s the same
as x ≡ −10 (mod 13), or, even more neatly, x ≡ 3 (mod 13). In other words, the solutions are
3, 16, 29, 42, 55, 68, 81 modulo 91.
167
Exercises
Exercise 26.1.
(a) On a military (24-hour) clock, what time is it 3122 hours after 16 hundred hours?
Exercise 26.3. In a listing of the five equivalence classes modulo 5, four of the values are 1211,
218, −100, and −3333. What are the possible choices for the fifth value?
m
(a) Show that if ax ≡ b (mod m), then there is an integer r such that x = s + r d .
m
(b) If 0 ≤ r1 < r2 < d, then the numbers x1 = s + r1 d and
x2 = s + r2 m
d are not congruent modulo m.
168 CHAPTER 26. MODULAR ARITHMETIC
Problems
Problem 26.1. Suppose we have a 52 card deck with the cards in order ace, 2, 3, . . . , queen,king for
clubs, then diamonds, then hearts, then spades from top to bottom. A step consists to taking the
top card and moving it to the bottom of the deck. We start with the ace of clubs as the top card.
After two steps, the top card is the 3 of clubs. What is the top card after 735 steps?
Problem 26.2. The marks on a combination lock are numbered 0 to 39. If the lock is at mark
19, and the dial is turned one mark clockwise, it will be at mark 18. If the lock is at mark 19 and
turned 137 marks clockwise, at what mark will it be?
Problem 26.4. Arrange the numbers −39, −27, −8, 11, 37, 68, 91
so they are in the order 0, 1, 2, 3, 4, 5, 6 modulo 7.
Problem 26.14. There is exactly one n between 0 and 19548 such that n ≡ 22 (mod 173) and
n ≡ 80 (mod 113). Determine that n.
Chapter 27
The usual way of writing integers is in terms of groups of ones (units), and groups of tens, and
groups of tens of tens (hundreds), and so on. Thus 237 stands for 7 units plus 3 tens and 2 hundreds,
or 2(102 ) + 3(10) + 7. This is the familiar decimal notation for numbers (deci = ten). But there is
really nothing special about the number ten here, and it could be replaced by any integer bigger
than one. That is, we could use say 7 the way 10 was used above to describe a number. Thus we
would specify how many units, how many 7’s and 72 ’s and 73 ’s, and so on are needed to make up
the number. When a number is expressed in this fashion with b in place of the 10, the result is
called the base-b expansion (or radix-b expansion) of the integer.
For example, the decimal integer 132, is made up of two 72 ’s, four 7’s and finally six units. Thus we
express the base ten number 132 as 246 in base 7, or as 2467 , the little 7 indicating the base. For
small numbers, with a couple of minutes practice, conversion from base 10 (decimal) to other bases,
and back again can be carried out mentally. For larger numbers, mental arithmetic will prove a
little awkward. Luckily there is a handy algorithm to do the conversion automatically.
For base 10 integers, we use the decimal digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. In general, for base b, the
digits will be 0, 1, 2, 3 · · · , b − 1. So, for example, a base 7 numbers use digits 0, 1, 2, 3, 4, 5, 6.
Conversion from the base b expansion of a number to its decimal version is a snap: For example,
169
170 CHAPTER 27. INTEGERS IN OTHER BASES
That sort of computation is so easy because we have been practicing base 10 arithmetic for so many
years. If we were as good at arithmetic in some base b, then conversion from base 10 to base b
would be just as simple. But, lacking that comfort with base b arithmetic, we need to describe the
conversion algorithm from decimal to base b a little more formally. Here’s the idea.
Suppose we have a decimal number n that we want to convert to some base b. Let’s say the base b
expansion is dk dk−1 · · · d2 d1 d0 , with the base b digits between 0 and b − 1. That means
n = dk · bk + dk−1 · bk−1 + · · · + d2 · b2 + d1 · b + d0
So the quotient is q = dk · bk−1 + dk−1 · bk−2 + · · · + d2 · b + d1 , and and the remainder is the base
b digit d0 of n. So we have found the units digit in the base b expansion of n. If we repeat that
process on the quotient q, the result is
so the next base b digit, d1 appears as the remainder. Continuing in this fashion, the base b
expansion is produced one digit at a time.
Briefly, to convert a positive decimal integer n to its base b representation, divide n by b, to find the
quotient and the remainder. That remainder will be needed units digit. Then divide the quotient
by b again, to get a new quotient and a new remainder. That remainder gives the next base b digit.
Then divide the new quotient by b again, and so on. In this way producing the base b digits one
after the other.
171
Example 27.1. To convert 14567 from decimal to base 5, the steps are:
14567 = 2913 · 5 + 2
2913 = 582 · 5 + 3
582 = 116 · 5 + 2
116 = 23 · 5 + 1
23 = 4 · 5 + 3
4 = 0 · 5 + 4.
The least confusing way to do such a problem would be to convert n from base 7 to base 10, and
then convert the base 10 expression for n to base 5. This method allows us to do all our work in
base 10 where we are comfortable. The computations start with:
n = 3 · 73 + 3 · 72 + 5 · 7 + 5 = 1216.
Then, we calculate:
1216 = 243 · 5 + 1
243 = 48 · 5 + 3
48 = 9 · 5 + 3
9=1·5+4
1 = 0 · 5 + 1.
An alternative method, not for the faint of heart, is convert directly from base 7 to base 5 skipping
the middle man, base 10. In this method, we simply divide n by 5, take the remainder, getting the
units digit, then divide the quotient by 5 to get the next digit, and so on, just as described above.
The rub is that the arithmetic must all be done in base 7, and we don’t know the base 7 times table
very well. For example, in base 7, 3 · 5 = 21 is correct since three 5’s add up to two 7’s plus one
172 CHAPTER 27. INTEGERS IN OTHER BASES
more.
The computation would now look like (all the 7 ’s indicating base 7 are suppressed for readability):
Particularly important in computer science applications of discrete mathematics are the bases 2
(called binary), 8 (called octal) and 16 (called hexadecimal, or simply hex). Thus the decimal
number 75 would be 10010112 (binary), 1138 (octal) and 4B16 in hex. Note that for hex numbers,
symbols will be needed to represent hex digits for 10, 11, 12, 13, 14 and 15. The letters A, B, C, D, E
and F are traditionally used for these digits.
173
Exercises
Exercise 27.2. Convert the decimal integer 11714 to bases 2, 6, and 16. Remember to use
A, B, · · · , F to represent base 16 digits from 10 to 15, if needed.
× 1 2 3 4 5 6
1 1 2 3 4 5 6
2 2 4
3 12
4 22
Exercise 27.4. Make base 6 addition and multiplication tables similar to the base 7 multiplication
table of exercise 27.3.
Exercise 27.5. (For those with a sweet tooth for punishment!) Use the Euclidean algorithm
to compute gcd(51227 , 13127 ) without converting the numbers to base 10.
Problems
Problem 27.2. Convert the decimal integer 3177 to bases 2, 6, and 16. Use A, B, · · · , F to
represent base 16 digits from 10 to 15 as needed.
Problem 27.3. Make base 8 addition and multiplication tables similar to the base 7 multiplication
table of exercise 27.3.
Problem 27.4. Using the table in problem 27.3, add 1268 + 4578 .
Problem 27.5. Using the table in problem 27.3, multiply (1268 )(4578 ).
Chapter 28
The next few chapters will deal with the topic of combinatorics: the art of counting. By counting
we mean determining the number of different ways of arranging objects in certain patterns or the
number of ways of carrying out a sequence of tasks. For example, suppose we want to count the
number of ways of making a bit string of length two. Such a problem is small enough that the
possible arrangements can be counted by brute force. In other words, we can simply make a list
of all the possibilities: 00, 01, 10, 11. So the answer is four. If the problem were to determine the
number of bit strings of length fifty, the brute force method loses a lot of its appeal. For problems
where brute force counting is not a reasonable alternative, there are a few principles we can apply
to aid in the counting. In fact, there are just two basic principles on which all counting ultimately
rests.
Throughout this chapter, all sets mentioned will be finite sets, and if A is a set, |A| will denote the
number of elements in A.
The sum rule says that if the sets A and B are disjoint, then
|A ∪ B| = |A| + |B|.
174
175
Example 28.1. For example, if A = {a, b, c} and B = {j, k, l, m, n}, then |A| = 3, |B| = 5, and,
sure enough,
|A ∪ B| = |{a, b, c, j, k, l, m, n}| = 8 = 3 + 5.
Care must be used when applying the sum principle that the sets are disjoint. If A = {a, b, c} and
B = {b, c, d}, then |A ∪ B| = 4, and not 6.
Example 28.2. As another example of the sum principle, if we have a collection of 3 dogs and 5
cats, then we can select one of the animals in 8 ways.
The sum principle is often expressed in different language: If we can do task 1 in m ways and task
2 in n ways, and the tasks are independent (meaning that both tasks cannot be done at the same
time), then there are m + n ways to do one of the two tasks. The independence of the tasks is the
analog of the disjointness of the sets in the set version of the sum rule.
A serious type of error is trying to use the sum rule for tasks that are not independent. For instance,
suppose we want to know in how many different ways we can select either a deuce or a six from an
ordinary deck of 52 cards. We could let the first task be the process of selecting a deuce from the
deck. That task can be done in 4 ways since there are 4 deuces in the deck. For the second task,
we will take the operation of selecting a six from the deck. Again, there are 4 ways to accomplish
that task. Now these tasks are independent since we cannot simultaneously pick a deuce and a six
from the deck. So, according to the sum rule, there are 4 + 4 = 8 ways of selecting one card from
a deck, and having that card be either a deuce or a six.
Now consider the similar sounding question: In how many ways can we select either a deuce or a
diamond from a deck of 52 cards? We could let the first task again be the operation of selecting
a deuce from the deck, with 4 ways to carry out that task. And we could let the second task be
the operation of selecting a diamond from the deck, with 13 ways to accomplish that. But in this
case, the answer to the question is not 4 + 13 = 17, since these tasks are not independent. It is
possible to select a card that is both a deuce and a diamond. So the sum rule cannot be used.
What is the correct answer? Well, there are 13 diamonds, and there are 3 deuces besides the two
of diamonds, and so there are actually 16 cards in the deck that are either a deuce or a diamond.
That means there are 16 ways to select a card from a deck and have it turn out to be either a deuce
or a diamond.
The sum rule can be extended to the case of more than two sets (or more than two tasks): If
A1 , A2 , A3 , · · · , An is a collection of pairwise disjoint sets, then |A1 ∪ A2 ∪ A3 ∪ · · · ∪ An | = |A1 | +
176 CHAPTER 28. THE TWO FUNDAMENTAL COUNTING PRINCIPLES
|A2 | + |A3 | + · · · + |An |. Or, in terms of tasks: If task 1 can be done in k1 ways, and task 2 in
k2 , and task 3 in k3 ways, and so on, until task n can be done in kn ways, and if the tasks are all
independent, then we can do one task in k1 + k2 + k3 + · · · + kn ways.
Example 28.3. For example, if we own three cars, two bikes, a motorcycle, four pairs of roller
skates, and two scooters, then we can select one of these modes of transportation in 3+2+1+4+2 =
12 ways.
The sum rule is related to the logical connective or. That is reasonable since the sum rule counts
the number of elements in the set A ∪ B = { x | x ∈ A or x ∈ B }. In terms of tasks, the sum rule
counts the number of ways to do either task 1 or task 2. Generally speaking, when the word or
occurs in a counting problem, the sum rule is the tool to use.
The logical connective and is related to the second fundamental counting principle: the product
rule. The product rule says:
|A × B| = |A| · |B|.
An explanation of this is that A × B consists of all ordered pairs (a, b) where a ∈ A and b ∈ B.
There are |A| choices for a and then |B| choices for b.
In terms of tasks, the product rule says that if task 1 can be done in m ways and task 2 can be
done in n ways after task 1 has been done, then there are mn ways to do both tasks, the first then
the second. Here the relation with the logical connective and is also obvious. We need to do task 1
and task 2. Generally speaking, the appearance of and in a counting problem suggests the product
rule will come into play.
As with the sum rule, the product rule can be used for situations with more than two sets or more
than two tasks. In terms of sets, the product rule reads |A1 × A2 × · · · An | = |A1 | · |A2 | · · · |An |. In
terms of tasks, it reads, if task 1 can be done in k1 ways, and for each of those ways, task 2 can be
done in k2 ways, and for each of those ways, task 3 can be done in k3 ways, and so on, until for each
of those ways, task n can be done in kn ways, then we can do task 1 followed by task 2 followed by
task 3, etc, followed by task n in k1 k2 k3 · · · kn ways. That sounds worse than it really is.
Example 28.4. How many bit strings are there of length five?
Solution. We can think of task 1 as filling in the first (right hand) position, task 2 as filling in the
second position, and so on. We can argue that we have two ways to do task 1, and then two ways
177
to do task 2, and then two ways to do task 3, and then two ways to do task 4, and then two ways
to do task 5. So, by the product rule, there are 2 · 2 · 2 · 2 · 2 = 25 = 32 ways to do all five tasks, and
so there are 32 bit strings of length five.
The same reasoning shows that, in general, there are 2n bit strings of length n, when n ≥ 0.
Example 28.5. Suppose we are buying a car with five choices for the exterior color and three
choices for the interior color. Then there is a total of 3 · 5 = 15 possible color combinations that
we can choose from. The first task is to select an exterior color, and there are 5 ways to do that.
The second task is to select an interior color, and there are 3 ways to do that. So the product rule
says there are 15 ways total to do both tasks. Notice that there is no requirement of independence
of tasks when using the product rule. However, also notice that the number of ways of doing the
second task must be the same no matter what choice is made for doing the first task.
Example 28.6. For another, slightly more complicated, example of the product rule in action,
suppose we wanted to make a two-digit number using the digits 1, 2, 3, 4, 5, 6, 7, 8, and 9. How
many different such two-digit numbers could we form? Let’s make the first task filling in the left
digit, and the second task filling in the right digit. There are 9 ways to do the first task. And, no
matter how we do the first task, there are 9 ways to do the second task as well. So, by the product
rule, there are 9 · 9 = 81 possible such two-digit numbers.
Example 28.7. Now, let’s change the problem in example 28.6 a little bit. Suppose we wanted
two-digit numbers made up of those same nine digits, but we do not want to use a digit more than
once in any of the numbers. In other words, 37 and 91 are OK, but we do not want to count 44 as
a possibility. We can still make the first task filling in the left digit, and the second task filling in
the right digit. And, as before, there are 9 ways to do the first task. But now, once the first task
has been done, there are only 8 ways to do the second task, since the digit used in the first task is
no longer available for doing the second task. For instance, if the digit 3 was selected in the first
task, then for the second task, we will have to choose from the eight digits 1, 2, 4, 5, 6, 7, 8, and 9.
So, according to the product rule, there are 9 · 8 = 72 ways of building such a number.
No matter in what way the first task was done, there are always 8 ways to to the second task in
sequence. What if you chose to pick the second digit first?
Example 28.8. Just for fun, here is another way to see the answer in example 28.7 is 72. We saw
above that there are 81 ways to make a two-digit number when we allow repeated digits. But there
are 9 two digit numbers that do have repeated digits (namely 11, 22, · · · , 99). That means there
must be 81 − 9 = 72 two-digit numbers without repeated digits.
The trick we used in example 28.8 looks like a new counting principle, but it is really the sum rule
178 CHAPTER 28. THE TWO FUNDAMENTAL COUNTING PRINCIPLES
being applied in a tricky way. Here’s the idea. Call the set of all the two-digit numbers (not using
0) T , call the set with no repeated digits N , and call the set with repeated digits R. By the sum
rule, |T | = |N | + |R|, so |N | = |T | − |R|. This is a very common trick.
Generally, suppose we are interested in counting some arrangements, let’s call them the Good
arrangements. But it is not easy for some reason to count the Good arrangements directly. So,
instead, we count the Total number of arrangements, and subtract the number of Bad arrangements:
Example 28.9. By a word of length five, we will mean any string of five letters from the 26 letter
alphabet. How many words contain at least one vowel. The vowels are: a,e,i,o,u.
By the product rule, there is a total of 265 possible words of length five. The bad words are made
up of only the 21 non-vowels. So, by the sum rule, the number of good words is 265 − 215 .
As in example 28.9, most interesting counting problems involve a combination of both the sum and
product rules.
Example 28.10. Suppose we wanted to count the number of different possible bit strings of length
five that start with either three 0’s or with two 1’s. Recall that a bit string is a list of 0’s and 1’s,
and the length of the bit string is the total number of 0’s and 1’s in the list. So, here are some bit
strings that satisfy the stated conditions: 00001, 11111, 11011, and 00010. On the other hand, the
bit strings 00110 and 10101 do not meet the required condition.
To do this problem, let’s first count the number of good bit strings that start with three 0’s. In this
case, we can think of the construction of such a bit string as doing five tasks, one after the other,
filling in the leftmost bit, then the next one, then the third, the next, and finally the last bit. There
is only one way to do the first three tasks, since we need to fill in 0’s in the first three positions.
But there are two ways to do the last two tasks, and so, according to the product rule there are
1 · 1 · 1 · 2 · 2 = 4 bit strings of length five starting with three 0’s. Using the same reasoning, there
are 1 · 1 · 2 · 2 · 2 = 8 bit strings of length five starting with two 1’s. Now, a bit string cannot both
start with three 0’s and also with two 1’s, (in other words, starting with three 0’s and starting with
two 1’s are independent). And so, according to the sum rule, there will be a total of 4 + 8 = 12 bit
strings of length five starting with either three 0’s or two 1’s.
Example 28.11. How many words of six letters (repeats OK) contain exactly one vowel?
179
Solution. Let’s break the construction of a good word down into a number of tasks.
Example 28.12. Count the number of strings on license plates which either consist of three capital
English letters, followed by three digits, or consist of two digits followed by four capital English
letters.
Solution. Let A be the set of strings which consist of three capital English letters followed by three
digits, and B be the set of strings which consist of two digits followed by four capital English letters.
By the product rule |A| = 263 · 103 since there are 26 capital English letters and 10 digits. Also by
the product rule |B| = 102 · 264 . Since A ∩ B = ∅, by the sum rule the answer is 263 · 103 + 102 · 264 .
In the previous examples we might continue on with the arithmetic. For instance, in the last,
example 28.12, using the distributive law on our answer to factor out common terms we see |A∪B| =
102 · 263 (10 + 26) is an equivalent answer. This, in turn, simplifies to |A ∪ B| = 102 · 263 · 36, and
that gives
|A ∪ B| = 100 · 17576 · 36 = 63, 273, 600.
Of all of these answers the most valuable is probably 263 · 103 + 102 · 264 , since the form of the
answer is indicative of the manner of solution. We can readily observe that the sum rule
was applied to two disjoint subcases. For each subcase the product rule was applied to compute
the intermediate answer. As a general rule, answers to counting problems should be left in this
uncomputed form.
180 CHAPTER 28. THE TWO FUNDAMENTAL COUNTING PRINCIPLES
Exercises
Exercise 28.1. To meet the science requirement a student must take one of the following courses:
a choice of 5 biology courses, 4 physics courses, or 6 chemistry courses. In how many ways can the
one course be selected?
Exercise 28.2. Using the data of problem 1, a student has decided to take one biology, one physics,
and one chemistry course. How many different such selections are possible?
Exercise 28.3. A serial code is formed in one of three ways: (1) two letters followed by two digits,
or (2) three letters followed by one digit, or (3) four letters. How many different codes are there?
(Unless otherwise indicated, letters will means upper case letters chosen from the usual 26-letter
alphabet and digits are selected from {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}.)
Exercise 28.4. How many words of length six are there if letters may be repeated? (Examples:
BBBXBB, ABATBC are OK).
Exercise 28.5. How many words of length six are there if letters may not be repeated? (Examples:
BBBBXB, ABATJC are bad but ABXHYR is OK).
(a) How many ways can a student complete the test if every question must be answered?
(b) How many ways can a student complete the test if questions can be left unanswered?
Exercise 28.7. How many binary strings of length less than or equal to nine are there?
Exercise 28.10. How many nine-letter words contain at least two A’s?
181
Problems
Problem 28.1. My piggy bank contains 20 pennies, 4 nickels, 7 dimes, and 2 quarters. In how
many ways can I select one coin?
Problem 28.2. My piggy bank contains 20 pennies, 4 nickels, 7 dimes, and 2 quarters. In how
many ways can I select four coins, one of each value?
Problem 28.3. A multiple choice test contains 10 questions. There are four possible answers for
each question.
(a) How many ways can a student complete the test if every question must be answered?
(b) How many ways can a student complete the test if questions can be left unanswered?
Problem 28.4. Computer ID’s are length seven strings made up of any combination of seven
different letters and digits. How many different ID’s are there?
Problem 28.5. Computer ID’s are length seven strings made up of any combination of seven
letters and digits, with repeats allowed. How many different ID’s are there?
Problem 28.6. A code word is either a sequence of three letters followed by two digits or two letters
followed by three digits. (Unless otherwise indicated, letters will means upper case letters chosen
from the usual 26-letter alphabet and digits are selected from {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}.) How many
different code words are possible?
Problem 28.7. Code words consist of five letters followed by five digits. How many code words
contain at least one X?
Problem 28.8. Code words consist of five letters followed by five digits. How many code words
contain exactly one X?
Problem 28.9. Code words consist of five letters followed by five digits. How many code words
contain exactly two X’s?
Problem 28.10. How many bit strings of length ten begin and end with 1’s?
Problem 28.11. How many bit strings of length at least two but no more than ten begin and end
with 1’s?
Chapter 29
By a permutation of a set of objects we mean a listing of the objects of the set in a specific order.
For example, there are six possible permutations of the set A = {a, b, c}. They are
The product rule explains why there are six permutation of A: there are 3 choices for the first
letter, once that choice has been made there are 2 choices for the second letter, and finally that
leaves 1 choice for the last letter. So the total number of permutations is 3 · 2 · 1 = 6.
A set with n elements is called an n-set. We have just shown that a 3-set has 6 permutations. The
same reasoning shows that an n-set has n · (n − 1) · (n − 2) · · · 2 · 1 = n! permutations. So the total
number of different ways to arrange a deck of cards is 52!, a number with 68 digits:
80658175170943878571660636856403766975289505440883277824000000000000.
Instead of forming a permutation of all the elements of an n-set, we might consider the problem of
first selecting some of the elements of the set, say r of them, and then forming a permutation of
just those r elements. In that case we say we have formed an r-permutation of the n-set. All the
possible 2-permutations of the 4-set A = {a, b, c, d} are
ab, ac, ad, ba, bc, bd, ca, cb, cd, da, db, dc
182
183
The product rule provides a simple formula for P (n, r). There are n choices for the first element,
and once that choice has been made, there are n − 1 choices for the second element, then n − 2 for
the third, and so on, until finally, there are n − (r − 1) = n − r + 1 choices for the rth element. So
P (n, r) = n(n − 1) · ... · (n − r + 1). That expression can be written more neatly as follows:
Example 29.1. As an example, the number of ways of selecting a president, vice-president, secre-
tary, and treasurer from a group of 20 people is P (20, 4) (assuming no person can hold more than
20!
one office). If you want the actual numerical value, it is = 20 · 19 · 18 · 17 = 116280, but
(20 − 4)!
20!
the best way to write the answer in most cases would be just P (20, 4) = , and skip the numerical
16!
computations.
Example 29.2. How many one-to-one functions are there from a 5-set to a 7-set?
While this question doesn’t sound on the surface like a problem of permutations, it really is. Suppose
the 5-set is A = {1, 2, 3, 4, 5} and the 7-set is B = {a, b, c, d, e, f, g}. One example of a one-to-one
function from A to B would be f (1) = a, f (2) = c, f (3) = g, f (4) = b, f (5) = d. But, if we agree
to think of the elements of A listed in their natural order, we could specify that function more
briefly as acgbd. In other words, each one-to-one function specifies a 5-permutation of B, and,
conversely, each 5-permutation of B specifies a one-to-one function. So the number of one-to-one
functions from a 5-set to a 7-set is equal to the number of 5-permutations of a 7-set, and that is
P (7, 5) = 2520.
The same reasoning shows there are P (n, r) one-to-one functions from an r-set to an n-set.
Example 29.3. Here are a few easily seen values of P (n, r):
(1) P (n, n) = n!
184 CHAPTER 29. PERMUTATIONS AND COMBINATIONS
(2) P (n, 1) = n
(3) P (n, 0) = 1
When forming permutations, the order in which the elements are listed is important. But there are
many cases when we are interested only in which elements are selected and we do not care about
the order. For example, when playing poker, a hand consists of five cards dealt from a standard
52-card deck. The order in which the cards arrive in a hand does not matter, only the final selection
of the five cards is important. When order is not important, the selection is called a combination
rather than a permutation. More carefully, an r-combination from an n-set is an r-subset of the
n-set. In other words an r-combination of an n-set is an unordered selection of r distinct elements
from the n-set.
{a, b}, {a, c}, {a, d}, {a, e}, {b, c},
{b, d}, {b, e}, {c, d}, {c, e}, {d, e}.
n
Notation. The number of r-combinations from an n-set is denoted by C(n, r) or, sometimes, r .
Example 29.5. Here are a few easily seen values of C(n, r):
(1) C(n, n) = 1
(2) C(n, 1) = n
(3) C(n, 0) = 1 the only set with 0 elements is the empty set.
(4) C(n, r) = 0 if r > n since a subset cannot contain more objects than its superset.
There is a compact formula for C(n, r) which can be derived using the product rule in a sort of back-
handed way. An r-permutation of an n-set can be built using a sequence of two tasks. First, select
r elements of the n-set. There are C(n, r) ways to do that task. Next, arrange those r elements
in some specific order. There are r! ways to do that task. So, according to the product rule, the
number of r-permutations of an n-set will be C(n, r)r!. However, we know that the number of r-
permutations of an n-set are P (n, r). So we may conclude that P (n, r) = C(n, r)r!, or, rearranging
that, we see
n P (n, r) n!
C(n, r) = = =
r r! r!(n − r)!
185
Example 29.6. Suppose we have a club with 20 members. If we want to select a committee of 4
members, then there are
20! 20 · 19 · 18 · 17
C(20, 4) = = = 4845
4!(20 − 4)! 4·3·2·1
ways to do this since the order of people on the committee doesn’t matter.
Compare this answer with example 29.1 where we counted the number of possible selections for
president, vice-president, secretary, and treasurer from the group of 20. The difference between the
two cases is that the earlier example is a question about permutations (order matters), whereas
this example is a question about combinations (order does not matter).
186 CHAPTER 29. PERMUTATIONS AND COMBINATIONS
Exercises
Exercise 29.1. In how many ways can the 26 volumes (labeled A through Z) of the Encyclopedia
of Pseudo-Science be placed on a shelf ?
Exercise 29.2. In how many ways can those same 26 volumes be placed on a shelf if superstitions
demand the volumes labeled with vowels must be adjacent? In how many ways can they be placed
on the shelf obeying the conflicting superstition that volumes labeled with vowels cannot touch each
other?
Exercise 29.3. For those same 26 volumes, how many ways can they be placed in a two shelf
bookcase if volumes A-M go on the top shelf and N -Z go on the bottom shelf ?
Exercise 29.4. In how many ways can seven men and four women sit in a row if the men must
sit together?
Exercise 29.5. 20 players are to be divided into two 10-man teams. In how many ways can that
be done?
Exercise 29.6. A lottery ticket consists of five different integers selected from 1 to 99. How many
different lottery tickets are possible? How many tickets would you need to buy to have a one-in-a-
million chance of winning by matching all five randomly selected numbers?
Exercise 29.7. A committee of size six is selected from a group of nine deans and thirteen profes-
sors.
(b) How many committees are possible if there must be exactly two deans on the committee?
(c) How many committees are possible if professors must outnumber deans on the committee?
187
Problems
Problem 29.1. In how many ways can the ten digits be written in a row?
Problem 29.2. In how many ways can the ten digits be written in a row if the odd digits have to
be adjacent?
Problem 29.3. In how many ways can the ten digits be written in a row if the even and odd digits
have to alternate?
Problem 29.4. How many bit strings of length ten have exactly four 0’s?
Problem 29.5. How many bit strings of length ten have at most four 0’s?
Problem 29.6. How many length twenty strings of a’s, b’s, and c’s have ten a’s, six b’s, and four
c’s?
Problem 29.7. How many bit strings of length ten have more 0’s than 1’s?
Problem 29.8. In how many ways can a subset of two numbers from 1 to 100 (inclusive) be selected
if the selected numbers cannot be consecutive?
Chapter 30
n
The quantity C(n, k) is also written as k , and called a binomial coefficient . It gives the number
of k-subsets of an n-set, or, equivalently, it gives the number of ways of selecting k distinct items
from n items.
Facts involving the binomial coefficients can be proved algebraically, using the formula nk =
n! n
k!(n−k)! . But often the same facts can be proved much more neatly by recognizing k gives the
number of k-subsets of an n-set. This second sort of proof is called a combinatorial proof. Here is
an example of each type of proof.
Proof. (a combinatorial proof) Let S be a set with n + 1 elements. Select one particular element
a ∈ S. There are two ways to produce a subset of S of size
k. We
can include a in the subset, and
n
toss in k − 1 of the remaining n elements of S. There are to do that. Or, we can avoid
k−1
n
a, and choose all k elements from the other n elements of S. There are ways to do that. So,
k
188
189
n n
according to the sum rule, there is a total of + subsets of size k of S. But we know
k−1 k
n+1 n+1 n n
there are subsets of size k of S. So it must be that = + .
k k k−1 k
The idea of a combinatorial proof is to ask a counting problem that can be answered in two different
ways, and then conclude the two answers must be equal. In the proof above, we asked how many
k-subsets
there are ofan
(n + 1)-set. We provided an argument to show two answers were correct:
n+1 n n
and + , and so we could conclude the two answers must be equal:
k k−1 k
n+1 n n
= + .
k k−1 k
As in the combinatorial proof of Pascal’s Identity 30.1, such arguments can be much less work, far
less tedious, and much more illuminating, than algebraic proofs. Unfortunately, they can also be
190 CHAPTER 30. THE BINOMIAL THEOREM AND PASCAL’S TRIANGLE
much more difficult to discover since it is necessary to dream up a good counting problem that will
have as answers the two expressions we are trying to show are equal, and there is no algorithm for
coming up with such a suitable counting problem.
n n
Example 30.2. Give a combinatorial proof of = .
k n−k
Solution. To provide a combinatorialproof, we ask how many ways are there to grab k elements
n
of an n-set?. One answer of course is . But here is a second way to view the problem. We can
k
n
select k elements of an n-set by deciding on n − k elements not to pick. Since there are
n−k
n
ways to select the n − k not to pick, there must be ways to select k elements of an n-set.
n− k
n n
Since the two answers must be equal we conclude that = . □
k n−k
Example 30.3. Give a combinatorial proof of Vandermonde’s Identity:
n+m n m n m n m n m
= + + + ··· + .
k 0 k 1 k−1 2 k−2 k 0
n m
we can select 0 of the a’s and k of the b’s. There are 0 k ways to do that.
n m
Or we select 1 of the a’s and k − 1 of the b’s. There are 1 k−1 ways to do that.
n m
Or we select 2 of the a’s and k − 2 of the b’s. There are 2 k−2 ways to do that.
n m
And so on, until we reach the option of selecting k of the a’s and 0 of the b’s. There are k 0
ways to do that.
By the sum rule, it follows that another way to count the number of k-subsets is
n m n m n m n m
+ + + ··· + .□
0 k 1 k−1 2 k−2 k 0
191
0
Row 0: 0
1 1
Row 1: 0 1
2 2 2
Row 2: 0 1 2
3 3 3 3
Row 3: 0 1 2 3
4 4 4 4 4
Row 4: 0 1 2 3 4
5 5 5 5 5 5
Row 5: 0 1 2 3 4 5
. .. ..
.. . .
Figure 30.1: Pascal’s Triangle
The binomial coefficients are so named because they appear when a binomial x + y is raised to an
power ≥ 0. To appreciate the connection, let’s look at a table of the binomial coefficients
integer
n
. The table is arranged in rows starting with row n = 0, and within each row, the entries are
k
arranged from left to right for k = 0, 1, 2, · · · , n. The result, called Pascal’s Triangle, is shown in
figure 30.1.
Filling in the numerical values for the binomial coefficients gives the table shown in figure 30.2.
The numbers in the first, second, and third rows of Pascal’s triangle probably seem familiar. In
fact, we see that
192 CHAPTER 30. THE BINOMIAL THEOREM AND PASCAL’S TRIANGLE
Row 0: 1
Row 1: 1 1
Row 2: 1 2 1
Row 3: 1 3 3 1
Row 4: 1 4 6 4 1
Row 5: 1 5 10 10 5 1
. .. ..
.. . .
Figure 30.2: Pascal’s Triangle (numeric)
(x + y)0 =1
(x + y)1 = x + y = 1 · x + 1 · y
(x + y)2 = x2 + 2xy + y 2 = 1 · x2 +2 · xy + 1 · y 2
(x + y)3 = x3 + 3x2 y + 3xy 2 + y 3 = 1 · x3 + 3 · x2 y + 3 · xy 2 + 1 · y 3
The coefficients in these binomial expansions are exactly the entries in the corresponding rows of
Pascal’s Triangle. This even works for the 0th row: (x + y)0 = 1.
The fact that the coefficients in the expansion of the binomial (x + y)n (where n ≥ 0 is an integer)
can be read off from the nth row of Pascal’s Triangle is called the Binomial Theorem. We will give
two proofs of this theorem, one by induction, and the other a combinatorial proof.
193
n
n
X n
(x + y) = xk y n−k .
k
k=0
Proof. (by induction on n) When n = 0 the result is clear. So suppose that for some n ≥ 0 we have
n
X n k n−k
(x + y)n = x y , for any x, y ∈ R. Then, we have
k
k=0
Proof. When the binomial (x + y)n = (x + y)(x + y)(x + y) · · · (x + y) is expanded, the terms are
produced by selecting either the x or the y from each of the n factors x + y appearing on the right
n
side of the equation. The number of ways of selecting exactly k x’s from the n available is ,
k
194 CHAPTER 30. THE BINOMIAL THEOREM AND PASCAL’S TRIANGLE
and so that will be the coefficient of the term xk y n−k in the expansion.
n
X n k n−k
By the Binomial Theorem 1 1 = (1 + 1)n = 2n .
k
k=0
Exercises
Problems
Inclusion-Exclusion Counting
The sum rule says that if A and B are disjoint sets, then |A ∪ B| = |A| + |B|. If the sets are
not disjoint, then this formula over counts the number of elements in the union of A and B. For
example, if A = {a, b, c} and B = {c, d, e}, then
The correct way to count the number of elements in |A ∪ B| when A and B might not be disjoint
is via the inclusion-exclusion formula. To derive this formula, notice that A ∪ B = (A − B) ∪ B,
and that the sets A − B and B are disjoint. So we can apply the sum rule to conclude
|A ∪ B| = |(A − B) ∪ B| = |A − B| + |B|.
Next, notice that A = (A − B) ∪ (A ∩ B), and the two sets on the right are disjoint. So, using the
sum rule, we get
|A| = |(A − B) ∪ (A ∩ B)| = |A − B| + |A ∩ B|,
197
198 CHAPTER 31. INCLUSION-EXCLUSION COUNTING
A B
Figure 31.1: A ∪ B = (A − B) ∪ B
In words, to count the number of items in the union of two sets, include one for everything in the
first set, and include one for everything in the second set, then exclude one for each element in the
overlap of the two sets (since those elements will have been counted twice).
Example 31.1. How many students are there in a discrete math class if 15 students are computer
science majors, 7 are math majors, and 3 are double majors in math and computer science?
Solution. Let C denote the subset of computer science majors in the class, and M denote the math
majors. Then |C| = 15, |M | = 7 and |C ∩ M | = 3. So by the principle of inclusion-exclusion there
are |C| + |M | − |C ∩ M | = 15 + 7 − 3 = 19 students in the class.
Example 31.2. How many integers between 1 and 1000 are divisible by either 7 or 11?
Solution. Let S denote the set of integers between 1 and 1000 divisible by 7, and E denote the set
of integers between 1 and 1000 divisible by 11. We need to count the number of integers in S ∪ E.
By the principle of inclusion-exclusion, we have
1000 1000 1000
|S ∪ E| = |S| + |E| − |S ∩ E| = + −
7 11 77
= 142 + 90 − 12 = 120.
The inclusion-exclusion principle can be extended to the problem of counting the number of elements
in the union of three sets. The trick is the think of the union of three sets as the union of two sets.
199
It goes as follows:
|A ∪ B ∪ C| = |(A ∪ B) ∪ C|
= |A ∪ B|+|C|−|(A ∪ B) ∩ C|
= |A|+|B|+|C|−|A ∩ B|−|(A ∪ B) ∩ C|
= |A|+|B|+|C|−|A ∩ B|−|(A ∩ C) ∪ (B ∩ C)|
= |A|+|B|+|C|−|A ∩ B|−|(A ∩ C) ∪ (B ∩ C)|
= |A|+|B|+|C|−|A ∩ B|− (|A ∩ C|+|B ∩ C|−|(A ∩ C) ∩ (B ∩ C)|)
= |A|+|B|+|C|−|A ∩ B|−|A ∩ C|−|B ∩ C|+|A ∩ B ∩ C|.
This might more appropriately be named the inclusion-exclusion-inclusion formula, but nobody
calls it that. In words, the formula says that to count the number of elements in the union of three
sets, first, include everything in each set, then exclude everything in the overlap of each pair of sets,
and finally, re-include everything in the overlap of all three sets.
Example 31.3. How many integers between 1 and 1000 are divisible by at least one of 7, 9, and
11?
Solution. Let S denote the set of integers between 1 and 1000 divisible by 7, let N denote the
set of integers between 1 and 1000 divisible by 9, and E denote the set of integers between 1 and
1000 divisible by 11. We need to count the number of integers in S ∪ N ∪ E. By the principle of
inclusion-exclusion,
|S ∪ N ∪ E| = |S| + |N | + |E| − |S ∩ N | − |S ∩ E| − |N ∩ E| + |S ∩ N ∩ E|
1000 1000 1000 1000 1000 1000 1000
= + + − − − +
7 9 11 63 77 99 693
= 142 + 111 + 90 − 15 − 12 − 10 + 1 = 307.
There are similar inclusion-exclusion formulas for the union of four, five, six, · · · sets. The formulas
can be proved by induction with the inductive step using the trick we used above to go from two
sets to three. However, there is a much neater way to prove the formula based on the Binomial
Theorem.
Theorem 31.4. Given finite sets A1 , A2 , ..., An
n
[ n
X X n
\
n−1
Ak = |Ak | − |Aj ∩ Ak | + · · · + (−1) Ak .
k=1 k=1 1≤j<k≤n k=1
200 CHAPTER 31. INCLUSION-EXCLUSION COUNTING
Sn
Proof. Suppose x ∈ k=1 Ak . We need to show that x is counted exactly once by the right-hand
side of the promised formula. Say x ∈ Ai for exactly p of the sets Ai , where 1 ≤ p ≤ n.
The key to the proof is being able to count the number of intersections in each summation on the
right-hand side of the offered formula that contain x since wewill account for x once for each such
p
term. The number of such terms in the first sum is n = , the number in the second term is
1
p p
, and, in general, the number of terms in the j th sum will be provided j ≤ p. If j > p
2 j
then x will not be any of the intersections of j of the sets, and so will not contribute any more to
the right side of the formula.
So the total number of times x is accounted for on the right hand side is
p p p
− − ... + (−1)p−1
1 2 p
p p p p
=1− − + − ... + (−1)p
0 1 2 p
= 1 − (1 − 1)p = 1.
Just as we hoped.
Example 31.5. How many students are in a calculus class if 14 are math majors, 22 are com-
puter science majors, 15 are engineering majors, and 13 are chemistry majors, if 5 students are
double majoring in math and computer science, 3 students are double majoring in chemistry and
engineering, 10 are double majoring in computer science and engineering, 4 are double majoring in
chemistry and computer science, none are double majoring in math and engineering and none are
double majoring in math and chemistry, and no student has more than two majors?
Solution. Let A1 denote the math majors, A2 denote the computer science majors, A3 denote the
engineering majors, and A4 the chemistry majors. Then the information given is
and
|A1 ∩ A2 ∩ A3 ∩ A4 | = 0.
14 + 22 + 15 + 13 − 5 − 10 − 4 − 3 = 42.
Example 31.6. How many ternary strings (using 0’s, 1’s and 2’s) of length 8 either start with a
1, end with two 0’s or have 4th and 5th positions 12, respectively?
Solution. Let A1 denote the set of ternary strings of length 8 which start with a 1, A2 denote
the set of ternary strings of length 8 which end with two 0’s, and A3 denote the set of ternary
strings of length 8 which have 4th and 5th positions 12. By inclusion-exclusion, the answer is
37 + 36 + 36 − 35 − 35 − 34 + 33 .
The inclusion-exclusion formula is often used along with the Good=Total-Bad trick.
Example 31.7. How many integers between 1 and 1000 are divisible by none of 7, 9, and 11?
Solution. There are 1000 numbers between 1 and 1000 (assuming 1 and 1000 are included). As
counted before, there are 307 of those that are divisible by at least one of 7, 9, and 11. That means
there are 1000 − 307 = 693 that are divisible by none of 7, 9, or 11.
202 CHAPTER 31. INCLUSION-EXCLUSION COUNTING
Exercises
Exercise 31.1. At a certain college no student is allowed more than two majors. How many
students are in the college if there are 70 math majors, 160 chemistry majors, 230 biology majors,
56 geology majors, 24 physics majors, 35 anthropology majors, 12 double math-physics majors, 10
double math-chemistry majors, 4 double biology-math majors, 53 double biology-chemistry majors,
5 double biology-anthropology majors, and no other double majors?
Exercise 31.2. How many bit strings of length 15 start with the string 1111, end with the string
1000 or have 4th through 7th bits 1010?
Exercise 31.3. How many positive integers between 1000 and 9999 inclusive are divisible by any
of 4, 10 or 25 (careful!)?
Exercise 31.4. How many permutations of the digits 1, 2, 3, 4, 5, have at least one digit in its own
spot? In other words, a 1 in the first spot, or a 2 in the second, etc. For example, 35241 is OK
since it has a 4 in the fourth spot, and 14235 is OK, since it has a 1 in the first spot (and also a 5
in the fifth spot). But 31452 is no good. Hint: Let A1 be the set of permutations that have 1 in the
first spot, let A2 be the set of permutations that 2 in the second spot, and so on.
Exercise 31.5. How many permutations of the digits 1, 2, 3, 4, 5 have no digit in its own spot?
203
Problems
Problem 31.1. The membership of a language club consists of seven people who speak only English,
eight speak only French, five speak only Spanish, seven speak only English and Spanish, two speak
only French and Spanish, there are none who speak only English and French, and there are four
who speak all three languages. How many members are in the club?
Problem 31.2. How many integers between 1 and 10000 (inclusive) are divisible by at least one
of 9, 10, or 11?
Problem 31.3. Suppose p, q are two different primes. How many integers between 1 and the
product pq are relatively prime to pq (or, same thing, how many are divisible by neither of p and
q)? (The correct answer will factor neatly.)
Problem 31.4. Suppose p, q, r are three different primes. How many integers between 1 and the
product pqr are relatively prime to pqr (or, same thing, how many are divisible by none of p, q and
r)? (The correct answer will factor neatly.)
Problem 31.5. Suppose p, q, r, s are four different primes. How many integers between 1 and the
product pqrs are relatively prime to pqrs (or, same thing, how many are divisible by none of p, q,
r and s)? (The correct answer will factor neatly.)
Problem 31.6. Based on the results of the previous three problems, can you guess the neat formula
for five, six, seven, and so on, different primes?
Problem 31.7. Of the words of length ten using the alphabet Σ = {a, b, c}, how many either begin
abc or end cba or have cccccc as the middle six letters?
Problem 31.8. There are 6! permutations of the numbers 1, 2, 3, 4, 5, 6. In some of these there
is a run of three (or more) consecutive numbers that increase (left to right) such as 514632 which
has the increasing run 146. Others do not have any increasing runs of length three such as (a
cheap example) 654321 and (not quite as cheap) 615243. How many of the 6! permutations contain
no increasing runs of length three (or more)? (Hint: runs of length three can start with the first,
second, third, or fourth spot in the permutation.)
Chapter 32
The pigeonhole principle, like the sum and product rules, is another one of those absolutely obvious
counting facts. The statement is simple: If n + 1 objects are divided into n piles (some piles can
be empty), then at least one pile must have two or more objects in it. Or, more colorfully, if n + 1
pigeons land in n pigeonholes, then at least one pigeonhole has two or more pigeons. What could
be more obvious? The pigeonhole principle is used to show that no matter how a certain task is
carried out, some specific result must always happen.
As a simple example, suppose we have a drawer containing ten identical black socks and ten identical
white socks. How many socks do we need to select to be sure we have a matching pair? The answer
is three. Think of the pigeonholes as the colors black and white, and as each sock is selected put it
in the pigeonhole of its color. After we have placed the third sock, one of the two pigeonholes must
have at least two socks in it, and we will have a matching pair. Of course, we may have been lucky
and had a pair after picking the second sock, but the pigeonhole principle guarantees that with the
third sock we will have a pair.
As another example, suppose license plates are made consisting of four digits followed by two
letters. Are there enough license plates for a state with seven million cars? No, since there are only
104 · 262 = 6760000 possible license plates, and so, by the pigeonhole principle, at least two of the
seven million plates assigned would have to be the same.
A slightly fancier version of the pigeonhole principle says that if N objects are distributed in k
204
205
N
piles, then there must be a least one pile with k objects in it.
That formula looks impressive, but actually is easy to understand. For example, if there are 52
people in a room, we can be absolutely certain that there are at least eight born on the same day
of the week. Think of it this way: with 49 people, it would be possible to have seven born on each
of the seven days of the week. But when the 50th one is reached, it must boost one day up to an
eighth person. That is really about all there is to it. The general proof of the fancy pigeonhole
principle uses this same sort of reasoning. It is a proof by contradiction, and goes as follows:
Theorem 32.1 (Pigeonhole Principle). If N objects are distributed in k piles, then there must be
a least one pile with N
k objects in it.
Proof. Suppose we have N objects distributed in k piles, and suppose that every pile has fewer
than N objects in it. That means that the piles each contain N
k N N k − 1 or fewer objects. We will
use the fact that k < k + 1 to complete the proof. The total number of objects will be at most
k N N
k −1 < k k + 1 − 1 = N . That is a contradiction since we know there is a total of N
objects in the k piles.
Even though the pigeonhole principle sounds very simple, clever applications of it can produce
totally unexpected results.
Example 32.2. Five misanthropes move to a perfectly square deserted island that measures two
kilometers on a side. Of course, being misanthropes, they want to live as far from each other as
√
possible. Show that, no matter where they build on the island, some two will be no more than 2
kilometers of each other.
Solution. Divide the island into four one kilometer by one kilometer squares by drawing lines
joining the midpoints of opposite sides. Since there are five people and four squares, the pigeonhole
principle guarantees there will be two people living in one of those four squares. But people in one
of those squares cannot be further apart than the length of the diagonal of the square which is,
√
according to Pythagoras, 2.
Example 32.3. For any positive integer n, there is a positive multiple of n made up of a number
of 1’s followed by a number of 0’s. For example, for n = 1084, we see 1084 · 1025 = 1111100.
Solution. Consider the n + 1 integers 1, 11, 111, · · · , 11 · · · 1, where the last one consists of 1
repeated n + 1 times. Some two of these must be the same modulo n, and so n will divide the
difference of some two of them. But the difference of two of those numbers is of the required
type.
206 CHAPTER 32. THE PIGEONHOLE PRINCIPLE
Example 32.4. Bill has 20 days to prepare his tiddlywinks title defense. He has decided to practice
at least one hour every day. But, to avoid burn-out, he will not practice more than a total of 30
hours. Show there is a sequence of consecutive days during which he practices exactly 9 hours.
Solution. For j = 1, 2, · · · 20, let tj = the total number of hours Bill practices up to and including
day j. Since he practices at least one hour every day, and the total number of hours is no more
than 30, we see
0 < t1 < t2 < · · · < t20 ≤ 30.
So we have 40 integers t1 , t2 , · · · , t20 , t1 + 9, · · · t20 + 9, all between 1 and 39. By the pigeonhole
principle, some two must be equal, and the only way that can happen is for ti = tj + 9 for some i
and j. It follows that ti − tj = 9, and since the difference ti − tj is the the total number of hours
Bill practiced from day j + 1 to day i, that shows there is a sequence of consecutive days during
which he practiced exactly 9 hours.
207
Exercises
Exercise 32.1. Show that in any group of eight people, at least two were born on the same day of
the week.
Exercise 32.2. Show that in any group of 100 people, at least 15 were born on the same day of the
week.
Exercise 32.3. How many cards must be selected from a deck to be sure that at least six of the
selected cards have the same suit?
Exercise 32.4. Show that in any set of n integers, where n ≥ 2, there must be a pair with a
difference that is a multiple of n − 1.
Exercise 32.5. Al has 75 days to master discrete mathematics. He decides to study at least one
hour every day, but no more than a total of 125 hours. Show there must be a sequence of consecutive
days during which he studies exactly 24 hours.
Exercise 32.6. Show that in any set of 217 integers, there must be a pair with a difference that is
a multiple of 216.
208 CHAPTER 32. THE PIGEONHOLE PRINCIPLE
Problems
Problem 32.1. Show that in a town with population 18, 000, there must be at least two people with
the same three initials.
Problem 32.2. What is the smallest town population that will guarantee there will be at least two
people with the same three initials?
Problem 32.3. What is the smallest town population that will guarantee there will be at least five
people with the same three initials?
Problem 32.4. How many cards have to be selected from a 52 card deck to be sure there will be
two cards of the same suit?
Problem 32.5. How many cards have to be selected from a 52 card deck to be sure there will be
two cards of the same rank?
Problem 32.6. Five misanthropes buy a six mile by eight mile rectangular plot in the arctic. Show
that no matter where they build their houses, there will be at least two people that are no more than
five miles apart. (You can assume the ice sheet they buy is perfectly flat.)
Problem 32.7. In any list of n integers, there will be a chunk of consecutive entries from the list
that add up to a multiple of n. For example: in the list −8, 4, 22, −11, 7, we have 4 + 22 − 11 = 15
is a multiple of 5.
Problem 32.8. Suppose a1 , a2 , a3 . . . , a99 is a permutation of 1, 2, . . . , 99. Show that the product
is even.
Problem 32.9. In a rematch, Bill has 30 days to train for a new defense of his tiddlywinks title.
He plans to practice at least one hour every day, but no more than 45 hours total. Show there is a
sequence of consecutive days during which he practices exactly 14 hours.
Chapter 33
All of the counting exercises you’ve been asked to complete so far have not been realistic. In general
it won’t be true that a counting problem fits neatly into a section. So we need to work on the bigger
picture.
When we start any counting exercise it is true that there is an underlying exercise at the basic
level that we want to consider first. So instead of answering the question immediately we might
first want to decide on what type of exercise we have. So far we have seen three types which are
distinguishable by the answers to two questions.
(2) In forming the objects we want to count, does the order of selection matter?
The three scenarios we have seen so far are described in table 33.1.
There are two problems to address. First of all, table 33.1 is incomplete. What about, for example,
counting objects where repetition is allowed, but order doesn’t matter. Second of all, there are
connections among the types which make some solutions appear misleading. But as a general rule
of thumb, if we correctly identify the type of problem we are working on, then all we have to do
is use the principles of addition, multiplication, inclusion/exclusion, or exclusion to decompose our
problem into subproblems. The solutions to the subproblems often have the same form as the
209
210 CHAPTER 33. TOUGHER COUNTING PROBLEMS
underlying problem. The principles we employed direct us on how the sub-solutions should be
recombined to give the final answer.
Example 33.1. As an example of the second problem, if we ask how many binary strings of length
10 contain exactly
three 1’s, then the underlying
problem is an r-string problem. But in this case the
10 10 3 7
answer is . Of course this is really 1 · 1 from the binomial theorem. In this case the
3 3
r
part of the answer which looks like n is suppressed since it’s trivial. To see the difference wemight
10 3 7
ask how many ternary strings of length 10 contain exactly three 1’s. Now the answer is 1 2 ,
3
since we choose the three positions for the 1’s to go in, and then fill in each of the 7 remaining
positions with a 0 or a 2.
To begin to address the first problem we introduce the basic donut shop problem: If you get to the
donut shop before the cops get there, you will find that they have a nice variety of donuts. You
might want to order several dozen. They will put your order in a box. You don’t particularly care
what order the donuts are put into the box. You do usually want more than one of several types.
The number of ways for you to complete your order is therefore a counting problem where order
doesn’t matter, and repetition is allowed.
In order to answer the question of how many ways you can complete your order, we first recast the
problem mathematically. From among n types of objects we want to select r objects. If xi denotes
the number of objects of the ith type selected, we have 0 ≤ xi , (since we cannot choose a negative
number of chocolate donuts), also xi ∈ Z, (since we cannot select fractional parts of donuts). So,
the different ways to order are in one-to-one correspondence with the solutions to
11 + +1 + 111.
Finally, we see that the total number of solutions in non-negative integers to x1 + ... + xn = r, is the
number of binary strings of length r + n − 1 with exactly r 1’s and (n − 1) +’s. From the remark
above, the number of ways to select r donuts from n different types is
n+r−1
.
r
The basic donut shop problem is not very realistic in two ways. First it is common that some of
your order will be determined by other people. You might for example canvas the people in your
office before you go to see if there is anything you can pick up for them. So whereas you want to
order r donuts, you might have been asked to pick up a certain number of various types.
Now suppose that we know that we want to select r donuts from among n types so that at least
ai (ai ≥ 0) donuts of type i are selected. In terms of our equation, we have x1 + x2 + ... + xn = r,
n
X
where ai ≤ xi , and xi ∈ Z. If we set yi = xi − ai for i = 1, ..., n, and a = ai , then 0 ≤ yi , yi ∈ Z
i=1
and " # " n #
n
X n
X n
X X
yi = (xi − ai ) = xi − ai = r − a.
i=1 i=1 i=1 i=1
n + (r − a) − 1
So, the number of ways to complete our order is . First ask for the donuts your
(r − a)
colleagues wanted (total of a), then randomly get the rest (r − a).
Still, we qualified the donut shop problem by supposing that we arrived before the cops did.
If we arrive at the donut shop after canvassing our friends, we want to select r donuts from among
n types. The problem is that there are probably only a few left of each type. This may place an
upper limit on how often we can select a particular type. So now we wish to count solutions to
x1 + x2 + ... + xn = r, with ai ≤ xi ≤ bi , xi ∈ Z.
x1 + x2 + x3 + x4 = 34,
A4 = U and A1 ∩ A2 ∩ A3 ∩ A4 = A1 ∩ A2 ∩ A3 .
Now, to compute A1 , we must first rephrase x1 > 4 as a non-strict inequality, i.e. 5 ≤ x1 . So, it
follows that
29 + 4 − 1
|A1 | = .
29
Similarly, we have
28 + 4 − 1 25 + 4 − 1
|A2 | = , and |A3 | = .
28 25
Next, we observe that A1 ∩ A2 represents the set of all solutions in non-negative integers to
x1 + x2 + x3 + x4 = 34 with 5 ≤ x1 and 6 ≤ x2 .
So, we have
23 + 4 − 1
|A1 ∩ A2 | = .
23
213
We leave the answer in this form for clarity. The numerical value is not illuminating.
We can now solve general counting exercises where order is unimportant and repetition is restricted
somewhere between no repetition, and full repetition.
To complete the picture we should be able to also solve counting exercises where order is impor-
tant and repetition is partial. This is somewhat easier. It suffices to consider the sub-cases in
example 33.3.
Example 33.3. Let us take as initial problem the number of quaternary strings of length 15. There
are 415 of these.
15 13
Now, if we ask how many contain exactly two 0’s, the answer is 3 .
2
If we ask how many contain exactly two 0’s and four 1’s, the answer is
15 13 9
2 .
2 4
And, if we ask how many contain exactly two 0’s, four 1’s and five 2’s, the answer is
15 13 9 4 15!
= .
2 4 5 4 2! · 4! · 5! · 4!
214 CHAPTER 33. TOUGHER COUNTING PROBLEMS
So, in fact many types of counting are related by what we call the multinomial theorem.
r r!
where = .
e1 , e2 , ...en e1 !e2 !...en !
To recap, when we have a counting exercise, we should first ask whether order is important and
then ask whether repetition is allowed. This will get us into the right ballpark as far as the form of
the solution. We must use basic counting principles to decompose the exercise into sub-problems.
Solve the sub-problems, and put the pieces back together. Solutions to sub-problems usually take
the same form as the underlying problem, though they may be related to it via the multinomial
theorem. Table 33.2 synopsizes our six fundamental cases.
215
Exercises
Exercise 33.1. How many quaternary strings of length n are there (a quaternary string uses 0’s,
1’s, 2’s, and 3’s)?
Exercise 33.2. How many quaternary strings of length less than or equal to 7 are there?
Exercise 33.4. How many ternary strings of length n start 0101 and end 212?
Exercise 33.5. A doughnut shop has 8 kinds of doughnuts: chocolate, glazed, sugar, cherry, straw-
berry, vanilla, caramel, and jalapeño. How many ways are there to order three dozen doughnuts,
if at least 4 are jalapeño, at least 6 are cherry, and at least 8 are strawberry, but there are no
restrictions on the other varieties?
Exercise 33.6. How many strings of twelve lowercase English letters are there
(a) which start and end with the letter x, if letters may be repeated?
(b) which contain the letter x exactly once, if letters can be repeated?
(c) which contain each of the letters x and y both exactly once, if letters can be repeated?
(d) which contain at least one letter from the first half of the alphabet (a through m), where letters
may be repeated?
Exercise 33.7. How many bit strings of length 19 either begin 0101, or have 4th, 5th and 6th digits
101, or end 1010?
Exercise 33.8. How many pentary strings (i.e. strings using the digits 0,1,2,3,4) of length 15
consist of three 0’s, four 1’s, three 2’s, four 3’s and one 4?
Exercise 33.9. Seven lecturers and fourteen professors are on the faculty of a math department.
(a) How many ways are there to form a committee with seven members which contains more
lecturers than professors?
(b) How many ways are there to form a committee with seven members where the professors
outnumber the lecturers on the committee by at least a two-to-one margin?
216 CHAPTER 33. TOUGHER COUNTING PROBLEMS
(c) How many ways are there to form a committee consisting of at least five lecturers?
Exercise 33.10. In how many ways can twenty people form a line at a ticket window if Hans and
wife Brunhilda are having a spat, and refuse to stand in consecutive places in the line?
Exercise 33.11. Prove that a set with n ≥ 1 elements has the same number of subsets with an
even number of elements, as subsets with an odd number of elements.
217
Problems
Problem 33.1. A doughnut shop has 8 kinds of doughnuts: chocolate, glazed, sugar, cherry, straw-
berry, vanilla, caramel, and jalapeño. How many ways are there to order three dozen doughnuts,
if at most 4 are jalapeño, at most 6 are cherry, and at most 8 are strawberry, but there are no
restrictions on the other varieties?
Problem 33.2. How many strings of twelve lowercase English letters are there
(b) which contain the letter x at least once, if letters can be repeated?
(c) which contain each of the letters x and y at least once, if letters can be repeated?
(d) which contain at least one vowel, where letters may not be repeated?
Problem 33.4. How many ways are there to seat six people at a circular table where two seatings
are considered equivalent if one can be obtained from the other by rotating the table?
Problem 33.5. A donut shop sells six types of donut. You buy a scratch-off ticket that promises
you will win anywhere from one to two dozen donuts. How many different prizes are possible?
Problem 33.6. How many ternary strings of length n contain no two adjacent identical symbols?
Examples: (n = 8) 13123231 is good, but 13112321 is bad.
Problem 33.7. How many ternary strings of length n contain at least two adjacent identical
symbols? Examples: (n = 9) 131212321 is bad, but 131123321 is good.
Chapter 34
It is not always convenient to use the methods of earlier chapters to solve counting problems.
Another technique for finding the solution to a counting problem is recursive counting. The
method will be illustrated with several examples.
Example 34.1. Recall that a bit string is a list of 0’s and 1’s, and the length of a bit string is
the total number of 0’s and 1’s in the string. For example, 10111 is a bit string of length five, and
000100 is a bit string of length six. The problem of counting the number of bit strings of length n
is duck soup. There are two choices for each bit, and so, applying the product rule, there are 2n
such strings. However, consider the problem of counting the number of bit strings of length n with
no adjacent 0’s.
Let’s use an to denote the number of bit strings of length n with no adjacent 0’s. Here are a few
sample cases for small values of n.
n=0: Just one good bit string of length zero, and that is λ, the empty bit string. So a0 = 1.
n=1: There are two good bit strings of length one. Namely 0 and 1. So a1 = 2.
n=2: There are three good bit strings of length two. Namely, 01, 10 and 11. (Of course, 00 is a bad
bit string.) That means a2 = 3.
218
219
n=3: Things start to get confusing now. But here is the list of good bit strings of length three: 010,
011, 101, 110, and 111. So a3 = 5.
n=4: A little scratch work produces the good bit strings 0101, 0111, 1011, 1101, 1111, 0110, 1010,
and 1110, for a total of eight. That means a4 = 8.
We can do a few more, but it is hard to see a formula for an like the 2n formula that gives the
total number of all bit strings of length n. Even though a formula for an is difficult to spot, there
is a pattern to the list of values for an which looks like the Fibonacci sequence pattern. In fact, the
list so far looks like 1, 2, 3, 5, 8, and if a few more are worked out by brute force, it turns out the
list continues 13, 21, 34. So, it certainly seems that the solution to the counting problem can be
expressed recursively as a0 = 1, a1 = 2, and for n ≥ 2, an = an−1 + an−2 . If this guess is really
correct, then we can quickly compute the number of good bit strings of length n. We just calculate
a0 , a1 , a2 , etc., until we reach the an we are interested in.
Such a recursive solution to counting problems is certainly less satisfactory than a simple formula,
but some counting problems are so messy that a simple formula might not be possible, and the
recursive solution is better than nothing in such a case.
There is one problem with the recursive solution offered in example 34.1. We said that it seems
that the solution to the counting problem can be expressed recursively as a0 = 1, a1 = 2, and for
n ≥ 2, an = an−1 + an−2 . That it seems that is not an acceptable justification of the formula.
After all, we are basing that guess on just eight or ten values of the infinite sequence an , and it is
certainly possible that those values happen to follow the pattern we’ve guessed simply by accident.
Maybe the true pattern is much more complicated, and we have been tricked by the small number
of cases we have considered. It is necessary to show that the guessed pattern is correct by supplying
a logical argument.
Our argument would begin by checking the initial conditions we offered. In other words, we would
verify by hand that a0 = 1 and a1 = 2. This serves as a basis for the verification of the recursive
formula. Now what we want to do is assume that we have already calculated all the values a0 , a1 ,
· · · , ak for some k ≥ 1, and show that ak+1 must equal ak +ak−1 . It is very important to understand
that we do not want to compute the value of ak+1 . We only want to prove that ak+1 = ak + ak−1 .
The major error made doing these types of problems is attempting to compute the specific value
of ak+1 . Don’t fall for that trap! After all, if it were possible to actually compute the specific
value of ak+1 , then we could find a formula for an in general, and we wouldn’t have to be seeking
a recursive relation at all.
220 CHAPTER 34. COUNTING USING RECURRENCE RELATIONS
Here is how the argument would go in the bit string example. Suppose we have lists of the good bit
strings of lengths 0, 1, · · · , k. Here is how to make a list of all the good bit strings of length k + 1.
First, take any good bit string of length k and add a 1 on the right hand end. The result must be
a good bit string of length k + 1 (since we added a 1 to the end, and the original bit string didn’t
have two consecutive 0’s, the new bit string cannot have two consecutive 0’s either). In that way
we form some good bit strings of length k + 1. In fact, we have built exactly ak good bit strings of
length k + 1. But wait, there’s more! (as they say in those simple-minded TV ads). Another way
to build a good bit string of length k + 1 is to take a good bit string of length k − 1 and add 10 to
the right end. Clearly these will also be good bit strings of length k + 1. And these all end with a
0, so they are all new ones, and not ones we built in the previous step. How many are there of this
type? One for each of the good bit strings of length k − 1, or a total of ak−1 . Thus, so far we have
built ak + ak−1 good bit strings of length k + 1. Now we will show that in fact we have a complete
list of all good bit string of length k + 1, and that will complete the proof that ak+1 = ak + ak−1 .
But before driving that last nail into the coffin, let’s look at the steps outlined above for the case
k + 1 = 4.
The previous paragraph essentially provides an algorithm for building good bit strings of length
k + 1 from good bits strings of lengths k and k − 1. The algorithm instructs us to add 1 to the
right end of all the good bit strings of length k and 10 to the right of all the good bit strings of
length k − 1. Applying the algorithm for the case k + 1 = 4, gives the following list, where the
added bits are put in parentheses to make them stand out. 010(1), 011(1), 101(1), 110(1), 111(1),
01(10), 10(10), and 11(10).
There remains one detail to iron out. It is clear that the algorithm will produce good bit strings
of length k + 1. But, does it produce every good bit string of length k + 1? If it does not, then
the recursive relation we are offering for the solution to the counting problem will eventually begin
to produce answers that are too small, and we will undercount the number of good bit strings. To
see that we do count all good bit strings of length k + 1, consider any particular good bit string of
length k + 1, call it s for short, and look at the right most bit of s. There are two possibilities for
that bit. It could be a 1. If that is so, then when the 1 is removed the remaining bit string is a good
string of length k (it can’t have two adjacent 0’s since s doesn’t have two adjacent 0’s). That means
the bit string s is produced by adding a 1 to the right end of a good bit string of length k, and so s
is produced by the first step in the algorithm. The other option for s is that the right most bit is a
0. But then the second bit in from the right must be a 1, since s is a good bit string, so it doesn’t
have adjacent 0’s. So the last two bits on the right of s are 10. If those two bits are removed, there
remains a good bit string of length k − 1. Thus s is produced by adding 10 to the right end of a
good bit string of length k − 1, and so s is produced by the second case in the algorithm.
221
In a nutshell, we have shown our algorithm produces ak + ak−1 good bit strings of length k + 1,
and that the algorithm does not miss any good bit strings of length k + 1. Thus we have proved
that ak+1 = ak + ak−1 for all k ≥ 2.
Example 34.1 was explained in excruciating detail. Normally, the verifications will be much more
briefly presented. It takes a while to get used to recursive counting, but once the light goes on, the
beauty and simplicity of the method will become apparent.
Example 34.2. This example is a little silly since it is very easy to write down a formula to solve
the counting problem. But the point of the example is not find the solution to the problem but rather
to exhibit recursive counting in action. The problem is to compute the total number of individual
squares on an n × n checkerboard. If we let the total number of squares be denoted by sn , then
obviously sn = n2 . For example, an ordinary checkerboard is an 8 × 8 board, and it has a total of
s8 = 82 = 64 individual squares. But let’s count the number of squares recursively. Clearly s0 = 0.
Now suppose we have computed the values of s0 , s1 , · · · , sk , for some k ≥ 0. We will show how to
compute sk+1 from those known values. To determine sk+1 , draw a (k + 1) × (k + 1) checkerboard.
(You should make a little sketch of such a board for say k + 1 = 5 so you can follow the process
described next.) From that (k + 1) × (k + 1) board, slice off the right hand column of squares, and the
bottom row of squares. What is left over will be a k × k checkerboard, so it will have sk individual
squares. That means that
Now ignore the lower right hand corner square for a moment. There are k other squares in the right
hand column that was sliced off. Likewise, ignoring the corner square, there are k other squares in
the bottom row that was sliced off. Hence the total number of squares sliced off was k + k + 1, the
1 accounting for the corner square. Thus
sk+1 = sk + k + k + 1 = sk + 2k + 1
s0 = 0, and
sk+1 = sk + 2k + 1, for k ≥ 0.
222 CHAPTER 34. COUNTING USING RECURRENCE RELATIONS
Example 34.3. Suppose we have available an unlimited number of pennies and nickels to deposit
in a vending machine (a really old vending machine it seems, since it even accepts pennies). Let dn
be the number of different ways of depositing a total of n cents in the machine. Just to make sure
we understand the problem, let’s compute dn for a few small values of n. Clearly d0 = 1 since there
is only one way to deposit no money in the machine (namely don’t put any money in the machine!).
d1 = 1 (put in one penny), d2 = 1 (put in two pennies), d3 = 1 (put in three pennies), d4 = 1 (put
in four pennies). Now things start to get exciting! d5 = 2 (put in five pennies or put in one nickel).
And even more thrilling is d6 = 3 (the three options are (1) six pennies, (2) one penny followed by
a nickel, and (3) one nickel followed by a penny). That last count indicates a fact that may not
have been clear: the order on which pennies and nickels are deposited is considered important. With
a little more trial and error with pencil and paper, further values are found to be d7 = 4, d8 = 5,
d9 = 6, d10 = 8, d11 = 11, and d12 = 15. It is hard to see a formula for these values. But it is
relatively easy to write down a recursive relation that produces this sequence of values. Think of it
this way, suppose we wanted to put n cents in the machine, where n ≥ 5. We can make the first
coin either a penny or a nickel. If we make the first coin a penny, then we will need to add n − 1
more cents, which can be done in dn−1 ways. On the other hand, if we make the first coin a nickel,
we will need to deposit n − 5 more cents, and that can be done in dn−5 ways. By the sum rule of
counting, we conclude that the number of ways of depositing n cents is dn−1 + dn−5 . In other words,
dn = dn−1 + dn−5 for n ≥ 5.
Since our recursive relation for dn does not kick in until n reaches 5, we will need to include
d0 , d1 , d2 , d3 , and d4 as initial terms. So the recursive solution to this counting problem is
d0 = 1 d1 = 1 d2 = 1 d3 = 1 d4 = 1
for n ≥ 5, dn = dn−1 + dn−5
Example 34.4 (The Tower of Hanoi). The classic example of recursive counting concerns the
story of the Tower of Hanoi. A group of monks wished a magical tower to be constructed from 1000
stone rings. The rings were to be of 1000 different sizes. The size and composition of the rings was
to be designed so that any ring could support the entire weight of all of the rings smaller than itself,
but each ring would be crushed beneath the weight of any larger ring.
The monks hired the lowest bidder to construct the tower in a clearing in the dense jungle nearby.
Upon completion of construction the engineers brought the monks to see their work. The monks
223
admired the exquisite workmanship, but informed the engineers that the tower was not in the proper
clearing.
In the jungle there were only three permanent clearings. The monks had labelled them A, B and C.
The engineers had labelled them in reverse order. The monks instructed the engineers to move the
tower from clearing A to clearing C!
Because of the massive size of the rings, the engineers could only move one per day. No ring could
be left anywhere in the jungle except one of A, B, or C. Finally each clearing was only large enough
so that rings could be stored there by stacking them one on top of another.
The monks then asked the engineers how long it would take for them to fix the problem.
Before they all flipped a gasket, the most mathematically talented engineer came upon the following
solution.
Let Hn denote the minimum number of days required to move an n ring tower from A to C under
the constraints given. Then H1 = 1, and in general an n ring tower can be moved from A to C by
first moving the top (n − 1) rings from A to B leaving the bottom ring at A, then moving the bottom
ring from A to C, and then moving the top (n − 1) rings from clearing B to clearing C. That shows
Hn ≤ 2 · Hn−1 + 1, for n ≥ 2, and a little more thought shows the algorithm just described cannot
be improved upon. Thus Hn = 2 · Hn−1 + 1.
Using the initial condition H1 = 1 together with the recursive relation Hn = 2 · Hn−1 + 1, we can
generate terms of the sequence:
So, the problem would be fixed in 21000 − 1 days, or approximately 2.93564 × 10296 centuries. Now,
that is job security!
224 CHAPTER 34. COUNTING USING RECURRENCE RELATIONS
Here are a few general rules for solving counting problems recursively:
(2) think recursively: how can a larger case be solved if the solutions to smaller cases are known,
and
(3) check the numbers produced by the recursive solution to make sure they agree with the values
obtained by brute force.
Exercises
Exercise 34.1. On day zero, a piggy bank contains $0. Each day, one more penny is added to
the bank than the day before. So, on day 1, one penny is added, on day 2, two pennies are added.
Write a recursive formula for the total number of pennies in the bank each day, n = 0, 1, 2, . . ..
Exercise 34.2. Al climbs stairs by taking either one or two steps at a time. For example, he can
climb a flight of three steps in three different ways: (1) one step, one step, one step or (2) two step,
one step, or (3) one step, two step. Determine a recursive formula for the number of different ways
Al can climb a flight of n steps.
Exercise 34.3. Find a recurrence relation for the number of bit strings of length n that contain an
even number of 0’s.
Exercise 34.4. Find a recurrence relation for the number of bit strings of length n that contain
two consecutive 0’s.
Exercise 34.5. Find a recurrence relation for the number of bit strings of length n that contain
the string 01.
Exercise 34.6. Find a recurrence relation for the number of ternary strings of length n that contain
two consecutive 0’s.
Exercise 34.7. Find a recurrence relation for the number of subsets of {1, 2, 3, . . . , n} that do not
contain any consecutive integers. Examples for n = 9: the subset {1, 3, 8} is good, but the subset
{2, 5, 6, 9} is bad since it contains the consecutive integers 5, 6.
Exercise 34.8. Suppose in the original Tower of Hanoi problem there are four clearings A, B, C, D.
Find a recursive relation for Jn , the minimum number of moves needed to transfer the tower from
clearing A to clearing D.
225
Problems
Problem 34.1. Suppose on December 31, 2000, a deposit of $100 is made in a savings account
that pays 10% annual interest (Ah, those were the days!). So one year after the initial deposit, on
December 31, 2001, the account will be credited with $10, and have a value of $110. On December
31, 2002 that account will be credited with an additional $11, and have value $121. Find a recursive
relation that gives the value of the account n years after the initial deposit.
Problem 34.2. Sal climbs stairs by taking either one, two, or three steps at a time. Determine a
recursive formula for the number of different ways Sal can climb a flight of n steps. In how many
ways can Sal climb a flight of 10 steps?
Problem 34.3. Passwords for a certain computer system are strings of uppercase letters. A valid
password must contain an even number of X’s. Determine a recurrence relation for the number of
valid passwords of length n.
Problem 34.4. A (cheap) vending machine accepts pennies, nickels, and dimes. Let dn be the
number of ways of depositing n cents in the machine, where the order in which the coins are
deposited matters. Determine a recurrence relation for dn . Give the initial conditions.
Problem 34.5. Suppose the Tower of Hanoi rules are changed so that stones may only be trans-
ferred to an adjacent clearing in one move. Let In be the minimum number of moves required to
transfer tower from clearing A to clearing C?
Problem 34.6. Find a recurrence relation for the number of binary strings of length n which do
not contain the substring 010.
Problem 34.7. Find a recurrence relation for the number of ternary strings of length n that contain
three consecutive zeroes.
Problem 34.8. Find a recurrence relation for the number of quaternary strings which contain two
consecutive 1’s.
Problem 34.9. Let n be a positive integer. Find a recurrence relation that counts the number of
increasing sequences of distinct integers that start with 1 and end with n. Example: For n = 4,
there are 4 such sequences. They are 1, 4; 1, 2, 4; 1, 3, 4; and 1, 2, 3, 4.
Chapter 35
In chapter 34, it was pointed out that recursively defined sequences suffer from one major drawback:
In order to compute a particular term in the sequence, it is necessary to first compute all the terms
of the sequence leading up to the one that is wanted. Imagine the chore to calculate the 250th
Fibonacci number, f250 ! For problems of computation, there is nothing like having a formula like
an = n2 , into which it is merely necessary to plug the number of interest.
It may be possible to find a formula for a sequence that is defined recursively. When that can be
done, you have the best of both the formula and recursive worlds. If we find a formula for the terms
of a recursively defined sequence, we say we have solved the recursion.
Example 35.1. Here is an example: The sequence {an } is defined recursively by the initial con-
dition a0 = 2, and the recursive formula an = 2an−1 − 1 for n ≥ 1. If the first few terms of this
sequence are written out, the results are
and it shouldn’t be too long before the pattern becomes clear. In fact, it looks like an = 2n + 1 is the
formula for an . You do have to recognize the slightly hidden powers of 2: 1, 2, 4, 8, 16, 32, 64, . . . .
226
227
To prove that guess is correct, induction would be the best way to go. Here are the details. Just
to make everything clear, here is what we are going to show: If a0 = 2, and an = 2an−1 − 1 for
n ≥ 1, then an = 2n + 1 for all n ≥ 0. The basis for the inductive proof is the case n = 0. The
correct value for a0 is 2, and the guessed formula has value 2 when n = 0, so that checks out.
Now for the inductive step: suppose that the formula for ak is correct for a particular k ≥ 0. That
is, assume ak = 2k + 1 for some k ≥ 0. Let’s show that the formula must also be correct for
ak+1 . That is, we want to show ak+1 = 2k+1 + 1. Well, we know that ak+1 = 2ak − 1, and hence
ak+1 = 2(2k + 1) − 1 = 2k+1 + 2 − 1 = 2k+1 + 1, just as was to be proved. It can now be concluded
that the formula we guessed is correct for all n ≥ 0.
In example 35.1, it was possible to guess the correct formula for an after looking at a few terms. In
most cases the formula will be so complicated that that sort of guessing will be out of the question.
There is a method that will nearly automatically solve any recurrence of the form a0 = a and for,
n ≥ 1, an = ban−1 + c (where a, b, c are constants). The method is called unfolding.
Example 35.2. As an example, let’s solve a0 = 2 and, for n ≥ 1, an = 5 + 2an−1 . The plan is
to write down the recurrence relation, and then substitute for an−1 , then for an−2 , and so on, until
we reach a0 . It looks like this
an = 5 + 2an−1
= 5 + 2(5 + 2an−2 ) = 5 + 5(2) + 22 an−2
= 5 + 5(2) + 22 (5 + 2an−3 ) = 5 + 5(2) + 5(22 ) + 23 an−3 .
If this substitution is continued, eventually we reach an expression we can compute in closed form:
In the next to last step we use the formula for adding the terms of a geometric sequence.
228 CHAPTER 35. SOLUTIONS TO RECURRENCE RELATIONS
Exercises
Exercise 35.1. Guess the solution to a0 = 2, and a1 = 4, and, for n ≥ 2, an = 4an−1 − 3an−2
and prove your guess is correct by induction.
Problems
Problem 35.1. Guess the solution to a0 = 1, and a1 = 5, and, for n ≥ 2, an = an−1 + 2an−2 and
prove your guess is correct by induction.
There is no method that will solve all recurrence relations. However, for one particular type,
there is a standard technique. The type is called a linear recurrence relation with constant
coefficients. In such a recurrence relation, the recurrence formula has the form
The degree of the recurrence is k, the number of terms we need to go back in the sequence to
compute each new term. If f (n) = 0, then the recurrence relation is called homogeneous. Otherwise
it is called non-homogeneous.
In chapter 35, we noted that some simple non-homogeneous linear recurrence relations with constant
coefficients can be solved by unfolding. This method is not powerful enough for more general
problems. In this chapter we introduce a basic method that, in principle at least, can be used to
solve any homogeneous linear recurrence relation with constant coefficients.
We begin by considering the degree 2 case. That is, we have a recurrence relation of the form
an = c1 an−1 + c2 an−2 , for n ≥ 2, where c1 and c2 are real constants. We must also have two initial
229
230 CHAPTER 36. THE METHOD OF CHARACTERISTIC ROOTS
conditions a0 and a1 . That is, we are given a0 and a1 and the formula an = c1 an−1 + c2 an−2 , for
n ≥ 2. Notice that c2 ̸= 0 or else we have a linear recurrence relation with constant coefficients
and degree 1. What we seek is a closed form formula for an , which is a function of n alone, and
which is therefore independent of the previous terms of the sequence.
Here’s the technique in a specific example: The problem we will solve is to find a formula for the
terms of the sequence
a0 = 4 and a1 = 8, with
an = 4an−1 + 12an−2 , for n ≥ 2.
The first thing to do is to ignore the initial conditions, and concentrate on the recurrence relation.
And the way to solve the recurrence relation is to guess the solution. Well, actually, it is to guess
the form of the solution - an educated guess! For such a recurrence you should guess that the
solution looks like an = rn , for some constant r. In other words, guess the solution is simply the
powers of some fixed number. The good news is that this guess will always be correct! You will
always find some solutions of this form. When this guess is plugged into the recurrence relation and
the equation is simplified, the result is an equation that can be solved for r. That equation is called
the characteristic equation for the recurrence. In our example, when an = rn for each n, the
result is rn = 4rn−1 + 12rn−2 , and canceling rn−2 from each term, and rearranging the equation,
we get r2 − 4r − 12 = 0. That’s the characteristic equation. The left side can be factored, and the
equation then looks like (r − 6)(r + 2) = 0, and we see the solutions for r are r = 6 and r = −2.
And, sure enough, if you check it out, you will see that an = 6n and an = (−2)n both satisfy the
given recurrence relation. In other words, we find that
Using the characteristic equation, we have a method of finding some solutions to a recurrence
relation. This method will not find all possible solutions however. BUT... if we find all the
solutions to the characteristic equation, then they can be combined in a certain way to produce all
possible solutions to the recurrence relation. The fact to remember is that if r = a, b are the two
solutions to the characteristic equation (for a recurrence of order two), then every possible solution
to the linear homogeneous recurrence relation must look like αan + βbn for some constants α, β.
In the example we have been working on, every possible solution looks like
an = α(6)n + β(−2)n .
231
Once we have figured out the general solution to the recurrence relation, it is time to think about
the initial conditions. In our case, the initial conditions are a0 = 4 and a1 = 8. The idea is to select
the constants α and β of the general solution an = α6n + β(−2)n so it will produce the correct two
initial values. For n = 0 we see we need 4 = a0 = α60 + β(−2)0 = α + β, and for n = 1, we need
8 = a1 = α61 + β(−2)1 = 6α − 2β. Now, we solve the following pair of equations for α and β:
α + β = 4,
6α − 2β = 8.
Performing a bit of algebra, we learn that α = 2 and β = 2. Thus the solution to the recurrence is
an = 2 · 6n + 2 · (−2)n .
(4) Select the constants in the general solution to produce the correct initial conditions.
One catch with the method of characteristic equation occurs when the equation has repeated roots.
Suppose, for example, that when the characteristic equation is factored the result is (r − 2)(r −
2)(r − 3)(r + 5) = 0. The characteristic roots are 2, 2, 3 and −5. Here 2 is a repeated root. If we
follow the instructions given above, then the general solution we would write down is
However, this expression will not include all possible solutions to the recurrence relation. Happily,
the problem is not too hard to repair: each time a root of the characteristic equation is repeated,
multiply it by an additional factor of n in the general solution, and then proceed with step 4 as
described earlier.
232 CHAPTER 36. THE METHOD OF CHARACTERISTIC ROOTS
For our example, we modify one of the 2n terms in equation 36.1. The correct general solution
looks like
an = α2n + βn · 2n + γ3n + δ(−5)n .
Notice the extra factor of n in the second term. If (r − 2) had been a four fold factor of the
characteristic equation, in other words, if 2 had been a characteristic root four times, then the part
of the general solution involving the 2’s would look like
Let’s describe the method of characteristic equation a little more formally. First, the charac-
teristic equation is denoted by χ(x) = 0. Notice that the degree of χ(x) coincides with the degree
of the recurrence relation. Notice also that the non-leading coefficients of χ(x) are simply the
negatives of the coefficients of the recurrence relation. In general, the characteristic equation of
an = c1 an−1 + ... + ck an−k is
A number r (possibly complex) is a characteristic root if χ(r) = 0. From basic algebra we know
that r is a root of a polynomial if and only if (x − r) is a factor of the polynomial. When χ(x) is
a degree 2 polynomial, by the quadratic formula, either χ(x) = (x − r1 )(x − r2 ), where r1 ̸= r2 , or
χ(x) = (x − r)2 , for some r. So there are two theorems about degree 2 linear recurrence relations
with constant coefficients.
Theorem 36.1. Let c1 and c2 be real numbers. Suppose that the polynomial χ(x) = x2 − c1 x − c2
has two distinct roots r1 and r2 . Then a sequence a : N → R is a solution of the recurrence relation
an = c1 an−1 + c2 an−2 , for n ≥ 2 if and only if am = αr1m + βr2m , for all m ∈ N, and for some
constants α and β. The constants are determined by the initial conditions (see equation 36.2).
Proof. If am = αr1m + βr2m for all m ∈ N, for some constants α and β, then since ri2 − c1 ri − c2 = 0,
we have ri2 = c1 ri + c2 , for i = 1, 2. Hence, for n ≥ 2, we have
Conversely, if a is a solution of the recurrence relation and has initial terms a0 and a1 , then one
checks that the sequence am = αr1m + βr2m with
a1 − a0 · r2 a0 r1 − a1
α= , and β = (36.2)
r1 − r2 r1 − r2
also satisfies the relation and has the same initial conditions. The equations for α and β come from
solving the system of linear equations
a0 = α(r1 )0 + β(r2 )0 = α + β
a1 = α(r1 )1 + β(r2 )1 = αr1 + βr2 .
Solution. The recurrence relation is a linear homogeneous recurrence relation of degree 2 with
constant coefficients c1 = 0 and c2 = 1. The characteristic polynomial is
χ(x) = x2 − 0 · x − 1 = x2 − 1.
x2 − 1 = (x − 1)(x + 1).
2 = a0 = α10 + β(−1)0 = α + β
3 = a1 = α11 + β(−1)1 = α + β(−1) = α − β.
Adding the two equations eliminates β and gives 5 = 2α, so α = 5/2. Substituting this into the first
equation, 2 = 5/2 + β, we see that β = −1/2. Thus, our solution is
5 n −1 5 1
an = ·1 + (−1)n = − · (−1)n .
2 2 2 2
Example 36.3. Solve the recurrence relation a1 = 3, a2 = 5, and an = 5an−1 − 6an−2 for n ≥ 3.
3 = a1 = α21 + β31 = 2α + 3β
5 = a2 = α22 + β32 = 4α + 9β.
6 = 4α + 6β
5 = 4α + 9β.
Subtracting the second equation from the first eliminates α and yields 1 = −3β. So, we have found
that β = −1/3. Substitution into the first equation yields 3 = 2α + 3 · (−1/3), so α = 2. Thus
1 m
am = 2 · 2m − · 3 = 2m+1 − 3m−1 , for all m ≥ 1.
3
The other case we mentioned had a characteristic polynomial of degree two with one repeated root.
Since the proof is similar we simply state the theorem.
Theorem 36.4. Let c1 and c2 be real numbers with c2 ̸= 0 and suppose that the polynomial
x2 − c1 x − c2 has a root r with multiplicity 2, so that x2 − c1 x − c2 = (x − r)2 . Then, a sequence
a : N → R is a solution of the recurrence relation an = c1 an−1 + c2 an−2 , for n ≥ 2 if and only if
am = (α + βm)rm ,
Example 36.5. Solve the recurrence relation a0 = −1, a1 = 4 and an = 4an−1 − 4an−2 , for n ≥ 2.
Solution. In this case we have χ(x) = x2 − 4x + 4 = (x − 2)2 . So, we may suppose that
−1 = a0 = (α + β · 0)20 = (α) · 1 = α
4 = a1 = (α + β · 1)21 = 2(α + β) · 2.
Substituting α = −1 into the second equation gives 4 = 2(β − 1), so 2 = β − 1 and β = 3. Therefore
am = (3m − 1)2m , for all m ∈ N.
Theorem 36.6. Let c1 , c2 , ..., ck ∈ R with ck ̸= 0. Suppose that the characteristic polynomial
factors as
where r1 , r2 , . . ., rs are distinct roots of χ(x), and j1 , j2 , . . ., js are positive integers that sum to
k Then a sequence a : N → R is a solution of the recurrence relation
if and only if
where
pi (m) = α0,i + α1,i m + α2,i m2 + ... + αji −1,i mji −1 1 ≤ i ≤ s and the αl,i ’s are constants.
There is a problem with the general case. It is true that given the recurrence relation we can
simply write down the characteristic polynomial. However it can be quite a challenge to factor
it as required by the theorem. Even if we succeed in factoring it we are faced with the tedious
task of setting up and solving a system of k linear equations in k unknowns (the αl,i ’s). While in
theory such a system can be solved using the methods of elimination or substitution covered in a
college algebra course, in practice, the amount of labor involved can become overwhelming. For
this reason, computer algebra systems are often used in practice to help solve systems of equations,
or even the original recurrence relation.
236 CHAPTER 36. THE METHOD OF CHARACTERISTIC ROOTS
Exercises
Exercise 36.1. For each of the following sequences find a recurrence relation satisfied by the se-
quence. Include a sufficient number of initial conditions to completely specify the sequence.
(a) an = 2n + 3, n ≥ 0
(b) an = 3 · 2n , n ≥ 1
(c) an = n2 , n ≥ 1
(d) an = n + (−1)n , n ≥ 0
Exercise 36.9. Find a closed form formula for the terms of the Fibonacci sequence: f0 = 0, f1 = 1,
and for n ≥ 2, fn = fn−1 + fn−2 .
237
Problems
Solve each of the recurrence relations using the method of characteristic roots:
Solving Non-homogeneous
Recurrences
When a linear recurrence relation with constant coefficients for a sequence {sn } looks like
where f (n) is some (nonzero) function of n, then the recurrence relation is said to be non-
homogeneous. For example, sn = 2sn−1 + n2 + 1 is a non-homogeneous recurrence. Here
f (n) = n2 +1. The methods used in the last chapter are not adequate to deal with non-homogeneous
problems. But it wasn’t all a waste since those methods do provide one step in the solution of non-
homogeneous problems.
The steps used to solve non-homogeneous linear recurrence relations with constant coefficients are:
Now solve this and write down the general solution. We learned to do this in chapter 36. For
238
239
example, in the case of no repeated roots, the general solution will look something like:
Step (2): Next, find one particular solution to the original non-homogeneous recursion. In other
words, one specific sequence that obeys the recursive formula (ignoring the initial conditions).
A method for finding a particular solution that works in many cases is to guess! Actually, it
is to make an educated guess. Reasonable guesses depend on the form of f (n). There is an
algorithm that will produce the correct guess, but it is so complicated it isn’t worth learning
for the few simple examples we will be doing. Instead, rely on the following guidelines to
guess the form of a particular solution.
Roughly, the plan is the guess a particular solution that is the most general function of the
same type as f (n). Specifically, table 37.1 shows reasonable guesses.
These guesses can be mixed-and-matched. For example, if
f (n) = 3n2 + 5n ,
An2 + Bn + C + D5n .
Once a guess has been made for the form of a particular solution, that guess is plugged into
the recurrence relation, and the coefficients A, B, · · · are determined. In this way a specific
particular solution will be found.
It will sometimes happen that when the equations are set up to determine the coefficients of
the particular solution, an inconsistent system will appear. In such a case, as with repeated
240 CHAPTER 37. SOLVING NON-HOMOGENEOUS RECURRENCES
characteristic roots, the trick is (more-or-less) to multiply the guess for the particular solution
by n, and try again.
Step (3): Once a particular solution has been found, add the particular solution of step (2) to the
general solution of the homogeneous recurrence found in step (1). If we denote a particular
solution by h(n), then the total general solution looks like
Step (4): Invoke the initial conditions to determine the values of the coefficients a1 , a2 , · · · , ak just as
we did for the homogeneous problems in chapter 36.
The major oversight made solving a non-homogeneous recurrence relation is trying to determine
the coefficients a1 , a2 , · · · , ak before the particular solution is added to the general solution. This
mistake will usually lead to inconsistent information about the coefficients, and no solution to the
recurrence will be found.
Example 37.1. Let’s solve the Tower of Hanoi recurrence using this method.
The recurrence is H0 = 0, and, for n ≥ 1, Hn = 2Hn−1 + 1. We know the closed form formula for
Hn is 2n − 1 already, but let’s work it out using the method outlined above.
Step (1): Find the general solution of related homogeneous recursion (indicated by the superscript (h)):
(h) (h) (h)
Hn = 2Hn−1 . That will be Hn = A2n .
(p)
Step (2): Guess the particular solution (indicated by superscript (p)): Hn = B, a constant. Plugging
that guess into the recurrence gives B = 2B + 1, and so we see B = −1.
Step (3): Hence, the general solution to the Tower of Hanoi recurrence is
Step (4): Now, use the initial condition to determine A: When n = 0, we want 0 = A20 − 1 which
means A = 1. Thus, we find the expected result:
Hn = 2n − 1, for n ≥ 0.
Example 37.2. Here is a more complicated example worked out in detail to exhibit the method.
241
s1 = 2, s2 = 5 and,
sn = sn−1 + 6sn−2 + 3n − 1, for n ≥ 3.
Step (1): Find the general solution of sn = sn−1 + 6sn−2 . After finding the characteristic equation, and
the characteristic roots, the general solution turns out to be sn = a1 3n + a2 (−2)n .
Step (2): To find a particular solution let’s guess that there is a solution h(n) that looks like h(n) =
an + b, where a and b are to be determined. To find values of a and b that work, we substitute
this guess for a solution into the original recurrence relation. In this case, the result of plugging
in the guess (sn = h(n) = an + b) gives us:
an + b = a(n − 1) + b + 6(a(n − 2) + b) + 3n − 1.
If this equation is to be correct for all n, then, in particular, it must be correct when n = 0
and when n = 1, and that tells us that
−13a + 6b − 1 = 0 and,
6a + 3 − 13a + 6b − 1 = 0.
Step (3): Write down the general solution to the original non-homogeneous problem by adding the par-
ticular solution of step (2) to the general solution from step (1) getting:
1 11
sn = a1 3n + a2 (−2)n + − n+ − .
2 12
Step (4): Now a1 , a2 can be calculated: For n = 1, the first initial condition gives
−1 −11
2 = a1 31 + a2 (−2)1 + 1+ ,
2 12
242 CHAPTER 37. SOLVING NON-HOMOGENEOUS RECURRENCES
11
Solving these two equations for a1 and a2 , we find that a1 = 12 and a2 = − 31 . So the solution
to the recurrence is
11 n 1 1 11
sn = 3 − (−2)n + − n+ − .
12 3 2 12
243
Exercises
Use the general solutions for the related homogeneous problems of chapter 36 to help solve the
following non-homogeneous recurrence relations with initial conditions.
Problems
Graphs
In chapter 8 we represented a relation with a graph. In this chapter we discuss a more general
notion of a graph.
There is a lot of new vocabulary to absorb concerning graphs! For this chapter, a graph will consist
of a number of points (called vertices) (singular: vertex) together with lines (called edges) joining
some (possibly none, possibly all) pairs of vertices. Unlike the graphs of earlier chapters, we will
not allow an edge from a vertex back to itself (so no loops allowed), we will not allow multiple edges
between vertices, and the edges will not be directed (there will be no edges with arrowheads on one
or both ends). All of our graphs will have a finite vertex sets, and consequently a finite number of
edges. Graphs are typically denoted by an uppercase letter such as G or H.
If you would like a formal definition: a graph, G consists of a set of vertices V and a set E of
edges, where an edge t ∈ E is written as an unordered pair of vertices {u, v}, (in other words, a set
consisting of two different vertices). We say that the edge t = {u, v} has endpoints u and v, and
that the edge t is incident to both u and v. The vertices u and v are adjacent when there is an
edge with endpoints u and v; otherwise they are not adjacent. Such a formal definition is necessary,
but a more helpful way to think of a graph is as a diagram.
Here is an example of a graph G with vertex set {a, b, c, d, e} illustrating these concepts.
244
245
b c
a d
G e
The placement of the vertices in a diagram representing a graph is (within reason!) not important.
Here is another diagram of that same graph G.
a b c
G
d
Applying the a-picture-is-worth-a-thousand-words principle, for the small graphs we will be working
with, a graph diagram is generally the easiest way to represent a graph.
There are two standard ways to represent a graph in computer memory, both involving matrices
(in other words, tables of numbers). The matrices are of a special type called 0, 1-matrices since
the table entries will all be either 0 or 1.
Adjacency matrix: If there are n vertices in the graph G, the adjacency matrix is an n by n
square table of numbers. The rows and columns of the table are labeled with the symbols used to
name the vertices. The names are used in the same order for the rows and columns, so there are
n! possible labelings. Often there will be some natural choice of the order of the labels, such as
246 CHAPTER 38. GRAPHS
alphabetic or numeric order. The entries in the table are determined as follows: the matrix entry
with row label x and column label y is 1 if x and y are adjacent, and 0 otherwise.
Incidence matrix: Suppose the graph G has n vertices and m edges. The table will have n rows,
labeled with the names of the vertices, and m columns labeled with the edges. Which of the n!m!
possible orderings of these labelings has to be specified in some way. The entry in the row labeled
with vertex u and column labeled with edge e is 1 if e is incident with u, and 0 otherwise. Since
every edge is incident to exactly two vertices, every column of the incidence matrix will have exactly
two 1’s.
Example 38.1. Let G have vertex set {u1 , u2 , u3 , u4 , u5 } and edges {u1 , u2 }, {u2 , u3 }, {u3 , u4 },
{u4 , u5 }, {u5 , u1 }, {u5 , u3 }. A graphical representation of G is
u1
u5 u2
G
u4 u3
Here are the adjacency matrix AG , and the incidence matrix MG of G using the vertices and edges
in the orders given above.
0 1 0 0 1 1 0 0 0 1 0
1 0 1 0 0 1 1 0 0 0 0
AG =
0 1 0 1 1 MG =
0 1 1 0 0 1
0 0 1 0 1 0 0 1 1 0 0
1 0 1 1 0 0 0 0 1 1 1
Unlike most areas of mathematics, it is possible to point the a specific person as the creator of graph
theory and a specific problem that led to its creation. On the following pages the Seven Bridges of
Königsberg problem and the graph theoretic approach to a solution provided by Leonard Euler in
1736 is described.
247
The notion of a graph discussed in the article is a little more general that the graphs we will be
working with in the chapter. To model the bridge problem as a graph, Euler allowed multiple edges
between vertices. In modern terminology, graphs with multiple edges are called multigraphs.
While we are on the topic of extensions of the definition of a graph, let’s also mention the case of
graphs with loops. Here we allow an edge to connect a vertex to itself, forming a loop. Multigraphs
with loops allowed are called pseudographs. Another generalization of the basic concept of a
graph is hypergraph: in a hypergraph, a single edge is allowed to connect not just two, but any
number of vertices.
Finally, for all these various types of graphs, we can consider the directed versions in which the
edges are given arrowheads on one or both ends to indicate the permitted direction of travel along
that edge.
For a vertex v in a graph we denote the number of edges incident to v as the degree of v, written
as deg(v). For example, consider the graph
u1
u5 u2
G
u4 u3
Vertices u1 , u2 , u4 each have degree 2, while deg(u3 ) and deg(u5 ) are each 3. The list of the degrees
of the vertices of a graph is called the degree sequence of the graph. The degrees are traditionally
listed in increasing order. So the degree sequence of the graph G above is 2, 2, 2, 3, 3.
The following theorem is usually referred to as the First Theorem of Graph Theory, it’s also called
the Hand-Shaking Theorem.
Theorem 38.2. The sum of the degrees of the vertices of a graph equals twice the number of edges.
In particular, the sum of the degrees is even.
248 CHAPTER 38. GRAPHS
Proof. Notice that, when adding the degrees for the vertices, each edge will contribute two to the
total, once for each end. So the sum of the degrees is twice the number of edges.
For example, in the graph G above, there are 6 edges, and the sum of the degrees of the vertices is
2 + 2 + 2 + 3 + 3 = 12 = 2(6).
Corollary 38.3. A graph must have an even number of vertices of odd degree.
Proof. Split the vertices into two groups: the vertices with even degree and the vertices with odd
degree. The sum of all the degrees is even, and the sum of all the even degrees is also even. That
implies that the sum of all the odd degrees must also be even. Since an odd number of odd integers
adds up to an odd integer, it must be that there is an even number of odd degrees.
It is convenient to have names for some particular types of graphs that occur frequently.
For n ≥ 1, Kn denotes the graph with n vertices where every pair of vertices is adjacent. Kn is the
complete graph on n vertices. So Kn is the largest possible graph with n vertices in the sense
that it has the maximum possible number of edges.
K6
For n ≥ 3, Cn denotes the graph with n vertices, v1 , ..., vn , where each vertex in that list is adjacent
to the vertex that follows it and vn is adjacent to v1 . The graph Cn is called the n-cycle. The
graph C3 is called a triangle.
C6
249
For n ≥ 2, Ln denotes the n-link. An n-link is a row of n vertices with each vertex adjacent to
the following vertex. Alternatively, for n ≥ 3, an n-link is produced by erasing one edge from an
n-cycle.
L6
For n ≥ 3, Wn denotes the n-wheel. To form Wn add one vertex to Cn and make it adjacent to
every other vertex. Notice that the n-wheel has n + 1 vertices.
W6
For n ≥ 1, the n-cube, Qn , is the graph whose vertices are labeled with the 2n bit strings of length
n. The unusual choice of names for the vertices is made so it will be easy to describe the edges
in the graph: two vertices are adjacent provided their labels differ in exactly one bit. Except for
n = 1, 2, 3 it is not easy to draw a convincing diagram of Qn . The graph Q3 can be drawn so it
looks like what you would probably draw if you wanted a picture of a 3-dimensional cube. In the
graph below, there is a vertex placed at each of the eight corners of the 3-cube labeled with the
name of the vertex.
101 111
011
001
100
110
000 010
250 CHAPTER 38. GRAPHS
A graph is bipartite if it is possible to split the vertices into two subsets, let’s call them T and B
for top and bottom, so that all the edges go from a vertex in one of the subsets to a vertex in the
other subset.
For example, the graph below is a bipartite graph with T = {a, b, c} and B = {d, e, f, g}.
a b c
d e f g
If T has m vertices and B has n vertices, and every vertex in T is adjacent to every vertex in B,
the graph is called the complete bipartite graph, and it is denoted by Km,n . Here is the graph
K3,4 :
a b c
d e f g
K3,4
It is not always obvious if a graph is bipartite or not when looking at a diagram. For example the
square
251
d c
a b
a c
d b
The graphs G and H are obviously really the same except for the labels used for the vertices.
b c y z
G a H x
This idea of sameness (the official phrase is the graphs G and H are isomorphic) for graphs
is defined as follows: Two graphs G and H are isomorphic provided we can relabel the vertices
of one of the graphs using the labels of the other graph in such a way that the two graphs will
have exactly the same edges. As you can probably guess, the notion of isomorphic graphs is an
equivalence relation on the collection of all graphs.
In the example above, if the vertices of H are relabeled as a → x (meaning replace x with a), and
b → y, c → z, then the graph H will have edges {a, b} and {a, c} just like the graph G. So we have
252 CHAPTER 38. GRAPHS
r s t
L3
On the other hand, G is certainly not isomorphic to the 4-cycle, C4 since that graph does not even
have the same number of vertices as G. Also G is not isomorphic to the 3-cycles, C3 . In this case,
the two graphs do have the same number of vertices, but not the same number of edges. For two
graphs have a chance of being isomorphic, the two graphs must have the same number of vertices
and the same number of edges. But warning: even if two graphs have the same number of vertices
and the same number of edges, they need not be isomorphic. For example L4 and K1,3 are both
graphs with 4 vertices and 3 edges, but they are not isomorphic. This is so since L4 does not have
a vertex of degree 3, but K1,3 does.
Extending that idea: to have a chance of being isomorphic, two graphs will have to have the same
degree sequences since they will end up with the same edges after relabeling. But even having the
same degree sequences is not enough to conclude two graphs are isomorphic as the graphs in 38.1.
We can see those two graphs are not isomorphic since G has three vertices that form a triangle,
but there are no triangles in H.
G H
For graphs with a few vertices and a few edges, a little trial and error is typically enough to
determine if the graphs are isomorphic. For more complicated graphs, it can be very difficult to
253
a v
e b z w
G H
d c y x
determine if they are isomorphic or not. One of the big goals in theoretical computer science is the
design of efficient algorithms to determine if two graphs are isomorphic.
Example 38.4. Let G be a 5-cycle on a, b, c, d, e drawn as a regular pentagon with vertices arranged
clockwise, in order, at the corners. Let H have vertex set v, w, x, y, z and graphical presentation
as a pentagram (five-pointed star), where the vertices of the graph are the ends of the points of the
star, and are arranged clockwise, (see figure 38.2).
An isomorphism is a → v, b → x, c → z, d → w, e → y.
Example 38.5. The two graphs in figure 38.3 are isomorphic as shown by using the relabeling
u1 → v1 , u2 → v2 , u3 → v3 , u4 → v4 , u5 → v9 ,
u6 → v10 , u7 → v5 , u8 → v7 , u9 → v8 , u10 → v6 .
The graph G is the traditional presentation of the Petersen Graph. It could be described as the
graph whose vertex set is labeled with all the two element subsets of a five element set, with an
edge joining two vertices if their labels have exactly one element in common.
The origins of graph theory had to do with bridges, and possible routes crossing the bridges. In
this section we will consider that sort of question in graphs in general. We will think of walking
along edges, from one vertex in the list to the next, and visiting vertices. Remember that we do
not allow multiple edges or loops in our graphs.
We begin with a collection of definitions. Warning: These terms are used differently in different
texts. If you look at another graph theory text, be sure to see how the terms are used there.
254 CHAPTER 38. GRAPHS
H v2
G
u2
v1 v8 v3
u9
u1 u3
u10 u8 v10
v7 v9
v6 v4
u6 u7
u5 u4 v5
Example 38.6. In the graph shown in figure 38.4, a, b, e, c, f, c is a path of length 5. That is an
example of an a, c-path, meaning it starts at vertex a and ends at vertex c. That path is not simple
since the edge c, f is repeated. Note that direction does not matter. The vertex sequence a, b, c, f, e
is an a, e-path. Here are two simple circuits in that graph: a, b, e, d, a and a, b, c, e, d, a. Notice that
the circuit a, e, b, c, f, e, d, a is also simple even though it repeats the vertex e. It does not repeat any
edges.
A graph is connected if there is a path between any two vertices. In plain English, a connected
graph consists of a single piece. The individual connected pieces of a graph are called its connected
components. The length of the shortest path between two vertices in a connected component of
255
a b c
d e f
a graph is called the distance between the vertices. In figure 38.4, the distance between a and f
is 2.
Theorem 38.7. In a connected graph there is a simple path between any two vertices. In other
words, if there is a way to get from one vertex to another vertex along edges, then there is a way to
get between those two vertices without repeating any edges.
Proof. Problem 38.7. The idea is simple: in a path with a repeated edge, just eliminate the side
trip made between the two occurrences of that edge from the path. Do that until all the repeated
edges are eliminated. For example, in the graph shown in figure 38.4, The a, c-path a, e, b, e, c can
be reduced to the path a, e, c, eliminating the side trip to b.
A vertex in a graph is a cut vertex, if removal of the vertex and edges incident to it results in a
graph with more connected components. Similarly a bridge is an edge whose removal (keeping the
vertices it is incident to) yields a graph with more connected components.
An Eulerian path in a graph is a simple path which transverses every edge of the graph. In
other words, an Eulerian path in a graph is a path that transverses every edge of the graph exactly
once. An interesting property of a graph with an Eulerian path is that it can be drawn completely
without lifting pencil from paper and without retracing any edges.
An Eulerian circuit is a simple circuit in a graph that transverses every edge of the graph. So
an Eulerian circuit is a path of length three or more that transverses every edge of the graph and
ends up at its initial vertex. A graph is called Eulerian if it has an Eulerian circuit.
256 CHAPTER 38. GRAPHS
Example 38.8. The graph C5 is an Eulerian graph. In fact, the graph itself is an Eulerian circuit.
Example 38.10. The graph Ln is itself an Eulerian path, but does not have an Eulerian circuit.
A Hamiltonian path in a graph is a simple path that visits every vertex in the graph exactly once.
A Hamiltonian circuit in a graph is a simple circuit that, except for the last vertex of the circuit,
visits every vertex in the graph exactly once. A graph is Hamiltonian if it has a Hamiltonian
circuit.
A few easy observation: if G is a graph with either an Eulerian circuit or Hamiltonian circuit, then
(1) G is connected.
Leonhard Euler gave a simple way to determine exactly when a graph is Eulerian. On the other
hand, despite considerable effort, no one has been able to devise a test to distinguish between
Hamiltonian and non-Hamiltonian graphs that is much better than a brute force trial-and-error
search for a Hamiltonian circuit.
Theorem 38.15. A connected graph is Eulerian if and only if every vertex has even degree.
Proof. Let G be an Eulerian graph, and suppose that v is a vertex in G with odd degree, say
2m + 1. Let i denote the number of times an Eulerian circuit passes through v. Since every edge
is used exactly once in the circuit, and each time v is visited two different edges are used, we have
2i = 2m + 1, which is impossible. →←. So G cannot have any vertices of odd degree.
257
Conversely, let G be a connected graph where every vertex has even degree. Select a vertex u and
build a simple path starting at u as long as possible: each time we visit a vertex we select an unused
edge leaving that vertex to extend the simple path. For any vertex v ̸= u we visit, its even degree
guarantees there will be an unused edge out, since each time v is visited used two edges incident
to v and one more edge to arrive at v, for a total of an odd number of edges incident to v, and
the vertex has even degree, so there must be at least one unused edge leading out of v. Since the
process of extending the simple path must eventually come to an end, that shows the end must be
at u when the simple path cannot be extended, and so we have constructed an Eulerian circuit.
If this simple path contains every edge we are done. Otherwise when these edges are removed from
G we obtain a set of connected components H1 , ..., Hm which are subgraphs of G and which each
satisfy that all vertices have even degree. Since their sizes are smaller, we may inductively construct
an Eulerian circuit for each Hi . Since each G is connected, each Hi contains a vertex of the initial
circuit, say vj . If we call the Eulerian circuit of Hi , Ci , then v0 , ...vj , Ci , vj , ..., vn , v0 is a circuit in
G. Since the Hi are disjoint, we may insert each Eulerian partial circuit thus obtaining an Eulerian
circuit for G.
As a corollary we have
Theorem 38.16. A connected graph has an Eulerian path, but not an Eulerian circuit, if and only
if it has exactly two vertices of odd degree.
The following theorem is an example of a sufficient (but not necessary) condition for a graph to
have a Hamiltonian circuit.
Theorem 38.17. Let G be a connected graph with n ≥ 3 vertices. If deg(v) ≥ n/2 for every vertex
v, then G is Hamiltonian.
Proof. Suppose that the theorem is false. Let G be a connected graph with deg(v) ≥ n/2 for every
vertex v. Moreover suppose that of all counterexamples on n vertices, G is a graph with the largest
possible number of edges.
G is not complete, since Kn has a Hamiltonian circuit, for n ≥ 3. Therefore G has two nonadjacent
vertices v1 and vn . By maximality the graph G1 formed by adding the edge {v1 , vn } to G has a
Hamiltonian circuit. Moreover this circuit uses the edge {v1 , vn }, since otherwise G has a Hamilto-
nian circuit. So we may suppose that the Hamiltonian circuit in G1 is of the form v1 , v2 , ..., vn , v1 .
Thus v1 , ..., vn is a path in G.
258 CHAPTER 38. GRAPHS
WARNING: Do not read too much into this theorem. The condition is not a necessary condition.
The 5-cycle, C5 , is obviously Hamiltonian, but the vertices all have degree 2 which is less than 52 .
Trees form an important class of graphs. A tree is a connected graph with no circuits. Trees are
traditionally drawn upside down, with the tree growing down rather than up, starting at a root
vertex.
root
left right
Theorem 38.18. A graph G is a tree if and only if there is a unique path between any two vertices.
Proof. Suppose that G is a tree, and let u and v be two vertices of G. Since G is connected,
there is a path of the form u = v0 , v1 , ..., vn = v. If there is a different path from u to v, say
u = w0 , w1 , ..., wn = v let i be the smallest subscript so that wi = vi , but vi+1 ̸= wi+1 . Also let j
be the next smallest subscript where vj = wj . By construction vi , vi+1 , ..., vj , wj−1 , wj−2 , ..., wi is
a circuit in G →←.
Conversely, if G is a graph where there is a unique path between any pair of vertices, then by
definition G is connected. If G contained a circuit, C, then any two vertices of C would be joined
by two distinct paths.→← Therefore G contains no circuits, and is a tree.
A consequence of theorem 38.18 is that given any vertex r in a tree, we can draw the tree with r at
the top, as the root vertex, and the other vertices in levels below. The neighbors of r that appear
259
at the first level below r are called r’s children. The children of r’s children are put in the second
level below r, and are r’s grandchildren. In general the ith level consists of those vertices in the
tree which are at distance i from r. The result is called a rooted tree. The height of a rooted
tree is the maximum level number.
Naturally, besides child and parent, many genealogical terms apply to rooted trees, and are sugges-
tive of the structure. For example if a rooted tree has root r, and v ̸ =r, the ancestors of v are
all vertices on the path from r to v, including r, but excluding v. The descendants of a vertex,
w consist of all vertices which have w as one of their ancestors. The subtree rooted at w is the
rooted tree consisting of w, its descendants, and all the required edges. A vertex with no children
is a leaf, and a vertex with at least one child is called an internal vertex.
To distinguish rooted trees by breadth, we use the term m-ary to mean that any internal vertex
has at most m children. An m-ary tree is full if every internal vertex has exactly m children. When
m = 2, we use the term binary.
Inductive Step: Suppose that for some n ≥ 1 every tree with n vertices has n − 1 edges. Now
suppose T is a tree with n + 1 vertices. Let v be a leaf of T . If we erase v and the edge leading
to it, we are left with a tree with n vertices. By the inductive hypothesis, this new tree will have
n − 1 edges. Since it has one less edge than the original tree, we conclude T has n edges.
260 CHAPTER 38. GRAPHS
Exercises
Exercise 38.1. Find a graph isomorphism ϕ : G → H. Verify the adjacency preserving property
by showing the adjacency matrices satisfy AG = AϕG .
(G) (H)
a b s t
g z y
h f
e w x
d c v u
(G) (H)
a b s t
h c z u
g d y v
f e x w
261
(G)
1 2 3 4
8 7 6 5
Exercise 38.4. Explain why the graph G is not Eulerian, but is Hamiltonian.
(G)
c e
a b d f g
j i h
Exercise 38.5. Find an Eulerian circuit for the graph G as a list of vertices.
(G)
a b
h c
g d
f e
262 CHAPTER 38. GRAPHS
(G)
a c d
b e
i f
j h g
263
Problems
Problem 38.1.
(a) How many edges are there in Kn , the complete graph with n vertices?
(b) How many edges are there in Cn , the n-cycle with n vertices?
(c) How many edges are there in Ln , the n-link with n vertices?
(d) How many edges are there in Wn , the n-wheel with n + 1 vertices?
(e) How many edges are there in Qn , the n-cube with 2n vertices?
(f ) How many edges are there in Km,n , the complete bipartite graph with m top and n bottom
vertices?
Problem 38.2. Determine whether each graph is bipartite. If it is, redraw it as a bipartite graph.
(a) (b)
b c u2
u1 u3
a d
e u4 u5 u6
(c) (d)
v1 b
v2 v6 a c
v3 v5 f d
v4 e
Problem 38.4. Draw the Petersen graph with vertices labeled with the ten different subsets of the
five element set {a, b, c, d, e} as suggested in example 38.5.
264 CHAPTER 38. GRAPHS
Problem 38.5. For each pair of graphs either prove that G1 and G2 are not isomorphic, or else
show they are isomorphic by exhibiting a graph isomorphism.
(a) (b)
u3 v3 u3 v3
u2 v2 u4 u2 v4 v2
u4 v4
u1 v1 u5 u1 v5 v1
u5 v5
u7 v7 u6 u8 v6 v8
u6 v6 u7 v7
G1 G2 G1 G2
(c) (d)
u5 v5 u1 v1 v2
u4 u10 v9 v4 v10 u6 u2 v5
u7 u3 u9 v8 v3
u6 u2 u8 v7 v2 v6 u5 u3 v6
u1 v1 u4 v4 v3
G1 G2 G1 G2
a b c d e f
g h
Problem 38.7. Prove theorem 38.7: If G is a connected graph, then there is a simple path between
any two different vertices.
265
Problem 38.8. For each candidate degree sequence below, either draw a graph with that degree
sequence or explain why that list cannot be the degree sequence of a graph.
(1) 4, 4, 4, 4, 4
(2) 6, 4, 4, 4, 4
(3) 3, 2, 1, 1, 1
(4) 3, 3, 2, 2, 1
Problem 38.9. A tree is called star-like if there is exactly one vertex with degree greater than 2.
How many different (that it, nonisomorphic) star-like trees are there with six vertices? (Note: If
you draw the graph with the vertex of of degree greater than 2 having the arms of the tree radiating
out from it like spokes on a wheel, the name star-like will make sense.)
Problem 38.10. For each graph below (i) find an Eulerian circuit, or prove that none exists, and
(ii) find a Hamiltonian circuit or prove that none exists.
a b c d
(a)
i h g f e
a b c
(b) d e f
g h i
Problem 38.11. Answer the following questions about the rooted tree shown below.
(b) Which vertices are internal? (g) Which vertices are siblings of q?
(c) Which vertices are leaves? (h) Which vertices are ancestors of p?
(d) Which vertices are children of b? (i) Which vertices are descendants of d?
b c
d e f g h
i j k l m
n o p q r s
Problem 38.12. A forest is a graph consisting of one or more (separate) trees. If the total number
of vertices in a forest is f , and the number of trees in the forest is t, what is the total number of
edges in the forest?
Appendix A
Answers to Exercises
Chapter 1
1.1.
a) yes b) no
c) no d) yes
1.2.
p q p ⊕ ¬ q p q ¬( q → p )
T T T T F T T T F T T T
a) T F T F T F b) T F F F T T
F T F F F T F T T T F F
F F F T T F F F F F T F
p q q ∧ ¬ p p q ∼ q ∨ p
T T T F F T T T F T T T
c) T F F F F T d) T F T F T T
F T T T T F F T F T F F
F F F F T F F F T F T F
267
268 APPENDIX A. ANSWERS TO EXERCISES
p q r p →(¬ q ∧ r )
T T T T F F T F T
T T F T F F T F F
T F T T T T F T T
e) T F F T F T F F F
F T T F T F T F T
F T F F T F T F F
F F T F T T F T T
F F F F T T F F F
1.3. a) (1101 0111 ⊕ 1110 0010) ∧ 1100 1000 = (0011 0101) ∧ 1100 1000 = 0000 0000
b) (1111 1010 ∧ 0111 0010) ∨ (0101 0001) = (0111 0010) ∨ (0101 0001) = 0111 0011
c) (1001 0010 ∨ 0101 1101) ∧ (0110 0010 ∨ 0111 0101) = (1101 1111) ∧ (0111 0111) = 0101 0111
1.4.
a) s ∧ ¬f b) f ∧ ¬s c) ¬s → ¬f
1.5.
d) Jordan didn’t play when the Wizards won. OR If the Wizards won, then Jordan did not play.
1.6.
b) (c ∧ ¬w) → b
269
Chapter 2
2.1.
p q r ( p ∨ q )∨ r p ∨( q ∨ r )
T T T T T T T T T T T T T
T T F T T T T F T T T T F
T F T T T F T T T T F T T
a) T F F T T F T F T T F F F
F T T F T T T T F T T T T
F T F F T T T F F T T T F
F F T F F F T T F T F T T
F F F F F F F F F F F F F
p q ¬ p ∧( p ∨ q ) ¬( q → p )
T T F T F T T T F T T T
b) T F F T F T T F F F T T
F T T F T F T T T T F F
F F T F F F F F F F T F
p q r p ∨( q ∧ r ) ( p ∨ q )∧( p ∨ r )
T T T T T T T T T T T T T T T
T T F T T T F F T T T T T T F
T F T T T F F T T T F T T T T
c) T F F T T F F F T T F T T T F
F T T F T T T T F T T T F T T
F T F F F T F F F T T F F F F
F F T F F F F T F F F F F T T
F F F F F F F F F F F F F F F
270 APPENDIX A. ANSWERS TO EXERCISES
2.2.
p q r p ∧( q → r ) ( p ∧ q )→ r
T T T T T T T T T T T T T
T T F T F T F F T T T F F
T F T T T F T T T F F T T
a) T F F T T F T F T F F T F
F T T F F T T T F F T T T
F T F F F T F F F F T T F
F F T F F F T T F F F T T
F F F F F F T F F F F T F
Consider (p, q, r) = (F, T, T ), (p, q, r) = (F, T, F ), (p, q, r) = (F, F, T ), or (p, q, r) = (F, F, F ).
p q p → q q → p
T T T T T T T T
b) T F T F F F T T
F T F T T T F F
F F F T F F T F
Consider either (p, q) = (T, F ) or (p, q) = (F, T ).
p q p → q ¬ p → ¬ q
T T T T T F T T F T
c) T F T F F F T T T F
F T F T T T F F F T
F F F T F T F T T F
Consider either (p, q) = (T, F ) or (p, q) = (F, T ).
2.3.
p q ( p ∧ ( p → q )) → q
T T T T T T T T T
a) T F T F T F F T F
F T F F F T T T T
F F F F F T F T F
271
p q r (( p → q ) ∧ ( q → r )) → ( p → r )
T T T T T T T T T T T T T T
T T F T T T F T F F T T F F
T F T T F F F F T T T T T T
b) T F F T F F F F T F T T F F
F T T F T T T T T T T F T T
F T F F T T F T F F T F T F
F F T F T F T F T T T F T T
F F F F T F T F T F T F T F
2.5. An implication is logically equivalent to its contrapositive. So, (If it is Saturday, then I mow
the lawn) is logically equivalent to (If I do not mow the lawn, then it is not Saturday). The
inverse and converse of an implication are logically equivalent (but not logically equivalent to the
implication) so (If it is not Saturday, then I do not mow the lawn) is logically equivalent to (If I
mow the lawn, then it is Saturday).
2.6.
2.7. Proof.
Chapter 3
3.1.
a) T b) F
c) F (e.g., (0 ≤ 10) → (2 · 0 ≥ 4) is false.)
d) T (e.g., ¬(2 · 0 ≥ 4) is true.)
3.2.
3.3.
a) ∀y F (I, y)
273
3.4.
3.5.
a) ∃y ¬F (I, y) b) ∃y F (George, y)
c) ∃x F (x, x) d) ∀x ∃y ¬F (x, y)
e) ∀y ∃x ¬F (x, y)
f) This one is a little complicated.
∀y∀z((y = z) ∨ ¬F (Ralph, y) ∨ ¬F (Ralph, z)).
In plain English name any two (not necessarily different) people. Either they are the same
person, or else Ralph cannot fool at least one of the two. A logically equivalent, and maybe
easier to fathom version converting the disjunctive form above to an implication:
∀y∀z((F (Ralph, y) ∧ F (Ralph, z)) → (y = z)).
In plain English: If Ralph can fool both y and z, then y and z are actually the same person.
Or, in more natural sounding English, Ralph can fool at most one person.
3.7. “There is an x so P (x) holds, and, for any x and y, if both P (x) and P (y) hold, then x and
y are equal.”
Or, more succinctly: “There is exactly one x for which P (x) is true.”
274 APPENDIX A. ANSWERS TO EXERCISES
Chapter 4
p q r (( p ∨ q ) ∧ ( ¬ p ∨ r )) → ( q ∨ r )
T T T T T T T F T T T T T T T
T T F T T T F F T F F T T T F
T F T T T F T F T T T T F T T
4.1. T F F T T F F F T F F T F F F
F T T F T T T T F T T T T T T
F T F F T T T T F T F T T T F
F F T F F F F T F T T T F T T
F F F F F F F T F T F T F F F
The statement is a tautology, hence a valid rule of inference.
4.2. Proof: Make the assignments P orsche(x): “x owns a Porsche”, Speeder(x): “x is a speeder”,
Sedan(x): “x owns a sedan”,
and BuysP rem(x):“x buys premium fuel”. The argument has the symbolic form:
∀x (P orsche(x) → Speeder(x))
¬∃x (Sedan(x) ∧ BuysP rem(x))
∀x (¬BuyP rem(x) → ¬Speeder(x))
∴ ∀x (P orsche(x) → ¬Sedan(x))
4.3. 1) ¬w Hypothesis
2) u∨w Hypothesis
3) u Disjunctive syllogism
4) u → ¬p Hypothesis
5) ¬p Modus Ponens (3) and (4)
6) ¬p → (r ∧ ¬s) Hypothesis
7) r ∧ ¬s Modus Ponens (5) and (6)
8) ¬s Simplification
9) t→s Hypothesis
10) ¬t Modus Tollens (8) and (9)
11) ¬t ∨ w Addition
Chapter 5
5.1.
a) 2, 3, 4 b) −5, 5 c) 4, 5, 6
5.2.
a) {5x | x ∈ Z where − 1 ≤ x ≤ 3}
b) {x | x ∈ N where x ≤ 4} or {x | x ∈ Z where − 1 < x < 5}
c) {x ∈ R | π ≤ x < 4} or {x | x ∈ R where π ≤ x < 4}
5.4. This proposition is true. We may write this statement using symbols as ∀x ((x ∈ ∅) →
(“x has three toes”)). The hypothesis, (x ∈ ∅), of the implication is false, which makes the impli-
cation true.
5.6. True. Order and repetitions do not matter when the elements of a set are listed.
276 APPENDIX A. ANSWERS TO EXERCISES
5.7. False. For example, 2 is in the set of even integers, but not in the set of integers that are
multiples of four. In fact, just the reverse is true since if an integer is a multiple of four, then it is
certainly even. So the set of integers that are multiples of four is a subset of the set of even integers.
Chapter 6
6.1.
a) A ∩ B = {2, 4, 6, 7, 8} b) A ∪ B = {1, 2, 3, 4, 5, 6, 7, 8, 9}
c) A − B = {3, 5} d) B − A = {1, 9}
6.2.
a) A = (A − B) ∪ (A ∩ B) = {1, 2, 5, 6, 7, 8, 9}
b) B = (B − A) ∪ (A ∩ B) = {3, 4, 5, 6, 9, 10}
A B A ⊕ B (A ∪ B)− (A∩ B )
1 1 1 0 1 1 1 1 0 1 1 1
6.3. 1 0 1 1 0 1 1 0 1 1 0 0
0 1 0 1 1 0 1 1 1 0 0 1
0 0 0 0 0 0 0 0 0 0 0 0
The columns in red are identical, and that shows the set equality is correct.
6.4. A B − A B = A B
277
Proof :
A ∪ (A ∩ B) = (A ∩ U) ∪ (A ∩ B) ∩-identity
= A ∩ (U ∪ B) Distributive law
=A∩U ∪-domination
=A ∩-identity
6.6.
a) A × B = {(1, a), (1, b), (1, c), (2, a), (2, b), (2, c), (3, a), (3, b), (3, c), (4, a), (4, b), (4, c)}
b) B × A = {(a, 1), (a, 2), (a, 3), (a, 4), (b, 1), (b, 2), (b, 3), (b, 4), (c, 1), (c, 2), (c, 3), (c, 4)}
c) C × B × D = {(α, a, 7), (α, a, 8), (α, a, 9), (α, b, 7), (α, b, 8), (α, b, 9), (α, c, 7), (α, c, 8), (α, c, 9),
(β, a, 7), (β, a, 8), (β, a, 9), (β, b, 7), (β, b, 8), (β, b, 9), (β, c, 7), (β, c, 8), (β, c, 9)}
6.7. A = ∅, B = ∅, or A = B
6.9. The elements in B are (1, 1), (2, 1), (2, 2), (3, 1), (3, 2), (3, 3).
Chapter 7
▶ Let m, n ∈ Z be given.
▶ Suppose m and n are even.
▶ =⇒ m = 2i and n = 2j, for some i, j ∈ Z, by the definition of even.
▶ =⇒ m + n = 2i + 2j = 2(i + j)
278 APPENDIX A. ANSWERS TO EXERCISES
▶ =⇒ m + n = 2ℓ, where ℓ = i + j.
▶ =⇒ m + n is even, by the definition of even.
▶ ∴ if m and n are even, then m + n is even.
Proof. (Prose form) Suppose m and n are even integers. Then, by the definition of even, m = 2i
and n = 2j, for some i, j ∈ Z. Thus, we have
Proof. (Prose form) Suppose n is an even integer. Then, by the definition of even, n = 2j, for some
j ∈ Z. Thus, we have
n2 = (2j)2 = 2(2j 2 ) = 2ℓ, where ℓ = 2j 2 .
7.3.
Proof. (Prose form) Suppose x is rational and y is irrational. Suppose, for the sake of argument,
that x + y is rational. Then, y = (x + y) − x would be rational, since the difference of two rational
numbers is rational. But, y was given to be irrational. This is impossible. Therefore, x + y must
have been irrational.
7.4.
Proof. (Prose form) Suppose that 5n − 1 is odd for some n ∈ Z. And, suppose, for the sake of
280 APPENDIX A. ANSWERS TO EXERCISES
That is, 5n − 1 is even, since 5n − 1 = 2ℓ, for ℓ = 5j + 2. But, 5n − 1 was given to be odd. This is
impossible. Therefore, n must have been even.
7.5. The positive integer 77 ends in a 7, but is not prime since 77 = 7 · 11.
Chapter 8
a b
1 0 0 1
0 1 1 0
8.1. MR =
0
1 1 0
1 0 0 1
c d
R
8.2. S = {(1, a), (1, b), (2, c), (2, d), (3, a), (3, d), (4, a), (4, c), (5, b), (5, c)}
1 a
2 b
3 c
4 d
5 S
281
1 1 1 1
1 1 1 1
8.3. MR◦S = MS ⊙ MR =
1 1 1 .
1
1 0 1 0
1 1 1 1
8.4. (a)
R1 ∪ R2 = {(1, 2), (1, 3), (1, 5), (1, 6), (2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (3, 1), (3, 3), (3, 4), (3, 6),
(4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (5, 1), (5, 5), (5, 6), (6, 2), (6, 3), (6, 6)}
R1 ∩ R2 = {(1, 2), (2, 1), (2, 2), (3, 3), (5, 5), (6, 6)}
R1 ⊕ R2 = {(1, 3), (1, 5), (1, 6), (2, 3), (2, 4), (2, 5), (3, 1), (3, 4), (3, 6),
(4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (5, 1), (5, 6), (6, 2), (6, 3)}
(b)
0 1 1 0 1 1
1 1 1 1 1 0
1 0 1 1 0 1
MR1 ∪R2 = MR1 ∨ MR2 =
1 1 1 1 1 0
1 0 0 0 1 1
0 1 1 0 0 1
0 1 0 0 0 0
1 1 0 0 0 0
0 0 1 0 0 0
MR1 ∩R2 = MR1 ∧ MR2 =
0 0 0 0 0 0
0 0 0 0 1 0
0 0 0 0 0 1
0 0 1 0 1 1
0 0 1 1 1 0
1 0 0 1 0 1
MR1 ⊕R2 = MR1 ⊕ MR2 =
1 1 1 1 1 0
1 0 0 0 0 1
0 1 1 0 0 0
282 APPENDIX A. ANSWERS TO EXERCISES
Chapter 9
9.1. R = {(1, 1), (2, 2), (3, 3)} (There are seven more correct answers since any subset of R would
also be okay.)
9.2.
9.3.
9.4. Recall that for a set A, the notation |A| is the cardinal number of A (in other words, the
number of elements in A).
a) C is reflexive since any set A has the same number of elements as itself!
b) C is not irreflexive since, for example, |{1}| = |{1}|, so {1} C {1} is true.
c) C is not symmetric since, for example, {1} C {1, 2} is true, but {1, 2} C {1} is false.
d) C is not antisymmetric since, for example, {1} C {2} and {2} C {1} are both true, but {1} =
̸
{2}.
e) C is transitive since if |A| ≤ |B| and |B| ≤ |D|, then |A| ≤ |D| (by the transitive property of
the less than or equal to relation for numbers).
9.5. For any universe of discourse U, a relation on U is defined to be any subset of U × U. The
empty set is a subset of U × U.
283
9.6. {1, 2}M {1, 2} is false, so M is not reflexive. {1}M {1} is true, so M is not irreflexive. If
|A ∩ B| = 1 then |B ∩ A| = 1, (since A ∩ B = B ∩ A), so M is symmetric. {1}M {1, 2} and
{1, 2}M {1} are true, but {1} =
̸ {1, 2} so M is not antisymmetric. {1}M {1, 2} and {1, 2}M {2} are
true, but {1}M {2} is false, so M is not transitive.
Chapter 10
10.1. The relation R is reflexive since (0, 0), (1, 1), (2, 2) are all in R. R is symmetric since the
reverse of each ordered pair in R is also in R. Finally, R is transitive (there are a lot of cases to
check for this condition. For example, (1, 1) and (1, 0) are both in R, so we need to check that
(1, 0) is also in R! But it is of course. There is a total of nine such checks needed to verify that R
is transitive, but they are all just as automatic as that one.) Alternatively, the transitive property
follows from verifying MR ⊙ MR ≤ MR . So R is an equivalence relation on A. There are two
equivalence classes: [0] = [1] = {0, 1} and [2] = {2}.
10.2. The relation R is not reflexive on A since (3, 3) is not in R. So R is not an equivalence
relation on A. Notice that R is symmetric and transitive.
10.3. The relation R is not symmetric on A since (1, 0) is in R, but (0, 1) is not in R. So R is not
an equivalence relation on A. Notice that R is reflexive on A and is transitive.
10.4. The relation R is not transitive on A since (0, 1) and (1, 2) are in R, but (0, 2) is not in R.
So R is not an equivalence relation on A. Notice that R is reflexive on A and is symmetric.
10.5. True. The relation is symmetric since the reverse of each ordered pair in R is also in R (after
all, the reverse of each ordered pair in R is just the ordered pair itself). To check the antisymmetric
condition, we need to look at all cases where an ordered pair and its reverse are both in R, and
make sure the two coordinates are equal in each such case. There are only two cases to check: (1, 1)
and its reverse are both in R, and sure enough, 1 = 1. The story is the same for (2, 2). So R passes
the antisymmetry test. The moral of the story: It is possible for a relation to be both symmetric
and antisymmetric.
10.6. The relation S is reflexive since, for any integer m, m2 = m2 . S is symmetric since if m2 = n2 ,
284 APPENDIX A. ANSWERS TO EXERCISES
The equivalence class of an integer m the the set of all integers with the same square as m. For
m = 0, the only element in the equivalence class would be 0 itself: [0] = {0}. For any integer, m,
other than 0, we get [m] = {m, −m}. For example, [2] = {2, −2} since 22 = 4 and (−2)2 = 4.
10.7. The relation C is obviously reflexive and symmetric. But it is not transitive. For example,
The lines l1 : y = x crosses the l2 : x + y = 2 at the point (1, 1) and the line l2 : x + y = 2 crosses
the line l3 : y = x + 2 at the point (0, 2). But the lines l1 : y = x and l3 : y = x + 2 are distinct
parallel lines, so they do not cross. In other words l1 C l2 and l2 C l3 are both true, but l1 C l3 is
false. So C is not transitive.
(∀a, b ∈ A) [([a] ∩ [b] = ∅) ∨ ([a] = [b])] ≡ (∀a, b ∈ A) [¬([a] ∩ [b] = ∅) → ([a] = [b])]
≡(∀a, b ∈ A) [([a] ∩ [b] ̸= ∅) → ([a] = [b])] ≡ (∀a, b ∈ A) [([a] ∩ [b] ̸= ∅) → ([a] ⊆ [b] ∧ [b] ⊆ [a])]
≡(∀a, b ∈ A) [([a] ∩ [b] ̸= ∅ → [a] ⊆ [b]) ∧ ([a] ∩ [b] ̸= ∅ → [b] ⊆ [a])]
10.10.
a)
1 3
GE
4 2 7 5
6 8
286 APPENDIX A. ANSWERS TO EXERCISES
(a) [a] = {a, c, e, g} = [c] = [e] = [g], [b] = {b, d, f } = [d] = [f ], and [h] = {h} partition A.
(b) [a] = {a, b, g, h} = [b] = [g] = [h], [c] = {c, d} = [d], and [e] = {e, f } = [f ] partition A.
10.12. Here are the facts we know: (1) E is an equivalence relation on the set A, (2) a ∈ [b] (a is
in the equivalence class of b). Our job is to show that if c ∈ [a], then c ∈ [b]. So, suppose c ∈ [a].
That means cEa is true according to the definition of equivalence class. We also know aEb is true,
since a is in the equivalence class of b. Since cEa and aEb are true, the transitive condition tells
us cEb is true, and that means c ∈ [b], as we needed to show.
287
Chapter 11
a) f (x) = x + 1
b) f (x) = 1
c) f (x) = ex
d) f (x) = x3 − x
11.2.
a) As a set of ordered pairs f = {(1, a), (2, b), (3, c), (4, d), (5, e)}
d) As a set of ordered pairs g = {(a, 1), (b, 2), (c, 3), (d, 4), (e, 5), (f, 5)}
11.3. The composition of two functions is a function, but the composition of two equivalence
relations need not be an equivalence relation. However, the composition of an equivalence relation
with itself will be an equivalence relation. Here is a proof.
Since E is reflexive on A, for any a ∈ A, (a, a) ∈ E is true, and since (a, a) and (a, a) are in E, the
composition rule tells us (a, a) ∈ E ◦ E! So E ◦ E is reflexive on A.
Next, suppose (a, b) ∈ E ◦ E. That means there is an x in A such that (a, x) and (x, b) are in E.
Since E is an equivalence relation, it is symmetric, so (b, x) and (x, a) are in E. The composition
rule then tells us (b, a) is in E ◦ E. That proves E ◦ E is symmetric.
Finally, suppose (a, b) and (b, c) are in E ◦ E. That means there an x in A such that (a, x) and
(x, b) are in E and there is a y in A such that (b, y) and (y, c) are in E. From (x, b) and (b, y) in
E, we see (x, y) is in E. Then From (a, x) and (x, y) in E, we get (a, y) is in E. And finally, (a, y)
and (y, c) in E tells us (a, c) is in E ◦ E. So E ◦ E is transitive.
288 APPENDIX A. ANSWERS TO EXERCISES
Since (f ◦g)(x) = (f ◦g)(y), the definition of the composition of functions tells us f (g(x)) = f (g(y)).
Since f is one-to-one, we conclude g(x) = g(y), and then, since g is one-to-one, we get x = y.
Chapter 12
12.2.
y
3
x
−3 −2 −1 0 1 2 3
−1
−2
−3
−4
12.3.
We could fall back on the classic plot-a-billion-points method of graphing, but it is a better idea to
289
think first. The graph ought to look a lot like the graph of y = ⌊x⌋, but the jumps will move to new
spots. In particular, there will be a jump up by 1 whenever x reaches a value where 2x − 1 = n,
n+1
for n equal an integer. In other words, there will be jumps when x = 2 for integers n.
−4 −3 −2 −1 0 1 2 3 4
. . . , −2 = , , −1 = , ,0 = , ,1 = , ,2 = ,....
2 2 2 2 2 2 2 2 2
n+1
Moreover, for the integer n, the jump at x = 2 will be from n − 1 to n. That’s enough to draw
1
the graph: it looks exactly like the graph of y = ⌊x⌋, except jumps come at multiples of 2 instead
of at the integers and care is needed with the heights of each step. That makes the graph easy to
draw since we can cheat by dividing all the integer marks on the x-axis by 2 in the graph of y = ⌊x⌋.
y
3
x
− 23 −1 − 1 0 1 1 3
2−1 2 2
−2
−3
−4
12.4. Thinking: the graph of y = ⌊x − 1⌋ will look just like the graph of y = ⌊x⌋ shifted one unit
the the right, and the factor of 2 in y = 2⌊x − 1⌋ will move each horizontal segment of the graph
to twice its original distance from the x-axis. Put those two pieces together:
290 APPENDIX A. ANSWERS TO EXERCISES
y
6
x
−3 −2 −1 0 1 2 3
−2
−4
−6
−8
x3 x3
12.5. The functions f (x) = 18x and g(x) = 2 cross when 18x = 2 . That is, when
36x = x3
x3 − 36x = 0
x(x2 − 36) = 0
x(x − 6)(x + 6) = 0
x = −6, 0, 6
500
400
300
200
100
0
0 2 4 6 8 10
12.6. The exponential function g(x) = 2x ultimately beats the power function f (x) = 4x5 . To
show that, lets find a number x > 1 so that ln(2x ) > ln(4x5 ). That inequality can be rewritten
as x ln 2 > ln 4 + 5 ln x. Since all we care about is showing there is an x for which that inequality
x
is true, we can make some simplifying substitutions: if 2 > 2 + 5 ln x, that would be guarantee
x
x ln 2 > ln 4 + 5 ln x because x ln 2 > 2. So let’s see if we can find x so that x > 4 + 10 ln x, and
since we can make x as large as we want, we can also guarantee ln x ≥ 4. Consequently we need
only be sure we can pick x with x > 11 ln x. Thinking about the graphs of y = x and y = 11 ln x,
we can see there will certainly be such an x. If we know a bit of calculus, we can compute slopes of
tangent lines to the two curves to show there is such an x. If we have a computer algebra system,
we can find a specific value of such an x. In fact, it turns out that 41 is the smallest integer greater
than 1 for which x > 11 ln x.
√
2
12.7. Assuming the other buttons are in working order, we can use the fact that log(2 ) =
√
( 2) log 2. Here are the steps:
Chapter 13
13.1. A great source of information about numerical sequences is The Online Encyclopedia of
Integer Sequences at http://oeis.org. That site finds over 200 sequences that either begin or are
otherwise related to 1, 2, 4, 5, 7, 8
One possible answer: It looks like the list of positive integers in order, skipping the multiples of 3.
So the next few terms will be 10, 11, 13, 14, 16, 17.
Another answer: The positive integers n (in increasing order) for which 2n + 3 is a prime. The next
few terms would be 10, 13, 14, 17, 19, 20.
13.3. Let a be the initial term and d be the common difference. We have the system a + 9d = −4
and a + 15d = 47. Subtracting the first from the second gives 6d = 51, so d = 51 = 17
2 , and then
17 161 th 161 17
69
a = −4 − 9d = −4 − 9 2 = − 2 . That means the 11 term is − 2 + 10 2 = 2 .
13.6. Suppose we have a sequence with initial term a that is arithmetic (with common difference
d) as well as geometric (with common ration r). For the two terms following the initial term, we
have a + d = ar and a + 2d = ar2 . The first equation tells us that d = a(r − 1). Subtracting the
first equation from the second gives d = ar2 − ar = ar(r − 1). So a(r − 1) = ar(r − 1), and that
equation leaves only a few options for a and r. It could be that a = 0, and that means (1) d = 0
and the sequence is 0, 0, 0, 0 . . ., or (2) r − 1 = 0, so r = 1, and that means d = 0, and the sequence
is a, a, a, a, . . ., or (3) neither a = 0 nor r = 1 in which case a(r − 1) = ar(r − 1) reduces to r = 1
which can’t be in this case. Conclusion: the constant sequences a, a, a, a, . . . are the only sequences
that are both arithmetic (initial term a and common difference 0) and geometric (initial term a
and common ratio 1).
293
P4 2
13.7. j=1 (j + 1) = 2 + 5 + 10 + 17 = 34
P4
13.8. k=−2 (2k − 3) = −7 − 5 − 3 − 1 + 1 + 3 + 5 = −7
2+2+(99)(6)
13.9. The initial term is 2, the 100th term is 2 + (99)(6), so the total is 100 2 = 29900.
13.10. 6 + 12 + 24 + 48 + 96 = 186
3 9 27 81 55
13.11. 1 − 2 + 4 − 8 + 16 = 16
n
X 1
13.12.
2k
k=1
Chapter 14
14.3. 1, 2, 3, 3, 4, 4, 4, 4, 5, 5
14.4. 1, 0, 3, 2, 5, 4, 7, 6, 9, 8. It looks like the terms alternately give one more and one less then
the term index. That suggests an = n + (−1)n .
14.5. This is the Look-and-Say sequence introduced by John Horton Conway. After the initial term
equal to 1, each new term is produced by reading the previous term. Examples:
• for the second term, read the first term (1) as ”one 1” (so 11)
14.6. For n = 1, we define 1d to equal d. Now for the recursive part of the definition: For n > 1,
we define nd = (n − 1)d + d.
Chapter 15
15.4. Each string in S consists of zero or more a’s followed on the right by an equal number of of
the two letter combination bc. Examples: aabcbc and aaaabcbcbcbc.
15.5. The strings is S consist of one or more c’s preceded on the left by any combination of zero
or more a’s and b’s. Examples: cccc, abbbac, babbacc, aaacc.
alternate solution: (1) λ, a, b, c ∈ S, and (2) If x, y ∈ S, the xyx ∈ S. In plain English, the recursive
rule (2) says we can build longer palindromes by adding a palindrome to either end of a palindrome.
295
Chapter 16
1·2·9
16.1. basis: For n = 1, the left side is 1 · 3 = 3, and the right side is 6 = 3, so the equality is
correct for this case.
inductive hypothesis: Suppose
n(n + 1)(2n + 7)
1 · 3 + 2 · 4 + 3 · 5 + · · · + n(n + 2) =
6
as we needed to show.
1 · 21 + 2 · 22 + 3 · 23 + ... + n · 2n = (n − 1)2n+1 + 2,
inductive step:
1 · 21 + 2 · 22 + 3 · 23 + ... + n · 2n + (n + 1)2n+1
= (n − 1)2n+1 + 2 + (n + 1)2n+1 using the inductive hypothesis
= ((n − 1) + (n + 1))2n+1 + 2 and the rest is just algebra
n+1
= (2n)2 +2
= n2n+2 + 2
= ((n − 1) + 1)2(n+1)+1 + 2,
as we needed to show.
f0 + f1 + f2 + · · · + fn = fn+2 − 1
f0 + f1 + f2 + · · · + fn + fn+1
= (fn+2 − 1) + fn+1 using the inductive hypothesis
= (fn+1 + fn+2 − 1)
= fn+3 − 1 using the recursive definition of the Fibonacci sequence
= f(n+1)+2 − 1,
as we needed to show.
Putting the pieces together, we get 2n+1 > (n + 1)2 as we needed to show.
In that last expression, the term 2(5)(11n ) is certainly divisible by 5, and the expression 11n − 6 is
divisible by 5 by the inductive hypothesis. That implies 2(5)(11n ) + (11n − 6) is divisible by 5 as
we needed to show.
Note: Here is another way to see that 11n − 6 is divisible by 5: for any integer n ≥ 0, the number
11n will have units digit 1, and so 11n − 6 will have units digit 5. Numbers with units digit 5 are
divisible by 5. This proof does not answer the question posed however, since it is not a proof by
induction.
16.6. basis: With 0 cuts, we end up with one piece of pizza, namely the whole thing. And, sure
n2 +n+2 02 +0+2
enough, for n = 0, 2 = 2 = 1.
inductive hypothesis: Suppose that for some n ≥ 0, n straight lines cut produces a maximum
n2 +n+2
number of 2 pieces. Then
inductive step: Suppose we add one more cut. Notice that when the new cut crosses an old cut,
it will slice one old piece into two new piece. We can’t get more that two pieces when one cut
crosses another since two straight lines cannot cross each other more than once. So, to get the
maximum number of new pieces, we should make the new cut not parallel to the n previous cuts
(and so, with care, be sure to cross all the previous cuts, and not at a point where previous cuts
cross each other). This will give the maximum number of new pieces equal to n + 1. Conclusion:
the maximum number of pieces with n + 1 straight cuts is
n2 + n + 2 n2 + n + 2 + 2(n + 1) (n + 1)2 + (n + 1) + 2
+ (n + 1) = = ,
2 2 2
as we needed to show.
The list of maximums begins 1, 2, 4, 7, 11, 16, 22, 29, 37, 46, 56, 67, 79, 92.
50 −1
16.7. basis: For n = 0, we are given a0 = 0 and we see 4 = 0, so the basis step is good.
5n −1
inductive hypothesis: Suppose an = 4 for some n ≥ 0. Then
298 APPENDIX A. ANSWERS TO EXERCISES
inductive step:
as we needed to show.
16.8. basis: Since the recursive formula involves the two previous terms, we are going to have to
check the closed form formula for the two terms a0 and a1 . But they both work okay since a0 = 1
and 2(30 ) − 20 = 2(1) − 1 = 1 and a1 = 4 and 2(31 ) − 21 = 2(3) − 2 = 4.
inductive hypothesis: For n ≥ 2, an depends on the two previous terms in the sequence, so it
would be wise to use the second form of induction this time. So, let’s suppose that ak = 2 · 3k − 2k
for all k form 0 to some n ≥ 1. Then
inductive step:
as we needed to show.
Chapter 20
20.1. Proof:
Suppose a > 0 and b > 0. Since a > 0, multiplying both sides of b > 0 by a gives ab > a0. We
299
20.2. Proof:
Suppose neither a nor b is 0. Consider four cases:
(1) a > 0 and b > 0: In this case, we proved ab > 0 in Exercise 1. In particular then, ab ̸= 0 in
this case.
(2) a > 0 and b < 0: Since a > 0, multiplying both sides of b < 0 by a gives ab < a0. We know
a0 = 0. It follows that ab < 0. So ab ̸= 0 in this case.
(3) a < 0 and b > 0: Since a < 0, multiplying both sides of b > 0 by a gives ab < a0. We know
a0 = 0. It follows that ab < 0. So ab ̸= 0 in this case.
(4) a < 0 and b < 0: Since a < 0, multiplying both sides of b < 0 by a gives ab > a0. We know
a0 = 0. It follows that ab > 0. In particular then, ab ̸= 0 in this case.
20.3. Proof:
Suppose c ̸= 0 and ac = bc. We can rewrite ac = bc as ac − bc = 0, and then use the distributive
property to write that as (a−b)c = 0. Applying the result of Exercise 2, we conclude either a−b = 0
or c = 0. Since c ̸= 0, it must be that a − b = 0. Adding b to each side of that equation shows
a = b.
Chapter 21
21.1. 107653 = (4)(22869) + 16177. So the quotient is 4, and the remainder is 16177.
21.2. The square root of 1297 is 36.01..., so if 1297 is not a prime, it must have a prime divisor no
more than 36. Testing 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, and 31, (use a calculator if you want), we find
none of those ten integers divides 1297. That means 1297 is a prime.
300 APPENDIX A. ANSWERS TO EXERCISES
21.3. The divides relation is reflexive: For every integer a, a|a is true since a · 1 = a.
21.4. The divides relation is not symmetric. For example, 2|4 is true, but 4|2 is false.
21.5. The divides relation is transitive. Suppose a|b and b|c are both true. That means there are
integers d and e such that ad = b and be = c. Multiplying each side of the first of those two
equations by e gives ade = be, so a(de) = c. Since de is an integer, that equation shows a|c is true.
12
21.6. The expression 4|12 is a proposition, not a number. Note 4|12 is true whereas 4 = 3 using
correct notation.
The first number in the list is not a prime since 2 is a factor of each term, and so 2 is a factor of
1001! + 2. Likewise, 3 is a proper factor of the second number, 4 is a proper factor on the third
number, and so on, until 1001 is a proper factor of the 1000th number in the list. So none of the
integers can be a prime.
21.8. Proof:
Suppose a|b. That means ac = b for some integer c. Then (−a)(−c) = ac = b, so −a|b.
301
Chapter 22
22.1a.
233 = 2 · 89 + 55
89 = 1 · 55 + 34
55 = 1 · 34 + 21
34 = 1 · 21 + 13
21 = 1 · 13 + 8
13 = 1 · 8 + 5
8=1·5+3
5=1·3+2
3=1·2+1
2=2·1+0
gcd(233, 89) = 1
22.1b.
1001 = 77 · 13 + 0
gcd(1001, 13) = 13
302 APPENDIX A. ANSWERS TO EXERCISES
22.1c.
gcd(2457, 1458) = 27
22.1d.
gcd(567, 349) = 1
22.2.
987654321 = 8 · 123456789 + 9
123456789 = 13717421 · 9 + 0
gcd(987654321, 123456789) = 9
303
22.4. Since n divides both n and 2n, that means n is a common divisor of n and 2n. On the other
hand, no integer larger than n can divide n. So n is the largest common divisor of n and 2n. So
gcd(n, 2n) = n.
Chapter 23
17 = (1)(119) + (−1)(102)
You’ll likely agree that the Extended Euclidean Algorithm table is a lot neater and much less prone
to error.
23.2. Since 1 is a linear combination of a and b, we know that 1 is a multiple of gcd(a, b) from
Theorem 23.2. Since gcd’s are positive, it follows that gcd(a, b) = 1.
23.3. Since 19 is a linear combination of a and b, we know that 19 is a multiple of gcd(a, b) from
Theorem 23.2. Since gcd’s are positive, the possible values for gcd(a, b) are 1 and 19.
23.4. Since 18 is a linear combination of a and b, we know that 18 is a multiple of gcd(a, b) from
Theorem 23.2. Since gcd’s are positive, the possible values for gcd(a, b) are 1, 2, 3, 6, 9, and 18.
305
Chapter 24
24.1. A calculator would be handy for this problem. The plan is to try divisions by 2, 3, 5, 7, 11, 13,
and so on, until we have the complete factorization in to primes.
Note that in testing for prime divisors of 3389, we needed only test 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53.
With a calculator, the whole process took about two minutes.
24.2.
1016 = (2)(508) = (2)(2)(254) = (2)(2)(2)(127) = 23 · 127
24.3. The positive divisors of 1016 will look like 2a 127b where a = 0, 1, 2, 3 and b = 0, 1. Since
there are four choices for a and two choices for b, there will be a total of (4)(2) = 8 positive divisors
of 1016. They are:
20 1270 = 1 21 1270 = 2
22 1270 = 4 23 1270 = 8
20 1271 = 127 21 1271 = 254
22 1271 = 508 23 1271 = 1016
24.4. The positive divisors of 345678 will look like 2a 3b 17c 3389d where a, b, c, d can each be either
0 or 1. So, there will be a total of (2)(2)(2)(2) = 16 positive divisors of 345678.
Chapter 25
25.1. Since gcd(21, 48) = 3 and 3 does not divide 8, there are no integer solutions to this equation.
25.2. Since gcd(21, 48) = 3 and 3 does divides 9, there are integer solutions to this equation. To find
one solution, we could use the Extended Euclidean Algorithm, but in this case the numbers are small
306 APPENDIX A. ANSWERS TO EXERCISES
enough that we can find a solution by inspection (in other words, we can guess an answer). First,
let’s reduce the equation by dividing each side by 3 = gcd(21, 48) to get the equation 7x + 16y = 3.
We can quickly see a solution: x = 5, y = −2 since 7(5) + 16(−2) = 35 − 32 = 3. Multiplying
both sides of 7(5) + 16(−2) = 3 by 3 gives 21(5) + 48(−2) = 9. Now that we have one solution to
21x + 48y = 8 we can write down all solutions:
48
x=5+ k = 5 + 16k
gcd(21, 48)
21)
y = −2 − k = −2 − 7k
gcd(21, 48)
where k is any integer. Just to check our work, let’s test the solution when k = 10 (so x = 165 and
y = −72).
25.3. Since gcd(33, 12) = 3 and 3 does not divide 7, there are no solutions.
25.4. We need one solution to get the ball rolling. We could use the Extended Euclidean Algorithm
to write 3 = gcd(33, 12) as a linear combination of 33 and 12, and we would probably have to do
that if the numbers were larger. But with these small numbers we can do the work in our head:
3 = (33)(−1) + (12)(3).
Using the formulas that produce all solutions once one is known we get all solutions are given by
12 33
x = −2 + k = −2 + 4k and y =6− k = 6 − 11k
3 3
where k is any integer.
307
25.5. First, let’s find all solutions to 59x + 37y = 4270 by applying the Extended Euclidean
Algorithm.
59 37 22 15 7 1 0
1 1 1 2 7
0 1 -1 2 -3 8 -59
1 0 1 -1 2 -5 37
The table shows gcd(59, 37) = 1 = 59(−5) + 37(8). Multiplying by 4270, we see one solution to
59x + 37y = 4270 is given by x = −5(4270) and y = 8(4270). That means all solutions to that
equations are given by
We need to find values of k for which both x and y are 0 or more. In other words, we want to solve
Chapter 26
26.1a. Taking 0 hours as midnight, the time 3122 hours after 16 hundred hours is
16 + 3122 ≡ 3138 ≡ 18 (mod 24) hundred hours (or 6 p.m.).
26.1b. Taking Sunday as day 0 of a week, Monday will be 1. So, 3122 days after a Monday is
1 + 3122 ≡ 3123 ≡ 1 (mod 7). So, it is a Monday.
26.1c. Taking January as month 1, November will be month 11. So, 3122 months later it will be
11 + 3122 ≡ 3133 ≡ 1 (mod12). So, it will be January.
308 APPENDIX A. ANSWERS TO EXERCISES
26.2. The integers in [7]11 are given by adding any number of 11’s to 7. In other words, 7, 18, 29, 40, 51, . . .
and −4, −15, −25, . . .. More compactly: 7 + 11k, for all integers k.
26.3.
1211 ≡ 1 (mod 5)
218 ≡ 3 (mod 5)
−100 ≡ 0 (mod 5)
−3333 ≡ 2 (mod 5)
The missing equivalence class is [4]5 . Any value in that equivalence class will do for the fifth value.
So the possible answer is any number of the form 4 + 5k, for an integer k. In particular, 4 would
work (or −1, or 10004, or −6, and so on).
41 ≡ 4 (mod 9)
42 ≡ 16 ≡ 7 (mod 9)
43 ≡ (4)(42 ) ≡ (4)(7) ≡ 28 ≡ 1 (mod 9)
26.7. Since gcd(4, 7) = 1 and 1 divides 3, there will be exactly one solution modulo 7 to 4x ≡
3 (mod 7). Let’s use trial-and-error to find that solution. Testing x = 0, 1, 2, 3, 4, 5, 6, we find
(4)(6) ≡ 24 ≡ 3 (mod 7), and so the solution is x ≡ 6 (mod 7). Incidentally, it would be incorrect
to say the solution is x = 6 since we are working modulo 7 and so the solution has to be given
modulo 7.
26.8. Using the the Extended Euclidean Algorithm, we get gcd(57, 11) = 1 = 57(−5) + 11(26).
Since 1 divides 8, there will be exactly one solution to 11x ≡ 8 (mod 57). To find that solution,
multiply both sides of 57(−5)+11(26) = 1 = gcd(57, 11) (cleverly) by 8 to get 57(−40)+11(208) = 8.
So, one solution to 11x ≡ 8 (mod 57) is x = 208. That means all solutions are given by x ≡ 208 ≡
37 (mod 57). Note that giving the solution as x ≡ 208 (mod 57) is correct, but people expect to
see solutions to x ≡ n (mod m) written with the value of n in the range 0 to m − 1.
26.9. Observe gcd(14, 231) = 7 and 7 does not divide 3. So, 14x ≡ 3 (mod 231) has no solutions.
26.10. To solve 8x ≡ 16 (mod 28), we look for solutions to 8x + 28y = 16, We can simplify that
equation by dividing through by 4 = gcd(8, 28) to get 2x+7y = 4. Solving that equation is the same
as solving 2x ≡ 4 (mod 7). (Short cut: when solving ax ≡ b (mod m) where gcd(a, m) = d divides
b, we can simplify the original equation by dividing a, b, m each by d to get ad x ≡ b
d ( mod m
d ).). The
equation 2x ≡ 4 ( mod 7) with have exactly one solution modulo 7 (but remember that gcd(8, 28) =
4, so the original equation will have four solutions modulo 28). A little trial-and-error (or plain
old common sense) shows 2x ≡ 4 (mod 7) has solution x ≡ 2 (mod 7). It is acceptable to leave the
answer in this form, but since the problem was given modulo 28, it is good manners to provide the
solutions modulo 28. Since x has to be 2 modulo 7, the solutions modulo 28 will be the four values
from 0 to 27 that are equal to 2 modulo 7. Final answer then: x ≡ 2, 9, 16, 23 (mod 28).
26.11. Since gcd(91, 231) = 7, and 7|189, we see the congruence will have seven solutions modulo
231. As in exercise 10, to simplify the work a bit we could cancel 7’s in the given congruence, and
rewrite the problem as 13x ≡ 27 (mod 33). Solving that (using the Extended Euclidean Algorithm
for example), we get x ≡ 30 (mod 33) for the solution. Since the problem was given modulo 231,
we should express the solutions modulo 231 as well.
Solutions: x ≡ 30, 63, 96, 129, 162, 195, 228 (mod 231).
26.12a. Let d = gcd(a, m) and suppose s is a solution to ax ≡ b (mod m) so that as ≡ b (mod m).
If x represents a solution to ax ≡ b (mod m), then ax ≡ as (mod m). Rearrange that as ax − as ≡
310 APPENDIX A. ANSWERS TO EXERCISES
0 (mod m), or a(x − s) ≡ 0 (mod m). That means m divides a(x − s), and so there is an integer k
with mk = a(x − s). Now d also divides a and m since d = gcd(a, m). So, we can divide both sides
in that equation by d to get
m a
(k) = (x − s).
d d
m
So d divides ad (x − s). But m
d is relatively prime to ad , and so m
d must divide x − s. In other words,
there is an integer r such that x − s = r( m
d ). Rearrange that as x = s + r( m
d ).
26.12b. Suppose 0 ≤ r1 < r2 < d, and that d is a positive divisor m. We want to show the numbers
x1 = s + r1 ( m m
d ) and x2 = s + r2 ( d ) are not equivalent modulo m. In other words, we want to show
that m does not divide
m m m
x2 − x2 = (s + r2 ( )) − (s + r1 ( )) = (r2 − r1 ).
d d d
m m
Well, suppose m does divide d (r2 −r1 ). That means there is an integer k such that mk = d (r2 −r1 ).
Cancel the common factor m, and multiply both sides by d to get dk = r2 − r1 . That equation
tells us r2 − r1 is a multiple of d. Since 0 ≤ r1 < r2 < d, we know 0 < r2 − r1 < d, and
none of the integers in that range is a multiple of d. We have reached a contradiction, and so we
can conclude the x1 and x2 are different modulo m. (Notice that this means that the numbers
s, s + m m m m
d , s + 2 d , s + 3 d , . . . , s + (d − 1) d are d different values modulo m.)
Chapter 27
27.1. 213 = 7
3214 = 57
43215 = 586
F ED16 = 4077
11714 = 1301226
311
× 1 2 3 4 5 6
1 1 2 3 4 5 6
2 2 4 6 11 13 15
3 3 6 12 15 21 24
4 4 11 15 22 26 33
5 5 13 21 26 34 42
6 6 15 24 33 42 51
11714 = 2DC216
27.3.
+ 1 2 3 4 5 × 1 2 3 4 5
1 2 3 4 5 10 1 1 2 3 4 5
2 3 4 5 10 11 2 2 4 10 12 14
27.4.
3 4 5 10 11 12 3 3 10 13 20 23
4 5 10 11 12 13 4 4 12 20 24 32
5 10 11 12 13 14 5 5 14 23 32 41
27.5. Luckily we have a base 7 multiplication table above to make the work a little less onerous.
For neatness, the subscript 7 will be omitted.
312 APPENDIX A. ANSWERS TO EXERCISES
gcd(51227 , 13127 ) = 17 .
Chapter 28
28.2. Number of options for a program of three courses, one from each area = 5 · 4 · 6 = 120.
28.4. 266
28.5. 26 · 25 · 24 · 23 · 22 · 21
210 −1
28.7. 1 + 2 + 22 + · · · + 29 = 2−1 = 1023.
If you don’t want to include the empty string (of length 0) then the answer is 1022.
28.8. The total number of words of length 8 is 268 . The number with no A’s is 258 . So the number
with at least one A is 268 − 258 .
28.9. The number of seven letter words with no A’s is 257 . The number of seven letter words with
exactly one A is 7 · 256 . The 7 accounts for the number of options for placing the A, and the 256
accounts for the number of ways of filling in the remaining six spots. So, the number of seven letter
words with at most one A is 257 + 7 · 256 .
28.10. The number of nine letter words with at least two A’s is the total number of nine letter
words (269 ) minus the number with at most one A (259 + 9 · 258 ). So, using the good = total minus
bad rule the number of nine letter words with at least two A’s is 269 − (259 + 9 · 258 )
Chapter 29
29.1. 26!
29.2. Part 1: (vowels together, assume the vowels are a,e,i,o,u) (Task 1) arrange the five vowels
in some order: 5! ways to do that. (Task 2) arrange the 21 non-vowels in some order: 21! ways
to do that. (Task 3): pick a spot in the row of 21 non-vowels to place the row of five vowels: 22
choices for that spot. Since we need to do all three tasks, the number of possible arrangements is
5! · 21! · 22.
Part 2: (no adjacent vowels) (Task 1) arrange the 21 non-vowels in some order: 21! ways to do
that. (Task 2) There are 22 gaps between those volumes, and we need to select five of them for the
vowels: we can do that in 22
5 ways. (Task 3): arrange the five vowels in some order to place in
the five open spots: 5! ways to do that. Since we need to do all three tasks, the number of possible
arrangements is 21! · 22
5 · 5!.
29.3. There are 13! to arrange the books for each shelf. Since we need to arrange shelf 1 and shelf
2, there will be (13!)(13!) = (13!)2 way to arrange the bookcase.
314 APPENDIX A. ANSWERS TO EXERCISES
29.4. People are normally considered distinguishable. (Task 1) arrange the seven men in some
order: 7! ways to do that. (Task 2) arrange the four women in some order: 4! ways to do that.
(Task 3): pick a spot in the row of four women to place the row of seven men: 5 choices for that
spot. Since we need to do all three tasks, the number of possible arrangements is 7! · 4! · 5.
29.5. We need to select 10 of the 20 to form one of the teams (the remaining 10 will form the other
team). That can be done in 20
10 ways. Since that counts each division into two teams twice, the
1 20
total number of ways to divide the group into two teams is 2 · 10 .
99
29.6. The order of the numbers on the lottery ticket do not matter, so there are 5 lottery tickets
possible. In order to have at least a one-in-a-million chance of winning the lottery by matching
n 1
all five numbers we need to have n tickets, where ≥ 1,000,000 . In other words, we need
(99
5)
99
( ) 71,523,144
n≥ 5
1,000,000 = 1,000,000 = 71.52 . . .. That means we need to buy 72 tickets.
22
29.7a. We need to select six of the 9 + 13 = 22 people available. That can be done in 6 ways.
9
29.7b. (Task 1) Select two deans: 2 ways.
13
(Task 2) Select four professors: 4 ways.
9 13
We need to do task 1 and task 2, so there are 2 4 such committees.
29.7c. The options are (1) six professors, (2) five professors and one dean, or (3) four professors
and two deans. So the total number of acceptable committees is
13 13 9 13 9
+ + .
6 5 1 4 2
Chapter 30
30.1. Adding one more row to the triangle as given in the text using Pascal’s Identity, we get
315
Row 0: 1
Row 1: 1 1
Row 2: 1 2 1
Row 3: 1 3 3 1
Row 4: 1 4 6 4 1
Row 5: 1 5 10 10 5 1
Row 6: 1 6 15 20 15 6 1
10
30.2. (3)3 (−2)7
3
30.3.
2n 2n! (2n)(2n − 1)(2n − 2)! (2n)(2n − 1)
= = = = n(2n − 1) = 2n2 − n.
2 2!(2n − 2)! 2!(2n − 2)! 2
and
n n! n(n − 1)(n − 2)!
2 + n2 =2 + n2 = 2
2 2!(n − 2)! 2!(n − 2)!
n(n − 1)
=2 + n2 = n(n − 1) + n2 = 2n2 − n.
2
30.4.
r s r! s! r!
= =
s t s!(r − s)! t!(s − t)! (r − s)!t!(s − t)!
and
r r−t r! (r − t)! r! (r − t)! r!
= = = .
t s−t t!(r − t)! (s − t)!((r − t) − (s − t))! t!(r − t)! (s − t)!(r − s)! (r − s)!t!(s − t)!
316 APPENDIX A. ANSWERS TO EXERCISES
30.5. Scenario: We want to pick a committee of s people from a company with r employees, and
a subcommittee of t of those s to act as the committee’s board. In how many ways can that be
done?
Method 1: Select the s people from the total of all r ( rs ways to do that), and then select t of
those s to be the board ( st ways to do that. So, according the the product rule, there are rs st
r
Method 2: First select the t people to serve on the board from the r people available ( t ways to
do that). To fill out the committee, we need to pick s − t more people from the remaining r − t
people ( r−t r r−t
s−t ways to do that). So, according the the product rule, there are t s−t ways to form
the committee and its board.
r
s r
r−t
Since the two counting methods must give the same answer, we get s t = t s−t .
p
30.7. Suppose p is a prime and k is an integer with 1 < k < p. Let k = n (note that n is
p!
an integer). Expanding the binomial coefficient, we get k!(p−k)! = n Rewrite that equation as
p! = n · k! · (p − k)!. We want to show p divides n. Since the prime p divides the left side of that
equation, it must divide the right side, and one of the properties of primes we proved is that if
a prime divides a product of integers, it must divide one of those integers. So we can conclude p
divides n or k!, or (p − k)!. Since k! = (1)(2) · · · (k) and k < p, there is no factor of p in k!. So p
does not divide k!. Likewise, (p − k)! = (1)(2) · · · (p − k) and p − k < p, so p does not divide (p − k)!.
We conclude p must divide n as we wanted to prove.
317
Chapter 31
31.1. Let M be the set of students with a math major, C: chemistry majors, B: biology majors,
G: geology majors, P : physics majors, and A: anthropology majors.
The total number of students is (taking advantage of the fact each student has at most two majors,
and some double majors have no students)
31.2. Let A be the length 15 bit strings that start with 1111 (they look like 1111...........). Let B
be the set of length 15 bit strings that end with 1000 (they look like ...........1000), and let C be
the set of length 15 bit strings with bits 4 through 7 equal to 1010 (they look like ...1010......... We
need to compute |A ∪ B ∪ C|.
31.3. Let A be the set of integers between 1000 and 9999 (inclusive) that are multiples of 4. To
count the number of integers, n, in A, we want to solve 1000 ≤ 4n ≤ 9999. In other words, solve
250 ≤ n ≤ 2499.75. The number of integers in the range [250, 2499] is 2499 − 250 + 1 = 2250. So
|A| = 2250. Likewise, letting B be the set of multiples of 10 in the range 1000 to 9999, we get
|B| = 900, and with C being the set of multiples of 25 in that range, we get |C| = 360. Now things
get a little trickier: For A ∩ B we want to count the integers that are both multiples of 4 and 10.
But that is the same as the multiples of 20, so |A ∩ B| = 450. Likewise A ∩ C (multiples of 100)
has |A ∩ C| = 90, and B ∩ C (multiples of 50) has |B ∩ C| = 180. Finally, A ∩ B ∩ C (also multiples
of 100) has |A ∩ B ∩ C| = 90. So
All five of the counts |A1 |, |A2 |, . . . , |A5 | will be be 4! (place one number in its spot, and arrange
the other four in any way at all). Likewise, all ten counts like |A1 ∩ A2 | will be 3! (place two
numbers in the correct spots, and arrange the remaining three in any way at all. Continuing,
counts for A1 ∩ A2 ∩ A3 type will be 2!, and for the A1 ∩ A2 ∩ A3 ∩ A4 type will be 1!. Finally,
A1 ∩ A2 ∩ A3 ∩ A4 ∩ A5 | = 1 (every number is its correct spot). So,
|A1 ∪ A2 ∪ A3 ∪ A4 ∪ A5 |
= 5(4!) − 10(3!) + 10(2!) − 5(1!) + 1 = 120 − 60 + 20 − 5 + 1 = 76.
31.5. Using the good = total - bad method, the number of permutations of 1, 2, 3, 4, 5 with no digit
in its correct spot will be the total number of permutations of 1, 2, 3, 4, 5 minus the number of
those permutations with at least one numbers in its correct spot. From the problem above, that is
5! − 76 = 120 − 76 = 44.
Chapter 32
32.1. If we distribute eight objects (people) into seven piles (days of the week), there will be at
least one pile (day) with ⌈ 87 ⌉ = 2 objects (people).
32.2. If we distribute 100 objects (people) into seven piles (days of the week), there will be at least
one pile (day) with ⌈ 100
7 ⌉ = 15 objects (people).
32.3. Since there are four suits (piles), we need the least value of n so that ⌈ n4 ⌉ = 6. That will
5 · 4 + 1 = 21.
32.5. Let ti denote the total number of hours studied from day 1 to day i.
We are told
So we have 150 numbers (namely t1 , t2 , · · · , t75 and t1 + 24, t2 + 24, · · · , t75 + 24)
all between 1 and 149. By the Pigeonhole Principle some two of those must be equal. Since
the numbers in the first list are all different, and the numbers int he second list are also all
different, it must be that ti = tj + 24 for some i and j. That means ti − tj = 24.
That tells us that on days j + 1, j + 2, · · · , i, Al studied exactly a total of 24 hours.
32.6. Since there are only 216 different values modulo 216, the pigeonhole principle says some two
of the 217 numbers, say m and n, must have the same value modulo 216. So m ≡ n (mod 216).
That means 216 | m − n. So m and n have a difference that is a multiple of 216.
Chapter 33
33.1. There are four choices for each of the n positions in the string. So there are 4n such strings.
48 −1 48 −1
33.2. Including the string of length 0, there are 1 + 4 + 42 + · · · 47 = 4−1 = 3 such strings.
33.3. (a donut shop problem) Ask for 3 x1 ’s, 4 x2 ’s, 5 x3 ’s, and 6 each of x4 ’s, x5 ’s, x6 ’s, and x7 ’s.
That gives us 36 donuts so far. The remaining 18 can be selected in any way at all. So there are
18+7−1
= 25
18 18 acceptable solutions to the equation.
33.4. For ternary strings, each position is a 0, 1, or 2. If the ternary string begins 0101 and ends
212 (and must be of length 7 or more!). There will be n − 7 positions left to fill, and there are three
choices for each position, so there are 3n−7 such strings.
320 APPENDIX A. ANSWERS TO EXERCISES
33.5. When you build your order, tell the clerk to start with four jalapeño, six cherry, and eight
strawberry. That accounts for 18 donuts, and so you need 18 more, and any combination is okay
for those last 18. So there are 8+18−1 = 25
18 18 to form the donut order.
33.6a. 2610 (There are 26 choices for each of the middle ten spots.)
33.6b. Pick a spot for the x (12 options). Fill in the 11 empty spots (25 choices for each spot since
we can’t use the x again): Answer: (12)(2511 ).
33.6c. Pick a spot for the x (12 choices), then pick a spot for the y (11 choices), then fill in the
remaining 10 spots (24 choices for each spot): Answer: (12)(11)(2410 ).
33.6d. (good = total - bad method) There are 13 letters in the second half of the alphabet, and
so 1312 twelve letter words made up of only letters from the second half of the alphabet. These are
all bad for this problem. There are 2612 words of length twelve. So, there are 2612 − 1312 twelve
letters words with at least one letter from the first half of the alphabet.
(inclusion/exclusion)
15
33.8. (Task 1) Pick three of the fifteen spots for 0’s: ways to do that.
3
12
(Task 2) Prick four of the remaining twelve spots for 1’s: 4 ways.
8
(Task 3) Pick three of the remaining eight spots for 2’s: 3 ways.
(Task 4) Pick four of the remaining five spots for 3’s: 54 ways.
321
1
(Task 5) Pick one of the remaining one spot for the 4: 1 = 1 way.
33.9a. As usual, we assume people are distinguishable. We can pair 7, 6, 5 or 4 lecturers with 0, 1, 2
or 3 professors respectively.
7 7 14 7 14 7 14
7 + 6 1 + 5 2 + 4 3 .
14 14
7 14
7
7 + 6 1 + 5 2 .
33.9c. The final size of the committee isn’t specified so we will assume any size (five or more) is
ok We will pick 5, 6, or 7 lecturers, and pair each selection with any subset of the professors.
7 7 7
5 214 + 6 214 + 7 214 .
33.10. (good = total - bad method) There are 20! ways to form a line of the 20 people. If we tie
Hans and Brunhilda together, there are 19 items, and so there are 19! ways to line those 19 items
up. Of course Hans and Brunhilda could be in either order, so there are 2(19!) bad lines. The
number of good lines is 20! − 2(19!).
(basis) For a set of one element, {a}, the two subsets are {} and {a}. The first has an even number
of elements, and the second has an odd number of elements, so we are okay in this case.
(inductive step) Suppose that for some n ≥ 1, an n element set has the same number of subsets of
even cardinality as odd odd cardinality. Now consider a set with n + 1 elements. Say the set, A,
consists of n elements along with one additional element e (for extra). List all the subsets of A. By
322 APPENDIX A. ANSWERS TO EXERCISES
the inductive hypothesis, there will be some number t with even cardinality, and the same number
t with odd cardinality. Adding the element e to the subsets of A with even cardinality will produce
t subsets of A ∪ {e} with odd cardinality, and adding the e to the subsets of A with odd cardinality
will produce t subsets of A ∪ {e} with even cardinality. Conclusion: A ∪ {e} has the same number
(2t in fact) of subsets with even and odd cardinality.
Chapter 34
There are usually many different recursive formulas that will be correct answers to the problems
below. If your answer does not agree with the answer provided, you should use both formulas to
generate a numbers of terms, say six or eight or so. If the numbers do not agree, your work is wrong.
If they do agree, then likely your work is okay. You could make sure of that by using induction, for
example, to show the two recursive formulas are equivalent.
34.1. Let pn be the number of pennies in the bank on day n. The initial value is p0 = 0. The
recursive relation is pn = pn−1 + n, for n ≥ 1.
34.2. For n ≥ 0, let cn be the number of different ways Al can climb n steps.
For the recursive formula: When climbing n ≥ 2 steps, Sal can start with one step and finish the
climb in cn−1 ways, or start with two steps and finish the climb in cn−2 ways.
So, for n ≥ 2, cn = cn−1 + cn−2 .
34.3. Let an be the number of bit strings of length n with an even number of 0’s. For an initial
condition, we have a1 = 1 since the only good length 1 bit string is 1. If the empty bit string
doesn’t bother you, we could use initial condition a0 = 1. Now we think recursively: If we have
a good bit string of length n − 1, we can add a 1 to the end to get a good bit string of length n.
That accounts for all the good length n bit strings that end with 1. We get the good bit strings of
length n that end with 0 by adding 0 to the end of a bad length n − 1 bit string. Using good = total
- bad (well, actually bad = total - good in this case), we see there are 2n−1 − an−1 bad bit strings.
So, the solution is an = an−1 + (2n−1 − an−1 ) = 2n−1 , for n ≥ 1, with a0 = 1
323
34.4. We can be sneaky about this and use the example in the text: A recursive relation for the
number of bit strings with no adjacent 0’s is given by a0 = 1, a1 = 2, and an = an−1 + an−2 for
n ≥ 2. Let bn be the number of bit strings that do contain the pattern 00, the ones we are really
interested in. Now, using good = total - bad, an = 2n − bn . So b0 = 0 and b1 = 0, and for n ≥ 2 we
get
The sequence, for n ≥ 0, begins: 0, 0, 1, 3, 8, 19, 43, which agrees with a few brute force computa-
tions.
But that was sort of an unsportsmanlike solution since we didn’t really reason recursively. So, let’s
try it again. After getting b0 = 0 and b1 = 0, let’s think about how to build length n ≥ 2 good bit
strings. We could add 00 to the right end of any of the 2n−2 bit strings of length n − 2. Or, we
could add 10 to any of the bn−2 good bit strings of length n − 2. Or we could add 1 to any of the
bn−1 good bit strings of length n − 1. These three options account for all the good length n bit
strings, the first two count the length n bit string that end with 0, and the last counts the good bit
strings that end with 1. So, the recursive part of the solution is, for n ≥ 2,
as before.
34.5. Again, there is an easy way to do this counting using the good = total - bad method. The
only bad bit strings have to look like a number of 1’s followed on the right by a number of 0’s.
Examples: (length n) 000 · · · 0, 100 · · · 0,110 · · · 0,111 · · · 0, and so on, until we get to 111 · · · 1. That
is a total of n + 1 bad strings. So the number of good length n bit strings must be 2n − (n + 1).
The first few terms of the sequence, starting at n = 0, are : 0, 0, 1, 4, 11, 26, 57.
But, again, that wasn’t recursive counting. So let’s try that again. Letting gn be the number of
good strings of length n, we get g0 = 0, g1 = 0, and g2 = 1. That’s enough for a start. Now let’s
324 APPENDIX A. ANSWERS TO EXERCISES
think recursively. For good strings of length n ≥ 1, we can make good strings of length n + 1 by
adding a 0 to the right end, and that will account for all the good length n + 1 strings ending with
0. Next, let’s count the number of good length n + 1 strings ending with 1. Here there are several
choices depending on the number of 1’s that end the bit string (any will mean any bit string of
appropriate length):
So, we get
2n−1 − 1
gn = gn−1 + 2n−2 + 2n−3 + · · · + 2 + 1 = gn−1 + = gn−1 + 2n−1 − 1.
2−1
This doesn’t look exactly like our first solution, so let’s do a bit of testing. Checking this recursive
formula against the terms computed above, using initial value g0 = 0, we get
0, 0, 1, 4, 11, 26, 57
34.6. Let gn be the number of ternary strings of length n that contain 00. A little trial-and-error
gives the values g0 = 0, g1 = 0, g2 = 1, and g3 = 5. For larger values of n it is already too much
trouble writing down the good strings without some sort of organized plan.
Let’s break the problem of finding longer strings into a number of cases:
That accounts for all the good strings of length n, so the recursive formula is
g0 = 0 g1 = 0
gn = 2gn−1 + 2gn−2 + 3n−2 .
The first few values are 0, 0, 1, 5, 21, 79, 281, 963, 3217.
An alternative recursive answer (as given in sequence A186244 in the The On-Line Encyclopedia of
Integer Sequences) (Google it!) is,
Reasoning: The recursive formula is based on adding any of 0, 1, 2 to strings of length n − 1 which
already have 00 in them, or 100, 200 to strings of length n − 3 which do not.
34.7. Let An = {1, 2, 3, . . . , n}. A subset B of An is good if B does not contain any two consecutive
integers. Let gn be the number of good subsets of An . Split the good subsets of An into two groups:
and
Good subsets in group (1) cannot contain n − 1, and so those good subsets of An are produced by
adding n to a good subset of An−2 (at least if n ≥ 2). That accounts for all the good subsets of An
that contain n. That shows there are gn−2 good subsets of An that contain n.
326 APPENDIX A. ANSWERS TO EXERCISES
Next, let’s count the number of good subsets of An that do not contain n. But that is easy: these
are just the good subsets of An−1 , and so there are gn−1 of these.
Conclusion: for n ≥ 2, gn = gn−1 + gn−2 (the Fibonacci recurrence!). We need initial terms: g0 = 1
and g1 = 2.
The first few values are 1, 2, 3, 5, 8, 13, 21, 34. This is the Fibonacci sequence with the first two terms
discarded.
34.8. Maybe a bit surprisingly, this is a very difficult problem. A conjectured answer was given in
1941, and evidently the conjecture was proved correct in 2014. No one has been able to solve the
problem for more than four pegs, though there are suspicions that remain unproven.
Chapter 35
35.1. Using the given recursive formula we see the sequence begins 2, 4, 10, 28, 82, 244, and those
values look a lot like powers of 3: 1, 3, 8, 27, 81, 243, so it looks like a reasonable guess is an = 3n +1.
The basis for the induction is the case n = 0. We are given a0 = 2 while 30 + 1 = 1 + 1 = 2, and
a1 = 4 while 31 + 1 = 4. So an = 3n + 1 is correct for n = 1, 2 works out okay. For the inductive
step, suppose ak = 3k + 1 for all values of k ≤ n for some n ≥ 1. Then
35.2.
an = 5an−1 = 5(5an−2 ) = 52 an−2 = 52 (5an−3 ) = 53 an−3 .
As we continue to unfold, eventually we will reach a0 . Notice that the exponent on the 5 and the
subscript on the a always add up to n. That makes sense since at each step the exponent goes up 1
and the subscript goes down 1, and so the exponent and the subscript always add up to the n they
started at in the first step. That means we eventually reach an = 5n a0 . Since a0 = 2, we conclude
an = 2 · 5n .
327
35.3.
As we continue to unfold, the first group in parentheses will continue gaining one term at each step
and the last term will have the exponent of the 5 going up one at a times while the subscript on
the a will decease by one at a time. Eventually we will reach the expression
an = (3 + 3 · 5 + 3 · 52 + · · · + 3 · 5n−1 ) + 5n a0
= 3(1 + 5 + 52 · · · + 5n−1 ) + 5n · 2
5n − 1
=3 + 2 · 5n
5−1
11 · 5n − 3
= .
4
As a check of our work, we can use the recursive formula and the closed form formula to generate
six or so terms to see if they produce the same values (or, if we are really ambitious, we can use
induction to verify the closed form formula is correct). In any case, the recursive formula and the
closed form formula both give
for the first six terms, and so we can be reasonably confident our work is okay.
Chapter 36
χ(x) = x2 − x − 6 = (x + 2)(x − 3) = 0.
(
α+β =3
−2α + 3β = 6
3 12
with solution α = 5 and β = 5 .
χ(x) = x2 − 5x + 6 = (x − 2)(x − 3) = 0.
(
α+β =4
2α + 3β = 7
for n ≥ 0.
χ(x) = x2 − 7x + 10 = (x − 2)(x − 5) = 0.
(
4α + 25β = 5
8α + 125β = 13
1
with solution α = 1 and β = 25 .
χ(x) = x2 − 4x + 4 = (x − 2)2 = 0.
330 APPENDIX A. ANSWERS TO EXERCISES
(
2α + 2β = 3
4α + 8β = 5
7
with solution α = 4 and β = − 14 .
7 · 2n n(2n ) 2n (7 − n)
an = − = = 2n−2 (7 − n)
4 4 4
for n ≥ 1.
χ(x) = x2 − 6x + 9 = (x − 3)2 = 0.
(
α=1
3α + 3β = 6
for n ≥ 0.
36.7. This one is easily solved by inspection (that is, it is easy to guess the solution), but let’s use
the characteristic equation method for the practice.
χ(x) = x2 − 1 = (x + 1)(x − 1) = 0.
(
−α + β = 2
α+β =8
for n ≥ 1.
(You might need to review the topic (finding rational roots of polynomials in a college algebra text
or via an internet search ) to refresh your memory about finding that factorization.)
α+β+γ =2
α + 2β + 3γ = 5
α + 4β + 9γ = 15
for n ≥ 0.
χ(x) = x2 − x + 1 = 0.
√
1± 5
The characteristic roots (use the quadratic formula) are x = 2 . To save a bit or writing, let’s
√ √
set r1 = 1+2 5 and r2 = 1−2 5
(
α+β =0
αr1 + βr2 = 1
1 √1
with solution α = r1 −r2 = 5
and β = − √15 .
333
for n ≥ 0. This closed form formula for the Fibonacci numbers is called Binet’s Formula.
Chapter 37
37.1. For exercise 36.2, we know the general solution to the related homogeneous recursion is
a(h) n n
n = α(−2) + β3 .
That general solution of the related homogeneous recursion needs to be paired up with a particular
solution of the original recursive formula
an = an−1 + 6an−2 + 1.
Since the nonhomogeneous part of the original recursion is the constant 1, our first guess should be
that there will be a particular solution of the form an = A, a constant. Putting that guess in the
recursive formula, we get
1
A = A + 6A + 1 with solution A = − .
6
1
an = α(−2)n + β3n − .
6
1
α+β− =3
6
−2α + 3β − 1 = 6
6
2
with solution α = 3 and β = 52 . So the solution to the original recursive formula is
2 5 1
an = (−2)n + (3n ) − .
3 2 6
37.2. For exercise 36.4, we know the general solution to the related homogeneous recursion is
That general solution of the related homogeneous recursion needs to be paired up with a particular
solution of the original recursive formula
an = 7an−1 − 10an−2 + n.
Since the nonhomogeneous part of the original recursion is n, our first guess should be that there
will be a particular solution of the form an = An + B, a general first degree expression. Putting
that guess in the recursive formula, we get
An + B = 7(A(n − 1) + B) − 10(A(n − 2) + B) + n.
Gathering all the term on the left side of the equation gives
1 13
That tells us 4A − 1 = 0 and −13A + 4B = 0. So, A = 4 and B = 16 .
n 13
an = α2n + β5n + + .
4 16
335
The last step is using the initial conditions to determine α and β. Remember that the given initial
conditions were for n = 2, 3: a2 = 5 and a3 = 13. The system to solve is
1 13
4α + 25β + (2) +
=5
4 16
1
8α + 125β + (3) + 13
= 13
4 16
7 13
with solution α = 12 and β = 240 . Assembling the pieces, the solution is
7 n 13 n n 13
an = (2 ) + (5 ) + + .
12 240 4 16
37.3. For exercise 36.5, we know the general solution to the related homogeneous recursion is
a(h) n n
n = α2 + βn2 .
That general solution of the related homogeneous recursion needs to be paired up with a particular
solution of the original recursive formula
an = 4an−1 − 4an−2 + 2n .
Since the nonhomogeneous part of the original recursion is 2n , our first guess should be that there
will be a particular solution of the form an = A2n a multiple of that exponential expression. Putting
that guess in the recursive formula, we get
Moving all the terms involving A to the left side of the equation gives
Now, hold on: that makes some sense because our general solution to the homogeneous equation
already has a term of the form A2n , so we won’t need any more of those. Likewise, we are already
accounting for terms like An2n . So let’s take our guess for a particular solution up two notches to
an = An2 2n . Putting that guess in the recursive formula, we get
Moving all the terms involving A to the left side of the equation and combining terms gives
1
A2n+1 = 2n and so A = .
2
(p)
So, a particular solution is an = 12 (2n )n2 = 2n−1 n2 .
The last step is using the initial conditions to determine α and β. Remember that the given initial
conditions were for n = 1, 2: a1 = 3 and a2 = 5. The system to solve is
(
2α + 2β + 1 = 3
4α + 8β + 8 = 5
11
with solution α = 4 and β = − 74 . Assembling the pieces, the solution is
11 n 7
an = (2 ) − (n2n ) + 2n−1 n2 .
4 4
337
37.4. For exercise 36.6, we know the general solution to the related homogeneous recursion is
a(h) n n
n = α3 + βn3 .
That general solution of the related homogeneous recursion needs to be paired up with a particular
solution of the original recursive formula
an = 6an−1 − 9an−2 + n.
Since the nonhomogeneous part of the original recursion is n, our first guess should be that there
will be a particular solution of the form an = An + B, the general first degree polynomial in n.
Putting that guess in the recursive formula, we get
An + B = 6(A(n − 1) + B) − 9(A(n − 2) + B) + n.
Moving all the terms to the left side of the equation and combining terms gives
1 3
(4A − 1)n + (4B − 12A) = 0 which implies A = and B = .
4 4
(p)
So, a particular solution is an = 14 n + 34 .
1 3
an = α3n + βn3n + n + .
4 4
The last step is using the initial conditions to determine α and β. The given initial conditions are
a0 = 1 and a1 = 6. The system to solve is
3
α+ =1
4
3α + 3β + 1 = 6
338 APPENDIX A. ANSWERS TO EXERCISES
1 17
with solution α = 4 and β = 12 . Assembling the pieces, the solution is
1 n 17 1 3
an = (3 ) + (n3n ) + n + .
4 12 4 4
37.5. For exercise 36.8, we know the general solution to the related homogeneous recursion is
That general solution of the related homogeneous recursion needs to be paired up with a particular
solution of the original recursive formula
Since the nonhomogeneous part of the original recursion is the linear polynomial 2n + 1, our first
guess should be that there will be a particular solution of the form an = An + B, the general linear
polynomial. Putting that guess in the recursive formula, we get
Moving all the terms to the left side of the equation and combining terms gives
−2n + (2A − 1) = 0.
Well, that’s not possible, so we will need to lift the guess up a bit by multiplying by n: our new
guess for a particular solution is An2 + Bn. Putting that guess in the recursive formula, we get
An2 + Bn = 6(A(n − 1)2 + B(n − 1)) − 11(A(n − 2)2 + B(n − 2)) + 6(A(n − 3)2 + B(n − 3)) + 2n + 1.
Moving all the terms to the left side of the equation and combining terms gives
339
1 9
(4a − 2)n + (2B − 16A − 1) = 0 which means A = and B = .
2 2
(p) 1 2
So, we have a particular solution an = 2n + 92 n. Adding that to the general solution of the
related homogeneous recursion, we see the general solution to the original problem is
1 9
an = α + β2n + γ3n + n2 + n.
2 2
The last step is using the initial conditions to determine α, β, and γ. The given initial conditions
are a0 = 2, a1 = 5, and a2 = 15. The system to solve is
α+β+γ =2
α + 2β + 3γ + 5 = 5
α + 4β + 9γ + 10 = 15
1 9
an = 8 − 10 · 2n + 4 · 3n + n2 + n.
2 2
Whew: I’m relieved there were only five of these annoying, tedious, tiresome, monotonous problems.
Chapter 38
0 1 0 1 0 0 0 1
1 0 1 0 0 0 1 0
0 1 0 1 0 1 0 0
1 0 1 0 1 0 0 0
AG =
0
0 0 1 0 1 0 1
0 0 1 0 1 0 1 0
0 1 0 0 0 1 0 1
1 0 0 0 1 0 1 0
38.2. Let G∗ be the complement of G with respect to the graph K8 . Likewise, H∗ will be K8 with
the edges of H erased. It is easy to see that if G and H are isomorphic, then so are G∗ and H∗.
But G∗ is just an 8-cycle, while H ∗ is two disjoint 4-cycles, and so they are not isomorphic. So G
and H are not isomorphic.
341
1 7 3 5
2 8 6 4
38.6. In a Hamiltonian circuit, exactly two edges must be used at each vertex. Therefore a Hamil-
tonian circuit would include the edges {a, b}, {a, j}, {b, c}, {c, d}, {d, g}, and {g, h}. But then the
edges {b, i}, {b, e}, and {e, d} would be forbidden. This leaves only the single edge {e, f } to include
e in the circuit. Conclusion: the graph G is not Hamiltonian.
Appendix B
B.1 PREAMBLE
The purpose of this License is to make a manual, textbook, or other functional and useful document
”free” in the sense of freedom: to assure everyone the effective freedom to copy and redistribute
it, with or without modifying it, either commercially or noncommercially. Secondarily, this License
preserves for the author and publisher a way to get credit for their work, while not being considered
responsible for modifications made by others. This License is a kind of ”copyleft”, which means
342
B.1. PREAMBLE 343
that derivative works of the document must themselves be free in the same sense. It complements
the GNU General Public License, which is a copyleft license designed for free software. We have
designed this License in order to use it for manuals for free software, because free software needs free
documentation: a free program should come with manuals providing the same freedoms that the
software does. But this License is not limited to software manuals; it can be used for any textual
work, regardless of subject matter or whether it is published as a printed book. We recommend
this License principally for works whose purpose is instruction or reference.
1. APPLICABILITY AND DEFINITIONS This License applies to any manual or other work,
in any medium, that contains a notice placed by the copyright holder saying it can be distributed
under the terms of this License. Such a notice grants a world-wide, royalty-free license, unlimited in
duration, to use that work under the conditions stated herein. The ”Document”, below, refers to any
such manual or work. Any member of the public is a licensee, and is addressed as ”you”. You accept
the license if you copy, modify or distribute the work in a way requiring permission under copyright
law. A ”Modified Version” of the Document means any work containing the Document or a portion
of it, either copied verbatim, or with modifications and/or translated into another language. A
”Secondary Section” is a named appendix or a front-matter section of the Document that deals
exclusively with the relationship of the publishers or authors of the Document to the Document’s
overall subject (or to related matters) and contains nothing that could fall directly within that
overall subject. (Thus, if the Document is in part a textbook of mathematics, a Secondary Section
may not explain any mathematics.) The relationship could be a matter of historical connection
with the subject or with related matters, or of legal, commercial, philosophical, ethical or political
position regarding them. The ”Invariant Sections” are certain Secondary Sections whose titles are
designated, as being those of Invariant Sections, in the notice that says that the Document is released
under this License. If a section does not fit the above definition of Secondary then it is not allowed
to be designated as Invariant. The Document may contain zero Invariant Sections. If the Document
does not identify any Invariant Sections then there are none. The ”Cover Texts” are certain short
passages of text that are listed, as Front-Cover Texts or Back-Cover Texts, in the notice that says
that the Document is released under this License. A Front-Cover Text may be at most 5 words,
and a Back-Cover Text may be at most 25 words. A ”Transparent” copy of the Document means a
machine-readable copy, represented in a format whose specification is available to the general public,
that is suitable for revising the document straightforwardly with generic text editors or (for images
composed of pixels) generic paint programs or (for drawings) some widely available drawing editor,
and that is suitable for input to text formatters or for automatic translation to a variety of formats
344 APPENDIX B. GNU FREE DOCUMENTATION LICENSE
suitable for input to text formatters. A copy made in an otherwise Transparent file format whose
markup, or absence of markup, has been arranged to thwart or discourage subsequent modification
by readers is not Transparent. An image format is not Transparent if used for any substantial
amount of text. A copy that is not ”Transparent” is called ”Opaque”. Examples of suitable formats
for Transparent copies include plain ASCII without markup, Texinfo input format, LaTeX input
format, SGML or XML using a publicly available DTD, and standard-conforming simple HTML,
PostScript or PDF designed for human modification. Examples of transparent image formats
include PNG, XCF and JPG. Opaque formats include proprietary formats that can be read and
edited only by proprietary word processors, SGML or XML for which the DTD and/or processing
tools are not generally available, and the machine-generated HTML, PostScript or PDF produced
by some word processors for output purposes only. The ”Title Page” means, for a printed book, the
title page itself, plus such following pages as are needed to hold, legibly, the material this License
requires to appear in the title page. For works in formats which do not have any title page as such,
”Title Page” means the text near the most prominent appearance of the work’s title, preceding
the beginning of the body of the text. A section ”Entitled XYZ” means a named subunit of the
Document whose title either is precisely XYZ or contains XYZ in parentheses following text that
translates XYZ in another language. (Here XYZ stands for a specific section name mentioned
below, such as ”Acknowledgements”, ”Dedications”, ”Endorsements”, or ”History”.) To ”Preserve
the Title” of such a section when you modify the Document means that it remains a section ”Entitled
XYZ” according to this definition. The Document may include Warranty Disclaimers next to the
notice which states that this License applies to the Document. These Warranty Disclaimers are
considered to be included by reference in this License, but only as regards disclaiming warranties:
any other implication that these Warranty Disclaimers may have is void and has no effect on the
meaning of this License.
2. VERBATIM COPYING You may copy and distribute the Document in any medium, either
commercially or noncommercially, provided that this License, the copyright notices, and the license
notice saying this License applies to the Document are reproduced in all copies, and that you add
no other conditions whatsoever to those of this License. You may not use technical measures to
obstruct or control the reading or further copying of the copies you make or distribute. However,
you may accept compensation in exchange for copies. If you distribute a large enough number of
copies you must also follow the conditions in section 3. You may also lend copies, under the same
conditions stated above, and you may publicly display copies.
3. COPYING IN QUANTITY If you publish printed copies (or copies in media that commonly
B.1. PREAMBLE 345
have printed covers) of the Document, numbering more than 100, and the Document’s license notice
requires Cover Texts, you must enclose the copies in covers that carry, clearly and legibly, all these
Cover Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on the back cover. Both
covers must also clearly and legibly identify you as the publisher of these copies. The front cover
must present the full title with all words of the title equally prominent and visible. You may add
other material on the covers in addition. Copying with changes limited to the covers, as long as
they preserve the title of the Document and satisfy these conditions, can be treated as verbatim
copying in other respects. If the required texts for either cover are too voluminous to fit legibly,
you should put the first ones listed (as many as fit reasonably) on the actual cover, and continue
the rest onto adjacent pages.
If you publish or distribute Opaque copies of the Document numbering more than 100, you must
either include a machine-readable Transparent copy along with each Opaque copy, or state in or
with each Opaque copy a computer-network location from which the general network-using public
has access to download using public-standard network protocols a complete Transparent copy of the
Document, free of added material. If you use the latter option, you must take reasonably prudent
steps, when you begin distribution of Opaque copies in quantity, to ensure that this Transparent
copy will remain thus accessible at the stated location until at least one year after the last time
you distribute an Opaque copy (directly or through your agents or retailers) of that edition to the
public. It is requested, but not required, that you contact the authors of the Document well before
redistributing any large number of copies, to give them a chance to provide you with an updated
version of the Document.
4. MODIFICATIONS You may copy and distribute a Modified Version of the Document under the
conditions of sections 2 and 3 above, provided that you release the Modified Version under precisely
this License, with the Modified Version filling the role of the Document, thus licensing distribution
and modification of the Modified Version to whoever possesses a copy of it. In addition, you must
do these things in the Modified Version:
(1) A. Use in the Title Page (and on the covers, if any) a title distinct from that of the Document,
and from those of previous versions (which should, if there were any, be listed in the History
section of the Document). You may use the same title as a previous version if the original
publisher of that version gives permission.
(2) B. List on the Title Page, as authors, one or more persons or entities responsible for authorship
of the modifications in the Modified Version, together with at least five of the principal authors
346 APPENDIX B. GNU FREE DOCUMENTATION LICENSE
of the Document (all of its principal authors, if it has fewer than five), unless they release you
from this requirement.
(3) C. State on the Title page the name of the publisher of the Modified Version, as the publisher.
(5) E. Add an appropriate copyright notice for your modifications adjacent to the other copyright
notices.
(6) F. Include, immediately after the copyright notices, a license notice giving the public permis-
sion to use the Modified Version under the terms of this License, in the form shown in the
Addendum below.
(7) G. Preserve in that license notice the full lists of Invariant Sections and required Cover Texts
given in the Document’s license notice.
(9) I. Preserve the section Entitled ”History”, Preserve its Title, and add to it an item stating at
least the title, year, new authors, and publisher of the Modified Version as given on the Title
Page. If there is no section Entitled ”History” in the Document, create one stating the title,
year, authors, and publisher of the Document as given on its Title Page, then add an item
describing the Modified Version as stated in the previous sentence.
(10) J. Preserve the network location, if any, given in the Document for public access to a Trans-
parent copy of the Document, and likewise the network locations given in the Document for
previous versions it was based on. These may be placed in the ”History” section. You may
omit a network location for a work that was published at least four years before the Document
itself, or if the original publisher of the version it refers to gives permission.
(11) K. For any section Entitled ”Acknowledgements” or ”Dedications”, Preserve the Title of the
section, and preserve in the section all the substance and tone of each of the contributor
acknowledgements and/or dedications given therein.
(12) L. Preserve all the Invariant Sections of the Document, unaltered in their text and in their
titles. Section numbers or the equivalent are not considered part of the section titles.
(13) M. Delete any section Entitled ”Endorsements”. Such a section may not be included in the
Modified Version.
(14) N. Do not re-title any existing section to be Entitled ”Endorsements” or to conflict in title
with any Invariant Section.
B.1. PREAMBLE 347
If the Modified Version includes new front-matter sections or appendices that qualify as Secondary
Sections and contain no material copied from the Document, you may at your option designate
some or all of these sections as invariant. To do this, add their titles to the list of Invariant Sections
in the Modified Version’s license notice. These titles must be distinct from any other section titles.
You may add a section Entitled ”Endorsements”, provided it contains nothing but endorsements
of your Modified Version by various parties–for example, statements of peer review or that the text
has been approved by an organization as the authoritative definition of a standard. You may add a
use their names for publicity for or to assert or imply endorsement of any Modified Version.
5. COMBINING DOCUMENTS You may combine the Document with other documents released
under this License, under the terms defined in section 4 above for modified versions, provided
that you include in the combination all of the Invariant Sections of all of the original documents,
unmodified, and list them all as Invariant Sections of your combined work in its license notice, and
that you preserve all their Warranty Disclaimers. The combined work need only contain one copy of
this License, and multiple identical Invariant Sections may be replaced with a single copy. If there
are multiple Invariant Sections with the same name but different contents, make the title of each
such section unique by adding at the end of it, in parentheses, the name of the original author or
publisher of that section if known, or else a unique number. Make the same adjustment to the section
titles in the list of Invariant Sections in the license notice of the combined work. In the combination,
you must combine any sections Entitled ”History” in the various original documents, forming one
section Entitled ”History”; likewise combine any sections Entitled ”Acknowledgements”, and any
sections Entitled ”Dedications”. You must delete all sections Entitled ”Endorsements”.
and other documents released under this License, and replace the individual copies of this License
in the various documents with a single copy that is included in the collection, provided that you
follow the rules of this License for verbatim copying of each of the documents in all other respects.
You may extract a single document from such a collection, and distribute it individually under this
License, provided you insert a copy of this License into the extracted document, and follow this
License in all other respects regarding verbatim copying of that document.
derivatives with other separate and independent documents or works, in or on a volume of a storage
or distribution medium, is called an ”aggregate” if the copyright resulting from the compilation
is not used to limit the legal rights of the compilation’s users beyond what the individual works
permit. When the Document is included in an aggregate, this License does not apply to the other
works in the aggregate which are not themselves derivative works of the Document. If the Cover
Text requirement of section 3 is applicable to these copies of the Document, then if the Document is
less than one half of the entire aggregate, the Document’s Cover Texts may be placed on covers that
bracket the Document within the aggregate, or the electronic equivalent of covers if the Document is
in electronic form. Otherwise they must appear on printed covers that bracket the whole aggregate.
tions of the Document under the terms of section 4. Replacing Invariant Sections with translations
requires special permission from their copyright holders, but you may include translations of some
or all Invariant Sections in addition to the original versions of these Invariant Sections. You may
include a translation of this License, and all the license notices in the Document, and any Warranty
Disclaimers, provided that you also include the original English version of this License and the
original versions of those notices and disclaimers. In case of a disagreement between the translation
and the original version of this License or a notice or disclaimer, the original version will prevail.
If a section in the Document is Entitled ”Acknowledgements”, ”Dedications”, or ”History”, the
requirement (section 4) to Preserve its Title (section 1) will typically require changing the actual
title.
9. TERMINATION You may not copy, modify, sublicense, or distribute the Document except
as expressly provided for under this License. Any other attempt to copy, modify, sublicense or
distribute the Document is void, and will automatically terminate your rights under this License.
B.1. PREAMBLE 349
However, parties who have received copies, or rights, from you under this License will not have
their licenses terminated so long as such parties remain in full compliance.
10. FUTURE REVISIONS OF THIS LICENSE The Free Software Foundation may publish new,
revised versions of the GNU Free Documentation License from time to time. Such new versions
will be similar in spirit to the present version, but may differ in detail to address new problems or
concerns.
See http://www.gnu.org/copyleft/. Each version of the License is given a distinguishing version
number. If the Document specifies that a particular numbered version of this License ”or any
later version” applies to it, you have the option of following the terms and conditions either of
that specified version or of any later version that has been published (not as a draft) by the Free
Software Foundation. If the Document does not specify a version number of this License, you may
choose any version ever published (not as a draft) by the Free Software Foundation.
350 APPENDIX B. GNU FREE DOCUMENTATION LICENSE
ADDENDUM: How to use this License for your documents To use this License in a document you
have written, include a copy of the License in the document and put the following copyright and
license notices just after the title page:
If you have Invariant Sections, Front-Cover Texts and Back-Cover Texts, replace the ”with...Texts.”
line with this: with the Invariant Sections being LIST THEIR TITLES, with the Front-Cover Texts
If you have Invariant Sections without Cover Texts, or some other combination of the three, merge
those two alternatives to suit the situation. If your document contains nontrivial examples of
program code, we recommend releasing these examples in parallel under your choice of free software
license, such as the GNU General Public License, to permit their use in free software.