2025 Lent Partii Logic and Set Notes
2025 Lent Partii Logic and Set Notes
András Zsák
Lent 2025
Contents
1 Propositional Logic 1
5 Set Theory 45
6 Cardinal Arithmetic 58
1 Propositional Logic
(i) P ⊂ L,
(iii) if p, q ∈ L then (p ⇒ q) ∈ L.
1
S
Remarks. 1. ‘L defined inductively’ means, more precisely, that L = n∈N Ln ,
where
L1 = P ∪ {⊥}
and for n ∈ N,
Ln+1 = Ln ∪ {(p ⇒ q) : p, q ∈ Ln } .
Semantic Entailment
Definition. A valuation on L is a function v : L → {0, 1} such that:
(i) v(⊥) = 0,
(
0 if v(p) = 1, v(q) = 0,
(ii) v(p ⇒ q) =
1 otherwise
for all p, q ∈ L.
Example. If v(p1 ) = 1 and v(p2 ) = 0, then v (⊥ ⇒ p1 ) ⇒ (p1 ⇒ p2 ) = 0 .
Proposition 1.
(i) If v and v 0 are valuations with vP = v 0P , then v = v 0 .
(ii) For any function w : P → {0, 1}, there exist a valuation v on L such that
vP = w.
2
Definition. Say t ∈ L is a tautology if v(t) = 1 for all valuations v.
Definition. At this point, we enrich our language by adding the symbols >
(‘true’ or ‘top’), ∧ (‘and’), ∨ (‘or’), ¬ (‘not’) and ⇔ (‘iff’) as abbrevations as
follows.
> = (⊥ ⇒ ⊥)
¬p = (p ⇒ ⊥)
(p ∨ q) = (¬p ⇒ q)
(p ∧ q) = ¬(p ⇒ ¬q)
(p ⇔ q) = ((p ⇒ q) ∧ (q ⇒ p))
for any p, q ∈ L. We have v(>) = 1 for any valuation v, and so > is a tautology.
Similarly, v(¬p), v(p ∨ q) and v(p ∧ q) have the expected values.
Examples. The following three examples of tautologies will play a role later.
so a tautology.
3. p ⇒ (q ⇒ r) ⇒ (p ⇒ q) ⇒ (p ⇒ r)
Suppose not a tautology. Then
v p ⇒ (q ⇒ r) = 1 and v (p ⇒ q) ⇒ (p ⇒ r) = 0
3
Examples. {p, p ⇒ q} q {p ⇒ q, q ⇒ r} (p ⇒ r)
Syntactic Entailment
(A3) ¬¬p ⇒ p (p ∈ L)
These are more accurately called ‘axiom-schemes’, as each is an infinite collec-
tion of axioms.
We will have only one deduction rule, called modus ponens: ‘from p and p ⇒ q,
can deduce q’.
(iii) or ti follows from earlier lines by modus ponens (MP): there exist j, k < i
with tk = (tj ⇒ ti ).
Examples. 1. {p ⇒ q, q ⇒ r} ` (p ⇒ r)
p ⇒ (q ⇒ r) ⇒ (p ⇒ q) ⇒ (p ⇒ r) (A2)
(q ⇒ r) ⇒ (p ⇒ (q ⇒ r)) (A1)
q⇒r (premiss)
p ⇒ (q ⇒ r) (MP)
(p ⇒ q) ⇒ (p ⇒ r) (MP)
p⇒q (premiss)
p⇒r (MP)
4
2. ` (p ⇒ p)
p ⇒ (p ⇒ p) ⇒ p (A1)
p ⇒ (p ⇒ p) ⇒ p ⇒ p ⇒ (p ⇒ p) ⇒ (p ⇒ p) (A2)
p ⇒ (p ⇒ p) ⇒ (p ⇒ p) (MP)
p ⇒ (p ⇒ p) (A1)
p⇒p (MP)
Proof. Assume that S ` (p ⇒ q). Write down a proof of p ⇒ q from S and add
the following two lines to obtain a proof of q from S ∪ {p}:
p (premiss)
q (MP)
ti ⇒ (p ⇒ ti ) (A1)
ti (axiom or premiss)
p ⇒ ti (MP)
is a proof of p ⇒ ti from S.
Case 2: If ti = p, then S ` (p ⇒ ti ) since ` (p ⇒ p) (Example 2 above).
Case 3: Finally, if there exist j, k < i such that tk = (tj ⇒ ti ), then by induction
hypothesis, there are proofs of p ⇒ tj and p ⇒ (tj ⇒ ti ) from S. Adding the
lines
p ⇒ (tj ⇒ ti ) ⇒ (p ⇒ tj ) ⇒ (p ⇒ ti ) (A2)
(p ⇒ tj ) ⇒ (p ⇒ ti ) (MP)
p ⇒ ti (MP)
5
Soundness says that our notion of proof is sound: it doesn’t lead to absurd
conclusions. Adequacy says that our notion of proof is sufficiently strong to
prove from S every semantic consequence of S.
6
S is deductively closed : if S ` t, then t ∈ S. Indeed, if t ∈
/ S, then ¬t ∈ S. It
follows that S proves both t and ¬t, and hence S ` ⊥ contradicting that S is
consistent.
We now define v : L → {0, 1} by
(
1 if S ` t (or equivalently, if t ∈ S),
v(t) =
0 otherwise.
We show that v is a valuation on L. It will then follow that v is a model of S
completing the proof.
Firstly, v(⊥) = 0 since ⊥ ∈
/ S as S is consistent. Next, we examine v(p ⇒ q)
for arbitrary p, q ∈ L.
Case 1: v(p) = 1 and v(q) = 0, i.e., p ∈ S and q ∈/ S. We need to show that
v(p ⇒ q) = 0. If not, then (p ⇒ q) ∈ S. Then the sequence
p (premiss)
p⇒q (premiss)
q (MP)
7
Proof. If S t, then S ∪ {¬t} ⊥. Hence by Theorem 4, we have S ∪ {¬t} ` ⊥,
which in turn implies S ` ¬¬t by the Deduction Theorem. We now obtain a
proof of t from S by adding the following lines to a proof of ¬¬t from S.
¬¬t ⇒ t (A3)
t (MP).
Remark. If S ` t, then a proof can be found in finite time: by writing out all
proofs from S, we will eventually arrive at t. However, this algorithm does not
terminate if S 0 t.
8
2 Well-Orderings and Ordinals
A linear order or total order on a set X is a relation < on X that is
(i) irreflexive: ¬(x < x) for all x ∈ X.
(ii) transitive: (x < y) ∧ (y < z) ⇒ (x < z) for all x, y, z ∈ X.
Note. In (iii) exactly one of the three possibilities hold. E.g., if x < y and
y < x, then x < x by (ii), which contradicts (i).
Note. For a set X of size at least 2, the relation on the power set PX of X (the
set of all subsets of X) defined by a < b if a ⊂ b and a 6= b is not trichotomous.
Note that a ⊂ b means: (x ∈ a) ⇒ (x ∈ b) for all x ∈ X, which includes the
case a = b.
Notation. We write ‘x > y’ for ‘y < x’ and ‘x 6 y’ for ‘x < y or x = y’. Then
the relation 6 is
(i) reflexive: x 6 x for all x ∈ X
(ii) antisymmetric: (x 6 y) ∧ (y 6 x) ⇒ (x = y) for all x, y ∈ X
(iii) transitive: (x 6 y) ∧ (y 6 z) ⇒ (x 6 z) for all x, y, z ∈ X
9
We say f is an order-isomorphism. Note that f −1 is also an order-isomorphism,
and thus x < y ⇐⇒ f (x) < f (y).
The
‘base case’(p(x) for the least element x of X) is included in the assumption
‘ (∀ y < x)p(y) ⇒ p(x)’.
10
Proof. Is S 6= X, then X \ S has a least element x, say. For y < x, we have
y ∈ S by choice of x. It follows from the assumption on S that x ∈ S –
contradiction.
Note. This is false in general for linearly ordered sets. E.g., from Z → Z we
have n 7→ n or n 7→ n + 17; from [0, ∞) → [0, ∞) we have x 7→ x or x 7→ x2 .
Proof. Let f, g : X → Y be order-isomorphisms. We show (∀ x)(f (x) = g(x))
by induction.
Let x ∈ X and assume that f (y) = g(y) for all y < x (the induction hypothesis).
We need to show that f (x) = g(x), which will then complete the induction. By
Lemma 1, f (x) is the least element of A = Y \ {f (y) : y < x}, and g(x) is the
least element of B = Y \ {g(y) : y < x}. By the induction hypothesis, A = B,
and hence f (x) = g(x).
11
S
The domain of f is the union {dom(h) : h is an attempt}, and thus dom(f )
is an initial segment of X.
For any x ∈ dom(f ), there is an attempt h such that f (x) = h(x). It follows
that f (y) = h(y) for all y < x, and hence f (x) = h(x) = G(hIx ) = G(f Ix ).
Thus, f is an attempt.
Finally, assume that dom(f ) 6= X. Then dom(f ) = Ix for some x ∈ X. In
particular, there is no attempt defined at x. However, f ∪ {(x, G(f )} is an
attempt defined at x. Thus, f is defined on the whole of X.
We first show that the ‘otherwise’ clause never arises by showing that f (x) 6 x
for all x ∈ X. Indeed, fix x ∈ X and assume that f (y) 6 y holds for all y ∈ X
with y < x. Then x ∈ Y \ {f (y) : y ∈ X, y < x}, and hence f (x) 6 x. The
claim follows by induction.
Given y < x in X, since
f (x) ∈ Y \ {f (z) : z ∈ X, z < x} ⊂ Y \ {f (z) : z ∈ X, z < y} ,
it follows that f (y) < f (x). Thus, f is order-preserving.
Finally, assume that a ∈ Y \ im(f ). We show by induction that f (x) < a for all
x ∈ X, which shows that im(f ) is an initial segment of Y . Fix x ∈ X and assume
that f (y) < a for all y ∈ X with y < x. Then a ∈ Y \ {f (y) : y ∈ X, y < x},
and thus f (x) < a, as required.
12
If the ‘otherwise’ clause ever arises, then let x be the least element of X for
which this happens. Then f (Ix ) = Y and for y < x the ‘otherwise’ clause does
not arise in the definition of f (y). It follows as in the proof of Proposition 5
that f is an order-isomorphism from Ix to Y contradicting Y 66 X. Hence
the ‘otherwise’ clause never arises, and then it follows again as in the proof
of Proposition 5 that f is an order-isomorphism from X to an initial segment
of Y .
Remark. What the above shows is that ‘6’ is a linear order (reflexive, anti-
symmetric, transitive and trichotomous) on the collection of well-ordered sets
provided we identify order-isomorphic sets. (We haven’t showed transitivity but
that is straightforward.) It is natural to introduce the corresponding ‘<’ sign
as follows. For well-ordered sets X, Y , write X < Y to mean ‘X 6 Y and X
not order-isomorphic to Y ’. Equivalently, X < Y if and only if X is order-
isomorphic to a proper initial segment of Y . Then ‘<’ is irreflexive, transitive
and (regarding order-isomorphic well-ordered sets the same) trichotomous. A
natural question arises: Is the collection of all well-ordered sets a well-ordered
set? We return to this question later in the chapter, but first show how to build
new well-ordered sets from old ones.
Note. For any set X, there is always some z not in X, for example, because
there is no surjection from X to the power set PX by Cantor’s diagonal argu-
ment.
13
Given a non-empty subset S ⊂ X, we have S ∩ Xi 6= ∅ for some i ∈ I. Since Xi
is well-ordered, S ∩ Xi has a least element x. Since Xi is an initial segment of
X, it follows that x is a least element of S.
Remark. The same result holds without the assumption that the Xi are nested
(see Chapter 5). The well-ordered set X constructed in the proof above is in
fact a least upper bound for {Xi : i ∈ I}.
Ordinals
Definitions. An ordinal is a well-ordered set with two ordinals regarded the
same if they are order-isomorphic. The order-type of a well-ordered set X is the
unique ordinal to which X is order-isomorphic.
Remark. The formal definition of ordinal will be given in Chapter 5. For now
you can view the word ordinal as shorthand for identifying well-ordered sets
that are order-isomorphic. The results in this chapter can be expressed purely
in terms of well-ordered sets.
Note. The notions above are well-defined, i.e., they don’t depend on the choice
of X and Y . For arbitrary ordinals α, β, we have either α 6 β or β 6 α
(Theorem 6), and if α 6 β and β 6 α, then α = β (Proposition 7).
I = {order-type(Y ) : Y ∈ X 0 }
which consists precisely of all ordinals strictly less than α. It is linearly ordered
by <, and moreover, the map Y 7→ order-type(Y ) is an order-isomorphism
X 0 → I, and hence I is well-ordered of order-type α.
14
Note. It is natural to denote the set of ordinals {β : β < α} by Iα . It is a
well-ordered set of order-type α.
0, 1, 2, 3, . . . ω, ω + 1 (officially ω + ), ω + 2 (officially ω ++ ), ω + 3, . . .
ω + ω = ω · 2 (officially sup{ω, ω + 1, ω + 2, . . . }), ω · 2 + 1, ω · 2 + 2, ω · 2 + 3,
. . . ω · 3, . . . ω · 4 . . . ω · 5, ω · ω = ω 2 (officially sup{ω, ω · 2, ω · 3, . . . }), ω 2 + 1,
ω 2 + 2, . . . ω 2 + ω, ω 2 + ω + 1, . . . ω 2 + ω · 2, . . . ω 2 + ω · 3, . . . ω 2 + ω 2 = ω 2 · 2,
ω 2 · 2 + 1, . . . ω 2 · 2 + ω, . . . ω 2 · 3, . . . ω 2 · 4, . . . ω 3 , . . . ω 3 · 2, . . . ω 3 · 3, . . .
ω 4 , . . . ω 5 , . . . ω ω (officially sup{ω, ω 2 , ω 3 , . . . }), . . . ω ω · 2, . . . ω ω · 3, . . .
2 2
ω ω ·ω = ω ω+1 , . . . ω ω+2 , . . . ω ω+3 , . . . ω ω·2 , . . . ω ω·3 , . . . ω ω·4 , . . . ω ω = ω (ω ) ,
2 2 3 4 ω ω2 ω3 ωω
·2 ·3
. . . ωω , . . . ωω , . . . ωω , . . . ωω , . . . ωω , . . . ωω , . . . ωω , . . . ωω ,
.
..
ωω ω
. . . ε0 = ω ω (officially sup{ω, ω ω , ω ω , . . . }), ε0 + 1, ε0 + 2, . . . ε0 + ω,
2
. . . ε0 + ε0 = ε0 · 2, . . . ε0 · 3, . . . ε0 · ω, . . . ε0 · ε0 = ε20 , . . . ε30 , . . . εω ω
0 , . . . ε0 ,
..
. ε ε.
.. ε ε 0 ε 0
3 ω ωω ε00 ε0 0 ε0 0
. . . εω ω ω
0 , . . . ε0 , . . . ε0 = εε00 , . . . ε0 , . . . ε0 , . . . ε1 = ε0 , . . . ε2 ,
. . . ε3 , . . . εω , . . . εε0 , . . . εεε0 , . . . εεε . . , . . .
.
15
Note. All the ordinals above are countable! By this we mean, of course, that
they are order-types of countable well-ordered sets.
Questions. Does there exist an uncountable ordinal? I.e., does there exist an
uncountable well-ordered set? Can we well-order R?
Idea. If there is an uncountable ordinal, then there is a least one, α say. Then
Iα is the set of countable ordinals, i.e., the set of order-types of well-orderings
of subsets of N.
Proof. We can form the set
B = {order-type(X) : X ∈ A}
Note. The ordinal ω1 constructed in the proof above is the least uncountable
ordinal. (Indeed, if α < ω1 , then α < β for some β ∈ B, and so α is countable.)
It follows that every proper initial segment of ω1 is countable. If α1 , αS2 , . . . are
countable ordinals, then so is sup{α1 , α2 , . . . } being the order-type of n∈N Iαn .
(Here we are using the fact that a countable union of countable sets is countable.)
Notation. The least ordinal that does not inject into X is denoted γ(X).
16
Ordinal Arithmetic
Ordinal addition. For ordinals α, β, we define α + β by recursion on β, with
α fixed, as follows.
α+0=α
α + β + = (α + β)+
α + λ = sup{α + β : β < λ} for a non-zero limit ordinal λ .
Remark. Technically, since the ordinals do not form a set, we need to fix an
ordinal γ and define α + β for β < γ by recursion in the well-ordered set Iγ . The
definition is then independent of γ by uniqueness of recursion. This justifies the
recursive definition above and others given below.
In a similar way, induction on ordinals works even though ordinals do not form
a set. Indeed, assume p is a property of ordinals. Then
(∀ α)((∀ β < α)p(β) ⇒ p(α)) =⇒ (∀ α)p(α)
since otherwise the assumption (∀ α)((∀ β < α)p(β) ⇒ p(α)) holds, but for some
γ we have ¬p(γ). Then the non-empty set S = {β 6 γ : ¬p(β)} has a least
element α. By minimality, we then have (∀ β < α)p(β) which implies p(α) by
our assumption, which contradicts α ∈ S.
γ = 0: if β 6 γ, then β = 0, so α + β = α + γ = α.
17
Remark. It follows directly from the result above that β < γ implies α + β <
α + γ. Indeed, we have β + 6 γ, and hence
α + β < (α + β)+ = α + β + 6 α + γ .
Note, however, that β < γ does not imply β + α < γ + α in general. For
example, 1 < 2 but 1 + ω = 2 + ω = ω. On the other hand, β 6 γ does imply
β + α 6 γ + α (by induction on α).
α + sup S = sup{α + β : β ∈ S}
To show the reverse inequality, first assume that S has a greatest element γ.
Then γ = sup S, and α + γ is also the greatest element of T by Proposition 14.
It follows that sup T = α + γ = α + sup S.
Now assume that S has no greatest element. Then λ = sup S is a non-zero limit
ordinal. Indeed, λ ∈ / S as S has no greatest element, and so S ⊂ Iλ , which
implies that λ = sup S 6 sup Iλ . It follows that λ = sup Iλ and λ is a limit
ordinal (λ > 0 since S 6= ∅). By definition of ordinal addition, we have
Proposition 16. α + (β + γ) = (α + β) + γ.
Proof. We proceed by induction on γ with α, β fixed. As usual, there are three
cases.
γ = 0: α + (β + 0) = α + β = (α + β) + 0.
α + (β + γ) = α + sup{β + δ : δ < γ}
= sup{α + (β + δ) : δ < γ}
= sup{(α + β) + δ : δ < γ} = (α + β) + γ .
18
Remark. The definition of ordinal addition we gave above is called inductive
definition. We now give an alternative.
β = 0: α + 0 = α = order-type(α t 0) = α u 0.
For δ < β, α u δ is the order-type of αSt δ, and the supremum of the nested set
{α t δ : δ < β} of well-ordered sets is δ<β (α t δ) = α t β which has order-type
α u β. This completes the proof that α + β = α u β in this case.
α·0=0
α · β+ = α · β + α
α · λ = sup{α · δ : δ < λ} for a non-zero limit ordinal λ .
19
For the synthetic definition, we first define the product X × Y of well-ordered
sets to be their Cartesian product well-ordered as follows:
(
either y < v in Y
(x, y) < (u, v) ⇐⇒
or y = v and x < u in X .
We then define α·β to be the order-type of α×β or, more precisely, the order-type
of X × Y , where X is a well-ordered set of order-type α and Y is a well-ordered
set of order-type β. It is then straightforward to verify (by induction on β) that
the two definitions coincide.
ω · 2 = ω · 1+ = ω · 1 + ω = ω · 0+ + ω = (ω · 0 + ω) + ω = ω + ω.
β 6 γ ⇒ αβ 6 αγ.
β 6 γ ⇒ βα 6 γα.
Note, however, that the last inequality cannot be strengthened. E.g., 1 < 2 but
1 · ω = 2 · ω = ω.
α0 = 1
+
αβ = αβ · α
αλ = sup{αδ : δ < λ} for a non-zero limit ordinal λ .
20
Fact. Every separable Banach space embeds isometrically into the separable
Banach space C[0, 1] of continuous functions on [0, 1] with the uniform norm.
Thus, the class SB of all separable Banach spaces has a universal element: there
is a member Z of SB that contains isomorphic (or, in this case, even isometric)
copies of every other member of the class.
(i) Sz(X) 6 ω1 and furthermore, Sz(X) < ω1 if and only if the dual space X ∗
of X is separable;
(ii) if the Banach space X isomorphically embeds into the Banach space Y ,
then Sz(X) 6 Sz(Y );
(iii) for every countable ordinal α, there exists a separable, reflexive Banach
space Xα such that Sz(Xα ) > α.
It follows immediately that the answer to the question posed above is ‘no’.
Indeed, if Z is a universal member of the class SR, then each Xα embeds iso-
morphically into Z. Then by property (ii) above, Sz(Z) > α for all countable
ordinals α. Thus, Sz(Z) = ω1 , which implies that Z ∗ is not separable by prop-
erty (i). Since Z is reflexive, Z ∗∗ = Z is separable, and hence Z ∗ is separable (as
the dual of a space is always at least as ‘big’ as the space itself) — contradiction.
21
3 Posets and Zorn’s Lemma
Definition. A partially ordered set or poset is a set with a partial order on it.
{1, 2, 3}
" b
{1, 2}
" b
"
" bb
@ {1, 2} {1, 3} {2, 3}
@ bb "" bb ""
{1} {2} "b "b
" b " b
@
@∅ {1} {2} {3}
b ""
b "
b "
b ∅"
22
es
@@
@sd If the ‘height’ of a, c, d, e is 0, 1, 2, 3, respectively, then
b s
JJ s what should be the ‘height’ of b?
c
J
Js
a
d e
7. s s
T There needs to be no relation between different parts.
s TTs s
a b c
23
5. In s s s s the whole set {a, b, c, d} is an antichain.
a b c d
Definition. Let X be a poset, S ⊂ X and x ∈ X.
x is an upper bound for S if y 6 x for all y ∈ S.
x is a least upper bound or supremum for S if x is an upper bound for S and
x 6 y for every upper bound y for S.
W
If a supremum for S exists, it is unique and is denoted sup S or S.
S
Examples. 1. If S ⊂ PX, then sup S = {A : A ∈ S}.
2. In R, sup(0, 1) = 1 and sup[0, 1] = 1.
3. In Q, the set {x : x2 < 2} has an upper bound, e.g., 2, but it has no
supremum.
4. In the poset es
@
@
@sd
b s sup{a, b, c} = e.
JJ s
c
J
Js
a
5. In the poset c d
s s
l ,
,
l {a, b} has upper bounds c, d but
s
, ls {a, b} has no supremum.
a b
6. In the poset d e
s s
T {b, c} has no upper bounds.
s TTs s
a b c
Note. If X is complete, then X has greatest element sup X and least element
sup ∅. In particular, X is non-empty.
Examples. 1. f : N → N, f (n) = n + 1.
2. f : P(S) → P(S), f (A) = A ∪ B, where B ⊂ S is fixed.
24
Note. If f : X → Y is order-preserving and injective, then x < y implies
f (x) < f (y). The converse holds if X is linearly ordered.
Zorn’s Lemma
Definition. Let X be a poset. Say x ∈ X is a maximal element of X if x 6 y
implies x = y for every y ∈ X, i.e., there is no y in X such that x < y.
25
Proof. Assume that X has no maximal elements. Then for every x ∈ X we
can fix an element x0 ∈ X with x < x0 . Let us also fix, for every chain C, an
upper bound u(C) for C in X. Let γ = γ(X) (from Hartogs’ lemma) and define
f : γ → X by recursion as follows:
f (0) = u(∅)
f (α + 1) = f (α)0
An easy induction (on β with α fixed) shows that f (α) < f (β) for all α < β. It
follows that f is injective contradicting the choice of γ.
Remark. Technically, the definition of f (λ) above is only valid when {f (α) :
α < λ} is a chain, otherwise we should define f (λ) differently, e.g., we could set
f (λ) = u(∅) in that case. Then an easy induction shows that the ‘otherwise’
clause never arises.
We next use Zorn’s Lemma to complete the proof of the Model Existence Lemma
from Chapter 1 without the assumption that the set of primitive propositions
is countable.
26
Theorem 5. Let P be any set of primitive propositions. Let S ⊂ L = L(P ) be
consistent. Then there exists S ⊂ L such that S ⊂ S and for all t ∈ L, either
t ∈ S or ¬t ∈ S.
Axiom of Choice (AC). This is the assertion that for S every set X of non-
empty sets, X = {Ai : i ∈ I}, there is a function f : I → i∈I Ai with f (i) ∈ Ai
for all i ∈ I, called a choice function for X.
27
Note. This rule differs in character from other rules for building sets (e.g.,
union, power set) in that the object whose existence it asserts is not unique.
It is therefore often of interest whether a result whose proof uses AC can be
proved without AC. We show in a moment that ZL and WO both need AC.
Theorem 7. AC ⇐⇒ ZL ⇐⇒ WO
Proof. We have already proved the implications AC⇒ZL (Theorem 3) and
ZL⇒WO (Theorem 6). It remainsSto show WO⇒AC. Let X = {Ai : i ∈ I} be
a set of non-empty sets. Let Y = i∈I Ai and fix a well-ordering of Y . Define
f : I → Y by letting f (i) be the least element of Ai . Then f is a choice function
of X.
Examples. Every complete poset and every non-empty finite poset is chain-
complete. In general, if X is a poset, then
Y = {C ⊂ X : C is a chain}
partially ordered by inclusion is chain-complete.
28
Theorem 9. AC+Bourbaki–Witt =⇒ ZL
Remark. For this reason, the Bourbaki–Witt fixed point theorem is sometimes
called the ‘choice-free’ part of ZL.
C = {C ⊂ X : C is a chain}
29
4 First-order Predicate Logic
The Language
The language is specified by a disjoint pair of sets: the set Ω of operation
symbols and the set Π of predicate symbols together with an arity function
α : Ω ∪ Π → N0 = N ∪ {0}. The language L = L(Ω, Π) then consists of the
following.
30
(i) Every variable is a term.
Note that brackets are not needed as their positions are uniquely determined by
the order of operation symbols and variables. The following strings of operation
symbols and variables are not terms: mxymz, emx, mxyz.
(x1 = x2 ) , (x1 6 x2 )
(ii) ⊥ is a formula.
31
Examples. The following are formulae in the language of groups. In each case
we indicate for every occurence of the variable x whether it is a free or bound
occurence.
(mxix = e) ⇒ (mixx = e)
↑ ↑ ↑↑
free free
Note that in the last example, the variable x has both free and bound occurences.
Although such formulae are technically allowed, it is usual mathematical prac-
tice to avoid them. In the second example, there are no free variables: both x
and y only have bound occurences. Such formulae have a special name.
Structures
Definition. A structure in a language L = L(Ω, Π), or L-structure, is a non-
empty set A together with functions
ωA : An → A (ω ∈ Ω, n = α(ω))
ϕA : An → {0, 1} (ϕ ∈ Π, n = α(ϕ))
mA : A × A → A , iA : A → A
32
Motivation. Let L = L(Ω, Π) be a first-order language and A an L-structure.
Given a formula p in the language L, we want to define the interpretation of
p in A and what it means that ‘p is satisfied in A’. For example, if p is the
formula (mxix = e) in the language of groups, then we let pA be the subset
{a ∈ A : mA (a, iA (a)) = eA } of A, and then say that p is satisfied in A if
pA = A. Equivalently, identifying pA with its indicator function, we say p is
satisfied in A if pA : A → {0, 1} is the constant function with value 1, where
pA (a) = 1 if mA (a, iA (a)) = eA , and pA (a) = 0 otherwise. If q now is the
sentence (∀ x)p, then its interpretation in A should be a function A0 → {0, 1},
i.e., qA is simply an element of {0, 1}. We set qA = 1 if pA (a) = 1 for every
a ∈ A, i.e., if p holds in A, otherwise we set qA = 0. We now give the formal
description of how to interpret formulae in a structure. This is rather dry, and
it is best to let these motivating examples be the guide.
tA : An → A , tA (a1 , . . . , an ) = ai
• If p is (q ⇒ r), then
33
Theories and Models
Definition. Let L = L(Ω, Π) be a first-order language and A be an L-structure.
Given a formula p in the language L, we say p is satisfied in A (or p holds in A, or
p is true in A or A is a model of p) if pA = An or, equivalently, pA : An → {0, 1}
is the constant function with value 1, where n is the number of free variables
in p.
T = {(∀ x)(x 6 x) ,
(∀ x)(∀ y)((x 6 y ∧ y 6 x) ⇒ x = y) ,
(∀ x)(∀ y)(∀ z)((x 6 y ∧ y 6 z) ⇒ x 6 z)}
The models of T are precisely the rings with 1. Note that here we reverted to
writing x + y instead of +xy, xy instead of x × y, etc. This and the theory of
groups are examples of algebraic theories: the sentences only involve equations;
well, they don’t actually, but for example the sentence (∀ x)((x1 = x)∧(1x = x))
can be replaced with the two sentences (∀ x)(x1 = x) and (∀ x)(1x = x), etc.
34
4. Theory of fields The language is the same as for rings with 1. The theory is
the union of the theory for rings with 1 and the following set of three sentences.
(∀ x)(∀ y)(xy = yx), ¬(0 = 1), (∀ x)(¬(x = 0) ⇒ (∃ y)(xy = 1))
This axiomatises fields. Note that this is not an algebraic theory, and indeed
fields cannot be axiomatised as an algebraic theory. This is because every field
has at least two elements, and it is easy to see that the singleton set is a model
for every algebraic theory.
5. Graph theory The language consists of Ω = ∅, Π = {a} with a having
arity 2 (a is the adjacency predicate). The theory is
(∀ x)¬(axx), (∀ x)(∀ y)(axy ⇒ ayx)
Semantic entailment
Definition. Given a first-order language L = L(Ω, Π), a set S of sentences in
L and a sentence t in L, we say S (semantically) entails t, written S t, if t
holds in every model of S.
Examples. 1. Let T be the theory of groups (in the language of groups). Then
T (∀ x)(xx = e) ⇒ (∀ x)(∀ y)(xy = yx)
2. Let T be the theory of fields (in the language of rings with 1). Then
T (∀ x) ¬(x = 0) ⇒ (∀ y)(∀ z)(xy = xz ⇒ y = z)
Remark. We will also need to define S t in the case when S ∪ {t} contains
formulae with free variables. The following example motivates the definition.
Let T be the theory of fields (in the language of rings with 1). Let p be the
formula ¬(x = 0), let t be the formula (∃ y)(xy = 1) and let S = T ∪ {p}. It
ought to be the case that S t because, given a field F , if we assign a value
a ∈ F to the variable x, then according to the field axioms, if pF (a) is true (i.e.,
pF (a) = 1), then tF (a) is also true.
35
Definition. Let L = L(Ω, Π) be a first-order language, S a set of formulae in
L and t a formula in L. Introduce a new constant to L for each free variable
occuring in S ∪ {t}. For a formula u ∈ S ∪ {t}, let u0 be the sentence in the
new language L0 obtained from u by replacing each free occurence of a variable
with the corresponding constant and set S 0 = {s0 : s ∈ S}. We then say S
(semantically) entails t, written S t, if S 0 t0 .
(iii) If t is the term myy, then p[t/x] is not defined as y occurs bound in p.
Syntactic entailment
Axioms.
(A1) p ⇒ (q ⇒ p) (p, q any formulae)
(A4) (∀ x)(x = x)
36
Rules of deduction.
Generalisation (Gen): from p with free variable x, we can deduce (∀ x)p pro-
vided x does not occur free in any premiss used in the
proof of p.
Definition. Let S be a set of formulae and p be a formula. A proof of p from
S is a finite sequence t1 , . . . , tn of formulae such that tn = p and for every i,
(i) either ti is an axiom, or
37
Proposition 1. (Deduction Theorem) Let S be a set of formulae and p, q
be formulae. Then S ` (p ⇒ q) if and only if S ∪ {p} ` q.
p (premiss)
q (MP)
ti ⇒ (p ⇒ ti ) (A1)
ti (axiom or premiss)
p ⇒ ti (MP)
is a proof of p ⇒ ti from S.
Case 2: If ti = p, then S ` (p ⇒ ti ) since ` (p ⇒ p).
Case 3: If there exist j, k < i such that tk = (tj ⇒ ti ), then by induction
hypothesis, there are proofs of p ⇒ tj and p ⇒ (tj ⇒ ti ) from S. Adding the
lines
p ⇒ (tj ⇒ ti ) ⇒ (p ⇒ tj ) ⇒ (p ⇒ ti ) (A2)
(p ⇒ tj ) ⇒ (p ⇒ ti ) (MP)
p ⇒ ti (MP)
(∀ x)tj (Gen)
(∀ x)tj ⇒ (p ⇒ (∀ x)tj ) (A1)
p ⇒ (∀ x)tj (MP)
38
Case 4b: If x does not occur free in p, then we write down a proof of p ⇒ tj
from S in which x does not occur free in any premiss (possible by induction
hypothesis). We then append the following lines
(∀ x)(p ⇒ tj ) (Gen)
(∀ x)(p ⇒ tj ) ⇒ p ⇒ (∀ x)tj (A7)
p ⇒ (∀ x)tj (MP)
Aim. We now embark on the proof of the Completeness Theorem that states
that, for first-order logic, ` and coincide.
Note. For the converse, i.e., that S p implies S ` p, we first consider the
special case when S is a theory and p is the formula ⊥.
Idea of proof. We will build a model from the language L itself. We initially
choose our structure to be the set A of all closed terms of L, i.e., terms not
involving variables. Examples of closed terms in the language of commutative
rings with 1:
39
For an example of the first issue, consider the theory S of fields with charac-
teristic 2 or 3, which consists of the theory of fields together with the sentence
(1+1 = 0)∨(1+1+1 = 0). Then S 0 (1+1 = 0) since S has models that are fields
of characteristic 3. Similarly, S 0 (1 +1 +1 = 0). It follows that in our structure
A, we have [1] +A [1] = [1 + 1] 6= [0] and [1] +A [1] +A [1] = [1 + 1 + 1] 6= [0], and
thus A is not a model. As for propositional logic, we will first extend the theory
S to a consistent theory that is complete. In general, we say that a theory S
in a first-order language L is complete if for every sentence p, either S ` p or
S ` ¬p.
For an example of the second issue, consider the theory S of fields in which
2 has a square root. This consists of the theory of fields together with the
sentence (∃ x)(xx = 1 + 1). Then the structure A defined above consisting of
∼-equivalence classes of closed terms is not a model, since there is no closed
term t such that [tt] = [1 + 1]. In other words, we lack a witness to the sentence
(∃ x)(xx = 1 + 1), i.e., we lack a closed term t such that S ` p[t/x] where p
is the formula (xx = 1 + 1). The solution is to add a new constant c to our
language and the new sentence (cc = 1 + 1) to our theory.
The problem is that the two processes of adding witnesses and of completion
pull in different directions. When we add witnesses to a complete theory, the
new theory may no longer be complete. When we complete a theory which has
witnesses, the new theory may lack witnesses.
40
to verify that S ∗ is a consistent theory in the language L∗ that is complete and
has witnesses. Since every model of S ∗ is also a model of S, to complete the
proof, we may assume that S is complete and has witnesses.
We let A be the set of equivalence classes of closed terms in L under the equiv-
alence relation s ∼ t if and only if S ` (s = t). We turn A into an L-structure
as follows. For ω ∈ Ω with arity n, we define ωA : An → A by setting
Applications of completeness/compactness
Question. Can we axiomatise the theory of finite groups? In other words, does
there exist a first-order theory T in a suitable language such that every finite
group is a model of T and every model of T is a finite group?
41
Consider, for each n ∈ N, the following sentence tn defined in any language L:
Note. For any set X we can take I = γ(X) (from Hartogs’ Lemma). This
shows that S has models that do not inject into X.
Remark. We can easily write down uncountable groups or vector spaces, but
already for fields, the Upward Löwenheim–Skolem Theorem is not obvious.
42
Peano Arithmetic
We finish this chapter with another worked example. Our aim is to axiomatise
the set of natural numbers. The key defining property of N is induction which
we try to emulate with an axiom-scheme.
Peano Arithmetic (PA) (also known as formal number theory) is the theory in
the language above with sentences as follows.
(∀ x)¬(sx = 0)
(∀ x)(∀ y)(sx = sy ⇒ x = y)
(∀ x)(x + 0 = 0)
(∀ x)(x × 0 = 0)
(∀ x)(∀ y)(x × sy = (x × y) + x)
(∀ y1 ) . . . (∀ yn ) p[0/x] ∧ (∀ x)(p ⇒ p[sx/x]) ⇒ (∀ x)p
for any formula p with FV(p) = {x, y1 , . . . , yn }.
Remark. The last axiom is the axiom-scheme for induction. The variables
y1 , . . . , yn are parameters. To see why they are needed, consider the following
formula p: (x + y) + z = x + (y + z) with free variables x, y, z. We can prove
(∀ x)(∀ y)(∀ z)p by iduction on z with x, y treated as parameters. Formally, we
verify that
p[0/z] ∧ (∀ z)(p ⇒ p[sz/z])
holds in any model of PA, and hence it is provable by completeness. We
then use the induction-scheme to deduce that (∀ z)p holds, and hence so does
(∀ x)(∀ y)(∀ z)p by Generalisation.
43
Examples. The following sets are definable using the given formula:
44
5 Set Theory
This is just another piece of mathematical theory like group theory, topology,
etc. We will axiomatise set theory as a first-order theory. So we could think
of this chapter as just another worked example of first-order logic. Since any
model of set theory should contain all of mathematics, it will obviously be a
very complicated example of a first-order theory.
The set x whose existence this axiom asserts is unique by (Ext). We call this set
the empty set denoted by ∅. (Formally, we add the constant ∅ to the language
of ZF with the sentence (∀ y)¬(y ∈ ∅).)
Strictly speaking, this axiom is not needed as it follows from (Sep). Indeed, in
a structure V , we can pick any set x and form the set {y ∈ x : ¬(y = y)} by
(Sep). However, if in first-order logic we allow the empty set as a structure,
then (Emp) is needed (or some axiom asserting the existence of some set).
45
4. Pair-set axiom (Pair).
‘For any sets x, y, we can form {x, y}.’
(Pair) (∀ x)(∀ y)(∃ z)(∀ t) t ∈ z ⇔ t = x ∨ t = y
The unique (by Extensionality) set z whose existence is asserted here is denoted
by {x, y}. We shall write {x} for {x, x}. (Formally, we add a binary operation
{, } and a unary operation {} to the language of ZF.)
It follows from (Ext) that {x, y} = {y, x} for all x, y. Thus, (Pair) gives us
unordered pairs. Ordered pairs can be constructed using the following device
(due, independently, to K. Kuratowski and N. Wiener): for sets x, y define
(x, y) = {{x}, {x, y}}. This satisfies:
(∀ x)(∀ y)(∀ z)(∀ t) (x, y) = (z, t) ⇔ x = z ∧ y = t
‘f is a function’ means
(∀ x) x ∈ f ⇒ ‘x is an ordered pair’
∧ (∀ x)(∀ y)(∀ z) (x, y) ∈ f ∧ (x, z) ∈ f ⇒ y = z
Note. We do not need a separate axiom for intersections. Indeed, the following
sentence follows from the axioms so far:
(∀ x) ¬(x = ∅) ⇒ (∃ y)(∀ z) z ∈ y ⇔ (∀ t)(t ∈ x ⇒ z ∈ t)
46
Indeed, given a non-empty set x, we can form the set
n [ o
y= z∈ x : (∀ t)(t ∈ x ⇒ z ∈ t)
using (Un) and (Sep). Note that technically we work in a model here to construct
the set y; then the sentence above follows by the Completeness Theorem. T We
will denote the unique
T (by Extensionality) set y constructed above by x. We
will write a ∩ b for {a, b}.
Note. We can now define the domain of a function fS Note that if (x, y) ∈ f ,
. S
then since (x, y) = {{x}, {x, y}}, it follows that x, y ∈ f . We can then form
the set n [[ o
dom f = x ∈ f : (∃ y) (x, y) ∈ f
using (Un) and (Sep). This of course makes sense for any set f . Formally, we
introduce a new unary operation symbol dom to the language of ZF.
using (Un), (Pow) and (Sep). In turn, we can form the set y x of all functions
from x to y:
y x = {f ∈ P(x × y) : f : x → y}
using (Pow) and (Sep).
7. Axiom of infinity (Inf ). With the first six axioms we can already do
quite a bit of mathematics. Also, in any model V there will be infinitely many
elements. For example, it is easy to show that the sets ∅, P∅, PP∅, . . . are pairwise
distinct.
For another example, let us first introduce for a set x, the successor of x to be
the set x+ = x ∪ {x}. Then the sets ∅, ∅+ , ∅++ , ∅+++ , . . . are pairwise distinct.
We shall denote these sets by 0, 1, 2, 3, . . . , respectively. Thus,
These examples show that from the outside, V is an infinite set. However, V is
not a set, i.e., the sentence (∃ x)(∀ y)(y ∈ x) does not hold in V . This is known
as Russell’s paradox. (Indeed, if the sentence holds in V , then we can form the
set y = {z ∈ x : ¬(z ∈ z)} by (Sep), and get a contradiction by considering
whether y ∈ y.) So we need an axiom that says that there is a set that contains,
for example, the elements 0, 1, 2, 3, . . . . We begin with a definition.
47
Say that ‘x is a successor set’ if
(0 ∈ x) ∧ (∀ y ∈ x)(y + ∈ x)
It follows from this axiom (and the ones listed so far) that there is a smallest
successor set:
(∃ x) ‘x is a successor set’ ∧ (∀ y)(‘y is a successor set’ ⇒ x ⊂ y)
z = {t ∈ Py : ‘t is a successor set’}
T
by (Pow) and (Sep). It is easy to check that x = z is the smallest successor
set which will be denoted by ω.
Note that every successor set contained in ω is ω:
(∀ x ⊂ ω) 0 ∈ x ∧ (∀ y ∈ x)(y + ∈ x) ⇒ x = ω
Digression on classes.
48
Definition. A class C, given by a formula p with free variable y, is a set if
(∃ x)(∀ y)(y ∈ x ⇔ p) holds in V . Otherwise we say C is a proper class.
End of digression.
The Axiom of Replacement is an axiom-scheme stating that the image of a set
under a function-class is a set. As usual, we use parameters.
h
(Rep) (∀ t1 ) . . . (∀ tn ) (∀ x)(∀ y)(∀ z) (p ∧ p[z/y]) ⇒ (y = z)
i
⇒ (∀ x)(∃ y)(∀ z) z ∈ y ⇔ (∃ u)(u ∈ x ∧ p[u/x, z/y])
Remark. The nine axioms and axiom-schemes above form ZF set theory. Note
that the Axiom of Choice is not included. We shall write ZFC for ZF+AC, i.e.,
ZF set theory with the Axiom of Choice.
[
(AC) (∀ x)((∀ y ∈ x)¬(y = ∅) ⇒ (∃ f )(f : x → x ∧ (∀ y ∈ x)(f (y) ∈ y)))
Remark. For the rest of this chapter we work within ZF. Our ultimate aim is
to describe the set-theoretic universe V . We first prove versions of induction and
recursion similar to but more general than those introduced for well-ordered sets
in Chapter 2. This will eventually lead to a proper definition of ordinals thereby
filling the gap from Chapter 2. We then describe a picture of the universe in
which sets appear in ‘time’ measured by ordinals where no set appears before
all its members do.
49
Definition. A set x is transitive if every member of a member of x is a member
of x. Thus, ‘x is transitive’ is shorthand for
(∀ y)((∃ z)(z ∈ x ∧ y ∈ z) ⇒ y ∈ x)
S
or equivalently, if x ⊂ x.
A straightforward ω-induction shows that any two attempts agree on the inter-
section of their domains:
h
(*) (∀ f )(∀ g)(∀ n) ‘f is an attempt’ ∧ ‘g is an attempt’
i
∧ n ∈ dom f ∧ n ∈ dom g ⇒ f (n) = g(n)
by (Sep). We show that w is a successor set from which it will follow that
w = ω, as required. Since f = {(0, x)} is an attempt, 0 ∈ w. If n ∈ w, then fix
an attempt f with n ∈ dom f . Since every member of ω is transitive, we have
50
n ⊂ dom f , and hence n+ ⊂ dom f . By restricting f to n+ , we can assume that
dom f = n+ in which case
g = f ∪ n+ , f (n)
S
Remark. The set t constructed in the proof above is in fact the transitive
closure TC(x) of x.
Proof. Fix values t1 , . . . , tn of the parameters and assume that p(x) holds when-
ever p(y) holds for all members y of x, i.e., that (∀ x)((∀ y ∈ x)p[y/x] ⇒ p) holds.
Assume for a contradiction that ¬(∀ x)p holds and fix any set x such that p(x)
fails.
(At this point we would like to take a minimal counterexample, i.e., an ∈-
minimal member of {y : ¬p(y)}. However, {y : ¬p(y)} may not be a set. This
is where transitive closure comes in.)
By Lemma 1 we can form the set t = TC({x}), and by (Sep) we can form the
set u = {y ∈ t : ¬p(y)}. Then x ∈ u, and hence u has an ∈-minimal member z.
If y ∈ z, then y ∈ t since t is transitive, and thus y ∈
/ u since z is ∈-minimal in
u. It follows that p(y) holds for all y ∈ z. By assumption on p, we deduce p(z)
contradicting the choice of z.
51
Remark. In the presence of the first eight axioms of ZF, the Principle of ∈-
induction is equivalent to the Axiom of Foundation. One direction is Theorem 2.
For the converse, assume the Principle of ∈-induction. Say that a set x is regular
if
(∀ y)(x ∈ y ⇒ ‘y has an ∈-minimal member’)
(this definition is the clever bit). Then (Fnd) is equivalent to the assertion that
(∀ x)(‘x is regular’) which we prove by ∈-induction. Fix a set x and assume that
every y ∈ x is regular (the induction hypothesis). Let z be a set with x ∈ z.
We need to show that z has an ∈-minimal member. This is obviously true if x
itself is an ∈-minimal member of z. If not, then we have y ∈ z for some y ∈ x,
in which case z has an ∈-minimal member since y is regular by the induction
hypothesis. This shows that x is regular, as required.
We now turn to ∈-recursion. Informally, this is the statement that a function f
can be defined so that for every x, the value f (x) is given in terms of the values
f (y), y ∈ x.
Note that if x ∈ dom f , then x ⊂ dom f since dom f is transitive, and hence fx
makes sense. Now a straightforward ∈-induction (as in the proof of uniqueness)
shows that any two attempts agree on the intersection of their domains:
(*) (∀ f )(∀ g)(∀ x) ‘f is an attempt’ ∧ ‘g is an attempt’
∧ x ∈ dom f ∧ x ∈ dom g ⇒ f (x) = g(x)
Another ∈-induction shows that every set is in the domain of some attempt:
To see this, fix a set x and assume that for all y ∈ x there is an attempt defined
at y. Note that an attempt defined at y is defined on TC({y}) since the domain
of an attempt is transitive, and the restriction to TC({y}) of this attempt is still
an attempt. Hence by (∗), for each y ∈ x there is a unique attempt S fy defined
on TC({y}). Then {fy : y ∈ x} is a set by Replacement and f 0 = {fy : y ∈ x}
52
is an attempt whose domain contains x. Finally, f = f 0 ∪ {(x, G(f 0 x ))} is an
attempt defined at x.
We now let q be the formula
(∃ f )(‘f is an attempt’ ∧ y = f (x))
It is now straightforward to verify that q defines a function-class F with the
required properties.
53
Proof. We begin with existence. By r-recursion there is a function-class f such
that
(∀ x ∈ a) f (x) = {f (y) : y ∈ a, y r x}
We set b = {f (x) : x ∈ a} which is a set by (Rep). Since {(x, f (x)) : x ∈ a}
is also a set by (Rep), we can take f to be a function. We verify that (b, f )
satisfies the conclusions of the theorem.
We first show that b is transitive. Given z ∈ b, z = f (x) for some x ∈ a, and
hence z = {f (y) : y ∈ a, y r x}. It follows that w ∈ b whenever w ∈ z.
By definition, f is surjective and x r y implies f (x) ∈ f (y) for all x, y ∈ a. It
remains to show that f is injective. This will also show that f (x) ∈ f (y) implies
x r y for all x, y ∈ a. Indeed, if f (x) ∈ f (y), then f (x) = f (z) for some z ∈ a
with z r y. Then by injectivity z = x, and thus x r y.
For x ∈ a, say that f is injective at x if (∀ y ∈ a)(f (y) = f (x) ⇒ y = x).
Then f is injective if and only if (∀ x ∈ a)(f is injective at x) which we show
by r-induction. Fix x ∈ a and assume that f is injective at s for all s ∈ a with
s r x. Assume that f (x) = f (y) for some y ∈ a. Then
{f (s) : s ∈ a, s r x} = {f (t) : t ∈ a, t r y}
{s : s ∈ a, s r x} = {t : t ∈ a, t r y}
by r-induction. Fix x ∈ a and assume that f (y) = f 0 (y) for all y ∈ a with y r x.
Given w ∈ f (x), we have w ∈ b since b is transitive, and w = f (z) for some
z ∈ a since f is surjective. Then f (z) ∈ f (x), and hence z r x. It follows by
the induction hypothesis that w = f (z) = f 0 (z) ∈ f 0 (x). Similarly, w ∈ f 0 (x)
implies w ∈ f (x). Thus, by the Axiom of Extensionality, we have f (x) = f 0 (x).
This completes the r-induction which shows that f = f 0 , and thus b = b0 .
54
Proposition 5. Let α, β ∈ ON and a be a set of ordinals.
(i) Every member of α is an ordinal.
(ii) β ∈ α ⇔ β < α
(iii) β ∈ α or β = α or α ∈ β
(iv) α+ = α ∪ {α}
S S
(v) a is an ordinal and a = sup a.
Remarks. 1. Recall from Chapter 2 that the notation β < α in part (ii) means
that β is order-isomorphic to a proper initial segment of α. Parts (i) and (ii)
together show that the ordinal α really is the set of ordinals strictly less than α.
2. Part (iii) shows that ∈ is a linear order on the class ON.
3. Part (iv) reconciles two definitions. According to the definition in Chapter 2,
α+ is the unique (up to order-isomorphism) well-ordered set that consists of
α as a proper initial segment and one extra element that is a maximum. By
Mostowski, this well-ordered set is order-isomorpic to a unique ordinal (its order-
type). Part (iv) shows that this ordinal is the successor of the set α as defined
in this chapter. In particular, this shows that the successor of an ordinal is an
ordinal.
4. Part (v) implies that any set x of well-ordered sets has an upper bound. This
was owed from Chapter 2 (see the Remark following Proposition 2.8). Indeed,
a = {order-type(y) : y ∈ x} is a set of ordinals by (Rep) which by part (v) has
an upper bound.
55
Picture of the Universe
Idea. We build the
S entire universe V starting from the empty
S set by repeatedly
applying P and . So we have ∅, P∅, PP∅, . . . , then {∅, P∅, PP∅, . . . }, etc.
Formally, we define Vα , α ∈ ON, by ∈-recursion as follows.
V0 = ∅
Vα+ = PVα
[
Vλ = {Vα : α < λ} for a non-zero limit ordinal λ .
The class of sets Vα , α ∈ ON, is called the von Neumann hierarchy. Our aim is
to show that every set appears in one of the sets Vα . This leads to the following,
somewhat unstable-looking, picture of the universe, which perhaps also explains
why it is usually denoted by V .
A ↑ ON
A .
A .
A .
A α Vα
A .
A .
A .
A ω +1 Vω+1 = PVω
A ω Vω = S{V0 , V1 , V2 , . . . }
A .
A .
A .
A 2 V2 = PP∅
A 1
V1 = P∅
A V0 = ∅
56
Theorem 8. The von Neumann hierarchy exhausts the set-theoretic universe,
i.e., (∀ x)(∃ α ∈ ON)(x ∈ Vα ) holds in ZF.
rank(x) 6 sup{rank(y)+ : y ∈ x} .
For the reverse inequality, we first show that if x ∈ Vα then rank(x) < α. So
let us assume that x ∈ Vα . Then α > 0. If α = β + , then x ⊂ Vβ , and hence
rank(x) 6 β < α. If α is a limit ordinal, then x ∈ Vβ for some β < α. It follows
that x ⊂ Vβ since Vβ is transitive, and thus rank(x) 6 β < α.
Now set α = rank(x). Then for each y ∈ x, we have y ∈ Vα , and hence
rank(y) < α by the claim above. It follows that sup{rank(y)+ : y ∈ x} 6 α.
Example. Using the formula above, an easy induction shows that rank(α) = α
for every ordinal α.
57
6 Cardinal Arithmetic
In this chapter we are interested in the size of sets. So we will want to identify
sets that have the same size. We introduce the abbreviation ‘x ≡ y’ for the
formula (∃ f )(‘f is a bijection from x to y’). Note that this is an equivalence
relation on V .
Next, we wish to define the size, or cardinality, of a set x to be a set card(x)
such that the following holds.
Note. In the rest of this chapter we will work in ZFC, so we could adopt the
first definition of cardinality. However, the exact definition does not matter that
much. What is important is property (†). Also, much of what we do below is
valid in ZF.
The Alephs
Definition. Say α ∈ ON is an initial ordinal if (∀ β ∈ ON)(β < α ⇒ ¬(β ≡ α)).
Examples. For every set x, the Hartogs’ ordinal γ(x) is an initial ordinal. Since
for n < ω we have γ(n) = n+ (easy ω-induction), it follows that all members of
ω are initial ordinals, which in turn implies that ω is an initial ordinal.
.
..
ωω
The ordinals ω 2 , ω 3 or ε0 = ω ω are not initial ordinals as they all biject
with ω. In fact, the next initial ordinal after ω is γ(ω) = ω1 . More generally,
we can index the infinite initial ordinals as follows.
58
Definition. Define ωα for α ∈ ON by recursion:
ω0 = ω
ωβ + = γ(ωβ )
ωλ = sup{ωβ : β < λ} (λ non-zero limit)
Remark. Note that for ordinals α < β, if β injects into α, then by the Schröder–
Bernstein theorem we have α ≡ β. It follows that if α < β and β is an initial
ordinal, then β cannot inject into α. We shall use this simple observation several
times below.
Proof. We first show that the ωα are initial ordinals by induction on α. We only
need to check the case when α is a non-zero limit. In this case, assume that
ωα ≡ γ for some γ < ωα . Then γ < ωβ for some β < α. Since ωβ < ωα (easy
induction on α), it follows that ωβ injects into γ contradicting the induction
hypothesis that ωβ is an initial ordinal.
Now assume that δ is an infinite initial ordinal. An easy induction shows that
α 6 ωα for all α ∈ ON, and hence there is a least α with δ < ωα . Since δ is
infinite, α 6= 0, and moreover α cannot be a limit otherwise δ < ωβ for some
β < α contradicting the minimality of α. Thus α = β + for some β that satisfies
ωβ 6 δ < ωβ + = γ(ωβ ). It follows that δ injects into ωβ , and thus δ = ωβ as δ
is an initial ordinal.
Note. The relation 6 is well defined, i.e., does not depend on the choice of
the sets M, N . It is also easy to check that 6 is a partial order on the class
of cardinals. Antisymmetry (m 6 n and n 6 m imply m = n) follows from
Schröder–Bernstein. In ZFC it is even a linear order.
m + n = card(M t N )
m · n = card(M × N )
mn = card(M N ) (M N = {f ∈ P(N × M ) : f : N → M })
These operations are well-defined, i.e., they do not depend on the choice of the
sets M, N .
59
Properties. The following are straightforward to check by writing down a
bijection between appropriate sets.
m+n=n+m
(m + n) + p = m + (n + p)
m·n=n·m
(mn)p = m(np)
m(n + p) = mn + mp
(mn)p = mp np
mn+p = mn mp
p
mn = mnp
Note. Cantor’s diagonal argument shows that m < 2m for all cardinals m
(there is no surjection M → 2M ). In particular, ℵ0 < 2ℵ0 which contrasts with
ω = 2ω for ordinal exponentiation.
Similarly, we have 2 · ℵ0 = ℵ0 · 2 in contrast with 2 · ω = ω 6= ω · 2 for ordinal
multiplication.
A consequence of the next result is that addition and multiplication of alephs
is easy.
60
Note. In ZFC one can define more general infinite sums and products of car-
dinals. In the definitions below, as earlier, lower-case letters denote cardinals
and upper-case letters denote sets with cardinality the corresponding lower-case
letter.
So it is of interest to study cardinals of the form 2ℵβ . We know that ℵβ < 2ℵβ
but very little else is known. For example, a natural question is whether 2ℵ0
is equal to ℵ1 . Since 2ℵ0 is the cardinality of R, this became known as the
Continuum Hypothesis (or CH for short):
(CH) 2ℵ0 = ℵ1
P. Cohen proved in the 1960s that if ZFC is consistent, then so are ZFC+CH
and ZFC+¬CH. So CH is independent of ZFC.
61
7 *Classical descriptive set theory*
Polish spaces
Definition. A Polish space is a separable, complete metrizable topological
space.
62
Borel hierarchy
Definitions. Let X be an arbitrary set. A σ-field (or σ-algebra) on X is a
subset F of the power set PX such that
(i) ∅ ∈ F
S
(ii) A1 , A2 , · · · ∈ F implies n∈N An ∈ F
(iii) A ∈ F implies X \ A ∈ F
Note that in particular a σ-field is closed under countable intersections (as well
as countable unions).
Now assume that X is a Polish space. The Borel σ-field B on X is the smallest
σ-field on X containing all the open sets. (Equivalently, B is the intersection
of all σ-fields on X that contain the open sets; there exists at least one such
σ-field, namely PX.) Members of B are called Borel sets.
Borel hierarchy. For a Polish space X, we define families Σ0α and Π0α of subsets
of X for ordinals 1 6 α < ω1 by recursion as follows.
Σ01 = {U ⊂ X : U open}
Π01 = {F ⊂ X : F closed}
Π0α+1 = {X \ A : A ∈ Σ0α+1 }
Π0λ = {X \ A : A ∈ Σ0λ }
where in the last two lines λ is a non-zero limit. The collections of these families
is the Borel hierarchy of X.
We define ∆0α = Σ0α ∩ Π0α for 1 6 α < ω1 .
Example. Σ02 is the family of countable unions of closed sets known as Fσ -sets.
Π02 is the family of countable intersections of open sets known as Gδ -sets.
Remark. Any open set in a Polish space (or indeed in any metric space) is
a countable union of closed sets, i.e., Σ01 ⊂ Σ02 . An easy induction then shows
that Σ0α ⊂ ∆0β and Π0α ⊂ ∆0β for 1 6 α < β < ω1 .
[ [
Lemma 3. Σ0α = Π0α = B in any Polish space.
16α<ω1 16α<ω1
63
Proof. The first equality follows from the inclusions above. For the second
0
S first show by induction that Σα ⊂ B for all 1 6 α < ω1 , and then show
equality,
that 16α<ω1 Σ0α is a σ-field containing the open sets.
A = {(m, n) ∈ N × N : ∃ i ∈ N n ∈ Umi }
Corollary 5. For each 1 6 α < ω1 there is a Σ0α -subset of N that is not Π0α .
Remark. This leads to the following refinement of the picture of the Borel
hierarchy of N .
Projective hierarchy
Definition. A subset of a Polish space is analytic if it is a continuous image of
N (or it is empty).
Examples. It follows from Lemma 1 that any Polish space, and thus any closed
subset of a Polish space, is analytic.
64
Remark. We shall often use implicitly the following observation. The spaces
N × N = N{0}tN , N × N = NNtN and N N = NN×N are homeomorphic to N in
the obvious way.
{(k, n, x) ∈ N × N × X : (n, x) ∈ Fk }
of N × N × X. Similarly, we have
\
x∈ Ak ⇔ ∀ k ∈ N ∃ n ∈ N (n, x) ∈ Fk
k∈N
of N N × X.
65
Definition. We define Σ11 to be the family of analytic sets (in some Polish
space) and Π11 to be the family of complements of analytic sets called coanalytic
sets. Then inductively, for 1 6 n < ω, we define Σ1n+1 to be the family of
continuous images of Π1n -sets, and Π1n+1 to be the family of complements of
Σ1n+1 -sets. We also let ∆1n = Σ1n ∩ Π1n for 1 6 n < ω.
⊂
∆11 ∆12 ∆13 ···
⊂
⊂
Π11 Π12 Π13
Definition. The collection of families Σ1n and Π1n , 1 6 n < ω, is called the
projective hierarchy. Members of P are the projective sets.
B = {n ∈ N : ∃ m ∈ N (m, n) ∈ F }
F = {(m, n) ∈ N × N : (p, m, n) ∈
/ U} .
It follows that
B = {n ∈ N : ∃ m ∈ N (p, m, n) ∈
/ U } = {n ∈ N : (p, n) ∈ A}
as required.
B = {n ∈ N : (n, n) ∈ A} .
66
Remark. We have already observed that Borel sets are both analytic and
coanalytic. So the set B constructed in the proof above is analytic non-Borel.
We will now show the converse that a set that is both analytic and coanalytic
is Borel.
Example. Let Seq be the set of all finite sequences of positive integers. Then
P Seq = {0, 1}Seq is a Polish space in the product topology (homeomorphic to
{0, 1}N since Seq is countable).
Given s = (m1 , . . . , mk ) and t = (n1 , . . . , nl ) in Seq, write s ≺ t if 0 6 k 6 l
and mi = ni for 1 6 i 6 k.
Say T ⊂ Seq is a tree if s ∈ T whenever s ≺ t and t ∈ T . Say n ∈ N is an
infinite branch of T if (n1 , . . . , nk ) ∈ T for all k ∈ N. Say T is well-founded if
T has no infinite branch.
67
Let T be the set of all trees and WFT be the set of all well-founded trees. Note
that T is a closed subset of P Seq, and thus T is also a Polish space. We show
that the subset WFT of T is coanalytic. For any T ∈ T we have
T ∈
/ WFT ⇔ ∃ n ∈ N ∀ k ∈ N (n1 , . . . , nk ) ∈ T .
and thus analytic. It is possible to show that WFT is not analytic, and hence
WFT is a coanalytic non-Borel set.
Lemma 11. A non-empty perfect subset of a Polish space has cardinality 2ℵ0 .
Proof. Let A be a non-empty perfect subset of a Polish space X. Given x ∈ A
and a radius r > 0, since x is not isolated in A, there exist y, z ∈ A and a radius
s > 0 such that the closed balls Bs (y) and Bs (z) are disjoint and contained in
Br (x). (Note that we are implicitly assuming that X comes with a complete
metric defining its topology.)
Since A is not empty, we can fix a point x∅ in A. Using the observation above, we
inductively construct points xε1 ,...,εk in A indexed by finite sequences ε1 , . . . , εk
in {0, 1} (where ∅ is the sequence of length zero) and radii r1 , r2 , . . . such that
the closed ball Brk (xε1 ,...,εk ) contains the disjoint closed balls Brk+1 (xε1 ,...,εk ,0 )
and Brk+1 (xε1 ,...,εk ,1 ), and moreover rk → 0 as k → ∞.
It is easy to verify that the function ϕ : {0, 1}N → A given by
Theorem 12. Every analytic set either has a perfect subset or is countable. It
follows that every infinite analytic set has cardinality ℵ0 or 2ℵ0 .
Proof. For a tree T let
be the set of all infinite branches of T . Note that for T = Seq we have [T ] = N ,
and for T ∈ WFT we have [T ] = ∅. For a tree T and s ∈ Seq, let
T (s) = {t ∈ T : t ≺ s or s ≺ t} .
68
Now fix an analytic set A in some Polish space X. Then A = f (N ) = f ([Seq])
for some continuous function f : N → X. For a tree T let
T (0) = Seq
0
T (α) = T (β) if α = β +
\
T (α) = T (β) if α is a non-zero limit
β<α
Since Seq is countable, there exists α < ω1 such that T (α+1) = T (α) . Set
T = T (α) and consider the following two cases.
If T = ∅, then
[
T (β) \ T (β+1)
A= f
β<α
and
[ n (β) o
T (β) \ T (β+1) = f T (s) : f T (β) (s) is countable
f
One can show that M is compact and f (M) is a perfect set. This is left as an
exercise.
69