0% found this document useful (0 votes)
2 views942 pages

Open Logic Debug

The Open Logic Text is a comprehensive resource on logic and set theory, covering topics such as naive set theory, relations, functions, and propositional logic. It is organized into sections that detail various concepts, definitions, and theorems, making it suitable for educational purposes. The text is licensed under a Creative Commons Attribution 4.0 International License.

Uploaded by

Wiz TheMoney
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views942 pages

Open Logic Debug

The Open Logic Text is a comprehensive resource on logic and set theory, covering topics such as naive set theory, relations, functions, and propositional logic. It is organized into sections that detail various concepts, definitions, and theorems, making it suitable for educational purposes. The text is licensed under a Creative Commons Attribution 4.0 International License.

Uploaded by

Wiz TheMoney
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 942

THE OPEN LOGIC TEXT

Debug Build

Open Logic Project

Revision: 6891b66 (master)


2024-12-01

The Open Logic Text by


the Open Logic Project is
licensed under a Creative
Commons Attribution 4.0
International License.
Contents

I Naı̈ve Set Theory 20

1 Sets 20
1.1 Extensionality . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.2 Subsets and Power Sets . . . . . . . . . . . . . . . . . . . . . . 21
1.3 Some Important Sets . . . . . . . . . . . . . . . . . . . . . . . 23
1.4 Unions and Intersections . . . . . . . . . . . . . . . . . . . . . 24
1.5 Pairs, Tuples, Cartesian Products . . . . . . . . . . . . . . . . 27
1.6 Russell’s Paradox . . . . . . . . . . . . . . . . . . . . . . . . . 28

2 Relations 30
2.1 Relations as Sets . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.2 Philosophical Reflections . . . . . . . . . . . . . . . . . . . . . 32
2.3 Special Properties of Relations . . . . . . . . . . . . . . . . . . 33
2.4 Equivalence Relations . . . . . . . . . . . . . . . . . . . . . . . 34
2.5 Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.6 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.7 Operations on Relations . . . . . . . . . . . . . . . . . . . . . 38

3 Functions 39
3.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2 Kinds of Functions . . . . . . . . . . . . . . . . . . . . . . . . 41
3.3 Functions as Relations . . . . . . . . . . . . . . . . . . . . . . 43
3.4 Inverses of Functions . . . . . . . . . . . . . . . . . . . . . . . 44
3.5 Composition of Functions . . . . . . . . . . . . . . . . . . . . . 46
3.6 Partial Functions . . . . . . . . . . . . . . . . . . . . . . . . . 47

4 The Size of Sets 48


4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.2 Enumerations and Enumerable Sets . . . . . . . . . . . . . . . 49
4.3 Cantor’s Zig-Zag Method . . . . . . . . . . . . . . . . . . . . . 53
4.4 Pairing Functions and Codes . . . . . . . . . . . . . . . . . . . 55
4.5 An Alternative Pairing Function . . . . . . . . . . . . . . . . . 56
4.6 Non-enumerable Sets . . . . . . . . . . . . . . . . . . . . . . . 58
4.7 Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.8 Equinumerosity . . . . . . . . . . . . . . . . . . . . . . . . . . 63

1
CONTENTS

4.9 Sets of Different Sizes, and Cantor’s Theorem . . . . . . . . . 64


4.10 The Notion of Size, and Schröder-Bernstein . . . . . . . . . . 66
4.11 Enumerations and Enumerable Sets . . . . . . . . . . . . . . . 66
4.12 Non-enumerable Sets . . . . . . . . . . . . . . . . . . . . . . . 68
4.13 Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

5 Arithmetization 72
5.1 From N to Z . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.2 From Z to Q . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.3 The Real Line . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.4 From Q to R . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.5 Some Philosophical Reflections . . . . . . . . . . . . . . . . . . 78
5.6 Ordered Rings and Fields . . . . . . . . . . . . . . . . . . . . . 80
5.7 Appendix: the Reals as Cauchy Sequences . . . . . . . . . . . 83

6 Infinite Sets 87
6.1 Hilbert’s Hotel . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.2 Dedekind Algebras . . . . . . . . . . . . . . . . . . . . . . . . 88
6.3 Arithmetical Induction . . . . . . . . . . . . . . . . . . . . . . 90
6.4 Dedekind’s “Proof” . . . . . . . . . . . . . . . . . . . . . . . . 91
6.5 Appendix: Proving Schröder-Bernstein . . . . . . . . . . . . . 93

II Propositional Logic 95

7 Syntax and Semantics 95


7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
7.2 Propositional Formulas . . . . . . . . . . . . . . . . . . . . . . 97
7.3 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
7.4 Formation Sequences . . . . . . . . . . . . . . . . . . . . . . . 100
7.5 Valuations and Satisfaction . . . . . . . . . . . . . . . . . . . . 102
7.6 Semantic Notions . . . . . . . . . . . . . . . . . . . . . . . . . 104

8 Derivation Systems 105


8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
8.2 The Sequent Calculus . . . . . . . . . . . . . . . . . . . . . . . 107
8.3 Natural Deduction . . . . . . . . . . . . . . . . . . . . . . . . 108
8.4 Tableaux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
8.5 Axiomatic Derivations . . . . . . . . . . . . . . . . . . . . . . 110

9 The Sequent Calculus 112


9.1 Rules and Derivations . . . . . . . . . . . . . . . . . . . . . . . 112
9.2 Propositional Rules . . . . . . . . . . . . . . . . . . . . . . . . 113
9.3 Structural Rules . . . . . . . . . . . . . . . . . . . . . . . . . . 114
9.4 Derivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
9.5 Examples of Derivations . . . . . . . . . . . . . . . . . . . . . 117

2 Release : 6891b66 (2024-12-01)


CONTENTS

9.6 Proof-Theoretic Notions . . . . . . . . . . . . . . . . . . . . . 121


9.7 Derivability and Consistency . . . . . . . . . . . . . . . . . . . 123
9.8 Derivability and the Propositional Connectives . . . . . . . . . 124
9.9 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

10 Natural Deduction 130


10.1 Rules and Derivations . . . . . . . . . . . . . . . . . . . . . . . 130
10.2 Propositional Rules . . . . . . . . . . . . . . . . . . . . . . . . 131
10.3 Derivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
10.4 Examples of Derivations . . . . . . . . . . . . . . . . . . . . . 134
10.5 Proof-Theoretic Notions . . . . . . . . . . . . . . . . . . . . . 139
10.6 Derivability and Consistency . . . . . . . . . . . . . . . . . . . 140
10.7 Derivability and the Propositional Connectives . . . . . . . . . 142
10.8 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

11 Tableaux 147
11.1 Rules and Tableaux . . . . . . . . . . . . . . . . . . . . . . . . 147
11.2 Propositional Rules . . . . . . . . . . . . . . . . . . . . . . . . 148
11.3 Tableaux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
11.4 Examples of Tableaux . . . . . . . . . . . . . . . . . . . . . . . 150
11.5 Proof-Theoretic Notions . . . . . . . . . . . . . . . . . . . . . 155
11.6 Derivability and Consistency . . . . . . . . . . . . . . . . . . . 158
11.7 Derivability and the Propositional Connectives . . . . . . . . . 159
11.8 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

12 Axiomatic Derivations 164


12.1 Rules and Derivations . . . . . . . . . . . . . . . . . . . . . . . 164
12.2 Axiom and Rules for the Propositional Connectives . . . . . . 166
12.3 Examples of Derivations . . . . . . . . . . . . . . . . . . . . . 166
12.4 Proof-Theoretic Notions . . . . . . . . . . . . . . . . . . . . . 168
12.5 The Deduction Theorem . . . . . . . . . . . . . . . . . . . . . 170
12.6 Derivability and Consistency . . . . . . . . . . . . . . . . . . . 171
12.7 Derivability and the Propositional Connectives . . . . . . . . . 172
12.8 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

13 The Completeness Theorem 174


13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
13.2 Outline of the Proof . . . . . . . . . . . . . . . . . . . . . . . . 175
13.3 Complete Consistent Sets of Sentences . . . . . . . . . . . . . 176
13.4 Lindenbaum’s Lemma . . . . . . . . . . . . . . . . . . . . . . . 178
13.5 Construction of a Model . . . . . . . . . . . . . . . . . . . . . 179
13.6 The Completeness Theorem . . . . . . . . . . . . . . . . . . . 179
13.7 The Compactness Theorem . . . . . . . . . . . . . . . . . . . . 180
13.8 A Direct Proof of the Compactness Theorem . . . . . . . . . . 181

Release : 6891b66 (2024-12-01) 3


CONTENTS

III First-order Logic 183

14 Introduction to First-Order Logic 183


14.1 First-Order Logic . . . . . . . . . . . . . . . . . . . . . . . . . 183
14.2 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
14.3 Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
14.4 Satisfaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
14.5 Sentences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
14.6 Semantic Notions . . . . . . . . . . . . . . . . . . . . . . . . . 189
14.7 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
14.8 Models and Theories . . . . . . . . . . . . . . . . . . . . . . . 190
14.9 Soundness and Completeness . . . . . . . . . . . . . . . . . . . 191

15 Syntax of First-Order Logic 192


15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
15.2 First-Order Languages . . . . . . . . . . . . . . . . . . . . . . 193
15.3 Terms and Formulas . . . . . . . . . . . . . . . . . . . . . . . 195
15.4 Unique Readability . . . . . . . . . . . . . . . . . . . . . . . . 197
15.5 Main operator of a Formula . . . . . . . . . . . . . . . . . . . 200
15.6 Subformulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
15.7 Formation Sequences . . . . . . . . . . . . . . . . . . . . . . . 203
15.8 Free Variables and Sentences . . . . . . . . . . . . . . . . . . . 206
15.9 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207

16 Semantics of First-Order Logic 209


16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
16.2 Structures for First-order Languages . . . . . . . . . . . . . . 210
16.3 Covered Structures for First-order Languages . . . . . . . . . 211
16.4 Satisfaction of a Formula in a Structure . . . . . . . . . . . . . 212
16.5 Variable Assignments . . . . . . . . . . . . . . . . . . . . . . . 217
16.6 Extensionality . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
16.7 Semantic Notions . . . . . . . . . . . . . . . . . . . . . . . . . 223

17 Theories and Their Models 225


17.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
17.2 Expressing Properties of Structures . . . . . . . . . . . . . . . 227
17.3 Examples of First-Order Theories . . . . . . . . . . . . . . . . 227
17.4 Expressing Relations in a Structure . . . . . . . . . . . . . . . 230
17.5 The Theory of Sets . . . . . . . . . . . . . . . . . . . . . . . . 232
17.6 Expressing the Size of Structures . . . . . . . . . . . . . . . . 234

18 Derivation Systems 235


18.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
18.2 The Sequent Calculus . . . . . . . . . . . . . . . . . . . . . . . 237
18.3 Natural Deduction . . . . . . . . . . . . . . . . . . . . . . . . 238
18.4 Tableaux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239

4 Release : 6891b66 (2024-12-01)


CONTENTS

18.5 Axiomatic Derivations . . . . . . . . . . . . . . . . . . . . . . 241

19 The Sequent Calculus 242


19.1 Rules and Derivations . . . . . . . . . . . . . . . . . . . . . . . 242
19.2 Propositional Rules . . . . . . . . . . . . . . . . . . . . . . . . 243
19.3 Quantifier Rules . . . . . . . . . . . . . . . . . . . . . . . . . . 244
19.4 Structural Rules . . . . . . . . . . . . . . . . . . . . . . . . . . 245
19.5 Derivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
19.6 Examples of Derivations . . . . . . . . . . . . . . . . . . . . . 248
19.7 Derivations with Quantifiers . . . . . . . . . . . . . . . . . . . 253
19.8 Proof-Theoretic Notions . . . . . . . . . . . . . . . . . . . . . 254
19.9 Derivability and Consistency . . . . . . . . . . . . . . . . . . . 256
19.10 Derivability and the Propositional Connectives . . . . . . . . . 258
19.11 Derivability and the Quantifiers . . . . . . . . . . . . . . . . . 259
19.12 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
19.13 Derivations with Identity predicate . . . . . . . . . . . . . . . 265
19.14 Soundness with Identity predicate . . . . . . . . . . . . . . . . 265

20 Natural Deduction 266


20.1 Rules and Derivations . . . . . . . . . . . . . . . . . . . . . . . 266
20.2 Propositional Rules . . . . . . . . . . . . . . . . . . . . . . . . 267
20.3 Quantifier Rules . . . . . . . . . . . . . . . . . . . . . . . . . . 268
20.4 Derivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
20.5 Examples of Derivations . . . . . . . . . . . . . . . . . . . . . 271
20.6 Derivations with Quantifiers . . . . . . . . . . . . . . . . . . . 276
20.7 Proof-Theoretic Notions . . . . . . . . . . . . . . . . . . . . . 280
20.8 Derivability and Consistency . . . . . . . . . . . . . . . . . . . 282
20.9 Derivability and the Propositional Connectives . . . . . . . . . 283
20.10 Derivability and the Quantifiers . . . . . . . . . . . . . . . . . 285
20.11 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
20.12 Derivations with Identity predicate . . . . . . . . . . . . . . . 289
20.13 Soundness with Identity predicate . . . . . . . . . . . . . . . . 291

21 Tableaux 292
21.1 Rules and Tableaux . . . . . . . . . . . . . . . . . . . . . . . . 292
21.2 Propositional Rules . . . . . . . . . . . . . . . . . . . . . . . . 293
21.3 Quantifier Rules . . . . . . . . . . . . . . . . . . . . . . . . . . 294
21.4 Tableaux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
21.5 Examples of Tableaux . . . . . . . . . . . . . . . . . . . . . . . 296
21.6 Tableaux with Quantifiers . . . . . . . . . . . . . . . . . . . . 301
21.7 Proof-Theoretic Notions . . . . . . . . . . . . . . . . . . . . . 305
21.8 Derivability and Consistency . . . . . . . . . . . . . . . . . . . 308
21.9 Derivability and the Propositional Connectives . . . . . . . . . 309
21.10 Derivability and the Quantifiers . . . . . . . . . . . . . . . . . 312
21.11 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
21.12 Tableaux with Identity predicate . . . . . . . . . . . . . . . . 316

Release : 6891b66 (2024-12-01) 5


CONTENTS

21.13 Soundness with Identity predicate . . . . . . . . . . . . . . . . 317

22 Axiomatic Derivations 318


22.1 Rules and Derivations . . . . . . . . . . . . . . . . . . . . . . . 318
22.2 Axiom and Rules for the Propositional Connectives . . . . . . 320
22.3 Axioms and Rules for Quantifiers . . . . . . . . . . . . . . . . 320
22.4 Examples of Derivations . . . . . . . . . . . . . . . . . . . . . 321
22.5 Derivations with Quantifiers . . . . . . . . . . . . . . . . . . . 323
22.6 Proof-Theoretic Notions . . . . . . . . . . . . . . . . . . . . . 323
22.7 The Deduction Theorem . . . . . . . . . . . . . . . . . . . . . 325
22.8 The Deduction Theorem with Quantifiers . . . . . . . . . . . . 326
22.9 Derivability and Consistency . . . . . . . . . . . . . . . . . . . 328
22.10 Derivability and the Propositional Connectives . . . . . . . . . 328
22.11 Derivability and the Quantifiers . . . . . . . . . . . . . . . . . 329
22.12 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
22.13 Derivations with Identity predicate . . . . . . . . . . . . . . . 331

23 The Completeness Theorem 332


23.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
23.2 Outline of the Proof . . . . . . . . . . . . . . . . . . . . . . . . 333
23.3 Complete Consistent Sets of Sentences . . . . . . . . . . . . . 335
23.4 Henkin Expansion . . . . . . . . . . . . . . . . . . . . . . . . . 337
23.5 Lindenbaum’s Lemma . . . . . . . . . . . . . . . . . . . . . . . 339
23.6 Construction of a Model . . . . . . . . . . . . . . . . . . . . . 340
23.7 Identity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
23.8 The Completeness Theorem . . . . . . . . . . . . . . . . . . . 345
23.9 The Compactness Theorem . . . . . . . . . . . . . . . . . . . . 346
23.10 A Direct Proof of the Compactness Theorem . . . . . . . . . . 348
23.11 The Löwenheim–Skolem Theorem . . . . . . . . . . . . . . . . 349

24 Beyond First-order Logic 350


24.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
24.2 Many-Sorted Logic . . . . . . . . . . . . . . . . . . . . . . . . 351
24.3 Second-Order logic . . . . . . . . . . . . . . . . . . . . . . . . 353
24.4 Higher-Order logic . . . . . . . . . . . . . . . . . . . . . . . . 356
24.5 Intuitionistic Logic . . . . . . . . . . . . . . . . . . . . . . . . 359
24.6 Modal Logics . . . . . . . . . . . . . . . . . . . . . . . . . . . 362
24.7 Other Logics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364

IV Model Theory 365

25 Basics of Model Theory 365


25.1 Reducts and Expansions . . . . . . . . . . . . . . . . . . . . . 365
25.2 Substructures . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
25.3 Overspill . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367

6 Release : 6891b66 (2024-12-01)


CONTENTS

25.4 Isomorphic Structures . . . . . . . . . . . . . . . . . . . . . . . 367


25.5 The Theory of a Structure . . . . . . . . . . . . . . . . . . . . 369
25.6 Partial Isomorphisms . . . . . . . . . . . . . . . . . . . . . . . 370
25.7 Dense Linear Orders . . . . . . . . . . . . . . . . . . . . . . . 372

26 Models of Arithmetic 374


26.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
26.2 Standard Models of Arithmetic . . . . . . . . . . . . . . . . . 375
26.3 Non-Standard Models . . . . . . . . . . . . . . . . . . . . . . . 377
26.4 Models of Q . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
26.5 Models of PA . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
26.6 Computable Models of Arithmetic . . . . . . . . . . . . . . . . 384

27 The Interpolation Theorem 386


27.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386
27.2 Separation of Sentences . . . . . . . . . . . . . . . . . . . . . . 386
27.3 Craig’s Interpolation Theorem . . . . . . . . . . . . . . . . . . 388
27.4 The Definability Theorem . . . . . . . . . . . . . . . . . . . . 390

28 Lindström’s Theorem 392


28.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392
28.2 Abstract Logics . . . . . . . . . . . . . . . . . . . . . . . . . . 392
28.3 Compactness and Löwenheim–Skolem Properties . . . . . . . 394
28.4 Lindström’s Theorem . . . . . . . . . . . . . . . . . . . . . . . 396

V Computability 398

29 Recursive Functions 398


29.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398
29.2 Primitive Recursion . . . . . . . . . . . . . . . . . . . . . . . . 399
29.3 Composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401
29.4 Primitive Recursion Functions . . . . . . . . . . . . . . . . . . 403
29.5 Primitive Recursion Notations . . . . . . . . . . . . . . . . . . 405
29.6 Primitive Recursive Functions are Computable . . . . . . . . . 406
29.7 Examples of Primitive Recursive Functions . . . . . . . . . . . 407
29.8 Primitive Recursive Relations . . . . . . . . . . . . . . . . . . 410
29.9 Bounded Minimization . . . . . . . . . . . . . . . . . . . . . . 412
29.10 Primes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
29.11 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
29.12 Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
29.13 Other Recursions . . . . . . . . . . . . . . . . . . . . . . . . . 418
29.14 Non-Primitive Recursive Functions . . . . . . . . . . . . . . . 419
29.15 Partial Recursive Functions . . . . . . . . . . . . . . . . . . . 420
29.16 The Normal Form Theorem . . . . . . . . . . . . . . . . . . . 422
29.17 The Halting Problem . . . . . . . . . . . . . . . . . . . . . . . 423

Release : 6891b66 (2024-12-01) 7


CONTENTS

29.18 General Recursive Functions . . . . . . . . . . . . . . . . . . . 424

30 Computability Theory 425


30.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425
30.2 Coding Computations . . . . . . . . . . . . . . . . . . . . . . . 426
30.3 The Normal Form Theorem . . . . . . . . . . . . . . . . . . . 427
30.4 The s-m-n Theorem . . . . . . . . . . . . . . . . . . . . . . . . 428
30.5 The Universal Partial Computable Function . . . . . . . . . . 429
30.6 No Universal Computable Function . . . . . . . . . . . . . . . 429
30.7 The Halting Problem . . . . . . . . . . . . . . . . . . . . . . . 430
30.8 Comparison with Russell’s Paradox . . . . . . . . . . . . . . . 431
30.9 Computable Sets . . . . . . . . . . . . . . . . . . . . . . . . . 432
30.10 Computably Enumerable Sets . . . . . . . . . . . . . . . . . . 433
30.11 Definitions of C. E. Sets . . . . . . . . . . . . . . . . . . . . . 433
30.12 Union and Intersection of C.E. Sets . . . . . . . . . . . . . . . 436
30.13 Computably Enumerable Sets not Closed under Complement . 437
30.14 Reducibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438
30.15 Properties of Reducibility . . . . . . . . . . . . . . . . . . . . 439
30.16 Complete Computably Enumerable Sets . . . . . . . . . . . . 440
30.17 An Example of Reducibility . . . . . . . . . . . . . . . . . . . 441
30.18 Totality is Undecidable . . . . . . . . . . . . . . . . . . . . . . 442
30.19 Rice’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 442
30.20 The Fixed-Point Theorem . . . . . . . . . . . . . . . . . . . . 444
30.21 Applying the Fixed-Point Theorem . . . . . . . . . . . . . . . 448
30.22 Defining Functions using Self-Reference . . . . . . . . . . . . . 449
30.23 Minimization with Lambda Terms . . . . . . . . . . . . . . . . 449

VI Turing Machines 451

31 Turing Machine Computations 451


31.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
31.2 Representing Turing Machines . . . . . . . . . . . . . . . . . . 453
31.3 Turing Machines . . . . . . . . . . . . . . . . . . . . . . . . . . 457
31.4 Configurations and Computations . . . . . . . . . . . . . . . . 458
31.5 Unary Representation of Numbers . . . . . . . . . . . . . . . . 460
31.6 Halting States . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
31.7 Disciplined Machines . . . . . . . . . . . . . . . . . . . . . . . 464
31.8 Combining Turing Machines . . . . . . . . . . . . . . . . . . . 466
31.9 Variants of Turing Machines . . . . . . . . . . . . . . . . . . . 468
31.10 The Church–Turing Thesis . . . . . . . . . . . . . . . . . . . . 469

32 Undecidability 470
32.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471
32.2 Enumerating Turing Machines . . . . . . . . . . . . . . . . . . 472
32.3 Universal Turing Machines . . . . . . . . . . . . . . . . . . . . 474

8 Release : 6891b66 (2024-12-01)


CONTENTS

32.4 The Halting Problem . . . . . . . . . . . . . . . . . . . . . . . 477


32.5 The Decision Problem . . . . . . . . . . . . . . . . . . . . . . 479
32.6 Representing Turing Machines . . . . . . . . . . . . . . . . . . 479
32.7 Verifying the Representation . . . . . . . . . . . . . . . . . . . 482
32.8 The Decision Problem is Unsolvable . . . . . . . . . . . . . . . 487
32.9 Trakhtenbrot’s Theorem . . . . . . . . . . . . . . . . . . . . . 488

VII Incompleteness 492

33 Introduction to Incompleteness 492


33.1 Historical Background . . . . . . . . . . . . . . . . . . . . . . 492
33.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 496
33.3 Overview of Incompleteness Results . . . . . . . . . . . . . . . 501
33.4 Undecidability and Incompleteness . . . . . . . . . . . . . . . 502

34 Arithmetization of Syntax 504


34.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504
34.2 Coding Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . 506
34.3 Coding Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . 507
34.4 Coding Formulas . . . . . . . . . . . . . . . . . . . . . . . . . 509
34.5 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 510
34.6 Derivations in LK . . . . . . . . . . . . . . . . . . . . . . . . . 511
34.7 Derivations in Natural Deduction . . . . . . . . . . . . . . . . 514
34.8 Axiomatic Derivations . . . . . . . . . . . . . . . . . . . . . . 519

35 Representability in Q 522
35.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523
35.2 Functions Representable in Q are Computable . . . . . . . . . 525
35.3 The Beta Function Lemma . . . . . . . . . . . . . . . . . . . . 526
35.4 Simulating Primitive Recursion . . . . . . . . . . . . . . . . . 529
35.5 Basic Functions are Representable in Q . . . . . . . . . . . . . 530
35.6 Composition is Representable in Q . . . . . . . . . . . . . . . 532
35.7 Regular Minimization is Representable in Q . . . . . . . . . . 534
35.8 Computable Functions are Representable in Q . . . . . . . . . 537
35.9 Representing Relations . . . . . . . . . . . . . . . . . . . . . . 538
35.10 Undecidability . . . . . . . . . . . . . . . . . . . . . . . . . . . 538

36 Theories and Computability 539


36.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 540
36.2 Q is C.e.-Complete . . . . . . . . . . . . . . . . . . . . . . . . 540
36.3 ω-Consistent Extensions of Q are Undecidable . . . . . . . . . 541
36.4 Consistent Extensions of Q are Undecidable . . . . . . . . . . 542
36.5 Axiomatizable Theories . . . . . . . . . . . . . . . . . . . . . . 543
36.6 Axiomatizable Complete Theories are Decidable . . . . . . . . 543
36.7 Q has no Complete, Consistent, Axiomatizable Extensions . . 543

Release : 6891b66 (2024-12-01) 9


CONTENTS

36.8 Sentences Provable and Refutable in Q are Computably Insep-


arable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 544
36.9 Theories Consistent with Q are Undecidable . . . . . . . . . . 545
36.10 Theories in which Q is Interpretable are Undecidable . . . . . 545

37 Incompleteness and Provability 546


37.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547
37.2 The Fixed-Point Lemma . . . . . . . . . . . . . . . . . . . . . 548
37.3 The First Incompleteness Theorem . . . . . . . . . . . . . . . 550
37.4 Rosser’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 552
37.5 Comparison with Gödel’s Original Paper . . . . . . . . . . . . 553
37.6 The Derivability Conditions for PA . . . . . . . . . . . . . . . 554
37.7 The Second Incompleteness Theorem . . . . . . . . . . . . . . 555
37.8 Löb’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 557
37.9 The Undefinability of Truth . . . . . . . . . . . . . . . . . . . 560

VIIISecond-order Logic 562

38 Syntax and Semantics 562


38.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 562
38.2 Terms and Formulas . . . . . . . . . . . . . . . . . . . . . . . 563
38.3 Satisfaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564
38.4 Semantic Notions . . . . . . . . . . . . . . . . . . . . . . . . . 567
38.5 Expressive Power . . . . . . . . . . . . . . . . . . . . . . . . . 567
38.6 Describing Infinite and Enumerable Domains . . . . . . . . . . 569

39 Metatheory of Second-order Logic 570


39.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 571
39.2 Second-order Arithmetic . . . . . . . . . . . . . . . . . . . . . 571
39.3 Second-order Logic is not Axiomatizable . . . . . . . . . . . . 573
39.4 Second-order Logic is not Compact . . . . . . . . . . . . . . . 574
39.5 The Löwenheim–Skolem Theorem Fails for Second-order Logic 574

40 Second-order Logic and Set Theory 575


40.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575
40.2 Comparing Sets . . . . . . . . . . . . . . . . . . . . . . . . . . 576
40.3 Cardinalities of Sets . . . . . . . . . . . . . . . . . . . . . . . . 577
40.4 The Power of the Continuum . . . . . . . . . . . . . . . . . . . 578

IX The Lambda Calculus 581

41 Introduction 581
41.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581
41.2 The Syntax of the Lambda Calculus . . . . . . . . . . . . . . . 582
41.3 Reduction of Lambda Terms . . . . . . . . . . . . . . . . . . . 584

10 Release : 6891b66 (2024-12-01)


CONTENTS

41.4 The Church–Rosser Property . . . . . . . . . . . . . . . . . . . 585


41.5 Currying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585
41.6 λ-Definable Arithmetical Functions . . . . . . . . . . . . . . . 586
41.7 λ-Definable Functions are Computable . . . . . . . . . . . . . 587
41.8 Computable Functions are λ-Definable . . . . . . . . . . . . . 587
41.9 The Basic Primitive Recursive Functions are λ-Definable . . . 588
41.10 The λ-Definable Functions are Closed under Composition . . . 588
41.11 λ-Definable Functions are Closed under Primitive Recursion . 589
41.12 Fixed-Point Combinators . . . . . . . . . . . . . . . . . . . . . 591
41.13 The λ-Definable Functions are Closed under Minimization . . 592

42 Syntax 593
42.1 Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593
42.2 Unique Readability . . . . . . . . . . . . . . . . . . . . . . . . 594
42.3 Abbreviated Syntax . . . . . . . . . . . . . . . . . . . . . . . . 595
42.4 Free Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 596
42.5 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 597
42.6 α-Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 600
42.7 The De Bruijn Index . . . . . . . . . . . . . . . . . . . . . . . 604
42.8 Terms as α-Equivalence Classes . . . . . . . . . . . . . . . . . 605
42.9 β-reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606
42.10 η-conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 608

43 The Church–Rosser Property 609


43.1 Definition and Properties . . . . . . . . . . . . . . . . . . . . . 610
43.2 Parallel β-reduction . . . . . . . . . . . . . . . . . . . . . . . . 611
43.3 β-reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613
43.4 Parallel βη-reduction . . . . . . . . . . . . . . . . . . . . . . . 614
43.5 βη-reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616

44 Lambda Definability 617


44.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617
44.2 λ-Definable Arithmetical Functions . . . . . . . . . . . . . . . 618
44.3 Pairs and Predecessor . . . . . . . . . . . . . . . . . . . . . . . 621
44.4 Truth Values and Relations . . . . . . . . . . . . . . . . . . . 621
44.5 Primitive Recursive Functions are λ-Definable . . . . . . . . . 623
44.6 Fixpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625
44.7 Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628
44.8 Partial Recursive Functions are λ-Definable . . . . . . . . . . 629
44.9 λ-Definable Functions are Recursive . . . . . . . . . . . . . . . 629

X Many-valued Logic 631

45 Syntax and Semantics 631


45.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 631

Release : 6891b66 (2024-12-01) 11


CONTENTS

45.2 Languages and Connectives . . . . . . . . . . . . . . . . . . . 632


45.3 Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633
45.4 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633
45.5 Valuations and Satisfaction . . . . . . . . . . . . . . . . . . . . 634
45.6 Semantic Notions . . . . . . . . . . . . . . . . . . . . . . . . . 635
45.7 Many-valued logics as sublogics of C . . . . . . . . . . . . . . 635

46 Three-valued Logics 637


46.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 637
46.2 Lukasiewicz logic . . . . . . . . . . . . . . . . . . . . . . . . . 637
46.3 Kleene logics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 641
46.4 Gödel logics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643
46.5 Designating not just T . . . . . . . . . . . . . . . . . . . . . . 645

47 Infinite-valued Logics 648


47.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 648
47.2 Lukasiewicz logic . . . . . . . . . . . . . . . . . . . . . . . . . 649
47.3 Gödel logics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 650

48 Sequent Calculus 651


48.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 652
48.2 Rules and Derivations . . . . . . . . . . . . . . . . . . . . . . . 653
48.3 Structural Rules . . . . . . . . . . . . . . . . . . . . . . . . . . 654
48.4 Propositional Rules for Selected Logics . . . . . . . . . . . . . 654

XI Normal Modal Logics 658

49 Syntax and Semantics 658


49.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 658
49.2 The Language of Basic Modal Logic . . . . . . . . . . . . . . . 660
49.3 Simultaneous Substitution . . . . . . . . . . . . . . . . . . . . 661
49.4 Relational Models . . . . . . . . . . . . . . . . . . . . . . . . . 662
49.5 Truth at a World . . . . . . . . . . . . . . . . . . . . . . . . . 663
49.6 Truth in a Model . . . . . . . . . . . . . . . . . . . . . . . . . 665
49.7 Validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666
49.8 Tautological Instances . . . . . . . . . . . . . . . . . . . . . . 667
49.9 Schemas and Validity . . . . . . . . . . . . . . . . . . . . . . . 669
49.10 Entailment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 671

50 Frame Definability 672


50.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 673
50.2 Properties of Accessibility Relations . . . . . . . . . . . . . . . 673
50.3 Frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 676
50.4 Frame Definability . . . . . . . . . . . . . . . . . . . . . . . . 676
50.5 First-order Definability . . . . . . . . . . . . . . . . . . . . . . 679

12 Release : 6891b66 (2024-12-01)


CONTENTS

50.6 Equivalence Relations and S5 . . . . . . . . . . . . . . . . . . 680


50.7 Second-order Definability . . . . . . . . . . . . . . . . . . . . . 682

51 Axiomatic Derivations 684


51.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684
51.2 Normal Modal Logics . . . . . . . . . . . . . . . . . . . . . . . 686
51.3 Derivations and Modal Systems . . . . . . . . . . . . . . . . . 687
51.4 Proofs in K . . . . . . . . . . . . . . . . . . . . . . . . . . . . 689
51.5 Derived Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . 691
51.6 More Proofs in K . . . . . . . . . . . . . . . . . . . . . . . . . 693
51.7 Dual Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . 694
51.8 Proofs in Modal Systems . . . . . . . . . . . . . . . . . . . . . 694
51.9 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 696
51.10 Showing Systems are Distinct . . . . . . . . . . . . . . . . . . 697
51.11 Derivability from a Set of Formulas . . . . . . . . . . . . . . . 698
51.12 Properties of Derivability . . . . . . . . . . . . . . . . . . . . . 699
51.13 Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . 699

52 Completeness and Canonical Models 700


52.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 700
52.2 Complete Σ-Consistent Sets . . . . . . . . . . . . . . . . . . . 701
52.3 Lindenbaum’s Lemma . . . . . . . . . . . . . . . . . . . . . . . 703
52.4 Modalities and Complete Consistent Sets . . . . . . . . . . . . 704
52.5 Canonical Models . . . . . . . . . . . . . . . . . . . . . . . . . 706
52.6 The Truth Lemma . . . . . . . . . . . . . . . . . . . . . . . . 706
52.7 Determination and Completeness for K . . . . . . . . . . . . . 708
52.8 Frame Completeness . . . . . . . . . . . . . . . . . . . . . . . 708

53 Filtrations and Decidability 711


53.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 712
53.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 714
53.3 Filtrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 715
53.4 Examples of Filtrations . . . . . . . . . . . . . . . . . . . . . . 717
53.5 Filtrations are Finite . . . . . . . . . . . . . . . . . . . . . . . 719
53.6 K and S5 have the Finite Model Property . . . . . . . . . . . 720
53.7 S5 is Decidable . . . . . . . . . . . . . . . . . . . . . . . . . . 721
53.8 Filtrations and Properties of Accessibility . . . . . . . . . . . 721
53.9 Filtrations of Euclidean Models . . . . . . . . . . . . . . . . . 723

54 Modal Tableaux 724


54.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724
54.2 Rules for K . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725
54.3 Tableaux for K . . . . . . . . . . . . . . . . . . . . . . . . . . 727
54.4 Soundness for K . . . . . . . . . . . . . . . . . . . . . . . . . . 728
54.5 Rules for Other Accessibility Relations . . . . . . . . . . . . . 731
54.6 Soundness for Additional Rules . . . . . . . . . . . . . . . . . 733

Release : 6891b66 (2024-12-01) 13


CONTENTS

54.7 Simple Tableaux for S5 . . . . . . . . . . . . . . . . . . . . . . 735


54.8 Completeness for K . . . . . . . . . . . . . . . . . . . . . . . . 736
54.9 Countermodels from Tableaux . . . . . . . . . . . . . . . . . . 739

XII Intuitionistic Logic 741

55 Introduction 741
55.1 Constructive Reasoning . . . . . . . . . . . . . . . . . . . . . . 741
55.2 Syntax of Intuitionistic Logic . . . . . . . . . . . . . . . . . . . 743
55.3 The Brouwer–Heyting–Kolmogorov Interpretation . . . . . . . 744
55.4 Natural Deduction . . . . . . . . . . . . . . . . . . . . . . . . 746
55.5 Axiomatic Derivations . . . . . . . . . . . . . . . . . . . . . . 749

56 Semantics 750
56.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 751
56.2 Relational models . . . . . . . . . . . . . . . . . . . . . . . . . 752
56.3 Semantic Notions . . . . . . . . . . . . . . . . . . . . . . . . . 753
56.4 Topological Semantics . . . . . . . . . . . . . . . . . . . . . . 754

57 Soundness and Completeness 755


57.1 Soundness of Axiomatic Derivations . . . . . . . . . . . . . . . 755
57.2 Soundness of Natural Deduction . . . . . . . . . . . . . . . . . 756
57.3 Lindenbaum’s Lemma . . . . . . . . . . . . . . . . . . . . . . . 758
57.4 The Canonical Model . . . . . . . . . . . . . . . . . . . . . . . 760
57.5 The Truth Lemma . . . . . . . . . . . . . . . . . . . . . . . . 761
57.6 The Completeness Theorem . . . . . . . . . . . . . . . . . . . 761
57.7 Decidability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 762

58 Propositions as Types 763


58.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 763
58.2 Sequent Natural Deduction . . . . . . . . . . . . . . . . . . . . 765
58.3 Proof Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 766
58.4 Converting Derivations to Proof Terms . . . . . . . . . . . . . 767
58.5 Recovering Derivations from Proof Terms . . . . . . . . . . . . 770
58.6 Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 772
58.7 Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . 774

XIIICounterfactuals 777

59 Introduction 777
59.1 The Material Conditional . . . . . . . . . . . . . . . . . . . . . 777
59.2 Paradoxes of the Material Conditional . . . . . . . . . . . . . 779
59.3 The Strict Conditional . . . . . . . . . . . . . . . . . . . . . . 779
59.4 Counterfactuals . . . . . . . . . . . . . . . . . . . . . . . . . . 782

14 Release : 6891b66 (2024-12-01)


CONTENTS

60 Minimal Change Semantics 783


60.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 783
60.2 Sphere Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 784
60.3 Truth and Falsity of Counterfactuals . . . . . . . . . . . . . . 786
60.4 Antecedent Strengthening . . . . . . . . . . . . . . . . . . . . 788
60.5 Transitivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 789
60.6 Contraposition . . . . . . . . . . . . . . . . . . . . . . . . . . . 790

XIVSet Theory 792

61 The Iterative Conception 792


61.1 Extensionality . . . . . . . . . . . . . . . . . . . . . . . . . . . 792
61.2 Russell’s Paradox (again) . . . . . . . . . . . . . . . . . . . . . 792
61.3 Predicative and Impredicative . . . . . . . . . . . . . . . . . . 794
61.4 The Cumulative-Iterative Approach . . . . . . . . . . . . . . . 795
61.5 Urelements or Not? . . . . . . . . . . . . . . . . . . . . . . . . 797
61.6 Appendix: Frege’s Basic Law V . . . . . . . . . . . . . . . . . 798

62 Steps towards Z 799


62.1 The Story in More Detail . . . . . . . . . . . . . . . . . . . . . 799
62.2 Separation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 800
62.3 Union . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 801
62.4 Pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 802
62.5 Powersets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 803
62.6 Infinity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 803
62.7 Z− : a Milestone . . . . . . . . . . . . . . . . . . . . . . . . . . 805
62.8 Selecting our Natural Numbers . . . . . . . . . . . . . . . . . 806
62.9 Appendix: Closure, Comprehension, and Intersection . . . . . 807

63 Ordinals 808
63.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 808
63.2 The General Idea of an Ordinal . . . . . . . . . . . . . . . . . 808
63.3 Well-Orderings . . . . . . . . . . . . . . . . . . . . . . . . . . . 809
63.4 Order-Isomorphisms . . . . . . . . . . . . . . . . . . . . . . . . 810
63.5 Von Neumann’s Construction . . . . . . . . . . . . . . . . . . 812
63.6 Basic Properties of the Ordinals . . . . . . . . . . . . . . . . . 813
63.7 Replacement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 816
63.8 ZF− : a milestone . . . . . . . . . . . . . . . . . . . . . . . . . 817
63.9 Ordinals as Order-Types . . . . . . . . . . . . . . . . . . . . . 817
63.10 Successor and Limit Ordinals . . . . . . . . . . . . . . . . . . 818

64 Stages and Ranks 820


64.1 Defining the Stages as the Vα s . . . . . . . . . . . . . . . . . . 820
64.2 The Transfinite Recursion Theorem(s) . . . . . . . . . . . . . 821
64.3 Basic Properties of Stages . . . . . . . . . . . . . . . . . . . . 823

Release : 6891b66 (2024-12-01) 15


CONTENTS

64.4 Foundation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 824


64.5 Z and ZF: A Milestone . . . . . . . . . . . . . . . . . . . . . . 825
64.6 Rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 826

65 Replacement 828
65.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 828
65.2 The Strength of Replacement . . . . . . . . . . . . . . . . . . 828
65.3 Extrinsic Considerations . . . . . . . . . . . . . . . . . . . . . 829
65.4 Limitation-of-size . . . . . . . . . . . . . . . . . . . . . . . . . 831
65.5 Replacement and “Absolute Infinity” . . . . . . . . . . . . . . 832
65.6 Replacement and Reflection . . . . . . . . . . . . . . . . . . . 834
65.7 Appendix: Results surrounding Replacement . . . . . . . . . . 834
65.8 Appendix: Finite axiomatizability . . . . . . . . . . . . . . . . 837

66 Ordinal Arithmetic 839


66.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 839
66.2 Ordinal Addition . . . . . . . . . . . . . . . . . . . . . . . . . 839
66.3 Using Ordinal Addition . . . . . . . . . . . . . . . . . . . . . . 842
66.4 Ordinal Multiplication . . . . . . . . . . . . . . . . . . . . . . 844
66.5 Ordinal Exponentiation . . . . . . . . . . . . . . . . . . . . . . 845

67 Cardinals 846
67.1 Cantor’s Principle . . . . . . . . . . . . . . . . . . . . . . . . . 846
67.2 Cardinals as Ordinals . . . . . . . . . . . . . . . . . . . . . . . 847
67.3 ZFC: A Milestone . . . . . . . . . . . . . . . . . . . . . . . . 849
67.4 Finite, Enumerable, Non-enumerable . . . . . . . . . . . . . . 849
67.5 Appendix: Hume’s Principle . . . . . . . . . . . . . . . . . . . 851

68 Cardinal Arithmetic 853


68.1 Defining the Basic Operations . . . . . . . . . . . . . . . . . . 853
68.2 Simplifying Addition and Multiplication . . . . . . . . . . . . 855
68.3 Some Simplifications . . . . . . . . . . . . . . . . . . . . . . . 857
68.4 The Continuum Hypothesis . . . . . . . . . . . . . . . . . . . 858
68.5 ℵ-Fixed Points . . . . . . . . . . . . . . . . . . . . . . . . . . . 860

69 Choice 862
69.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 862
69.2 The Tarski–Scott Trick . . . . . . . . . . . . . . . . . . . . . . 862
69.3 Comparability and Hartogs’ Lemma . . . . . . . . . . . . . . . 863
69.4 The Well-Ordering Problem . . . . . . . . . . . . . . . . . . . 865
69.5 Countable Choice . . . . . . . . . . . . . . . . . . . . . . . . . 866
69.6 Intrinsic Considerations about Choice . . . . . . . . . . . . . . 868
69.7 The Banach–Tarski Paradox . . . . . . . . . . . . . . . . . . . 869
69.8 Appendix: Vitali’s Paradox . . . . . . . . . . . . . . . . . . . 871

16 Release : 6891b66 (2024-12-01)


CONTENTS

XV Methods 875

70 Proofs 875
70.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 875
70.2 Starting a Proof . . . . . . . . . . . . . . . . . . . . . . . . . . 876
70.3 Using Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 877
70.4 Inference Patterns . . . . . . . . . . . . . . . . . . . . . . . . . 878
70.5 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 884
70.6 Another Example . . . . . . . . . . . . . . . . . . . . . . . . . 887
70.7 Proof by Contradiction . . . . . . . . . . . . . . . . . . . . . . 889
70.8 Reading Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . 892
70.9 I Can’t Do It! . . . . . . . . . . . . . . . . . . . . . . . . . . . 894
70.10 Other Resources . . . . . . . . . . . . . . . . . . . . . . . . . . 895

71 Induction 895
71.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 895
71.2 Induction on N . . . . . . . . . . . . . . . . . . . . . . . . . . 896
71.3 Strong Induction . . . . . . . . . . . . . . . . . . . . . . . . . 898
71.4 Inductive Definitions . . . . . . . . . . . . . . . . . . . . . . . 899
71.5 Structural Induction . . . . . . . . . . . . . . . . . . . . . . . 902
71.6 Relations and Functions . . . . . . . . . . . . . . . . . . . . . 903

XVIHistory 906

72 Biographies 906
72.1 Georg Cantor . . . . . . . . . . . . . . . . . . . . . . . . . . . 906
72.2 Alonzo Church . . . . . . . . . . . . . . . . . . . . . . . . . . . 907
72.3 Gerhard Gentzen . . . . . . . . . . . . . . . . . . . . . . . . . 908
72.4 Kurt Gödel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 909
72.5 Emmy Noether . . . . . . . . . . . . . . . . . . . . . . . . . . 910
72.6 Rózsa Péter . . . . . . . . . . . . . . . . . . . . . . . . . . . . 912
72.7 Julia Robinson . . . . . . . . . . . . . . . . . . . . . . . . . . . 913
72.8 Bertrand Russell . . . . . . . . . . . . . . . . . . . . . . . . . . 915
72.9 Alfred Tarski . . . . . . . . . . . . . . . . . . . . . . . . . . . . 916
72.10 Alan Turing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 917
72.11 Ernst Zermelo . . . . . . . . . . . . . . . . . . . . . . . . . . . 918

73 History and Mythology of Set Theory 920


73.1 Infinitesimals and Differentiation . . . . . . . . . . . . . . . . 920
73.2 Rigorous Definition of Limits . . . . . . . . . . . . . . . . . . . 922
73.3 Pathologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 923
73.4 More Myth than History? . . . . . . . . . . . . . . . . . . . . 925
73.5 Cantor on the Line and the Plane . . . . . . . . . . . . . . . . 926
73.6 Appendix: Hilbert’s Space-filling Curves . . . . . . . . . . . . 927

Release : 6891b66 (2024-12-01) 17


CONTENTS

XVII
Reference 930

74 The Greek Alphabet 930

75 The Fraktur Alphabet 931

Photo Credits 931

Bibliography 933

18 Release : 6891b66 (2024-12-01)


CONTENTS

This file loads all content included in the Open Logic Project. Editorial
notes like this, if displayed, indicate that the file was compiled without any
thought to how this material will be presented. If you can read this, it is
probably not advisable to teach or study from this PDF.
The Open Logic Project provides many mechanisms by which a text
can be generate which is more appropriate for teaching or self-study. For
instance, by default, the text will make all logical operators primitives and
carry out all cases for all operators in proofs. But it is much better to
leave some of these cases as exercises. The Open Logic Project is also a
work in progress. In an effort to stimulate collaboration and improvement,
material is included even if it is only in draft form, is missing exercises,
etc. A PDF produced for a course will exclude these sections.
To find PDFs more suitable for teaching and studying, have a look at
the sample courses available on the OLP website. To make your own, you
might start from the sample driver file or look at the sources of the derived
textbooks for more fancy and advanced examples.

Release : 6891b66 (2024-12-01) 19


Part I

Naı̈ve Set Theory


The material in this part is an introduction to basic naive set theory.
With the inclusion of Tim Button’s Open Set Theory, this also covers the
construction of number systems, and discussion of infinity, which are not
required for the logical parts of the OLP.

Chapter 1

Sets

content/sets-functions-relations/sets/basics.tex

1.1 Extensionality
A set is a collection of objects, considered as a single object. The objects sfr:set:bas:
sec
making up the set are called elements or members of the set. If x is an element
of a set A, we write x ∈ A; if not, we write x ∈ / A. The set which has no
elements is called the empty set and denoted “∅”.
explanation It does not matter how we specify the set, or how we order its elements, or
indeed how many times we count its elements. All that matters are what its
elements are. We codify this in the following principle.
Definition 1.1 (Extensionality). If A and B are sets, then A = B iff every
element of A is also an element of B, and vice versa.

Extensionality licenses some notation. In general, when we have some ob-


jects a1 , . . . , an , then {a1 , . . . , an } is the set whose elements are a1 , . . . , an . We
emphasise the word “the”, since extensionality tells us that there can be only
one such set. Indeed, extensionality also licenses the following:
{a, a, b} = {a, b} = {b, a}.

20
1.2. SUBSETS AND POWER SETS

This delivers on the point that, when we consider sets, we don’t care about the
order of their elements, or how many times they are specified.
Example 1.2. Whenever you have a bunch of objects, you can collect them
together in a set. The set of Richard’s siblings, for instance, is a set that
contains one person, and we could write it as S = {Ruth}. The set of positive
integers less than 4 is {1, 2, 3}, but it can also be written as {3, 2, 1} or even as
{1, 2, 1, 2, 3}. These are all the same set, by extensionality. For every element
of {1, 2, 3} is also an element of {3, 2, 1} (and of {1, 2, 1, 2, 3}), and vice versa.

Frequently we’ll specify a set by some property that its elements share.
We’ll use the following shorthand notation for that: {x : φ(x)}, where the
φ(x) stands for the property that x has to have in order to be counted among
the elements of the set.
Example 1.3. In our example, we could have specified S also as

S = {x : x is a sibling of Richard}.

Example 1.4. A number is called perfect iff it is equal to the sum of its
proper divisors (i.e., numbers that evenly divide it but aren’t identical to the
number). For instance, 6 is perfect because its proper divisors are 1, 2, and 3,
and 6 = 1 + 2 + 3. In fact, 6 is the only positive integer less than 10 that is
perfect. So, using extensionality, we can say:

{6} = {x : x is perfect and 0 ≤ x ≤ 10}

We read the notation on the right as “the set of x’s such that x is perfect and 0 ≤
x ≤ 10”. The identity here confirms that, when we consider sets, we don’t care
about how they are specified. And, more generally, extensionality guarantees
that there is always only one set of x’s such that φ(x). So, extensionality
justifies calling {x : φ(x)} the set of x’s such that φ(x).

Extensionality gives us a way for showing that sets are identical: to show
that A = B, show that whenever x ∈ A then also x ∈ B, and whenever y ∈ B
then also y ∈ A.
Problem 1.1. Prove that there is at most one empty set, i.e., show that if A
and B are sets without elements, then A = B.

content/sets-functions-relations/sets/subsets.tex

1.2 Subsets and Power Sets


sfr:set:sub: We will often want to compare sets. And one obvious kind of comparison explanation
sec
one might make is as follows: everything in one set is in the other too. This
situation is sufficiently important for us to introduce some new notation.

Release : 6891b66 (2024-12-01) 21


CHAPTER 1. SETS

Definition 1.5 (Subset). If every element of a set A is also an element of B,


then we say that A is a subset of B, and write A ⊆ B. If A is not a subset
of B we write A ̸⊆ B. If A ⊆ B but A ̸= B, we write A ⊊ B and say that A is
a proper subset of B.

Example 1.6. Every set is a subset of itself, and ∅ is a subset of every set.
The set of even numbers is a subset of the set of natural numbers. Also,
{a, b} ⊆ {a, b, c}. But {a, b, e} is not a subset of {a, b, c}.

Example 1.7. The number 2 is an element of the set of integers, whereas the
set of even numbers is a subset of the set of integers. However, a set may happen
to both be an element and a subset of some other set, e.g., {0} ∈ {0, {0}} and
also {0} ⊆ {0, {0}}.

Extensionality gives a criterion of identity for sets: A = B iff every element


of A is also an element of B and vice versa. The definition of “subset” defines
A ⊆ B precisely as the first half of this criterion: every element of A is also
an element of B. Of course the definition also applies if we switch A and B:
that is, B ⊆ A iff every element of B is also an element of A. And that, in turn,
is exactly the “vice versa” part of extensionality. In other words, extensionality
entails that sets are equal iff they are subsets of one another.
Proposition 1.8. A = B iff both A ⊆ B and B ⊆ A.

Now is also a good opportunity to introduce some further bits of helpful


notation. In defining when A is a subset of B we said that “every element of A
is . . . ,” and filled the “. . . ” with “an element of B”. But this is such a common
shape of expression that it will be helpful to introduce some formal notation
for it.
Definition 1.9. (∀x ∈ A)φ abbreviates ∀x(x ∈ A → φ). Similarly, (∃x ∈ A)φ sfr:set:sub:
abbreviates ∃x(x ∈ A ∧ φ). forallxina

Using this notation, we can say that A ⊆ B iff (∀x ∈ A)x ∈ B.


Now we move on to considering a certain kind of set: the set of all subsets
of a given set.
Definition 1.10 (Power Set). The set consisting of all subsets of a set A is
called the power set of A, written ℘(A).

℘(A) = {B : B ⊆ A}

Example 1.11. What are all the possible subsets of {a, b, c}? They are: ∅,
{a}, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c}. The set of all these subsets is
℘({a, b, c}):

℘({a, b, c}) = {∅, {a}, {b}, {c}, {a, b}, {b, c}, {a, c}, {a, b, c}}

Problem 1.2. List all subsets of {a, b, c, d}.

22 Release : 6891b66 (2024-12-01)


1.3. SOME IMPORTANT SETS

Problem 1.3. Show that if A has n elements, then ℘(A) has 2n elements.

content/sets-functions-relations/sets/important-sets.tex

1.3 Some Important Sets


sfr:set:imp:
sec
Example 1.12. We will mostly be dealing with sets whose elements are math-
ematical objects. Four such sets are important enough to have specific names:

N = {0, 1, 2, 3, . . .}
the set of natural numbers
Z = {. . . , −2, −1, 0, 1, 2, . . .}
the set of integers
Q= {m/n : m, n ∈ Z and n ̸= 0}
the set of rationals
R = (−∞, ∞)
the set of real numbers (the continuum)

These are all infinite sets, that is, they each have infinitely many elements.
As we move through these sets, we are adding more numbers to our stock.
Indeed, it should be clear that N ⊆ Z ⊆ Q ⊆ R: after all, every natural number
is an integer; every integer is a rational; and every rational is a real. Equally,
it should be clear that N ⊊ Z ⊊ Q, since −1 is an integer but not a natural
number, and 1/2 is rational but not integer. It is less obvious that Q ⊊ R, i.e.,
that there are some real numbers which are not rational.
We’ll sometimes also use the set of positive integers Z+ = {1, 2, 3, . . . } and
the set containing just the first two natural numbers B = {0, 1}.

Example 1.13 (Strings). Another interesting example is the set A∗ of finite


strings over an alphabet A: any finite sequence of elements of A is a string
over A. We include the empty string Λ among the strings over A, for every
alphabet A. For instance,

B∗ = {Λ, 0, 1, 00, 01, 10, 11,


000, 001, 010, 011, 100, 101, 110, 111, 0000, . . .}.

If x = x1 . . . xn ∈ A∗ is a string consisting of n “letters” from A, then we say


length of the string is n and write len(x) = n.

Example 1.14 (Infinite sequences). For any set A we may also consider
the set Aω of infinite sequences of elements of A. An infinite sequence a1 a2 a3 a4 . . .
consists of a one-way infinite list of objects, each one of which is an element
of A.

Release : 6891b66 (2024-12-01) 23


CHAPTER 1. SETS

Figure 1.1: The union A ∪ B of two sets is set of elements of A together with
those of B.
sfr:set:uni:
fig:union

content/sets-functions-relations/sets/unions-and-intersections.tex

1.4 Unions and Intersections


explanation In section 1.1, we introduced definitions of sets by abstraction, i.e., definitions sfr:set:uni:
sec
of the form {x : φ(x)}. Here, we invoke some property φ, and this property
can mention sets we’ve already defined. So for instance, if A and B are sets,
the set {x : x ∈ A ∨ x ∈ B} consists of all those objects which are elements
of either A or B, i.e., it’s the set that combines the elements of A and B. We
can visualize this as in Figure 1.1, where the highlighted area indicates the
elements of the two sets A and B together.
This operation on sets—combining them—is very useful and common, and
so we give it a formal name and a symbol.

Definition 1.15 (Union). The union of two sets A and B, written A ∪ B, is


the set of all things which are elements of A, B, or both.

A ∪ B = {x : x ∈ A ∨ x ∈ B}

Example 1.16. Since the multiplicity of elements doesn’t matter, the union
of two sets which have an element in common contains that element only once,
e.g., {a, b, c} ∪ {a, 0, 1} = {a, b, c, 0, 1}.
The union of a set and one of its subsets is just the bigger set: {a, b, c} ∪
{a} = {a, b, c}.
The union of a set with the empty set is identical to the set: {a, b, c} ∪ ∅ =
{a, b, c}.

Problem 1.4. Prove that if A ⊆ B, then A ∪ B = B.

explanation We can also consider a “dual” operation to union. This is the operation
that forms the set of all elements that are elements of A and are also elements

24 Release : 6891b66 (2024-12-01)


1.4. UNIONS AND INTERSECTIONS

Figure 1.2: The intersection A ∩ B of two sets is the set of elements they have
in common.
sfr:set:uni:
fig:intersection
of B. This operation is called intersection, and can be depicted as in Figure 1.2.

Definition 1.17 (Intersection). The intersection of two sets A and B, writ-


ten A ∩ B, is the set of all things which are elements of both A and B.

A ∩ B = {x : x ∈ A ∧ x ∈ B}

Two sets are called disjoint if their intersection is empty. This means they
have no elements in common.

Example 1.18. If two sets have no elements in common, their intersection is


empty: {a, b, c} ∩ {0, 1} = ∅.
If two sets do have elements in common, their intersection is the set of all
those: {a, b, c} ∩ {a, b, d} = {a, b}.
The intersection of a set with one of its subsets is just the smaller set:
{a, b, c} ∩ {a, b} = {a, b}.
The intersection of any set with the empty set is empty: {a, b, c} ∩ ∅ = ∅.

Problem 1.5. Prove rigorously that if A ⊆ B, then A ∩ B = A.

We can also form the union or intersection of more than two sets. An explanation

elegant way of dealing with this in general is the following: suppose you collect
all the sets you want to form the union (or intersection) of into a single set.
Then we can define the union of all our original sets as the set of all objects
which belong to at least one element of the set, and the intersection as the set
of all objects which belong to every element of the set.
S
Definition 1.19. If A is a set of sets, then A is the set of elements of
elements of A:
[
A = {x : x belongs to an element of A}, i.e.,
= {x : there is a B ∈ A so that x ∈ B}

Release : 6891b66 (2024-12-01) 25


CHAPTER 1. SETS

Figure 1.3: The difference A \ B of two sets is the set of those elements of A
which are not also elements of B.
sfr:set:uni:
T difference
Definition 1.20. If A is a set of sets, then A is the set of objects which all
elements of A have in common:
\
A = {x : x belongs to every element of A}, i.e.,
= {x : for all B ∈ A, x ∈ B}
S
Example
T 1.21. Suppose A = {{a, b}, {a, d, e}, {a, d}}. Then A = {a, b, d, e}
and A = {a}.
S
Problem 1.6. Show that if A is a set and A ∈ B, then A ⊆ B.

We could also do the same for a sequence of sets A1 , A2 , . . .


[
Ai = {x : x belongs to one of the Ai }
i
\
Ai = {x : x belongs to every Ai }.
i

When we have an index of sets, i.e., some set I such that we are considering
Ai for each i ∈ I, we may also use these abbreviations:
[ [
Ai = {Ai : i ∈ I}
i∈I
\ \
Ai = {Ai : i ∈ I}
i∈I

Finally, we may want to think about the set of all elements in A which are
not in B. We can depict this as in Figure 1.3.
Definition 1.22 (Difference). The set difference A \ B is the set of all ele-
ments of A which are not also elements of B, i.e.,

A \ B = {x : x ∈ A and x ∈
/ B}.

26 Release : 6891b66 (2024-12-01)


1.5. PAIRS, TUPLES, CARTESIAN PRODUCTS

Problem 1.7. Prove that if A ⊊ B, then B \ A ̸= ∅.

content/sets-functions-relations/sets/pairs-and-products.tex

1.5 Pairs, Tuples, Cartesian Products


sfr:set:pai: It follows from extensionality that sets have no order to their elements. So if explanation
sec
we want to represent order, we use ordered pairs ⟨x, y⟩. In an unordered pair
{x, y}, the order does not matter: {x, y} = {y, x}. In an ordered pair, it does:
if x ̸= y, then ⟨x, y⟩ =
̸ ⟨y, x⟩.
How should we think about ordered pairs in set theory? Crucially, we want
to preserve the idea that ordered pairs are identical iff they share the same first
element and share the same second element, i.e.:

⟨a, b⟩ = ⟨c, d⟩ iff both a = c and b = d.

We can define ordered pairs in set theory using the Wiener–Kuratowski defini-
tion.
sfr:set:pai: Definition 1.23 (Ordered pair). ⟨a, b⟩ = {{a}, {a, b}}.
wienerkuratowski

Problem 1.8. Using Definition 1.23, prove that ⟨a, b⟩ = ⟨c, d⟩ iff both a = c
and b = d.

Having fixed a definition of an ordered pair, we can use it to define further explanation
sets. For example, sometimes we also want ordered sequences of more than
two objects, e.g., triples ⟨x, y, z⟩, quadruples ⟨x, y, z, u⟩, and so on. We can
think of triples as special ordered pairs, where the first element is itself an
ordered pair: ⟨x, y, z⟩ is ⟨⟨x, y⟩, z⟩. The same is true for quadruples: ⟨x, y, z, u⟩
is ⟨⟨⟨x, y⟩, z⟩, u⟩, and so on. In general, we talk of ordered n-tuples ⟨x1 , . . . , xn ⟩.
Certain sets of ordered pairs, or other ordered n-tuples, will be useful.
Definition 1.24 (Cartesian product). Given sets A and B, their Cartesian
product A × B is defined by

A × B = {⟨x, y⟩ : x ∈ A and y ∈ B}.

Example 1.25. If A = {0, 1}, and B = {1, a, b}, then their product is

A × B = {⟨0, 1⟩, ⟨0, a⟩, ⟨0, b⟩, ⟨1, 1⟩, ⟨1, a⟩, ⟨1, b⟩}.

Example 1.26. If A is a set, the product of A with itself, A × A, is also


written A2 . It is the set of all pairs ⟨x, y⟩ with x, y ∈ A. The set of all triples
⟨x, y, z⟩ is A3 , and so on. We can give a recursive definition:

A1 = A
Ak+1 = Ak × A

Release : 6891b66 (2024-12-01) 27


CHAPTER 1. SETS

Problem 1.9. List all elements of {1, 2, 3}3 .

Proposition 1.27. If A has n elements and B has m elements, then A × B sfr:set:pai:


has n · m elements. cardnmprod

Proof. For every element x in A, there are m elements of the form ⟨x, y⟩ ∈
A × B. Let Bx = {⟨x, y⟩ : y ∈ B}. Since whenever x1 ̸= x2 , ⟨x1 , y⟩ = ̸ ⟨x2 , y⟩,
Bx1 ∩ Bx2 = ∅. But if A = {x1 , . . . , xn }, then A × B = Bx1 ∪ · · · ∪ Bxn , and so
has n · m elements.
To visualize this, arrange the elements of A × B in a grid:

Bx1 = {⟨x1 , y1 ⟩ ⟨x1 , y2 ⟩ ... ⟨x1 , ym ⟩}


Bx2 = {⟨x2 , y1 ⟩ ⟨x2 , y2 ⟩ ... ⟨x2 , ym ⟩}
.. ..
. .
Bxn = {⟨xn , y1 ⟩ ⟨xn , y2 ⟩ . . . ⟨xn , ym ⟩}

Since the xi are all different, and the yj are all different, no two of the pairs in
this grid are the same, and there are n · m of them.

Problem 1.10. Show, by induction on k, that for all k ≥ 1, if A has n ele-


ments, then Ak has nk elements.

Example 1.28. If A is a set, a word over A is any sequence of elements of A.


A sequence can be thought of as an n-tuple of elements of A. For instance, if
A = {a, b, c}, then the sequence “bac” can be thought of as the triple ⟨b, a, c⟩.
Words, i.e., sequences of symbols, are of crucial importance in computer science.
By convention, we count elements of A as sequences of length 1, and ∅ as the
sequence of length 0. The set of all words over A then is

A∗ = {∅} ∪ A ∪ A2 ∪ A3 ∪ . . .

content/sets-functions-relations/sets/russells-paradox.tex

1.6 Russell’s Paradox


Extensionality licenses the notation {x : φ(x)}, for the set of x’s such that φ(x). sfr:set:rus:
sec
However, all that extensionality really licenses is the following thought. If there
is a set whose members are all and only the φ’s, then there is only one such set.
Otherwise put: having fixed some φ, the set {x : φ(x)} is unique, if it exists.
But this conditional is important! Crucially, not every property lends itself
to comprehension. That is, some properties do not define sets. If they all did,
then we would run into outright contradictions. The most famous example of
this is Russell’s Paradox.
Sets may be elements of other sets—for instance, the power set of a set A
is made up of sets. And so it makes sense to ask or investigate whether a set is

28 Release : 6891b66 (2024-12-01)


an element of another set. Can a set be a member of itself? Nothing about the
idea of a set seems to rule this out. For instance, if all sets form a collection of
objects, one might think that they can be collected into a single set—the set
of all sets. And it, being a set, would be an element of the set of all sets.
Russell’s Paradox arises when we consider the property of not having itself
as an element, of being non-self-membered. What if we suppose that there is a
set of all sets that do not have themselves as an element? Does

R = {x : x ∈
/ x}

exist? It turns out that we can prove that it does not.

sfr:set:rus: Theorem 1.29 (Russell’s Paradox). There is no set R = {x : x ∈


/ x}.
thm:russells-paradox

Proof. If R = {x : x ∈
/ x} exists, then R ∈ R iff R ∈
/ R, which is a contradic-
tion.

explanation
Let’s run through this proof more slowly. If R exists, it makes sense to ask
whether R ∈ R or not. Suppose that indeed R ∈ R. Now, R was defined as the
set of all sets that are not elements of themselves. So, if R ∈ R, then R does
not itself have R’s defining property. But only sets that have this property are
in R, hence, R cannot be an element of R, i.e., R ∈ / R. But R can’t both be
and not be an element of R, so we have a contradiction.
Since the assumption that R ∈ R leads to a contradiction, we have R ∈ / R.
But this also leads to a contradiction! For if R ∈
/ R, then R itself does have
R’s defining property, and so R would be an element of R just like all the other
non-self-membered sets. And again, it can’t both not be and be an element
of R.
digression
How do we set up a set theory which avoids falling into Russell’s Paradox,
i.e., which avoids making the inconsistent claim that R = {x : x ∈ / x} exists?
Well, we would need to lay down axioms which give us very precise conditions
for stating when sets exist (and when they don’t).
The set theory sketched in this chapter doesn’t do this. It’s genuinely naı̈ve.
It tells you only that sets obey extensionality and that, if you have some sets,
you can form their union, intersection, etc. It is possible to develop set theory

29
CHAPTER 2. RELATIONS

more rigorously than this.

Chapter 2

Relations

content/sets-functions-relations/relations/relations-as-sets.tex

2.1 Relations as Sets


explanation In section 1.3, we mentioned some important sets: N, Z, Q, R. You will no sfr:rel:set:
sec
doubt remember some interesting relations between the elements of some of
these sets. For instance, each of these sets has a completely standard order
relation on it. There is also the relation is identical with that every object
bears to itself and to no other thing. There are many more interesting relations
that we’ll encounter, and even more possible relations. Before we review them,
though, we will start by pointing out that we can look at relations as a special
sort of set.
For this, recall two things from section 1.5. First, recall the notion of a
ordered pair : given a and b, we can form ⟨a, b⟩. Importantly, the order of
elements does matter here. So if a ̸= b then ⟨a, b⟩ =
̸ ⟨b, a⟩. (Contrast this with
unordered pairs, i.e., 2-element sets, where {a, b} = {b, a}.) Second, recall the
notion of a Cartesian product: if A and B are sets, then we can form A × B,
the set of all pairs ⟨x, y⟩ with x ∈ A and y ∈ B. In particular, A2 = A × A is
the set of all ordered pairs from A.
Now we will consider a particular relation on a set: the <-relation on the
set N of natural numbers. Consider the set of all pairs of numbers ⟨n, m⟩ where
n < m, i.e.,
R = {⟨n, m⟩ : n, m ∈ N and n < m}.
There is a close connection between n being less than m, and the pair ⟨n, m⟩
being a member of R, namely:

n < m iff ⟨n, m⟩ ∈ R.

Indeed, without any loss of information, we can consider the set R to be the
<-relation on N.

30 Release : 6891b66 (2024-12-01)


2.1. RELATIONS AS SETS

In the same way we can construct a subset of N2 for any relation between
numbers. Conversely, given any set of pairs of numbers S ⊆ N2 , there is a
corresponding relation between numbers, namely, the relationship n bears to
m if and only if ⟨n, m⟩ ∈ S. This justifies the following definition:
Definition 2.1 (Binary relation). A binary relation on a set A is a subset
of A2 . If R ⊆ A2 is a binary relation on A and x, y ∈ A, we sometimes write
Rxy (or xRy) for ⟨x, y⟩ ∈ R.

sfr:rel:set: Example 2.2. The set N2 of pairs of natural numbers can be listed in a
relations
2-dimensional matrix like this:
⟨0, 0⟩ ⟨0, 1⟩ ⟨0, 2⟩ ⟨0, 3⟩ ...
⟨1, 0⟩ ⟨1, 1⟩ ⟨1, 2⟩ ⟨1, 3⟩ ...
⟨2, 0⟩ ⟨2, 1⟩ ⟨2, 2⟩ ⟨2, 3⟩ ...
⟨3, 0⟩ ⟨3, 1⟩ ⟨3, 2⟩ ⟨3, 3⟩ ...
.. .. .. .. ..
. . . . .

We have put the diagonal, here, in bold, since the subset of N2 consisting of
the pairs lying on the diagonal, i.e.,

{⟨0, 0⟩, ⟨1, 1⟩, ⟨2, 2⟩, . . . },

is the identity relation on N. (Since the identity relation is popular, let’s define
IdA = {⟨x, x⟩ : x ∈ A} for any set A.) The subset of all pairs lying above the
diagonal, i.e.,

L = {⟨0, 1⟩, ⟨0, 2⟩, . . . , ⟨1, 2⟩, ⟨1, 3⟩, . . . , ⟨2, 3⟩, ⟨2, 4⟩, . . .},

is the less than relation, i.e., Lnm iff n < m. The subset of pairs below the
diagonal, i.e.,

G = {⟨1, 0⟩, ⟨2, 0⟩, ⟨2, 1⟩, ⟨3, 0⟩, ⟨3, 1⟩, ⟨3, 2⟩, . . . },

is the greater than relation, i.e., Gnm iff n > m. The union of L with I, which
we might call K = L ∪ I, is the less than or equal to relation: Knm iff n ≤ m.
Similarly, H = G ∪ I is the greater than or equal to relation. These relations
L, G, K, and H are special kinds of relations called orders. L and G have
the property that no number bears L or G to itself (i.e., for all n, neither Lnn
nor Gnn). Relations with this property are called irreflexive, and, if they also
happen to be orders, they are called strict orders.

Although orders and identity are important and natural relations, it should explanation
be emphasized that according to our definition any subset of A2 is a relation
on A, regardless of how unnatural or contrived it seems. In particular, ∅ is a
relation on any set (the empty relation, which no pair of elements bears), and
A2 itself is a relation on A as well (one which every pair bears), called the
universal relation. But also something like E = {⟨n, m⟩ : n > 5 or m × n ≥ 34}
counts as a relation.

Release : 6891b66 (2024-12-01) 31


CHAPTER 2. RELATIONS

Problem 2.1. List the elements of the relation ⊆ on the set ℘({a, b, c}).

content/sets-functions-relations/relations/reflections.tex

2.2 Philosophical Reflections


In section 2.1, we defined relations as certain sets. We should pause and ask a sfr:rel:ref:
sec
quick philosophical question: what is such a definition doing? It is extremely
doubtful that we should want to say that we have discovered some metaphysical
identity facts; that, for example, the order relation on N turned out to be the
set R = {⟨n, m⟩ : n, m ∈ N and n < m} that we defined in section 2.1. Here
are three reasons why.
First: in Definition 1.23, we defined ⟨a, b⟩ = {{a}, {a, b}}. Consider instead
the definition ∥a, b∥ = {{b}, {a, b}} = ⟨b, a⟩. When a ̸= b, we have that ⟨a, b⟩ =
̸
∥a, b∥. But we could equally have regarded ∥a, b∥ as our definition of an ordered
pair, rather than ⟨a, b⟩. Both definitions would have worked equally well. So
now we have two equally good candidates to “be” the order relation on the
natural numbers, namely:

R = {⟨n, m⟩ : n, m ∈ N and n < m}


S = {∥n, m∥ : n, m ∈ N and n < m}.

Since R ̸= S, by extensionality, it is clear that they cannot both be identical


to the order relation on N. But it would just be arbitrary, and hence a bit
embarrassing, to claim that R rather than S (or vice versa) is the ordering
relation, as a matter of fact. (This is a very simple instance of an argument
against set-theoretic reductionism which Benacerraf made famous in 1965. We
will revisit it several times.)
Second: if we think that every relation should be identified with a set, then
the relation of set-membership itself, ∈, should be a particular set. Indeed,
it would have to be the set {⟨x, y⟩ : x ∈ y}. But does this set exist? Given
Russell’s Paradox, it is a non-trivial claim that such a set exists. In fact, it is
possible to develop set theory in a rigorous way as an axiomatic theory, and
that theory will indeed deny the existence of this set. So, even if some relations
can be treated as sets, the relation of set-membership will have to be a special
case.
Third: when we “identify” relations with sets, we said that we would allow
ourselves to write Rxy for ⟨x, y⟩ ∈ R. This is fine, provided that the member-
ship relation, “∈”, is treated as a predicate. But if we think that “∈” stands
for a certain kind of set, then the expression “⟨x, y⟩ ∈ R” just consists of three
singular terms which stand for sets: “⟨x, y⟩”, “∈”, and “R”. And such a list of
names is no more capable of expressing a proposition than the nonsense string:
“the cup penholder the table”. Again, even if some relations can be treated as
sets, the relation of set-membership must be a special case. (This rolls together

32 Release : 6891b66 (2024-12-01)


2.3. SPECIAL PROPERTIES OF RELATIONS

a simple version of Frege’s concept horse paradox, and a famous objection that
Wittgenstein once raised against Russell.)
So where does this leave us? Well, there is nothing wrong with our saying
that the relations on the numbers are sets. We just have to understand the
spirit in which that remark is made. We are not stating a metaphysical identity
fact. We are simply noting that, in certain contexts, we can (and will) treat
(certain) relations as certain sets.

content/sets-functions-relations/relations/special-properties.tex

2.3 Special Properties of Relations


sfr:rel:prp: Some kinds of relations turn out to be so common that they have been given intro
sec
special names. For instance, ≤ and ⊆ both relate their respective domains
(say, N in the case of ≤ and ℘(A) in the case of ⊆) in similar ways. To get at
exactly how these relations are similar, and how they differ, we categorize them
according to some special properties that relations can have. It turns out that
(combinations of) some of these special properties are especially important:
orders and equivalence relations.

Definition 2.3 (Reflexivity). A relation R ⊆ A2 is reflexive iff, for every


x ∈ A, Rxx.

Definition 2.4 (Transitivity). A relation R ⊆ A2 is transitive iff, whenever


Rxy and Ryz, then also Rxz.

Definition 2.5 (Symmetry). A relation R ⊆ A2 is symmetric iff, whenever


Rxy, then also Ryx.

Definition 2.6 (Anti-symmetry). A relation R ⊆ A2 is anti-symmetric iff,


whenever both Rxy and Ryx, then x = y (or, in other words: if x ̸= y then
either ¬Rxy or ¬Ryx).

In a symmetric relation, Rxy and Ryx always hold together, or neither explanation

holds. In an anti-symmetric relation, the only way for Rxy and Ryx to hold
together is if x = y. Note that this does not require that Rxy and Ryx
holds when x = y, only that it isn’t ruled out. So an anti-symmetric relation
can be reflexive, but it is not the case that every anti-symmetric relation is
reflexive. Also note that being anti-symmetric and merely not being symmetric
are different conditions. In fact, a relation can be both symmetric and anti-
symmetric at the same time (e.g., the identity relation is).

Definition 2.7 (Connectivity). A relation R ⊆ A2 is connected if for all


x, y ∈ A, if x ̸= y, then either Rxy or Ryx.

Release : 6891b66 (2024-12-01) 33


CHAPTER 2. RELATIONS

Problem 2.2. Give examples of relations that are (a) reflexive and symmetric
but not transitive, (b) reflexive and anti-symmetric, (c) anti-symmetric, tran-
sitive, but not reflexive, and (d) reflexive, symmetric, and transitive. Do not
use relations on numbers or sets.

Definition 2.8 (Irreflexivity). A relation R ⊆ A2 is called irreflexive if, for


all x ∈ A, not Rxx.

Definition 2.9 (Asymmetry). A relation R ⊆ A2 is called asymmetric if for


no pair x, y ∈ A we have both Rxy and Ryx.

Note that if A ̸= ∅, then no irreflexive relation on A is reflexive and every


asymmetric relation on A is also anti-symmetric. However, there are R ⊆ A2
that are not reflexive and also not irreflexive, and there are anti-symmetric
relations that are not asymmetric.

content/sets-functions-relations/relations/equivalence-relations.tex

2.4 Equivalence Relations


The identity relation on a set is reflexive, symmetric, and transitive. Rela- sfr:rel:eqv:
sec
tions R that have all three of these properties are very common.
Definition 2.10 (Equivalence relation). A relation R ⊆ A2 that is reflex-
ive, symmetric, and transitive is called an equivalence relation. Elements x and
y of A are said to be R-equivalent if Rxy.

Equivalence relations give rise to the notion of an equivalence class. An


equivalence relation “chunks up” the domain into different partitions. Within
each partition, all the objects are related to one another; and no objects from
different partitions relate to one another. Sometimes, it’s helpful just to talk
about these partitions directly. To that end, we introduce a definition:
Definition 2.11. Let R ⊆ A2 be an equivalence relation. For each x ∈ A, sfr:rel:eqv:

the equivalence class of x in A is the set [x]R = {y ∈ A : Rxy}. The quotient def:equivalenceclass

of A under R is A/R = {[x]R : x ∈ A}, i.e., the set of these equivalence classes.

The next result vindicates the definition of an equivalence class, in proving


that the equivalence classes are indeed the partitions of A:
Proposition 2.12. If R ⊆ A2 is an equivalence relation, then Rxy iff [x]R =
[y]R .

Proof. For the left-to-right direction, suppose Rxy, and let z ∈ [x]R . By defi-
nition, then, Rxz. Since R is an equivalence relation, Ryz. (Spelling this out:
as Rxy and R is symmetric we have Ryx, and as Rxz and R is transitive we
have Ryz.) So z ∈ [y]R . Generalising, [x]R ⊆ [y]R . But exactly similarly,
[y]R ⊆ [x]R . So [x]R = [y]R , by extensionality.

34 Release : 6891b66 (2024-12-01)


2.5. ORDERS

For the right-to-left direction, suppose [x]R = [y]R . Since R is reflexive,


Ryy, so y ∈ [y]R . Thus also y ∈ [x]R by the assumption that [x]R = [y]R . So
Rxy.

Example 2.13. A nice example of equivalence relations comes from modular


arithmetic. For any a, b, and n ∈ Z+ , say that a ≡n b iff dividing a by n gives
the same remainder as dividing b by n. (Somewhat more symbolically: a ≡n b
iff, for some k ∈ Z, a − b = kn.) Now, ≡n is an equivalence relation, for any n.
And there are exactly n distinct equivalence classes generated by ≡n ; that is,
N/≡n has n elements. These are: the set of numbers divisible by n without
remainder, i.e., [0]≡n ; the set of numbers divisible by n with remainder 1,
i.e., [1]≡n ; . . . ; and the set of numbers divisible by n with remainder n − 1,
i.e., [n − 1]≡n .

Problem 2.3. Show that ≡n is an equivalence relation, for any n ∈ Z+ , and


that N/≡n has exactly n members.

content/sets-functions-relations/relations/orders.tex

2.5 Orders
sfr:rel:ord: Many of our comparisons involve describing some objects as being “less than”, explanation
sec
“equal to”, or “greater than” other objects, in a certain respect. These involve
order relations. But there are different kinds of order relations. For instance,
some require that any two objects be comparable, others don’t. Some include
identity (like ≤) and some exclude it (like <). It will help us to have a taxonomy
here.
Definition 2.14 (Preorder). A relation which is both reflexive and transi-
tive is called a preorder.

Definition 2.15 (Partial order). A preorder which is also anti-symmetric is


called a partial order.

sfr:rel:ord: Definition 2.16 (Linear order). A partial order which is also connected is
def:linearorder
called a total order or linear order.

Example 2.17. Every linear order is also a partial order, and every partial
order is also a preorder, but the converses don’t hold. The universal relation
on A is a preorder, since it is reflexive and transitive. But, if A has more than
one element, the universal relation is not anti-symmetric, and so not a partial
order.

Example 2.18. Consider the no longer than relation ≼ on B∗ : x ≼ y iff


len(x) ≤ len(y). This is a preorder (reflexive and transitive), and even con-
nected, but not a partial order, since it is not anti-symmetric. For instance,
01 ≼ 10 and 10 ≼ 01, but 01 ̸= 10.

Release : 6891b66 (2024-12-01) 35


CHAPTER 2. RELATIONS

Example 2.19. An important partial order is the relation ⊆ on a set of sets.


This is not in general a linear order, since if a ̸= b and we consider ℘({a, b}) =
{∅, {a}, {b}, {a, b}}, we see that {a} ⊈ {b} and {a} = ̸ {b} and {b} ⊈ {a}.

Example 2.20. The relation of divisibility without remainder gives us a par-


tial order which isn’t a linear order. For integers n, m, we write n | m to
mean n (evenly) divides m, i.e., iff there is some integer k so that m = kn. On
N, this is a partial order, but not a linear order: for instance, 2 ∤ 3 and also
3 ∤ 2. Considered as a relation on Z, divisibility is only a preorder since it is
not anti-symmetric: 1 | −1 and −1 | 1 but 1 ̸= −1.

Definition 2.21 (Strict order). A strict order is a relation which is irreflex-


ive, asymmetric, and transitive.

Definition 2.22 (Strict linear order). A strict order which is also con- sfr:rel:ord:
def:strictlinearorder
nected is called a strict total order or strict linear order.

Example 2.23. ≤ is the linear order corresponding to the strict linear order <.
⊆ is the partial order corresponding to the strict order ⊊.

Any strict order R on A can be turned into a partial order by adding the
diagonal IdA , i.e., adding all the pairs ⟨x, x⟩. (This is called the reflexive closure
of R.) Conversely, starting from a partial order, one can get a strict order by
removing IdA . These next two results make this precise.

Proposition 2.24. If R is a strict order on A, then R+ = R∪IdA is a partial sfr:rel:ord:


order. Moreover, if R is a strict linear order, then R+ is a linear order. prop:stricttopartial

Proof. Suppose R is a strict order, i.e., R ⊆ A2 and R is irreflexive, asymmetric,


and transitive. Let R+ = R ∪ IdA . We have to show that R+ is reflexive, anti-
symmetric, and transitive.
R+ is clearly reflexive, since ⟨x, x⟩ ∈ IdA ⊆ R+ for all x ∈ A.
To show R+ is anti-symmetric, suppose for reductio that R+ xy and R+ yx
but x ̸= y. Since ⟨x, y⟩ ∈ R ∪ IdA , but ⟨x, y⟩ ∈ / IdA , we must have ⟨x, y⟩ ∈
R, i.e., Rxy. Similarly, Ryx. But this contradicts the assumption that R is
asymmetric.
To establish transitivity, suppose that R+ xy and R+ yz. If both ⟨x, y⟩ ∈ R
and ⟨y, z⟩ ∈ R, then ⟨x, z⟩ ∈ R since R is transitive. Otherwise, either ⟨x, y⟩ ∈
IdA , i.e., x = y, or ⟨y, z⟩ ∈ IdA , i.e., y = z. In the first case, we have that
R+ yz by assumption, x = y, hence R+ xz. Similarly in the second case. In
either case, R+ xz, thus, R+ is also transitive.
Concerning the “moreover” clause, suppose that R is also connected. So
for all x ̸= y, either Rxy or Ryx, i.e., either ⟨x, y⟩ ∈ R or ⟨y, x⟩ ∈ R. Since
R ⊆ R+ , this remains true of R+ , so R+ is connected as well.

Proposition 2.25. If R is a partial order on A, then R− = R \ IdA is a strict sfr:rel:ord:


order. Moreover, if R is a linear order, then R− is a strict linear order. prop:partialtostrict

36 Release : 6891b66 (2024-12-01)


2.6. GRAPHS

Proof. This is left as an exercise.

Problem 2.4. Give a proof of Proposition 2.25.

The following simple result establishes that strict linear orders satisfy an
extensionality-like property:

sfr:rel:ord: Proposition 2.26. If < is a strict linear order on A, then:


p:extensionality-strictlinearorders

(∀a, b ∈ A)((∀x ∈ A)(x < a ↔ x < b) → a = b).

Proof. Suppose (∀x ∈ A)(x < a ↔ x < b). If a < b, then a < a, contradicting
the fact that < is irreflexive; so a ≮ b. Exactly similarly, b ≮ a. So a = b, as <
is connected.

content/sets-functions-relations/relations/graphs.tex

2.6 Graphs
sfr:rel:grp: A graph is a diagram in which points—called “nodes” or “vertices” (plural of
sec
“vertex”)—are connected by edges. Graphs are a ubiquitous tool in discrete
mathematics and in computer science. They are incredibly useful for repre-
senting, and visualizing, relationships and structures, from concrete things like
networks of various kinds to abstract structures such as the possible outcomes
of decisions. There are many different kinds of graphs in the literature which
differ, e.g., according to whether the edges are directed or not, have labels or
not, whether there can be edges from a node to the same node, multiple edges
between the same nodes, etc. Directed graphs have a special connection to
relations.

Definition 2.27 (Directed graph). A directed graph G = ⟨V, E⟩ is a set of


vertices V and a set of edges E ⊆ V 2 .

According to our definition, a graph just is a set together with a relation explanation

on that set. Of course, when talking about graphs, it’s only natural to expect
that they are graphically represented: we can draw a graph by connecting two
vertices v1 and v2 by an arrow iff ⟨v1 , v2 ⟩ ∈ E. The only difference between a
relation by itself and a graph is that a graph specifies the set of vertices, i.e., a
graph may have isolated vertices. The important point, however, is that every
relation R on a set X can be seen as a directed graph ⟨X, R⟩, and conversely, a
directed graph ⟨V, E⟩ can be seen as a relation E ⊆ V 2 with the set V explicitly
specified.

Release : 6891b66 (2024-12-01) 37


CHAPTER 2. RELATIONS

Example 2.28. The graph ⟨V, E⟩ with V = {1, 2, 3, 4} and E = {⟨1, 1⟩, ⟨1, 2⟩,
⟨1, 3⟩, ⟨2, 3⟩} looks like this:

1 2 4

This is a different graph than ⟨V ′ , E⟩ with V ′ = {1, 2, 3}, which looks like this:

1 2

Problem 2.5. Consider the less-than-or-equal-to relation ≤ on the set {1, 2, 3, 4}


as a graph and draw the corresponding diagram.

content/sets-functions-relations/relations/operations.tex

2.7 Operations on Relations


It is often useful to modify or combine relations. In Proposition 2.24, we sfr:rel:ops:
sec
considered the union of relations, which is just the union of two relations
considered as sets of pairs. Similarly, in Proposition 2.25, we considered the
relative difference of relations. Here are some other operations we can perform
on relations.
Definition 2.29. Let R, S be relations, and A be any set. sfr:rel:ops:
The inverse of R is R−1 = {⟨y, x⟩ : ⟨x, y⟩ ∈ R}. relationoperations

The relative product of R and S is (R | S) = {⟨x, z⟩ : ∃y(Rxy ∧ Syz)}.


The restriction of R to A is R↾A = R ∩ A2 .
The application of R to A is R[A] = {y : (∃x ∈ A)Rxy}
Example 2.30. Let S ⊆ Z2 be the successor relation on Z, i.e., S = {⟨x, y⟩ ∈
Z2 : x + 1 = y}, so that Sxy iff x + 1 = y.
S −1 is the predecessor relation on Z, i.e., {⟨x, y⟩ ∈ Z2 : x − 1 = y}.
S | S is {⟨x, y⟩ ∈ Z2 : x + 2 = y}
S↾N is the successor relation on N.
S[{1, 2, 3}] is {2, 3, 4}.

38 Release : 6891b66 (2024-12-01)


Definition 2.31 (Transitive closure). S Let R ⊆ A2 be a binary relation.
+ n
The transitive closure of R is R = 0<n∈N R , where we recursively define
R1 = R and Rn+1 = Rn | R.
The reflexive transitive closure of R is R∗ = R+ ∪ IdA .

Example 2.32. Take the successor relation S ⊆ Z2 . S 2 xy iff x + 2 = y, S 3 xy


iff x + 3 = y, etc. So S + xy iff x + n = y for some n ≥ 1. In other words, S + xy
iff x < y, and S ∗ xy iff x ≤ y.

Problem 2.6. Show that the transitive closure of R is in fact transitive.

Chapter 3

Functions

content/sets-functions-relations/functions/function-basics.tex

3.1 Basics
sfr:fun:bas: A function is a map which sends each element of a given set to a specific explanation
sec
element in some (other) given set. For instance, the operation of adding 1
defines a function: each number n is mapped to a unique number n + 1.
More generally, functions may take pairs, triples, etc., as inputs and return
some kind of output. Many functions are familiar to us from basic arithmetic.
For instance, addition and multiplication are functions. They take in two
numbers and return a third.
In this mathematical, abstract sense, a function is a black box : what matters
is only what output is paired with what input, not the method for calculating
the output.
Definition 3.1 (Function). A function f : A → B is a mapping of each ele-
ment of A to an element of B.
We call A the domain of f and B the codomain of f . The elements of A
are called inputs or arguments of f , and the element of B that is paired with
an argument x by f is called the value of f for argument x, written f (x).
The range ran(f ) of f is the subset of the codomain consisting of the values
of f for some argument; ran(f ) = {f (x) : x ∈ A}.

39
CHAPTER 3. FUNCTIONS

Figure 3.1: A function is a mapping of each element of one set to an element of


another. An arrow points from an argument in the domain to the corresponding
value in the codomain.
sfr:fun:bas:
fig:function
The diagram in Figure 3.1 may help to think about functions. The ellipse
on the left represents the function’s domain; the ellipse on the right represents
the function’s codomain; and an arrow points from an argument in the domain
to the corresponding value in the codomain.
Example 3.2. Multiplication takes pairs of natural numbers as inputs and
maps them to natural numbers as outputs, so goes from N × N (the domain)
to N (the codomain). As it turns out, the range is also N, since every n ∈ N is
n × 1.

Example 3.3. Multiplication is a function because it pairs each input—each


pair of natural numbers—with a single output: × : N2 → N. By contrast,
the square root operation applied to the domain√ N is √ not functional, since
each positive integer n has two square roots: n and √ − n. We can make it
functional by only returning the positive square root: : N → R.

Example 3.4. The relation that pairs each student in a class with their final
grade is a function—no student can get two different final grades in the same
class. The relation that pairs each student in a class with their parents is not
a function: students can have zero, or two, or more parents.

explanation We can define functions by specifying in some precise way what the value
of the function is for every possible argument. Different ways of doing this are
by giving a formula, describing a method for computing the value, or listing
the values for each argument. However functions are defined, we must make
sure that for each argument we specify one, and only one, value.
Example 3.5. Let f : N → N be defined such that f (x) = x + 1. This is a
definition that specifies f as a function which takes in natural numbers and
outputs natural numbers. It tells us that, given a natural number x, f will
output its successor x + 1. In this case, the codomain N is not the range of f ,
since the natural number 0 is not the successor of any natural number. The
range of f is the set of all positive integers, Z+ .

Example 3.6. Let g : N → N be defined such that g(x) = x + 2 − 1. This tells sfr:fun:bas:
examplefunext
us that g is a function which takes in natural numbers and outputs natural

40 Release : 6891b66 (2024-12-01)


3.2. KINDS OF FUNCTIONS

Figure 3.2: A surjective function has every element of the codomain as a value.
sfr:fun:kin:
fig:surjective
numbers. Given a natural number n, g will output the predecessor of the
successor of the successor of x, i.e., x + 1.

We just considered two functions, f and g, with different definitions. How- explanation
ever, these are the same function. After all, for any natural number n, we have
that f (n) = n + 1 = n + 2 − 1 = g(n). Otherwise put: our definitions for f
and g specify the same mapping by means of different equations. Implicitly,
then, we are relying upon a principle of extensionality for functions,
if ∀x f (x) = g(x), then f = g
provided that f and g share the same domain and codomain.
Example 3.7. We can also define functions by cases. For instance, we could
define h : N → N by (
x
2 if x is even
h(x) = x+1
2 if x is odd.
Since every natural number is either even or odd, the output of this function
will always be a natural number. Just remember that if you define a function
by cases, every possible input must fall into exactly one case. In some cases,
this will require a proof that the cases are exhaustive and exclusive.

content/sets-functions-relations/functions/function-kinds.tex

3.2 Kinds of Functions


sfr:fun:kin: It will be useful to introduce a kind of taxonomy for some of the kinds of explanation
sec
functions which we encounter most frequently.
To start, we might want to consider functions which have the property that
every member of the codomain is a value of the function. Such functions are
called surjective, and can be pictured as in Figure 3.2.
Definition 3.8 (Surjective function). A function f : A → B is surjective
iff B is also the range of f , i.e., for every y ∈ B there is at least one x ∈ A
such that f (x) = y, or in symbols:
(∀y ∈ B)(∃x ∈ A)f (x) = y.

Release : 6891b66 (2024-12-01) 41


CHAPTER 3. FUNCTIONS

Figure 3.3: An injective function never maps two different arguments to the
same value.
sfr:fun:kin:
fig:injective
We call such a function a surjection from A to B.

explanation If you want to show that f is a surjection, then you need to show that every
object in f ’s codomain is the value of f (x) for some input x.
Note that any function induces a surjection. After all, given a function
f : A → B, let f ′ : A → ran(f ) be defined by f ′ (x) = f (x). Since ran(f ) is
defined as {f (x) ∈ B : x ∈ A}, this function f ′ is guaranteed to be a surjection
explanation Now, any function maps each possible input to a unique output. But there
are also functions which never map different inputs to the same outputs. Such
functions are called injective, and can be pictured as in Figure 3.3.
Definition 3.9 (Injective function). A function f : A → B is injective iff
for each y ∈ B there is at most one x ∈ A such that f (x) = y. We call such a
function an injection from A to B.

explanation If you want to show that f is an injection, you need to show that for any
elements x and y of f ’s domain, if f (x) = f (y), then x = y.
Example 3.10. The constant function f : N → N given by f (x) = 1 is neither
injective, nor surjective.
The identity function f : N → N given by f (x) = x is both injective and
surjective.
The successor function f : N → N given by f (x) = x + 1 is injective but not
surjective.
The function f : N → N defined by:
(
x
2 if x is even
f (x) = x+1
2 if x is odd.

is surjective, but not injective.

explanation Often enough, we want to consider functions which are both injective and
surjective. We call such functions bijective. They look like the function pic-
tured in Figure 3.4. Bijections are also sometimes called one-to-one correspon-
dences, since they uniquely pair elements of the codomain with elements of the
domain.

42 Release : 6891b66 (2024-12-01)


3.3. FUNCTIONS AS RELATIONS

Figure 3.4: A bijective function uniquely pairs the elements of the codomain
with those of the domain.
sfr:fun:kin:
fig:bijective
Definition 3.11 (Bijection). A function f : A → B is bijective iff it is both
surjective and injective. We call such a function a bijection from A to B (or
between A and B).

content/sets-functions-relations/functions/functions-relations.tex

3.3 Functions as Relations


sfr:fun:rel: A function which maps elements of A to elements of B obviously defines a explanation
sec
relation between A and B, namely the relation which holds between x and y iff
f (x) = y. In fact, we might even—if we are interested in reducing the building
blocks of mathematics for instance—identify the function f with this relation,
i.e., with a set of pairs. This then raises the question: which relations define
functions in this way?
Definition 3.12 (Graph of a function). Let f : A → B be a function. The
graph of f is the relation Rf ⊆ A × B defined by

Rf = {⟨x, y⟩ : f (x) = y}.

The graph of a function is uniquely determined, by extensionality. More- explanation


over, extensionality (on sets) will immediately vindicate the implicit principle
of extensionality for functions, whereby if f and g share a domain and codomain
then they are identical if they agree on all values.
Similarly, if a relation is “functional”, then it is the graph of a function.
sfr:fun:rel: Proposition 3.13. Let R ⊆ A × B be such that:
prop:graph-function
1. If Rxy and Rxz then y = z; and
2. for every x ∈ A there is some y ∈ B such that ⟨x, y⟩ ∈ R.
Then R is the graph of the function f : A → B defined by f (x) = y iff Rxy.

Proof. Suppose there is a y such that Rxy. If there were another z ̸= y such
that Rxz, the condition on R would be violated. Hence, if there is a y such
that Rxy, this y is unique, and so f is well-defined. Obviously, Rf = R.

Release : 6891b66 (2024-12-01) 43


CHAPTER 3. FUNCTIONS

explanation Every function f : A → B has a graph, i.e., a relation on A × B defined by


f (x) = y. On the other hand, every relation R ⊆ A × B with the properties
given in Proposition 3.13 is the graph of a function f : A → B. Because of this
close connection between functions and their graphs, we can think of a function
simply as its graph. In other words, functions can be identified with certain
relations, i.e., with certain sets of tuples. Note, though, that the spirit of this
“identification” is as in section 2.2: it is not a claim about the metaphysics
of functions, but an observation that it is convenient to treat functions as
certain sets. One reason that this is so convenient, is that we can now consider
performing similar operations on functions as we performed on relations (see
section 2.7). In particular:

Definition 3.14. Let f : A → B be a function with C ⊆ A. sfr:fun:rel:


defn:funimage
The restriction of f to C is the function f ↾C : C → B defined by (f ↾C )(x) =
f (x) for all x ∈ C. In other words, f ↾C = {⟨x, y⟩ ∈ Rf : x ∈ C}.
The application of f to C is f [C] = {f (x) : x ∈ C}. We also call this the
image of C under f .

explanation It follows from these definitions that ran(f ) = f [dom(f )], for any func-
tion f . These notions are exactly as one would expect, given the definitions
in section 2.7 and our identification of functions with relations. But two other
operations—inverses and relative products—require a little more detail. We
will provide that in section 3.4 and section 3.5.

content/sets-functions-relations/functions/inverses.tex

3.4 Inverses of Functions


explanation We think of functions as maps. An obvious question to ask about functions, sfr:fun:inv:
sec
then, is whether the mapping can be “reversed.” For instance, the successor
function f (x) = x+1 can be reversed, in the sense that the function g(y) = y−1
“undoes” what f does.
But we must be careful. Although the definition of g defines a function
Z → Z, it does not define a function N → N, since g(0) ∈ / N. So even in simple
cases, it is not quite obvious whether a function can be reversed; it may depend
on the domain and codomain.
This is made more precise by the notion of an inverse of a function.

Definition 3.15. A function g : B → A is an inverse of a function f : A → B


if f (g(y)) = y and g(f (x)) = x for all x ∈ A and y ∈ B.

If f has an inverse g, we often write f −1 instead of g.


explanation Now we will determine when functions have inverses. A good candidate for
an inverse of f : A → B is g : B → A “defined by”

g(y) = “the” x such that f (x) = y.

44 Release : 6891b66 (2024-12-01)


3.4. INVERSES OF FUNCTIONS

But the scare quotes around “defined by” (and “the”) suggest that this is
not a definition. At least, it will not always work, with complete generality.
For, in order for this definition to specify a function, there has to be one and
only one x such that f (x) = y—the output of g has to be uniquely specified.
Moreover, it has to be specified for every y ∈ B. If there are x1 and x2 ∈ A
with x1 ̸= x2 but f (x1 ) = f (x2 ), then g(y) would not be uniquely specified
for y = f (x1 ) = f (x2 ). And if there is no x at all such that f (x) = y, then
g(y) is not specified at all. In other words, for g to be defined, f must be both
injective and surjective.
Let’s go slowly. We’ll divide the question into two: Given a function f : A →
B, when is there a function g : B → A so that g(f (x)) = x? Such a g “undoes”
what f does, and is called a left inverse of f . Secondly, when is there a function
h : B → A so that f (h(y)) = y? Such an h is called a right inverse of f —f
“undoes” what h does.
Proposition 3.16. If f : A → B is injective, then there is a left inverse g : B →
A of f so that g(f (x)) = x for all x ∈ A.

Proof. Suppose that f : A → B is injective. Consider a y ∈ B. If y ∈ ran(f ),


there is an x ∈ A so that f (x) = y. Because f is injective, there is only one
such x ∈ A. Then we can define: g(y) = x, i.e., g(y) is “the” x ∈ A such that
f (x) = y. If y ∈
/ ran(f ), we can map it to any a ∈ A. So, we can pick an a ∈ A
and define g : B → A by:
(
x if f (x) = y
g(y) =
a if y ∈/ ran(f ).

It is defined for all y ∈ B, since for each such y ∈ ran(f ) there is exactly one
x ∈ A such that f (x) = y. By definition, if y = f (x), then g(y) = x, i.e.,
g(f (x)) = x.

Problem 3.1. Show that if f : A → B has a left inverse g, then f is injective.

Proposition 3.17. If f : A → B is surjective, then there is a right inverse h : B →


A of f so that f (h(y)) = y for all y ∈ B.

Proof. Suppose that f : A → B is surjective. Consider a y ∈ B. Since f is


surjective, there is an xy ∈ A with f (xy ) = y. Then we can define: h(y) = xy ,
i.e., for each y ∈ B we choose some x ∈ A so that f (x) = y; since f is surjective
there is always at least one to choose from.1 By definition, if x = h(y), then
f (x) = y, i.e., for any y ∈ B, f (h(y)) = y.
1 Since f is surjective, for every y ∈ B the set {x : f (x) = y} is nonempty. Our definition

of h requires that we choose a single x from each of these sets. That this is always possible is
actually not obvious—the possibility of making these choices is simply assumed as an axiom.
In other words, this proposition assumes the so-called Axiom of Choice, an issue we will
revisit in chapter 69. However, in many specific cases, e.g., when A = N or is finite, or when
f is bijective, the Axiom of Choice is not required. (In the particular case when f is bijective,
for each y ∈ B the set {x : f (x) = y} has exactly one element, so that there is no choice to
make.)

Release : 6891b66 (2024-12-01) 45


CHAPTER 3. FUNCTIONS

Problem 3.2. Show that if f : A → B has a right inverse h, then f is surjec-


tive.

explanation By combining the ideas in the previous proof, we now get that every bijec-
tion has an inverse, i.e., there is a single function which is both a left and right
inverse of f .
Proposition 3.18. If f : A → B is bijective, there is a function f −1 : B → A sfr:fun:inv:
so that for all x ∈ A, f −1 (f (x)) = x and for all y ∈ B, f (f −1 (y)) = y. prop:bijection-inverse

Proof. Exercise.

Problem 3.3. Prove Proposition 3.18. You have to define f −1 , show that
it is a function, and show that it is an inverse of f , i.e., f −1 (f (x)) = x and
f (f −1 (y)) = y for all x ∈ A and y ∈ B.

explanation There is a slightly more general way to extract inverses. We saw in sec-
tion 3.2 that every function f induces a surjection f ′ : A → ran(f ) by letting
f ′ (x) = f (x) for all x ∈ A. Clearly, if f is injective, then f ′ is bijective, so that
it has a unique inverse by Proposition 3.18. By a very minor abuse of notation,
we sometimes call the inverse of f ′ simply “the inverse of f .”
Proposition 3.19. Show that if f : A → B has a left inverse g and a right sfr:fun:inv:
prop:left-right
inverse h, then h = g.

Proof. Exercise.

Problem 3.4. Prove Proposition 3.19.

Proposition 3.20. Every function f has at most one inverse. sfr:fun:inv:


prop:inverse-unique

Proof. Suppose g and h are both inverses of f . Then in particular g is a left


inverse of f and h is a right inverse. By Proposition 3.19, g = h.

content/sets-functions-relations/functions/composition.tex

3.5 Composition of Functions


explanation We saw in section 3.4 that the inverse f −1 of a bijection f is itself a function. sfr:fun:cmp:
sec
Another operation on functions is composition: we can define a new function
by composing two functions, f and g, i.e., by first applying f and then g. Of
course, this is only possible if the ranges and domains match, i.e., the range
of f must be a subset of the domain of g. This operation on functions is the
analogue of the operation of relative product on relations from section 2.7.
A diagram might help to explain the idea of composition. In Figure 3.5, we
depict two functions f : A → B and g : B → C and their composition (g ◦ f ).
The function (g ◦ f ) : A → C pairs each element of A with an element of C.

46 Release : 6891b66 (2024-12-01)


3.6. PARTIAL FUNCTIONS

Figure 3.5: The composition g ◦ f of two functions f and g.


sfr:fun:cmp:
fig:composition
We specify which element of C an element of A is paired with as follows:
given an input x ∈ A, first apply the function f to x, which will output
some f (x) = y ∈ B, then apply the function g to y, which will output some
g(f (x)) = g(y) = z ∈ C.
Definition 3.21 (Composition). Let f : A → B and g : B → C be func-
tions. The composition of f with g is g ◦ f : A → C, where (g ◦ f )(x) = g(f (x)).

Example 3.22. Consider the functions f (x) = x + 1, and g(x) = 2x. Since
(g ◦ f )(x) = g(f (x)), for each input x you must first take its successor, then
multiply the result by two. So their composition is given by (g◦f )(x) = 2(x+1).

Problem 3.5. Show that if f : A → B and g : B → C are both injective, then


g ◦ f : A → C is injective.

Problem 3.6. Show that if f : A → B and g : B → C are both surjective,


then g ◦ f : A → C is surjective.

Problem 3.7. Suppose f : A → B and g : B → C. Show that the graph of


g ◦ f is Rf | Rg .

content/sets-functions-relations/functions/partial-functions.tex

3.6 Partial Functions


sfr:fun:par: It is sometimes useful to relax the definition of function so that it is not required explanation
sec
that the output of the function is defined for all possible inputs. Such mappings
are called partial functions.
Definition 3.23. A partial function f : A → 7 B is a mapping which assigns to
every element of A at most one element of B. If f assigns an element of B
to x ∈ A, we say f (x) is defined, and otherwise undefined. If f (x) is defined,
we write f (x) ↓, otherwise f (x) ↑. The domain of a partial function f is the
subset of A where it is defined, i.e., dom(f ) = {x ∈ A : f (x) ↓}.

Release : 6891b66 (2024-12-01) 47


Example 3.24. Every function f : A → B is also a partial function. Partial
functions that are defined everywhere on A—i.e., what we so far have simply
called a function—are also called total functions.

Example 3.25. The partial function f : R → 7 R given by f (x) = 1/x is unde-


fined for x = 0, and defined everywhere else.

Problem 3.8. Given f : A → 7 B, define the partial function g : B →


7 A by: for
any y ∈ B, if there is a unique x ∈ A such that f (x) = y, then g(y) = x;
otherwise g(y) ↑. Show that if f is injective, then g(f (x)) = x for all x ∈
dom(f ), and f (g(y)) = y for all y ∈ ran(f ).

Definition 3.26 (Graph of a partial function). Let f : A → 7 B be a par-


tial function. The graph of f is the relation Rf ⊆ A × B defined by

Rf = {⟨x, y⟩ : f (x) = y}.

Proposition 3.27. Suppose R ⊆ A × B has the property that whenever Rxy


and Rxy ′ then y = y ′ . Then R is the graph of the partial function f : X →7 Y
defined by: if there is a y such that Rxy, then f (x) = y, otherwise f (x) ↑. If
R is also serial, i.e., for each x ∈ X there is a y ∈ Y such that Rxy, then f is
total.

Proof. Suppose there is a y such that Rxy. If there were another y ′ ̸= y such
that Rxy ′ , the condition on R would be violated. Hence, if there is a y such
that Rxy, that y is unique, and so f is well-defined. Obviously, Rf = R and f
is total if R is serial.

Chapter 4

The Size of Sets

This chapter discusses enumerations, countability and uncountability.


Several sections come in two versions: a more elementary one, that takes
enumerations to be lists, or surjections from Z+ ; and a more abstract one
that defines enumerations as bijections with N.

48
4.1. INTRODUCTION

content/sets-functions-relations/size-of-sets/introduction.tex

4.1 Introduction
sfr:siz:int: When Georg Cantor developed set theory in the 1870s, one of his aims was
sec
to make palatable the idea of an infinite collection—an actual infinity, as the
medievals would say. A key part of this was his treatment of the size of different
sets. If a, b and c are all distinct, then the set {a, b, c} is intuitively larger than
{a, b}. But what about infinite sets? Are they all as large as each other? It
turns out that they are not.
The first important idea here is that of an enumeration. We can list every
finite set by listing all its elements. For some infinite sets, we can also list
all their elements if we allow the list itself to be infinite. Such sets are called
enumerable. Cantor’s surprising result, which we will fully understand by the
end of this chapter, was that some infinite sets are not enumerable.

content/sets-functions-relations/size-of-sets/enumerability.tex

4.2 Enumerations and Enumerable Sets


sfr:siz:enm:
sec

This section discusses enumerations of sets, defining them as surjec-


tions from Z+ . It does things slowly, for readers with little mathematical
background. An alternative, terser version is given in section 4.11, which
defines enumerations differently: as bijections with N (or an initial seg-
ment).

We’ve already given examples of sets by listing their elements. Let’s discuss explanation
in more general terms how and when we can list the elements of a set, even if
that set is infinite.
Definition 4.1 (Enumeration, informally). Informally, an enumeration of
a set A is a list (possibly infinite) of elements of A such that every element of
A appears on the list at some finite position. If A has an enumeration, then A
is said to be enumerable.
A couple of points about enumerations: explanation

1. We count as enumerations only lists which have a beginning and in which


every element other than the first has a single element immediately pre-
ceding it. In other words, there are only finitely many elements between
the first element of the list and any other element. In particular, this
means that every element of an enumeration has a finite position: the
first element has position 1, the second position 2, etc.

Release : 6891b66 (2024-12-01) 49


CHAPTER 4. THE SIZE OF SETS

2. We can have different enumerations of the same set A which differ by the
order in which the elements appear: 4, 1, 25, 16, 9 enumerates the (set
of the) first five square numbers just as well as 1, 4, 9, 16, 25 does.
3. Redundant enumerations are still enumerations: 1, 1, 2, 2, 3, 3, . . . enu-
merates the same set as 1, 2, 3, . . . does.
4. Order and redundancy do matter when we specify an enumeration: we
can enumerate the positive integers beginning with 1, 2, 3, 1, . . . , but
the pattern is easier to see when enumerated in the standard way as 1,
2, 3, 4, . . .
5. Enumerations must have a beginning: . . . , 3, 2, 1 is not an enumeration
of the positive integers because it has no first element. To see how this
follows from the informal definition, ask yourself, “at what position in
the list does the number 76 appear?”
6. The following is not an enumeration of the positive integers: 1, 3, 5, . . . ,
2, 4, 6, . . . The problem is that the even numbers occur at places ∞ + 1,
∞ + 2, ∞ + 3, rather than at finite positions.
7. The empty set is enumerable: it is enumerated by the empty list!
Proposition 4.2. If A has an enumeration, it has an enumeration without
repetitions.

Proof. Suppose A has an enumeration x1 , x2 , . . . in which each xi is an element


of A. We can remove repetitions from an enumeration by removing repeated
elements. For instance, we can turn the enumeration into a new one in which
we list xi if it is an element of A that is not among x1 , . . . , xi−1 or remove xi
from the list if it already appears among x1 , . . . , xi−1 .

The last argument shows that in order to get a good handle on enumerations
and enumerable sets and to prove things about them, we need a more precise
definition. The following provides it.
Definition 4.3 (Enumeration, formally). An enumeration of a set A ̸= ∅
is any surjective function f : Z+ → A.

explanation Let’s convince ourselves that the formal definition and the informal defini-
tion using a possibly infinite list are equivalent. First, any surjective function
from Z+ to a set A enumerates A. Such a function determines an enumeration
as defined informally above: the list f (1), f (2), f (3), . . . . Since f is surjective,
every element of A is guaranteed to be the value of f (n) for some n ∈ Z+ .
Hence, every element of A appears at some finite position in the list. Since the
function may not be injective, the list may be redundant, but that is acceptable
(as noted above).
On the other hand, given a list that enumerates all elements of A, we can
define a surjective function f : Z+ → A by letting f (n) be the nth element of

50 Release : 6891b66 (2024-12-01)


4.2. ENUMERATIONS AND ENUMERABLE SETS

the list, or the final element of the list if there is no nth element. The only
case where this does not produce a surjective function is when A is empty,
and hence the list is empty. So, every non-empty list determines a surjective
function f : Z+ → A.

sfr:siz:enm: Definition 4.4. A set A is enumerable iff it is empty or has an enumeration.


defn:enumerable

Example 4.5. A function enumerating the positive integers (Z+ ) is simply


the identity function given by f (n) = n. A function enumerating the natural
numbers N is the function g(n) = n − 1.

Example 4.6. The functions f : Z+ → Z+ and g : Z+ → Z+ given by

f (n) = 2n and
g(n) = 2n − 1

enumerate the even positive integers and the odd positive integers, respectively.
However, neither function is an enumeration of Z+ , since neither is surjective.

Problem 4.1. Define an enumeration of the positive squares 1, 4, 9, 16, . . .

Example 4.7. The function f (n) = (−1)n ⌈ (n−1) 2 ⌉ (where ⌈x⌉ denotes the
ceiling function, which rounds x up to the nearest integer) enumerates the set
of integers Z. Notice how f generates the values of Z by “hopping” back and
forth between positive and negative integers:

f (1) f (2) f (3) f (4) f (5) f (6) f (7) ...

−⌈ 20 ⌉ ⌈ 12 ⌉ −⌈ 22 ⌉ ⌈ 32 ⌉ −⌈ 42 ⌉ ⌈ 25 ⌉ −⌈ 62 ⌉ ...

0 1 −1 2 −2 3 ...

You can also think of f as defined by cases as follows:



0
 if n = 1
f (n) = n/2 if n is even

−(n − 1)/2 if n is odd and > 1

Problem 4.2. Show that if A and B are enumerable, so is A ∪ B. To do this,


suppose there are surjective functions f : Z+ → A and g : Z+ → B, and define
a surjective function h : Z+ → A ∪ B and prove that it is surjective. Also
consider the cases where A or B = ∅.

Problem 4.3. Show that if B ⊆ A and A is enumerable, so is B. To do


this, suppose there is a surjective function f : Z+ → A. Define a surjective
function g : Z+ → B and prove that it is surjective. What happens if B = ∅?

Release : 6891b66 (2024-12-01) 51


CHAPTER 4. THE SIZE OF SETS

Problem 4.4. Show by induction on n that if A1 , A2 , . . . , An are all enumer-


able, so is A1 ∪ · · · ∪ An . You may assume the fact that if two sets A and B
are enumerable, so is A ∪ B.

Although it is perhaps more natural when listing the elements of a set to


start counting from the 1st element, mathematicians like to use the natural
numbers N for counting things. They talk about the 0th, 1st, 2nd, and so on,
elements of a list. Correspondingly, we can define an enumeration as a surjec-
tive function from N to A. Of course, the two definitions are equivalent.
Proposition 4.8. There is a surjection f : Z+ → A iff there is a surjection sfr:siz:enm:

g : N → A. prop:enum-shift

Proof. Given a surjection f : Z+ → A, we can define g(n) = f (n + 1) for


all n ∈ N. It is easy to see that g : N → A is surjective. Conversely, given
a surjection g : N → A, define f (n) = g(n − 1).

This gives us the following result:


Corollary 4.9. A set A is enumerable iff it is empty or there is a surjective sfr:siz:enm:
cor:enum-nat
function f : N → A.

We discussed above than an list of elements of a set A can be turned into


a list without repetitions. This is also true for enumerations, but a bit harder
to formulate and prove rigorously. Any function f : Z+ → A must be defined
for all n ∈ Z+ . If there are only finitely many elements in A then we clearly
cannot have a function defined on the infinitely many elements of Z+ that
takes as values all the elements of A but never takes the same value twice. In
that case, i.e., in the case where the list without repetitions is finite, we must
choose a different domain for f , one with only finitely many elements. Not
having repetitions means that f must be injective. Since it is also surjective,
we are looking for a bijection between some finite set {1, . . . , n} or Z+ and A.
Proposition 4.10. If f : Z+ → A is surjective (i.e., an enumeration of A), sfr:siz:enm:
there is a bijection g : Z → A where Z is either Z+ or {1, . . . , n} for some n ∈ prop:enum-bij

Z+ .

Proof. We define the function g recursively: Let g(1) = f (1). If g(i) has already
been defined, let g(i + 1) be the first value of f (1), f (2), . . . not already among
g(1), . . . , g(i), if there is one. If A has just n elements, then g(1), . . . , g(n)
are all defined, and so we have defined a function g : {1, . . . , n} → A. If A has
infinitely many elements, then for any i there must be an element of A in the
enumeration f (1), f (2), . . . , which is not already among g(1), . . . , g(i). In this
case we have defined a function g : Z+ → A.
The function g is surjective, since any element of A is among f (1), f (2), . . .
(since f is surjective) and so will eventually be a value of g(i) for some i. It is
also injective, since if there were j < i such that g(j) = g(i), then g(i) would
already be among g(1), . . . , g(i − 1), contrary to how we defined g.

52 Release : 6891b66 (2024-12-01)


4.3. CANTOR’S ZIG-ZAG METHOD

sfr:siz:enm: Corollary 4.11. A set A is enumerable iff it is empty or there is a bijection


cor:enum-nat-bij
f : N → A where either N = N or N = {0, . . . , n} for some n ∈ N.
Proof. A is enumerable iff A is empty or there is a surjective f : Z+ → A. By
Proposition 4.10, the latter holds iff there is a bijective function f : Z → A
where Z = Z+ or Z = {1, . . . , n} for some n ∈ Z+ . By the same argument as
in the proof of Proposition 4.8, that in turn is the case iff there is a bijection
g : N → A where either N = N or N = {0, . . . , n − 1}.
Problem 4.5. According to Definition 4.4, a set A is enumerable iff A = ∅ or
there is a surjective f : Z+ → A. It is also possible to define “enumerable set”
precisely by: a set is enumerable iff there is an injective function g : A → Z+ .
Show that the definitions are equivalent, i.e., show that there is an injective
function g : A → Z+ iff either A = ∅ or there is a surjective f : Z+ → A.

content/sets-functions-relations/size-of-sets/zig-zag.tex

4.3 Cantor’s Zig-Zag Method


sfr:siz:zigzag: We’ve already considered some “easy” enumerations. Now we will consider explanation
sec
something a bit harder. Consider the set of pairs of natural numbers, which
we defined in section 1.5 thus:
N × N = {⟨n, m⟩ : n, m ∈ N}
We can organize these ordered pairs into an array, like so:
0 1 2 3 ...
0 ⟨0, 0⟩ ⟨0, 1⟩ ⟨0, 2⟩ ⟨0, 3⟩ ...
1 ⟨1, 0⟩ ⟨1, 1⟩ ⟨1, 2⟩ ⟨1, 3⟩ ...
2 ⟨2, 0⟩ ⟨2, 1⟩ ⟨2, 2⟩ ⟨2, 3⟩ ...
3 ⟨3, 0⟩ ⟨3, 1⟩ ⟨3, 2⟩ ⟨3, 3⟩ ...
.. .. .. .. .. ..
. . . . . .
Clearly, every ordered pair in N × N will appear exactly once in the array. In
particular, ⟨n, m⟩ will appear in the nth row and mth column. But how do
we organize the elements of such an array into a “one-dimensional” list? The
pattern in the array below demonstrates one way to do this (although of course
there are many other options):
0 1 2 3 4 ...
0 0 1 3 6 10 ...
1 2 4 7 11 ... ...
2 5 8 12 ... ... ...
3 9 13 ... ... ... ...
4 14 ... ... ... ... ...
.. .. .. .. .. ..
. . . . . ... .

Release : 6891b66 (2024-12-01) 53


CHAPTER 4. THE SIZE OF SETS

This pattern is called Cantor’s zig-zag method. It enumerates N × N as follows:

⟨0, 0⟩, ⟨0, 1⟩, ⟨1, 0⟩, ⟨0, 2⟩, ⟨1, 1⟩, ⟨2, 0⟩, ⟨0, 3⟩, ⟨1, 2⟩, ⟨2, 1⟩, ⟨3, 0⟩, . . .

And this establishes the following:

Proposition 4.12. N × N is enumerable. sfr:siz:zigzag:


natsquaredenumerable

Proof. Let f : N → N × N take each k ∈ N to the tuple ⟨n, m⟩ ∈ N × N such


that k is the value of the nth row and mth column in Cantor’s zig-zag array.

explanation This technique also generalises rather nicely. For example, we can use it to
enumerate the set of ordered triples of natural numbers, i.e.:

N × N × N = {⟨n, m, k⟩ : n, m, k ∈ N}

We think of N × N × N as the Cartesian product of N × N with N, that is,

N3 = (N × N) × N = {⟨⟨n, m⟩, k⟩ : n, m, k ∈ N}

and thus we can enumerate N3 with an array by labelling one axis with the
enumeration of N, and the other axis with the enumeration of N2 :

0 1 2 3 ...
⟨0, 0⟩ ⟨0, 0, 0⟩ ⟨0, 0, 1⟩ ⟨0, 0, 2⟩ ⟨0, 0, 3⟩ ...
⟨0, 1⟩ ⟨0, 1, 0⟩ ⟨0, 1, 1⟩ ⟨0, 1, 2⟩ ⟨0, 1, 3⟩ ...
⟨1, 0⟩ ⟨1, 0, 0⟩ ⟨1, 0, 1⟩ ⟨1, 0, 2⟩ ⟨1, 0, 3⟩ ...
⟨0, 2⟩ ⟨0, 2, 0⟩ ⟨0, 2, 1⟩ ⟨0, 2, 2⟩ ⟨0, 2, 3⟩ ...
.. .. .. .. .. ..
. . . . . .

Thus, by using a method like Cantor’s zig-zag method, we may similarly obtain
an enumeration of N3 . And we can keep going, obtaining enumerations of Nn
for any natural number n. So, we have:

Proposition 4.13. Nn is enumerable, for every n ∈ N.

Problem 4.6. Show that (Z+ )n is enumerable, for every n ∈ N.

Problem 4.7. Show that (Z+ )∗ is enumerable. You may assume Problem 4.6.

content/sets-functions-relations/size-of-sets/pairing.tex

54 Release : 6891b66 (2024-12-01)


4.4. PAIRING FUNCTIONS AND CODES

4.4 Pairing Functions and Codes


sfr:siz:pai: Cantor’s zig-zag method makes the enumerability of Nn visually evident. But explanation
sec
let us focus on our array depicting N2 . Following the zig-zag line in the array
and counting the places, we can check that ⟨1, 2⟩ is associated with the num-
ber 7. However, it would be nice if we could compute this more directly. That
is, it would be nice to have to hand the inverse of the zig-zag enumeration,
g : N2 → N, such that

g(⟨0, 0⟩) = 0, g(⟨0, 1⟩) = 1, g(⟨1, 0⟩) = 2, . . . , g(⟨1, 2⟩) = 7, . . .

This would enable us to calculate exactly where ⟨n, m⟩ will occur in our enu-
meration.
In fact, we can define g directly by making two observations. First: if the
nth row and mth column contains value v, then the (n + 1)st row and (m − 1)st
column contains value v + 1. Second: the first row of our enumeration consists
of the triangular numbers, starting with 0, 1, 3, 6, etc. The kth triangular
number is the sum of the natural numbers < k, which can be computed as
k(k + 1)/2. Putting these two observations together, consider this function:

(n + m + 1)(n + m)
g(n, m) = +n
2
We often just write g(n, m) rather that g(⟨n, m⟩), since it is easier on the eyes.
This tells you first to determine the (n + m)th triangle number, and then add
n to it. And it populates the array in exactly the way we would like. So in
particular, the pair ⟨1, 2⟩ is sent to 4×3
2 + 1 = 7.
This function g is the inverse of an enumeration of a set of pairs. Such
functions are called pairing functions.

Definition 4.14 (Pairing function). A function f : A × B → N is an arith-


metical pairing function if f is injective. We also say that f encodes A × B,
and that f (x, y) is the code for ⟨x, y⟩.

We can use pairing functions to encode, e.g., pairs of natural numbers; or, explanation

in other words, we can represent each pair of elements using a single number.
Using the inverse of the pairing function, we can decode the number, i.e., find
out which pair it represents.

Problem 4.8. Give an enumeration of the set of all non-negative rational


numbers.

Problem 4.9. Show that Q is enumerable. Recall that any rational number
can be written as a fraction z/m with z ∈ Z, m ∈ N+ .

Problem 4.10. Define an enumeration of B∗ .

Release : 6891b66 (2024-12-01) 55


CHAPTER 4. THE SIZE OF SETS

Problem 4.11. Recall from your introductory logic course that each possible
truth table expresses a truth function. In other words, the truth functions are
all functions from Bk → B for some k. Prove that the set of all truth functions
is enumerable.

Problem 4.12. Show that the set of all finite subsets of an arbitrary infinite
enumerable set is enumerable.

Problem 4.13. A subset of N is said to be cofinite iff it is the complement


of a finite set N; that is, A ⊆ N is cofinite iff N \ A is finite. Let I be the set
whose elements are exactly the finite and cofinite subsets of N. Show that I is
enumerable.

Problem 4.14. Show that the enumerable union of enumerable sets is enu-
merable. That S
is, whenever A1 , A2 , . . . are sets, and each Ai is enumerable,

then the union i=1 Ai of all of them is also enumerable. [NB: this is hard!]

Problem 4.15. Let f : A × B → N be an arbitrary pairing function. Show


that the inverse of f is an enumeration of A × B.

Problem 4.16. Specify a function that encodes N3 .

content/sets-functions-relations/size-of-sets/pairing-alt.tex

4.5 An Alternative Pairing Function


explanation There are other enumerations of N2 that make it easier to figure out what their sfr:siz:pai-alt:
sec
inverses are. Here is one. Instead of visualizing the enumeration in an array,
start with the list of positive integers associated with (initially) empty spaces.
Imagine filling these spaces successively with pairs ⟨n, m⟩ as follows. Starting
with the pairs that have 0 in the first place (i.e., pairs ⟨0, m⟩), put the first (i.e.,
⟨0, 0⟩) in the first empty place, then skip an empty space, put the second (i.e.,
⟨0, 2⟩) in the next empty place, skip one again, and so forth. The (incomplete)
beginning of our enumeration now looks like this

1 2 3 4 5 6 7 8 9 10 ...

⟨0, 1⟩ ⟨0, 2⟩ ⟨0, 3⟩ ⟨0, 4⟩ ⟨0, 5⟩ ...

Repeat this with pairs ⟨1, m⟩ for the place that still remain empty, again skip-
ping every other empty place:

1 2 3 4 5 6 7 8 9 10 ...

⟨0, 0⟩ ⟨1, 0⟩ ⟨0, 1⟩ ⟨0, 2⟩ ⟨1, 1⟩ ⟨0, 3⟩ ⟨0, 4⟩ ⟨1, 2⟩ ...

56 Release : 6891b66 (2024-12-01)


4.5. AN ALTERNATIVE PAIRING FUNCTION

Enter pairs ⟨2, m⟩, ⟨2, m⟩, etc., in the same way. Our completed enumeration
thus starts like this:
1 2 3 4 5 6 7 8 9 10 ...

⟨0, 0⟩ ⟨1, 0⟩ ⟨0, 1⟩ ⟨2, 0⟩ ⟨0, 2⟩ ⟨1, 1⟩ ⟨0, 3⟩ ⟨3, 0⟩ ⟨0, 4⟩ ⟨1, 2⟩ ...
If we number the cells in the array above according to this enumeration, we
will not find a neat zig-zag line, but this arrangement:
0 1 2 3 4 5 ...
0 1 3 5 7 9 11 ...
1 2 6 10 14 18 ... ...
2 4 12 20 28 ... ... ...
3 8 24 40 ... ... ... ...
4 16 48 ... ... ... ... ...
5 32 ... ... ... ... ... ...
.. .. .. .. .. .. .. ..
. . . . . . . .
We can see that the pairs in row 0 are in the odd numbered places of our
enumeration, i.e., pair ⟨0, m⟩ is in place 2m + 1; pairs in the second row,
⟨1, m⟩, are in places whose number is the double of an odd number, specifically,
2 · (2m + 1); pairs in the third row, ⟨2, m⟩, are in places whose number is four
times an odd number, 4 · (2m + 1); and so on. The factors of (2m + 1) for
each row, 1, 2, 4, 8, . . . , are exactly the powers of 2: 1 = 20 , 2 = 21 , 4 = 22 ,
8 = 23 , . . . In fact, the relevant exponent is always the first member of the pair
in question. Thus, for pair ⟨n, m⟩ the factor is 2n . This gives us the general
formula: 2n · (2m + 1). However, this is a mapping of pairs to positive integers,
i.e., ⟨0, 0⟩ has position 1. If we want to begin at position 0 we must subtract 1
from the result. This gives us:
Example 4.15. The function h : N2 → N given by
h(n, m) = 2n (2m + 1) − 1
is a pairing function for the set of pairs of natural numbers N2 .
Accordingly, in our second enumeration of N2 , the pair ⟨0, 0⟩ has code explanation
h(0, 0) = 20 (2 · 0 + 1) − 1 = 0; ⟨1, 2⟩ has code 21 · (2 · 2 + 1) − 1 = 2 · 5 − 1 = 9;
⟨2, 6⟩ has code 22 · (2 · 6 + 1) − 1 = 51.
Sometimes it is enough to encode pairs of natural numbers N2 without
requiring that the encoding is surjective. Such encodings have inverses that
are only partial functions.
Example 4.16. The function j : N2 → N+ given by
j(n, m) = 2n 3m
is an injective function N2 → N.

content/sets-functions-relations/size-of-sets/non-enumerability.tex

Release : 6891b66 (2024-12-01) 57


CHAPTER 4. THE SIZE OF SETS

4.6 Non-enumerable Sets


sfr:siz:nen:
sec

This section proves the non-enumerability of Bω and ℘(Z+ ) using the


definition in section 4.2. It is designed to be a little more elementary and
a little more detailed than the version in section 4.11

Some sets, such as the set Z+ of positive integers, are infinite. So far we’ve
seen examples of infinite sets which were all enumerable. However, there are
also infinite sets which do not have this property. Such sets are called non-
enumerable.
First of all, it is perhaps already surprising that there are non-enumerable
sets. For any enumerable set A there is a surjective function f : Z+ → A. If a
set is non-enumerable there is no such function. That is, no function mapping
the infinitely many elements of Z+ to A can exhaust all of A. So there are
“more” elements of A than the infinitely many positive integers.
How would one prove that a set is non-enumerable? You have to show that
no such surjective function can exist. Equivalently, you have to show that the
elements of A cannot be enumerated in a one way infinite list. The best way
to do this is to show that every list of elements of A must leave at least one
element out; or that no function f : Z+ → A can be surjective. We can do this
using Cantor’s diagonal method. Given a list of elements of A, say, x1 , x2 , . . . ,
we construct another element of A which, by its construction, cannot possibly
be on that list.
Our first example is the set Bω of all infinite, non-gappy sequences of 0’s
and 1’s.

Theorem 4.17. Bω is non-enumerable. sfr:siz:nen:


thm:nonenum-bin-omega

Proof. Suppose, by way of contradiction, that Bω is enumerable, i.e., suppose


that there is a list s1 , s2 , s3 , s4 , . . . of all elements of Bω . Each of these si is
itself an infinite sequence of 0’s and 1’s. Let’s call the j-th element of the i-th
sequence in this list si (j). Then the i-th sequence si is

si (1), si (2), si (3), . . .

We may arrange this list, and the elements of each sequence si in it, in an
array:
1 2 3 4 ...
1 s1 (1) s1 (2) s1 (3) s1 (4) . . .
2 s2 (1) s2 (2) s2 (3) s2 (4) . . .
3 s3 (1) s3 (2) s3 (3) s3 (4) . . .
4 s4 (1) s4 (2) s4 (3) s4 (4) . . .
.. .. .. .. .. ..
. . . . . .

58 Release : 6891b66 (2024-12-01)


4.6. NON-ENUMERABLE SETS

The labels down the side give the number of the sequence in the list s1 , s2 , . . . ;
the numbers across the top label the elements of the individual sequences. For
instance, s1 (1) is a name for whatever number, a 0 or a 1, is the first element
in the sequence s1 , and so on.
Now we construct an infinite sequence, s, of 0’s and 1’s which cannot pos-
sibly be on this list. The definition of s will depend on the list s1 , s2 , . . . . Any
infinite list of infinite sequences of 0’s and 1’s gives rise to an infinite sequence s
which is guaranteed to not appear on the list.
To define s, we specify what all its elements are, i.e., we specify s(n) for all
n ∈ Z+ . We do this by reading down the diagonal of the array above (hence
the name “diagonal method”) and then changing every 1 to a 0 and every 0
to a 1. More abstractly, we define s(n) to be 0 or 1 according to whether the
n-th element of the diagonal, sn (n), is 1 or 0.
(
1 if sn (n) = 0
s(n) =
0 if sn (n) = 1.

If you like formulas better than definitions by cases, you could also define
s(n) = 1 − sn (n).
Clearly s is an infinite sequence of 0’s and 1’s, since it is just the mirror
sequence to the sequence of 0’s and 1’s that appear on the diagonal of our
array. So s is an element of Bω . But it cannot be on the list s1 , s2 , . . . Why
not?
It can’t be the first sequence in the list, s1 , because it differs from s1 in the
first element. Whatever s1 (1) is, we defined s(1) to be the opposite. It can’t be
the second sequence in the list, because s differs from s2 in the second element:
if s2 (2) is 0, s(2) is 1, and vice versa. And so on.
More precisely: if s were on the list, there would be some k so that s = sk .
Two sequences are identical iff they agree at every place, i.e., for any n, s(n) =
sk (n). So in particular, taking n = k as a special case, s(k) = sk (k) would
have to hold. sk (k) is either 0 or 1. If it is 0 then s(k) must be 1—that’s how
we defined s. But if sk (k) = 1 then, again because of the way we defined s,
s(k) = 0. In either case s(k) ̸= sk (k).
We started by assuming that there is a list of elements of Bω , s1 , s2 , . . .
From this list we constructed a sequence s which we proved cannot be on the
list. But it definitely is a sequence of 0’s and 1’s if all the si are sequences of
0’s and 1’s, i.e., s ∈ Bω . This shows in particular that there can be no list of
all elements of Bω , since for any such list we could also construct a sequence s
guaranteed to not be on the list, so the assumption that there is a list of all
sequences in Bω leads to a contradiction.

This proof method is called “diagonalization” because it uses the diagonal explanation

of the array to define s. Diagonalization need not involve the presence of an


array: we can show that sets are not enumerable by using a similar idea even
when no array and no actual diagonal is involved.
sfr:siz:nen: Theorem 4.18. ℘(Z+ ) is not enumerable.
thm:nonenum-pownat

Release : 6891b66 (2024-12-01) 59


CHAPTER 4. THE SIZE OF SETS

Proof. We proceed in the same way, by showing that for every list of subsets
of Z+ there is a subset of Z+ which cannot be on the list. Suppose the following
is a given list of subsets of Z+ :

Z1 , Z2 , Z3 , . . .

We now define a set Z such that for any n ∈ Z+ , n ∈ Z iff n ∈


/ Zn :

Z = {n ∈ Z+ : n ∈
/ Zn }

Z is clearly a set of positive integers, since by assumption each Zn is, and thus
Z ∈ ℘(Z+ ). But Z cannot be on the list. To show this, we’ll establish that for
each k ∈ Z+ , Z ̸= Zk .
So let k ∈ Z+ be arbitrary. We’ve defined Z so that for any n ∈ Z+ , n ∈ Z
iff n ∈
/ Zn . In particular, taking n = k, k ∈ Z iff k ∈/ Zk . But this shows that
Z ̸= Zk , since k is an element of one but not the other, and so Z and Zk have
different elements. Since k was arbitrary, Z is not on the list Z1 , Z2 , . . .

explanation The preceding proof did not mention a diagonal, but you can think of it as
involving a diagonal if you picture it this way: Imagine the sets Z1 , Z2 , . . . ,
written in an array, where each element j ∈ Zi is listed in the j-th column.
Say the first four sets on that list are {1, 2, 3, . . . }, {2, 4, 6, . . . }, {1, 2, 5}, and
{3, 4, 5, . . . }. Then the array would begin with

Z1 = {1, 2, 3, 4, 5, 6, . . . }
Z2 ={ 2, 4, 6, . . . }
Z3 = {1, 2, 5 }
Z4 ={ 3, 4, 5, 6, . . . }
.. ..
. .

Then Z is the set obtained by going down the diagonal, leaving out any numbers
that appear along the diagonal and include those j where the array has a gap in
the j-th row/column. In the above case, we would leave out 1 and 2, include 3,
leave out 4, etc.

Problem 4.17. Show that ℘(N) is non-enumerable by a diagonal argument.

Problem 4.18. Show that the set of functions f : Z+ → Z+ is non-enumerable


by an explicit diagonal argument. That is, show that if f1 , f2 , . . . , is a list of
functions and each fi : Z+ → Z+ , then there is some f : Z+ → Z+ not on this
list.

content/sets-functions-relations/size-of-sets/reduction.tex

60 Release : 6891b66 (2024-12-01)


4.7. REDUCTION

4.7 Reduction
sfr:siz:red:
sec

This section proves non-enumerability by reduction, matching the re-


sults in section 4.6. An alternative, slightly more condensed version match-
ing the results in section 4.12 is provided in section 4.13.

We showed ℘(Z+ ) to be non-enumerable by a diagonalization argument. We


already had a proof that Bω , the set of all infinite sequences of 0s and 1s, is non-
enumerable. Here’s another way we can prove that ℘(Z+ ) is non-enumerable:
Show that if ℘(Z+ ) is enumerable then Bω is also enumerable. Since we know
Bω is not enumerable, ℘(Z+ ) can’t be either. This is called reducing one
problem to another—in this case, we reduce the problem of enumerating Bω to
the problem of enumerating ℘(Z+ ). A solution to the latter—an enumeration
of ℘(Z+ )—would yield a solution to the former—an enumeration of Bω .
How do we reduce the problem of enumerating a set B to that of enumer-
ating a set A? We provide a way of turning an enumeration of A into an
enumeration of B. The easiest way to do that is to define a surjective function
f : A → B. If x1 , x2 , . . . enumerates A, then f (x1 ), f (x2 ), . . . would enumer-
ate B. In our case, we are looking for a surjective function f : ℘(Z+ ) → Bω .

Problem 4.19. Show that if there is an injective function g : B → A, and B is


non-enumerable, then so is A. Do this by showing how you can use g to turn
an enumeration of A into one of B.

Proof of Theorem 4.18 by reduction. Suppose that ℘(Z+ ) were enumerable, and
thus that there is an enumeration of it, Z1 , Z2 , Z3 , . . .
Define the function f : ℘(Z+ ) → Bω by letting f (Z) be the sequence sk
such that sk (n) = 1 iff n ∈ Z, and sk (n) = 0 otherwise. This clearly defines
a function, since whenever Z ⊆ Z+ , any n ∈ Z+ either is an element of Z or
isn’t. For instance, the set 2Z+ = {2, 4, 6, . . . } of positive even numbers gets
mapped to the sequence 010101 . . . , the empty set gets mapped to 0000 . . . and
the set Z+ itself to 1111 . . . .
It also is surjective: Every sequence of 0s and 1s corresponds to some set
of positive integers, namely the one which has as its members those integers
corresponding to the places where the sequence has 1s. More precisely, suppose
s ∈ Bω . Define Z ⊆ Z+ by:

Z = {n ∈ Z+ : s(n) = 1}

Then f (Z) = s, as can be verified by consulting the definition of f .


Now consider the list

f (Z1 ), f (Z2 ), f (Z3 ), . . .

Release : 6891b66 (2024-12-01) 61


CHAPTER 4. THE SIZE OF SETS

Since f is surjective, every member of Bω must appear as a value of f for some


argument, and so must appear on the list. This list must therefore enumerate
all of Bω .
So if ℘(Z+ ) were enumerable, Bω would be enumerable. But Bω is non-
enumerable (Theorem 4.17). Hence ℘(Z+ ) is non-enumerable.

explanation It is easy to be confused about the direction the reduction goes in. For
instance, a surjective function g : Bω → B does not establish that B is non-
enumerable. (Consider g : Bω → B defined by g(s) = s(1), the function that
maps a sequence of 0’s and 1’s to its first element. It is surjective, because
some sequences start with 0 and some start with 1. But B is finite.) Note also
that the function f must be surjective, or otherwise the argument does not go
through: f (x1 ), f (x2 ), . . . would then not be guaranteed to include all the
elements of B. For instance,

h(n) = 000
| {z. . . 0}
n 0’s

defines a function h : Z+ → Bω , but Z+ is enumerable.

Problem 4.20. Show that the set of all sets of pairs of positive integers is
non-enumerable by a reduction argument.

Problem 4.21. Show that the set X of all functions f : N → N is non-


enumerable by a reduction argument (Hint: give a surjective function from
X to Bω .)

Problem 4.22. Show that Nω , the set of infinite sequences of natural num-
bers, is non-enumerable by a reduction argument.

Problem 4.23. Let P be the set of functions from the set of positive integers
to the set {0}, and let Q be the set of partial functions from the set of positive
integers to the set {0}. Show that P is enumerable and Q is not. (Hint: reduce
the problem of enumerating Bω to enumerating Q).

Problem 4.24. Let S be the set of all surjective functions from the set of
positive integers to the set {0,1}, i.e., S consists of all surjective f : Z+ → B.
Show that S is non-enumerable.

Problem 4.25. Show that the set R of all real numbers is non-enumerable.

content/sets-functions-relations/size-of-sets/equinumerous-sets.tex

62 Release : 6891b66 (2024-12-01)


4.8. EQUINUMEROSITY

4.8 Equinumerosity
sfr:siz:equ: We have an intuitive notion of “size” of sets, which works fine for finite sets.
sec
But what about infinite sets? If we want to come up with a formal way of
comparing the sizes of two sets of any size, it is a good idea to start by defining
when sets are the same size. Here is Frege:

If a waiter wants to be sure that he has laid exactly as many knives


as plates on the table, he does not need to count either of them, if
he simply lays a knife to the right of each plate, so that every knife
on the table lies to the right of some plate. The plates and knives
are thus uniquely correlated to each other, and indeed through that
same spatial relationship. (Frege, 1884, §70)

The insight of this passage can be brought out through a formal definition:

sfr:siz:equ: Definition 4.19. A is equinumerous with B, written A ≈ B, iff there is


comparisondef
a bijection f : A → B.

sfr:siz:equ: Proposition 4.20. Equinumerosity is an equivalence relation.


equinumerosityisequi

Proof. We must show that equinumerosity is reflexive, symmetric, and transi-


tive. Let A, B, and C be sets.
Reflexivity. The identity map IdA : A → A, where IdA (x) = x for all x ∈ A,
is a bijection. So A ≈ A.
Symmetry. Suppose A ≈ B, i.e., there is a bijection f : A → B. Since f
is bijective, its inverse f −1 exists and is also bijective. Hence, f −1 : B → A is
a bijection, so B ≈ A.
Transitivity. Suppose that A ≈ B and B ≈ C, i.e., there are bijections
f : A → B and g : B → C. Then the composition g ◦ f : A → C is bijective, so
that A ≈ C.

Proposition 4.21. If A ≈ B, then A is enumerable if and only if B is.

The following proof uses Definition 4.4 if section 4.2 is included and
Definition 4.27 otherwise.

Proof. Suppose A ≈ B, so there is some bijection f : A → B, and suppose that


A is enumerable. Then either A = ∅ or there is a surjective function g : Z+ →
A. If A = ∅, then B = ∅ also (otherwise there would be an element y ∈ B but
no x ∈ A with g(x) = y). If, on the other hand, g : Z+ → A is surjective, then
f ◦ g : Z+ → B is surjective. To see this, let y ∈ B. Since f is surjective, there
is an x ∈ A such that f (x) = y. Since g is surjective, there is an n ∈ Z+ such
that g(n) = x. Hence,

(f ◦ g)(n) = f (g(n)) = f (x) = y

Release : 6891b66 (2024-12-01) 63


CHAPTER 4. THE SIZE OF SETS

and thus f ◦ g is surjective. We have that f ◦ g is an enumeration of B, and so


B is enumerable.
If B is enumerable, we obtain that A is enumerable by repeating the argu-
ment with the bijection f −1 : B → A instead of f .

Problem 4.26. Show that if A ≈ C and B ≈ D, and A ∩ B = C ∩ D = ∅,


then A ∪ B ≈ C ∪ D.

Problem 4.27. Show that if A is infinite and enumerable, then A ≈ N.

content/sets-functions-relations/size-of-sets/comparing-size.tex

4.9 Sets of Different Sizes, and Cantor’s Theorem


explanation We have offered a precise statement of the idea that two sets have the same size. sfr:siz:car:
sec
We can also offer a precise statement of the idea that one set is smaller than
another. Our definition of “is smaller than (or equinumerous)” will require,
instead of a bijection between the sets, an injection from the first set to the
second. If such a function exists, the size of the first set is less than or equal
to the size of the second. Intuitively, an injection from one set to another
guarantees that the range of the function has at least as many elements as the
domain, since no two elements of the domain map to the same element of the
range.

Definition 4.22. A is no larger than B, written A ⪯ B, iff there is an injection


f : A → B.

It is clear that this is a reflexive and transitive relation, but that it is not
symmetric (this is left as an exercise). We can also introduce a notion, which
states that one set is (strictly) smaller than another.

Definition 4.23. A is smaller than B, written A ≺ B, iff there is an injec-


tion f : A → B but no bijection g : A → B, i.e., A ⪯ B and A ̸≈ B.

It is clear that this relation is irreflexive and transitive. (This is left as


an exercise.) Using this notation, we can say that a set A is enumerable iff
A ⪯ N, and that A is non-enumerable iff N ≺ A. This allows us to restate
Theorem 4.32 as the observation that N ≺ ℘(N). In fact, Cantor (1892) proved
that this last point is perfectly general :

Theorem 4.24 (Cantor). A ≺ ℘(A), for any set A. sfr:siz:car:


thm:cantor

Proof. The map f (x) = {x} is an injection f : A → ℘(A), since if x ̸= y,


then also {x} =
̸ {y} by extensionality, and so f (x) ̸= f (y). So we have that
A ⪯ ℘(A).

64 Release : 6891b66 (2024-12-01)


4.9. SETS OF DIFFERENT SIZES, AND CANTOR’S THEOREM

We present the slow proof if section 4.6 is present, otherwise a faster


proof matching section 4.12.

We will now show that there cannot be a surjective function g : A → ℘(A),


let alone a bijective one, and hence that A ̸≈ ℘(A). For suppose that g : A →
℘(A). Since g is total, every x ∈ A is mapped to a subset g(x) ⊆ A. We can
show that g cannot be surjective. To do this, we define a subset A ⊆ A which
by definition cannot be in the range of g. Let
A = {x ∈ A : x ∈
/ g(x)}.
Since g(x) is defined for all x ∈ A, A is clearly a well-defined subset of A.
But, it cannot be in the range of g. Let x ∈ A be arbitrary, we will show
that A ̸= g(x). If x ∈ g(x), then it does not satisfy x ∈ / g(x), and so by the
definition of A, we have x ∈/ A. If x ∈ A, it must satisfy the defining property
of A, i.e., x ∈ A and x ∈
/ g(x). Since x was arbitrary, this shows that for each
x ∈ A, x ∈ g(x) iff x ∈/ A, and so g(x) ̸= A. In other words, A cannot be in
the range of g, contradicting the assumption that g is surjective.
It’s instructive to compare the proof of Theorem 4.24 to that of Theo- explanation
rem 4.18. There we showed that for any list Z1 , Z2 , . . . , of subsets of Z+ one
can construct a set Z of numbers guaranteed not to be on the list. It was guar-
anteed not to be on the list because, for every n ∈ Z+ , n ∈ Zn iff n ∈ / Z. This
way, there is always some number that is an element of one of Zn or Z but not
the other. We follow the same idea here, except the indices n are now elements
of A instead of Z+ . The set B is defined so that it is different from g(x) for
each x ∈ A, because x ∈ g(x) iff x ∈/ B. Again, there is always an element of A
which is an element of one of g(x) and B but not the other. And just as Z
therefore cannot be on the list Z1 , Z2 , . . . , B cannot be in the range of g.
It’s instructive to compare the proof of Theorem 4.24 to that of Theo-
rem 4.32. There we showed that for any list N0 , N1 , N2 , . . . , of subsets of N
we can construct a set D of numbers guaranteed not to be on the list. It was
guaranteed not to be on the list because n ∈ Nn iff n ∈ / D, for every n ∈ N. We
follow the same idea here, except the indices n are now elements of A rather
than of N. The set D is defined so that it is different from g(x) for each x ∈ A,
because x ∈ g(x) iff x ∈/ D.
The proof is also worth comparing with the proof of Russell’s Paradox,
Theorem 1.29. Indeed, Cantor’s Theorem was the inspiration for Russell’s own
paradox.
Problem 4.28. Show that there cannot be an injection g : ℘(A) → A, for any
set A. Hint: Suppose g : ℘(A) → A is injective. Consider D = {g(B) : B ⊆
A and g(B) ∈ / B}. Let x = g(D). Use the fact that g is injective to derive a
contradiction.

content/sets-functions-relations/size-of-sets/schroder-bernstein.tex

Release : 6891b66 (2024-12-01) 65


CHAPTER 4. THE SIZE OF SETS

4.10 The Notion of Size, and Schröder-Bernstein


explanation Here is an intuitive thought: if A is no larger than B and B is no larger sfr:siz:sb:
sec
than A, then A and B are equinumerous. To be honest, if this thought were
wrong, then we could scarcely justify the thought that our defined notion of
equinumerosity has anything to do with comparisons of “sizes” between sets!
Fortunately, though, the intuitive thought is correct. This is justified by the
Schröder-Bernstein Theorem.

Theorem 4.25 (Schröder-Bernstein). If A ⪯ B and B ⪯ A, then A ≈ B. sfr:siz:sb:


thm:schroder-bernstein

explanation In other words, if there is an injection from A to B, and an injection from


B to A, then there is a bijection from A to B.
This result, however, is really rather difficult to prove. Indeed, although
Cantor stated the result, others proved it.1 For now, you can (and must) take
it on trust.
Fortunately, Schröder-Bernstein is correct, and it vindicates our thinking of
the relations we defined, i.e., A ≈ B and A ⪯ B, as having something to do
with “size”. Moreover, Schröder-Bernstein is very useful. It can be difficult to
think of a bijection between two equinumerous sets. The Schröder-Bernstein
Theorem allows us to break the comparison down into cases so we only have
to think of an injection from the first to the second, and vice-versa.

The following section 4.11, section 4.12, section 4.13 are alternative
versions of section 4.2, section 4.6, section 4.7 due to Tim Button for use
in his Open Set Theory text. They are slightly more advanced and use a
difference definition of enumerability more suitable in a set theory context
(i.e., bijection with N or an initial segment, rather than being listable or
being the range of a surjective function from Z+ ).

content/sets-functions-relations/size-of-sets/enumerability-alt.tex

4.11 Enumerations and Enumerable Sets


sfr:siz:enm-alt:
sec

This section defines enumerations as bijections with (initial segments)


of N, the way it’s done in set theory. So it conflicts slightly with the
definitions in section 4.2, and repeats all the examples there. It is also a
bit more terse than that section.

1 For more on the history, see e.g., Potter (2004, pp. 165–6).

66 Release : 6891b66 (2024-12-01)


4.11. ENUMERATIONS AND ENUMERABLE SETS

We can specify finite set is by simply enumerating its elements. We do this


when we define a set like so:

A = {a1 , a2 , . . . , an }.

Assuming that the elements a1 , . . . , an are all distinct, this gives us a bijection
between A and the first n natural numbers 0, . . . , n−1. Conversely, since every
finite set has only finitely many elements, every finite set can be put into such
a correspondence. In other words, if A is finite, there is a bijection between A
and {0, . . . , n − 1}, where n is the number of elements of A.
If we allow for certain kinds of infinite sets, then we will also allow some
infinite sets to be enumerated. We can make this precise by saying that an
infinite set is enumerated by a bijection between it and all of N.
Definition 4.26 (Enumeration, set-theoretic). An enumeration of a set
A is a bijection whose range is A and whose domain is either an initial set of
natural numbers {0, 1, . . . , n} or the entire set of natural numbers N.

There is an intuitive underpinning to this use of the word enumeration. For explanation
to say that we have enumerated a set A is to say that there is a bijection f
which allows us to count out the elements of the set A. The 0th element is
f (0), the 1st is f (1), . . . the nth is f (n). . . .2 The rationale for this may be
made even clearer by adding the following:
sfr:siz:enm-alt: Definition 4.27. A set A is enumerable iff either A = ∅ or there is an enu-
defn:enumerable
meration of A. We say that A is non-enumerable iff A is not enumerable.

So a set is enumerable iff it is empty or you can use an enumeration to explanation


count out its elements.
Example 4.28. A function enumerating the natural numbers is simply the
identity function IdN : N → N given by IdN (n) = n. A function enumerating
the positive natural numbers, N+ = N \ {0}, is the function g(n) = n + 1, i.e.,
the successor function.

Problem 4.29. Show that a set A is enumerable iff either A = ∅ or there is


a surjection f : N → A. Show that A is enumerable iff there is an injection
g : A → N.

Example 4.29. The functions f : N → N and g : N → N given by

f (n) = 2n and
g(n) = 2n + 1

respectively enumerate the even natural numbers and the odd natural numbers.
But neither is surjective, so neither is an enumeration of N.
2 Yes, we count from 0. Of course we could also start with 1. This would make no big

difference. We would just have to replace N by Z+ .

Release : 6891b66 (2024-12-01) 67


CHAPTER 4. THE SIZE OF SETS

Problem 4.30. Define an enumeration of the square numbers 1, 4, 9, 16, . . .

Example 4.30. Let ⌈x⌉ be the ceiling function, which rounds x up to the
nearest integer. Then the function f : N → Z given by:

f (n) = (−1)n n2
 

enumerates the set of integers Z as follows:

f (0) f (1) f (2) f (3) f (4) f (5) f (6) ...


0
− 21
2 3 4
− 52
    6
2 2 − 2 2 2 ...

0 −1 1 −2 2 −3 3 ...

Notice how f generates the values of Z by “hopping” back and forth between
positive and negative integers. You can also think of f as defined by cases as
follows: (
n
if n is even
f (n) = 2 n+1
− 2 if n is odd

Problem 4.31. Show that if A and B are enumerable, so is A ∪ B.

Problem 4.32. Show by induction on n that if A1 , A2 , . . . , An are all enu-


merable, so is A1 ∪ · · · ∪ An .

content/sets-functions-relations/size-of-sets/non-enumerability-alt.tex

4.12 Non-enumerable Sets


sfr:siz:nen-alt:
sec

This section proves the non-enumerability of Bω and ℘(N) using the


definitions in section 4.11, i.e., requiring a bijection with N instead of a
surjection from Z+ .

explanation The set N of natural numbers is infinite. It is also trivially enumerable. But
the remarkable fact is that there are non-enumerable sets, i.e., sets which are
not enumerable (see Definition 4.27).
This might be surprising. After all, to say that A is non-enumerable is
to say that there is no bijection f : N → A; that is, no function mapping the
infinitely many elements of N to A exhausts all of A. So if A is non-enumerable,
there are “more” elements of A than there are natural numbers.
To prove that a set is non-enumerable, you have to show that no appropriate
bijection can exist. The best way to do this is to show that every attempt to

68 Release : 6891b66 (2024-12-01)


4.12. NON-ENUMERABLE SETS

enumerate elements of A must leave at least one element out; this shows that
no function f : N → A is surjective. And a general strategy for establishing
this is to use Cantor’s diagonal method. Given a list of elements of A, say, x1 ,
x2 , . . . , we construct another element of A which, by its construction, cannot
possibly be on that list.
But all of this is best understood by example. So, our first example is the
set Bω of all infinite strings of 0’s and 1’s. (The ‘B’ stands for binary, and we
can just think of it as the two-element set {0, 1}.)

sfr:siz:nen-alt: Theorem 4.31. Bω is non-enumerable.


thm:nonenum-bin-omega

Proof. Consider any enumeration of a subset of Bω . So we have some list s0 ,


s1 , s2 , . . . where every sn is an infinite string of 0’s and 1’s. Let sn (m) be the
nth digit of the mth string in this list. So we can now think of our list as an
array, where sn (m) is placed at the nth row and mth column:

0 1 2 3 ...
0 s0 (0) s0 (1) s0 (2) s0 (3) ...
1 s1 (0) s1 (1) s1 (2) s1 (3) ...
2 s2 (0) s2 (1) s2 (2) s2 (3) ...
3 s3 (0) s3 (1) s3 (2) s3 (3) ...
.. .. .. .. .. ..
. . . . . .

We will now construct an infinite string, d, of 0’s and 1’s which is not on this
list. We will do this by specifying each of its entries, i.e., we specify d(n) for
all n ∈ N. Intuitively, we do this by reading down the diagonal of the array
above (hence the name “diagonal method”) and then changing every 1 to a 0
and every 1 to a 0. More abstractly, we define d(n) to be 0 or 1 according to
whether the n-th element of the diagonal, sn (n), is 1 or 0, that is:
(
1 if sn (n) = 0
d(n) =
0 if sn (n) = 1

Clearly d ∈ Bω , since it is an infinite string of 0’s and 1’s. But we have


constructed d so that d(n) ̸= sn (n) for any n ∈ N. That is, d differs from sn in
its nth entry. So d ̸= sn for any n ∈ N. So d cannot be on the list s0 , s1 , s2 ,
...
We have shown, given an arbitrary enumeration of some subset of Bω , that
it will omit some element of Bω . So there is no enumeration of the set Bω , i.e.,
Bω is non-enumerable.

This proof method is called “diagonalization” because it uses the diagonal explanation

of the array to define d. However, diagonalization need not involve the presence
of an array. Indeed, we can show that some set is non-enumerable by using
a similar idea, even when no array and no actual diagonal is involved. The
following result illustrates how.

Release : 6891b66 (2024-12-01) 69


CHAPTER 4. THE SIZE OF SETS

Theorem 4.32. ℘(N) is not enumerable. sfr:siz:nen-alt:


thm:nonenum-pownat

Proof. We proceed in the same way, by showing that every list of subsets of N
omits some subset of N. So, suppose that we have some list N0 , N1 , N2 , . . . of
subsets of N. We define a set D as follows: n ∈ D iff n ∈
/ Nn :

D = {n ∈ N : n ∈
/ Nn }

Clearly D ⊆ N. But D cannot be on the list. After all, by construction n ∈ D


iff n ∈
/ Nn , so that D ̸= Nn for any n ∈ N.

explanation The preceding proof did not mention a diagonal. Still, you can think of it
as involving a diagonal if you picture it this way: Imagine the sets N0 , N1 ,
. . . , written in an array, where we write Nn on the nth row by writing m in
the mth column iff if m ∈ Nn . For example, say the first four sets on that list
are {0, 1, 2, . . . }, {1, 3, 5, . . . }, {0, 1, 4}, and {2, 3, 4, . . . }; then our array would
begin with
N0 = {0, 1, 2, ...}
N1 = { 1, 3, 5, . . . }
N2 = {0, 1, 4 }
N3 = { 2, 3, 4, ...}
.. ..
. .
Then D is the set obtained by going down the diagonal, placing n ∈ D iff n
is not on the diagonal. So in the above case, we would leave out 0 and 1, we
would include 2, we would leave out 3, etc.

Problem 4.33. Show that the set of all functions f : N → N is non-enumerable


by an explicit diagonal argument. That is, show that if f1 , f2 , . . . , is a list of
functions and each fi : N → N, then there is some g : N → N not on this list.

content/sets-functions-relations/size-of-sets/reduction-alt.tex

4.13 Reduction
sfr:siz:red-alt:
sec

This section proves non-enumerability by reduction, matching the re-


sults in section 4.12. An alternative, slightly more elaborate version match-
ing the results in section 4.6 is provided in section 4.7.

We proved that Bω is non-enumerable by a diagonalization argument. We used


a similar diagonalization argument to show that ℘(N) is non-enumerable. But
here’s another way we can prove that ℘(N) is non-enumerable: show that if

70 Release : 6891b66 (2024-12-01)


4.13. REDUCTION

℘(N) is enumerable then Bω is also enumerable. Since we know Bω is non-


enumerable, it will follow that ℘(N) is too.
This is called reducing one problem to another. In this case, we reduce the
problem of enumerating Bω to the problem of enumerating ℘(N). A solution to
the latter—an enumeration of ℘(N)—would yield a solution to the former—an
enumeration of Bω .
To reduce the problem of enumerating a set B to that of enumerating a
set A, we provide a way of turning an enumeration of A into an enumeration
of B. The easiest way to do that is to define a surjection f : A → B. If x1 , x2 ,
. . . enumerates A, then f (x1 ), f (x2 ), . . . would enumerate B. In our case, we
are looking for a surjection f : ℘(N) → Bω .
Problem 4.34. Show that if there is an injective function g : B → A, and B is
non-enumerable, then so is A. Do this by showing how you can use g to turn
an enumeration of A into one of B.

Proof of Theorem 4.32 by reduction. For a reduction, suppose that ℘(N) is enu-
merable, and thus that there is an enumeration of it, N1 , N2 , N3 , . . .
Define the function f : ℘(N) → Bω by letting f (N ) be the string sk such
that sk (n) = 1 iff n ∈ N , and sk (n) = 0 otherwise.
This clearly defines a function, since whenever N ⊆ N, any n ∈ N either
is an element of N or isn’t. For instance, the set 2N = {2n : n ∈ N} =
{0, 2, 4, 6, . . . } of even naturals gets mapped to the string 1010101 . . . ; ∅ gets
mapped to 0000 . . . ; N gets mapped to 1111 . . . .
It is also surjective: every string of 0s and 1s corresponds to some set
of natural numbers, namely the one which has as its members those natural
numbers corresponding to the places where the string contains a 1s. More
precisely, if s ∈ Bω , then define N ⊆ N by:

N = {n ∈ N : s(n) = 1}

Then f (N ) = s, as can be verified by consulting the definition of f .


Now consider the list

f (N1 ), f (N2 ), f (N3 ), . . .

Since f is surjective, every member of Bω must appear as a value of f for some


argument, and so must appear on the list. This list must therefore enumerate
all of Bω .
So if ℘(N) were enumerable, Bω would be enumerable. But Bω is non-
enumerable (Theorem 4.31). Hence ℘(N) is non-enumerable.

Problem 4.35. Show that the set X of all functions f : N → N is non-


enumerable by a reduction argument (Hint: give a surjective function from
X to Bω .)

Problem 4.36. Show that the set of all sets of pairs of natural numbers, i.e.,
℘(N × N), is non-enumerable by a reduction argument.

Release : 6891b66 (2024-12-01) 71


Problem 4.37. Show that Nω , the set of infinite sequences of natural num-
bers, is non-enumerable by a reduction argument.

Problem 4.38. Let S be the set of all surjections from N to the set {0, 1},
i.e., S consists of all surjections f : N → B. Show that S is non-enumerable.

Problem 4.39. Show that the set R of all real numbers is non-enumerable.

Chapter 5

Arithmetization

The material in this chapter presents the construction of the number


systems in naı̈ve set theory. It is taken from Tim Button’s Open Set Theory
text.

content/sets-functions-relations/arithmetization/integers.tex

5.1 From N to Z
Here are two basic realisations: sfr:arith:int:
sec

1. Every integer can be written in the form n − m, with n, m ∈ N.

2. The information encoded in an expression n − m can equally be encoded


by an ordered pair ⟨n, m⟩.

We already know that the ordered pairs of natural numbers are the elements of
N2 . And we are assuming that we understand N. So here is a naı̈ve suggestion,
based on the two realisations we have had: let’s treat integers as ordered pairs
of natural numbers.
In fact, this suggestion is too naı̈ve. Obviously we want it to be the case
that 0 − 2 = 4 − 6. But evidently ⟨0, 2⟩ ≠ ⟨4, 6⟩. So we cannot simply say that
N2 is the set of integers.

72
5.1. FROM N TO Z

Generalising from the preceding problem, what we want is the following:

a − b = c − d iff a + d = c + b

(It should be obvious that this is how integers are meant to behave: just add
b and d to both sides.) And the easy way to guarantee this behaviour is just
to define an equivalence relation between ordered pairs, ∼, as follows:

⟨a, b⟩ ∼ ⟨c, d⟩ iff a + d = c + b

We now have to show that this is an equivalence relation.


Proposition 5.1. ∼ is an equivalence relation.
Proof. We must show that ∼ is reflexive, symmetric, and transitive.
Reflexivity: Evidently ⟨a, b⟩ ∼ ⟨a, b⟩, since a + b = b + a.
Symmetry: Suppose ⟨a, b⟩ ∼ ⟨c, d⟩, so a + d = c + b. Then c + b = a + d, so
that ⟨c, d⟩ ∼ ⟨a, b⟩.
Transitivity: Suppose ⟨a, b⟩ ∼ ⟨c, d⟩ ∼ ⟨m, n⟩. So a + d = c + b and
c + n = m + d. So a + d + c + n = c + b + m + d, and so a + n = m + b. Hence
⟨a, b⟩ ∼ ⟨m, n⟩.

Now we can use this equivalence relation to take equivalence classes:


Definition 5.2. The integers are the equivalence classes, under ∼, of ordered
pairs of natural numbers; that is, Z = N2 /∼ .

Now, one might have plenty of different philosophical reactions to this stip-
ulative definition. Before we consider those reactions, though, it is worth con-
tinuing with some of the technicalities.
Having said what the integers are, we shall need to define basic functions
and relations on them. Let’s write [m, n]∼ for the equivalence class under ∼
with ⟨m, n⟩ as an element.1 That is:

[m, n]∼ = {⟨a, b⟩ ∈ N2 : ⟨a, b⟩ ∼ ⟨m, n⟩}

So now we offer some definitions:

[a, b]∼ + [c, d]∼ = [a + c, b + d]∼


[a, b]∼ × [c, d]∼ = [ac + bd, ad + bc]∼
[a, b]∼ ≤ [c, d]∼ iff a + d ≤ b + c

(As is common, I’m using ‘ab’ stand for ‘(a × b)’, just to make the axioms easier
to read.) Now, we need to make sure that these definitions behave as they ought
to. Spelling out what this means, and checking it through, is rather laborious;
we relegate the details to section 5.6. But the short point is: everything works!
1 Note: using the notation introduced in Definition 2.11, we would have written [⟨m, n⟩]

for the same thing. But that’s just a bit harder to read.

Release : 6891b66 (2024-12-01) 73


CHAPTER 5. ARITHMETIZATION

One final thing remains. We have constructed the integers using natural
numbers. But this will mean that the natural numbers are not themselves
integers. We will return to the philosophical significance of this in section 5.5.
On a purely technical front, though, we will need some way to be able to
treat natural numbers as integers. The idea is quite easy: for each n ∈ N,
we just stipulate that nZ = [n, 0]∼ . We need to confirm that this definition is
well-behaved, i.e., that for any m, n ∈ N

(m + n)Z = mZ + nZ
(m × n)Z = mZ × nZ
m ≤ n ↔ mZ ≤ nZ

But this is all pretty straightforward. For example, to show that the second
of these obtains, we can simply help ourselves to the behaviour of the natural
numbers and reason as follows:

(m × n)Z = [m × n, 0]∼
= [m × n + 0 × 0, m × 0 + 0 × n]∼
= [m, 0]∼ × [n, 0]∼
= mZ × nZ

We leave it as an exercise to confirm that the other two conditions hold.

Problem 5.1. Show that (m + n)Z = mZ + nZ and m ≤ n ↔ mZ ≤ nZ , for


any m, n ∈ N.

content/sets-functions-relations/arithmetization/rationals.tex

5.2 From Z to Q
We just saw how to construct the integers from the natural numbers, using sfr:arith:rat:
sec
some naı̈ve set theory. We shall now see how to construct the rationals from
the integers in a very similar way. Our initial realisations are:

1. Every rational can be written in the form i/j , where both i and j are
integers but j is non-zero.

2. The information encoded in an expression i/j can equally be encoded in


an ordered pair ⟨i, j⟩.

The obvious approach would be to think of the rationals as ordered pairs drawn
from Z × (Z \ {0Z }). As before, though, that would be a bit too naı̈ve, since we
want 3/2 = 6/4, but ⟨3, 2⟩ =
̸ ⟨6, 4⟩. More generally, we will want the following:

a/b = c/d iff a × d = b × c

74 Release : 6891b66 (2024-12-01)


5.3. THE REAL LINE

To get this, we define an equivalence relation on Z × (Z \ {0Z }) thus:

⟨a, b⟩ ∽ ⟨c, d⟩ iff a × d = b × c

We must check that this is an equivalence relation. This is very much like the
case of ∼, and we will leave it as an exercise.
Problem 5.2. Show that ∽ is an equivalence relation.
But it allows us to say:
Definition 5.3. The rationals are the equivalence classes, under ∽, of pairs of
integers (whose second element is non-zero). That is, Q = (Z × (Z \ {0Z }))/∽ .
As with the integers, we also want to define some basic operations. Where
[i, j]∽ is the equivalence class under ∽ with ⟨i, j⟩ as an element, we say:

[a, b]∽ + [c, d]∽ = [ad + bc, bd]∽


[a, b]∽ × [c, d]∽ = [ac, bd]∽ .

To define r ≤ s on these rationals, we use the fact that r ≤ s iff s − r is not


negative, i.e., r − s can be written as i/j with i non-negative and j positive:

[a, b]∽ ≤ [c, d]∽ iff [c, d]∽ − [a, b]∽ = [iZ , jZ ]∽

for some i ∈ N and 0 ̸= j ∈ N.


We then need to check that these definitions behave as they ought to; and
we relegate this to section 5.6. But they indeed do! Finally, we want some way
to treat integers as rationals; so for each i ∈ Z, we stipulate that iQ = [i, 1Z ]∽ .
Again, we check that all of this behaves correctly in section 5.6.
Problem 5.3. Show that (i + j)Q = iQ + jQ and (i × j)Q = iQ × jQ and
i ≤ j ↔ iQ ≤ jQ , for any i, j ∈ Z.

content/sets-functions-relations/arithmetization/reals.tex

5.3 The Real Line


sfr:arith:real: The next step is to show how to construct the reals from the rationals. Before
sec
that, we need to understand what is distinctive about the reals.
The reals behave very much like the rationals. (Technically, both are ex-
amples of ordered fields; for the definition of this, see Definition 5.9.) Now, if
you worked through the exercises to chapter 4, you will know that there are
strictly more reals than rationals, i.e., that Q ≺ R. This was first proved by
Cantor. But it’s been known for about two and a half millennia that there are
irrational numbers, i.e., reals which are not rational. Indeed:
√ √
sfr:arith:real: Theorem 5.4. 2 is not rational, i.e., 2 ∈/Q
root2irrational

Release : 6891b66 (2024-12-01) 75


CHAPTER 5. ARITHMETIZATION

√ √
Proof. Suppose, for reductio, that 2 is rational. So 2 = m/n for some natural
numbers m and n. Indeed, we can choose m and n so that the fraction cannot
be reduced any further. Re-organising, m2 = 2n2 . From here, we can complete
the proof in two ways:
First, geometrically (following Tennenbaum).2 Consider these squares:

n
m

Since m2 = 2n2 , the region where the two squares of side n overlap has the
same area as the region which neither of the two squares cover; i.e., the area
of the orange square equals the sum of the area of the two unshaded squares.
So where the orange √ square has side p, and each unshaded square has side q,
p2 = 2q 2 . But now 2 = p/q, with p < m and q < n and p, q ∈ N. This
contradicts the fact that m and n were chosen to be as small as possible.
Second, formally. Since m2 = 2n2 , it follows that m is even. (It is easy to
show that, if x is odd, then x2 is odd.) So m = 2r, for some r ∈ N. Rearranging,
2r2 = n2 , so n is also even. So both m and n are even, and hence the fraction
m/n can be reduced further. Contradiction!

In passing, this diagrammatic proof allows us to revisit the material from


section 73.4. Tennenbaum (1927–2006) was a thoroughly modern mathemati-
cian; but the proof is undeniably lovely, completely rigorous, and appeals to
geometric intuition!
In any case: the reals are “more expansive” than the rationals. In some
sense, there are “gaps” in the rationals, and these are filled by the reals. Weier-
strass realised that this describes a single property of the real numbers, which
distinguishes them from the rationals, namely the Completeness Property: Ev-
ery non-empty set of real numbers with an upper bound has a least upper bound.
It is easy to see that the rationals do not have the
√ Completeness Property.
For example, consider the set of rationals less than 2, i.e.:

{p ∈ Q : p2 < 2 or p < 0}

This has an upper bound in the rationals; its elements are all smaller
√ than 3,
for example. But what
√ is its least upper bound? We want to say ‘ 2’; but we
have just seen√that 2 is not rational. And there is no least rational number
greater than 2. So the set has an upper bound but no least upper bound.
Hence the rationals lack the Completeness Property.
2 This proof is reported by Conway (2006).

76 Release : 6891b66 (2024-12-01)


5.4. FROM Q TO R

By contrast, the continuum √ “morally ought” to have the Completeness


Property. We do not just want 2 to be a real number; we want to fill all
the “gaps” in the rational line. Indeed, we want the continuum itself to have
no “gaps” in it. That is just what we will get via Completeness.

content/sets-functions-relations/arithmetization/cuts.tex

5.4 From Q to R
sfr:arith:cuts: In essence, the Completeness Property shows that any point α of the real line
sec
divides that line into two halves perfectly: those for which α is the least upper
bound, and those for which α is the greatest lower bound. To construct the
real numbers from the rational numbers, Dedekind suggested that we simply
think of the reals as the cuts that partition the rationals.
√ √ That is, we identify

2 with the cut which separates the rationals < 2 from the rationals > 2.
Let’s tidy this up. If we cut the rational numbers into two halves, we can
uniquely identify the partition we made just by considering its bottom half. So,
getting precise, we offer the following definition:

Definition 5.5 (Cut). A cut α is any non-empty proper initial segment of


the rationals with no greatest element. That is, α is a cut iff:

1. non-empty, proper : ∅ =
̸ α⊊Q

2. initial : for all p, q ∈ Q: if p < q ∈ α then p ∈ α

3. no maximum: for all p ∈ α there is a q ∈ α such that p < q

Then R is the set of cuts.



So now we can say that 2 = {p ∈ Q : p2 < 2 or p < 0}. Of course, we
need to check that this is a cut, but we relegate that to section 5.6.
As before, having defined some entities, we next need to define basic func-
tions and relations upon them. We begin with an easy one:

α ≤ β iff α ⊆ β

This definition of an order allows to state the central result, that the set of
cuts has the Completeness Property. Spelled out fully, the statement has this
shape. If S is a non-empty set of cuts with an upper bound, then S has a least
upper bound. In more detail: there is a cut, λ, which is an upper bound for
S, i.e. (∀α ∈ S)α ⊆ λ, and λ is the least such cut, i.e. (∀β ∈ R)((∀α ∈ S)α ⊆
β → λ ⊆ β). Now here is the proof of the result:

sfr:arith:cuts: Theorem 5.6. The set of cuts has the Completeness Property.
realcompleteness
S
Proof. Let S be any non-empty set of cuts with an upper bound. Let λ = S.
We first claim that λ is a cut:

Release : 6891b66 (2024-12-01) 77


CHAPTER 5. ARITHMETIZATION

1. Since S has an upper bound, at least one cut is in S, so ∅ =


̸ λ. Since S is
a set of cuts, λ ⊆ Q. Since S has an upper bound, some p ∈ Q is absent
from every cut α ∈ S. So p ∈/ λ, and hence λ ⊊ Q.

2. Suppose p < q ∈ λ. So there is some α ∈ S such that q ∈ α. Since α is a


cut, p ∈ α. So p ∈ λ.

3. Suppose p ∈ λ. So there is some α ∈ S such that p ∈ α. Since α is a cut,


there is some q ∈ α such that p < q. So q ∈ λ.
S
This proves the claim. Moreover, clearly (∀α ∈ S)α ⊆ S = λ, i.e. λ is
an upper bound on S. So now suppose β ∈ R is also an upper bound, i.e.
(∀α ∈ S)α ⊆ β. For any p ∈ Q, if p ∈ λ, then there is α ∈ S such that p ∈ α,
so that p ∈ β. Generalizing, λ ⊆ β. So λ is the least upper bound on S.

So we have a bunch of entities which satisfy the Completeness Property.


And one way to put this is: there are no “gaps” in our cuts. (So: taking
further “cuts” of reals, rather than rationals, would yield no interesting new
objects.)
Next, we must define some operations on the reals. We start by embedding
the rationals into the reals by stipulating that pR = {q ∈ Q : q < p} for each
p ∈ Q. We then define:

α + β = {p + q : p ∈ α ∧ q ∈ β}
α × β = {p × q : 0 ≤ p ∈ α ∧ 0 ≤ q ∈ β} ∪ 0R if α, β ≥ 0R

To handle the other multiplication cases, first let:

−α = {p − q : p < 0 ∧ q ∈
/ α}

and then stipulate:



−α × −β
 if α < 0R and β < 0R
α × β = −(−α × β) if α < 0R and β > 0R

−(α × −β) if α > 0R and β < 0R

We then need to check that each of these definitions always yields a cut. And
finally, we need to go through an easy (but long-winded) demonstration that
the cuts, so defined, behave exactly as they should. But we relegate all of this
to section 5.6.

content/sets-functions-relations/arithmetization/reflections.tex

5.5 Some Philosophical Reflections


So much for the technicalities. But what did they achieve? sfr:arith:ref:
sec

78 Release : 6891b66 (2024-12-01)


5.5. SOME PHILOSOPHICAL REFLECTIONS

Well, pretty uncontestably, they gave us some lovely pure mathematics.


Moreover, there were some deep conceptual achievements. It was a profound
insight, to see that the Completeness Property expresses the crucial difference
between the reals and the rationals. Moreover, the explicit construction of
reals, as Dedekind cuts, puts the subject matter of analysis on a firm footing.
We know that the notion of a complete ordered field is coherent, for the cuts
form just such a field.
For all that, we should air a few reservations about these achievements.
First, it is not clear that thinking of reals in terms of cuts is any more
rigorous than thinking of reals in terms of their familiar (possibly infinite) dec-
imal expansions. This latter “construction” of the reals has some resemblance
to the construction of the reals via Cauchy sequence; but in fact, it was es-
sentially known to mathematicians from the early 17th century onwards (see
section 5.7). The real increase in rigour came from the realisation that the
reals have the Completeness Property; the ability to construct real numbers as
particular sets is perhaps not, by itself, so very interesting.
It is even less clear that the (much easier) arithmetization of the integers,
or of the rationals, increases rigour in those areas. Here, it is worth making
a simple observation. Having constructed the integers as equivalence classes
of ordered pairs of naturals, and then constructed the rationals as equivalence
classes of ordered pairs of integers, and then constructed the reals as sets of
rationals, we immediately forget about the constructions. In particular: no one
would ever want to invoke these constructions during a mathematical proof
(excepting, of course, a proof that the constructions behaved as they were
supposed to). It’s much easier to speak about a real, directly, than to speak
about some set of sets of sets of sets of sets of sets of sets of naturals.
It is most doubtful of all that these definitions tell us what the integers,
rationals, or reals are, metaphysically speaking. That is, it is doubtful that the
reals (say) are certain sets (of sets of sets. . . ). The main barrier to such a view
is that the construction could have been done in many different ways. In the
case of the reals, there are some genuinely interestingly different constructions
(see section 5.7). But here is a really trivial way to obtain some different
constructions: as in section 2.2, we could have defined ordered pairs slightly
differently; if we had used this alternative notion of an ordered pair, then our
constructions would have worked precisely as well as they did, but we would
have ended up with different objects. As such, there are many rival set-theoretic
constructions of the integers, the rationals, and the reals. And now it would
just be arbitrary (and embarrassing) to claim that the integers (say) are these
sets, rather than those. (As in section 2.2, this is an instance of an argument
made famous by Benacerraf 1965.)
A further point is worth raising: there is something quite odd about our
constructions. We started with the natural numbers. We then construct the
integers, and construct “the 0 of the integers”, i.e., [0, 0]∼ . But 0 ̸= [0, 0]∼ .
Indeed, given our constructions, no natural number is an integer. But that
seems extremely counter-intuitive. Indeed, in section 1.3, we claimed without
much argument that N ⊆ Q. If the constructions tell us exactly what the

Release : 6891b66 (2024-12-01) 79


CHAPTER 5. ARITHMETIZATION

numbers are, this claim was trivially false.


Standing back, then, where do we get to? Working in a naı̈ve set theory,
and helping ourselves to the naturals, we are able to treat integers, rationals,
and reals as certain sets. In that sense, we can embed the theories of these
entities within a set theory. But the philosophical import of this embedding is
just not that straightforward.
Of course, none of this is the last word! The point is only this. Showing
that the arithmetization of the reals is of deep philosophical significance would
require some additional philosophical argument.

content/sets-functions-relations/arithmetization/checking-details.tex

5.6 Ordered Rings and Fields


Throughout this chapter, we claimed that certain definitions behave “as they sfr:arith:check:
sec
ought”. In this technical appendix, we will spell out what we mean, and (sketch
how to) show that the definitions do behave “correctly”.
In section 5.1, we defined addition and multiplication on Z. We want to
show that, as defined, they endow Z with the structure we “would want” it to
have. In particular, the structure in question is that of a commutative ring.

Definition 5.7. A commutative ring is a set S, equipped with specific ele-


ments 0 and 1 and operations + and ×, satisfying these eight formulas:

Associativity a + (b + c) = (a + b) + c
(a × b) × c = a × (b × c)
Commutativity a+b=b+a
a×b=b×a
Identities a+0=a
a×1=a
Additive Inverse (∃b ∈ S)0 = a + b
Distributivity a × (b + c) = (a × b) + (a × c)

Implicitly, these are all bound with universal quantifiers restricted to S. And
note that the elements 0 and 1 here need not be the natural numbers with the
same name.

So, to check that the integers form a commutative ring, we just need to
check that we meet these eight conditions. None of the conditions is difficult
to establish, but this is a bit laborious. For example, here is how to prove
Associativity, in the case of addition:

Proof. Fix i, j, k ∈ Z. So there are a1 , b1 , a2 , b2 , a3 , b3 ∈ N such that i = [a1 , b1 ]


and j = [a2 , b2 ] and k = [a3 , b3 ]. (For legibility, we write “[x, y]” rather than

80 Release : 6891b66 (2024-12-01)


5.6. ORDERED RINGS AND FIELDS

“[x, y]∼ ”; we’ll do this throughout this section.) Now:

i + (j + k) = [a1 , b1 ] + ([a2 , b2 ] + [a3 , b3 ])


= [a1 , b1 ] + [a2 + a3 , b2 + b3 ]
= [a1 + (a2 + a3 ), b1 + (b2 + b3 )]
= [(a1 + a2 ) + a3 , (b1 + b2 ) + b3 ]
= [a1 + a2 , b1 + b2 ] + [a3 , b3 ]
= ([a1 , b1 ] + [a2 , b2 ]) + [a3 , b3 ]
= (i + j) + k

helping ourselves freely to the behavior of addition on N.

Equally, here is how to prove Additive Inverse:

Proof. Fix i ∈ Z, so that i = [a, b] for some a, b ∈ N. Let j = [b, a] ∈ Z.


Helping ourselves to the behaviour of the naturals, (a + b) + 0 = 0 + (a + b), so
that ⟨a + b, b + a⟩ ∼Z ⟨0, 0⟩ by definition, and hence [a + b, b + a] = [0, 0] = 0Z .
So now i + j = [a, b] + [b, a] = [a + b, b + a] = [0, 0] = 0Z .

And here is a proof of Distributivity:

Proof. As above, fix i = [a1 , b1 ] and j = [a2 , b2 ] and k = [a3 , b3 ]. Now:

i × (j + k) = [a1 , b1 ] × ([a2 , b2 ] + [a3 , b3 ])


= [a1 , b1 ] × [a2 + a3 , b2 + b3 ]
= [a1 (a2 + a3 ) + b1 (b2 + b3 ), a1 (b2 + b3 ) + b1 (a2 + a3 )]
= [a1 a2 + a1 a3 + b1 b2 + b1 b3 , a1 b2 + a1 b3 + a2 b1 + a3 b1 ]
= [a1 a2 + b1 b2 , a1 b2 + a2 b1 ] + [a1 a3 + b1 b3 , a1 b3 + a3 b1 ]
= ([a1 , b1 ] × [a2 , b2 ]) + ([a1 , b1 ] × [a3 , b3 ])
= (i × j) + (i × k)

We leave it as an exercise to prove the remaining five conditions. Having


done that, we have shown that Z constitutes a commutative ring, i.e., that
addition and multiplication (as defined) behave as they should.

Problem 5.4. Prove that Z is a commutative ring.

But our task is not over. As well as defining addition and multiplication
over Z, we defined an ordering relation, ≤, and we must check that this behaves
as it should. In more detail, we must show that Z constitutes an ordered ring.3
3 Recall from Definition 2.16 that a total order is a relation which is reflexive, transitive,

anti-symmetric, and connected. In the context of order relations, connectedness is sometimes


called trichotomy, since for any a and b we have a ≤ b ∨ a = b ∨ a ≥ b.

Release : 6891b66 (2024-12-01) 81


CHAPTER 5. ARITHMETIZATION

Definition 5.8. An ordered ring is a commutative ring which is also equipped


with a total order relation, ≤, such that:

a≤b→a+c≤b+c
(a ≤ b ∧ 0 ≤ c) → a × c ≤ b × c

Problem 5.5. Prove that Z is an ordered ring.

As before, it is laborious but routine to show that Z, as constructed, is an


ordered ring. We will leave that to you.
This takes care of the integers. But now we need to show very similar things
of the rationals. In particular, we now need to show that the rationals form an
ordered field, under our given definitions of +, ×, and ≤:

Definition 5.9. An ordered field is an ordered ring which also satisfies: sfr:arith:check:
orderedfield

Multiplicative Inverse (∀a ∈ S \ {0})(∃b ∈ S)a × b = 1

Once you have shown that Z constitutes an ordered ring, it is easy but
laborious to show that Q constitutes an ordered field.

Problem 5.6. Prove that Q is an ordered field.

Having dealt with the integers and the rationals, it only remains to deal with
the reals. In particular, we need to show that R constitutes a complete ordered
field, i.e., an ordered field with the Completeness Property. Now, Theorem 5.6
established that R has the Completeness Property. However, it remains to run
through the (tedious) of checking that R is an ordered field.
Before tearing off into that laborious exercise, we need to check some more
“immediate” things. For example, we need a guarantee that α + β, as defined,
is indeed a cut, for any cuts α and β. Here is a proof of that fact:

Proof. Since α and β are both cuts, α + β = {p + q : p ∈ α ∧ q ∈ β} is a


non-empty proper subset of Q. Now suppose x < p + q for some p ∈ α and
q ∈ β. Then x − p < q, so x − p ∈ β, and x = p + (x − p) ∈ α + β. So
α + β is an initial segment of Q. Finally, for any p + q ∈ α + β, since α and
β are both cuts, there are p1 ∈ α and q1 ∈ β such that p < p1 and q < q1 ; so
p + q < p1 + q1 ∈ α + β; so α + β has no maximum.

Similar efforts will allow you to check that α − β and α × β and α ÷ β


are cuts (in the last case, ignoring the case where β is the zero-cut). Again,
though, we will simply leave this to you.

Problem 5.7. Prove that R is an ordered field.

But here
√ is a small loose end to tidy up. In section 5.4, we suggest that we
can take 2 = {p ∈ Q : p < 0 or p2 < 2}. But we do need to show that this
set is a cut. Here is a proof of that fact:

82 Release : 6891b66 (2024-12-01)


5.7. APPENDIX: THE REALS AS CAUCHY SEQUENCES

Proof. Clearly this is a nonempty proper initial segment of the rationals; so


it suffices to show that it has no maximum. In particular, it suffices to show
that, where p is a positive rational with p2 < 2 and q = 2p+2
p+2 , both p < q and
q 2 < 2. To see that p < q, just note:

p2 < 2
p2 + 2p < 2 + 2p
p(p + 2) < 2 + 2p
2+2p
p< p+2 =q

To see that q 2 < 2, just note:

p2 < 2
2p2 + 4p + 2 < p2 + 4p + 4
4p2 + 8p + 4 < 2(p2 + 4p + 4)
(2p + 2)2 < 2(p + 2)2
(2p+2)2
(p+2)2 <2
2
q <2

content/sets-functions-relations/arithmetization/cauchy.tex

5.7 Appendix: the Reals as Cauchy Sequences


sfr:arith:cauchy: In section 5.4, we constructed the reals as Dedekind cuts. In this section, we
sec
explain an alternative construction. It builds on Cauchy’s definition of (what
we now call) a Cauchy sequence; but the use of this definition to construct the
reals is due to other nineteenth-century authors, notably Weierstrass, Heine,
Méray and Cantor. (For a nice history, see O’Connor and Robertson 2005.)
Before we get to the nineteenth century, it’s worth considering Simon Stevin
(1548–1620). In brief, Stevin realised that we can think of each√real in terms
of its decimal expansion. Thus even an irrational number, like 2, has a nice
decimal expansion, beginning:

1.41421356237 . . .

It is very easy to model decimal expansions in set theory: simply consider


them as functions d : N → N, where d(n) is the nth decimal place that we
are interested in. We will then need a bit of tweak, to handle the bit of the
real number that comes before the decimal point (here, just 1). We will also
need a further tweak (an equivalence relation) to guarantee that, for example,
0.999 . . . = 1. But it is not difficult to offer a perfectly rigorous construction of
the real numbers, in the manner of Stevin, within set theory.

Release : 6891b66 (2024-12-01) 83


CHAPTER 5. ARITHMETIZATION

Stevin is not our focus. (For more on Stevin, see Katz


√ and Katz 2012.) But
here is a closely related thought. Instead of treating 2’s decimal expansion
directly, we can instead
√ consider a sequence of increasingly accurate rational
approximations to 2, by considering the increasingly precise expansions:

1, 1.4, 1.414, 1.4142, 1.41421, . . .

The idea that reals can be considered via “increasingly good approximations”
provides us with the basis for another sequence of insights (akin to the reali-
sations that we used when constructing Q from Z, or Z from N). The basic
insights are these:

1. Every real can be written as a (perhaps infinite) decimal expansion.

2. The information encoded by a (perhaps infinite) decimal expansion can


be equally be encoded by a sequence of rational numbers.

3. A sequence of rational numbers can be thought of as a function from N


to Q; just let f (n) be the nth rational in the sequence.

Of course, not just any function from N to Q will give us a real number. For
instance, consider this function:
(
1 if n is odd
f (n) =
0 if n is even

Essentially the worry here is that the sequence 0, 1, 0, 1, 0, 1, 0, . . . doesn’t seem


to “hone in” on any real. So: to ensure that we consider sequences which do
hone in on some real, we need to restrict our attention to sequences which have
some limit.
We have already encountered the idea of a limit, in section 73.2. But
we cannot use quite the same definition as we used there. The expression
“(∀ε > 0)” there tacitly involved quantification over the real numbers; and we
were considering the limits of functions on the real numbers; so invoking that
definition would be to help ourselves to the real numbers; and they are exactly
what we were aiming to construct. Fortunately, we can work with a closely
related idea of a limit.

Definition 5.10. A function f : N → Q is a Cauchy sequence iff for any sfr:arith:cauchy:


positive ε ∈ Q we have that (∃ℓ ∈ N)(∀m, n > ℓ)|f (m) − f (n)| < ε. def:CauchySequence

The general idea of a limit is the same as before: if you want a certain
level of precision (measured by ε), there is a “region” to look in (any input
greater than ℓ). And it is easy to see that our sequence
√ 1, 1.4, 1.414, 1.4142,
1.41421. . . has a limit: if you want to approximate 2 to within an error of
1/10n , then just look to any entry after the nth.

The obvious thought, then, would be to say that a real number just is
any Cauchy sequence. But, as in the constructions of Z and Q, this would

84 Release : 6891b66 (2024-12-01)


5.7. APPENDIX: THE REALS AS CAUCHY SEQUENCES

be too naı̈ve: for any given real number, multiple different Cauchy sequences
indicate that real number. A simple way to see this as follows. Given a Cauchy
sequence f , define g to be exactly the same function as f , except that g(0) ̸=
f (0). Since the two sequences agree everywhere after the first number, we will
(ultimately) want to say that they have the same limit, in the sense employed
in Definition 5.10, and so should be thought of “defining” the same real. So,
we should really think of these Cauchy sequences as the same real number.
Consequently, we again need to define an equivalence relation on the Cauchy
sequences, and identify real numbers with equivalence relations. First we need
the idea of a function which tends to 0 in the limit. For any function h : N → Q,
say that h tends to 0 iff for any positive ε ∈ Q we have that (∃ℓ ∈ N)(∀n >
ℓ)|f (n)| < ε.4 Further, where f and g are functions N → Q, let (f − g)(n) =
f (n) − g(n). Now define:

f ≎ g iff (f − g) tends to 0.

We need to check that ≎ is an equivalence relation; and it is. We can then,


if we like, define the reals as the equivalence classes, under ≎, of all Cauchy
sequences from N → Q.
1
Problem 5.8. Let f (n) = 0 for every n. Let g(n) = (n+1) 2 . Show that both

are Cauchy sequences, and indeed that the limit of both functions is 0, so that
also f ∼R g.

Having done this, we shall as usual write [f ]≎ for the equivalence class with
f as an element. However, to keep things readable, in what follows we will
drop the subscript and write just [f ]. We also stipulate that, for each q ∈ Q,
we have qR = [cq ], where cq is the constant function cq (n) = q for all n ∈ N.
We then define basic relations and operations on the reals, e.g.:

[f ] + [g] = [(f + g)]


[f ] × [g] = [(f × g)]

where (f +g)(n) = f (n)+g(n) and (f ×g)(n) = f (n)×g(n). Of course, we also


need to check that each of (f + g), (f − g) and (f × g) are Cauchy sequences
when f and g are; but they are, and we leave this to you.
Finally, we define we a notion of order. Say [f ] is positive iff both [f ] ̸= 0Q
and (∃ℓ ∈ N)(∀n > ℓ)0 < f (n). Then say [f ] < [g] iff [(g − f )] is positive. We
have to check that this is well-defined (i.e., that it does not depend upon choice
of “representative” function from the equivalence class). But having done this,
it is quite easy to show that these yield the right algebraic properties; that is:

sfr:arith:cauchy: Theorem 5.11. The Cauchy sequences constitute an ordered field.


thm:cauchyorderedfield

Proof. Exercise.
4 Compare this with the definition of limx→∞ f (x) = 0 in section 73.2.

Release : 6891b66 (2024-12-01) 85


Problem 5.9. Prove that the Cauchy sequences constitute an ordered field.

It is harder to prove that the reals, so constructed, have the Completeness


Property, so we will give the proof.

Theorem 5.12. Every non-empty set of Cauchy sequences with an upper bound
has a least upper bound.

Proof sketch. Let S be any non-empty set of Cauchy sequences with an upper
bound. So there is some p ∈ Q such that pR is an upper bound for S. Let
r ∈ S; then there is some q ∈ Q such that qR < r. So if a least upper bound
on S exists, it is between qR and pR (inclusive).
We will hone in on the l.u.b., by approaching it simultaneously from below
and above. In particular, we define two functions, f, g : N → Q, with the aim
that f will hone in on the l.u.b. from above, and g will hone on in it from
below. We start by defining:

f (0) = p
g(0) = q

f (n)+g(n)
Then, where an = 2 , let:5
(
an if (∀h ∈ S)[h] ≤ (an )R
f (n + 1) =
f (n) otherwise
(
an if (∃h ∈ S)[h] ≥ (an )R
g(n + 1) =
g(n) otherwise

Both f and g are Cauchy sequences. (This can be checked fairly easily; but we
leave it as an exercise.) Note that the function (f − g) tends to 0, since the
difference between f and g halves at every step. Hence [f ] = [g].
We will show that (∀h ∈ S)[h] ≤ [f ], invoking Theorem 5.11 as we go. Let
h ∈ S and suppose, for reductio, that [f ] < [h], so that 0R < [(h − f )]. Since f
is a monotonically decreasing Cauchy sequence, there is some n ∈ N such that
[(cf (n) − f )] < [(h − f )]. So:

(f (n))R = [cf (k) ] < [f ] + [(h − f )] = [h],

contradicting the fact that, by construction, [h] ≤ (f (k))R .


In an exactly similar way, we can show that (∀[h]∈ S)[g] ≤ [h]. So [f ] = [g]
is the least upper bound for S.

5 This is a recursive definition. But we have not yet given any reason to think that

recursive definitions are ok.

86
Chapter 6

Infinite Sets

This chapter on infinite sets is taken from Tim Button’s Open Set
Theory.

content/sets-functions-relations/infinite/hilberts-hotel.tex

6.1 Hilbert’s Hotel


sfr:infinite:hilbert: The set of the natural numbers is obviously infinite. So, if we do not want
sec
to help ourselves to the natural numbers, our first step must be characterize
an infinite set in terms that do not require mentioning the natural numbers
themselves. Here is a nice approach, presented by Hilbert in a lecture from
1924. He asks us to imagine
[. . . ] a hotel with a finite number of rooms. All of these rooms
should be occupied by exactly one guest. If the guests now swap
their rooms somehow, [but] so that each room still contains no more
than one person, then no rooms will become free, and the hotel-
owner cannot in this way create a new place for a newly arriving
guest [. . . ¶. . . ]
Now we stipulate that the hotel shall have infinitely many numbered
rooms 1, 2, 3, 4, 5, . . . , each of which is occupied by exactly one
guest. As soon as a new guest comes along, the owner only needs
to move each of the old guests into the room associated with the
number one higher, and room 1 will be free for the newly-arriving
guest.

1 2 3 4 5 6 7 8 9 ...

1 2 3 4 5 6 7 8 9 ...

Release : 6891b66 (2024-12-01) 87


CHAPTER 6. INFINITE SETS

(published in Hilbert 2013, 730; our translation)

The crucial point is that Hilbert’s Hotel has infinitely many rooms; and we
can take his explanation to define what it means to say this. Indeed, this was
Dedekind’s approach (presented here, of course, with massive anachronism;
Dedekind’s definition is from 1888):

Definition 6.1. A set A is Dedekind infinite iff there is an injection from A to sfr:infinite:hilbert:
a proper subset of A. That is, there is some o ∈ A and an injection f : A → A defn:DedekindInfinite

such that o ∈
/ ran(f ).

content/sets-functions-relations/infinite/dedekind-algebra.tex

6.2 Dedekind Algebras


We not only want natural numbers to be infinite; we want them to have certain sfr:infinite:dedekind:
sec
(algebraic) properties: they need to behave well under addition, multiplication,
and so forth.
Dedekind’s idea was to take the idea of the successor function as basic, and
then characterise the numbers as those with the following properties:

1. There is a number, 0, which is not the successor of any number


i.e., 0 ∈
/ ran(s)
i.e., ∀x s(x) ̸= 0

2. Distinct numbers have distinct successors


i.e., s is an injection
i.e., ∀x∀y(s(x) = s(y) → x = y)

3. Every number is obtained from 0 by repeated applications of the succes- sfr:infinite:dedekind:


repeatedapplication
sor function.

The first two conditions are easy to deal with using first-order logic (see above).
But we cannot deal with (3) just using first-order logic. Dedekind’s break-
through was to reformulate condition (3), set-theoretically, as follows:

3′ . The natural numbers are the smallest set that is closed under the succes-
sor function: that is, if we apply s to any element of the set, we obtain
another element of the set.

But we shall need to spell this out slowly.

Definition 6.2. For any function f , the set X is f -closed iff (∀x ∈ X)f (x) ∈ sfr:infinite:dedekind:
Closure
X. Now define, for any o:
\
clof (o) = {X : o ∈ X and X is f -closed}

88 Release : 6891b66 (2024-12-01)


6.2. DEDEKIND ALGEBRAS

So clof (o) is the intersection of all the f -closed sets with o as an element.
Intuitively, then, clof (o) is the smallest f -closed set with o as an element. This
next result makes that intuitive thought precise;
sfr:infinite:dedekind: Lemma 6.3. For any function f and any o ∈ A:
closureproperties
sfr:infinite:dedekind: 1. o ∈ clof (o); and
closurehaselem
sfr:infinite:dedekind: 2. clof (o) is f -closed; and
closureclosed
sfr:infinite:dedekind: 3. if X is f -closed and o ∈ X, then clof (o) ⊆ X
closuresmallest
Proof. Note that there is at least one f -closed set with o as an element, namely
ran(f ) ∪ {o}. So clof (o), the intersection of all such sets, exists. We must now
check (1)–(3).
Concerning (1): o ∈ clof (o) as it is an intersection of sets which all have o
as an element.
Concerning (2): suppose x ∈ clof (o). So if o ∈ X and X is f -closed, then
x ∈ X, and now f (x) ∈ X as X is f -closed. So f (x) T ∈ clof (o).
Concerning (3): quite generally, if X ∈ C then C ⊆ X.

Using this, we can say:


Definition 6.4. A Dedekind algebra is a set A together with a function f : A →
A and some o ∈ A such that:
sfr:infinite:dedekind: 1. o ∈
/ ran(f )
ded:proper
sfr:infinite:dedekind: 2. f is an injection
ded:injection
sfr:infinite:dedekind: 3. A = clof (o)
ded:closure

Since A = clof (o), our earlier result tells us that A is the smallest f -closed
set with o as an element. Clearly a Dedekind algebra is Dedekind infinite; just
look at clauses (1) and (2) of the definition. But the more exciting fact is that
any Dedekind infinite set can be turned into a Dedekind algebra.
sfr:infinite:dedekind: Theorem 6.5. If there is a Dedekind infinite set, then there is a Dedekind
thm:DedekindInfiniteAlgebra
algebra.

Proof. Let D be Dedekind infinite. So there is an injection g : D → D and an


element o ∈ D \ ran(g). Now let A = clog (o); by Lemma 6.3, A exists and
o ∈ A. Let f = g↾A . We will show that A, f, o comprise a Dedekind algebra.
Concerning (1): o ∈ / ran(g) and ran(f ) ⊆ ran(g) so o ∈
/ ran(f ).
Concerning (2): g is an injection on D; so f ⊆ g must be an injection.
Concerning (3): by Lemma 6.3, A is g-closed; a fortiori, A is f -closed. So
clof (o) ⊆ A by Lemma 6.3. Since also clof (o) is f -closed and f = g↾A , it
follows that clof (o) is g-closed. So A ⊆ clof (o) by Lemma 6.3.

content/sets-functions-relations/infinite/dedekind-induction.tex

Release : 6891b66 (2024-12-01) 89


CHAPTER 6. INFINITE SETS

6.3 Dedekind Algebras and Arithmetical Induction


Crucially, now, a Dedekind algebra—indeed, any Dedekind algebra—will serve sfr:infinite:induction:
sec
as a surrogate for the natural numbers. This is thanks to the following trivial
consequence:

Theorem 6.6 (Arithmetical induction). Let N, s, o comprise a Dedekind sfr:infinite:induction:


thm:dedinfiniteinduction
algebra. Then for any set X:

if o ∈ X and (∀n ∈ N ∩ X)s(n) ∈ X, then N ⊆ X.

Proof. By the definition of a Dedekind algebra, N = clos (o). Now if both


o ∈ X and (∀n ∈ N )(n ∈ X → s(n) ∈ X), then N = clos (o) ⊆ X.

Since induction is characteristic of the natural numbers, the point is this.


Given any Dedekind infinite set, we can form a Dedekind algebra, and use that
algebra as our surrogate for the natural numbers.
Admittedly, Theorem 6.6 formulates induction in set-theoretic terms. But
we can easily put the principle in terms which might be more familiar:

Corollary 6.7. Let N, s, o comprise a Dedekind algebra. Then for any for- sfr:infinite:induction:
natinductionschema
mula φ(x), which may have parameters:

if φ(o) and (∀n ∈ N )(φ(n) → φ(s(n))), then (∀n ∈ N )φ(n)

Proof. Let X = {n ∈ N : φ(n)}, and now use Theorem 6.6

In this result, we spoke of a formula “having parameters”. What this means,


roughly, is that for any objects c1 , . . . , ck , we can work with φ(x, c1 , . . . , ck ).
More precisely, we can state the result without mentioning “parameters” as
follows. For any formula φ(x, v1 , . . . , vk ), whose free variables are all displayed,
we have:

∀v1 . . . ∀vk ((φ(o, v1 , . . . , vk ) ∧


(∀x ∈ N )(φ(x, v1 , . . . , vk ) → φ(s(x), v1 , . . . , vk ))) →
(∀x ∈ N )φ(x, v1 , . . . , vk ))

Evidently, speaking of “having parameters” can make things much easier to


read. (In part XIV, we will use this device rather frequently.)
Returning to Dedekind algebras: given any Dedekind algebra, we can also
define the usual arithmetical functions of addition, multiplication and exponen-
tiation. This is non-trivial, however, and it involves the technique of recursive
definition. That is a technique which we shall introduce and justify much later,
and in a much more general context. (Enthusiasts might want to revisit this
after chapter 66, or perhaps read an alternative treatment, such as Potter 2004,

90 Release : 6891b66 (2024-12-01)


6.4. DEDEKIND’S “PROOF”

pp. 95–8.) But, where N, s, o comprise a Dedekind algebra, we will ultimately


be able to stipulate the following:

a+o=a a×o=o ao = s(o)


a + s(b) = s(a + b) a × s(b) = (a × b) + a as(b) = ab × a

and show that these behave as one would hope.

content/sets-functions-relations/infinite/dedekinds-proof.tex

6.4 Dedekind’s “Proof ” of the Existence of an Infinite


Set
sfr:infinite:dedekindsproof: In this chapter, we have offered a set-theoretic treatment of the natural num-
sec
bers, in terms of Dedekind algebras. In section 5.5, we reflected on the philo-
sophical significance of the arithmetisation of analysis (among other things).
Now we should reflect on the significance of what we have achieved here.
Throughout chapter 5, we took the natural numbers as given, and used
them to construct the integers, rationals, and reals, explicitly. In this chapter,
we have not given an explicit construction of the natural numbers. We have
just shown that, given any Dedekind infinite set, we can define a set which will
behave just like we want N to behave.
Obviously, then, we cannot claim to have answered a metaphysical ques-
tion, such as which objects are the natural numbers. But that’s a good thing.
After all, in section 5.5, we emphasized that we would be wrong to think of the
definition of R as the set of Dedekind cuts as a discovery, rather than a conve-
nient stipulation. The crucial observation is that the Dedekind cuts exemplify
the key mathematical properties of the real numbers. So too here: the crucial
observation is that any Dedekind algebra exemplifies the key mathematical
properties of the natural numbers. (Indeed, Dedekind pushed this point home
by proving that all Dedekind algebras are isomorphic (1888, Theorems 132–3).
It is no surprise, then, that many contemporary “structuralists” cite Dedekind
as a forerunner.)
Moreover, we have shown how to embed the theory of the natural num-
bers into a naı̈ve simple set theory, which itself still remains rather informal,
but which doesn’t (apparently) assume the natural numbers as given. So, we
may be on the way to realising Dedekind’s own ambitious project, which he
explained thus:
In science nothing capable of proof ought to be believed without
proof. Though this demand seems reasonable, I cannot regard it
as having been met even in the most recent methods of laying the
foundations of the simplest science; viz., that part of logic which
deals with the theory of numbers. In speaking of arithmetic (al-
gebra, analysis) as merely a part of logic I mean to imply that I
consider the number-concept entirely independent of the notions or

Release : 6891b66 (2024-12-01) 91


CHAPTER 6. INFINITE SETS

intuitions of space and time—that I rather consider it an immediate


product of the pure laws of thought. (Dedekind, 1888, preface)

Dedekind’s bold idea is this. We have just shown how to build the natural
numbers using (naı̈ve) set theory alone. In chapter 5, we saw how to construct
the reals given the natural numbers and some set theory. So, perhaps, “arith-
metic (algebra, analysis)” turn out to be “merely a part of logic” (in Dedekind’s
extended sense of the word “logic”).
That’s the idea. But hold on for a moment. Our construction of a Dedekind
algebra (our surrogate for the natural numbers) is conditional on the existence
of a Dedekind infinite set. (Just look back to Theorem 6.5.) Unless the ex-
istence of a Dedekind infinite set can be established via “logic” or “the pure
laws of thought”, the project stalls.
So, can the existence of a Dedekind infinite set be established by “the pure
laws of thought”? Here was Dedekind’s effort:

My own realm of thoughts, i.e., the totality S of all things which can
be objects of my thought, is infinite. For if s signifies an element
of S, then the thought s′ that s can be an object of my thought,
is itself an element of S. If we regard this as an image φ(s) of the
element s, then . . . S is [Dedekind] infinite, which was to be proved.
(Dedekind, 1888, §66)

This is quite an astonishing thing to find in the middle of a book which largely
consists of highly rigorous mathematical proofs. Two remarks are worth mak-
ing.
First: this “proof” scarcely has what we would now recognize as a “math-
ematical” character. It speaks of psychological objects (thoughts), and merely
possible ones at that.
Second: at least as we have presented Dedekind algebras, this “proof” has
a straightforward technical shortcoming. If Dedekind’s argument is successful,
it establishes only that there are infinitely many things (specifically, infinitely
many thoughts). But Dedekind also needs to give us a reason to regard S as
a single set, with infinitely many elements, rather than thinking of S as some
things (in the plural).
The fact that Dedekind did not see a gap here might suggest that his use
of the word “totality” does not precisely track our use of the word “set”.1
But this would not be too surprising. The project we have pursued in the last
two chapters—a “construction” of the naturals, and from them a “construc-
tion” of the integers, reals and rationals—has all been carried out naı̈vely. We
have helped ourselves to this set, or that set, as and when we have needed
them, without laying down many general principles concerning exactly which
sets exist, and when. But we know that we need some general principles, for
otherwise we will fall into Russell’s Paradox.
The time has come for us to outgrow our naı̈vety.
1 Indeed, we have other reasons to think it did not; see Potter (2004, p. 23).

92 Release : 6891b66 (2024-12-01)


6.5. APPENDIX: PROVING SCHRÖDER-BERNSTEIN

content/sets-functions-relations/infinite/card-sb.tex

6.5 Appendix: Proving Schröder-Bernstein


sfr:infinite:card-sb: Before we depart from naı̈ve set theory, we have one last naı̈ve (but sophisti-
sec
cated!) proof to consider. This is a proof of Schröder-Bernstein (Theorem 4.25):
if A ⪯ B and B ⪯ A then A ≈ B; i.e., given injections f : A → B and g : B → A
there is a bijection h : A → B.
In this chapter, we followed Dedekind’s notion of closures. In fact, Dedekind
provided a lovely proof of Schröder-Bernstein using this notion, and we will
present it here. The proof closely follows Potter (2004, pp. 157–8), if you want
a slightly different but essentially similar treatment. A little googling will
√ also
convince you that this is a theorem—rather like the irrationality of 2—for
which many interesting and different proofs exist.
Using similar notation as Definition 6.2, let
\
Clof (B) = {X : B ⊆ X and X is f -closed}

for each set B and function f . Defined thus, Clof (B) is the smallest f -closed
set containing B, in that:

sfr:infinite:card-sb: Lemma 6.8. For any function f , and any B:


Closureprops
sfr:infinite:card-sb: 1. B ⊆ Clof (B); and
Closurehaselem
sfr:infinite:card-sb: 2. Clof (B) is f -closed; and
Closureclosed
sfr:infinite:card-sb: 3. if X is f -closed and B ⊆ X, then Clof (B) ⊆ X.
Closuresmallest

Proof. Exactly as in Lemma 6.3.

We need one last fact to get to Schröder-Bernstein:

sfr:infinite:card-sb: Proposition 6.9. If A ⊆ B ⊆ C and A ≈ C, then A ≈ B ≈ C.


sbhelper

Proof. Given a bijection f : C → A, let F = Clof (C \ B) and define a function


g with domain C as follows:
(
f (x) if x ∈ F
g(x) =
x otherwise

We’ll show that g is a bijection from C → B, from which it will follow that
g ◦ f −1 : A → B is a bijection, completing the proof.
First we claim that if x ∈ F but y ∈ / F then g(x) ̸= g(y). For reductio
suppose otherwise, so that y = g(y) = g(x) = f (x). Since x ∈ F and F is
f -closed by Lemma 6.8, we have y = f (x) ∈ F , a contradiction.

Release : 6891b66 (2024-12-01) 93


CHAPTER 6. INFINITE SETS

Now suppose g(x) = g(y). So, by the above, x ∈ F iff y ∈ F . If x, y ∈ F ,


then f (x) = g(x) = g(y) = f (y) so that x = y since f is a bijection. If x, y ∈
/ F,
then x = g(x) = g(y) = y. So g is an injection.
It remains to show that ran(g) = B. So fix x ∈ B ⊆ C. If x ∈ / F , then
g(x) = x. If x ∈ F , then x = f (y) for some y ∈ F , since otherwise F \ {x}
would be f -closed and extend C \ B, which is impossible by Lemma 6.8; now
g(y) = f (y) = x.

Finally, here is the proof of the main result. Recall that given a function h
and set D, we define h[D] = {h(x) : x ∈ D}.

Proof of Schröder-Bernstein. Let f : A → B and g : B → A be injections.


Since f [A] ⊆ B we have that g[f [A]] ⊆ g[B] ⊆ A. Also, g ◦ f : A → g[f [A]] is
an injection since both g and f are; and indeed g ◦ f is a bijection, just by the
way we defined its codomain. So g[f [A]] ≈ A, and hence by Proposition 6.9
there is a bijection h : A → g[B]. Moreover, g −1 is a bijection g[B] → B. So
g −1 ◦ h : A → B is a bijection.

94 Release : 6891b66 (2024-12-01)


Part II

Propositional Logic
This part contains material on classical propositional logic. The first
chapter is relatively rudimentary and just lists definitions and results, many
proofs are not carried out but are left as exercises. The material on proof
systems and the completeness theorem is included from the part on first-
order logic, with the “FOL” tag set to false. This leaves out everything
related to predicates, terms, and quantifiers, and replaces talk of struc-
tures M with talk about valuations v.
It is planned to expand this part to include more detail, and to add
further topics and results, such as truth-functional completeness.

Chapter 7

Syntax and Semantics

This is a very quick summary of definitions only. It should be expanded


to provide a gentle intro to proofs by induction on formulas, with lots more
examples.

content/propositional-logic/syntax-and-semantics/introduction.tex

7.1 Introduction
pl:syn:int: Propositional logic deals with formulas that are built from propositional vari-
sec
ables using the propositional connectives ¬, ∧, ∨, →, and ↔. Intuitively,
a propositional variable p stands for a sentence or proposition that is true or
false. Whenever the “truth value” of the propositional variable in a formula

95
CHAPTER 7. SYNTAX AND SEMANTICS

is determined, so is the truth value of any formulas formed from them using
propositional connectives. We say that propositional logic is truth functional,
because its semantics is given by functions of truth values. In particular, in
propositional logic we leave out of consideration any further determination of
truth and falsity, e.g., whether something is necessarily true rather than just
contingently true, or whether something is known to be true, or whether some-
thing is true now rather than was true or will be true. We only consider two
truth values true (T) and false (F), and so exclude from discussion the possibil-
ity that a statement may be neither true nor false, or only half true. We also
concentrate only on connectives where the truth value of a formula built from
them is completely determined by the truth values of its parts (and not, say, on
its meaning). In particular, whether the truth value of conditionals in English
is truth functional in this sense is contentious. The material conditional → is;
other logics deal with conditionals that are not truth functional.
In order to develop the theory and metatheory of truth-functional propo-
sitional logic, we must first define the syntax and semantics of its expressions.
We will describe one way of constructing formulas from propositional variables
using the connectives. Alternative definitions are possible. Other systems will
choose different symbols, will select different sets of connectives as primitive,
and will use parentheses differently (or even not at all, as in the case of so-called
Polish notation). What all approaches have in common, though, is that the
formation rules define the set of formulas inductively. If done properly, every
expression can result essentially in only one way according to the formation
rules. The inductive definition resulting in expressions that are uniquely read-
able means we can give meanings to these expressions using the same method—
inductive definition.
Giving the meaning of expressions is the domain of semantics. The central
concept in semantics for propositional logic is that of satisfaction in a valuation.
A valuation v assigns truth values T, F to the propositional variables. Any
valuation determines a truth value v(φ) for any formula φ. A formula is satisfied
in a valuation v iff v(φ) = T—we write this as v ⊨ φ. This relation can also
be defined by induction on the structure of φ, using the truth functions for the
logical connectives to define, say, satisfaction of φ ∧ ψ in terms of satisfaction
(or not) of φ and ψ.
On the basis of the satisfaction relation v ⊨ φ for sentences we can then
define the basic semantic notions of tautology, entailment, and satisfiability.
A formula is a tautology, ⊨ φ, if every valuation satisfies it, i.e., v(φ) = T for
any v. It is entailed by a set of formulas, Γ ⊨ φ, if every valuation that satisfies
all the formulas in Γ also satisfies φ. And a set of formulas is satisfiable if
some valuation satisfies all formulas in it at the same time. Because formulas
are inductively defined, and satisfaction is in turn defined by induction on the
structure of formulas, we can use induction to prove properties of our semantics
and to relate the semantic notions defined.

content/propositional-logic/syntax-and-semantics/formulas.tex

96 Release : 6891b66 (2024-12-01)


7.2. PROPOSITIONAL FORMULAS

7.2 Propositional Formulas


pl:syn:fml: Formulas of propositional logic are built up from propositional variables, the
sec
propositional constant ⊥ and the propositional constant ⊤ using logical con-
nectives.

1. A denumerable set At0 of propositional variables p0 , p1 , . . .

2. The propositional constant for falsity ⊥.

3. The propositional constant for truth ⊤.

4. The logical connectives: ¬ (negation), ∧ (conjunction), ∨ (disjunction),


→ (conditional), ↔ (biconditional)

5. Punctuation marks: (, ), and the comma.

We denote this language of propositional logic by L0 .


.
You may be familiar with different terminology and symbols than the ones intro

we use above. Logic texts (and teachers) commonly use either ∼, ¬, and ! for
“negation”, ∧, ·, and & for “conjunction”. Commonly used symbols for the
“conditional” or “implication” are →, ⇒, and ⊃. Symbols for “biconditional,”
“bi-implication,” or “(material) equivalence” are ↔, ⇔, and ≡. The ⊥ symbol
is variously called “falsity,” “falsum,” “absurdity,” or “bottom.” The ⊤ symbol
is variously called “truth,” “verum,” or “top.”

pl:syn:fml: Definition 7.1 (Formula). The set Frm(L0 ) of formulas of propositional


defn:formulas
logic is defined inductively as follows:

1. ⊥ is an atomic formula.

2. ⊤ is an atomic formula.

3. Every propositional variable pi is an atomic formula.

4. If φ is a formula, then ¬φ is a formula.

5. If φ and ψ are formulas, then (φ ∧ ψ) is a formula.

6. If φ and ψ are formulas, then (φ ∨ ψ) is a formula.

7. If φ and ψ are formulas, then (φ → ψ) is a formula.

8. If φ and ψ are formulas, then (φ ↔ ψ) is a formula.

9. Nothing else is a formula.

Release : 6891b66 (2024-12-01) 97


CHAPTER 7. SYNTAX AND SEMANTICS

explanation The definition of formulas is an inductive definition. Essentially, we con-


struct the set of formulas in infinitely many stages. In the initial stage, we
pronounce all atomic formulas to be formulas; this corresponds to the first few
cases of the definition, i.e., the cases for ⊤, ⊥, pi . “Atomic formula” thus
means any formula of this form.
The other cases of the definition give rules for constructing new formulas
out of formulas already constructed. At the second stage, we can use them to
construct formulas out of atomic formulas. At the third stage, we construct
new formulas from the atomic formulas and those obtained in the second stage,
and so on. A formula is anything that is eventually constructed at such a stage,
and nothing else.
When writing a formula (ψ ∗ χ) constructed from ψ, χ using a two-place
connective ∗, we will often leave out the outermost pair of parentheses and
write simply ψ ∗ χ.

Definition 7.2 (Syntactic identity). The symbol ≡ expresses syntactic iden-


tity between strings of symbols, i.e., φ ≡ ψ iff φ and ψ are strings of symbols
of the same length and which contain the same symbol in each place.

The ≡ symbol may be flanked by strings obtained by concatenation, e.g.,


φ ≡ (ψ ∨ χ) means: the string of symbols φ is the same string as the one
obtained by concatenating an opening parenthesis, the string ψ, the ∨ symbol,
the string χ, and a closing parenthesis, in this order. If this is the case, then
we know that the first symbol of φ is an opening parenthesis, φ contains ψ as a
substring (starting at the second symbol), that substring is followed by ∨, etc.

content/propositional-logic/syntax-and-semantics/preliminaries.tex

7.3 Preliminaries
pl:syn:pre:
sec
Theorem 7.3 (Principle of induction on formulas). If some property P pl:syn:pre:
thm:induction
holds for all the atomic formulas and is such that

1. it holds for ¬φ whenever it holds for φ;

2. it holds for (φ ∧ ψ) whenever it holds for φ and ψ;

3. it holds for (φ ∨ ψ) whenever it holds for φ and ψ;

4. it holds for (φ → ψ) whenever it holds for φ and ψ;

5. it holds for (φ ↔ ψ) whenever it holds for φ and ψ;

then P holds for all formulas.

98 Release : 6891b66 (2024-12-01)


7.3. PRELIMINARIES

Proof. Let S be the collection of all formulas with property P . Clearly S ⊆


Frm(L0 ). S satisfies all the conditions of Definition 7.1: it contains all atomic
formulas and is closed under the logical operators. Frm(L0 ) is the smallest such
class, so Frm(L0 ) ⊆ S. So Frm(L0 ) = S, and every formula has property P .

pl:syn:pre: Proposition 7.4. Any formula in Frm(L0 ) is balanced, in that it has as


prop:balanced
many left parentheses as right ones.

Problem 7.1. Prove Proposition 7.4

pl:syn:pre: Proposition 7.5. No proper initial segment of a formula is a formula.


prop:noinit

Problem 7.2. Prove Proposition 7.5

Proposition 7.6 (Unique Readability). Any formula φ in Frm(L0 ) has ex-


actly one parsing as one of the following
1. ⊥.
2. ⊤.
3. pn for some pn ∈ At0 .
4. ¬ψ for some formula ψ.
5. (ψ ∧ χ) for some formulas ψ and χ.
6. (ψ ∨ χ) for some formulas ψ and χ.
7. (ψ → χ) for some formulas ψ and χ.
8. (ψ ↔ χ) for some formulas ψ and χ.
Moreover, this parsing is unique.

Proof. By induction on φ. For instance, suppose that φ has two distinct read-
ings as (ψ → χ) and (ψ ′ → χ′ ). Then ψ and ψ ′ must be the same (or else one
would be a proper initial segment of the other); so if the two readings of φ are
distinct it must be because χ and χ′ are distinct readings of the same sequence
of symbols, which is impossible by the inductive hypothesis.

Definition 7.7 (Uniform Substitution). If φ and ψ are formulas, and pi


is a propositional variable, then φ[ψ/pi ] denotes the result of replacing each
occurrence of pi by an occurrence of ψ in φ; similarly, the simultaneous substi-
tution of p1 , . . . , pn by formulas ψ1 , . . . , ψn is denoted by φ[ψ1 /p1 , . . . , ψn /pn ].

Problem 7.3. For each of the five formulas below determine whether the for-
mula can be expressed as a substitution φ[ψ/pi ] where φ is (i) p0 ; (ii) (¬p0 ∧p1 );
and (iii) ((¬p0 → p1 ) ∧ p2 ). In each case specify the relevant substitution.
1. p1

Release : 6891b66 (2024-12-01) 99


CHAPTER 7. SYNTAX AND SEMANTICS

2. (¬p0 ∧ p0 )

3. ((p0 ∨ p1 ) ∧ p2 )

4. ¬((p0 → p1 ) ∧ p2 )

5. ((¬(p0 → p1 ) → (p0 ∨ p1 )) ∧ ¬(p0 ∧ p1 ))

Problem 7.4. Give a mathematically rigorous definition of φ[ψ/p] by induc-


tion.

content/propositional-logic/syntax-and-semantics/formation-sequences.tex

7.4 Formation Sequences


Defining formulas via an inductive definition, and the complementary tech- pl:syn:fseq:
sec
nique of proving properties of formulas via induction, is an elegant and effi-
cient approach. However, it can also be useful to consider a more bottom-up,
step-by-step approach to the construction of formulas, which we do here using
the notion of a formation sequence.

Definition 7.8 (Formation sequences for formulas). A finite sequence pl:syn:fseq:


⟨φ0 , . . . , φn ⟩ of strings of symbols from the language L0 is a formation sequence defn:fseq-frm

for φ if φ ≡ φn and for all i ≤ n, either φi is an atomic formula or there exist


j, k < i such that one of the following holds:

1. φi ≡ ¬φj .

2. φi ≡ (φj ∧ φk ).

3. φi ≡ (φj ∨ φk ).

4. φi ≡ (φj → φk ).

5. φi ≡ (φj ↔ φk ).

Example 7.9.

⟨p0 , p1 , (p1 ∧ p0 ), ¬(p1 ∧ p0 )⟩

is a formation sequence of ¬(p1 ∧ p0 ), as is

⟨p0 , p1 , p0 , (p1 ∧ p0 ), (p0 → p1 ), ¬(p1 ∧ p0 )⟩.

As can be seen from the second example, formation sequences may contain
‘junk’: formulas which are redundant or do not contribute to the construction.

Proposition 7.10. Every formula φ in Frm(L0 ) has a formation sequence. pl:syn:fseq:


prop:formed

100 Release : 6891b66 (2024-12-01)


7.4. FORMATION SEQUENCES

Proof. Suppose φ is atomic. Then the sequence ⟨φ⟩ is a formation sequence


for φ. Now suppose that ψ and χ have formation sequences ⟨ψ0 , . . . , ψn ⟩ and
⟨χ0 , . . . , χm ⟩ respectively.

1. If φ ≡ ¬ψ, then ⟨ψ0 , . . . , ψn , ¬ψn ⟩ is a formation sequence for φ.

2. If φ ≡ (ψ ∧ χ), then ⟨ψ0 , . . . , ψn , χ0 , . . . , χm , (ψn ∧ χm )⟩ is a formation


sequence for φ.

3. If φ ≡ (ψ ∨ χ), then ⟨ψ0 , . . . , ψn , χ0 , . . . , χm , (ψn ∨ χm )⟩ is a formation


sequence for φ.

4. If φ ≡ (ψ → χ), then ⟨ψ0 , . . . , ψn , χ0 , . . . , χm , (ψn → χm )⟩ is a formation


sequence for φ.

5. If φ ≡ (ψ ↔ χ), then ⟨ψ0 , . . . , ψn , χ0 , . . . , χm , (ψn ↔ χm )⟩ is a formation


sequence for φ.

By the principle of induction on formulas, every formula has a formation se-


quence.

We can also prove the converse. This is important because it shows that
our two ways of defining formulas are equivalent: they give the same results.
It also means that we can prove theorems about formulas by using ordinary
induction on the length of formation sequences.

pl:syn:fseq: Lemma 7.11. Suppose that ⟨φ0 , . . . , φn ⟩ is a formation sequence for φn , and
lem:fseq-init
that k ≤ n. Then ⟨φ0 , . . . , φk ⟩ is a formation sequence for φk .

Proof. Exercise.

pl:syn:fseq: Theorem 7.12. Frm(L0 ) is the set of all strings of symbols in the language L0
thm:fseq-frm-equiv
with a formation sequence.

Proof. Let F be the set of all strings of symbols in the language L0 that have
a formation sequence. We have seen in Proposition 7.10 that Frm(L0 ) ⊆ F , so
now we prove the converse.
Suppose φ has a formation sequence ⟨φ0 , . . . , φn ⟩. We prove that φ ∈
Frm(L0 ) by strong induction on n. Our induction hypothesis is that every
string of symbols with a formation sequence of length m < n is in Frm(L0 ).
By the definition of a formation sequence, either φn is atomic or there must
exist j, k < n such that one of the following is the case:

1. φn ≡ ¬φj .

2. φn ≡ (φj ∧ φk ).

3. φn ≡ (φj ∨ φk ).

4. φn ≡ (φj → φk ).

Release : 6891b66 (2024-12-01) 101


CHAPTER 7. SYNTAX AND SEMANTICS

5. φn ≡ (φj ↔ φk ).

Now we reason by cases. If φn is atomic then φn ∈ Frm(L0 ). Suppose in-


stead that φ ≡ (φj ∧ φk ). By Lemma 7.11, ⟨φ0 , . . . , φj ⟩ and ⟨φ0 , . . . , φk ⟩ are
formation sequences for φj and φk respectively. Since these are proper ini-
tial subsequences of the formation sequence for φ, they both have length less
than n. Therefore by the induction hypothesis, φj and φk are in Frm(L0 ), and
so by the definition of a formula, so is (φj ∧ φk ). The other cases follow by
parallel reasoning.

content/propositional-logic/syntax-and-semantics/valuations-sat.tex

7.5 Valuations and Satisfaction


pl:syn:val:
sec
Definition 7.13 (Valuations). Let {T, F} be the set of the two truth values,
“true” and “false.” A valuation for L0 is a function v assigning either T or F
to the propositional variables of the language, i.e., v : At0 → {T, F}.

Definition 7.14. Given a valuation v, define the evaluation function v : Frm(L0 ) →


{T, F} inductively by:

v(⊥) = F;
v(⊤) = T;
v(pn ) = v(pn );
(
T if v(φ) = F;
v(¬φ) =
F otherwise.
(
T if v(φ) = T and v(ψ) = T;
v(φ ∧ ψ) =
F if v(φ) = F or v(ψ) = F.
(
T if v(φ) = T or v(ψ) = T;
v(φ ∨ ψ) =
F if v(φ) = F and v(ψ) = F.
(
T if v(φ) = F or v(ψ) = T;
v(φ → ψ) =
F if v(φ) = T and v(ψ) = F.
(
T if v(φ) = v(ψ);
v(φ ↔ ψ) =
F if v(φ) ̸= v(ψ).

explanation The clauses correspond to the following truth tables:

102 Release : 6891b66 (2024-12-01)


7.5. VALUATIONS AND SATISFACTION

φ ψ φ∧ψ φ ψ φ∨ψ
φ ¬φ T T T T T T
T F T F F T F T
F T F T F F T T
F F F F F F

φ ψ φ→ψ φ ψ φ↔ψ
T T T T T T
T F F T F F
F T T F T F
F F T F F T

Problem 7.5. Consider adding to L0 a ternary connective ♢ with evaluation


given by
(
v(ψ) if v(φ) = T;
v(♢(φ, ψ, χ)) =
v(χ) if v(φ) = F.

Write down the truth table for this connective.

pl:syn:val: Theorem 7.15 (Local Determination). Suppose that v1 and v2 are valu-
thm:LocalDetermination
ations that agree on the propositional variables occurring in φ, i.e., v1 (pn ) =
v2 (pn ) whenever pn occurs in some formula φ. Then v1 and v2 also agree on φ,
i.e., v1 (φ) = v2 (φ).

Proof. By induction on φ.

pl:syn:val: Definition 7.16 (Satisfaction). We can inductively define the notion of sat-
defn:satisfaction
isfaction of a formula φ by a valuation v, v ⊨ φ, as follows. (We write v ⊭ φ
to mean “not v ⊨ φ.”)

1. φ ≡ ⊥: v ⊭ φ.

2. φ ≡ ⊤: v ⊨ φ.

3. φ ≡ pi : v ⊨ φ iff v(pi ) = T.

4. φ ≡ ¬ψ: v ⊨ φ iff v ⊭ ψ.

5. φ ≡ (ψ ∧ χ): v ⊨ φ iff v ⊨ ψ and v ⊨ χ.

6. φ ≡ (ψ ∨ χ): v ⊨ φ iff v ⊨ ψ or v ⊨ χ (or both).

7. φ ≡ (ψ → χ): v ⊨ φ iff v ⊭ ψ or v ⊨ χ (or both).

8. φ ≡ (ψ ↔ χ): v ⊨ φ iff either both v ⊨ ψ and v ⊨ χ, or neither v ⊨ ψ nor


v ⊨ χ.

If Γ is a set of formulas, v ⊨ Γ iff v ⊨ φ for every φ ∈ Γ .

Release : 6891b66 (2024-12-01) 103


CHAPTER 7. SYNTAX AND SEMANTICS

Proposition 7.17. v ⊨ φ iff v(φ) = T. pl:syn:val:


prop:sat-value

Proof. By induction on φ.

Problem 7.6. Prove Proposition 7.17

content/propositional-logic/syntax-and-semantics/semantic-notions.tex

7.6 Semantic Notions


We define the following semantic notions: pl:syn:sem:
sec

Definition 7.18. 1. A formula φ is satisfiable if for some v, v ⊨ φ; it is


unsatisfiable if for no v, v ⊨ φ;
2. A formula φ is a tautology if v ⊨ φ for all valuations v;
3. A formula φ is contingent if it is satisfiable but not a tautology;
4. If Γ is a set of formulas, Γ ⊨ φ (“Γ entails φ”) if and only if v ⊨ φ for
every valuation v for which v ⊨ Γ .
5. If Γ is a set of formulas, Γ is satisfiable if there is a valuation v for which
v ⊨ Γ , and Γ is unsatisfiable otherwise.

Problem 7.7. For each of the following four formulas determine whether it is
(a) satisfiable, (b) tautology, and (c) contingent.
1. (p0 → (¬p1 → ¬p0 )).
2. ((p0 ∧ ¬p1 ) → (¬p0 ∧ p2 )) ↔ ((p2 → p0 ) → (p0 → p1 )).
3. (p0 ↔ p1 ) → (p2 ↔ ¬p1 ).
4. ((p0 ↔ (¬p1 ∧ p2 )) ∨ (p2 → (p0 ↔ p1 ))).

Proposition 7.19. pl:syn:sem:


prop:semanticalfacts
1. φ is a tautology if and only if ∅ ⊨ φ;
2. If Γ ⊨ φ and Γ ⊨ φ → ψ then Γ ⊨ ψ;
3. If Γ is satisfiable then every finite subset of Γ is also satisfiable;
4. Monotonicity: if Γ ⊆ ∆ and Γ ⊨ φ then also ∆ ⊨ φ; pl:syn:sem:
def:monotonicity
5. Transitivity: if Γ ⊨ φ and ∆ ∪ {φ} ⊨ ψ then Γ ∪ ∆ ⊨ ψ. pl:syn:sem:
def:Cut

Proof. Exercise.

Problem 7.8. Prove Proposition 7.19

104 Release : 6891b66 (2024-12-01)


pl:syn:sem: Proposition 7.20. Γ ⊨ φ if and only if Γ ∪ {¬φ} is unsatisfiable.
prop:entails-unsat

Proof. Exercise.

Problem 7.9. Prove Proposition 7.20

pl:syn:sem: Theorem 7.21 (Semantic Deduction Theorem). Γ ⊨ φ → ψ if and only


thm:sem-deduction
if Γ ∪ {φ} ⊨ ψ.

Proof. Exercise.

Problem 7.10. Prove Theorem 7.21

Chapter 8

Derivation Systems

This chapter collects general material on derivation systems. A text-


book using a specific system can insert the introduction section plus the
relevant survey section at the beginning of the chapter introducing that
system.

content/propositional-logic/../first-order-logic/proof-systems/introduction.tex

8.1 Introduction
pl:prf:int: Logics commonly have both a semantics and a derivation system. The seman-
sec
tics concerns concepts such as truth, satisfiability, validity, and entailment.
The purpose of derivation systems is to provide a purely syntactic method of
establishing entailment and validity. They are purely syntactic in the sense
that a derivation in such a system is a finite syntactic object, usually a se-
quence (or other finite arrangement) of sentences or formulas. Good derivation
systems have the property that any given sequence or arrangement of sentences
or formulas can be verified mechanically to be “correct.”

105
CHAPTER 8. DERIVATION SYSTEMS

The simplest (and historically first) derivation systems for first-order logic
were axiomatic. A sequence of formulas counts as a derivation in such a sys-
tem if each individual formula in it is either among a fixed set of “axioms”
or follows from formulas coming before it in the sequence by one of a fixed
number of “inference rules”—and it can be mechanically verified if a formula
is an axiom and whether it follows correctly from other formulas by one of the
inference rules. Axiomatic derivation systems are easy to describe—and also
easy to handle meta-theoretically—but derivations in them are hard to read
and understand, and are also hard to produce.
Other derivation systems have been developed with the aim of making it
easier to construct derivations or easier to understand derivations once they
are complete. Examples are natural deduction, truth trees, also known as
tableaux proofs, and the sequent calculus. Some derivation systems are de-
signed especially with mechanization in mind, e.g., the resolution method is
easy to implement in software (but its derivations are essentially impossible to
understand). Most of these other derivation systems represent derivations as
trees of formulas rather than sequences. This makes it easier to see which parts
of a derivation depend on which other parts.
So for a given logic, such as first-order logic, the different derivation systems
will give different explications of what it is for a sentence to be a theorem and
what it means for a sentence to be derivable from some others. However that is
done (via axiomatic derivations, natural deductions, sequent derivations, truth
trees, resolution refutations), we want these relations to match the semantic
notions of validity and entailment. Let’s write ⊢ φ for “φ is a theorem” and
“Γ ⊢ φ” for “φ is derivable from Γ .” However ⊢ is defined, we want it to match
up with ⊨, that is:

1. ⊢ φ if and only if ⊨ φ
2. Γ ⊢ φ if and only if Γ ⊨ φ

The “only if” direction of the above is called soundness. A derivation system is
sound if derivability guarantees entailment (or validity). Every decent deriva-
tion system has to be sound; unsound derivation systems are not useful at all.
After all, the entire purpose of a derivation is to provide a syntactic guarantee
of validity or entailment. We’ll prove soundness for the derivation systems we
present.
The converse “if” direction is also important: it is called completeness.
A complete derivation system is strong enough to show that φ is a theorem
whenever φ is valid, and that Γ ⊢ φ whenever Γ ⊨ φ. Completeness is harder
to establish, and some logics have no complete derivation systems. First-order
logic does. Kurt Gödel was the first one to prove completeness for a derivation
system of first-order logic in his 1929 dissertation.
Another concept that is connected to derivation systems is that of consis-
tency. A set of sentences is called inconsistent if anything whatsoever can be
derived from it, and consistent otherwise. Inconsistency is the syntactic coun-
terpart to unsatisfiablity: like unsatisfiable sets, inconsistent sets of sentences

106 Release : 6891b66 (2024-12-01)


8.2. THE SEQUENT CALCULUS

do not make good theories, they are defective in a fundamental way. Consis-
tent sets of sentences may not be true or useful, but at least they pass that
minimal threshold of logical usefulness. For different derivation systems the
specific definition of consistency of sets of sentences might differ, but like ⊢, we
want consistency to coincide with its semantic counterpart, satisfiability. We
want it to always be the case that Γ is consistent if and only if it is satisfi-
able. Here, the “if” direction amounts to completeness (consistency guarantees
satisfiability), and the “only if” direction amounts to soundness (satisfiability
guarantees consistency). In fact, for classical first-order logic, the two versions
of soundness and completeness are equivalent.

content/propositional-logic/../first-order-logic/proof-systems/sequent-calculus.tex

8.2 The Sequent Calculus


pl:prf:seq: While many derivation systems operate with arrangements of sentences, the
sec
sequent calculus operates with sequents. A sequent is an expression of the
form
φ1 , . . . , φ m ⇒ ψ1 , . . . , ψ m ,
that is a pair of sequences of sentences, separated by the sequent symbol ⇒.
Either sequence may be empty. A derivation in the sequent calculus is a tree
of sequents, where the topmost sequents are of a special form (they are called
“initial sequents” or “axioms”) and every other sequent follows from the se-
quents immediately above it by one of the rules of inference. The rules of
inference either manipulate the sentences in the sequents (adding, removing,
or rearranging them on either the left or the right), or they introduce a com-
plex formula in the conclusion of the rule. For instance, the ∧L rule allows the
inference from φ, Γ ⇒ ∆ to φ ∧ ψ, Γ ⇒ ∆, and the →R allows the inference
from φ, Γ ⇒ ∆, ψ to Γ ⇒ ∆, φ → ψ, for any Γ , ∆, φ, and ψ. (In particular, Γ
and ∆ may be empty.)
The ⊢ relation based on the sequent calculus is defined as follows: Γ ⊢ φ
iff there is some sequence Γ0 such that every φ in Γ0 is in Γ and there is a
derivation with the sequent Γ0 ⇒ φ at its root. φ is a theorem in the sequent
calculus if the sequent ⇒ φ has a derivation. For instance, here is a derivation
that shows that ⊢ (φ ∧ ψ) → φ:
φ ⇒ φ
∧L
φ∧ψ ⇒ φ
→R
⇒ (φ ∧ ψ) → φ
A set Γ is inconsistent in the sequent calculus if there is a derivation of
Γ0 ⇒ (where every φ ∈ Γ0 is in Γ and the right side of the sequent is empty).
Using the rule WR, any sentence can be derived from an inconsistent set.
The sequent calculus was invented in the 1930s by Gerhard Gentzen. Be-
cause of its systematic and symmetric design, it is a very useful formalism for
developing a theory of derivations. It is relatively easy to find derivations in

Release : 6891b66 (2024-12-01) 107


CHAPTER 8. DERIVATION SYSTEMS

the sequent calculus, but these derivations are often hard to read and their
connection to proofs are sometimes not easy to see. It has proved to be a very
elegant approach to derivation systems, however, and many logics have sequent
calculus systems.

content/propositional-logic/../first-order-logic/proof-systems/natural-deduction.tex

8.3 Natural Deduction


Natural deduction is a derivation system intended to mirror actual reasoning pl:prf:ntd:
sec
(especially the kind of regimented reasoning employed by mathematicians).
Actual reasoning proceeds by a number of “natural” patterns. For instance,
proof by cases allows us to establish a conclusion on the basis of a disjunctive
premise, by establishing that the conclusion follows from either of the disjuncts.
Indirect proof allows us to establish a conclusion by showing that its negation
leads to a contradiction. Conditional proof establishes a conditional claim “if
. . . then . . . ” by showing that the consequent follows from the antecedent.
Natural deduction is a formalization of some of these natural inferences. Each
of the logical connectives and quantifiers comes with two rules, an introduction
and an elimination rule, and they each correspond to one such natural inference
pattern. For instance, →Intro corresponds to conditional proof, and ∨Elim to
proof by cases. A particularly simple rule is ∧Elim which allows the inference
from φ ∧ ψ to φ (or ψ).
One feature that distinguishes natural deduction from other derivation sys-
tems is its use of assumptions. A derivation in natural deduction is a tree
of formulas. A single formula stands at the root of the tree of formulas, and
the “leaves” of the tree are formulas from which the conclusion is derived. In
natural deduction, some leaf formulas play a role inside the derivation but are
“used up” by the time the derivation reaches the conclusion. This corresponds
to the practice, in actual reasoning, of introducing hypotheses which only re-
main in effect for a short while. For instance, in a proof by cases, we assume
the truth of each of the disjuncts; in conditional proof, we assume the truth
of the antecedent; in indirect proof, we assume the truth of the negation of
the conclusion. This way of introducing hypothetical assumptions and then
doing away with them in the service of establishing an intermediate step is a
hallmark of natural deduction. The formulas at the leaves of a natural de-
duction derivation are called assumptions, and some of the rules of inference
may “discharge” them. For instance, if we have a derivation of ψ from some
assumptions which include φ, then the →Intro rule allows us to infer φ→ψ and
discharge any assumption of the form φ. (To keep track of which assumptions
are discharged at which inferences, we label the inference and the assumptions
it discharges with a number.) The assumptions that remain undischarged at
the end of the derivation are together sufficient for the truth of the conclu-
sion, and so a derivation establishes that its undischarged assumptions entail
its conclusion.

108 Release : 6891b66 (2024-12-01)


8.4. TABLEAUX

The relation Γ ⊢ φ based on natural deduction holds iff there is a derivation


in which φ is the last sentence in the tree, and every leaf which is undischarged
is in Γ . φ is a theorem in natural deduction iff there is a derivation in which
φ is the last sentence and all assumptions are discharged. For instance, here is
a derivation that shows that ⊢ (φ ∧ ψ) → φ:

[φ ∧ ψ]1
φ ∧Elim
1 →Intro
(φ ∧ ψ) → φ

The label 1 indicates that the assumption φ ∧ ψ is discharged at the →Intro


inference.
A set Γ is inconsistent iff Γ ⊢ ⊥ in natural deduction. The rule ⊥I makes
it so that from an inconsistent set, any sentence can be derived.
Natural deduction systems were developed by Gerhard Gentzen and Sta-
nislaw Jaśkowski in the 1930s, and later developed by Dag Prawitz and Frederic
Fitch. Because its inferences mirror natural methods of proof, it is favored by
philosophers. The versions developed by Fitch are often used in introductory
logic textbooks. In the philosophy of logic, the rules of natural deduction have
sometimes been taken to give the meanings of the logical operators (“proof-
theoretic semantics”).

content/propositional-logic/../first-order-logic/proof-systems/tableaux.tex

8.4 Tableaux
pl:prf:tab: While many derivation systems operate with arrangements of sentences, tableaux
sec
operate with signed formulas. A signed formula is a pair consisting of a truth
value sign (T or F) and a sentence

T φ or F φ.

A tableau consists of signed formulas arranged in a downward-branching tree.


It begins with a number of assumptions and continues with signed formulas
which result from one of the signed formulas above it by applying one of the
rules of inference. Each rule allows us to add one or more signed formulas to
the end of a branch, or two signed formulas side by side—in this case a branch
splits into two, with the two added signed formulas forming the ends of the
two branches.
A rule applied to a complex signed formula results in the addition of signed
formulas which are immediate sub-formulas. They come in pairs, one rule for
each of the two signs. For instance, the ∧T rule applies to T φ ∧ ψ, and allows
the addition of both the two signed formulas T φ and T ψ to the end of any
branch containing T φ ∧ ψ, and the rule φ ∧ ψF allows a branch to be split by
adding F φ and F ψ side-by-side. A tableau is closed if every one of its branches
contains a matching pair of signed formulas T φ and F φ.

Release : 6891b66 (2024-12-01) 109


CHAPTER 8. DERIVATION SYSTEMS

The ⊢ relation based on tableaux is defined as follows: Γ ⊢ φ iff there is


some finite set Γ0 = {ψ1 , . . . , ψn } ⊆ Γ such that there is a closed tableau for
the assumptions
{F φ, T ψ1 , . . . , T ψn }
For instance, here is a closed tableau that shows that ⊢ (φ ∧ ψ) → φ:

1. F (φ ∧ ψ) → φ Assumption
2. Tφ ∧ ψ →F 1
3. Fφ →F 1
4. Tφ →T 2
5. Tψ →T 2

A set Γ is inconsistent in the tableau calculus if there is a closed tableau


for assumptions
{T ψ1 , . . . , T ψn }
for some ψi ∈ Γ .
Tableaux were invented in the 1950s independently by Evert Beth and
Jaakko Hintikka, and simplified and popularized by Raymond Smullyan. They
are very easy to use, since constructing a tableau is a very systematic proce-
dure. Because of the systematic nature of tableaux, they also lend themselves
to implementation by computer. However, a tableau is often hard to read and
their connection to proofs are sometimes not easy to see. The approach is also
quite general, and many different logics have tableau systems. Tableaux also
help us to find structures that satisfy given (sets of) sentences: if the set is
satisfiable, it won’t have a closed tableau, i.e., any tableau will have an open
branch. The satisfying structure can be “read off” an open branch, provided
every rule it is possible to apply has been applied on that branch. There is also
a very close connection to the sequent calculus: essentially, a closed tableau is
a condensed derivation in the sequent calculus, written upside-down.

content/propositional-logic/../first-order-logic/proof-systems/axiomatic-deduction.t

8.5 Axiomatic Derivations


Axiomatic derivations are the oldest and simplest logical derivation systems. Its pl:prf:axd:
sec
derivations are simply sequences of sentences. A sequence of sentences counts
as a correct derivation if every sentence φ in it satisfies one of the following
conditions:

1. φ is an axiom, or

2. φ is an element of a given set Γ of sentences, or

3. φ is justified by a rule of inference.

110 Release : 6891b66 (2024-12-01)


To be an axiom, φ has to have the form of one of a number of fixed sentence
schemas. There are many sets of axiom schemas that provide a satisfactory
(sound and complete) derivation system for first-order logic. Some are orga-
nized according to the connectives they govern, e.g., the schemas

φ → (ψ → φ) ψ → (ψ ∨ χ) (ψ ∧ χ) → ψ

are common axioms that govern →, ∨ and ∧. Some axiom systems aim at a
minimal number of axioms. Depending on the connectives that are taken as
primitives, it is even possible to find axiom systems that consist of a single
axiom.
A rule of inference is a conditional statement that gives a sufficient condition
for a sentence in a derivation to be justified. Modus ponens is one very common
such rule: it says that if φ and φ → ψ are already justified, then ψ is justified.
This means that a line in a derivation containing the sentence ψ is justified,
provided that both φ and φ → ψ (for some sentence φ) appear in the derivation
before ψ.
The ⊢ relation based on axiomatic derivations is defined as follows: Γ ⊢ φ
iff there is a derivation with the sentence φ as its last formula (and Γ is taken
as the set of sentences in that derivation which are justified by (2) above). φ
is a theorem if φ has a derivation where Γ is empty, i.e., every sentence in the
derivation is justified either by (1) or (3). For instance, here is a derivation
that shows that ⊢ φ → (ψ → (ψ ∨ φ)):

1. ψ → (ψ ∨ φ)
2. (ψ → (ψ ∨ φ)) → (φ → (ψ → (ψ ∨ φ)))
3. φ → (ψ → (ψ ∨ φ))

The sentence on line 1 is of the form of the axiom φ → (φ ∨ ψ) (with the


roles of φ and ψ reversed). The sentence on line 2 is of the form of the axiom
φ→(ψ →φ). Thus, both lines are justified. Line 3 is justified by modus ponens:
if we abbreviate it as θ, then line 2 has the form χ → θ, where χ is ψ → (ψ ∨ φ),
i.e., line 1.
A set Γ is inconsistent if Γ ⊢ ⊥. A complete axiom system will also prove
that ⊥ → φ for any φ, and so if Γ is inconsistent, then Γ ⊢ φ for any φ.
Systems of axiomatic derivations for logic were first given by Gottlob Frege
in his 1879 Begriffsschrift, which for this reason is often considered the first
work of modern logic. They were perfected in Alfred North Whitehead and
Bertrand Russell’s Principia Mathematica and by David Hilbert and his stu-
dents in the 1920s. They are thus often called “Frege systems” or “Hilbert
systems.” They are very versatile in that it is often easy to find an axiomatic
system for a logic. Because derivations have a very simple structure and only
one or two inference rules, it is also relatively easy to prove things about them.

111
CHAPTER 9. THE SEQUENT CALCULUS

However, they are very hard to use in practice, i.e., it is difficult to find and
write proofs.

Chapter 9

The Sequent Calculus

This chapter presents Gentzen’s standard sequent calculus LK for clas-


sical first-order logic. It could use more examples and exercises. To include
or exclude material relevant to the sequent calculus as a proof system, use
the “prfLK” tag.

content/propositional-logic/../first-order-logic/sequent-calculus/rules-and-proofs.t

9.1 Rules and Derivations


For the following, let Γ, ∆, Π, Λ represent finite sequences of sentences. pl:seq:rul:
sec

Definition 9.1 (Sequent). A sequent is an expression of the form

Γ ⇒∆

where Γ and ∆ are finite (possibly empty) sequences of sentences of the lan-
guage L. Γ is called the antecedent, while ∆ is the succedent.

explanation The intuitive idea behind a sequent is: if all of the sentences in the an-
tecedent hold, then at least one of the sentences in the succedent holds. That
is, if Γ = ⟨φ1 , . . . , φm ⟩ and ∆ = ⟨ψ1 , . . . , ψn ⟩, then Γ ⇒ ∆ holds iff

(φ1 ∧ · · · ∧ φm ) → (ψ1 ∨ · · · ∨ ψn )

holds. There are two special cases: where Γ is empty and when ∆ is empty.
When Γ is empty, i.e., m = 0, ⇒ ∆ holds iff ψ1 ∨ · · · ∨ ψn holds. When ∆
is empty, i.e., n = 0, Γ ⇒ holds iff ¬(φ1 ∧ · · · ∧ φm ) does. We say a sequent
is valid iff the corresponding sentence is valid.

112 Release : 6891b66 (2024-12-01)


9.2. PROPOSITIONAL RULES

If Γ is a sequence of sentences, we write Γ, φ for the result of appending φ


to the right end of Γ (and φ, Γ for the result of appending φ to the left end
of Γ ). If ∆ is a sequence of sentences also, then Γ, ∆ is the concatenation of
the two sequences.

Definition 9.2 (Initial Sequent). An initial sequent is a sequent of one of


the following forms:

1. φ ⇒ φ

2. ⇒⊤

3. ⊥ ⇒

for any sentence φ in the language.

Derivations in the sequent calculus are certain trees of sequents, where the
topmost sequents are initial sequents, and if a sequent stands below one or two
other sequents, it must follow correctly by a rule of inference. The rules for LK
are divided into two main types: logical rules and structural rules. The logical
rules are named for the main operator of the sentence containing φ and/or ψ in
the lower sequent. Each one comes in two versions, one for inferring a sequent
with the sentence containing the logical operator on the left, and one with the
sentence on the right.

content/propositional-logic/../first-order-logic/sequent-calculus/propositional-rules.tex

9.2 Propositional Rules


pl:seq:prl:
sec
Rules for ¬

Γ ⇒ ∆, φ φ, Γ ⇒ ∆
¬L ¬R
¬φ, Γ ⇒ ∆ Γ ⇒ ∆, ¬φ

Rules for ∧

φ, Γ ⇒ ∆
∧L
φ ∧ ψ, Γ ⇒ ∆ Γ ⇒ ∆, φ Γ ⇒ ∆, ψ
∧R
ψ, Γ ⇒ ∆ Γ ⇒ ∆, φ ∧ ψ
∧L
φ ∧ ψ, Γ ⇒ ∆

Release : 6891b66 (2024-12-01) 113


CHAPTER 9. THE SEQUENT CALCULUS

Rules for ∨

Γ ⇒ ∆, φ
∨R
φ, Γ ⇒ ∆ ψ, Γ ⇒ ∆ Γ ⇒ ∆, φ ∨ ψ
∨L
φ ∨ ψ, Γ ⇒ ∆ Γ ⇒ ∆, ψ
∨R
Γ ⇒ ∆, φ ∨ ψ

Rules for →

Γ ⇒ ∆, φ ψ, Π ⇒ Λ φ, Γ ⇒ ∆, ψ
→L →R
φ → ψ, Γ, Π ⇒ ∆, Λ Γ ⇒ ∆, φ → ψ

content/propositional-logic/../first-order-logic/sequent-calculus/structural-rules.t

9.3 Structural Rules


We also need a few rules that allow us to rearrange sentences in the left and pl:seq:srl:
sec
right side of a sequent. Since the logical rules require that the sentences in
the premise which the rule acts upon stand either to the far left or to the far
right, we need an “exchange” rule that allows us to move sentences to the right
position. It’s also important sometimes to be able to combine two identical
sentences into one, and to add a sentence on either side.

Weakening

Γ ⇒ ∆ Γ ⇒ ∆
WL WR
φ, Γ ⇒ ∆ Γ ⇒ ∆, φ

Contraction

φ, φ, Γ ⇒ ∆ Γ ⇒ ∆, φ, φ
CL CR
φ, Γ ⇒ ∆ Γ ⇒ ∆, φ

114 Release : 6891b66 (2024-12-01)


9.4. DERIVATIONS

Exchange

Γ, φ, ψ, Π ⇒ ∆ Γ ⇒ ∆, φ, ψ, Λ
XL XR
Γ, ψ, φ, Π ⇒ ∆ Γ ⇒ ∆, ψ, φ, Λ

A series of weakening, contraction, and exchange inferences will often be indi-


cated by double inference lines.
The following rule, called “cut,” is not strictly speaking necessary, but
makes it a lot easier to reuse and combine derivations.

Γ ⇒ ∆, φ φ, Π ⇒ Λ
Cut
Γ, Π ⇒ ∆, Λ

content/propositional-logic/../first-order-logic/sequent-calculus/derivations.tex

9.4 Derivations
pl:seq:der: We’ve said what an initial sequent looks like, and we’ve given the rules of explanation
sec
inference. Derivations in the sequent calculus are inductively generated from
these: each derivation either is an initial sequent on its own, or consists of one
or two derivations followed by an inference.

Definition 9.3 (LK derivation). An LK-derivation of a sequent S is a finite


tree of sequents satisfying the following conditions:

1. The topmost sequents of the tree are initial sequents.

2. The bottommost sequent of the tree is S.

3. Every sequent in the tree except S is a premise of a correct application


of an inference rule whose conclusion stands directly below that sequent
in the tree.

We then say that S is the end-sequent of the derivation and that S is derivable
in LK (or LK-derivable).

Example 9.4. Every initial sequent, e.g., χ ⇒ χ is a derivation. We can


obtain a new derivation from this by applying, say, the WL rule,

Γ ⇒ ∆
WL
φ, Γ ⇒ ∆

Release : 6891b66 (2024-12-01) 115


CHAPTER 9. THE SEQUENT CALCULUS

The rule, however, is meant to be general: we can replace the φ in the rule
with any sentence, e.g., also with θ. If the premise matches our initial sequent
χ ⇒ χ, that means that both Γ and ∆ are just χ, and the conclusion would
then be θ, χ ⇒ χ. So, the following is a derivation:
χ ⇒ χ
WL
θ, χ ⇒ χ

We can now apply another rule, say XL, which allows us to switch two sentences
on the left. So, the following is also a correct derivation:
χ ⇒ χ
WL
θ, χ ⇒ χ
XL
χ, θ ⇒ χ

In this application of the rule, which was given as


Γ, φ, ψ, Π ⇒ ∆
XL
Γ, ψ, φ, Π ⇒ ∆,

both Γ and Π were empty, ∆ is χ, and the roles of φ and ψ are played by θ
and χ, respectively. In much the same way, we also see that
θ ⇒ θ
WL
χ, θ ⇒ θ

is a derivation. Now we can take these two derivations, and combine them
using ∧R. That rule was
Γ ⇒ ∆, φ Γ ⇒ ∆, ψ
∧R
Γ ⇒ ∆, φ ∧ ψ

In our case, the premises must match the last sequents of the derivations ending
in the premises. That means that Γ is χ, θ, ∆ is empty, φ is χ and ψ is θ. So
the conclusion, if the inference should be correct, is χ, θ ⇒ χ ∧ θ.
χ ⇒ χ
WL
θ, χ ⇒ χ θ ⇒ θ
XL WL
χ, θ ⇒ χ χ, θ ⇒ θ
∧R
χ, θ ⇒ χ ∧ θ

Of course, we can also reverse the premises, then φ would be θ and ψ would
be χ.
χ ⇒ χ
WL
θ ⇒ θ θ, χ ⇒ χ
WL XL
χ, θ ⇒ θ χ, θ ⇒ χ
∧R
χ, θ ⇒ θ ∧ χ

content/propositional-logic/../first-order-logic/sequent-calculus/proving-things.tex

116 Release : 6891b66 (2024-12-01)


9.5. EXAMPLES OF DERIVATIONS

9.5 Examples of Derivations


pl:seq:pro:
sec
Example 9.5. Give an LK-derivation for the sequent φ ∧ ψ ⇒ φ.
We begin by writing the desired end-sequent at the bottom of the derivation.

φ∧ψ ⇒ φ

Next, we need to figure out what kind of inference could have a lower sequent
of this form. This could be a structural rule, but it is a good idea to start by
looking for a logical rule. The only logical connective occurring in the lower
sequent is ∧, so we’re looking for an ∧ rule, and since the ∧ symbol occurs in
the antecedent, we’re looking at the ∧L rule.
∧L
φ∧ψ ⇒ φ

There are two options for what could have been the upper sequent of the ∧L
inference: we could have an upper sequent of φ ⇒ φ, or of ψ ⇒ φ. Clearly,
φ ⇒ φ is an initial sequent (which is a good thing), while ψ ⇒ φ is not
derivable in general. We fill in the upper sequent:
φ ⇒ φ
∧L
φ∧ψ ⇒ φ

We now have a correct LK-derivation of the sequent φ ∧ ψ ⇒ φ.

Example 9.6. Give an LK-derivation for the sequent ¬φ ∨ ψ ⇒ φ → ψ.


Begin by writing the desired end-sequent at the bottom of the derivation.

¬φ ∨ ψ ⇒ φ → ψ

To find a logical rule that could give us this end-sequent, we look at the logical
connectives in the end-sequent: ¬, ∨, and →. We only care at the moment
about ∨ and → because they are main operators of sentences in the end-sequent,
while ¬ is inside the scope of another connective, so we will take care of it later.
Our options for logical rules for the final inference are therefore the ∨L rule
and the →R rule. We could pick either rule, really, but let’s pick the →R rule
(if for no reason other than it allows us to put off splitting into two branches).
According to the form of →R inferences which can yield the lower sequent, this
must look like:

φ, ¬φ ∨ ψ ⇒ ψ
→R
¬φ ∨ ψ ⇒ φ → ψ

If we move ¬φ ∨ ψ to the outside of the antecedent, we can apply the ∨L


rule. According to the schema, this must split into two upper sequents as
follows:

Release : 6891b66 (2024-12-01) 117


CHAPTER 9. THE SEQUENT CALCULUS

¬φ, φ ⇒ ψ ψ, φ ⇒ ψ
∨L
¬φ ∨ ψ, φ ⇒ ψ
XR
φ, ¬φ ∨ ψ ⇒ ψ
→R
¬φ ∨ ψ ⇒ φ→ψ
Remember that we are trying to wind our way up to initial sequents; we seem
to be pretty close! The right branch is just one weakening and one exchange
away from an initial sequent and then it is done:
ψ ⇒ ψ
WL
φ, ψ ⇒ ψ
XL
¬φ, φ ⇒ ψ ψ, φ ⇒ ψ
∨L
¬φ ∨ ψ, φ ⇒ ψ
XR
φ, ¬φ ∨ ψ ⇒ ψ
→R
¬φ ∨ ψ ⇒ φ → ψ
Now looking at the left branch, the only logical connective in any sentence
is the ¬ symbol in the antecedent sentences, so we’re looking at an instance of
the ¬L rule.
ψ ⇒ ψ
WL
φ ⇒ ψ, φ φ, ψ ⇒ ψ
¬L XL
¬φ, φ ⇒ ψ ψ, φ ⇒ ψ
∨L
¬φ ∨ ψ, φ ⇒ ψ
XR
φ, ¬φ ∨ ψ ⇒ ψ
→R
¬φ ∨ ψ ⇒ φ → ψ
Similarly to how we finished off the right branch, we are just one weakening
and one exchange away from finishing off this left branch as well.
φ ⇒ φ
WR
φ ⇒ φ, ψ ψ ⇒ ψ
XR WL
φ ⇒ ψ, φ φ, ψ ⇒ ψ
¬L XL
¬φ, φ ⇒ ψ ψ, φ ⇒ ψ
∨L
¬φ ∨ ψ, φ ⇒ ψ
XR
φ, ¬φ ∨ ψ ⇒ ψ
→R
¬φ ∨ ψ ⇒ φ → ψ

Example 9.7. Give an LK-derivation of the sequent ¬φ ∨ ¬ψ ⇒ ¬(φ ∧ ψ)


Using the techniques from above, we start by writing the desired end-
sequent at the bottom.

¬φ ∨ ¬ψ ⇒ ¬(φ ∧ ψ)
The available main connectives of sentences in the end-sequent are the ∨ symbol
and the ¬ symbol. It would work to apply either the ∨L or the ¬R rule here,
but we start with the ¬R rule because it avoids splitting up into two branches
for a moment:

118 Release : 6891b66 (2024-12-01)


9.5. EXAMPLES OF DERIVATIONS

φ ∧ ψ, ¬φ ∨ ¬ψ ⇒
¬R
¬φ ∨ ¬ψ ⇒ ¬(φ ∧ ψ)
Now we have a choice of whether to look at the ∧L or the ∨L rule. Let’s see
what happens when we apply the ∧L rule: we have a choice to start with either
the sequent φ, ¬φ ∨ ψ ⇒ or the sequent ψ, ¬φ ∨ ψ ⇒ . Since the derivation
is symmetric with regards to φ and ψ, let’s go with the former:

φ, ¬φ ∨ ¬ψ ⇒
∧L
φ ∧ ψ, ¬φ ∨ ¬ψ ⇒
¬R
¬φ ∨ ¬ψ ⇒ ¬(φ ∧ ψ)
Continuing to fill in the derivation, we see that we run into a problem:
?
φ ⇒ φ φ ⇒ ψ
¬L ¬L
¬φ, φ ⇒ ¬ψ, φ ⇒
∨L
¬φ ∨ ¬ψ, φ ⇒
XL
φ, ¬φ ∨ ¬ψ ⇒
∧L
φ ∧ ψ, ¬φ ∨ ¬ψ ⇒
¬R
¬φ ∨ ¬ψ ⇒ ¬(φ ∧ ψ)
The top of the right branch cannot be reduced any further, and it cannot be
brought by way of structural inferences to an initial sequent, so this is not the
right path to take. So clearly, it was a mistake to apply the ∧L rule above.
Going back to what we had before and carrying out the ∨L rule instead, we
get

¬φ, φ ∧ ψ ⇒ ¬ψ, φ ∧ ψ ⇒
∨L
¬φ ∨ ¬ψ, φ ∧ ψ ⇒
XL
φ ∧ ψ, ¬φ ∨ ¬ψ ⇒
¬R
¬φ ∨ ¬ψ ⇒ ¬(φ ∧ ψ)
Completing each branch as we’ve done before, we get
φ ⇒ φ ψ ⇒ ψ
∧L ∧L
φ∧ψ ⇒ φ φ∧ψ ⇒ ψ
¬L ¬L
¬φ, φ ∧ ψ ⇒ ¬ψ, φ ∧ ψ ⇒
∨L
¬φ ∨ ¬ψ, φ ∧ ψ ⇒
XL
φ ∧ ψ, ¬φ ∨ ¬ψ ⇒
¬R
¬φ ∨ ¬ψ ⇒ ¬(φ ∧ ψ)
(We could have carried out the ∧ rules lower than the ¬ rules in these steps
and still obtained a correct derivation).

Example 9.8. So far we haven’t used the contraction rule, but it is sometimes
required. Here’s an example where that happens. Suppose we want to prove
⇒ φ ∨ ¬φ. Applying ∨R backwards would give us one of these two derivations:

Release : 6891b66 (2024-12-01) 119


CHAPTER 9. THE SEQUENT CALCULUS

φ ⇒
⇒ φ ⇒ ¬φ ¬R
⇒ φ ∨ ¬φ ∨R ⇒ φ ∨ ¬φ ∨R

Neither of these of course ends in an initial sequent. The trick is to realize


that the contraction rule allows us to combine two copies of a sentence into
one—and when we’re searching for a proof, i.e., going from bottom to top, we
can keep a copy of φ ∨ ¬φ in the premise, e.g.,

⇒ φ ∨ ¬φ, φ
⇒ φ ∨ ¬φ, φ ∨ ¬φ ∨R
⇒ φ ∨ ¬φ CR

Now we can apply ∨R a second time, and also get ¬φ, which leads to a complete
derivation.
φ ⇒ φ
⇒ φ, ¬φ ¬R
⇒ φ, φ ∨ ¬φ ∨R
⇒ φ ∨ ¬φ, φ XR
⇒ φ ∨ ¬φ, φ ∨ ¬φ ∨R
⇒ φ ∨ ¬φ CR

Problem 9.1. Give derivations of the following sequents:

1. φ ∧ (ψ ∧ χ) ⇒ (φ ∧ ψ) ∧ χ.

2. φ ∨ (ψ ∨ χ) ⇒ (φ ∨ ψ) ∨ χ.

3. φ → (ψ → χ) ⇒ ψ → (φ → χ).

4. φ ⇒ ¬¬φ.

Problem 9.2. Give derivations of the following sequents:

1. (φ ∨ ψ) → χ ⇒ φ → χ.

2. (φ → χ) ∧ (ψ → χ) ⇒ (φ ∨ ψ) → χ.

3. ⇒ ¬(φ ∧ ¬φ).

4. ψ → φ ⇒ ¬φ → ¬ψ.

5. ⇒ (φ → ¬φ) → ¬φ.

6. ⇒ ¬(φ → ψ) → ¬ψ.

7. φ → χ ⇒ ¬(φ ∧ ¬χ).

8. φ ∧ ¬χ ⇒ ¬(φ → χ).

9. φ ∨ ψ, ¬ψ ⇒ φ.

120 Release : 6891b66 (2024-12-01)


9.6. PROOF-THEORETIC NOTIONS

10. ¬φ ∨ ¬ψ ⇒ ¬(φ ∧ ψ).


11. ⇒ (¬φ ∧ ¬ψ) → ¬(φ ∨ ψ).
12. ⇒ ¬(φ ∨ ψ) → (¬φ ∧ ¬ψ).

Problem 9.3. Give derivations of the following sequents:


1. ¬(φ → ψ) ⇒ φ.
2. ¬(φ ∧ ψ) ⇒ ¬φ ∨ ¬ψ.
3. φ → ψ ⇒ ¬φ ∨ ψ.
4. ⇒ ¬¬φ → φ.
5. φ → ψ, ¬φ → ψ ⇒ ψ.
6. (φ ∧ ψ) → χ ⇒ (φ → χ) ∨ (ψ → χ).
7. (φ → ψ) → φ ⇒ φ.
8. ⇒ (φ → ψ) ∨ (ψ → χ).
(These all require the CR rule.)

This section collects the definitions of the provability relation and con-
sistency for natural deduction.

content/propositional-logic/../first-order-logic/sequent-calculus/proof-theoretic-notions.tex

9.6 Proof-Theoretic Notions


pl:seq:ptn: Just as we’ve defined a number of important semantic notions (validity, en- explanation
sec
tailment, satisfiabilty), we now define corresponding proof-theoretic notions.
These are not defined by appeal to satisfaction of sentences in structures, but
by appeal to the derivability or non-derivability of certain sequents. It was an
important discovery that these notions coincide. That they do is the content
of the soundness and completeness theorem.
Definition 9.9 (Theorems). A sentence φ is a theorem if there is a deriva-
tion in LK of the sequent ⇒ φ. We write ⊢ φ if φ is a theorem and ⊬ φ if
it is not.

Definition 9.10 (Derivability). A sentence φ is derivable from a set of sen-


tences Γ , Γ ⊢ φ, iff there is a finite subset Γ0 ⊆ Γ and a sequence Γ0′ of the
sentences in Γ0 such that LK derives Γ0′ ⇒ φ. If φ is not derivable from Γ we
write Γ ⊬ φ.

Release : 6891b66 (2024-12-01) 121


CHAPTER 9. THE SEQUENT CALCULUS

Because of the contraction, weakening, and exchange rules, the order and
number of sentences in Γ0′ does not matter: if a sequent Γ0′ ⇒ φ is derivable,
then so is Γ0′′ ⇒ φ for any Γ0′′ that contains the same sentences as Γ0′ . For
instance, if Γ0 = {ψ, χ} then both Γ0′ = ⟨ψ, ψ, χ⟩ and Γ0′′ = ⟨χ, χ, ψ⟩ are
sequences containing just the sentences in Γ0 . If a sequent containing one is
derivable, so is the other, e.g.:

ψ, ψ, χ ⇒ φ
CL
ψ, χ ⇒ φ
XL
χ, ψ ⇒ φ
WL
χ, χ, ψ ⇒ φ

From now on we’ll say that if Γ0 is a finite set of sentences then Γ0 ⇒ φ is


any sequent where the antecedent is a sequence of sentences in Γ0 and tacitly
include contractions, exchanges, and weakenings if necessary.

Definition 9.11 (Consistency). A set of sentences Γ is inconsistent iff there


is a finite subset Γ0 ⊆ Γ such that LK derives Γ0 ⇒ . If Γ is not inconsistent,
i.e., if for every finite Γ0 ⊆ Γ , LK does not derive Γ0 ⇒ , we say it is
consistent.

Proposition 9.12 (Reflexivity). If φ ∈ Γ , then Γ ⊢ φ. pl:seq:ptn:


prop:reflexivity

Proof. The initial sequent φ ⇒ φ is derivable, and {φ} ⊆ Γ .

Proposition 9.13 (Monotonicity). If Γ ⊆ ∆ and Γ ⊢ φ, then ∆ ⊢ φ. pl:seq:ptn:


prop:monotonicity

Proof. Suppose Γ ⊢ φ, i.e., there is a finite Γ0 ⊆ Γ such that Γ0 ⇒ φ is


derivable. Since Γ ⊆ ∆, then Γ0 is also a finite subset of ∆. The derivation of
Γ0 ⇒ φ thus also shows ∆ ⊢ φ.

Proposition 9.14 (Transitivity). If Γ ⊢ φ and {φ} ∪ ∆ ⊢ ψ, then Γ ∪ ∆ ⊢ pl:seq:ptn:


prop:transitivity
ψ.

Proof. If Γ ⊢ φ, there is a finite Γ0 ⊆ Γ and a derivation π0 of Γ0 ⇒ φ. If


{φ} ∪ ∆ ⊢ ψ, then for some finite subset ∆0 ⊆ ∆, there is a derivation π1 of
φ, ∆0 ⇒ ψ. Consider the following derivation:

π0 π1

Γ0 ⇒ φ φ, ∆0 ⇒ ψ
Cut
Γ0 , ∆0 ⇒ ψ

Since Γ0 ∪ ∆0 ⊆ Γ ∪ ∆, this shows Γ ∪ ∆ ⊢ ψ.

122 Release : 6891b66 (2024-12-01)


9.7. DERIVABILITY AND CONSISTENCY

Note that this means that in particular if Γ ⊢ φ and φ ⊢ ψ, then Γ ⊢ ψ. It


follows also that if φ1 , . . . , φn ⊢ ψ and Γ ⊢ φi for each i, then Γ ⊢ ψ.
pl:seq:ptn: Proposition 9.15. Γ is inconsistent iff Γ ⊢ φ for every sentence φ.
prop:incons

Proof. Exercise.

Problem 9.4. Prove Proposition 19.16

pl:seq:ptn: Proposition 9.16 (Compactness).


prop:proves-compact
1. If Γ ⊢ φ then there is a finite subset Γ0 ⊆ Γ such that Γ0 ⊢ φ.
2. If every finite subset of Γ is consistent, then Γ is consistent.

Proof. 1. If Γ ⊢ φ, then there is a finite subset Γ0 ⊆ Γ such that the sequent


Γ0 ⇒ φ has a derivation. Consequently, Γ0 ⊢ φ.
2. If Γ is inconsistent, there is a finite subset Γ0 ⊆ Γ such that LK derives
Γ0 ⇒ . But then Γ0 is a finite subset of Γ that is inconsistent.

content/propositional-logic/../first-order-logic/sequent-calculus/provability-consistency.tex

9.7 Derivability and Consistency


pl:seq:prv: We will now establish a number of properties of the derivability relation. They
sec
are independently interesting, but each will play a role in the proof of the
completeness theorem.
pl:seq:prv: Proposition 9.17. If Γ ⊢ φ and Γ ∪ {φ} is inconsistent, then Γ is inconsis-
prop:provability-contr
tent.

Proof. There are finite Γ0 and Γ1 ⊆ Γ such that LK derives Γ0 ⇒ φ and


φ, Γ1 ⇒ . Let the LK-derivation of Γ0 ⇒ φ be π0 and the LK-derivation of
Γ1 , φ ⇒ be π1 . We can then derive

π0 π1

Γ0 ⇒ φ φ, Γ1 ⇒
Cut
Γ0 , Γ1 ⇒
Since Γ0 ⊆ Γ and Γ1 ⊆ Γ , Γ0 ∪ Γ1 ⊆ Γ , hence Γ is inconsistent.

pl:seq:prv: Proposition 9.18. Γ ⊢ φ iff Γ ∪ {¬φ} is inconsistent.


prop:prov-incons

Proof. First suppose Γ ⊢ φ, i.e., there is a derivation π0 of Γ ⇒ φ. By adding


a ¬L rule, we obtain a derivation of ¬φ, Γ ⇒ , i.e., Γ ∪ {¬φ} is inconsistent.
If Γ ∪ {¬φ} is inconsistent, there is a derivation π1 of ¬φ, Γ ⇒ . The
following is a derivation of Γ ⇒ φ:

Release : 6891b66 (2024-12-01) 123


CHAPTER 9. THE SEQUENT CALCULUS

π1
φ ⇒ φ
⇒ φ, ¬φ ¬R ¬φ, Γ ⇒
Cut
Γ ⇒ φ

Problem 9.5. Prove that Γ ⊢ ¬φ iff Γ ∪ {φ} is inconsistent.

Proposition 9.19. If Γ ⊢ φ and ¬φ ∈ Γ , then Γ is inconsistent. pl:seq:prv:


prop:explicit-inc

Proof. Suppose Γ ⊢ φ and ¬φ ∈ Γ . Then there is a derivation π of a sequent


Γ0 ⇒ φ. The sequent ¬φ, Γ0 ⇒ is also derivable:

π φ ⇒ φ
¬φ, φ ⇒ ¬L
Γ0 ⇒ φ φ, ¬φ ⇒ XL
Cut
Γ0 , ¬φ ⇒
Since ¬φ ∈ Γ and Γ0 ⊆ Γ , this shows that Γ is inconsistent.

Proposition 9.20. If Γ ∪ {φ} and Γ ∪ {¬φ} are both inconsistent, then Γ is pl:seq:prv:
prop:provability-exhaustive
inconsistent.

Proof. There are finite sets Γ0 ⊆ Γ and Γ1 ⊆ Γ and LK-derivations π0 and π1


of φ, Γ0 ⇒ and ¬φ, Γ1 ⇒ , respectively. We can then derive

π0
π1
φ, Γ0 ⇒
¬R
Γ0 ⇒ ¬φ ¬φ, Γ1 ⇒
Cut
Γ0 , Γ1 ⇒
Since Γ0 ⊆ Γ and Γ1 ⊆ Γ , Γ0 ∪ Γ1 ⊆ Γ . Hence Γ is inconsistent.

content/propositional-logic/../first-order-logic/sequent-calculus/provability-propos

9.8 Derivability and the Propositional Connectives


explanation We establish that the derivability relation ⊢ of the sequent calculus is strong pl:seq:ppr:
sec
enough to establish some basic facts involving the propositional connectives,
such as that φ ∧ ψ ⊢ φ and φ, φ → ψ ⊢ ψ (modus ponens). These facts are
needed for the proof of the completeness theorem.
Proposition 9.21. pl:seq:ppr:
prop:provability-land
1. Both φ ∧ ψ ⊢ φ and φ ∧ ψ ⊢ ψ. pl:seq:ppr:
prop:provability-land-left

124 Release : 6891b66 (2024-12-01)


9.8. DERIVABILITY AND THE PROPOSITIONAL CONNECTIVES

pl:seq:ppr: 2. φ, ψ ⊢ φ ∧ ψ.
prop:provability-land-right

Proof. 1. Both sequents φ ∧ ψ ⇒ φ and φ ∧ ψ ⇒ ψ are derivable:

φ ⇒ φ ψ ⇒ ψ
∧L ∧L
φ∧ψ ⇒ φ φ∧ψ ⇒ ψ

2. Here is a derivation of the sequent φ, ψ ⇒ φ ∧ ψ:

φ ⇒ φ ψ ⇒ ψ
∧R
φ, ψ ⇒ φ ∧ ψ

pl:seq:ppr: Proposition 9.22.


prop:provability-lor

1. φ ∨ ψ, ¬φ, ¬ψ is inconsistent.

2. Both φ ⊢ φ ∨ ψ and ψ ⊢ φ ∨ ψ.

Proof. 1. We give a derivation of the sequent φ ∨ ψ, ¬φ, ¬ψ ⇒:

φ ⇒ φ ψ ⇒ ψ
¬L ¬L
¬φ, φ ⇒ ¬ψ, ψ ⇒
φ, ¬φ, ¬ψ ⇒ ψ, ¬φ, ¬ψ ⇒
∨L
φ ∨ ψ, ¬φ, ¬ψ ⇒

(Recall that double inference lines indicate several weakening, contrac-


tion, and exchange inferences.)

2. Both sequents φ ⇒ φ ∨ ψ and ψ ⇒ φ ∨ ψ have derivations:

φ ⇒ φ ψ ⇒ ψ
∨R ∨R
φ ⇒ φ∨ψ ψ ⇒ φ∨ψ

pl:seq:ppr: Proposition 9.23.


prop:provability-lif

pl:seq:ppr: 1. φ, φ → ψ ⊢ ψ.
prop:provability-lif-left

pl:seq:ppr: 2. Both ¬φ ⊢ φ → ψ and ψ ⊢ φ → ψ.


prop:provability-lif-right

Proof. 1. The sequent φ → ψ, φ ⇒ ψ is derivable:

φ ⇒ φ ψ ⇒ ψ
→L
φ → ψ, φ ⇒ ψ

2. Both sequents ¬φ ⇒ φ → ψ and ψ ⇒ φ → ψ are derivable:

Release : 6891b66 (2024-12-01) 125


CHAPTER 9. THE SEQUENT CALCULUS

φ ⇒ φ
¬φ, φ ⇒ ¬L
φ, ¬φ ⇒ XL ψ ⇒ ψ
WR WL
φ, ¬φ ⇒ ψ φ, ψ ⇒ ψ
→R →R
¬φ ⇒ φ→ψ ψ ⇒ φ→ψ

content/propositional-logic/../first-order-logic/sequent-calculus/soundness.tex

9.9 Soundness
explanation A derivation system, such as the sequent calculus, is sound if it cannot derive pl:seq:sou:
sec
things that do not actually hold. Soundness is thus a kind of guaranteed safety
property for derivation systems. Depending on which proof theoretic property
is in question, we would like to know for instance, that
1. every derivable φ is a tautology;
2. if a sentence is derivable from some others, it is also a consequence of
them;
3. if a set of sentences is inconsistent, it is unsatisfiable.
These are important properties of a derivation system. If any of them do
not hold, the derivation system is deficient—it would derive too much. Con-
sequently, establishing the soundness of a derivation system is of the utmost
importance.
Because all these proof-theoretic properties are defined via derivability in
the sequent calculus of certain sequents, proving (1)–(3) above requires proving
something about the semantic properties of derivable sequents. We will first
define what it means for a sequent to be valid, and then show that every
derivable sequent is valid. (1)–(3) then follow as corollaries from this result.
Definition 9.24. A valuation v satisfies a sequent Γ ⇒ ∆ iff either v ⊭ φ for
some φ ∈ Γ or v ⊨ φ for some φ ∈ ∆.
A sequent is valid iff every valuation v satisfies it.

Theorem 9.25 (Soundness). If LK derives Θ ⇒ Ξ, then Θ ⇒ Ξ is valid. pl:seq:sou:


thm:sequent-soundness

Proof. Let π be a derivation of Θ ⇒ Ξ. We proceed by induction on the


number of inferences n in π.
If the number of inferences is 0, then π consists only of an initial sequent.
Every initial sequent φ ⇒ φ is obviously valid, since for every v, either v ⊭ φ
or v ⊨ φ.
If the number of inferences is greater than 0, we distinguish cases according
to the type of the lowermost inference. By induction hypothesis, we can assume
that the premises of that inference are valid, since the number of inferences in
the derivation of any premise is smaller than n.
First, we consider the possible inferences with only one premise.

126 Release : 6891b66 (2024-12-01)


9.9. SOUNDNESS

1. The last inference is a weakening. Then Θ ⇒ Ξ is either φ, Γ ⇒ ∆ (if


the last inference is WL) or Γ ⇒ ∆, φ (if it’s WR), and the derivation
ends in one of

Γ ⇒ ∆ Γ ⇒ ∆
WL WR
φ, Γ ⇒ ∆ Γ ⇒ ∆, φ

By induction hypothesis, Γ ⇒ ∆ is valid, i.e., for every valuation v, either


there is some χ ∈ Γ such that v ⊭ χ or there is some χ ∈ ∆ such that
v ⊨ χ.
If v ⊭ χ for some χ ∈ Γ , then χ ∈ Θ as well since Θ = φ, Γ , and so v ⊭ χ
for some χ ∈ Θ. Similarly, if v ⊨ χ for some χ ∈ ∆, as χ ∈ Ξ, v ⊨ χ for
some χ ∈ Ξ. Consequently, Θ ⇒ Ξ is valid.

2. The last inference is ¬L: Then the premise of the last inference is Γ ⇒
∆, φ and the conclusion is ¬φ, Γ ⇒ ∆, i.e., the derivation ends in

Γ ⇒ ∆, φ
¬L
¬φ, Γ ⇒ ∆

and Θ = ¬φ, Γ while Ξ = ∆.


The induction hypothesis tells us that Γ ⇒ ∆, φ is valid, i.e., for every
v, either (a) for some χ ∈ Γ , v ⊭ χ, or (b) for some χ ∈ ∆, v ⊨ χ, or (c)
v ⊨ φ. We want to show that Θ ⇒ Ξ is also valid. Let v be a valuation.
If (a) holds, then there is χ ∈ Γ so that v ⊭ χ, but χ ∈ Θ as well. If
(b) holds, there is χ ∈ ∆ such that v ⊨ χ, but χ ∈ Ξ as well. Finally,
if v ⊨ φ, then v ⊭ ¬φ. Since ¬φ ∈ Θ, there is χ ∈ Θ such that v ⊭ χ.
Consequently, Θ ⇒ Ξ is valid.

3. The last inference is ¬R: Exercise.

4. The last inference is ∧L: There are two variants: φ ∧ ψ may be inferred
on the left from φ or from ψ on the left side of the premise. In the first
case, the π ends in

φ, Γ ⇒ ∆
∧L
φ ∧ ψ, Γ ⇒ ∆

Release : 6891b66 (2024-12-01) 127


CHAPTER 9. THE SEQUENT CALCULUS

and Θ = φ ∧ ψ, Γ while Ξ = ∆. Consider a valuation v. Since by


induction hypothesis, φ, Γ ⇒ ∆ is valid, (a) v ⊭ φ, (b) v ⊭ χ for some
χ ∈ Γ , or (c) v ⊨ χ for some χ ∈ ∆. In case (a), v ⊭ φ ∧ ψ, so there
is χ ∈ Θ (namely, φ ∧ ψ) such that v ⊭ χ. In case (b), there is χ ∈ Γ
such that v ⊭ χ, and χ ∈ Θ as well. In case (c), there is χ ∈ ∆ such
that v ⊨ χ, and χ ∈ Ξ as well since Ξ = ∆. So in each case, v satisfies
φ ∧ ψ, Γ ⇒ ∆. Since v was arbitrary, Γ ⇒ ∆ is valid. The case where
φ ∧ ψ is inferred from ψ is handled the same, changing φ to ψ.

5. The last inference is ∨R: There are two variants: φ ∨ ψ may be inferred
on the right from φ or from ψ on the right side of the premise. In the
first case, π ends in

Γ ⇒ ∆, φ
∨R
Γ ⇒ ∆, φ ∨ ψ

Now Θ = Γ and Ξ = ∆, φ ∨ ψ. Consider a valuation v. Since Γ ⇒ ∆, φ


is valid, (a) v ⊨ φ, (b) v ⊭ χ for some χ ∈ Γ , or (c) v ⊨ χ for some
χ ∈ ∆. In case (a), v ⊨ φ ∨ ψ. In case (b), there is χ ∈ Γ such that v ⊭ χ.
In case (c), there is χ ∈ ∆ such that v ⊨ χ. So in each case, v satisfies
Γ ⇒ ∆, φ ∨ ψ, i.e., Θ ⇒ Ξ. Since v was arbitrary, Θ ⇒ Ξ is valid. The
case where φ ∨ ψ is inferred from ψ is handled the same, changing φ to
ψ.

6. The last inference is →R: Then π ends in

φ, Γ ⇒ ∆, ψ
→R
Γ ⇒ ∆, φ → ψ

Again, the induction hypothesis says that the premise is valid; we want
to show that the conclusion is valid as well. Let v be arbitrary. Since
φ, Γ ⇒ ∆, ψ is valid, at least one of the following cases obtains: (a) v ⊭ φ,
(b) v ⊨ ψ, (c) v ⊭ χ for some χ ∈ Γ , or (d) v ⊨ χ for some χ ∈ ∆. In
cases (a) and (b), v ⊨ φ → ψ and so there is a χ ∈ ∆, φ → ψ such that
v ⊨ χ. In case (c), for some χ ∈ Γ , v ⊭ χ. In case (d), for some χ ∈ ∆,
v ⊨ χ. In each case, v satisfies Γ ⇒ ∆, φ → ψ. Since v was arbitrary,
Γ ⇒ ∆, φ → ψ is valid.

Now let’s consider the possible inferences with two premises.

1. The last inference is a cut: then π ends in

128 Release : 6891b66 (2024-12-01)


9.9. SOUNDNESS

Γ ⇒ ∆, φ φ, Π ⇒ Λ
Cut
Γ, Π ⇒ ∆, Λ

Let v be a valuation. By induction hypothesis, the premises are valid,


so v satisfies both premises. We distinguish two cases: (a) v ⊭ φ and
(b) v ⊨ φ. In case (a), in order for v to satisfy the left premise, it must
satisfy Γ ⇒ ∆. But then it also satisfies the conclusion. In case (b), in
order for v to satisfy the right premise, it must satisfy Π \ Λ. Again, v
satisfies the conclusion.

2. The last inference is ∧R. Then π ends in

Γ ⇒ ∆, φ Γ ⇒ ∆, ψ
∧R
Γ ⇒ ∆, φ ∧ ψ

Consider a valuation v. If v satisfies Γ ⇒ ∆, we are done. So suppose


it doesn’t. Since Γ ⇒ ∆, φ is valid by induction hypothesis, v ⊨ φ.
Similarly, since Γ ⇒ ∆, ψ is valid, v ⊨ ψ. But then v ⊨ φ ∧ ψ.

3. The last inference is ∨L: Exercise.

4. The last inference is →L. Then π ends in

Γ ⇒ ∆, φ ψ, Π ⇒ Λ
→L
φ → ψ, Γ, Π ⇒ ∆, Λ

Again, consider a valuation v and suppose v doesn’t satisfy Γ, Π ⇒ ∆, Λ.


We have to show that v ⊭ φ → ψ. If v doesn’t satisfy Γ, Π ⇒ ∆, Λ, it
satisfies neither Γ ⇒ ∆ nor Π ⇒ Λ. Since, Γ ⇒ ∆, φ is valid, we have
v ⊨ φ. Since ψ, Π ⇒ Λ is valid, we have v ⊭ ψ. But then v ⊭ φ → ψ,
which is what we wanted to show.

Problem 9.6. Complete the proof of Theorem 9.25.

pl:seq:sou: Corollary 9.26. If ⊢ φ then φ is a tautology.


cor:weak-soundness

pl:seq:sou: Corollary 9.27. If Γ ⊢ φ then Γ ⊨ φ.


cor:entailment-soundness

Release : 6891b66 (2024-12-01) 129


Proof. If Γ ⊢ φ then for some finite subset Γ0 ⊆ Γ , there is a derivation of
Γ0 ⇒ φ. By Theorem 9.25, every valuation v either makes some ψ ∈ Γ0 false
or makes φ true. Hence, if v ⊨ Γ then also v ⊨ φ.

Corollary 9.28. If Γ is satisfiable, then it is consistent. pl:seq:sou:


cor:consistency-soundness

Proof. We prove the contrapositive. Suppose that Γ is not consistent. Then


there is a finite Γ0 ⊆ Γ and a derivation of Γ0 ⇒ . By Theorem 9.25, Γ0 ⇒
is valid. In other words, for every valuation v, there is χ ∈ Γ0 so that v ⊭ χ,
and since Γ0 ⊆ Γ , that χ is also in Γ . Thus, no v satisfies Γ , and Γ is not
satisfiable.

Chapter 10

Natural Deduction

This chapter presents a natural deduction system in the style of


Gentzen/Prawitz.
To include or exclude material relevant to natural deduction as a proof
system, use the “prfND” tag.

content/propositional-logic/../first-order-logic/natural-deduction/rules-and-proofs.

10.1 Rules and Derivations


explanation Natural deduction systems are meant to closely parallel the informal reason- pl:ntd:rul:
sec
ing used in mathematical proof (hence it is somewhat “natural”). Natural
deduction proofs begin with assumptions. Inference rules are then applied.
Assumptions are “discharged” by the ¬Intro, →Intro, and ∨Elim inference
rules, and the label of the discharged assumption is placed beside the inference
for clarity.

Definition 10.1 (Assumption). An assumption is any sentence in the top-


most position of any branch.

130
10.2. PROPOSITIONAL RULES

Derivations in natural deduction are certain trees of sentences, where the


topmost sentences are assumptions, and if a sentence stands below one, two,
or three other sequents, it must follow correctly by a rule of inference. The
sentences at the top of the inference are called the premises and the sentence
below the conclusion of the inference. The rules come in pairs, an introduction
and an elimination rule for each logical operator. They introduce a logical
operator in the conclusion or remove a logical operator from a premise of the
rule. Some of the rules allow an assumption of a certain type to be discharged.
To indicate which assumption is discharged by which inference, we also assign
labels to both the assumption and the inference. This is indicated by writing
the assumption as “[φ]n .”
It is customary to consider rules for all the logical operators ∧, ∨, →, ¬,
and ⊥, even if some of those are defined.

content/propositional-logic/../first-order-logic/natural-deduction/propositional-rules.tex

10.2 Propositional Rules


pl:ntd:prl:
sec
Rules for ∧

φ∧ψ
φ ∧Elim
φ ψ
∧Intro
φ∧ψ φ∧ψ
∧Elim
ψ

Rules for ∨

φ [φ]n [ψ]n
∨Intro
φ∨ψ
ψ
∨Intro φ∨ψ χ χ
φ∨ψ n ∨Elim
χ

Rules for →

Release : 6891b66 (2024-12-01) 131


CHAPTER 10. NATURAL DEDUCTION

[φ]n

φ→ψ φ
→Elim
ψ
ψ
n →Intro
φ→ψ

Rules for ¬

[φ]n
¬φ φ
¬Elim


n
¬φ ¬Intro

Rules for ⊥

[¬φ]n

⊥ ⊥
φ I

n
⊥ ⊥
φ C

Note that ¬Intro and ⊥C are very similar: The difference is that ¬Intro derives
a negated sentence ¬φ but ⊥C a positive sentence φ.
Whenever a rule indicates that some assumption may be discharged, we
take this to be a permission, but not a requirement. E.g., in the →Intro rule,
we may discharge any number of assumptions of the form φ in the derivation
of the premise ψ, including zero.

content/propositional-logic/../first-order-logic/natural-deduction/derivations.tex

10.3 Derivations
explanation We’ve said what an assumption is, and we’ve given the rules of inference. pl:ntd:der:
sec
Derivations in natural deduction are inductively generated from these: each
derivation either is an assumption on its own, or consists of one, two, or three
derivations followed by a correct inference.
Definition 10.2 (Derivation). A derivation of a sentence φ from assump-
tions Γ is a finite tree of sentences satisfying the following conditions:

132 Release : 6891b66 (2024-12-01)


10.3. DERIVATIONS

1. The topmost sentences of the tree are either in Γ or are discharged by


an inference in the tree.

2. The bottommost sentence of the tree is φ.

3. Every sentence in the tree except the sentence φ at the bottom is a


premise of a correct application of an inference rule whose conclusion
stands directly below that sentence in the tree.

We then say that φ is the conclusion of the derivation and Γ its undischarged
assumptions.
If a derivation of φ from Γ exists, we say that φ is derivable from Γ , or
in symbols: Γ ⊢ φ. If there is a derivation of φ in which every assumption is
discharged, we write ⊢ φ.

Example 10.3. Every assumption on its own is a derivation. So, e.g., φ by


itself is a derivation, and so is ψ by itself. We can obtain a new derivation from
these by applying, say, the ∧Intro rule,
φ ψ
∧Intro
φ∧ψ

These rules are meant to be general: we can replace the φ and ψ in it with any
sentences, e.g., by χ and θ. Then the conclusion would be χ ∧ θ, and so
χ θ
∧Intro
χ∧θ

is a correct derivation. Of course, we can also switch the assumptions, so that


θ plays the role of φ and χ that of ψ. Thus,
θ χ
∧Intro
θ∧χ

is also a correct derivation.


We can now apply another rule, say, →Intro, which allows us to conclude
a conditional and allows us to discharge any assumption that is identical to
the antecedent of that conditional. So both of the following would be correct
derivations:
[χ]1 θ χ [θ]1
∧Intro ∧Intro
χ∧θ χ∧θ
1 →Intro 1 →Intro
χ → (χ ∧ θ) θ → (χ ∧ θ)

They show, respectively, that θ ⊢ χ → (χ ∧ θ) and χ ⊢ θ → (χ ∧ θ).


Remember that discharging of assumptions is a permission, not a require-
ment: we don’t have to discharge the assumptions. In particular, we can apply
a rule even if the assumptions are not present in the derivation. For instance,
the following is legal, even though there is no assumption φ to be discharged:

Release : 6891b66 (2024-12-01) 133


CHAPTER 10. NATURAL DEDUCTION

ψ
1 →Intro
φ→ψ

content/propositional-logic/../first-order-logic/natural-deduction/proving-things.te

10.4 Examples of Derivations


pl:ntd:pro:
sec
Example 10.4. Let’s give a derivation of the sentence (φ ∧ ψ) → φ.
We begin by writing the desired conclusion at the bottom of the derivation.

(φ ∧ ψ) → φ

Next, we need to figure out what kind of inference could result in a sentence
of this form. The main operator of the conclusion is →, so we’ll try to arrive at
the conclusion using the →Intro rule. It is best to write down the assumptions
involved and label the inference rules as you progress, so it is easy to see whether
all assumptions have been discharged at the end of the proof.

[φ ∧ ψ]1

φ
1 →Intro
(φ ∧ ψ) → φ

We now need to fill in the steps from the assumption φ ∧ ψ to φ. Since we


only have one connective to deal with, ∧, we must use the ∧ elim rule. This
gives us the following proof:

[φ ∧ ψ]1
φ ∧Elim
1 →Intro
(φ ∧ ψ) → φ

We now have a correct derivation of (φ ∧ ψ) → φ.

Example 10.5. Now let’s give a derivation of (¬φ ∨ ψ) → (φ → ψ).


We begin by writing the desired conclusion at the bottom of the derivation.

(¬φ ∨ ψ) → (φ → ψ)

To find a logical rule that could give us this conclusion, we look at the logical
connectives in the conclusion: ¬, ∨, and →. We only care at the moment about
the first occurrence of → because it is the main operator of the sentence in the
end-sequent, while ¬, ∨ and the second occurrence of → are inside the scope
of another connective, so we will take care of those later. We therefore start
with the →Intro rule. A correct application must look like this:

134 Release : 6891b66 (2024-12-01)


10.4. EXAMPLES OF DERIVATIONS

[¬φ ∨ ψ]1

φ→ψ
1 →Intro
(¬φ ∨ ψ) → (φ → ψ)

This leaves us with two possibilities to continue. Either we can keep working
from the bottom up and look for another application of the →Intro rule, or we
can work from the top down and apply a ∨Elim rule. Let us apply the latter.
We will use the assumption ¬φ ∨ ψ as the leftmost premise of ∨Elim. For a
valid application of ∨Elim, the other two premises must be identical to the
conclusion φ → ψ, but each may be derived in turn from another assumption,
namely one of the two disjuncts of ¬φ ∨ ψ. So our derivation will look like this:

[¬φ]2 [ψ]2

[¬φ ∨ ψ]1 φ→ψ φ→ψ


2 ∨Elim
φ→ψ
1 →Intro
(¬φ ∨ ψ) → (φ → ψ)

In each of the two branches on the right, we want to derive φ → ψ, which


is best done using →Intro.

[¬φ]2 , [φ]3 [ψ]2 , [φ]4

ψ ψ
3 →Intro 4 →Intro
[¬φ ∨ ψ]1 φ→ψ φ→ψ
2 ∨Elim
φ→ψ
1 →Intro
(¬φ ∨ ψ) → (φ → ψ)

For the two missing parts of the derivation, we need derivations of ψ from
¬φ and φ in the middle, and from φ and ψ on the left. Let’s take the former
first. ¬φ and φ are the two premises of ¬Elim:

[¬φ]2 [φ]3
¬Elim

By using ⊥I , we can obtain ψ as a conclusion and complete the branch.

Release : 6891b66 (2024-12-01) 135


CHAPTER 10. NATURAL DEDUCTION

[ψ]2 , [φ]4
[¬φ]2 [φ]3
⊥Intro
⊥ ⊥
I
ψ ψ
3 →Intro 4 →Intro
[¬φ ∨ ψ]1 φ→ψ φ→ψ
2 ∨Elim
φ→ψ
1 →Intro
(¬φ ∨ ψ) → (φ → ψ)

Let’s now look at the rightmost branch. Here it’s important to realize
that the definition of derivation allows assumptions to be discharged but does
not require them to be. In other words, if we can derive ψ from one of the
assumptions φ and ψ without using the other, that’s ok. And to derive ψ
from ψ is trivial: ψ by itself is such a derivation, and no inferences are needed.
So we can simply delete the assumption φ.

[¬φ]2 [φ]3
¬Elim
⊥ ⊥
I
ψ [ψ]2
3 →Intro →Intro
[¬φ ∨ ψ]1 φ→ψ φ→ψ
2 ∨Elim
φ→ψ
1 →Intro
(¬φ ∨ ψ) → (φ → ψ)

Note that in the finished derivation, the rightmost →Intro inference does not
actually discharge any assumptions.

Example 10.6. So far we have not needed the ⊥C rule. It is special in that it
allows us to discharge an assumption that isn’t a sub-formula of the conclusion
of the rule. It is closely related to the ⊥I rule. In fact, the ⊥I rule is a special
case of the ⊥C rule—there is a logic called “intuitionistic logic” in which only
⊥I is allowed. The ⊥C rule is a last resort when nothing else works. For
instance, suppose we want to derive φ ∨ ¬φ. Our usual strategy would be to
attempt to derive φ ∨ ¬φ using ∨Intro. But this would require us to derive
either φ or ¬φ from no assumptions, and this can’t be done. ⊥C to the rescue!

[¬(φ ∨ ¬φ)]1

1
⊥ ⊥
φ ∨ ¬φ C

Now we’re looking for a derivation of ⊥ from ¬(φ ∨ ¬φ). Since ⊥ is the
conclusion of ¬Elim we might try that:

136 Release : 6891b66 (2024-12-01)


10.4. EXAMPLES OF DERIVATIONS

[¬(φ ∨ ¬φ)]1 [¬(φ ∨ ¬φ)]1

¬φ φ
¬Elim
1
⊥ ⊥
φ ∨ ¬φ C

Our strategy for finding a derivation of ¬φ calls for an application of ¬Intro:


[¬(φ ∨ ¬φ)]1 , [φ]2
[¬(φ ∨ ¬φ)]1


2
¬φ ¬Intro φ
¬Elim
1
⊥ ⊥C
φ ∨ ¬φ

Here, we can get ⊥ easily by applying ¬Elim to the assumption ¬(φ ∨ ¬φ) and
φ ∨ ¬φ which follows from our new assumption φ by ∨Intro:
[¬(φ ∨ ¬φ)]1
[φ]2
[¬(φ ∨ ¬φ)]1 φ ∨ ¬φ ∨Intro
¬Elim

2
¬φ ¬Intro φ
¬Elim
1
⊥ ⊥
φ ∨ ¬φ C

On the right side we use the same strategy, except we get φ by ⊥C :


[φ]2 [¬φ]3
[¬(φ ∨ ¬φ)] 1
φ ∨ ¬φ ∨Intro [¬(φ ∨ ¬φ)]1
φ ∨ ¬φ ∨Intro
¬Elim ¬Elim
⊥ ⊥ ⊥
2
¬φ ¬Intro 3
φ C
¬Elim
1
⊥ ⊥
φ ∨ ¬φ C

Problem 10.1. Give derivations that show the following:

1. φ ∧ (ψ ∧ χ) ⊢ (φ ∧ ψ) ∧ χ.

2. φ ∨ (ψ ∨ χ) ⊢ (φ ∨ ψ) ∨ χ.

3. φ → (ψ → χ) ⊢ ψ → (φ → χ).

4. φ ⊢ ¬¬φ.

Problem 10.2. Give derivations that show the following:

1. (φ ∨ ψ) → χ ⊢ φ → χ.

Release : 6891b66 (2024-12-01) 137


CHAPTER 10. NATURAL DEDUCTION

2. (φ → χ) ∧ (ψ → χ) ⊢ (φ ∨ ψ) → χ.

3. ⊢ ¬(φ ∧ ¬φ).

4. ψ → φ ⊢ ¬φ → ¬ψ.

5. ⊢ (φ → ¬φ) → ¬φ.

6. ⊢ ¬(φ → ψ) → ¬ψ.

7. φ → χ ⊢ ¬(φ ∧ ¬χ).

8. φ ∧ ¬χ ⊢ ¬(φ → χ).

9. φ ∨ ψ, ¬ψ ⊢ φ.

10. ¬φ ∨ ¬ψ ⊢ ¬(φ ∧ ψ).

11. ⊢ (¬φ ∧ ¬ψ) → ¬(φ ∨ ψ).

12. ⊢ ¬(φ ∨ ψ) → (¬φ ∧ ¬ψ).

Problem 10.3. Give derivations that show the following:

1. ¬(φ → ψ) ⊢ φ.

2. ¬(φ ∧ ψ) ⊢ ¬φ ∨ ¬ψ.

3. φ → ψ ⊢ ¬φ ∨ ψ.

4. ⊢ ¬¬φ → φ.

5. φ → ψ, ¬φ → ψ ⊢ ψ.

6. (φ ∧ ψ) → χ ⊢ (φ → χ) ∨ (ψ → χ).

7. (φ → ψ) → φ ⊢ φ.

8. ⊢ (φ → ψ) ∨ (ψ → χ).

(These all require the ⊥C rule.)

content/propositional-logic/../first-order-logic/natural-deduction/proof-theoretic-n

138 Release : 6891b66 (2024-12-01)


10.5. PROOF-THEORETIC NOTIONS

10.5 Proof-Theoretic Notions


pl:ntd:ptn:
sec

This section collects the definitions the provability relation and consis-
tency for natural deduction.

Just as we’ve defined a number of important semantic notions (validity, en- explanation
tailment, satisfiability), we now define corresponding proof-theoretic notions.
These are not defined by appeal to satisfaction of sentences in structures, but
by appeal to the derivability or non-derivability of certain sentences from oth-
ers. It was an important discovery that these notions coincide. That they do
is the content of the soundness and completeness theorems.
Definition 10.7 (Theorems). A sentence φ is a theorem if there is a deriva-
tion of φ in natural deduction in which all assumptions are discharged. We
write ⊢ φ if φ is a theorem and ⊬ φ if it is not.

Definition 10.8 (Derivability). A sentence φ is derivable from a set of sen-


tences Γ , Γ ⊢ φ, if there is a derivation with conclusion φ and in which every
assumption is either discharged or is in Γ . If φ is not derivable from Γ we
write Γ ⊬ φ.

Definition 10.9 (Consistency). A set of sentences Γ is inconsistent iff Γ ⊢


⊥. If Γ is not inconsistent, i.e., if Γ ⊬ ⊥, we say it is consistent.

pl:ntd:ptn: Proposition 10.10 (Reflexivity). If φ ∈ Γ , then Γ ⊢ φ.


prop:reflexivity
Proof. The assumption φ by itself is a derivation of φ where every undischarged
assumption (i.e., φ) is in Γ .

pl:ntd:ptn: Proposition 10.11 (Monotonicity). If Γ ⊆ ∆ and Γ ⊢ φ, then ∆ ⊢ φ.


prop:monotonicity
Proof. Any derivation of φ from Γ is also a derivation of φ from ∆.

pl:ntd:ptn: Proposition 10.12 (Transitivity). If Γ ⊢ φ and {φ}∪∆ ⊢ ψ, then Γ ∪∆ ⊢


prop:transitivity
ψ.

Proof. If Γ ⊢ φ, there is a derivation δ0 of φ with all undischarged assumptions


in Γ . If {φ} ∪ ∆ ⊢ ψ, then there is a derivation δ1 of ψ with all undischarged
assumptions in {φ} ∪ ∆. Now consider:
∆, [φ]1

δ1 Γ
δ0
ψ
1 →Intro
φ→ψ φ
→Elim
ψ

Release : 6891b66 (2024-12-01) 139


CHAPTER 10. NATURAL DEDUCTION

The undischarged assumptions are now all among Γ ∪ ∆, so this shows Γ ∪ ∆ ⊢


ψ.

When Γ = {φ1 , φ2 , . . . , φk } is a finite set we may use the simplified notation


φ1 , φ2 , . . . , φk ⊢ ψ for Γ ⊢ ψ, in particular φ ⊢ ψ means that {φ} ⊢ ψ.
Note that if Γ ⊢ φ and φ ⊢ ψ, then Γ ⊢ ψ. It follows also that if φ1 , . . . , φn ⊢
ψ and Γ ⊢ φi for each i, then Γ ⊢ ψ.

Proposition 10.13. The following are equivalent. pl:ntd:ptn:


prop:incons

1. Γ is inconsistent.

2. Γ ⊢ φ for every sentence φ.

3. Γ ⊢ φ and Γ ⊢ ¬φ for some sentence φ.

Proof. Exercise.

Problem 10.4. Prove Proposition 10.13

Proposition 10.14 (Compactness). pl:ntd:ptn:


prop:proves-compact

1. If Γ ⊢ φ then there is a finite subset Γ0 ⊆ Γ such that Γ0 ⊢ φ.

2. If every finite subset of Γ is consistent, then Γ is consistent.

Proof. 1. If Γ ⊢ φ, then there is a derivation δ of φ from Γ . Let Γ0 be the


set of undischarged assumptions of δ. Since any derivation is finite, Γ0
can only contain finitely many sentences. So, δ is a derivation of φ from
a finite Γ0 ⊆ Γ .

2. This is the contrapositive of (1) for the special case φ ≡ ⊥.

content/propositional-logic/../first-order-logic/natural-deduction/provability-consi

10.6 Derivability and Consistency


We will now establish a number of properties of the derivability relation. They pl:ntd:prv:
sec
are independently interesting, but each will play a role in the proof of the
completeness theorem.

Proposition 10.15. If Γ ⊢ φ and Γ ∪ {φ} is inconsistent, then Γ is incon- pl:ntd:prv:


prop:provability-contr
sistent.

Proof. Let the derivation of φ from Γ be δ1 and the derivation of ⊥ from


Γ ∪ {φ} be δ2 . We can then derive:

140 Release : 6891b66 (2024-12-01)


10.6. DERIVABILITY AND CONSISTENCY

Γ, [φ]1
Γ
δ2
δ1

1
¬φ ¬Intro φ
¬Elim

In the new derivation, the assumption φ is discharged, so it is a derivation
from Γ .

pl:ntd:prv: Proposition 10.16. Γ ⊢ φ iff Γ ∪ {¬φ} is inconsistent.


prop:prov-incons

Proof. First suppose Γ ⊢ φ, i.e., there is a derivation δ0 of φ from undischarged


assumptions Γ . We obtain a derivation of ⊥ from Γ ∪ {¬φ} as follows:
Γ
δ0
¬φ φ
¬Elim

Now assume Γ ∪{¬φ} is inconsistent, and let δ1 be the corresponding deriva-
tion of ⊥ from undischarged assumptions in Γ ∪ {¬φ}. We obtain a derivation
of φ from Γ alone by using ⊥C :
Γ, [¬φ]1

δ1

1
⊥ ⊥
φ C

Problem 10.5. Prove that Γ ⊢ ¬φ iff Γ ∪ {φ} is inconsistent.

pl:ntd:prv: Proposition 10.17. If Γ ⊢ φ and ¬φ ∈ Γ , then Γ is inconsistent.


prop:explicit-inc

Proof. Suppose Γ ⊢ φ and ¬φ ∈ Γ . Then there is a derivation δ of φ from Γ .


Consider this simple application of the ¬Elim rule:
Γ

δ
¬φ φ
¬Elim

Since ¬φ ∈ Γ , all undischarged assumptions are in Γ , this shows that Γ ⊢ ⊥.

pl:ntd:prv: Proposition 10.18. If Γ ∪ {φ} and Γ ∪ {¬φ} are both inconsistent, then Γ
prop:provability-exhaustive
is inconsistent.

Release : 6891b66 (2024-12-01) 141


CHAPTER 10. NATURAL DEDUCTION

Proof. There are derivations δ1 and δ2 of ⊥ from Γ ∪{φ} and ⊥ from Γ ∪{¬φ},
respectively. We can then derive

Γ, [¬φ]2 Γ, [φ]1

δ2 δ1

⊥ ⊥
2
¬¬φ ¬Intro 1
¬φ ¬Intro
¬Elim

Since the assumptions φ and ¬φ are discharged, this is a derivation of ⊥ from Γ


alone. Hence Γ is inconsistent.

content/propositional-logic/../first-order-logic/natural-deduction/provability-propo

10.7 Derivability and the Propositional Connectives


explanation We establish that the derivability relation ⊢ of natural deduction is strong pl:ntd:ppr:
sec
enough to establish some basic facts involving the propositional connectives,
such as that φ ∧ ψ ⊢ φ and φ, φ → ψ ⊢ ψ (modus ponens). These facts are
needed for the proof of the completeness theorem.

Proposition 10.19. pl:ntd:ppr:


prop:provability-land

1. Both φ ∧ ψ ⊢ φ and φ ∧ ψ ⊢ ψ pl:ntd:ppr:


prop:provability-land-left

2. φ, ψ ⊢ φ ∧ ψ. pl:ntd:ppr:
prop:provability-land-right

Proof. 1. We can derive both

φ∧ψ φ∧ψ
∧Elim ∧Elim
φ ψ

2. We can derive:

φ ψ
∧Intro
φ∧ψ

Proposition 10.20. pl:ntd:ppr:


prop:provability-lor

1. φ ∨ ψ, ¬φ, ¬ψ is inconsistent.

2. Both φ ⊢ φ ∨ ψ and ψ ⊢ φ ∨ ψ.

Proof. 1. Consider the following derivation:

142 Release : 6891b66 (2024-12-01)


10.8. SOUNDNESS

¬φ [φ]1 ¬ψ [ψ]1
¬Elim ¬Elim
φ∨ψ ⊥ ⊥
1 ∨Elim

This is a derivation of ⊥ from undischarged assumptions φ ∨ ψ, ¬φ, and


¬ψ.

2. We can derive both

φ ψ
∨Intro ∨Intro
φ∨ψ φ∨ψ

pl:ntd:ppr: Proposition 10.21.


prop:provability-lif
pl:ntd:ppr: 1. φ, φ → ψ ⊢ ψ.
prop:provability-lif-left
pl:ntd:ppr: 2. Both ¬φ ⊢ φ → ψ and ψ ⊢ φ → ψ.
prop:provability-lif-right

Proof. 1. We can derive:

φ→ψ φ
→Elim
ψ

2. This is shown by the following two derivations:

¬φ [φ]1
¬Elim
⊥ ⊥
I
ψ ψ
1 →Intro →Intro
φ→ψ φ→ψ

Note that →Intro may, but does not have to, discharge the assumption φ.

content/propositional-logic/../first-order-logic/natural-deduction/soundness.tex

10.8 Soundness
pl:ntd:sou: A derivation system, such as natural deduction, is sound if it cannot derive explanation
sec
things that do not actually follow. Soundness is thus a kind of guaranteed
safety property for derivation systems. Depending on which proof theoretic
property is in question, we would like to know for instance, that

1. every derivable sentence is a tautology;

2. if a sentence is derivable from some others, it is also a consequence of


them;

Release : 6891b66 (2024-12-01) 143


CHAPTER 10. NATURAL DEDUCTION

3. if a set of sentences is inconsistent, it is unsatisfiable.

These are important properties of a derivation system. If any of them do


not hold, the derivation system is deficient—it would derive too much. Con-
sequently, establishing the soundness of a derivation system is of the utmost
importance.

Theorem 10.22 (Soundness). If φ is derivable from the undischarged as- pl:ntd:sou:


thm:soundness
sumptions Γ , then Γ ⊨ φ.

Proof. Let δ be a derivation of φ. We proceed by induction on the number of


inferences in δ.
For the induction basis we show the claim if the number of inferences is 0.
In this case, δ consists only of a single sentence φ, i.e., an assumption. That
assumption is undischarged, since assumptions can only be discharged by in-
ferences, and there are no inferences. So, any valuation v that satisfies all of
the undischarged assumptions of the proof also satisfies φ.
Now for the inductive step. Suppose that δ contains n inferences. The
premise(s) of the lowermost inference are derived using sub-derivations, each
of which contains fewer than n inferences. We assume the induction hypoth-
esis: The premises of the lowermost inference follow from the undischarged
assumptions of the sub-derivations ending in those premises. We have to show
that the conclusion φ follows from the undischarged assumptions of the entire
proof.
We distinguish cases according to the type of the lowermost inference. First,
we consider the possible inferences with only one premise.

1. Suppose that the last inference is ¬Intro: The derivation has the form

Γ, [φ]n

δ1


n
¬φ ¬Intro

By inductive hypothesis, ⊥ follows from the undischarged assumptions


Γ ∪ {φ} of δ1 . Consider a valuation v. We need to show that, if v ⊨ Γ ,
then v ⊨ ¬φ. Suppose for reductio that v ⊨ Γ , but v ⊭ ¬φ, i.e., v ⊨ φ.
This would mean that v ⊨ Γ ∪ {φ}. This is contrary to our inductive
hypothesis. So, v ⊨ ¬φ.

2. The last inference is ∧Elim: There are two variants: φ or ψ may be


inferred from the premise φ ∧ ψ. Consider the first case. The derivation δ
looks like this:

144 Release : 6891b66 (2024-12-01)


10.8. SOUNDNESS

Γ
δ1

φ∧ψ
φ ∧Elim

By inductive hypothesis, φ ∧ ψ follows from the undischarged assump-


tions Γ of δ1 . Consider a structure v. We need to show that, if v ⊨ Γ ,
then v ⊨ φ. Suppose v ⊨ Γ . By our inductive hypothesis (Γ ⊨ φ ∧ ψ), we
know that v ⊨ φ ∧ ψ. By definition, v ⊨ φ ∧ ψ iff v ⊨ φ and v ⊨ ψ. (The
case where ψ is inferred from φ ∧ ψ is handled similarly.)

3. The last inference is ∨Intro: There are two variants: φ ∨ ψ may be


inferred from the premise φ or the premise ψ. Consider the first case.
The derivation has the form

Γ
δ1
φ
∨Intro
φ∨ψ

By inductive hypothesis, φ follows from the undischarged assumptions Γ


of δ1 . Consider a valuation v. We need to show that, if v ⊨ Γ , then
v ⊨ φ ∨ ψ. Suppose v ⊨ Γ ; then v ⊨ φ since Γ ⊨ φ (the inductive
hypothesis). So it must also be the case that v ⊨ φ ∨ ψ. (The case where
φ ∨ ψ is inferred from ψ is handled similarly.)

4. The last inference is →Intro: φ → ψ is inferred from a subproof with


assumption φ and conclusion ψ, i.e.,

Γ, [φ]n

δ1

ψ
n →Intro
φ→ψ

By inductive hypothesis, ψ follows from the undischarged assumptions


of δ1 , i.e., Γ ∪ {φ} ⊨ ψ. Consider a valuation v. The undischarged
assumptions of δ are just Γ , since φ is discharged at the last inference.
So we need to show that Γ ⊨ φ → ψ. For reductio, suppose that for
some valuation v, v ⊨ Γ but v ⊭ φ → ψ. So, v ⊨ φ and v ⊭ ψ. But
by hypothesis, ψ is a consequence of Γ ∪ {φ}, i.e., v ⊨ ψ, which is a
contradiction. So, Γ ⊨ φ → ψ.

5. The last inference is ⊥I : Here, δ ends in

Release : 6891b66 (2024-12-01) 145


CHAPTER 10. NATURAL DEDUCTION

Γ
δ1

⊥ ⊥
φ I

By induction hypothesis, Γ ⊨ ⊥. We have to show that Γ ⊨ φ. Suppose


not; then for some v we have v ⊨ Γ and v ⊭ φ. But we always have v ⊭ ⊥,
so this would mean that Γ ⊭ ⊥, contrary to the induction hypothesis.
6. The last inference is ⊥C : Exercise.

Now let’s consider the possible inferences with several premises: ∨Elim,
∧Intro, and →Elim.
1. The last inference is ∧Intro. φ ∧ ψ is inferred from the premises φ and ψ
and δ has the form

Γ1 Γ2

δ1 δ2

φ ψ
∧Intro
φ∧ψ

By induction hypothesis, φ follows from the undischarged assumptions Γ1


of δ1 and ψ follows from the undischarged assumptions Γ2 of δ2 . The
undischarged assumptions of δ are Γ1 ∪ Γ2 , so we have to show that
Γ1 ∪ Γ2 ⊨ φ ∧ ψ. Consider a valuation v with v ⊨ Γ1 ∪ Γ2 . Since v ⊨ Γ1 ,
it must be the case that v ⊨ φ as Γ1 ⊨ φ, and since v ⊨ Γ2 , v ⊨ ψ since
Γ2 ⊨ ψ. Together, v ⊨ φ ∧ ψ.
2. The last inference is ∨Elim: Exercise.
3. The last inference is →Elim. ψ is inferred from the premises φ→ψ and φ.
The derivation δ looks like this:
Γ1 Γ2
δ1 δ2
φ→ψ φ
→Elim
ψ

By induction hypothesis, φ → ψ follows from the undischarged assump-


tions Γ1 of δ1 and φ follows from the undischarged assumptions Γ2 of δ2 .
Consider a valuation v. We need to show that, if v ⊨ Γ1 ∪ Γ2 , then v ⊨ ψ.
Suppose v ⊨ Γ1 ∪ Γ2 . Since Γ1 ⊨ φ → ψ, v ⊨ φ → ψ. Since Γ2 ⊨ φ, we
have v ⊨ φ. This means that v ⊨ ψ (For if v ⊭ ψ, since v ⊨ φ, we’d have
v ⊭ φ → ψ, contradicting v ⊨ φ → ψ).

146 Release : 6891b66 (2024-12-01)


4. The last inference is ¬Elim: Exercise.

Problem 10.6. Complete the proof of Theorem 10.22.

pl:ntd:sou: Corollary 10.23. If ⊢ φ, then φ is a tautology.


cor:weak-soundness

pl:ntd:sou: Corollary 10.24. If Γ is satisfiable, then it is consistent.


cor:consistency-soundness

Proof. We prove the contrapositive. Suppose that Γ is not consistent. Then


Γ ⊢ ⊥, i.e., there is a derivation of ⊥ from undischarged assumptions in Γ . By
Theorem 10.22, any valuation v that satisfies Γ must satisfy ⊥. Since v ⊭ ⊥
for every valuation v, no v can satisfy Γ , i.e., Γ is not satisfiable.

Chapter 11

Tableaux

This chapter presents a signed analytic tableaux system.


To include or exclude material relevant to natural deduction as a proof
system, use the “prfTab” tag.

content/propositional-logic/../first-order-logic/tableaux/rules-and-proofs.tex

11.1 Rules and Tableaux


pl:tab:rul: A tableau is a systematic survey of the possible ways a sentence can be true
sec
or false in a structure. The building blocks of a tableau are signed formulas:
sentences plus a truth value “sign,” either T or F. These signed formulas are
arranged in a (downward growing) tree.

Definition 11.1. A signed formula is a pair consisting of a truth value and


a sentence, i.e., either:
T φ or F φ.

147
CHAPTER 11. TABLEAUX

Intuitively, we might read T φ as “φ might be true” and F φ as “φ might


be false” (in some structure).
Each signed formula in the tree is either an assumption (which are listed
at the very top of the tree), or it is obtained from a signed formula above it
by one of a number of rules of inference. There are two rules for each possible
main operator of the preceding formula, one for the case where the sign is T,
and one for the case where the sign is F. Some rules allow the tree to branch,
and some only add signed formulas to the branch. A rule may be (and often
must be) applied not to the immediately preceding signed formula, but to any
signed formula in the branch from the root to the place the rule is applied.
A branch is closed when it contains both T φ and F φ. A closed tableau
is one where every branch is closed. Under the intuitive interpretation, any
branch describes a joint possibility, but T φ and F φ are not jointly possible.
In other words, if a branch is closed, the possibility it describes has been ruled
out. In particular, that means that a closed tableau rules out all possibilities
of simultaneously making every assumption of the form T φ true and every
assumption of the form F φ false.
A closed tableau for φ is a closed tableau with root F φ. If such a closed
tableau exists, all possibilities for φ being false have been ruled out; i.e., φ
must be true in every structure.

content/propositional-logic/../first-order-logic/tableaux/propositional-rules.tex

11.2 Propositional Rules

Rules for ¬ pl:tab:prl:


sec

T ¬φ F ¬φ
¬T ¬F
Fφ Tφ

Rules for ∧

Tφ ∧ ψ
∧T Fφ∧ψ
Tφ ∧F
Fφ | Fψ

Rules for ∨

148 Release : 6891b66 (2024-12-01)


11.3. TABLEAUX

Fφ∨ψ
Tφ ∨ ψ ∨F
∨T Fφ
Tφ | Tψ

Rules for →

Fφ→ψ
Tφ → ψ →F
→T Tφ
F φ | Tψ

The Cut Rule

Cut
Tφ | Fφ

The Cut rule is not applied “to” a previous signed formula; rather, it allows
every branch in a tableau to be split in two, one branch containing T φ, the
other F φ. It is not necessary—any set of signed formulas with a closed tableau
has one not using Cut—but it allows us to combine tableaux in a convenient
way.

content/propositional-logic/../first-order-logic/tableaux/derivations.tex

11.3 Tableaux
pl:tab:der: We’ve said what an assumption is, and we’ve given the rules of inference. explanation
sec
Tableaux are inductively generated from these: each tableau either is a single
branch consisting of one or more assumptions, or it results from a tableau by
applying one of the rules of inference on a branch.

Definition 11.2 (Tableau). A tableau for assumptions S1φ1 , . . . , Snφn (where


each Si is either T or F) is a finite tree of signed formulas satisfying the following
conditions:

1. The n topmost signed formulas of the tree are Siφi , one below the other.

2. Every signed formula in the tree that is not one of the assumptions results
from a correct application of an inference rule to a signed formula in the
branch above it.

Release : 6891b66 (2024-12-01) 149


CHAPTER 11. TABLEAUX

A branch of a tableau is closed iff it contains both T φ and F φ, and open


otherwise. A tableau in which every branch is closed is a closed tableau (for its
set of assumptions). If a tableau is not closed, i.e., if it contains at least one
open branch, it is open.

Example 11.3. Every set of assumptions on its own is a tableau, but it will
generally not be closed. (Obviously, it is closed only if the assumptions already
contain a pair of signed formulas T φ and F φ.)
From a tableau (open or closed) we can obtain a new, larger one by applying
one of the rules of inference to a signed formula φ in it. The rule will append
one or more signed formulas to the end of any branch containing the occurrence
of φ to which we apply the rule.
For instance, consider the assumption T φ ∧ ¬φ. Here is the (open) tableau
consisting of just that assumption:

1. T φ ∧ ¬φ Assumption

We obtain a new tableau from it by applying the ∧T rule to the assumption.


That rule allows us to add two new lines to the tableau, T φ and T ¬φ:

1. T φ ∧ ¬φ Assumption
2. Tφ ∧T 1
3. T ¬φ ∧T 1

When we write down tableaux, we record the rules we’ve applied on the right
(e.g., ∧T1 means that the signed formula on that line is the result of applying
the ∧T rule to the signed formula on line 1). This new tableau now contains
additional signed formulas, but to only one (T ¬φ) can we apply a rule (in this
case, the ¬T rule). This results in the closed tableau

1. T φ ∧ ¬φ Assumption
2. Tφ ∧T 1
3. T ¬φ ∧T 1
4. Fφ ¬T 3

content/propositional-logic/../first-order-logic/tableaux/proving-things.tex

11.4 Examples of Tableaux


pl:tab:pro:
sec
Example 11.4. Let’s find a closed tableau for the sentence (φ ∧ ψ) → φ.
We begin by writing the corresponding assumption at the top of the tableau.

1. F (φ ∧ ψ) → φ Assumption

150 Release : 6891b66 (2024-12-01)


11.4. EXAMPLES OF TABLEAUX

There is only one assumption, so only one signed formula to which we can
apply a rule. (For every signed formula, there is always at most one rule that
can be applied: it’s the rule for the corresponding sign and main operator of
the sentence.) In this case, this means, we must apply →F.

1. F (φ ∧ ψ) → φ ✓ Assumption
2. Tφ ∧ ψ →F 1
3. Fφ →F 1

To keep track of which signed formulas we have applied their corresponding


rules to, we write a checkmark next to the sentence. However, only write a
checkmark if the rule has been applied to all open branches. Once a signed
formula has had the corresponding rule applied in every open branch, we will
not have to return to it and apply the rule again. In this case, there is only
one branch, so the rule only has to be applied once. (Note that checkmarks
are only a convenience for constructing tableaux and are not officially part of
the syntax of tableaux.)
There is one new signed formula to which we can apply a rule: the T φ ∧ ψ
on line 2. Applying the ∧T rule results in:

1. F (φ ∧ ψ) → φ ✓ Assumption
2. Tφ ∧ ψ ✓ →F 1
3. Fφ →F 1
4. Tφ ∧T 2
5. Tψ ∧T 2

Since the branch now contains both T φ (on line 4) and F φ (on line 3), the
branch is closed. Since it is the only branch, the tableau is closed. We have
found a closed tableau for (φ ∧ ψ) → φ.

Example 11.5. Now let’s find a closed tableau for (¬φ ∨ ψ) → (φ → ψ).
We begin with the corresponding assumption:

1. F (¬φ ∨ ψ) → (φ → ψ) Assumption

The one signed formula in this tableau has main operator → and sign F, so we
apply the →F rule to it to obtain:

1. F (¬φ ∨ ψ) → (φ → ψ) ✓ Assumption
2. T ¬φ ∨ ψ →F 1
3. F (φ → ψ) →F 1

We now have a choice as to whether to apply ∨T to line 2 or →F to line 3.


It actually doesn’t matter which order we pick, as long as each signed formula
has its corresponding rule applied in every branch. So let’s pick the first one.
The ∨T rule allows the tableau to branch, and the two conclusions of the rule

Release : 6891b66 (2024-12-01) 151


CHAPTER 11. TABLEAUX

will be the new signed formulas added to the two new branches. This results
in:

1. F (¬φ ∨ ψ) → (φ → ψ) ✓ Assumption
2. T ¬φ ∨ ψ ✓ →F 1
3. F (φ → ψ) →F 1

4. T ¬φ Tψ ∨T 2

We have not applied the →F rule to line 3 yet: let’s do that now. To save
time, we apply it to both branches. Recall that we write a checkmark next to
a signed formula only if we have applied the corresponding rule in every open
branch. So it’s a good idea to apply a rule at the end of every branch that
contains the signed formula the rule applies to. That way we won’t have to
return to that signed formula lower down in the various branches.

1. F (¬φ ∨ ψ) → (φ → ψ) ✓ Assumption
2. T ¬φ ∨ ψ ✓ →F 1
3. F (φ → ψ) ✓ →F 1

4. T ¬φ Tψ ∨T 2
5. Tφ Tφ →F 3
6. Fψ Fψ →F 3

The right branch is now closed. On the left branch, we can still apply the ¬T
rule to line 4. This results in F φ and closes the left branch:

1. F (¬φ ∨ ψ) → (φ → ψ) ✓ Assumption
2. T ¬φ ∨ ψ ✓ →F 1
3. F (φ → ψ) ✓ →F 1

4. T ¬φ Tψ ∨T 2
5. Tφ Tφ →F 3
6. Fψ Fψ →F 3
7. Fφ ⊗ ¬T 4

Example 11.6. We can give tableaux for any number of signed formulas as
assumptions. Often it is also necessary to apply more than one rule that allows
branching; and in general a tableau can have any number of branches. For
instance, consider a tableau for {T φ ∨ (ψ ∧ χ), F (φ ∨ ψ) ∧ (φ ∨ χ)}. We start
by applying the ∨T to the first assumption:

152 Release : 6891b66 (2024-12-01)


11.4. EXAMPLES OF TABLEAUX

1. T φ ∨ (ψ ∧ χ) ✓ Assumption
2. F (φ ∨ ψ) ∧ (φ ∨ χ) Assumption

3. Tφ Tψ ∧ χ ∨T 1

Now we can apply the ∧F rule to line 2. We do this on both branches simul-
taneously, and can therefore check off line 2:

1. T φ ∨ (ψ ∧ χ) ✓ Assumption
2. F (φ ∨ ψ) ∧ (φ ∨ χ) ✓ Assumption

3. Tφ Tψ ∧ χ ∨T 1

4. Fφ∨ψ Fφ∨χ Fφ∨ψ Fφ∨χ ∧F 2

Now we can apply ∨F to all the branches containing φ ∨ ψ:

1. T φ ∨ (ψ ∧ χ) ✓ Assumption
2. F (φ ∨ ψ) ∧ (φ ∨ χ) ✓ Assumption

3. Tφ Tψ ∧ χ ∨T 1

4. Fφ∨ψ ✓ Fφ∨χ Fφ∨ψ ✓ Fφ∨χ ∧F 2


5. Fφ Fφ ∨F 4
6. Fψ Fψ ∨F 4

The leftmost branch is now closed. Let’s now apply ∨F to φ ∨ χ:

1. T φ ∨ (ψ ∧ χ) ✓ Assumption
2. F (φ ∨ ψ) ∧ (φ ∨ χ) ✓ Assumption

3. Tφ Tψ ∧ χ ∨T 1

4. Fφ∨ψ ✓ Fφ∨χ ✓ Fφ∨ψ ✓ Fφ∨χ ✓ ∧F 2


5. Fφ Fφ ∨F 4
6. Fψ Fψ ∨F 4
7. ⊗ Fφ Fφ ∨F 4
8. Fχ Fχ ∨F 4

Note that we moved the result of applying ∨F a second time below for clarity.
In this instance it would not have been needed, since the justifications would
have been the same.

Release : 6891b66 (2024-12-01) 153


CHAPTER 11. TABLEAUX

Two branches remain open, and T ψ ∧ χ on line 3 remains unchecked. We


apply ∧T to it to obtain a closed tableau:

1. T φ ∨ (ψ ∧ χ) ✓ Assumption
2. F (φ ∨ ψ) ∧ (φ ∨ χ) ✓ Assumption

3. Tφ Tψ ∧ χ ✓ ∨T 1

4. Fφ∨ψ ✓ Fφ∨χ ✓ Fφ∨ψ ✓ Fφ∨χ ✓ ∧F 2


5. Fφ Fφ Fφ Fφ ∨F 4
6. Fψ Fχ Fψ Fχ ∨F 4
7. ⊗ ⊗ Tψ Tψ ∧T 3
8. Tχ Tχ ∧T 3
⊗ ⊗

For comparison, here’s a closed tableau for the same set of assumptions in
which the rules are applied in a different order:

1. T φ ∨ (ψ ∧ χ) ✓ Assumption
2. F (φ ∨ ψ) ∧ (φ ∨ χ) ✓ Assumption

3. Fφ∨ψ ✓ Fφ∨χ ✓ ∧F 2
4. Fφ Fφ ∨F 3
5. Fψ Fχ ∨F 3

6. Tφ Tψ ∧ χ ✓ Tφ Tψ ∧ χ ✓ ∨T 1
7. ⊗ Tψ ⊗ Tψ ∧T 6
8. Tχ Tχ ∧T 6
⊗ ⊗

Problem 11.1. Give closed tableaux of the following:


1. T φ ∧ (ψ ∧ χ), F (φ ∧ ψ) ∧ χ.
2. T φ ∨ (ψ ∨ χ), F (φ ∨ ψ) ∨ χ.
3. T φ → (ψ → χ), F ψ → (φ → χ).
4. T φ, F ¬¬φ.

Problem 11.2. Give closed tableaux of the following:


1. T (φ ∨ ψ) → χ, F φ → χ.
2. T (φ → χ) ∧ (ψ → χ), F (φ ∨ ψ) → χ.
3. F ¬(φ ∧ ¬φ).

154 Release : 6891b66 (2024-12-01)


11.5. PROOF-THEORETIC NOTIONS

4. T ψ → φ, F ¬φ → ¬ψ.
5. F (φ → ¬φ) → ¬φ.
6. F ¬(φ → ψ) → ¬ψ.
7. T φ → χ, F ¬(φ ∧ ¬χ).
8. T φ ∧ ¬χ, F ¬(φ → χ).
9. T φ ∨ ψ, ¬ψ, F φ.
10. T ¬φ ∨ ¬ψ, F ¬(φ ∧ ψ).
11. F (¬φ ∧ ¬ψ) → ¬(φ ∨ ψ).
12. F ¬(φ ∨ ψ) → (¬φ ∧ ¬ψ).

Problem 11.3. Give closed tableaux of the following:


1. T ¬(φ → ψ), F φ.
2. T ¬(φ ∧ ψ), F ¬φ ∨ ¬ψ.
3. T φ → ψ, F ¬φ ∨ ψ.
4. F ¬¬φ → φ.
5. T φ → ψ, T ¬φ → ψ, F ψ.
6. T (φ ∧ ψ) → χ, F (φ → χ) ∨ (ψ → χ).
7. T (φ → ψ) → φ, F φ.
8. F (φ → ψ) ∨ (ψ → χ).

content/propositional-logic/../first-order-logic/tableaux/proof-theoretic-notions.tex

11.5 Proof-Theoretic Notions


pl:tab:ptn:
sec

This section collects the definitions of the provability relation and con-
sistency for tableaux.

Just as we’ve defined a number of important semantic notions (validity, en- explanation
tailment, satisfiability), we now define corresponding proof-theoretic notions.
These are not defined by appeal to satisfaction of sentences in structures, but
by appeal to the existence of certain closed tableaux. It was an important dis-
covery that these notions coincide. That they do is the content of the soundness
and completeness theorems.

Release : 6891b66 (2024-12-01) 155


CHAPTER 11. TABLEAUX

Definition 11.7 (Theorems). A sentence φ is a theorem if there is a closed


tableau for F φ. We write ⊢ φ if φ is a theorem and ⊬ φ if it is not.

Definition 11.8 (Derivability). A sentence φ is derivable from a set of sen-


tences Γ , Γ ⊢ φ iff there is a finite set {ψ1 , . . . , ψn } ⊆ Γ and a closed tableau
for the set
{F φ, T ψ1 , . . . , T ψn }.
If φ is not derivable from Γ we write Γ ⊬ φ.

Definition 11.9 (Consistency). A set of sentences Γ is inconsistent iff there


is a finite set {ψ1 , . . . , ψn } ⊆ Γ and a closed tableau for the set

{T ψ1 , . . . , T ψn }.

If Γ is not inconsistent, we say it is consistent.

Proposition 11.10 (Reflexivity). If φ ∈ Γ , then Γ ⊢ φ. pl:tab:ptn:


prop:reflexivity

Proof. If φ ∈ Γ , {φ} is a finite subset of Γ and the tableau

1. Fφ Assumption
2. Tφ Assumption

is closed.

Proposition 11.11 (Monotonicity). If Γ ⊆ ∆ and Γ ⊢ φ, then ∆ ⊢ φ. pl:tab:ptn:


prop:monotonicity

Proof. Any finite subset of Γ is also a finite subset of ∆.

Proposition 11.12 (Transitivity). If Γ ⊢ φ and {φ}∪∆ ⊢ ψ, then Γ ∪∆ ⊢ pl:tab:ptn:


prop:transitivity
ψ.

Proof. If {φ} ∪ ∆ ⊢ ψ, then there is a finite subset ∆0 = {χ1 , . . . , χn } ⊆ ∆


such that

{F ψ,T φ, T χ1 , . . . , T χn }

has a closed tableau. If Γ ⊢ φ then there are θ1 , . . . , θm such that

{F φ,T θ1 , . . . , T θm }

has a closed tableau.


Now consider the tableau with assumptions

F ψ, T χ1 , . . . , T χn , T θ1 , . . . , T θm .

156 Release : 6891b66 (2024-12-01)


11.5. PROOF-THEORETIC NOTIONS

Apply the Cut rule on φ. This generates two branches, one has T φ in it, the
other F φ. Thus, on the one branch, all of

{F ψ, T φ, T χ1 , . . . , T χn }

are available. Since there is a closed tableau for these assumptions, we can
attach it to that branch; every branch through T φ closes. On the other branch,
all of
{F φ, T θ1 , . . . , T θm }

are available, so we can also complete the other side to obtain a closed tableau.
This shows Γ ∪ ∆ ⊢ ψ.

Note that this means that in particular if Γ ⊢ φ and φ ⊢ ψ, then Γ ⊢ ψ. It


follows also that if φ1 , . . . , φn ⊢ ψ and Γ ⊢ φi for each i, then Γ ⊢ ψ.

pl:tab:ptn: Proposition 11.13. Γ is inconsistent iff Γ ⊢ φ for every sentence φ.


prop:incons

Proof. Exercise.

Problem 11.4. Prove Proposition 11.13

pl:tab:ptn: Proposition 11.14 (Compactness).


prop:proves-compact

1. If Γ ⊢ φ then there is a finite subset Γ0 ⊆ Γ such that Γ0 ⊢ φ.

2. If every finite subset of Γ is consistent, then Γ is consistent.

Proof. 1. If Γ ⊢ φ, then there is a finite subset Γ0 = {ψ1 , . . . , ψn } and a


closed tableau for
{F φ, T ψ1 , . . . , T ψn }

This tableau also shows Γ0 ⊢ φ.

2. If Γ is inconsistent, then for some finite subset Γ0 = {ψ1 , . . . , ψn } there


is a closed tableau for
{T ψ1 , . . . , T ψn }

This closed tableau shows that Γ0 is inconsistent.

content/propositional-logic/../first-order-logic/tableaux/provability-consistency.tex

Release : 6891b66 (2024-12-01) 157


CHAPTER 11. TABLEAUX

11.6 Derivability and Consistency


We will now establish a number of properties of the derivability relation. They pl:tab:prv:
sec
are independently interesting, but each will play a role in the proof of the
completeness theorem.
Proposition 11.15. If Γ ⊢ φ and Γ ∪ {φ} is inconsistent, then Γ is incon- pl:tab:prv:
prop:provability-contr
sistent.
Proof. There are finite Γ0 = {ψ1 , . . . , ψn } and Γ1 = {χ1 , . . . , χn } ⊆ Γ such
that
{F φ,T ψ1 , . . . , T ψn }
{T φ,T χ1 , . . . , T χm }
have closed tableaux. Using the Cut rule on φ we can combine these into a
single closed tableau that shows Γ0 ∪ Γ1 is inconsistent. Since Γ0 ⊆ Γ and
Γ1 ⊆ Γ , Γ0 ∪ Γ1 ⊆ Γ , hence Γ is inconsistent.
Proposition 11.16. Γ ⊢ φ iff Γ ∪ {¬φ} is inconsistent. pl:tab:prv:
prop:prov-incons
Proof. First suppose Γ ⊢ φ, i.e., there is a closed tableau for
{F φ, T ψ1 , . . . , T ψn }
Using the ¬T rule, this can be turned into a closed tableau for
{T ¬φ, T ψ1 , . . . , T ψn }.
On the other hand, if there is a closed tableau for the latter, we can turn it
into a closed tableau of the former by removing every formula that results from
¬T applied to the first assumption T ¬φ as well as that assumption, and adding
the assumption F φ. For if a branch was closed before because it contained the
conclusion of ¬T applied to T ¬φ, i.e., F φ, the corresponding branch in the
new tableau is also closed. If a branch in the old tableau was closed because
it contained the assumption T ¬φ as well as F ¬φ we can turn it into a closed
branch by applying ¬F to F ¬φ to obtain T φ. This closes the branch since we
added F φ as an assumption.
Problem 11.5. Prove that Γ ⊢ ¬φ iff Γ ∪ {φ} is inconsistent.
Proposition 11.17. If Γ ⊢ φ and ¬φ ∈ Γ , then Γ is inconsistent. pl:tab:prv:
prop:explicit-inc
Proof. Suppose Γ ⊢ φ and ¬φ ∈ Γ . Then there are ψ1 , . . . , ψn ∈ Γ such that
{F φ, T ψ1 , . . . , T ψn }
has a closed tableau. Replace the assumption F φ by T ¬φ, and insert the
conclusion of ¬T applied to F φ after the assumptions. Any sentence in the
tableau justified by appeal to line 1 in the old tableau is now justified by appeal
to line n + 1. So if the old tableau was closed, the new one is. It shows that Γ
is inconsistent, since all assumptions are in Γ .

158 Release : 6891b66 (2024-12-01)


11.7. DERIVABILITY AND THE PROPOSITIONAL CONNECTIVES

pl:tab:prv: Proposition 11.18. If Γ ∪ {φ} and Γ ∪ {¬φ} are both inconsistent, then Γ
prop:provability-exhaustive
is inconsistent.

Proof. If there are ψ1 , . . . , ψn ∈ Γ and χ1 , . . . , χm ∈ Γ such that

{T φ,T ψ1 , . . . , T ψn } and
{T ¬φ,T χ1 , . . . , T χm }

both have closed tableaux, we can construct a single, combined tableau that
shows that Γ is inconsistent by using as assumptions T ψ1 , . . . , T ψn together
with T χ1 , . . . , T χm , followed by an application of the Cut rule. This yields
two branches, one starting with T φ, the other with F φ.
On the left left side, add the part of the first tableau below its assumptions.
Here, every rule application is still correct, since each of the assumptions of the
first tableau, including T φ, is available. Thus, every branch below T φ closes.
On the right side, add the part of the second tableau below its assumption,
with the results of any applications of ¬T to T ¬φ removed. The conclusion of
¬T to T ¬φ is F φ, which is nevertheless available, as it is the conclusion of the
Cut rule on the right side of the combined tableau.
If a branch in the second tableau was closed because it contained the as-
sumption T ¬φ (which no longer appears as an assumption in the combined
tableau) as well as F ¬φ, we can applying ¬F to F ¬φ to obtain T φ. Now
the corresponding branch in the combined tableau also closes, because it con-
tains the right-hand conclusion of the Cut rule, F φ. If a branch in the second
tableau closed for any other reason, the corresponding branch in the combined
tableau also closes, since any signed formulas other than T ¬φ occurring on the
branch in the old, second tableau also occur on the corresponding branch in
the combined tableau.

content/propositional-logic/../first-order-logic/tableaux/provability-propositional.tex

11.7 Derivability and the Propositional Connectives


pl:tab:ppr: We establish that the derivability relation ⊢ of tableaux is strong enough to explanation
sec
establish some basic facts involving the propositional connectives, such as that
φ ∧ ψ ⊢ φ and φ, φ → ψ ⊢ ψ (modus ponens). These facts are needed for the
proof of the completeness theorem.

pl:tab:ppr: Proposition 11.19.


prop:provability-land

pl:tab:ppr: 1. Both φ ∧ ψ ⊢ φ and φ ∧ ψ ⊢ ψ.


prop:provability-land-left

pl:tab:ppr: 2. φ, ψ ⊢ φ ∧ ψ.
prop:provability-land-right

Proof. 1. Both {F φ, T φ ∧ ψ} and {F ψ, T φ ∧ ψ} have closed tableaux

Release : 6891b66 (2024-12-01) 159


CHAPTER 11. TABLEAUX

1. Fφ Assumption
2. Tφ ∧ ψ Assumption
3. Tφ ∧T 2
4. Tψ ∧T 2

1. Fψ Assumption
2. Tφ ∧ ψ Assumption
3. Tφ ∧T 2
4. Tψ ∧T 2

2. Here is a closed tableau for {T φ, T ψ, F φ ∧ ψ}:

1. Fφ∧ψ Assumption
2. Tφ Assumption
3. Tψ Assumption

4. Fφ Fψ ∧F 1
⊗ ⊗

Proposition 11.20. pl:tab:ppr:


prop:provability-lor

1. {φ ∨ ψ, ¬φ, ¬ψ} is inconsistent.

2. Both φ ⊢ φ ∨ ψ and ψ ⊢ φ ∨ ψ.

Proof. 1. We give a closed tableau of {T φ ∨ ψ, T ¬φ, T ¬ψ}:

1. Tφ ∨ ψ Assumption
2. T ¬φ Assumption
3. T ¬ψ Assumption
4. Fφ ¬T 2
5. Fψ ¬T 3

6. Tφ Tψ ∨T 1
⊗ ⊗

2. Both {F φ ∨ ψ, T φ} and {F φ ∨ ψ, T ψ} have closed tableaux:

160 Release : 6891b66 (2024-12-01)


11.7. DERIVABILITY AND THE PROPOSITIONAL CONNECTIVES

1. Fφ∨ψ Assumption
2. Tφ Assumption
3. Fφ ∨F 1
4. Fψ ∨F 1

1. Fφ∨ψ Assumption
2. Tψ Assumption
3. Fφ ∨F 1
4. Fψ ∨F 1

pl:tab:ppr: Proposition 11.21.


prop:provability-lif
pl:tab:ppr: 1. φ, φ → ψ ⊢ ψ.
prop:provability-lif-left
pl:tab:ppr: 2. Both ¬φ ⊢ φ → ψ and ψ ⊢ φ → ψ.
prop:provability-lif-right
Proof. 1. {F ψ, T φ → ψ, T φ} has a closed tableau:

1. Fψ Assumption
2. Tφ → ψ Assumption
3. Tφ Assumption

4. Fφ Tψ →T 2
⊗ ⊗

2. Both {F φ → ψ, T ¬φ} and {F φ → ψ, T ψ} have closed tableaux:

1. Fφ→ψ Assumption
2. T ¬φ Assumption
3. Tφ →F 1
4. Fψ →F 1
5. Fφ ¬T 2

1. Fφ→ψ Assumption
2. Tψ Assumption
3. Tφ →F 1
4. Fψ →F 1

content/propositional-logic/../first-order-logic/tableaux/soundness.tex

Release : 6891b66 (2024-12-01) 161


CHAPTER 11. TABLEAUX

11.8 Soundness
explanation A derivation system, such as tableaux, is sound if it cannot derive things that pl:tab:sou:
sec
do not actually hold. Soundness is thus a kind of guaranteed safety property
for derivation systems. Depending on which proof theoretic property is in
question, we would like to know for instance, that

1. every derivable φ is a tautology;

2. if a sentence is derivable from some others, it is also a consequence of


them;

3. if a set of sentences is inconsistent, it is unsatisfiable.

These are important properties of a derivation system. If any of them do


not hold, the derivation system is deficient—it would derive too much. Con-
sequently, establishing the soundness of a derivation system is of the utmost
importance.
Because all these proof-theoretic properties are defined via closed tableaux
of some kind or other, proving (1)–(3) above requires proving something about
the semantic properties of closed tableaux. We will first define what it means
for a signed formula to be satisfied in a structure, and then show that if a
tableau is closed, no structure satisfies all its assumptions. (1)–(3) then follow
as corollaries from this result.

Definition 11.22. A valuation v satisfies a signed formula T φ iff v ⊨ φ, and


it satisfies F φ iff v ⊭ φ. v satisfies a set of signed formulas Γ iff it satisfies every
S φ ∈ Γ . Γ is satisfiable if there is a valuation that satisfies it, and unsatisfiable
otherwise.

Theorem 11.23 (Soundness). If Γ has a closed tableau, Γ is unsatisfiable. pl:tab:sou:


thm:tableau-soundness

Proof. Let’s call a branch of a tableau satisfiable iff the set of signed formulas
on it is satisfiable, and let’s call a tableau satisfiable if it contains at least one
satisfiable branch.
We show the following: Extending a satisfiable tableau by one of the rules
of inference always results in a satisfiable tableau. This will prove the theo-
rem: any closed tableau results by applying rules of inference to the tableau
consisting only of assumptions from Γ . So if Γ were satisfiable, any tableau
for it would be satisfiable. A closed tableau, however, is clearly not satisfiable:
every branch contains both T φ and F φ, and no structure can both satisfy and
not satisfy φ.
Suppose we have a satisfiable tableau, i.e., a tableau with at least one
satisfiable branch. Applying a rule of inference either adds signed formulas
to a branch, or splits a branch in two. If the tableau has a satisfiable branch
which is not extended by the rule application in question, it remains a satisfiable
branch in the extended tableau, so the extended tableau is satisfiable. So we
only have to consider the case where a rule is applied to a satisfiable branch.

162 Release : 6891b66 (2024-12-01)


11.8. SOUNDNESS

Let Γ be the set of signed formulas on that branch, and let S φ ∈ Γ be the
signed formula to which the rule is applied. If the rule does not result in a split
branch, we have to show that the extended branch, i.e., Γ together with the
conclusions of the rule, is still satisfiable. If the rule results in a split branch,
we have to show that at least one of the two resulting branches is satisfiable.
First, we consider the possible inferences that do not result in a split branch.

1. The branch is expanded by applying ¬T to T ¬ψ ∈ Γ . Then the extended


branch contains the signed formulas Γ ∪ {F ψ}. Suppose v ⊨ Γ . In
particular, v ⊨ ¬ψ. Thus, v ⊭ ψ, i.e., v satisfies F ψ.

2. The branch is expanded by applying ¬F to F ¬ψ ∈ Γ : Exercise.

3. The branch is expanded by applying ∧T to T ψ ∧ χ ∈ Γ , which results in


two new signed formulas on the branch: T ψ and T χ. Suppose v ⊨ Γ , in
particular v ⊨ ψ ∧ χ. Then v ⊨ ψ and v ⊨ χ. This means that v satisfies
both T ψ and T χ.

4. The branch is expanded by applying ∨F to F ψ ∨ χ ∈ Γ : Exercise.

5. The branch is expanded by applying →F to F ψ → χ ∈ Γ : This results in


two new signed formulas on the branch: T ψ and F χ. Suppose v ⊨ Γ , in
particular v ⊭ ψ → χ. Then v ⊨ ψ and v ⊭ χ. This means that v satisfies
both T ψ and F χ.

Now let’s consider the possible inferences that result in a split branch.

1. The branch is expanded by applying ∧F to F ψ ∧ χ ∈ Γ , which results in


two branches, a left one continuing through F ψ and a right one through
F χ. Suppose v ⊨ Γ , in particular v ⊭ ψ ∧ χ. Then v ⊭ ψ or v ⊭ χ. In
the former case, v satisfies F ψ, i.e., v satisfies the formulas on the left
branch. In the latter, v satisfies F χ, i.e., v satisfies the formulas on the
right branch.

2. The branch is expanded by applying ∨T to T ψ ∨ χ ∈ Γ : Exercise.

3. The branch is expanded by applying →T to T ψ → χ ∈ Γ : Exercise.

4. The branch is expanded by Cut: This results in two branches, one con-
taining T ψ, the other containing F ψ. Since v ⊨ Γ and either v ⊨ ψ or
v ⊭ ψ, v satisfies either the left or the right branch.

Problem 11.6. Complete the proof of Theorem 11.23.

pl:tab:sou: Corollary 11.24. If ⊢ φ then φ is a tautology.


cor:weak-soundness

pl:tab:sou: Corollary 11.25. If Γ ⊢ φ then Γ ⊨ φ.


cor:entailment-soundness

Release : 6891b66 (2024-12-01) 163


Proof. If Γ ⊢ φ then for some ψ1 , . . . , ψn ∈ Γ , {F φ, T ψ1 , . . . , T ψn } has a
closed tableau. By Theorem 11.23, every valuation v either makes some ψi
false or makes φ true. Hence, if v ⊨ Γ then also v ⊨ φ.

Corollary 11.26. If Γ is satisfiable, then it is consistent. pl:tab:sou:


cor:consistency-soundness

Proof. We prove the contrapositive. Suppose that Γ is not consistent. Then


there are ψ1 , . . . , ψn ∈ Γ and a closed tableau for {T ψ1 , . . . , T ψn }. By The-
orem 11.23, there is no v such that v ⊨ ψi for all i = 1, . . . , n. But then Γ is
not satisfiable.

Chapter 12

Axiomatic Derivations

No effort has been made yet to ensure that the material in this chap-
ter respects various tags indicating which connectives and quantifiers are
primitive or defined: all are assumed to be primitive, except ↔ which is
assumed to be defined. If the FOL tag is true, we produce a version with
quantifiers, otherwise without.

content/propositional-logic/../first-order-logic/axiomatic-deduction/rules-and-proof

12.1 Rules and Derivations


explanation Axiomatic derivations are perhaps the simplest derivation system for logic. pl:axd:rul:
sec
A derivation is just a sequence of formulas. To count as a derivation, every
formula in the sequence must either be an instance of an axiom, or must follow
from one or more formulas that precede it in the sequence by a rule of inference.
A derivation derives its last formula.

Definition 12.1 (Derivability). If Γ is a set of formulas of L then a deriva-


tion from Γ is a finite sequence φ1 , . . . , φn of formulas where for each i ≤ n
one of the following holds:

164
12.1. RULES AND DERIVATIONS

1. φi ∈ Γ ; or

2. φi is an axiom; or

3. φi follows from some φj (and φk ) with j < i (and k < i) by a rule of


inference.

What counts as a correct derivation depends on which inference rules we


allow (and of course what we take to be axioms). And an inference rule is
an if-then statement that tells us that, under certain conditions, a step Ai in
a derivation is a correct inference step.

Definition 12.2 (Rule of inference). A rule of inference gives a sufficient


condition for what counts as a correct inference step in a derivation from Γ .

For instance, since any one-element sequence φ with φ ∈ Γ trivially counts


as a derivation, the following might be a very simple rule of inference:

If φ ∈ Γ , then φ is always a correct inference step in any derivation


from Γ .

Similarly, if φ is one of the axioms, then φ by itself is a derivation, and so this


is also a rule of inference:

If φ is an axiom, then φ is a correct inference step.

It gets more interesting if the rule of inference appeals to formulas that appear
before the step considered. The following rule is called modus ponens:

If ψ →φ and ψ occur higher up in the derivation, then φ is a correct


inference step.

If this is the only rule of inference, then our definition of derivation above
amounts to this: φ1 , . . . , φn is a derivation iff for each i ≤ n one of the
following holds:

1. φi ∈ Γ ; or

2. φi is an axiom; or

3. for some j < i, φj is ψ → φi , and for some k < i, φk is ψ.

The last clause says that φi follows from φj (ψ) and φk (ψ → φi ) by modus
ponens. If we can go from 1 to n, and each time we find a formula φi that is
either in Γ , an axiom, or which a rule of inference tells us that it is a correct
inference step, then the entire sequence counts as a correct derivation.

Definition 12.3 (Derivability). A formula φ is derivable from Γ , written


Γ ⊢ φ, if there is a derivation from Γ ending in φ.

Release : 6891b66 (2024-12-01) 165


CHAPTER 12. AXIOMATIC DERIVATIONS

Definition 12.4 (Theorems). A formula φ is a theorem if there is a deriva-


tion of φ from the empty set. We write ⊢ φ if φ is a theorem and ⊬ φ if it is
not.

content/propositional-logic/../first-order-logic/axiomatic-deduction/axioms-rules-pr

12.2 Axiom and Rules for the Propositional Connectives


pl:axd:prp:
sec
Definition 12.5 (Axioms). The set of Ax0 of axioms for the propositional
connectives comprises all formulas of the following forms:

(φ ∧ ψ) → φ (12.1) pl:axd:prp:
ax:land1
(φ ∧ ψ) → ψ (12.2) pl:axd:prp:
ax:land2
φ → (ψ → (φ ∧ ψ)) (12.3) pl:axd:prp:
ax:land3
φ → (φ ∨ ψ) (12.4) pl:axd:prp:
ax:lor1
φ → (ψ ∨ φ) (12.5) pl:axd:prp:
ax:lor2
(φ → χ) → ((ψ → χ) → ((φ ∨ ψ) → χ)) (12.6) pl:axd:prp:
ax:lor3
φ → (ψ → φ) (12.7) pl:axd:prp:
ax:lif1
(φ → (ψ → χ)) → ((φ → ψ) → (φ → χ)) (12.8) pl:axd:prp:
ax:lif2
(φ → ψ) → ((φ → ¬ψ) → ¬φ) (12.9) pl:axd:prp:
ax:lnot1
¬φ → (φ → ψ) (12.10) pl:axd:prp:
ax:lnot2
⊤ (12.11) pl:axd:prp:
ax:ltrue
⊥→φ (12.12) pl:axd:prp:
ax:lfalse1
(φ → ⊥) → ¬φ (12.13) pl:axd:prp:
ax:lfalse2
¬¬φ → φ (12.14) pl:axd:prp:
ax:dne

Definition 12.6 (Modus ponens). If ψ and ψ→φ already occur in a deriva-


tion, then φ is a correct inference step.

We’ll abbreviate the rule modus ponens as “mp.”

content/propositional-logic/../first-order-logic/axiomatic-deduction/proving-things.

12.3 Examples of Derivations


pl:axd:pro:
sec
Example 12.7. Suppose we want to prove (¬θ ∨ α) → (θ → α). Clearly, this is
not an instance of any of our axioms, so we have to use the mp rule to derive
it. Our only rule is MP, which given φ and φ → ψ allows us to justify ψ. One

166 Release : 6891b66 (2024-12-01)


12.3. EXAMPLES OF DERIVATIONS

strategy would be to use eq. (12.6) with φ being ¬θ, ψ being α, and χ being
θ → α, i.e., the instance
(¬θ → (θ → α)) → ((α → (θ → α)) → ((¬θ ∨ α) → (θ → α))).
Why? Two applications of MP yield the last part, which is what we want. And
we easily see that ¬θ → (θ → α) is an instance of eq. (12.10), and α → (θ → α)
is an instance of eq. (12.7). So our derivation is:
1. ¬θ → (θ → α) eq. (12.10)
2. (¬θ → (θ → α)) →
((α → (θ → α)) → ((¬θ ∨ α) → (θ → α))) eq. (12.6)
3. ((α → (θ → α)) → ((¬θ ∨ α) → (θ → α)) 1, 2, mp
4. α → (θ → α) eq. (12.7)
5. (¬θ ∨ α) → (θ → α) 3, 4, mp

pl:axd:pro: Example 12.8. Let’s try to find a derivation of θ → θ. It is not an instance


ex:identity
of an axiom, so we have to use mp to derive it. eq. (12.7) is an axiom of the
form φ → ψ to which we could apply mp. To be useful, of course, the ψ which
mp would justify as a correct step in this case would have to be θ → θ, since
this is what we want to derive. That means φ would also have to be θ, i.e., we
might look at this instance of eq. (12.7):
θ → (θ → θ)
In order to apply mp, we would also need to justify the corresponding second
premise, namely φ. But in our case, that would be θ, and we won’t be able to
derive θ by itself. So we need a different strategy.
The other axiom involving just → is eq. (12.8), i.e.,
(φ → (ψ → χ)) → ((φ → ψ) → (φ → χ))
We could get to the last nested conditional by applying mp twice. Again, that
would mean that we want an instance of eq. (12.8) where φ → χ is θ → θ, the
formula we are aiming for. Then of course, φ and χ are both θ. How should
we pick ψ so that both φ → (ψ → χ) and φ → ψ, i.e., in our case θ → (ψ → θ)
and θ → ψ, are also derivable? Well, the first of these is already an instance of
eq. (12.7), whatever we decide ψ to be. And θ → ψ would be another instance
of eq. (12.7) if ψ were (θ → θ). So, our derivation is:
1. θ → ((θ → θ) → θ) eq. (12.7)
2. (θ → ((θ → θ) → θ)) →
((θ → (θ → θ)) → (θ → θ)) eq. (12.8)
3. (θ → (θ → θ)) → (θ → θ) 1, 2, mp
4. θ → (θ → θ) eq. (12.7)
5. θ→θ 3, 4, mp

pl:axd:pro: Example 12.9. Sometimes we want to show that there is a derivation of


ex:chain
some formula from some other formulas Γ . For instance, let’s show that we
can derive φ → χ from Γ = {φ → ψ, ψ → χ}.

Release : 6891b66 (2024-12-01) 167


CHAPTER 12. AXIOMATIC DERIVATIONS

1. φ→ψ Hyp
2. ψ→χ Hyp
3. (ψ → χ) → (φ → (ψ → χ)) eq. (12.7)
4. φ → (ψ → χ) 2, 3, mp
5. (φ → (ψ → χ)) →
((φ → ψ) → (φ → χ)) eq. (12.8)
6. ((φ → ψ) → (φ → χ)) 4, 5, mp
7. φ→χ 1, 6, mp

The lines labelled “Hyp” (for “hypothesis”) indicate that the formula on that
line is an element of Γ .

Proposition 12.10. If Γ ⊢ φ → ψ and Γ ⊢ ψ → χ, then Γ ⊢ φ → χ pl:axd:pro:


prop:chain

Proof. Suppose Γ ⊢ φ → ψ and Γ ⊢ ψ → χ. Then there is a derivation of φ → ψ


from Γ ; and a derivation of ψ → χ from Γ as well. Combine these into a single
derivation by concatenating them. Now add lines 3–7 of the derivation in the
preceding example. This is a derivation of φ → χ—which is the last line of the
new derivation—from Γ . Note that the justifications of lines 4 and 7 remain
valid if the reference to line number 2 is replaced by reference to the last line
of the derivation of φ → ψ, and reference to line number 1 by reference to the
last line of the derivation of B → χ.

Problem 12.1. Show that the following hold by exhibiting derivations from
the axioms:

1. (φ ∧ ψ) → (ψ ∧ φ)

2. ((φ ∧ ψ) → χ) → (φ → (ψ → χ))

3. ¬(φ ∨ ψ) → ¬φ

content/propositional-logic/../first-order-logic/axiomatic-deduction/proof-theoretic

12.4 Proof-Theoretic Notions


explanation Just as we’ve defined a number of important semantic notions (tautology, en- pl:axd:ptn:
sec
tailment, satisfiability), we now define corresponding proof-theoretic notions.
These are not defined by appeal to satisfaction of sentences in structures, but
by appeal to the derivability or non-derivability of certain formulas. It was an
important discovery that these notions coincide. That they do is the content
of the soundness and completeness theorems.

Definition 12.11 (Derivability). A formula φ is derivable from Γ , written


Γ ⊢ φ, if there is a derivation from Γ ending in φ.

168 Release : 6891b66 (2024-12-01)


12.4. PROOF-THEORETIC NOTIONS

Definition 12.12 (Theorems). A formula φ is a theorem if there is a deriva-


tion of φ from the empty set. We write ⊢ φ if φ is a theorem and ⊬ φ if it is
not.
Definition 12.13 (Consistency). A set Γ of formulas is consistent if and
only if Γ ⊬ ⊥; it is inconsistent otherwise.
pl:axd:ptn: Proposition 12.14 (Reflexivity). If φ ∈ Γ , then Γ ⊢ φ.
prop:reflexivity
Proof. The formula φ by itself is a derivation of φ from Γ .
pl:axd:ptn: Proposition 12.15 (Monotonicity). If Γ ⊆ ∆ and Γ ⊢ φ, then ∆ ⊢ φ.
prop:monotonicity
Proof. Any derivation of φ from Γ is also a derivation of φ from ∆.
pl:axd:ptn: Proposition 12.16 (Transitivity). If Γ ⊢ φ and {φ}∪∆ ⊢ ψ, then Γ ∪∆ ⊢
prop:transitivity
ψ.
Proof. Suppose {φ} ∪ ∆ ⊢ ψ. Then there is a derivation ψ1 , . . . , ψl = ψ
from {φ} ∪ ∆. Some of the steps in that derivation will be correct because of
a rule which refers to a prior line ψi = φ. By hypothesis, there is a derivation
of φ from Γ , i.e., a derivation φ1 , . . . , φk = φ where every φi is an axiom,
an element of Γ , or correct by a rule of inference. Now consider the sequence
φ1 , . . . , φk = φ, ψ1 , . . . , ψl = ψ.
This is a correct derivation of ψ from Γ ∪ ∆ since every Bi = φ is now justified
by the same rule which justifies φk = φ.
Note that this means that in particular if Γ ⊢ φ and φ ⊢ ψ, then Γ ⊢ ψ. It
follows also that if φ1 , . . . , φn ⊢ ψ and Γ ⊢ φi for each i, then Γ ⊢ ψ.
pl:axd:ptn: Proposition 12.17. Γ is inconsistent iff Γ ⊢ φ for every φ.
prop:incons
Proof. Exercise.
Problem 12.2. Prove Proposition 12.17.
pl:axd:ptn: Proposition 12.18 (Compactness).
prop:proves-compact
1. If Γ ⊢ φ then there is a finite subset Γ0 ⊆ Γ such that Γ0 ⊢ φ.
2. If every finite subset of Γ is consistent, then Γ is consistent.
Proof. 1. If Γ ⊢ φ, then there is a finite sequence of formulas φ1 , . . . , φn
so that φ ≡ φn and each φi is either a logical axiom, an element of Γ or
follows from previous formulas by modus ponens. Take Γ0 to be those
φi which are in Γ . Then the derivation is likewise a derivation from Γ0 ,
and so Γ0 ⊢ φ.
2. This is the contrapositive of (1) for the special case φ ≡ ⊥.

content/propositional-logic/../first-order-logic/axiomatic-deduction/deduction-theorem.tex

Release : 6891b66 (2024-12-01) 169


CHAPTER 12. AXIOMATIC DERIVATIONS

12.5 The Deduction Theorem


As we’ve seen, giving derivations in an axiomatic system is cumbersome, and pl:axd:ded:
sec
derivations may be hard to find. Rather than actually write out long lists of
formulas, it is generally easier to argue that such derivations exist, by mak-
ing use of a few simple results. We’ve already established three such results:
Proposition 12.14 says we can always assert that Γ ⊢ φ when we know that
φ ∈ Γ . Proposition 12.15 says that if Γ ⊢ φ then also Γ ∪ {ψ} ⊢ φ. And
Proposition 12.16 implies that if Γ ⊢ φ and φ ⊢ ψ, then Γ ⊢ ψ. Here’s another
simple result, a “meta”-version of modus ponens:

Proposition 12.19. If Γ ⊢ φ and Γ ⊢ φ → ψ, then Γ ⊢ ψ. pl:axd:ded:


prop:mp

Proof. We have that {φ, φ → ψ} ⊢ ψ:

1. φ Hyp.
2. φ→ψ Hyp.
3. ψ 1, 2, MP

By Proposition 12.16, Γ ⊢ ψ.

The most important result we’ll use in this context is the deduction theorem:

Theorem 12.20 (Deduction Theorem). Γ ∪ {φ} ⊢ ψ if and only if Γ ⊢ pl:axd:ded:


φ → ψ. thm:deduction-thm

Proof. The “if” direction is immediate. If Γ ⊢ φ → ψ then also Γ ∪ {φ} ⊢


φ → ψ by Proposition 12.15. Also, Γ ∪ {φ} ⊢ φ by Proposition 12.14. So, by
Proposition 12.19, Γ ∪ {φ} ⊢ ψ.
For the “only if” direction, we proceed by induction on the length of the
derivation of ψ from Γ ∪ {φ}.
For the induction basis, we prove the claim for every derivation of length 1.
A derivation of ψ from Γ ∪ {φ} of length 1 consists of ψ by itself; and if it is
correct ψ is either ∈ Γ ∪ {φ} or is an axiom. If ψ ∈ Γ or is an axiom, then
Γ ⊢ ψ. We also have that Γ ⊢ ψ →(φ→ψ) by eq. (12.7), and Proposition 12.19
gives Γ ⊢ φ → ψ. If ψ ∈ {φ} then Γ ⊢ φ → ψ because then last sentence φ → ψ
is the same as φ → φ, and we have derived that in Example 12.8.
For the inductive step, suppose a derivation of ψ from Γ ∪ {φ} ends with
a step ψ which is justified by modus ponens. (If it is not justified by modus
ponens, ψ ∈ Γ , ψ ≡ φ, or ψ is an axiom, and the same reasoning as in the
induction basis applies.) Then some previous steps in the derivation are χ → ψ
and χ, for some formula χ, i.e., Γ ∪ {φ} ⊢ χ → ψ and Γ ∪ {φ} ⊢ χ, and the
respective derivations are shorter, so the inductive hypothesis applies to them.
We thus have both:

Γ ⊢ φ → (χ → ψ);
Γ ⊢ φ → χ.

170 Release : 6891b66 (2024-12-01)


12.6. DERIVABILITY AND CONSISTENCY

But also
Γ ⊢ (φ → (χ → ψ)) → ((φ → χ) → (φ → ψ)),
by eq. (12.8), and two applications of Proposition 12.19 give Γ ⊢ φ → ψ, as
required.

Notice how eq. (12.7) and eq. (12.8) were chosen precisely so that the De-
duction Theorem would hold.
The following are some useful facts about derivability, which we leave as
exercises.
pl:axd:ded: Proposition 12.21.
prop:derivfacts
pl:axd:ded: 1. ⊢ (φ → ψ) → ((ψ → χ) → (φ → χ);
derivfacts:a
pl:axd:ded: 2. If Γ ∪ {¬φ} ⊢ ¬ψ then Γ ∪ {ψ} ⊢ φ (Contraposition);
derivfacts:b
pl:axd:ded: 3. {φ, ¬φ} ⊢ ψ (Ex Falso Quodlibet, Explosion);
derivfacts:c
pl:axd:ded: 4. {¬¬φ} ⊢ φ (Double Negation Elimination);
derivfacts:d
pl:axd:ded: 5. If Γ ⊢ ¬¬φ then Γ ⊢ φ;
derivfacts:e

Problem 12.3. Prove Proposition 12.21

content/propositional-logic/../first-order-logic/axiomatic-deduction/provability-consistency.t

12.6 Derivability and Consistency


pl:axd:prv: We will now establish a number of properties of the derivability relation. They
sec
are independently interesting, but each will play a role in the proof of the
completeness theorem.
pl:axd:prv: Proposition 12.22. If Γ ⊢ φ and Γ ∪ {φ} is inconsistent, then Γ is incon-
prop:provability-contr
sistent.

Proof. If Γ ∪ {φ} is inconsistent, then Γ ∪ {φ} ⊢ ⊥. By Proposition 12.14,


Γ ⊢ ψ for every ψ ∈ Γ . Since also Γ ⊢ φ by hypothesis, Γ ⊢ ψ for every
ψ ∈ Γ ∪ {φ}. By Proposition 12.16, Γ ⊢ ⊥, i.e., Γ is inconsistent.

pl:axd:prv: Proposition 12.23. Γ ⊢ φ iff Γ ∪ {¬φ} is inconsistent.


prop:prov-incons

Proof. First suppose Γ ⊢ φ. Then Γ ∪ {¬φ} ⊢ φ by Proposition 12.15. Γ ∪


{¬φ} ⊢ ¬φ by Proposition 12.14. We also have ⊢ ¬φ → (φ → ⊥) by eq. (12.10).
So by two applications of Proposition 12.19, we have Γ ∪ {¬φ} ⊢ ⊥.
Now assume Γ ∪ {¬φ} is inconsistent, i.e., Γ ∪ {¬φ} ⊢ ⊥. By the deduction
theorem, Γ ⊢ ¬φ → ⊥. Γ ⊢ (¬φ → ⊥) → ¬¬φ by eq. (12.13), so Γ ⊢ ¬¬φ
by Proposition 12.19. Since Γ ⊢ ¬¬φ → φ (eq. (12.14)), we have Γ ⊢ φ by
Proposition 12.19 again.

Release : 6891b66 (2024-12-01) 171


CHAPTER 12. AXIOMATIC DERIVATIONS

Problem 12.4. Prove that Γ ⊢ ¬φ iff Γ ∪ {φ} is inconsistent.

Proposition 12.24. If Γ ⊢ φ and ¬φ ∈ Γ , then Γ is inconsistent. pl:axd:prv:


prop:explicit-inc

Proof. Γ ⊢ ¬φ → (φ → ⊥) by eq. (12.10). Γ ⊢ ⊥ by two applications of


Proposition 12.19.

Proposition 12.25. If Γ ∪ {φ} and Γ ∪ {¬φ} are both inconsistent, then Γ pl:axd:prv:
prop:provability-exhaustive
is inconsistent.

Proof. Exercise.

Problem 12.5. Prove Proposition 12.25

content/propositional-logic/../first-order-logic/axiomatic-deduction/provability-pro

12.7 Derivability and the Propositional Connectives


explanation We establish that the derivability relation ⊢ of axiomatic deduction is strong pl:axd:ppr:
sec
enough to establish some basic facts involving the propositional connectives,
such as that φ ∧ ψ ⊢ φ and φ, φ → ψ ⊢ ψ (modus ponens). These facts are
needed for the proof of the completeness theorem.

Proposition 12.26. pl:axd:ppr:


prop:provability-land
1. Both φ ∧ ψ ⊢ φ and φ ∧ ψ ⊢ ψ pl:axd:ppr:
prop:provability-land-left
2. φ, ψ ⊢ φ ∧ ψ. pl:axd:ppr:
prop:provability-land-right

Proof. 1. From eq. (12.1) and eq. (12.1) by modus ponens.

2. From eq. (12.3) by two applications of modus ponens.

Proposition 12.27. pl:axd:ppr:


prop:provability-lor
1. φ ∨ ψ, ¬φ, ¬ψ is inconsistent.

2. Both φ ⊢ φ ∨ ψ and ψ ⊢ φ ∨ ψ.

Proof. 1. From eq. (12.9) we get ⊢ ¬φ→(φ→⊥) and ⊢ ¬ψ →(ψ →⊥). So by


the deduction theorem, we have {¬φ} ⊢ φ → ⊥ and {¬ψ} ⊢ ψ → ⊥. From
eq. (12.6) we get {¬φ, ¬ψ} ⊢ (φ ∨ ψ) → ⊥. By the deduction theorem,
{φ ∨ ψ, ¬φ, ¬ψ} ⊢ ⊥.

2. From eq. (12.4) and eq. (12.5) by modus ponsens.

Proposition 12.28. pl:axd:ppr:


prop:provability-lif

172 Release : 6891b66 (2024-12-01)


12.8. SOUNDNESS

pl:axd:ppr: 1. φ, φ → ψ ⊢ ψ.
prop:provability-lif-left
pl:axd:ppr: 2. Both ¬φ ⊢ φ → ψ and ψ ⊢ φ → ψ.
prop:provability-lif-right

Proof. 1. We can derive:

1. φ Hyp
2. φ→ψ Hyp
3. ψ 1, 2, mp

2. By eq. (12.10) and eq. (12.7) and the deduction theorem, respectively.

content/propositional-logic/../first-order-logic/axiomatic-deduction/soundness.tex

12.8 Soundness
pl:axd:sou: A derivation system, such as axiomatic deduction, is sound if it cannot derive explanation
sec
things that do not actually hold. Soundness is thus a kind of guaranteed safety
property for derivation systems. Depending on which proof theoretic property
is in question, we would like to know for instance, that
1. every derivable φ is valid;
2. if φ is derivable from some others Γ , it is also a consequence of them;
3. if a set of formulas Γ is inconsistent, it is unsatisfiable.
These are important properties of a derivation system. If any of them do
not hold, the derivation system is deficient—it would derive too much. Con-
sequently, establishing the soundness of a derivation system is of the utmost
importance.
Proposition 12.29. If φ is an axiom, then v ⊨ φ for each valuation v.

Proof. Do truth tables for each axiom to verify that they are tautologies.

pl:axd:sou: Theorem 12.30 (Soundness). If Γ ⊢ φ then Γ ⊨ φ.


thm:soundness

Proof. By induction on the length of the derivation of φ from Γ . If there are


no steps justified by inferences, then all formulas in the derivation are either
instances of axioms or are in Γ . By the previous proposition, all the axioms
are tautologies, and hence if φ is an axiom then Γ ⊨ φ. If φ ∈ Γ , then trivially
Γ ⊨ φ.
If the last step of the derivation of φ is justified by modus ponens, then
there are formulas ψ and ψ → φ in the derivation, and the induction hypothesis
applies to the part of the derivation ending in those formulas (since they contain
at least one fewer steps justified by an inference). So, by induction hypothesis,
Γ ⊨ ψ and Γ ⊨ ψ → φ. Then Γ ⊨ φ by Theorem 7.21.

Release : 6891b66 (2024-12-01) 173


Corollary 12.31. If ⊢ φ, then φ is a tautology. pl:axd:sou:
cor:weak-soundness
Corollary 12.32. If Γ is satisfiable, then it is consistent. pl:axd:sou:
cor:consistency-soundness
Proof. We prove the contrapositive. Suppose that Γ is not consistent. Then
Γ ⊢ ⊥, i.e., there is a derivation of ⊥ from Γ . By Theorem 12.30, any valua-
tion v that satisfies Γ must satisfy ⊥. Since v ⊭ ⊥ for every valuation v, no v
can satisfy Γ , i.e., Γ is not satisfiable.

Chapter 13

The Completeness Theorem

content/propositional-logic/../first-order-logic/completeness/introduction.tex

13.1 Introduction
The completeness theorem is one of the most fundamental results about logic. pl:com:int:
sec
It comes in two formulations, the equivalence of which we’ll prove. In its
first formulation it says something fundamental about the relationship between
semantic consequence and our derivation system: if a sentence φ follows from
some sentences Γ , then there is also a derivation that establishes Γ ⊢ φ. Thus,
the derivation system is as strong as it can possibly be without proving things
that don’t actually follow.
In its second formulation, it can be stated as a model existence result: every
consistent set of sentences is satisfiable. Consistency is a proof-theoretic notion:
it says that our derivation system is unable to produce certain derivations. But
who’s to say that just because there are no derivations of a certain sort from Γ ,
it’s guaranteed that there is valuation v with v ⊨ Γ ? Before the completeness
theorem was first proved—in fact before we had the derivation systems we
now do—the great German mathematician David Hilbert held the view that
consistency of mathematical theories guarantees the existence of the objects
they are about. He put it as follows in a letter to Gottlob Frege:
If the arbitrarily given axioms do not contradict one another with
all their consequences, then they are true and the things defined by
the axioms exist. This is for me the criterion of truth and existence.

174
13.2. OUTLINE OF THE PROOF

Frege vehemently disagreed. The second formulation of the completeness the-


orem shows that Hilbert was right in at least the sense that if the axioms are
consistent, then some valuation exists that makes them all true.
These aren’t the only reasons the completeness theorem—or rather, its
proof—is important. It has a number of important consequences, some of
which we’ll discuss separately. For instance, since any derivation that shows
Γ ⊢ φ is finite and so can only use finitely many of the sentences in Γ , it follows
by the completeness theorem that if φ is a consequence of Γ , it is already a
consequence of a finite subset of Γ . This is called compactness. Equivalently,
if every finite subset of Γ is consistent, then Γ itself must be consistent.
Although the compactness theorem follows from the completeness theorem
via the detour through derivations, it is also possible to use the the proof of
the completeness theorem to establish it directly. For what the proof does is
take a set of sentences with a certain property—consistency—and constructs
a structure out of this set that has certain properties (in this case, that it
satisfies the set). Almost the very same construction can be used to directly
establish compactness, by starting from “finitely satisfiable” sets of sentences
instead of consistent ones.

content/propositional-logic/../first-order-logic/completeness/outline.tex

13.2 Outline of the Proof


pl:com:out: The proof of the completeness theorem is a bit complex, and upon first reading
sec
it, it is easy to get lost. So let us outline the proof. The first step is a shift
of perspective, that allows us to see a route to a proof. When completeness
is thought of as “whenever Γ ⊨ φ then Γ ⊢ φ,” it may be hard to even come
up with an idea: for to show that Γ ⊢ φ we have to find a derivation, and
it does not look like the hypothesis that Γ ⊨ φ helps us for this in any way.
For some proof systems it is possible to directly construct a derivation, but we
will take a slightly different approach. The shift in perspective required is this:
completeness can also be formulated as: “if Γ is consistent, it is satisfiable.”
Perhaps we can use the information in Γ together with the hypothesis that it is
consistent to construct a valuation that satisfies every formula in Γ . After all,
we know what kind of valuation we are looking for: one that is as Γ describes
it!
If Γ contains only propositional variables, it is easy to construct a model
for it. All we have to do is come up with a valuation v such that v ⊨ p for all
p ∈ Γ . Well, let v(p) = T iff p ∈ Γ .
Now suppose Γ contains some formula ¬ψ, with ψ atomic. We might worry
that the construction of v interferes with the possibility of making ¬ψ true.
But here’s where the consistency of Γ comes in: if ¬ψ ∈ Γ , then ψ ∈ / Γ , or else
Γ would be inconsistent. And if ψ ∈ / Γ , then according to our construction
of v, v ⊭ ψ, so v ⊨ ¬ψ. So far so good.

Release : 6891b66 (2024-12-01) 175


CHAPTER 13. THE COMPLETENESS THEOREM

What if Γ contains complex, non-atomic formulas? Say it contains φ ∧ ψ.


To make that true, we should proceed as if both φ and ψ were in Γ . And if
φ ∨ ψ ∈ Γ , then we will have to make at least one of them true, i.e., proceed
as if one of them was in Γ .
This suggests the following idea: we add additional formulas to Γ so as to
(a) keep the resulting set consistent and (b) make sure that for every possible
atomic sentence φ, either φ is in the resulting set, or ¬φ is, and (c) such that,
whenever φ ∧ ψ is in the set, so are both φ and ψ, if φ ∨ ψ is in the set, at least
one of φ or ψ is also, etc. We keep doing this (potentially forever). Call the
set of all formulas so added Γ ∗ . Then our construction above would provide
us with a valuation v for which we could prove, by induction, that it satisfies
all sentences in Γ ∗ , and hence also all sentence in Γ since Γ ⊆ Γ ∗ . It turns
out that guaranteeing (a) and (b) is enough. A set of sentences for which (b)
holds is called complete. So our task will be to extend the consistent set Γ to
a consistent and complete set Γ ∗ .
So here’s what we’ll do. First we investigate the properties of complete
consistent sets, in particular we prove that a complete consistent set contains
φ ∧ ψ iff it contains both φ and ψ, φ ∨ ψ iff it contains at least one of them,
etc. (Proposition 13.2). We’ll then take the consistent set Γ and show that
it can be extended to a consistent and complete set Γ ∗ (Lemma 13.3). This
set Γ ∗ is what we’ll use to define our valuation v(Γ ∗ ). The valuation is de-
termined by the propositional variables in Γ ∗ (Definition 13.4). We’ll use the
properties of complete consistent sets to show that indeed v(Γ ∗ ) ⊨ φ iff φ ∈ Γ ∗
(Lemma 13.5), and thus in particular, v(Γ ∗ ) ⊨ Γ .

content/propositional-logic/../first-order-logic/completeness/complete-consistent-se

13.3 Complete Consistent Sets of Sentences


pl:com:ccs:
sec
Definition 13.1 (Complete set). A set Γ of sentences is complete iff for pl:com:ccs:
any sentence φ, either φ ∈ Γ or ¬φ ∈ Γ . def:complete-set

explanation Complete sets of sentences leave no questions unanswered. For any sen-
tence φ, Γ “says” if φ is true or false. The importance of complete sets extends
beyond the proof of the completeness theorem. A theory which is complete and
axiomatizable, for instance, is always decidable.
explanation Complete consistent sets are important in the completeness proof since we
can guarantee that every consistent set of sentences Γ is contained in a com-
plete consistent set Γ ∗ . A complete consistent set contains, for each sentence φ,
either φ or its negation ¬φ, but not both. This is true in particular for proposi-
tional variables, so from a complete consistent set, we can construct a valuation
where the truth value assigned to propositional variables is defined according
to which propositional variables are in Γ ∗ . This valuation can then be shown
to make all sentences in Γ ∗ (and hence also all those in Γ ) true. The proof of

176 Release : 6891b66 (2024-12-01)


13.3. COMPLETE CONSISTENT SETS OF SENTENCES

this latter fact requires that ¬φ ∈ Γ ∗ iff φ ∈ / Γ ∗ , (φ ∨ ψ) ∈ Γ ∗ iff φ ∈ Γ ∗ or



ψ ∈ Γ , etc.
In what follows, we will often tacitly use the properties of reflexivity, mono-
tonicity, and transitivity of ⊢ (see sections 9.6, 10.5, 11.5 and 12.4).

pl:com:ccs: Proposition 13.2. Suppose Γ is complete and consistent. Then:


prop:ccs

pl:com:ccs: 1. If Γ ⊢ φ, then φ ∈ Γ .
prop:ccs-prov-in
pl:com:ccs: 2. φ ∧ ψ ∈ Γ iff both φ ∈ Γ and ψ ∈ Γ .
prop:ccs-and
pl:com:ccs: 3. φ ∨ ψ ∈ Γ iff either φ ∈ Γ or ψ ∈ Γ .
prop:ccs-or

pl:com:ccs: 4. φ → ψ ∈ Γ iff either φ ∈


/ Γ or ψ ∈ Γ .
prop:ccs-if

Proof. Let us suppose for all of the following that Γ is complete and consistent.

1. If Γ ⊢ φ, then φ ∈ Γ .
Suppose that Γ ⊢ φ. Suppose to the contrary that φ ∈ / Γ . Since Γ
is complete, ¬φ ∈ Γ . By Propositions 9.19, 10.17, 11.17 and 12.24, Γ
is inconsistent. This contradicts the assumption that Γ is consistent.
Hence, it cannot be the case that φ ∈
/ Γ , so φ ∈ Γ .

2. φ ∧ ψ ∈ Γ iff both φ ∈ Γ and ψ ∈ Γ :


For the forward direction, suppose φ∧ψ ∈ Γ . Then by Propositions 9.21,
10.19, 11.19 and 12.26, item (1), Γ ⊢ φ and Γ ⊢ ψ. By (1), φ ∈ Γ and
ψ ∈ Γ , as required.
For the reverse direction, let φ ∈ Γ and ψ ∈ Γ . By Propositions 9.21,
10.19, 11.19 and 12.26, item (2), Γ ⊢ φ ∧ ψ. By (1), φ ∧ ψ ∈ Γ .

3. First we show that if φ ∨ ψ ∈ Γ , then either φ ∈ Γ or ψ ∈ Γ . Suppose


φ ∨ ψ ∈ Γ but φ ∈ / Γ and ψ ∈ / Γ . Since Γ is complete, ¬φ ∈ Γ and
¬ψ ∈ Γ . By Propositions 9.22, 10.20, 11.20 and 12.27, item (1), Γ is
inconsistent, a contradiction. Hence, either φ ∈ Γ or ψ ∈ Γ .
For the reverse direction, suppose that φ ∈ Γ or ψ ∈ Γ . By Proposi-
tions 9.22, 10.20, 11.20 and 12.27, item (2), Γ ⊢ φ ∨ ψ. By (1), φ ∨ ψ ∈ Γ ,
as required.

4. For the forward direction, suppose φ→ψ ∈ Γ , and suppose to the contrary
that φ ∈ Γ and ψ ∈ / Γ . On these assumptions, φ → ψ ∈ Γ and φ ∈ Γ .
By Propositions 9.23, 10.21, 11.21 and 12.28, item (1), Γ ⊢ ψ. But then
by (1), ψ ∈ Γ , contradicting the assumption that ψ ∈
/ Γ.
For the reverse direction, first consider the case where φ ∈
/ Γ . Since Γ
is complete, ¬φ ∈ Γ . By Propositions 9.23, 10.21, 11.21 and 12.28, item
(2), Γ ⊢ φ → ψ. Again by (1), we get that φ → ψ ∈ Γ , as required.
Now consider the case where ψ ∈ Γ . By Propositions 9.23, 10.21, 11.21
and 12.28, item (2) again, Γ ⊢ φ → ψ. By (1), φ → ψ ∈ Γ .

Release : 6891b66 (2024-12-01) 177


CHAPTER 13. THE COMPLETENESS THEOREM

Problem 13.1. Complete the proof of Proposition 13.2.

content/propositional-logic/../first-order-logic/completeness/lindenbaums-lemma.tex

13.4 Lindenbaum’s Lemma


explanation We now prove a lemma that shows that any consistent set of sentences is con- pl:com:lin:
sec
tained in some set of sentences which is not just consistent, but also complete.
The proof works by adding one sentence at a time, guaranteeing at each step
that the set remains consistent. We do this so that for every φ, either φ or ¬φ
gets added at some stage. The union of all stages in that construction then
contains either φ or its negation ¬φ and is thus complete. It is also consistent,
since we make sure at each stage not to introduce an inconsistency.
Lemma 13.3 (Lindenbaum’s Lemma). Every consistent set Γ in a lan- pl:com:lin:
guage L can be extended to a complete and consistent set Γ ∗ . lem:lindenbaum

Proof. Let Γ be consistent. Let φ0 , φ1 , . . . be an enumeration of all the


sentences of L. Define Γ0 = Γ , and
(
Γn ∪ {φn } if Γn ∪ {φn } is consistent;
Γn+1 =
Γn ∪ {¬φn } otherwise.
Let Γ ∗ = n≥0 Γn .
S
Each Γn is consistent: Γ0 is consistent by definition. If Γn+1 = Γn ∪ {φn },
this is because the latter is consistent. If it isn’t, Γn+1 = Γn ∪ {¬φn }. We have
to verify that Γn ∪ {¬φn } is consistent. Suppose it’s not. Then both Γn ∪ {φn }
and Γn ∪ {¬φn } are inconsistent. This means that Γn would be inconsistent by
Propositions 9.20, 10.18, 11.18 and 12.25, contrary to the induction hypothesis.
For every n and every i < n, Γi ⊆ Γn . This follows by a simple induction
on n. For n = 0, there are no i < 0, so the claim holds automatically. For
the inductive step, suppose it is true for n. We show that if i < n + 1 then
Γi ⊆ Γn+1 . We have Γn+1 = Γn ∪ {φn } or = Γn ∪ {¬φn } by construction. So
Γn ⊆ Γn+1 . If i < n + 1, then Γi ⊆ Γn by inductive hypothesis (if i < n) or
the trivial fact that Γn ⊆ Γn (if i = n). We get that Γi ⊆ Γn+1 by transitivity
of ⊆.
From this it follows that Γ ∗ is consistent. Here’s why: Let Γ ′ ⊆ Γ ∗ be
finite. Each ψ ∈ Γ ′ is also in Γi for some i. Let n be the largest of these.
Since Γi ⊆ Γn if i ≤ n, every ψ ∈ Γ ′ is also ∈ Γn , i.e., Γ ′ ⊆ Γn , and Γn is
consistent. So, every finite subset Γ ′ ⊆ Γ ∗ is consistent. By Propositions 9.16,
10.14, 11.14 and 12.18, Γ ∗ is consistent.
Every sentence of Frm(L) appears on the list used to define Γ ∗ . If φn ∈/ Γ ∗,
then that is because Γn ∪ {φn } was inconsistent. But then ¬φn ∈ Γ , so Γ ∗ is

complete.

content/propositional-logic/../first-order-logic/completeness/construction-of-model.

178 Release : 6891b66 (2024-12-01)


13.5. CONSTRUCTION OF A MODEL

13.5 Construction of a Model


pl:com:mod: We are now ready to define a valuation that makes all φ ∈ Γ true. To do this, explanation
sec
we first apply Lindenbaum’s Lemma: we get a complete consistent Γ ∗ ⊇ Γ .
We let the propositional variables in Γ ∗ determine v(Γ ∗ ).
pl:com:mod: Definition 13.4. Suppose Γ ∗ is a complete consistent set of formulas. Then
defn:termmodel
we let (
∗ T if p ∈ Γ ∗
v(Γ )(p) =
F if p ∈/ Γ∗

pl:com:mod: Lemma 13.5 (Truth Lemma). v(Γ ∗ ) ⊨ φ iff φ ∈ Γ ∗ .


lem:truth

Proof. We prove both directions simultaneously, and by induction on φ.


1. φ ≡ ⊥: v(Γ ∗ ) ⊭ ⊥ by definition of satisfaction. On the other hand,
⊥∈/ Γ ∗ since Γ ∗ is consistent.
2. φ ≡ ⊤: v(Γ ∗ ) ⊨ ⊤ by definition of satisfaction. On the other hand,
⊤ ∈ Γ ∗ since Γ ∗ is consistent and complete, and Γ ∗ ⊢ ⊤.
3. φ ≡ p: v(Γ ∗ ) ⊨ p iff v(Γ ∗ )(p) = T (by the definition of satisfaction) iff
p ∈ Γ ∗ (by the construction of v(Γ ∗ )).
4. φ ≡ ¬ψ: v(Γ ∗ ) ⊨ φ iff v(Γ ∗ ) ⊭ ψ (by definition of satisfaction). By
induction hypothesis, v(Γ ∗ ) ⊭ ψ iff ψ ∈
/ Γ ∗ . Since Γ ∗ is consistent and
∗ ∗
complete, ψ ∈
/ Γ iff ¬ψ ∈ Γ .
5. φ ≡ ψ ∧ χ: v(Γ ∗ ) ⊨ φ iff we have both v(Γ ∗ ) ⊨ ψ and v(Γ ∗ ) ⊨ χ (by
definition of satisfaction) iff both ψ ∈ Γ ∗ and χ ∈ Γ ∗ (by the induction
hypothesis). By Proposition 13.2(2), this is the case iff (ψ ∧ χ) ∈ Γ ∗ .
6. φ ≡ ψ ∨ χ: v(Γ ∗ ) ⊨ φ iff v(Γ ∗ ) ⊨ ψ or v(Γ ∗ ) ⊨ χ (by definition of
satisfaction) iff ψ ∈ Γ ∗ or χ ∈ Γ ∗ (by induction hypothesis). This is the
case iff (ψ ∨ χ) ∈ Γ ∗ (by Proposition 13.2(3)).
7. φ ≡ ψ → χ: v(Γ ∗ ) ⊨ φ iff v(Γ ∗ ) ⊭ ψ or v(Γ ∗ ) ⊨ χ (by definition of
/ Γ ∗ or χ ∈ Γ ∗ (by induction hypothesis). This is the
satisfaction) iff ψ ∈
case iff (ψ → χ) ∈ Γ ∗ (by Proposition 13.2(4)).

content/propositional-logic/../first-order-logic/completeness/completeness-thm.tex

13.6 The Completeness Theorem


pl:com:cth: Let’s combine our results: we arrive at the completeness theorem. explanation
sec

pl:com:cth: Theorem 13.6 (Completeness Theorem). Let Γ be a set of sentences. If


thm:completeness
Γ is consistent, it is satisfiable.

Release : 6891b66 (2024-12-01) 179


CHAPTER 13. THE COMPLETENESS THEOREM

Proof. Suppose Γ is consistent. By Lemma 13.3, there is a Γ ∗ ⊇ Γ which is


consistent and complete. By Lemma 13.5, v(Γ ∗ ) ⊨ φ iff φ ∈ Γ ∗ . From this it
follows in particular that for all φ ∈ Γ , v(Γ ∗ ) ⊨ φ, so Γ is satisfiable.

Corollary 13.7 (Completeness Theorem, Second Version). For all Γ pl:com:cth:


and sentences φ: if Γ ⊨ φ then Γ ⊢ φ. cor:completeness

Proof. Note that the Γ ’s in Corollary 13.7 and Theorem 13.6 are universally
quantified. To make sure we do not confuse ourselves, let us restate Theo-
rem 13.6 using a different variable: for any set of sentences ∆, if ∆ is con-
sistent, it is satisfiable. By contraposition, if ∆ is not satisfiable, then ∆ is
inconsistent. We will use this to prove the corollary.
Suppose that Γ ⊨ φ. Then Γ ∪ {¬φ} is unsatisfiable by Proposition 7.20.
Taking Γ ∪ {¬φ} as our ∆, the previous version of Theorem 13.6 gives us that
Γ ∪{¬φ} is inconsistent. By Propositions 9.18, 10.16, 11.16 and 12.23, Γ ⊢ φ.

Problem 13.2. Use Corollary 13.7 to prove Theorem 13.6, thus showing that
the two formulations of the completeness theorem are equivalent.

Problem 13.3. In order for a derivation system to be complete, its rules must
be strong enough to prove every unsatisfiable set inconsistent. Which of the
rules of derivation were necessary to prove completeness? Are any of these rules
not used anywhere in the proof? In order to answer these questions, make a
list or diagram that shows which of the rules of derivation were used in which
results that lead up to the proof of Theorem 13.6. Be sure to note any tacit
uses of rules in these proofs.

content/propositional-logic/../first-order-logic/completeness/compactness.tex

13.7 The Compactness Theorem


One important consequence of the completeness theorem is the compactness pl:com:com:
sec
theorem. The compactness theorem states that if each finite subset of a set
of sentences is satisfiable, the entire set is satisfiable—even if the set itself is
infinite. This is far from obvious. There is nothing that seems to rule out,
at first glance at least, the possibility of there being infinite sets of sentences
which are contradictory, but the contradiction only arises, so to speak, from
the infinite number. The compactness theorem says that such a scenario can
be ruled out: there are no unsatisfiable infinite sets of sentences each finite
subset of which is satisfiable. Like the completeness theorem, it has a version
related to entailment: if an infinite set of sentences entails something, already
a finite subset does.

Definition 13.8. A set Γ of formulas is finitely satisfiable iff every finite Γ0 ⊆


Γ is satisfiable.

180 Release : 6891b66 (2024-12-01)


13.8. A DIRECT PROOF OF THE COMPACTNESS THEOREM

pl:com:com: Theorem 13.9 (Compactness Theorem). The following hold for any sen-
thm:compactness
tences Γ and φ:

1. Γ ⊨ φ iff there is a finite Γ0 ⊆ Γ such that Γ0 ⊨ φ.

2. Γ is satisfiable iff it is finitely satisfiable.

Proof. We prove (2). If Γ is satisfiable, then there is a valuation v such that


v ⊨ φ for all φ ∈ Γ . Of course, this v also satisfies every finite subset of Γ , so
Γ is finitely satisfiable.
Now suppose that Γ is finitely satisfiable. Then every finite subset Γ0 ⊆ Γ is
satisfiable. By soundness (Corollaries 9.28, 10.24, 11.26 and 12.32), every finite
subset is consistent. Then Γ itself must be consistent by Propositions 9.16,
10.14, 11.14 and 12.18. By completeness (Theorem 13.6), since Γ is consistent,
it is satisfiable.

Problem 13.4. Prove (1) of Theorem 13.9.

content/propositional-logic/../first-order-logic/completeness/compactness-direct.tex

13.8 A Direct Proof of the Compactness Theorem


pl:com:cpd: We can prove the Compactness Theorem directly, without appealing to the
sec
Completeness Theorem, using the same ideas as in the proof of the complete-
ness theorem. In the proof of the Completeness Theorem we started with a
consistent set Γ of sentences, expanded it to a consistent and complete set Γ ∗
of sentences, and then showed that in the valuation v(Γ ∗ ) constructed from
Γ ∗ , all sentences of Γ are true, so Γ is satisfiable.
We can use the same method to show that a finitely satisfiable set of sen-
tences is satisfiable. We just have to prove the corresponding versions of the
results leading to the truth lemma where we replace “consistent” with “finitely
satisfiable.”

pl:com:cpd: Proposition 13.10. Suppose Γ is complete and finitely satisfiable. Then:


prop:fsat-ccs

1. (φ ∧ ψ) ∈ Γ iff both φ ∈ Γ and ψ ∈ Γ .

2. (φ ∨ ψ) ∈ Γ iff either φ ∈ Γ or ψ ∈ Γ .

3. (φ → ψ) ∈ Γ iff either φ ∈
/ Γ or ψ ∈ Γ .

Problem 13.5. Prove Proposition 13.10. Avoid the use of ⊢.

pl:com:cpd: Lemma 13.11. Every finitely satisfiable set Γ can be extended to a complete
lem:fsat-lindenbaum
and finitely satisfiable set Γ ∗ .

Release : 6891b66 (2024-12-01) 181


CHAPTER 13. THE COMPLETENESS THEOREM

Problem 13.6. Prove Lemma 13.11. (Hint: the crucial step is to show that
if Γn is finitely satisfiable, then either Γn ∪ {φn } or Γn ∪ {¬φn } is finitely
satisfiable.)

Theorem 13.12 (Compactness). Γ is satisfiable if and only if it is finitely pl:com:cpd:


thm:compactness-direct
satisfiable.

Proof. If Γ is satisfiable, then there is a valuation v such that v ⊨ φ for all


φ ∈ Γ . Of course, this v also satisfies every finite subset of Γ , so Γ is finitely
satisfiable.
Now suppose that Γ is finitely satisfiable. By Lemma 13.11, Γ can be
extended to a complete and finitely satisfiable set Γ ∗ . Construct the valua-
tion v(Γ ∗ ) as in Definition 13.4. The proof of the Truth Lemma (Lemma 13.5)
goes through if we replace references to Proposition 13.2.

Problem 13.7. Write out the complete proof of the Truth Lemma (Lemma 13.5)
in the version required for the proof of Theorem 13.12.

182 Release : 6891b66 (2024-12-01)


Part III

First-order Logic
This part covers the metatheory of first-order logic through complete-
ness. Currently it does not rely on a separate treatment of propositional
logic; everything is proved. The source files will exclude the material on
quantifiers (and replace “structure” with “valuation”, M with v, etc.) if
the “FOL” tag is false. In fact, most of the material in the part on propo-
sitional logic is simply the first-order material with the “FOL” tag turned
off.
If the part on propositional logic is included, this results in a lot of
repetition. It is planned, however, to make it possible to let this part take
into account the material on propositional logic (and exclude the material
already covered, as well as shorten proofs with references to the respective
places in the propositional part).

Chapter 14

Introduction to First-Order Logic

content/first-order-logic/introduction/first-order-logic.tex

14.1 First-Order Logic


fol:int:fol: You are probably familiar with first-order logic from your first introduction to
sec
formal logic.1 You may know it as “quantificational logic” or “predicate logic.”
First-order logic, first of all, is a formal language. That means, it has a certain
vocabulary, and its expressions are strings from this vocabulary. But not every
string is permitted. There are different kinds of permitted expressions: terms,
1 In fact, we more or less assume you are! If you’re not, you could review a more elemen-

tary textbook, such as forall x (Magnus et al., 2021).

183
CHAPTER 14. INTRODUCTION TO FIRST-ORDER LOGIC

formulas, and sentences. We are mainly interested in sentences of first-order


logic: they provide us with a formal analogue of sentences of English, and
about them we can ask the questions a logician typically is interested in. For
instance:

• Does ψ follow from φ logically?

• Is φ logically true, logically false, or contingent?

• Are φ and ψ equivalent?

These questions are primarily questions about the “meaning” of sentences


of first-order logic. For instance, a philosopher would analyze the question of
whether ψ follows logically from φ as asking: is there a case where φ is true
but ψ is false (ψ doesn’t follow from φ), or does every case that makes φ true
also make ψ true (ψ does follow from φ)? But we haven’t been told yet what
a “case” is—that is the job of semantics. The semantics of first-order logic
provides a mathematically precise model of the philosopher’s intuitive idea of
“case,” and also—and this is important—of what it is for a sentence φ to be
true in a case. We call the mathematically precise model that we will develop
a structure. The relation which makes “true in” precise, is called the relation of
satisfaction. So what we will define is “φ is satisfied in M” (in symbols: M ⊨ φ)
for sentences φ and structures M. Once this is done, we can also give precise
definitions of the other semantical terms such as “follows from” or “is logically
true.” These definitions will make it possible to settle, again with mathematical
precision, whether, e.g., ∀x (φ(x) → ψ(x)), ∃x φ(x) ⊨ ∃x ψ(x). The answer will,
of course, be “yes.” If you’ve already been trained to symbolize sentences of
English in first-order logic, you will recognize this as, e.g., the symbolizations
of, say, “All ants are insects, there are ants, therefore there are insects.” That
is obviously a valid argument, and so our mathematical model of “follows from”
for our formal language should give the same answer.
Another topic you probably remember from your first introduction to for-
mal logic is that there are derivations. If you have taken a first formal logic
course, your instructor will have made you practice finding such derivations,
perhaps even a derivation that shows that the above entailment holds. There
are many different ways to give derivations: you may have done something
called “natural deduction” or “truth trees,” but there are many others. The
purpose of derivation systems is to provide tools using which the logicians’
questions above can be answered: e.g., a natural deduction derivation in which
∀x (φ(x) → ψ(x)) and ∃x φ(x) are premises and ∃x ψ(x) is the conclusion (last
line) verifies that ∃x ψ(x) logically follows from ∀x (φ(x) → ψ(x)) and ∃x φ(x).
But why is that? On the face of it, derivation systems have nothing to do
with semantics: giving a formal derivation merely involves arranging symbols in
certain rule-governed ways; they don’t mention “cases” or “true in” at all. The
connection between derivation systems and semantics has to be established by
a meta-logical investigation. What’s needed is a mathematical proof, e.g., that
a formal derivation of ∃x ψ(x) from premises ∀x (φ(x) → ψ(x)) and ∃x φ(x) is

184 Release : 6891b66 (2024-12-01)


14.2. SYNTAX

possible, if, and only if, ∀x (φ(x) → ψ(x)) and ∃x φ(x) together entail ∃x ψ(x).
Before this can be done, however, a lot of painstaking work has to be carried
out to get the definitions of syntax and semantics correct.

content/first-order-logic/introduction/syntax.tex

14.2 Syntax
fol:int:syn: We first must make precise what strings of symbols count as sentences of first-
sec
order logic. We’ll do this later; for now we’ll just proceed by example. The basic
building blocks—the vocabulary—of first-order logic divides into two parts.
The first part is the symbols we use to say specific things or to pick out specific
things. We pick out things using constant symbols, and we say stuff about the
things we pick out using predicate symbols. E.g, we might use a as a constant
symbol to pick out a single thing, and then say something about it using the
sentence P (a). If you have meanings for “a” and “P ” in mind, you can read
P (a) as a sentence of English (and you probably have done so when you first
learned formal logic). Once you have such simple sentences of first-order logic,
you can build more complex ones using the second part of the vocabulary: the
logical symbols (connectives and quantifiers). So, for instance, we can form
expressions like (P (a) ∧ Q(b)) or ∃x P (x).
In order to provide the precise definitions of semantics and the rules of
our derivation systems required for rigorous meta-logical study, we first of all
have to give a precise definition of what counts as a sentence of first-order
logic. The basic idea is easy enough to understand: there are some simple
sentences we can form from just predicate symbols and constant symbols, such
as P (a). And then from these we form more complex ones using the connectives
and quantifiers. But what exactly are the rules by which we are allowed to
form more complex sentences? These must be specified, otherwise we have
not defined “sentence of first-order logic” precisely enough. There are a few
issues. The first one is to get the right strings to count as sentences. The
second one is to do this in such a way that we can give mathematical proofs
about all sentences. Finally, we’ll have to also give precise definitions of some
rudimentary operations with sentences, such as “replace every x in φ by b.”
The trouble is that the quantifiers and variables we have in first-order logic
make it not entirely obvious how this should be done. E.g., should ∃x P (a)
count as a sentence? What about ∃x ∃x P (x)? What should the result of
“replace x by b in (P (x) ∧ ∃x P (x))” be?

content/first-order-logic/introduction/formulas.tex

14.3 Formulas
Here is the approach we will use to rigorously specify sentences of first-order
logic and to deal with the issues arising from the use of variables. We first

Release : 6891b66 (2024-12-01) 185


CHAPTER 14. INTRODUCTION TO FIRST-ORDER LOGIC

define a different set of expressions: formulas. Once we’ve done that, we can
consider the role variables play in them—and on the basis of some other ideas,
namely those of “free” and “bound” variables, we can define what a sentence
is (namely, a formula without free variables). We do this not just because it
makes the definition of “sentence” more manageable, but also because it will
be crucial to the way we define the semantic notion of satisfaction.
Let’s define “formula” for a simple first-order language, one containing only
a single predicate symbol P and a single constant symbol a, and only the logical
symbols ¬, ∧, and ∃. Our full definitions will be much more general: we’ll allow
infinitely many predicate symbols and constant symbols. In fact, we will also
consider function symbols which can be combined with constant symbols and
variables to form “terms.” For now, a and the variables will be our only terms.
We do need infinitely many variables. We’ll officially use the symbols v0 , v1 ,
. . . , as variables.
Definition 14.1. The set of formulas Frm is defined as follows:
1. P (a) and P (vi ) are formulas (i ∈ N). fol:int:fml:
fmls-atom
2. If φ is a formula, then ¬φ is formula. fol:int:fml:
fmls-not
3. If φ and ψ are formulas, then (φ ∧ ψ) is a formula.
4. If φ is a formula and x is a variable, then ∃x φ is a formula. fol:int:fml:
fmls-ex
5. Nothing else is a formula. fol:int:fml:
fmls-limit
(1) tells us that P (a) and P (vi ) are formulas, for any i ∈ N. These are
the so-called atomic formulas. They give us something to start from. The
other clauses give us ways of forming new formulas from ones we have already
formed. So for instance, by (2), we get that ¬P (v2 ) is a formula, since P (v2 )
is already a formula by (1). Then, by (4), we get that ∃v2 ¬P (v2 ) is another
formula, and so on. (5) tells us that only strings we can form in this way count
as formulas. In particular, ∃v0 P (a) and ∃v0 ∃v0 P (a) do count as formulas, and
(¬P (a)) does not, because of the extraneous outer parentheses.
This way of defining formulas is called an inductive definition, and it allows
us to prove things about formulas using a version of proof by induction called
structural induction. These are discussed in a general way in section 71.4 and
section 71.5, which you should review before delving into the proofs later on.
Basically, the idea is that if you want to give a proof that something is true for
all formulas, you show first that it is true for the atomic formulas, and then
that if it’s true for any formula φ (and ψ), it’s also true for ¬φ, (φ ∧ ψ), and
∃x φ. For instance, this proves that it’s true for ∃v2 ¬P (v2 ): from the first part
you know that it’s true for the atomic formula P (v2 ). Then you get that it’s
true for ¬P (v2 ) by the second part, and then again that it’s true for ∃v2 ¬P (v2 )
itself. Since all formulas are inductively generated from atomic formulas, this
works for any of them.

content/first-order-logic/introduction/satisfaction.tex

186 Release : 6891b66 (2024-12-01)


14.4. SATISFACTION

14.4 Satisfaction
fol:int:sat: We can already skip ahead to the semantics of first-order logic once we know
sec
what formulas are: here, the basic definition is that of a structure. For our
simple language, a structure M has just three components: a non-empty set
|M| called the domain, what a picks out in M, and what P is true of in M.
The object picked out by a is denoted aM and the set of things P is true of
by P M . A structure M consists of just these three things: |M|, aM ∈ |M| and
P M ⊆ |M|. The general case will be more complicated, since there will be
many predicate symbols and constant symbols, the constant symbols can have
more than one place, and there will also be function symbols.
This is enough to give a definition of satisfaction for formulas that don’t
contain variables. The idea is to give an inductive definition that mirrors the
way we have defined formulas. We specify when an atomic formula is satisfied
in M, and then when, e.g., ¬φ is satisfied in M on the basis of whether or not
φ is satisfied in M. E.g., we could define:
1. P (a) is satisfied in M iff aM ∈ P M .
2. ¬φ is satisfied in M iff φ is not satisfied in M.
3. (φ ∧ ψ) is satisfied in M iff φ is satisfied in M, and ψ is satisfied in M as
well.
Let’s say that |M| = {0, 1, 2}, aM = 1, and P M = {1, 2}. This definition would
tell us that P (a) is satisfied in M (since aM = 1 ∈ {1, 2} = P M ). It tells
us further that ¬P (a) is not satisfied in M, and that in turn ¬¬P (a) is and
(¬P (a) ∧ P (a)) is not satisfied, and so on.
The trouble comes when we want to give a definition for the quantifiers:
we’d like to say something like, “∃v0 P (v0 ) is satisfied iff P (v0 ) is satisfied.”
But the structure M doesn’t tell us what to do about variables. What we
actually want to say is that P (v0 ) is satisfied for some value of v0 . To make
this precise we need a way to assign elements of |M| not just to a but also
to v0 . To this end, we introduce variable assignments. A variable assignment
is simply a function s that maps variables to elements of |M| (in our example,
to one of 1, 2, or 3). Since we don’t know beforehand which variables might
appear in a formula we can’t limit which variables s assigns values to. The
simple solution is to require that s assigns values to all variables v0 , v1 , . . .
We’ll just use only the ones we need.
Instead of defining satisfaction of formulas just relative to a structure, we’ll
define it relative to a structure M and a variable assignment s, and write
M, s ⊨ φ for short. Our definition will now include an additional clause to deal
with atomic formulas containing variables:
1. M, s ⊨ P (a) iff aM ∈ P M .
2. M, s ⊨ P (vi ) iff s(vi ) ∈ P M .
3. M, s ⊨ ¬φ iff not M, s ⊨ φ.

Release : 6891b66 (2024-12-01) 187


CHAPTER 14. INTRODUCTION TO FIRST-ORDER LOGIC

4. M, s ⊨ (φ ∧ ψ) iff M, s ⊨ φ and M, s ⊨ ψ.

Ok, this solves one problem: we can now say when M satisfies P (v0 ) for the
value s(v0 ). To get the definition right for ∃v0 P (v0 ) we have to do one more
thing: We want to have that M, s ⊨ ∃v0 P (v0 ) iff M, s′ ⊨ P (v0 ) for some way
s′ of assigning a value to v0 . But the value assigned to v0 does not necessarily
have to be the value that s(v0 ) picks out. We’ll introduce a notation for that:
if m ∈ |M|, then we let s[m/v0 ] be the assignment that is just like s (for all
variables other than v0 ), except to v0 it assigns m. Now our definition can be:

5. M, s ⊨ ∃vi φ iff M, s[m/vi ] ⊨ φ for some m ∈ |M|.

Does it work out? Let’s say we let s(vi ) = 0 for all i ∈ N. M, s ⊨ ∃v0 P (v0 ) iff
there is an m ∈ |M| so that M, s[m/v0 ] ⊨ P (v0 ). And there is: we can choose
m = 1 or m = 2. Note that this is true even if the value s(v0 ) assigned to v0
by s itself—in this case, 0—doesn’t do the job. We have M, s[1/v0 ] ⊨ P (v0 )
but not M, s ⊨ P (v0 ).
If this looks confusing and cumbersome: it is. But the added complexity is
required to give a precise, inductive definition of satisfaction for all formulas,
and we need something like it to precisely define the semantic notions. There
are other ways of doing it, but they are all equally (in)elegant.

content/first-order-logic/introduction/sentences.tex

14.5 Sentences
Ok, now we have a (sketch of a) definition of satisfaction (“true in”) for struc- fol:int:snt:
sec
tures and formulas. But it needs this additional bit—a variable assignment—
and what we wanted is a definition of sentences. How do we get rid of assign-
ments, and what are sentences?
You probably remember a discussion in your first introduction to formal
logic about the relation between variables and quantifiers. A quantifier is al-
ways followed by a variable, and then in the part of the sentence to which that
quantifier applies (its “scope”), we understand that the variable is “bound”
by that quantifier. In formulas it was not required that every variable has a
matching quantifier, and variables without matching quantifiers are “free” or
“unbound.” We will take sentences to be all those formulas that have no free
variables.
Again, the intuitive idea of when an occurrence of a variable in a formula φ
is bound, which quantifier binds it, and when it is free, is not difficult to get.
You may have learned a method for testing this, perhaps involving counting
parentheses. We have to insist on a precise definition—and because we have
defined formulas by induction, we can give a definition of the free and bound
occurrences of a variable x in a formula φ also by induction. E.g., it might
look like this for our simplified language:

188 Release : 6891b66 (2024-12-01)


14.6. SEMANTIC NOTIONS

1. If φ is atomic, all occurrences of x in it are free (that is, the occurrence


of x in P (x) is free).

2. If φ is of the form ¬ψ, then an occurrence of x in ¬ψ is free iff the


corresponding occurrence of x is free in ψ (that is, the free occurrences
of variables in ψ are exactly the corresponding occurrences in ¬ψ).

3. If φ is of the form (ψ ∧ χ), then an occurrence of x in (ψ ∧ χ) is free iff


the corresponding occurrence of x is free in ψ or in χ.

4. If φ is of the form ∃x ψ, then no occurrence of x in φ is free; if it is of the


form ∃y ψ where y is a different variable than x, then an occurrence of x
in ∃y ψ is free iff the corresponding occurrence of x is free in ψ.

Once we have a precise definition of free and bound occurrences of vari-


ables, we can simply say: a sentence is any formula without free occurrences
of variables.

content/first-order-logic/introduction/semantic-notions.tex

14.6 Semantic Notions


fol:int:sem: We mentioned above that when we consider whether M, s ⊨ φ holds, we (for
sec
convenience) let s assign values to all variables, but only the values it assigns
to variables in φ are used. In fact, it’s only the values of free variables in φ
that matter. Of course, because we’re careful, we are going to prove this fact.
Since sentences have no free variables, s doesn’t matter at all when it comes to
whether or not they are satisfied in a structure. So, when φ is a sentence we
can define M ⊨ φ to mean “M, s ⊨ φ for all s,” which as it happens is true iff
M, s ⊨ φ for at least one s. We need to introduce variable assignments to get
a working definition of satisfaction for formulas, but for sentences, satisfaction
is independent of the variable assignments.
Once we have a definition of “M ⊨ φ,” we know what “case” and “true
in” mean as far as sentences of first-order logic are concerned. On the basis
of the definition of M ⊨ φ for sentences we can then define the basic semantic
notions of validity, entailment, and satisfiability. A sentence is valid, ⊨ φ, if
every structure satisfies it. It is entailed by a set of sentences, Γ ⊨ φ, if every
structure that satisfies all the sentences in Γ also satisfies φ. And a set of
sentences is satisfiable if some structure satisfies all sentences in it at the same
time.
Because formulas are inductively defined, and satisfaction is in turn defined
by induction on the structure of formulas, we can use induction to prove prop-
erties of our semantics and to relate the semantic notions defined. We’ll collect
and prove some of these properties, partly because they are individually inter-
esting, but mainly because many of them will come in handy when we go on
to investigate the relation between semantics and derivation systems. In order

Release : 6891b66 (2024-12-01) 189


CHAPTER 14. INTRODUCTION TO FIRST-ORDER LOGIC

to do so, we’ll also have to define (precisely, i.e., by induction) some syntactic
notions and operations we haven’t mentioned yet.

content/first-order-logic/introduction/substitution.tex

14.7 Substitution
We’ll discuss an example to illustrate how things hang together, and how the fol:int:sub:
sec
development of syntax and semantics lays the foundation for our more advanced
investigations later. Our derivation systems should let us derive P (a) from
∀v0 P (v0 ). Maybe we even want to state this as a rule of inference. However,
to do so, we must be able to state it in the most general terms: not just for
P , a, and v0 , but for any formula φ, and term t, and variable x. (Recall that
constant symbols are terms, but we’ll consider also more complicated terms
built from constant symbols and function symbols.) So we want to be able
to say something like, “whenever you have derived ∀x φ(x) you are justified
in inferring φ(t)—the result of removing ∀x and replacing x by t.” But what
exactly does “replacing x by t” mean? What is the relation between φ(x)
and φ(t)? Does this always work?
To make this precise, we define the operation of substitution. Substitution is
actually tricky, because we can’t just replace all x’s in φ by t, and not every t
can be substituted for any x. We’ll deal with this, again, using inductive
definitions. But once this is done, specifying an inference rule as “infer φ(t)
from ∀x φ(x)” becomes a precise definition. Moreover, we’ll be able to show
that this is a good inference rule in the sense that ∀x φ(x) entails φ(t). But to
prove this, we have to again prove something that may at first glance prompt
you to ask “why are we doing this?” That ∀x φ(x) entails φ(t) relies on the fact
that whether or not M ⊨ φ(t) holds depends only on the value of the term t,
i.e., if we let m be whatever element of |M| is picked out by t, then M, s ⊨ φ(t)
iff M, s[m/x] ⊨ φ(x). This holds even when t contains variables, but we’ll have
to be careful with how exactly we state the result.

content/first-order-logic/introduction/models-theories.tex

14.8 Models and Theories


Once we’ve defined the syntax and semantics of first-order logic, we can get to fol:int:mod:
sec
work investigating the properties of structures and the semantic notions. We
can also define derivation systems, and investigate those. For a set of sentences,
we can ask: what structures make all the sentences in that set true? Given a
set of sentences Γ , a structure M that satisfies them is called a model of Γ .
We might start from Γ and try to find its models—what do they look like?
How big or small do they have to be? But we might also start with a single
structure or collection of structures and ask: what sentences are true in them?
Are there sentences that characterize these structures in the sense that they,

190 Release : 6891b66 (2024-12-01)


14.9. SOUNDNESS AND COMPLETENESS

and only they, are true in them? These kinds of questions are the domain of
model theory. They also underlie the axiomatic method : describing a collection
of structures by a set of sentences, the axioms of a theory. This is made possible
by the observation that exactly those sentences entailed in first-order logic by
the axioms are true in all models of the axioms.
As a very simple example, consider preorders. A preorder is a relation R
on some set A which is both reflexive and transitive. A set A with a two-place
relation R ⊆ A × A on it is exactly what we would need to give a structure for
a first-order language with a single two-place relation symbol P : we would set
|M| = A and P M = R. Since R is a preorder, it is reflexive and transitive, and
we can find a set Γ of sentences of first-order logic that say this:

∀v0 P (v0 , v0 )
∀v0 ∀v1 ∀v2 ((P (v0 , v1 ) ∧ P (v1 , v2 )) → P (v0 , v2 ))

These sentences are just the symbolizations of “for any x, Rxx” (R is reflexive)
and “whenever Rxy and Ryz then also Rxz” (R is transitive). We see that
a structure M is a model of these two sentences Γ iff R (i.e., P M ), is a preorder
on A (i.e., |M|). In other words, the models of Γ are exactly the preorders. Any
property of all preorders that can be expressed in the first-order language with
just P as predicate symbol (like reflexivity and transitivity above), is entailed
by the two sentences in Γ and vice versa. So anything we can prove about
models of Γ we have proved about all preorders.
For any particular theory and class of models (such as Γ and all preorders),
there will be interesting questions about what can be expressed in the corre-
sponding first-order language, and what cannot be expressed. There are some
properties of structures that are interesting for all languages and classes of mod-
els, namely those concerning the size of the domain. One can always express,
for instance, that the domain contains exactly n elements, for any n ∈ Z+ . One
can also express, using a set of infinitely many sentences, that the domain is
infinite. But one cannot express that the domain is finite, or that the domain
is non-enumerable. These results about the limitations of first-order languages
are consequences of the compactness and Löwenheim–Skolem theorems.

content/first-order-logic/introduction/soundness-completeness.tex

14.9 Soundness and Completeness


fol:int:scp: We’ll also introduce derivation systems for first-order logic. There are many
sec
derivation systems that logicians have developed, but they all define the same
derivability relation between sentences. We say that Γ derives φ, Γ ⊢ φ,
if there is a derivation of a certain precisely defined sort. Derivations are
always finite arrangements of symbols—perhaps a list of sentences, or some
more complicated structure. The purpose of derivation systems is to provide
a tool to determine if a sentence is entailed by some set Γ . In order to serve
that purpose, it must be true that Γ ⊨ φ if, and only if, Γ ⊢ φ.

Release : 6891b66 (2024-12-01) 191


If Γ ⊢ φ but not Γ ⊨ φ, our derivation system would be too strong, prove
too much. The property that if Γ ⊢ φ then Γ ⊨ φ is called soundness, and it
is a minimal requirement on any good derivation system. On the other hand,
if Γ ⊨ φ but not Γ ⊢ φ, then our derivation system is too weak, it doesn’t
prove enough. The property that if Γ ⊨ φ then Γ ⊢ φ is called completeness.
Soundness is usually relatively easy to prove (by induction on the structure of
derivations, which are inductively defined). Completeness is harder to prove.
Soundness and completeness have a number of important consequences.
If a set of sentences Γ derives a contradiction (such as φ ∧ ¬φ) it is called
inconsistent. Inconsistent Γ s cannot have any models, they are unsatisfiable.
From completeness the converse follows: any Γ that is not inconsistent—or, as
we will say, consistent—has a model. In fact, this is equivalent to completeness,
and is the form of completeness we will actually prove. It is a deep and perhaps
surprising result: just because you cannot prove φ∧¬φ from Γ guarantees that
there is a structure that is as Γ describes it. So completeness gives an answer
to the question: which sets of sentences have models? Answer: all and only
consistent sets do.
The soundness and completeness theorems have two important consequences:
the compactness and the Löwenheim–Skolem theorem. These are important
results in the theory of models, and can be used to establish many interesting
results. We’ve already mentioned two: first-order logic cannot express that the
domain of a structure is finite or that it is non-enumerable.
Historically, all of this—how to define syntax and semantics of first-order
logic, how to define good derivation systems, how to prove that they are sound
and complete, getting clear about what can and cannot be expressed in first-
order languages—took a long time to figure out and get right. We now know
how to do it, but going through all the details can still be confusing and tedious.
But it’s also important, because the methods developed here for the formal
language of first-order logic are applied all over the place in logic, computer
science, and linguistics. So working through the details pays off in the long
run.

Chapter 15

Syntax of First-Order Logic

192
15.1. INTRODUCTION

content/first-order-logic/syntax-and-semantics/intro-syntax.tex

15.1 Introduction
fol:syn:itx: In order to develop the theory and metatheory of first-order logic, we must
sec
first define the syntax and semantics of its expressions. The expressions of
first-order logic are terms and formulas. Terms are formed from variables,
constant symbols, and function symbols. Formulas, in turn, are formed from
predicate symbols together with terms (these form the smallest, “atomic” for-
mulas), and then from atomic formulas we can form more complex ones using
logical connectives and quantifiers. There are many different ways to set down
the formation rules; we give just one possible one. Other systems will chose
different symbols, will select different sets of connectives as primitive, will use
parentheses differently (or even not at all, as in the case of so-called Polish nota-
tion). What all approaches have in common, though, is that the formation rules
define the set of terms and formulas inductively. If done properly, every expres-
sion can result essentially in only one way according to the formation rules. The
inductive definition resulting in expressions that are uniquely readable means
we can give meanings to these expressions using the same method—inductive
definition.

content/first-order-logic/syntax-and-semantics/first-order-languages.tex

15.2 First-Order Languages


fol:syn:fol: Expressions of first-order logic are built up from a basic vocabulary containing
sec
variables, constant symbols, predicate symbols and sometimes function symbols.
From them, together with logical connectives, quantifiers, and punctuation
symbols such as parentheses and commas, terms and formulas are formed.
Informally, predicate symbols are names for properties and relations, con- explanation
stant symbols are names for individual objects, and function symbols are names
for mappings. These, except for the identity predicate =, are the non-logical
symbols and together make up a language. Any first-order language L is deter-
mined by its non-logical symbols. In the most general case, L contains infinitely
many symbols of each kind.
In the general case, we make use of the following symbols in first-order logic:

1. Logical symbols
a) Logical connectives: ¬ (negation), ∧ (conjunction), ∨ (disjunction),
→ (conditional), ↔ (biconditional), ∀ (universal quantifier), ∃ (ex-
istential quantifier).
b) The propositional constant for falsity ⊥.
c) The propositional constant for truth ⊤.
d) The two-place identity predicate =.

Release : 6891b66 (2024-12-01) 193


CHAPTER 15. SYNTAX OF FIRST-ORDER LOGIC

e) A denumerable set of variables: v0 , v1 , v2 , . . .

2. Non-logical symbols, making up the standard language of first-order logic

a) A denumerable set of n-place predicate symbols for each n > 0: An0 ,


An1 , An2 , . . .
b) A denumerable set of constant symbols: c0 , c1 , c2 , . . . .
c) A denumerable set of n-place function symbols for each n > 0: f0n ,
f1n , f2n , . . .

3. Punctuation marks: (, ), and the comma.

Most of our definitions and results will be formulated for the full standard
language of first-order logic. However, depending on the application, we may
also restrict the language to only a few predicate symbols, constant symbols,
and function symbols.

Example 15.1. The language LA of arithmetic contains a single two-place


predicate symbol <, a single constant symbol 0, one one-place function sym-
bol ′, and two two-place function symbols + and ×.

Example 15.2. The language of set theory LZ contains only the single two-
place predicate symbol ∈.

Example 15.3. The language of orders L≤ contains only the two-place pred-
icate symbol ≤.

Again, these are conventions: officially, these are just aliases, e.g., <, ∈,
and ≤ are aliases for A20 , 0 for c0 , ′ for f01 , + for f02 , × for f12 .
.
intro You may be familiar with different terminology and symbols than the ones
we use above. Logic texts (and teachers) commonly use ∼, ¬, or ! for “nega-
tion”, ∧, ·, or & for “conjunction”. Commonly used symbols for the “con-
ditional” or “implication” are →, ⇒, and ⊃. Symbols for “biconditional,”
“bi-implication,” or “(material) equivalence” are ↔, ⇔, and ≡. The ⊥ sym-
bol is variously called “falsity,” “falsum,”, “absurdity,” or “bottom.” The ⊤
symbol is variously called “truth,” “verum,” or “top.”
It is conventional to use lower case letters (e.g., a, b, c) from the beginning
of the Latin alphabet for constant symbols (sometimes called names), and lower
case letters from the end (e.g., x, y, z) for variables. Quantifiers combine V with
variables, e.g., x; notational variations include W ∀x, (∀x), (x), Πx, x for the
universal quantifier and ∃x, (∃x), (Ex), Σx, x for the existential quantifier.
explanation We might treat all the propositional operators and both quantifiers as prim-
itive symbols of the language. We might instead choose a smaller stock of
primitive symbols and treat the other logical operators as defined. “Truth
functionally complete” sets of Boolean operators include {¬, ∨}, {¬, ∧}, and

194 Release : 6891b66 (2024-12-01)


15.3. TERMS AND FORMULAS

{¬, →}—these can be combined with either quantifier for an expressively com-
plete first-order language.
You may be familiar with two other logical operators: the Sheffer stroke |
(named after Henry Sheffer), and Peirce’s arrow ↓, also known as Quine’s dag-
ger. When given their usual readings of “nand” and “nor” (respectively), these
operators are truth functionally complete by themselves.

content/first-order-logic/syntax-and-semantics/terms-formulas.tex

15.3 Terms and Formulas


fol:syn:frm: Once a first-order language L is given, we can define expressions built up from
sec
the basic vocabulary of L. These include in particular terms and formulas.
fol:syn:frm: Definition 15.4 (Terms). The set of terms Trm(L) of L is defined induc-
defn:terms
tively by:
1. Every variable is a term.
2. Every constant symbol of L is a term.
3. If f is an n-place function symbol and t1 , . . . , tn are terms, then f (t1 , . . . , tn )
is a term.
4. Nothing else is a term.
A term containing no variables is a closed term.

The constant symbols appear in our specification of the language and the explanation
terms as a separate category of symbols, but they could instead have been
included as zero-place function symbols. We could then do without the second
clause in the definition of terms. We just have to understand f (t1 , . . . , tn ) as
just f by itself if n = 0.
fol:syn:frm: Definition 15.5 (Formulas). The set of formulas Frm(L) of the language L
defn:formulas
is defined inductively as follows:
1. ⊥ is an atomic formula.
2. ⊤ is an atomic formula.
3. If R is an n-place predicate symbol of L and t1 , . . . , tn are terms of L,
then R(t1 , . . . , tn ) is an atomic formula.
4. If t1 and t2 are terms of L, then =(t1 , t2 ) is an atomic formula.
5. If φ is a formula, then ¬φ is a formula.
6. If φ and ψ are formulas, then (φ ∧ ψ) is a formula.
7. If φ and ψ are formulas, then (φ ∨ ψ) is a formula.

Release : 6891b66 (2024-12-01) 195


CHAPTER 15. SYNTAX OF FIRST-ORDER LOGIC

8. If φ and ψ are formulas, then (φ → ψ) is a formula.


9. If φ and ψ are formulas, then (φ ↔ ψ) is a formula.
10. If φ is a formula and x is a variable, then ∀x φ is a formula.
11. If φ is a formula and x is a variable, then ∃x φ is a formula.
12. Nothing else is a formula.

explanation The definitions of the set of terms and that of formulas are inductive defi-
nitions. Essentially, we construct the set of formulas in infinitely many stages.
In the initial stage, we pronounce all atomic formulas to be formulas; this
corresponds to the first few cases of the definition, i.e., the cases for ⊤, ⊥,
R(t1 , . . . , tn ) and =(t1 , t2 ). “Atomic formula” thus means any formula of this
form.
The other cases of the definition give rules for constructing new formulas
out of formulas already constructed. At the second stage, we can use them to
construct formulas out of atomic formulas. At the third stage, we construct
new formulas from the atomic formulas and those obtained in the second stage,
and so on. A formula is anything that is eventually constructed at such a stage,
and nothing else.
By convention, we write = between its arguments and leave out the paren-
theses: t1 = t2 is an abbreviation for =(t1 , t2 ). Moreover, ¬=(t1 , t2 ) is ab-
breviated as t1 ̸= t2 . When writing a formula (ψ ∗ χ) constructed from ψ, χ
using a two-place connective ∗, we will often leave out the outermost pair of
parentheses and write simply ψ ∗ χ.
intro Some logic texts require that the variable x must occur in φ in order for
∃x φ and ∀x φ to count as formulas. Nothing bad happens if you don’t require
this, and it makes things easier.
If we work in a language for a specific application, we will often write two-
place predicate symbols and function symbols between the respective terms,
e.g., t1 < t2 and (t1 + t2 ) in the language of arithmetic and t1 ∈ t2 in the
language of set theory. The successor function in the language of arithmetic
is even written conventionally after its argument: t′ . Officially, however, these
are just conventional abbreviations for A20 (t1 , t2 ), f02 (t1 , t2 ), A20 (t1 , t2 ) and f01 (t),
respectively.
Definition 15.6 (Syntactic identity). The symbol ≡ expresses syntactic iden-
tity between strings of symbols, i.e., φ ≡ ψ iff φ and ψ are strings of symbols
of the same length and which contain the same symbol in each place.

The ≡ symbol may be flanked by strings obtained by concatenation, e.g.,


φ ≡ (ψ ∨ χ) means: the string of symbols φ is the same string as the one
obtained by concatenating an opening parenthesis, the string ψ, the ∨ symbol,
the string χ, and a closing parenthesis, in this order. If this is the case, then
we know that the first symbol of φ is an opening parenthesis, φ contains ψ as a
substring (starting at the second symbol), that substring is followed by ∨, etc.

196 Release : 6891b66 (2024-12-01)


15.4. UNIQUE READABILITY

As terms and formulas are built up from basic elements via inductive def-
initions, we can use the following induction principles to prove things about
them.

fol:syn:frm: Lemma 15.7 (Principle of induction on terms). Let L be a first-order


lem:trmind
language. If some property P is such that

1. it holds for every variable v,

2. it holds for every constant symbol a of L, and

3. it holds for f (t1 , . . . , tn ) whenever it holds for t1 , . . . , tn and f is an


n-place function symbol of L

(assuming t1 , . . . , tn are terms of L), then P holds for every term in Trm(L).

Problem 15.1. Prove Lemma 15.7.

fol:syn:frm: Lemma 15.8 (Principle of induction on formulas). Let L be a first-


thm:frmind
order language. If some property P holds for all the atomic formulas and
is such that

1. it holds for ¬φ whenever it holds for φ;

2. it holds for (φ ∧ ψ) whenever it holds for φ and ψ;

3. it holds for (φ ∨ ψ) whenever it holds for φ and ψ;

4. it holds for (φ → ψ) whenever it holds for φ and ψ;

5. it holds for (φ ↔ ψ) whenever it holds for φ and ψ;

6. it holds for ∃x φ whenever it holds for φ;

7. it holds for ∀x φ whenever it holds for φ;

(assuming φ and ψ are formulas of L), then P holds for all formulas in Frm(L).

content/first-order-logic/syntax-and-semantics/unique-readability.tex

15.4 Unique Readability


fol:syn:unq: The way we defined formulas guarantees that every formula has a unique read- explanation
sec
ing, i.e., there is essentially only one way of constructing it according to our
formation rules for formulas and only one way of “interpreting” it. If this were
not so, we would have ambiguous formulas, i.e., formulas that have more than
one reading or intepretation—and that is clearly something we want to avoid.
But more importantly, without this property, most of the definitions and proofs
we are going to give will not go through.

Release : 6891b66 (2024-12-01) 197


CHAPTER 15. SYNTAX OF FIRST-ORDER LOGIC

Perhaps the best way to make this clear is to see what would happen if we
had given bad rules for forming formulas that would not guarantee unique read-
ability. For instance, we could have forgotten the parentheses in the formation
rules for connectives, e.g., we might have allowed this:
If φ and ψ are formulas, then so is φ → ψ.
Starting from an atomic formula θ, this would allow us to form θ → θ. From
this, together with θ, we would get θ → θ → θ. But there are two ways to do
this:
1. We take θ to be φ and θ → θ to be ψ.
2. We take φ to be θ → θ and ψ is θ.
Correspondingly, there are two ways to “read” the formula θ → θ → θ. It is of
the form ψ → χ where ψ is θ and χ is θ → θ, but it is also of the form ψ → χ
with ψ being θ → θ and χ being θ.
If this happens, our definitions will not always work. For instance, when we
define the main operator of a formula, we say: in a formula of the form ψ → χ,
the main operator is the indicated occurrence of →. But if we can match the
formula θ → θ → θ with ψ → χ in the two different ways mentioned above,
then in one case we get the first occurrence of → as the main operator, and
in the second case the second occurrence. But we intend the main operator to
be a function of the formula, i.e., every formula must have exactly one main
operator occurrence.
Lemma 15.9. The number of left and right parentheses in a formula φ are
equal.

Proof. We prove this by induction on the way φ is constructed. This requires


two things: (a) We have to prove first that all atomic formulas have the prop-
erty in question (the induction basis). (b) Then we have to prove that when
we construct new formulas out of given formulas, the new formulas have the
property provided the old ones do.
Let l(φ) be the number of left parentheses, and r(φ) the number of right
parentheses in φ, and l(t) and r(t) similarly the number of left and right paren-
theses in a term t.
Problem 15.2. Prove that for any term t, l(t) = r(t).

1. φ ≡ ⊥: φ has 0 left and 0 right parentheses.


2. φ ≡ ⊤: φ has 0 left and 0 right parentheses.
3. φ ≡ R(t1 , . . . , tn ): l(φ) = 1 + l(t1 ) + · · · + l(tn ) = 1 + r(t1 ) + · · · + r(tn ) =
r(φ). Here we make use of the fact, left as an exercise, that l(t) = r(t)
for any term t.
4. φ ≡ t1 = t2 : l(φ) = l(t1 ) + l(t2 ) = r(t1 ) + r(t2 ) = r(φ).

198 Release : 6891b66 (2024-12-01)


15.4. UNIQUE READABILITY

5. φ ≡ ¬ψ: By induction hypothesis, l(ψ) = r(ψ). Thus l(φ) = l(ψ) =


r(ψ) = r(φ).
6. φ ≡ (ψ ∗ χ): By induction hypothesis, l(ψ) = r(ψ) and l(χ) = r(χ).
Thus l(φ) = 1 + l(ψ) + l(χ) = 1 + r(ψ) + r(χ) = r(φ).
7. φ ≡ ∀x ψ: By induction hypothesis, l(ψ) = r(ψ). Thus, l(φ) = l(ψ) =
r(ψ) = r(φ).
8. φ ≡ ∃x ψ: Similarly.

Definition 15.10 (Proper prefix). A string of symbols ψ is a proper prefix


of a string of symbols φ if concatenating ψ and a non-empty string of symbols
yields φ.

fol:syn:unq: Lemma 15.11. If φ is a formula, and ψ is a proper prefix of φ, then ψ is


lem:no-prefix
not a formula.

Proof. Exercise.

Problem 15.3. Prove Lemma 15.11.

fol:syn:unq: Proposition 15.12. If φ is an atomic formula, then it satisfies one, and only
prop:unique-atomic
one of the following conditions.
1. φ ≡ ⊥.
2. φ ≡ ⊤.
3. φ ≡ R(t1 , . . . , tn ) where R is an n-place predicate symbol, t1 , . . . , tn are
terms, and each of R, t1 , . . . , tn is uniquely determined.
4. φ ≡ t1 = t2 where t1 and t2 are uniquely determined terms.

Proof. Exercise.

Problem 15.4. Prove Proposition 15.12 (Hint: Formulate and prove a version
of Lemma 15.11 for terms.)

Proposition 15.13 (Unique Readability). Every formula satisfies one, and


only one of the following conditions.
1. φ is atomic.
2. φ is of the form ¬ψ.
3. φ is of the form (ψ ∧ χ).
4. φ is of the form (ψ ∨ χ).
5. φ is of the form (ψ → χ).

Release : 6891b66 (2024-12-01) 199


CHAPTER 15. SYNTAX OF FIRST-ORDER LOGIC

6. φ is of the form (ψ ↔ χ).


7. φ is of the form ∀x ψ.
8. φ is of the form ∃x ψ.
Moreover, in each case ψ, or ψ and χ, are uniquely determined. This means
that, e.g., there are no different pairs ψ, χ and ψ ′ , χ′ so that φ is both of the
form (ψ → χ) and (ψ ′ → χ′ ).

Proof. The formation rules require that if a formula is not atomic, it must
start with an opening parenthesis (, ¬, or a quantifier. On the other hand,
every formula that starts with one of the following symbols must be atomic:
a predicate symbol, a function symbol, a constant symbol, ⊥, ⊤.
So we really only have to show that if φ is of the form (ψ ∗ χ) and also of
the form (ψ ′ ∗′ χ′ ), then ψ ≡ ψ ′ , χ ≡ χ′ , and ∗ = ∗′ .
So suppose both φ ≡ (ψ ∗ χ) and φ ≡ (ψ ′ ∗′ χ′ ). Then either ψ ≡ ψ ′ or not.
If it is, clearly ∗ = ∗′ and χ ≡ χ′ , since they then are substrings of φ that begin
in the same place and are of the same length. The other case is ψ ̸≡ ψ ′ . Since
ψ and ψ ′ are both substrings of φ that begin at the same place, one must be
a proper prefix of the other. But this is impossible by Lemma 15.11.

content/first-order-logic/syntax-and-semantics/main-operator.tex

15.5 Main operator of a Formula


explanation It is often useful to talk about the last operator used in constructing a for- fol:syn:mai:
sec
mula φ. This operator is called the main operator of φ. Intuitively, it is the
“outermost” operator of φ. For example, the main operator of ¬φ is ¬, the
main operator of (φ ∨ ψ) is ∨, etc.
Definition 15.14 (Main operator). The main operator of a formula φ is fol:syn:mai:
def:main-op
defined as follows:
1. φ is atomic: φ has no main operator.
2. φ ≡ ¬ψ: the main operator of φ is ¬.
3. φ ≡ (ψ ∧ χ): the main operator of φ is ∧.
4. φ ≡ (ψ ∨ χ): the main operator of φ is ∨.
5. φ ≡ (ψ → χ): the main operator of φ is →.
6. φ ≡ (ψ ↔ χ): the main operator of φ is ↔.
7. φ ≡ ∀x ψ: the main operator of φ is ∀.
8. φ ≡ ∃x ψ: the main operator of φ is ∃.

200 Release : 6891b66 (2024-12-01)


15.6. SUBFORMULAS

In each case, we intend the specific indicated occurrence of the main oper-
ator in the formula. For instance, since the formula ((θ → α) → (α → θ)) is of
the form (ψ → χ) where ψ is (θ → α) and χ is (α → θ), the second occurrence
of → is the main operator.
This is a recursive definition of a function which maps all non-atomic formu- explanation
las to their main operator occurrence. Because of the way formulas are defined
inductively, every formula φ satisfies one of the cases in Definition 15.14. This
guarantees that for each non-atomic formula φ a main operator exists. Because
each formula satisfies only one of these conditions, and because the smaller for-
mulas from which φ is constructed are uniquely determined in each case, the
main operator occurrence of φ is unique, and so we have defined a function.
We call formulas by the names in Table 15.1 depending on which symbol
their main operator is.
Main operator Type of formula Example
none atomic (formula) ⊥, ⊤, R(t1 , . . . , tn ), t1 = t2
¬ negation ¬φ
∧ conjunction (φ ∧ ψ)
∨ disjunction (φ ∨ ψ)
→ conditional (φ → ψ)
↔ biconditional (φ ↔ ψ)
∀ universal (formula) ∀x φ
∃ existential (formula) ∃x φ
Table 15.1: Main operator and names of formulas
fol:syn:mai:
tab:main-op content/first-order-logic/syntax-and-semantics/subformulas.tex

15.6 Subformulas
fol:syn:sbf: It is often useful to talk about the formulas that “make up” a given formula. explanation
sec
We call these its subformulas. Any formula counts as a subformula of itself; a
subformula of φ other than φ itself is a proper subformula.
Definition 15.15 (Immediate Subformula). If φ is a formula, the imme-
diate subformulas of φ are defined inductively as follows:
1. Atomic formulas have no immediate subformulas.
2. φ ≡ ¬ψ: The only immediate subformula of φ is ψ.
3. φ ≡ (ψ ∗ χ): The immediate subformulas of φ are ψ and χ (∗ is any one
of the two-place connectives).
4. φ ≡ ∀x ψ: The only immediate subformula of φ is ψ.
5. φ ≡ ∃x ψ: The only immediate subformula of φ is ψ.

Definition 15.16 (Proper Subformula). If φ is a formula, the proper sub-


formulas of φ are defined recursively as follows:

Release : 6891b66 (2024-12-01) 201


CHAPTER 15. SYNTAX OF FIRST-ORDER LOGIC

1. Atomic formulas have no proper subformulas.

2. φ ≡ ¬ψ: The proper subformulas of φ are ψ together with all proper


subformulas of ψ.

3. φ ≡ (ψ ∗ χ): The proper subformulas of φ are ψ, χ, together with all


proper subformulas of ψ and those of χ.

4. φ ≡ ∀x ψ: The proper subformulas of φ are ψ together with all proper


subformulas of ψ.

5. φ ≡ ∃x ψ: The proper subformulas of φ are ψ together with all proper


subformulas of ψ.

Definition 15.17 (Subformula). The subformulas of φ are φ itself together


with all its proper subformulas.

explanation Note the subtle difference in how we have defined immediate subformulas
and proper subformulas. In the first case, we have directly defined the imme-
diate subformulas of a formula φ for each possible form of φ. It is an explicit
definition by cases, and the cases mirror the inductive definition of the set of
formulas. In the second case, we have also mirrored the way the set of all
formulas is defined, but in each case we have also included the proper subfor-
mulas of the smaller formulas ψ, χ in addition to these formulas themselves.
This makes the definition recursive. In general, a definition of a function on
an inductively defined set (in our case, formulas) is recursive if the cases in the
definition of the function make use of the function itself. To be well defined,
we must make sure, however, that we only ever use the values of the function
for arguments that come “before” the one we are defining—in our case, when
defining “proper subformula” for (ψ ∗ χ) we only use the proper subformulas
of the “earlier” formulas ψ and χ.

Proposition 15.18. Suppose ψ is a subformula of φ and χ is a subformula fol:syn:sbf:


prop:subfrm-trans
of ψ. Then χ is a subformula of φ. In other words, the subformula relation is
transitive.

Problem 15.5. Prove Proposition 15.18.

Proposition 15.19. Suppose φ is a formula with n connectives and quanti- fol:syn:sbf:


prop:count-subfrms
fiers. Then φ has at most 2n + 1 subformulas.

Problem 15.6. Prove Proposition 15.19.

content/first-order-logic/syntax-and-semantics/formation-sequences.tex

202 Release : 6891b66 (2024-12-01)


15.7. FORMATION SEQUENCES

15.7 Formation Sequences


fol:syn:fseq: Defining formulas via an inductive definition, and the complementary tech-
sec
nique of proving properties of formulas via induction, is an elegant and effi-
cient approach. However, it can also be useful to consider a more bottom-up,
step-by-step approach to the construction of formulas, which we do here using
the notion of a formation sequence. To show how terms and formulas can be
introduced in this way without needing to refer to their inductive definitions,
we first introduce the notion of an arbitrary string of symbols drawn from some
language L.

fol:syn:fseq: Definition 15.20 (Strings). Suppose L is a first-order language. An L-


defn:string
string is a finite sequence of symbols of L. Where the language L is clearly
fixed by the context, we will often refer to a L-string simply as a string.

Example 15.21. For any first-order language L, all L-formulas are L-strings,
but not conversely. For example,

)(v0 → ∃

is an L-string but not an L-formula.

fol:syn:fseq: Definition 15.22 (Formation sequences for terms). A finite sequence of


defn:fseq-trm
L-strings ⟨t0 , . . . , tn ⟩ is a formation sequence for a term t if t ≡ tn and for all
i ≤ n, either ti is a variable or a constant symbol, or L contains a k-ary function
symbol f and there exist m0 , . . . , mk < i such that ti ≡ f (tm0 , . . . , tmk ). When
it is necessary to distinguish, we will refer to formation sequences for terms as
term formation sequences.

Example 15.23. The sequence

⟨c0 , v0 , f02 (c0 , v0 ), f01 (f02 (c0 , v0 ))⟩

is a formation sequence for the term f01 (f02 (c0 , v0 )), as is

⟨v0 , c0 , f02 (c0 , v0 ), f01 (f02 (c0 , v0 ))⟩.

fol:syn:fseq: Definition 15.24 (Formation sequences for formulas). A finite sequence


defn:fseq-frm
of L-strings ⟨φ0 , . . . , φn ⟩ is a formation sequence for φ if φ ≡ φn and for all
i ≤ n, either φi is an atomic formula or there exist j, k < i and a variable x
such that one of the following holds:

1. φi ≡ ¬φj .

2. φi ≡ (φj ∧ φk ).

3. φi ≡ (φj ∨ φk ).

4. φi ≡ (φj → φk ).

Release : 6891b66 (2024-12-01) 203


CHAPTER 15. SYNTAX OF FIRST-ORDER LOGIC

5. φi ≡ (φj ↔ φk ).
6. φi ≡ ∀x φj .
7. φi ≡ ∃x φj .
When it is necessary to distinguish, we will refer to formation sequences for
formulas as formula formation sequences.
Example 15.25.

⟨A10 (v0 ), A11 (c1 ), (A11 (c1 ) ∧ A10 (v0 )), ∃v0 (A11 (c1 ) ∧ A10 (v0 ))⟩

is a formation sequence of ∃v0 (A11 (c1 ) ∧ A10 (v0 )), as is

⟨A10 (v0 ), A11 (c1 ), (A11 (c1 ) ∧ A10 (v0 )), A11 (c1 ),
∀v1 A10 (v0 ), ∃v0 (A11 (c1 ) ∧ A10 (v0 ))⟩.

As can be seen from the second example, formation sequences may contain
“junk”: formulas which are redundant or do not contribute to the construction.

Proposition 15.26. Every formula φ in Frm(L) has a formation sequence. fol:syn:fseq:


prop:formed

Proof. Suppose φ is atomic. Then the sequence ⟨φ⟩ is a formation sequence


for φ. Now suppose that ψ and χ have formation sequences ⟨ψ0 , . . . , ψn ⟩ and
⟨χ0 , . . . , χm ⟩ respectively.
1. If φ ≡ ¬ψ, then ⟨ψ0 , . . . , ψn , ¬ψn ⟩ is a formation sequence for φ.
2. If φ ≡ (ψ ∧ χ), then ⟨ψ0 , . . . , ψn , χ0 , . . . , χm , (ψn ∧ χm )⟩ is a formation
sequence for φ.
3. If φ ≡ (ψ ∨ χ), then ⟨ψ0 , . . . , ψn , χ0 , . . . , χm , (ψn ∨ χm )⟩ is a formation
sequence for φ.
4. If φ ≡ (ψ → χ), then ⟨ψ0 , . . . , ψn , χ0 , . . . , χm , (ψn → χm )⟩ is a formation
sequence for φ.
5. If φ ≡ (ψ ↔ χ), then ⟨ψ0 , . . . , ψn , χ0 , . . . , χm , (ψn ↔ χm )⟩ is a formation
sequence for φ.
6. If φ ≡ ∀x ψ, then ⟨ψ0 , . . . , ψn , ∀x ψn ⟩ is a formation sequence for φ.
7. If φ ≡ ∃x ψ, then ⟨ψ0 , . . . , ψn , ∃x ψn ⟩ is a formation sequence for φ.
By the principle of induction on formulas, every formula has a formation se-
quence.

We can also prove the converse. This is important because it shows that
our two ways of defining formulas are equivalent: they give the same results.
It also means that we can prove theorems about formulas by using ordinary
induction on the length of formation sequences.

204 Release : 6891b66 (2024-12-01)


15.7. FORMATION SEQUENCES

fol:syn:fseq: Lemma 15.27. Suppose that ⟨φ0 , . . . , φn ⟩ is a formation sequence for φn ,


lem:fseq-init
and that k ≤ n. Then ⟨φ0 , . . . , φk ⟩ is a formation sequence for φk .

Proof. Exercise.

Problem 15.7. Prove Lemma 15.27.

fol:syn:fseq: Theorem 15.28. Frm(L) is the set of all L-strings φ such that there exists
thm:fseq-frm-equiv
a formula formation sequence for φ.

Proof. Let F be the set of all strings of symbols in the language L that have a
formation sequence. We have seen in Proposition 15.26 that Frm(L) ⊆ F , so
now we prove the converse.
Suppose φ has a formation sequence ⟨φ0 , . . . , φn ⟩. We prove that φ ∈
Frm(L) by strong induction on n. Our induction hypothesis is that every
string of symbols with a formation sequence of length m < n is in Frm(L). By
the definition of a formation sequence, either φ ≡ φn is atomic or there must
exist j, k < n such that one of the following is the case:
1. φ ≡ ¬φj .
2. φ ≡ (φj ∧ φk ).
3. φ ≡ (φj ∨ φk ).
4. φ ≡ (φj → φk ).
5. φ ≡ (φj ↔ φk ).
6. φ ≡ ∀x φj .
7. φ ≡ ∃x φj .
Now we reason by cases. If φ is atomic then φn ∈ Frm(L0 ). Suppose in-
stead that φ ≡ (φj ∧ φk ). By Lemma 15.27, ⟨φ0 , . . . , φj ⟩ and ⟨φ0 , . . . , φk ⟩
are formation sequences for φj and φk , respectively. Since these are proper
initial subsequences of the formation sequence for φ, they both have length
less than n. Therefore by the induction hypothesis, φj and φk are in Frm(L0 ),
and by the definition of a formula, so is (φj ∧ φk ). The other cases follow by
parallel reasoning.

Formation sequences for terms have similar properties to those for formulas.
fol:syn:fseq: Proposition 15.29. Trm(L) is the set of all L-strings t such that there exists
prop:fseq-trm-equiv
a term formation sequence for t.

Proof. Exercise.

Problem 15.8. Prove Proposition 15.29. Hint: use a similar strategy to that
used in the proof of Theorem 15.28.

Release : 6891b66 (2024-12-01) 205


CHAPTER 15. SYNTAX OF FIRST-ORDER LOGIC

There are two types of “junk” that can appear in formation sequences:
repeated elements, and elements that are irrelevant to the construction of the
formation or term. We can eliminate both by looking at minimal formation
sequences.
Definition 15.30 (Minimal formation sequences). A formation sequence fol:syn:fseq:

⟨φ0 , . . . , φn ⟩ for a formula φ is a minimal formation sequence for φ if for ev- defn:minimal-fseq

ery other formation sequence s for φ, the length of s is greater than or equal
to n + 1.
Similarly, a formation sequence ⟨t0 , . . . , tn ⟩ for a term t is a minimal for-
mation sequence for t if for every other formation sequence s for t, the length
of s is greater than or equal to n + 1.

Note that a formula or term can have more than one minimal formation
sequence, but they will contain exactly the same strings.
Proposition 15.31. The following are equivalent: fol:syn:fseq:
prop:subformula-equivs
1. ψ is a sub-formula of φ.
2. ψ occurs in every formation sequence of φ.
3. ψ occurs in a minimal formation sequence of φ.

Proof. Exercise.

Problem 15.9. Prove Proposition 15.31.

Historical Remarks Formation sequences were introduced by Raymond


Smullyan in his textbook First-Order Logic (Smullyan, 1968). Additional prop-
erties of formation sequences were established by Zuckerman (1973).

content/first-order-logic/syntax-and-semantics/free-vars-sentences.tex

15.8 Free Variables and Sentences


fol:syn:fvs:
sec
Definition 15.32 (Free occurrences of a variable). The free occurrences fol:syn:fvs:
defn:free-occ
of a variable in a formula are defined inductively as follows:
1. φ is atomic: all variable occurrences in φ are free.
2. φ ≡ ¬ψ: the free variable occurrences of φ are exactly those of ψ.
3. φ ≡ (ψ ∗ χ): the free variable occurrences of φ are those in ψ together
with those in χ.
4. φ ≡ ∀x ψ: the free variable occurrences in φ are all of those in ψ except
for occurrences of x.

206 Release : 6891b66 (2024-12-01)


15.9. SUBSTITUTION

5. φ ≡ ∃x ψ: the free variable occurrences in φ are all of those in ψ except


for occurrences of x.

Definition 15.33 (Bound Variables). An occurrence of a variable in a for-


mula φ is bound if it is not free.

Problem 15.10. Give an inductive definition of the bound variable occur-


rences along the lines of Definition 15.32.

Definition 15.34 (Scope). If ∀x ψ is an occurrence of a subformula in a for-


mula φ, then the corresponding occurrence of ψ in φ is called the scope of the
corresponding occurrence of ∀x. Similarly for ∃x.
If ψ is the scope of a quantifier occurrence ∀x or ∃x in φ, then the free oc-
currences of x in ψ are bound in ∀x ψ and ∃x ψ. We say that these occurrences
are bound by the mentioned quantifier occurrence.

Example 15.35. Consider the following formula:

∃v0 A20 (v0 , v1 )


| {z }
ψ

ψ represents the scope of ∃v0 . The quantifier binds the occurrence of v0 in ψ,


but does not bind the occurrence of v1 . So v1 is a free variable in this case.
We can now see how this might work in a more complicated formula φ:
θ
z }| {
∀v0 (A10 (v0 ) → A20 (v0 , v1 )) →∃v1 (A21 (v0 , v1 ) ∨ ∀v0 ¬A11 (v0 ))
| {z } | {z }
ψ χ

ψ is the scope of the first ∀v0 , χ is the scope of ∃v1 , and θ is the scope of
the second ∀v0 . The first ∀v0 binds the occurrences of v0 in ψ, ∃v1 binds the
occurrence of v1 in χ, and the second ∀v0 binds the occurrence of v0 in θ. The
first occurrence of v1 and the fourth occurrence of v0 are free in φ. The last
occurrence of v0 is free in θ, but bound in χ and φ.

Definition 15.36 (Sentence). A formula φ is a sentence iff it contains no


free occurrences of variables.

content/first-order-logic/syntax-and-semantics/substitution.tex

15.9 Substitution
fol:syn:sub:
sec
Definition 15.37 (Substitution in a term). We define s[t/x], the result of
substituting t for every occurrence of x in s, recursively:

Release : 6891b66 (2024-12-01) 207


CHAPTER 15. SYNTAX OF FIRST-ORDER LOGIC

1. s ≡ c: s[t/x] is just s.
2. s ≡ y: s[t/x] is also just s, provided y is a variable and y ̸≡ x.
3. s ≡ x: s[t/x] is t.
4. s ≡ f (t1 , . . . , tn ): s[t/x] is f (t1 [t/x], . . . , tn [t/x]).

Definition 15.38. A term t is free for x in φ if none of the free occurrences


of x in φ occur in the scope of a quantifier that binds a variable in t.

Example 15.39.
1. v8 is free for v1 in ∃v3 A24 (v3 , v1 )
2. f12 (v1 , v2 ) is not free for v0 in ∀v2 A24 (v0 , v2 )

Definition 15.40 (Substitution in a formula). If φ is a formula, x is a vari-


able, and t is a term free for x in φ, then φ[t/x] is the result of substituting t
for all free occurrences of x in φ.
1. φ ≡ ⊥: φ[t/x] is ⊥.
2. φ ≡ ⊤: φ[t/x] is ⊤.
3. φ ≡ P (t1 , . . . , tn ): φ[t/x] is P (t1 [t/x], . . . , tn [t/x]).
4. φ ≡ t1 = t2 : φ[t/x] is t1 [t/x] = t2 [t/x].
5. φ ≡ ¬ψ: φ[t/x] is ¬ψ[t/x].
6. φ ≡ (ψ ∧ χ): φ[t/x] is (ψ[t/x] ∧ χ[t/x]).
7. φ ≡ (ψ ∨ χ): φ[t/x] is (ψ[t/x] ∨ χ[t/x]).
8. φ ≡ (ψ → χ): φ[t/x] is (ψ[t/x] → χ[t/x]).
9. φ ≡ (ψ ↔ χ): φ[t/x] is (ψ[t/x] ↔ χ[t/x]).
10. φ ≡ ∀y ψ: φ[t/x] is ∀y ψ[t/x], provided y is a variable other than x;
otherwise φ[t/x] is just φ.
11. φ ≡ ∃y ψ: φ[t/x] is ∃y ψ[t/x], provided y is a variable other than x;
otherwise φ[t/x] is just φ.

explanation Note that substitution may be vacuous: If x does not occur in φ at all, then
φ[t/x] is just φ.
The restriction that t must be free for x in φ is necessary to exclude cases like
the following. If φ ≡ ∃y x < y and t ≡ y, then φ[t/x] would be ∃y y < y. In this
case the free variable y is “captured” by the quantifier ∃y upon substitution,
and that is undesirable. For instance, we would like it to be the case that
whenever ∀x ψ holds, so does ψ[t/x]. But consider ∀x ∃y x < y (here ψ is

208 Release : 6891b66 (2024-12-01)


∃y x < y). It is a sentence that is true about, e.g., the natural numbers: for
every number x there is a number y greater than it. If we allowed y as a
possible substitution for x, we would end up with ψ[y/x] ≡ ∃y y < y, which is
false. We prevent this by requiring that none of the free variables in t would
end up being bound by a quantifier in φ.
We often use the following convention to avoid cumbersome notation: If
φ is a formula which may contain the variable x free, we also write φ(x) to
indicate this. When it is clear which φ and x we have in mind, and t is a term
(assumed to be free for x in φ(x)), then we write φ(t) as short for φ[t/x]. So
for instance, we might say, “we call φ(t) an instance of ∀x φ(x).” By this we
mean that if φ is any formula, x a variable, and t a term that’s free for x in φ,
then φ[t/x] is an instance of ∀x φ.

Chapter 16

Semantics of First-Order Logic

content/first-order-logic/syntax-and-semantics/intro-semantics.tex

16.1 Introduction
fol:syn:its: Giving the meaning of expressions is the domain of semantics. The central
sec
concept in semantics is that of satisfaction in a structure. A structure gives
meaning to the building blocks of the language: a domain is a non-empty
set of objects. The quantifiers are interpreted as ranging over this domain,
constant symbols are assigned elements in the domain, function symbols are
assigned functions from the domain to itself, and predicate symbols are as-
signed relations on the domain. The domain together with assignments to the
basic vocabulary constitutes a structure. Variables may appear in formulas,
and in order to give a semantics, we also have to assign elements of the do-
main to them—this is a variable assignment. The satisfaction relation, finally,
brings these together. A formula may be satisfied in a structure M relative
to a variable assignment s, written as M, s ⊨ φ. This relation is also defined
by induction on the structure of φ, using the truth tables for the logical con-
nectives to define, say, satisfaction of (φ ∧ ψ) in terms of satisfaction (or not)
of φ and ψ. It then turns out that the variable assignment is irrelevant if

209
CHAPTER 16. SEMANTICS OF FIRST-ORDER LOGIC

the formula φ is a sentence, i.e., has no free variables, and so we can talk of
sentences being simply satisfied (or not) in structures.
On the basis of the satisfaction relation M ⊨ φ for sentences we can then de-
fine the basic semantic notions of validity, entailment, and satisfiability. A sen-
tence is valid, ⊨ φ, if every structure satisfies it. It is entailed by a set of
sentences, Γ ⊨ φ, if every structure that satisfies all the sentences in Γ also
satisfies φ. And a set of sentences is satisfiable if some structure satisfies all
sentences in it at the same time. Because formulas are inductively defined, and
satisfaction is in turn defined by induction on the structure of formulas, we can
use induction to prove properties of our semantics and to relate the semantic
notions defined.

content/first-order-logic/syntax-and-semantics/structures.tex

16.2 Structures for First-order Languages


explanation First-order languages are, by themselves, uninterpreted: the constant symbols, fol:syn:str:
sec
function symbols, and predicate symbols have no specific meaning attached to
them. Meanings are given by specifying a structure. It specifies the domain, i.e.,
the objects which the constant symbols pick out, the function symbols operate
on, and the quantifiers range over. In addition, it specifies which constant
symbols pick out which objects, how a function symbol maps objects to objects,
and which objects the predicate symbols apply to. Structures are the basis for
semantic notions in logic, e.g., the notion of consequence, validity, satisfiability.
They are variously called “structures,” “interpretations,” or “models” in the
literature.
Definition 16.1 (Structures). A structure M, for a language L of first-order
logic consists of the following elements:
1. Domain: a non-empty set, |M|
2. Interpretation of constant symbols: for each constant symbol c of L, an el-
ement cM ∈ |M|
3. Interpretation of predicate symbols: for each n-place predicate symbol R
n
of L (other than =), an n-place relation RM ⊆ |M|
4. Interpretation of function symbols: for each n-place function symbol f of
n
L, an n-place function f M : |M| → |M|

Example 16.2. A structure M for the language of arithmetic consists of a


set, an element of |M|, 0M , as interpretation of the constant symbol 0, a one-
place function ′M : |M| → |M|, two two-place functions +M and ×M , both
2 2
|M| → |M|, and a two-place relation <M ⊆ |M| .
An obvious example of such a structure is the following:
1. |N| = N

210 Release : 6891b66 (2024-12-01)


16.3. COVERED STRUCTURES FOR FIRST-ORDER LANGUAGES

2. 0N = 0

3. ′N (n) = n + 1 for all n ∈ N

4. +N (n, m) = n + m for all n, m ∈ N

5. ×N (n, m) = n · m for all n, m ∈ N

6. <N = {⟨n, m⟩ : n ∈ N, m ∈ N, n < m}

The structure N for LA so defined is called the standard model of arithmetic,


because it interprets the non-logical constants of LA exactly how you would
expect.
However, there are many other possible structures for LA . For instance,
we might take as the domain the set Z of integers instead of N, and define the
interpretations of 0, ′, +, ×, < accordingly. But we can also define structures
for LA which have nothing even remotely to do with numbers.

Example 16.3. A structure M for the language LZ of set theory requires just
a set and a single-two place relation. So technically, e.g., the set of people plus
the relation “x is older than y” could be used as a structure for LZ , as well as
N together with n ≥ m for n, m ∈ N.
A particularly interesting structure for LZ in which the elements of the
domain are actually sets, and the interpretation of ∈ actually is the relation
“x is an element of y” is the structure HF of hereditarily finite sets:

1. |HF| = ∅ ∪ ℘(∅) ∪ ℘(℘(∅)) ∪ ℘(℘(℘(∅))) ∪ . . . ;

2. ∈HF = {⟨x, y⟩ : x, y ∈ |HF| , x ∈ y}.

The stipulations we make as to what counts as a structure impact our digression

logic. For example, the choice to prevent empty domains ensures, given the
usual account of satisfaction (or truth) for quantified sentences, that ∃x (φ(x)∨
¬φ(x)) is valid—that is, a logical truth. And the stipulation that all constant
symbols must refer to an object in the domain ensures that the existential
generalization is a sound pattern of inference: φ(a), therefore ∃x φ(x). If we
allowed names to refer outside the domain, or to not refer, then we would be on
our way to a free logic, in which existential generalization requires an additional
premise: φ(a) and ∃x x = a, therefore ∃x φ(x).

content/first-order-logic/syntax-and-semantics/covered-structures.tex

16.3 Covered Structures for First-order Languages


fol:syn:cov: Recall that a term is closed if it contains no variables. explanation
sec

Definition 16.4 (Value of closed terms). If t is a closed term of the lan-


guage L and M is a structure for L, the value ValM (t) is defined as follows:

Release : 6891b66 (2024-12-01) 211


CHAPTER 16. SEMANTICS OF FIRST-ORDER LOGIC

1. If t is just the constant symbol c, then ValM (c) = cM .

2. If t is of the form f (t1 , . . . , tn ), then

ValM (t) = f M (ValM (t1 ), . . . , ValM (tn )).

Definition 16.5 (Covered structure). A structure is covered if every ele-


ment of the domain is the value of some closed term.

Example 16.6. Let L be the language with constant symbols z er o, one,


tw o, . . . , the binary predicate symbol <, and the binary function symbols +
and ×. Then a structure M for L is the one with domain |M| = {0, 1, 2, . . .}
and assignments z er o M = 0, one M = 1, tw o M = 2, and so forth. For the
2
binary relation symbol <, the set <M is the set of all pairs ⟨c1 , c2 ⟩ ∈ |M| such
that c1 is less than c2 : for example, ⟨1, 3⟩ ∈ <M but ⟨2, 2⟩ ∈ / <M . For the
binary function symbol +, define + in the usual way—for example, +M (2, 3)
M

maps to 5, and similarly for the binary function symbol ×. Hence, the value of
f our is just 4, and the value of ×(tw o, +(thr ee, z er o)) (or in infix notation,
tw o × (thr ee + z er o)) is

ValM (×(tw o, +(thr ee, z er o))) =


= ×M (ValM (tw o), ValM (+(thr ee, zer o)))
= ×M (ValM (tw o), +M (ValM (thr ee), ValM (z er o)))
= ×M (tw o M , +M (thr ee M , z er o M ))
= ×M (2, +M (3, 0))
= ×M (2, 3)
=6

Problem 16.1. Is N, the standard model of arithmetic, covered? Explain.

content/first-order-logic/syntax-and-semantics/satisfaction.tex

16.4 Satisfaction of a Formula in a Structure


explanation The basic notion that relates expressions such as terms and formulas, on the one fol:syn:sat:
sec
hand, and structures on the other, are those of value of a term and satisfaction
of a formula. Informally, the value of a term is an element of a structure—if the
term is just a constant, its value is the object assigned to the constant by the
structure, and if it is built up using function symbols, the value is computed
from the values of constants and the functions assigned to the functions in the
term. A formula is satisfied in a structure if the interpretation given to the
predicates makes the formula true in the domain of the structure. This notion
of satisfaction is specified inductively: the specification of the structure directly

212 Release : 6891b66 (2024-12-01)


16.4. SATISFACTION OF A FORMULA IN A STRUCTURE

states when atomic formulas are satisfied, and we define when a complex for-
mula is satisfied depending on the main connective or quantifier and whether
or not the immediate subformulas are satisfied.
The case of the quantifiers here is a bit tricky, as the immediate subformula
of a quantified formula has a free variable, and structures don’t specify the
values of variables. In order to deal with this difficulty, we also introduce
variable assignments and define satisfaction not with respect to a structure
alone, but with respect to a structure plus a variable assignment.
Definition 16.7 (Variable Assignment). A variable assignment s for a struc-
ture M is a function which maps each variable to an element of |M|, i.e.,
s : Var → |M|.

A structure assigns a value to each constant symbol, and a variable assign- explanation
ment to each variable. But we want to use terms built up from them to also
name elements of the domain. For this we define the value of terms induc-
tively. For constant symbols and variables the value is just as the structure
or the variable assignment specifies it; for more complex terms it is computed
recursively using the functions the structure assigns to the function symbols.
Definition 16.8 (Value of Terms). If t is a term of the language L, M is
a structure for L, and s is a variable assignment for M, the value ValM
s (t) is
defined as follows:
1. t ≡ c: ValM M
s (t) = c .

2. t ≡ x: ValM
s (t) = s(x).

3. t ≡ f (t1 , . . . , tn ):

ValM M M M
s (t) = f (Vals (t1 ), . . . , Vals (tn )).

Definition 16.9 (x-Variant). If s is a variable assignment for a structure M,


then any variable assignment s′ for M which differs from s at most in what
it assigns to x is called an x-variant of s. If s′ is an x-variant of s we write
s′ ∼x s.

Note that an x-variant of an assignment s does not have to assign something explanation
different to x. In fact, every assignment counts as an x-variant of itself.
Definition 16.10. If s is a variable assignment for a structure M and m ∈ |M|,
then the assignment s[m/x] is the variable assignment defined by
(
m if y ≡ x
s[m/x](y) =
s(y) otherwise.

In other words, s[m/x] is the particular x-variant of s which assigns the


domain element m to x, and assigns the same things to variables other than x
that s does.

Release : 6891b66 (2024-12-01) 213


CHAPTER 16. SEMANTICS OF FIRST-ORDER LOGIC

Definition 16.11 (Satisfaction). Satisfaction of a formula φ in a struc- fol:syn:sat:


defn:satisfaction
ture M relative to a variable assignment s, in symbols: M, s ⊨ φ, is defined
recursively as follows. (We write M, s ⊭ φ to mean “not M, s ⊨ φ.”)

1. φ ≡ ⊥: M, s ⊭ φ.

2. φ ≡ ⊤: M, s ⊨ φ.

3. φ ≡ R(t1 , . . . , tn ): M, s ⊨ φ iff ⟨ValM M M


s (t1 ), . . . , Vals (tn )⟩ ∈ R .

4. φ ≡ t1 = t2 : M, s ⊨ φ iff ValM M
s (t1 ) = Vals (t2 ).

5. φ ≡ ¬ψ: M, s ⊨ φ iff M, s ⊭ ψ.

6. φ ≡ (ψ ∧ χ): M, s ⊨ φ iff M, s ⊨ ψ and M, s ⊨ χ.

7. φ ≡ (ψ ∨ χ): M, s ⊨ φ iff M, s ⊨ ψ or M, s ⊨ χ (or both).

8. φ ≡ (ψ → χ): M, s ⊨ φ iff M, s ⊭ ψ or M, s ⊨ χ (or both).

9. φ ≡ (ψ ↔ χ): M, s ⊨ φ iff either both M, s ⊨ ψ and M, s ⊨ χ, or neither


M, s ⊨ ψ nor M, s ⊨ χ.

10. φ ≡ ∀x ψ: M, s ⊨ φ iff for every element m ∈ |M|, M, s[m/x] ⊨ ψ.

11. φ ≡ ∃x ψ: M, s ⊨ φ iff for at least one element m ∈ |M|, M, s[m/x] ⊨ ψ.

explanation The variable assignments are important in the last two clauses. We cannot
define satisfaction of ∀x ψ(x) by “for all m ∈ |M|, M ⊨ ψ(m).” We cannot
define satisfaction of ∃x ψ(x) by “for at least one m ∈ |M|, M ⊨ ψ(m).” The
reason is that if m ∈ |M|, it is not a symbol of the language, and so ψ(m) is
not a formula (that is, ψ[m/x] is undefined). We also cannot assume that
we have constant symbols or terms available that name every element of M,
since there is nothing in the definition of structures that requires it. In the
standard language, the set of constant symbols is denumerable, so if |M| is not
enumerable there aren’t even enough constant symbols to name every object.
We solve this problem by introducing variable assignments, which allow us
to link variables directly with elements of the domain. Then instead of saying
that, e.g., ∃x ψ(x) is satisfied in M iff for at least one m ∈ |M|, we say it is
satisfied in M relative to s iff ψ(x) is satisfied relative to s[m/x] for at least
one m ∈ |M|.

Example 16.12. Let L = {a, b, f, R} where a and b are constant symbols, f is


a two-place function symbol, and R is a two-place predicate symbol. Consider
the structure M defined by:

1. |M| = {1, 2, 3, 4}

2. aM = 1

3. bM = 2

214 Release : 6891b66 (2024-12-01)


16.4. SATISFACTION OF A FORMULA IN A STRUCTURE

4. f M (x, y) = x + y if x + y ≤ 3 and = 3 otherwise.

5. RM = {⟨1, 1⟩, ⟨1, 2⟩, ⟨2, 3⟩, ⟨2, 4⟩}

The function s(x) = 1 that assigns 1 ∈ |M| to every variable is a variable


assignment for M.
Then

ValM M M M
s (f (a, b)) = f (Vals (a), Vals (b)).

Since a and b are constant symbols, ValM


s (a) = a
M
= 1 and ValM
s (b) = b
M
= 2.
So

ValM M
s (f (a, b)) = f (1, 2) = 1 + 2 = 3.

To compute the value of f (f (a, b), a) we have to consider

ValM M M M M
s (f (f (a, b), a)) = f (Vals (f (a, b)), Vals (a)) = f (3, 1) = 3,

since 3 + 1 > 3. Since s(x) = 1 and ValM


s (x) = s(x), we also have

ValM M M M M
s (f (f (a, b), x)) = f (Vals (f (a, b)), Vals (x)) = f (3, 1) = 3,

An atomic formula R(t1 , t2 ) is satisfied if the tuple of values of its ar-


guments, i.e., ⟨ValM M M
s (t1 ), Vals (t2 )⟩, is an element of R . So, e.g., we have
M, s ⊨ R(b, f (a, b)) since ⟨ValM (b), ValM (f (a, b))⟩ = ⟨2, 3⟩ ∈ RM , but M, s ⊭
/ RM [s].
R(x, f (a, b)) since ⟨1, 3⟩ ∈
To determine if a non-atomic formula φ is satisfied, you apply the clauses
in the inductive definition that applies to the main connective. For instance,
the main connective in R(a, a) → (R(b, x) ∨ R(x, b)) is the →, and

M, s ⊨ R(a, a) → (R(b, x) ∨ R(x, b)) iff


M, s ⊭ R(a, a) or M, s ⊨ R(b, x) ∨ R(x, b)

Since M, s ⊨ R(a, a) (because ⟨1, 1⟩ ∈ RM ) we can’t yet determine the answer


and must first figure out if M, s ⊨ R(b, x) ∨ R(x, b):

M, s ⊨ R(b, x) ∨ R(x, b) iff


M, s ⊨ R(b, x) or M, s ⊨ R(x, b)

And this is the case, since M, s ⊨ R(x, b) (because ⟨1, 2⟩ ∈ RM ).

Release : 6891b66 (2024-12-01) 215


CHAPTER 16. SEMANTICS OF FIRST-ORDER LOGIC

Recall that an x-variant of s is a variable assignment that differs from s at


most in what it assigns to x. For every element of |M|, there is an x-variant
of s:

s1 = s[1/x], s2 = s[2/x],
s3 = s[3/x], s4 = s[4/x].

So, e.g., s2 (x) = 2 and s2 (y) = s(y) = 1 for all variables y other than x. These
are all the x-variants of s for the structure M, since |M| = {1, 2, 3, 4}. Note,
in particular, that s1 = s (s is always an x-variant of itself).
To determine if an existentially quantified formula ∃x φ(x) is satisfied, we
have to determine if M, s[m/x] ⊨ φ(x) for at least one m ∈ |M|. So,

M, s ⊨ ∃x (R(b, x) ∨ R(x, b)),

since M, s[1/x] ⊨ R(b, x) ∨ R(x, b) (s[3/x] would also fit the bill). But,

M, s ⊭ ∃x (R(b, x) ∧ R(x, b))

since, whichever m ∈ |M| we pick, M, s[m/x] ⊭ R(b, x) ∧ R(x, b).


To determine if a universally quantified formula ∀x φ(x) is satisfied, we have
to determine if M, s[m/x] ⊨ φ(x) for all m ∈ |M|. So,

M, s ⊨ ∀x (R(x, a) → R(a, x)),

since M, s[m/x] ⊨ R(x, a) → R(a, x) for all m ∈ |M|. For m = 1, we have


M, s[1/x] ⊨ R(a, x) so the consequent is true; for m = 2, 3, and 4, we have
M, s[m/x] ⊭ R(x, a), so the antecedent is false. But,

M, s ⊭ ∀x (R(a, x) → R(x, a))

since M, s[2/x] ⊭ R(a, x)→R(x, a) (because M, s[2/x] ⊨ R(a, x) and M, s[2/x] ⊭


R(x, a)).
For a more complicated case, consider

∀x (R(a, x) → ∃y R(x, y)).

Since M, s[3/x] ⊭ R(a, x) and M, s[4/x] ⊭ R(a, x), the interesting cases where
we have to worry about the consequent of the conditional are only m = 1
and = 2. Does M, s[1/x] ⊨ ∃y R(x, y) hold? It does if there is at least one
n ∈ |M| so that M, s[1/x][n/y] ⊨ R(x, y). In fact, if we take n = 1, we have
s[1/x][n/y] = s[1/y] = s. Since s(x) = 1, s(y) = 1, and ⟨1, 1⟩ ∈ RM , the
answer is yes.
To determine if M, s[2/x] ⊨ ∃y R(x, y), we have to look at the variable
assignments s[2/x][n/y]. Here, for n = 1, this assignment is s2 = s[2/x], which
does not satisfy R(x, y) (s2 (x) = 2, s2 (y) = 1, and ⟨2, 1⟩ ∈ / RM ). However,
consider s[2/x][3/y] = s2 [3/y]. M, s2 [3/y] ⊨ R(x, y) since ⟨2, 3⟩ ∈ RM , and so
M, s2 ⊨ ∃y R(x, y).

216 Release : 6891b66 (2024-12-01)


16.5. VARIABLE ASSIGNMENTS

So, for all n ∈ |M|, either M, s[m/x] ⊭ R(a, x) (if m = 3, 4) or M, s[m/x] ⊨


∃y R(x, y) (if m = 1, 2), and so

M, s ⊨ ∀x (R(a, x) → ∃y R(x, y)).

On the other hand,

M, s ⊭ ∃x (R(a, x) ∧ ∀y R(x, y)).

We have M, s[m/x] ⊨ R(a, x) only for m = 1 and m = 2. But for both


of these values of m, there is in turn an n ∈ |M|, namely n = 4, so that
M, s[m/x][n/y] ⊭ R(x, y) and so M, s[m/x] ⊭ ∀y R(x, y) for m = 1 and m = 2.
In sum, there is no m ∈ |M| such that M, s[m/x] ⊨ R(a, x) ∧ ∀y R(x, y).

Problem 16.2. Let L = {c, f, A} with one constant symbol, one one-place
function symbol and one two-place predicate symbol, and let the structure M
be given by
1. |M| = {1, 2, 3}
2. cM = 3
3. f M (1) = 2, f M (2) = 3, f M (3) = 2
4. AM = {⟨1, 2⟩, ⟨2, 3⟩, ⟨3, 3⟩}
(a) Let s(v) = 1 for all variables v. Find out whether

M, s ⊨ ∃x (A(f (z), c) → ∀y (A(y, x) ∨ A(f (y), x)))

Explain why or why not.


(b) Give a different structure and variable assignment in which the formula
is not satisfied.

content/first-order-logic/syntax-and-semantics/assignments.tex

16.5 Variable Assignments


fol:syn:ass: A variable assignment s provides a value for every variable—and there are explanation
sec
infinitely many of them. This is of course not necessary. We require variable
assignments to assign values to all variables simply because it makes things a
lot easier. The value of a term t, and whether or not a formula φ is satisfied in
a structure with respect to s, only depend on the assignments s makes to the
variables in t and the free variables of φ. This is the content of the next two
propositions. To make the idea of “depends on” precise, we show that any two
variable assignments that agree on all the variables in t give the same value,
and that φ is satisfied relative to one iff it is satisfied relative to the other if
two variable assignments agree on all free variables of φ.

Release : 6891b66 (2024-12-01) 217


CHAPTER 16. SEMANTICS OF FIRST-ORDER LOGIC

Proposition 16.13. If the variables in a term t are among x1 , . . . , xn , and fol:syn:ass:

s1 (xi ) = s2 (xi ) for i = 1, . . . , n, then ValM M prop:valindep


s1 (t) = Vals2 (t).

Proof. By induction on the complexity of t. For the base case, t can be a con-
stant symbol or one of the variables x1 , . . . , xn . If t = c, then ValM
s1 (t) = c
M
=
M
Vals2 (t). If t = xi , s1 (xi ) = s2 (xi ) by the hypothesis of the proposition, and
so ValM M
s1 (t) = s1 (xi ) = s2 (xi ) = Vals2 (t).
For the inductive step, assume that t = f (t1 , . . . , tk ) and that the claim
holds for t1 , . . . , tk . Then
ValM M
s1 (t) = Vals1 (f (t1 , . . . , tk ))

= f M (ValM M
s1 (t1 ), . . . , Vals1 (tk )).

For j = 1, . . . , k, the variables of tj are among x1 , . . . , xn . By the induction


hypothesis, ValM M
s1 (tj ) = Vals2 (tj ). So,

ValM M
s1 (t) = Vals1 (f (t1 , . . . , tk ))

= f M (ValM M
s1 (t1 ), . . . , Vals1 (tk ))

= f M (ValM M
s2 (t1 ), . . . , Vals2 (tk ))

= ValM M
s2 (f (t1 , . . . , tk )) = Vals2 (t).

Proposition 16.14. If the free variables in φ are among x1 , . . . , xn , and fol:syn:ass:


prop:satindep
s1 (xi ) = s2 (xi ) for i = 1, . . . , n, then M, s1 ⊨ φ iff M, s2 ⊨ φ.
Proof. We use induction on the complexity of φ. For the base case, where φ
is atomic, φ can be: ⊤, ⊥, R(t1 , . . . , tk ) for a k-place predicate R and terms
t1 , . . . , tk , or t1 = t2 for terms t1 and t2 . In the latter two cases, we only
demonstrate the forward direction of the biconditional, since the proof of the
reverse is symmetrical.
1. φ ≡ ⊤: both M, s1 ⊨ φ and M, s2 ⊨ φ.
2. φ ≡ ⊥: both M, s1 ⊭ φ and M, s2 ⊭ φ.
3. φ ≡ R(t1 , . . . , tk ): let M, s1 ⊨ φ. Then
⟨ValM M M
s1 (t1 ), . . . , Vals1 (tk )⟩ ∈ R .

For i = 1, . . . , k, ValM M
s1 (ti ) = Vals2 (ti ) by Proposition 16.13. So we also
have ⟨ValM M M
s2 (ti ), . . . , Vals2 (tk )⟩ ∈ R , and hence M, s2 ⊨ φ.

4. φ ≡ t1 = t2 : suppose M, s1 ⊨ φ. Then ValM M


s1 (t1 ) = Vals1 (t2 ). So,

ValM M
s2 (t1 ) = Vals1 (t1 ) (by Proposition 16.13)
= ValM
s1 (t2 ) (since M, s1 ⊨ t1 = t2 )
= ValM
s2 (t2 ) (by Proposition 16.13),
so M, s2 ⊨ t1 = t2 .

218 Release : 6891b66 (2024-12-01)


16.5. VARIABLE ASSIGNMENTS

Now assume M, s1 ⊨ ψ iff M, s2 ⊨ ψ for all formulas ψ less complex than φ.


The induction step proceeds by cases determined by the main operator of φ.
In each case, we only demonstrate the forward direction of the biconditional;
the proof of the reverse direction is symmetrical. In all cases except those for
the quantifiers, we apply the induction hypothesis to sub-formulas ψ of φ. The
free variables of ψ are among those of φ. Thus, if s1 and s2 agree on the free
variables of φ, they also agree on those of ψ, and the induction hypothesis
applies to ψ.

1. φ ≡ ¬ψ: if M, s1 ⊨ φ, then M, s1 ⊭ ψ, so by the induction hypothesis,


M, s2 ⊭ ψ, hence M, s2 ⊨ φ.

2. φ ≡ ψ ∧ χ: if M, s1 ⊨ φ, then M, s1 ⊨ ψ and M, s1 ⊨ χ, so by induction


hypothesis, M, s2 ⊨ ψ and M, s2 ⊨ χ. Hence, M, s2 ⊨ φ.

3. φ ≡ ψ ∨ χ: if M, s1 ⊨ φ, then M, s1 ⊨ ψ or M, s1 ⊨ χ. By induction
hypothesis, M, s2 ⊨ ψ or M, s2 ⊨ χ, so M, s2 ⊨ φ.

4. φ ≡ ψ → χ: if M, s1 ⊨ φ, then M, s1 ⊭ ψ or M, s1 ⊨ χ. By the induction


hypothesis, M, s2 ⊭ ψ or M, s2 ⊨ χ, so M, s2 ⊨ φ.

5. φ ≡ ψ ↔ χ: if M, s1 ⊨ φ, then either M, s1 ⊨ ψ and M, s1 ⊨ χ, or


M, s1 ⊭ ψ and M, s1 ⊭ χ. By the induction hypothesis, either M, s2 ⊨ ψ
and M, s2 ⊨ χ or M, s2 ⊭ ψ and M, s2 ⊭ χ. In either case, M, s2 ⊨ φ.

6. φ ≡ ∃x ψ: if M, s1 ⊨ φ, there is an m ∈ |M| so that M, s1 [m/x] ⊨ ψ. Let


s′1 = s1 [m/x] and s′2 = s2 [m/x]. The free variables of ψ are among x1 ,
. . . , xn , and x. s′1 (xi ) = s′2 (xi ), since s′1 and s′2 are x-variants of s1 and s2 ,
respectively, and by hypothesis s1 (xi ) = s2 (xi ). s′1 (x) = s′2 (x) = m by
the way we have defined s′1 and s′2 . Then the induction hypothesis applies
to ψ and s′1 , s′2 , so M, s′2 ⊨ ψ. Hence, since s′2 = s2 [m/x], there is an
m ∈ |M| such that M, s2 [m/x] ⊨ ψ, and so M, s2 ⊨ φ.

7. φ ≡ ∀x ψ: if M, s1 ⊨ φ, then for every m ∈ |M|, M, s1 [m/x] ⊨ ψ.


We want to show that also, for every m ∈ |M|, M, s2 [m/x] ⊨ ψ. So
let m ∈ |M| be arbitrary, and consider s′1 = s[m/x] and s′2 = s[m/x].
We have that M, s′1 ⊨ ψ. The free variables of ψ are among x1 , . . . ,
xn , and x. s′1 (xi ) = s′2 (xi ), since s′1 and s′2 are x-variants of s1 and s2 ,
respectively, and by hypothesis s1 (xi ) = s2 (xi ). s′1 (x) = s′2 (x) = m by
the way we have defined s′1 and s′2 . Then the induction hypothesis applies
to ψ and s′1 , s′2 , and we have M, s′2 ⊨ ψ. This applies to every m ∈ |M|,
i.e., M, s2 [m/x] ⊨ ψ for all m ∈ |M|, so M, s2 ⊨ φ.

By induction, we get that M, s1 ⊨ φ iff M, s2 ⊨ φ whenever the free variables


in φ are among x1 , . . . , xn and s1 (xi ) = s2 (xi ) for i = 1, . . . , n.

Problem 16.3. Complete the proof of Proposition 16.14.

Release : 6891b66 (2024-12-01) 219


CHAPTER 16. SEMANTICS OF FIRST-ORDER LOGIC

explanation Sentences have no free variables, so any two variable assignments assign the
same things to all the (zero) free variables of any sentence. The proposition
just proved then means that whether or not a sentence is satisfied in a structure
relative to a variable assignment is completely independent of the assignment.
We’ll record this fact. It justifies the definition of satisfaction of a sentence in
a structure (without mentioning a variable assignment) that follows.
Corollary 16.15. If φ is a sentence and s a variable assignment, then M, s ⊨ fol:syn:ass:

φ iff M, s′ ⊨ φ for every variable assignment s′ . cor:sat-sentence

Proof. Let s′ be any variable assignment. Since φ is a sentence, it has no free


variables, and so every variable assignment s′ trivially assigns the same things
to all free variables of φ as does s. So the condition of Proposition 16.14 is
satisfied, and we have M, s ⊨ φ iff M, s′ ⊨ φ.

Definition 16.16. If φ is a sentence, we say that a structure M satisfies φ, fol:syn:ass:


defn:satisfaction
M ⊨ φ, iff M, s ⊨ φ for all variable assignments s.

If M ⊨ φ, we also simply say that φ is true in M. The notion of satisfaction


naturally extends from individual sentences to sets of sentences.
Definition 16.17. If Γ is a set of sentences Γ , we say that a structure M fol:syn:ass:
satisfies Γ , M ⊨ Γ , iff M ⊨ φ for all φ ∈ Γ . defn:sat

Proposition 16.18. Let M be a structure, φ be a sentence, and s a variable fol:syn:ass:


prop:sentence-sat-true
assignment. M ⊨ φ iff M, s ⊨ φ.

Proof. Exercise.

Problem 16.4. Prove Proposition 16.18

Proposition 16.19. Suppose φ(x) only contains x free, and M is a structure. fol:syn:ass:
prop:sat-quant
Then:
1. M ⊨ ∃x φ(x) iff M, s ⊨ φ(x) for at least one variable assignment s.
2. M ⊨ ∀x φ(x) iff M, s ⊨ φ(x) for all variable assignments s.

Proof. Exercise.

Problem 16.5. Prove Proposition 16.19.

Problem 16.6. Suppose L is a language without function symbols. Given


a structure M, c a constant symbol and a ∈ |M|, define M[a/c] to be the
structure that is just like M, except that cM[a/c] = a. Define M ||= φ for
sentences φ by:
1. φ ≡ ⊥: not M ||= φ.
2. φ ≡ ⊤: M ||= φ.

220 Release : 6891b66 (2024-12-01)


16.6. EXTENSIONALITY

3. φ ≡ R(d1 , . . . , dn ): M ||= φ iff ⟨dM M M


1 , . . . , dn ⟩ ∈ R .

4. φ ≡ d1 = d2 : M ||= φ iff dM M
1 = d2 .

5. φ ≡ ¬ψ: M ||= φ iff not M ||= ψ.

6. φ ≡ (ψ ∧ χ): M ||= φ iff M ||= ψ and M ||= χ.

7. φ ≡ (ψ ∨ χ): M ||= φ iff M ||= ψ or M ||= χ (or both).

8. φ ≡ (ψ → χ): M ||= φ iff not M ||= ψ or M ||= χ (or both).

9. φ ≡ (ψ ↔ χ): M ||= φ iff either both M ||= ψ and M ||= χ, or neither


M ||= ψ nor M ||= χ.

10. φ ≡ ∀x ψ: M ||= φ iff for all a ∈ |M|, M[a/c] ||= ψ[c/x], if c does not
occur in ψ.

11. φ ≡ ∃x ψ: M ||= φ iff there is an a ∈ |M| such that M[a/c] ||= ψ[c/x],
if c does not occur in ψ.

Let x1 , . . . , xn be all free variables in φ, c1 , . . . , cn constant symbols not in φ,


a1 , . . . , an ∈ |M|, and s(xi ) = ai .
Show that M, s ⊨ φ iff M[a1 /c1 , . . . , an /cn ] ||= φ[c1 /x1 ] . . . [cn /xn ].
(This problem shows that it is possible to give a semantics for first-order
logic that makes do without variable assignments.)

Problem 16.7. Suppose that f is a function symbol not in φ(x, y). Show that
there is a structure M such that M ⊨ ∀x ∃y φ(x, y) iff there is an M′ such that
M′ ⊨ ∀x φ(x, f (x)).
(This problem is a special case of what’s known as Skolem’s Theorem;
∀x φ(x, f (x)) is called a Skolem normal form of ∀x ∃y φ(x, y).)

content/first-order-logic/syntax-and-semantics/extensionality.tex

16.6 Extensionality
fol:syn:ext: Extensionality, sometimes called relevance, can be expressed informally as fol- explanation
sec
lows: the only factors that bear upon the satisfaction of formula φ in a struc-
ture M relative to a variable assignment s, are the size of the domain and the
assignments made by M and s to the elements of the language that actually
appear in φ.
One immediate consequence of extensionality is that where two structures M
and M′ agree on all the elements of the language appearing in a sentence φ
and have the same domain, M and M′ must also agree on whether or not φ
itself is true.

Release : 6891b66 (2024-12-01) 221


CHAPTER 16. SEMANTICS OF FIRST-ORDER LOGIC

Proposition 16.20 (Extensionality). Let φ be a formula, and M1 and M2 fol:syn:ext:


be structures with |M1 | = |M2 |, and s a variable assignment on |M1 | = |M2 |. prop:extensionality

If cM1 = cM2 , RM1 = RM2 , and f M1 = f M2 for every constant symbol c,


relation symbol R, and function symbol f occurring in φ, then M1 , s ⊨ φ iff
M2 , s ⊨ φ.

Proof. First prove (by induction on t) that for every term, ValM 1 M2
s (t) = Vals (t).
Then prove the proposition by induction on φ, making use of the claim just
proved for the induction basis (where φ is atomic).

Problem 16.8. Carry out the proof of Proposition 16.20 in detail.

Corollary 16.21 (Extensionality for Sentences). Let φ be a sentence and fol:syn:ext:


cor:extensionality-sent
M1 , M2 as in Proposition 16.20. Then M1 ⊨ φ iff M2 ⊨ φ.

Proof. Follows from Proposition 16.20 by Corollary 16.15.

Moreover, the value of a term, and whether or not a structure satisfies


a formula, only depend on the values of its subterms.

Proposition 16.22. Let M be a structure, t and t′ terms, and s a variable fol:syn:ext:


prop:ext-terms
assignment. Then ValM ′ M
s (t[t /x]) = Vals[ValM ′
s (t )/x]
(t).

Proof. By induction on t.

1. If t is a constant, say, t ≡ c, then t[t′ /x] = c, and ValM


s (c) = c
M
=
M
Vals[ValM ′
s (t )/x]
(c).

2. If t is a variable other than x, say, t ≡ y, then t[t′ /x] = y, and ValM


s (y) =
ValM M ′
s[Vals (t )/x] (y) since s ∼ x s[ValM ′
s (t )/x].

3. If t ≡ x, then t[t′ /x] = t′ . But ValM


s[ValM ′
s (t )/x]
(x) = ValM ′
s (t ) by definition
of s[ValM ′
s (t )/x].

4. If t ≡ f (t1 , . . . , tn ) then we have:


ValM
s (t[t /x]) =
′ ′
= ValM
s (f (t1 [t /x], . . . , tn [t /x]))
by definition of t[t′ /x]
′ ′
= f M (ValM M
s (t1 [t /x]), . . . , Vals (tn [t /x]))

by definition of ValM
s (f (. . . ))

= f M (ValM
s[ValM ′
s (t )/x]
(t1 ), . . . , ValM
s[ValM ′
s (t )/x]
(tn ))
by induction hypothesis
= ValM
s[ValM ′
s (t )/x]
(t) by definition of ValM
s[ValM ′
s (t )/x]
(f (. . . ))

222 Release : 6891b66 (2024-12-01)


16.7. SEMANTIC NOTIONS

fol:syn:ext: Proposition 16.23. Let M be a structure, φ a formula, t′ a term, and s a


prop:ext-formulas
variable assignment. Then M, s ⊨ φ[t′ /x] iff M, s[ValM ′
s (t )/x] ⊨ φ.

Proof. Exercise.

Problem 16.9. Prove Proposition 16.23

The point of Propositions 16.22 and 16.23 is the following. Suppose we explanation
have a term t or a formula φ and some term t′ , and we want to know the
value of t[t′ /x] or whether or not φ[t′ /x] is satisfied in a structure M relative
to a variable assignment s. Then we can either perform the substitution first
and then consider the value or satisfaction relative to M and s, or we can first
determine the value m = ValM ′ ′
s (t ) of t in M relative to s, change the variable
assignment to s[m/x] and then consider the value of t in M and s[m/x], or
whether M, s[m/x] ⊨ φ. Propositions 16.22 and 16.23 guarantee that the
answer will be the same, whichever way we do it.

content/first-order-logic/syntax-and-semantics/semantic-notions.tex

16.7 Semantic Notions


fol:syn:sem: Given the definition of structures for first-order languages, we can define some explanation
sec
basic semantic properties of and relationships between sentences. The simplest
of these is the notion of validity of a sentence. A sentence is valid if it is satisfied
in every structure. Valid sentences are those that are satisfied regardless of how
the non-logical symbols in it are interpreted. Valid sentences are therefore also
called logical truths—they are true, i.e., satisfied, in any structure and hence
their truth depends only on the logical symbols occurring in them and their
syntactic structure, but not on the non-logical symbols or their interpretation.
Definition 16.24 (Validity). A sentence φ is valid, ⊨ φ, iff M ⊨ φ for every
structure M.

Definition 16.25 (Entailment). A set of sentences Γ entails a sentence φ,


Γ ⊨ φ, iff for every structure M with M ⊨ Γ , M ⊨ φ.

Definition 16.26 (Satisfiability). A set of sentences Γ is satisfiable if M ⊨


Γ for some structure M. If Γ is not satisfiable it is called unsatisfiable.

Proposition 16.27. A sentence φ is valid iff Γ ⊨ φ for every set of sen-


tences Γ .

Proof. For the forward direction, let φ be valid, and let Γ be a set of sentences.
Let M be a structure so that M ⊨ Γ . Since φ is valid, M ⊨ φ, hence Γ ⊨ φ.
For the contrapositive of the reverse direction, let φ be invalid, so there is
a structure M with M ⊭ φ. When Γ = {⊤}, since ⊤ is valid, M ⊨ Γ . Hence,
there is a structure M so that M ⊨ Γ but M ⊭ φ, hence Γ does not entail φ.

Release : 6891b66 (2024-12-01) 223


Proposition 16.28. Γ ⊨ φ iff Γ ∪ {¬φ} is unsatisfiable. fol:syn:sem:
prop:entails-unsat

Proof. For the forward direction, suppose Γ ⊨ φ and suppose to the contrary
that there is a structure M so that M ⊨ Γ ∪ {¬φ}. Since M ⊨ Γ and Γ ⊨ φ,
M ⊨ φ. Also, since M ⊨ Γ ∪ {¬φ}, M ⊨ ¬φ, so we have both M ⊨ φ and
M ⊭ φ, a contradiction. Hence, there can be no such structure M, so Γ ∪ {¬φ}
is unsatisfiable.
For the reverse direction, suppose Γ ∪ {¬φ} is unsatisfiable. So for every
structure M, either M ⊭ Γ or M ⊨ φ. Hence, for every structure M with
M ⊨ Γ , M ⊨ φ, so Γ ⊨ φ.

Problem 16.10. 1. Show that Γ ⊨ ⊥ iff Γ is unsatisfiable.

2. Show that Γ ∪ {φ} ⊨ ⊥ iff Γ ⊨ ¬φ.

3. Suppose c does not occur in φ or Γ . Show that Γ ⊨ ∀x φ iff Γ ⊨ φ[c/x].

Proposition 16.29. If Γ ⊆ Γ ′ and Γ ⊨ φ, then Γ ′ ⊨ φ.

Proof. Suppose that Γ ⊆ Γ ′ and Γ ⊨ φ. Let M be a structure such that


M ⊨ Γ ′ ; then M ⊨ Γ , and since Γ ⊨ φ, we get that M ⊨ φ. Hence, whenever
M ⊨ Γ ′ , M ⊨ φ, so Γ ′ ⊨ φ.

Theorem 16.30 (Semantic Deduction Theorem). Γ ∪ {φ} ⊨ ψ iff Γ ⊨ fol:syn:sem:


φ → ψ. thm:sem-deduction

Proof. For the forward direction, let Γ ∪ {φ} ⊨ ψ and let M be a structure so
that M ⊨ Γ . If M ⊨ φ, then M ⊨ Γ ∪ {φ}, so since Γ ∪ {φ} entails ψ, we get
M ⊨ ψ. Therefore, M ⊨ φ → ψ, so Γ ⊨ φ → ψ.
For the reverse direction, let Γ ⊨ φ → ψ and M be a structure so that
M ⊨ Γ ∪ {φ}. Then M ⊨ Γ , so M ⊨ φ → ψ, and since M ⊨ φ, M ⊨ ψ. Hence,
whenever M ⊨ Γ ∪ {φ}, M ⊨ ψ, so Γ ∪ {φ} ⊨ ψ.

Proposition 16.31. Let M be a structure, and φ(x) a formula with one free fol:syn:sem:
prop:quant-terms
variable x, and t a closed term. Then:

1. φ(t) ⊨ ∃x φ(x)

2. ∀x φ(x) ⊨ φ(t)

Proof. 1. Suppose M ⊨ φ(t). Let s be a variable assignment with s(x) =


ValM (t). Then M, s ⊨ φ(t) since φ(t) is a sentence. By Proposition 16.23,
M, s ⊨ φ(x). By Proposition 16.19, M ⊨ ∃x φ(x).

2. Suppose M ⊨ ∀x φ(x). Let s be a variable assignment with s(x) =


ValM (t). By Proposition 16.19, M, s ⊨ φ(x). By Proposition 16.23,
M, s ⊨ φ(t). By Proposition 16.18, M ⊨ φ(t) since φ(t) is a sentence.

224
Problem 16.11. Complete the proof of Proposition 16.31.

Chapter 17

Theories and Their Models

content/first-order-logic/models-theories/introduction.tex

17.1 Introduction
fol:mat:int: The development of the axiomatic method is a significant achievement in the explanation
sec
history of science, and is of special importance in the history of mathematics.
An axiomatic development of a field involves the clarification of many questions:
What is the field about? What are the most fundamental concepts? How are
they related? Can all the concepts of the field be defined in terms of these
fundamental concepts? What laws do, and must, these concepts obey?
The axiomatic method and logic were made for each other. Formal logic
provides the tools for formulating axiomatic theories, for proving theorems
from the axioms of the theory in a precisely specified way, for studying the
properties of all systems satisfying the axioms in a systematic way.

Definition 17.1. A set of sentences Γ is closed iff, whenever Γ ⊨ φ then


φ ∈ Γ . The closure of a set of sentences Γ is {φ : Γ ⊨ φ}.
We say that Γ is axiomatized by a set of sentences ∆ if Γ is the closure
of ∆.

We can think of an axiomatic theory as the set of sentences that is ax- explanation

iomatized by its set of axioms ∆. In other words, when we have a first-order


language which contains non-logical symbols for the primitives of the axiomat-
ically developed science we wish to study, together with a set of sentences
that express the fundamental laws of the science, we can think of the theory
as represented by all the sentences in this language that are entailed by the
axioms. This ranges from simple examples with only a single primitive and
simple axioms, such as the theory of partial orders, to complex theories such
as Newtonian mechanics.

Release : 6891b66 (2024-12-01) 225


CHAPTER 17. THEORIES AND THEIR MODELS

The important logical facts that make this formal approach to the axiomatic
method so important are the following. Suppose Γ is an axiom system for a
theory, i.e., a set of sentences.

1. We can state precisely when an axiom system captures an intended class


of structures. That is, if we are interested in a certain class of struc-
tures, we will successfully capture that class by an axiom system Γ iff
the structures are exactly those M such that M ⊨ Γ .

2. We may fail in this respect because there are M such that M ⊨ Γ , but M
is not one of the structures we intend. This may lead us to add axioms
which are not true in M.

3. If we are successful at least in the respect that Γ is true in all the intended
structures, then a sentence φ is true in all intended structures whenever
Γ ⊨ φ. Thus we can use logical tools (such as derivation methods) to
show that sentences are true in all intended structures simply by showing
that they are entailed by the axioms.

4. Sometimes we don’t have intended structures in mind, but instead start


from the axioms themselves: we begin with some primitives that we
want to satisfy certain laws which we codify in an axiom system. One
thing that we would like to verify right away is that the axioms do not
contradict each other: if they do, there can be no concepts that obey
these laws, and we have tried to set up an incoherent theory. We can
verify that this doesn’t happen by finding a model of Γ . And if there
are models of our theory, we can use logical methods to investigate them,
and we can also use logical methods to construct models.

5. The independence of the axioms is likewise an important question. It may


happen that one of the axioms is actually a consequence of the others,
and so is redundant. We can prove that an axiom φ in Γ is redundant by
proving Γ \ {φ} ⊨ φ. We can also prove that an axiom is not redundant
by showing that (Γ \{φ})∪{¬φ} is satisfiable. For instance, this is how it
was shown that the parallel postulate is independent of the other axioms
of geometry.

6. Another important question is that of definability of concepts in a theory:


The choice of the language determines what the models of a theory consist
of. But not every aspect of a theory must be represented separately in its
models. For instance, every ordering ≤ determines a corresponding strict
ordering <—given one, we can define the other. So it is not necessary
that a model of a theory involving such an order must also contain the
corresponding strict ordering. When is it the case, in general, that one
relation can be defined in terms of others? When is it impossible to define
a relation in terms of others (and hence must add it to the primitives of
the language)?

226 Release : 6891b66 (2024-12-01)


17.2. EXPRESSING PROPERTIES OF STRUCTURES

content/first-order-logic/models-theories/expressing-props-of-structures.tex

17.2 Expressing Properties of Structures


fol:mat:exs: It is often useful and important to express conditions on functions and relations, explanation
sec
or more generally, that the functions and relations in a structure satisfy these
conditions. For instance, we would like to have ways of distinguishing those
structures for a language which “capture” what we want the predicate symbols
to “mean” from those that do not. Of course we’re completely free to specify
which structures we “intend,” e.g., we can specify that the interpretation of
the predicate symbol ≤ must be an ordering, or that we are only interested in
interpretations of L in which the domain consists of sets and ∈ is interpreted
by the “is an element of” relation. But can we do this with sentences of the
language? In other words, which conditions on a structure M can we express
by a sentence (or perhaps a set of sentences) in the language of M? There are
some conditions that we will not be able to express. For instance, there is no
sentence of LA which is only true in a structure M if |M| = N. We cannot
express “the domain contains only natural numbers.” But there are “structural
properties” of structures that we perhaps can express. Which properties of
structures can we express by sentences? Or, to put it another way, which
collections of structures can we describe as those making a sentence (or set of
sentences) true?
Definition 17.2 (Model of a set). Let Γ be a set of sentences in a lan-
guage L. We say that a structure M is a model of Γ if M ⊨ φ for all φ ∈ Γ .

Example 17.3. The sentence ∀x x ≤ x is true in M iff ≤M is a reflexive


relation. The sentence ∀x ∀y ((x ≤ y ∧ y ≤ x) → x = y) is true in M iff ≤M is
anti-symmetric. The sentence ∀x ∀y ∀z ((x ≤ y ∧ y ≤ z) → x ≤ z) is true in M
iff ≤M is transitive. Thus, the models of

{ ∀x x ≤ x,
∀x ∀y ((x ≤ y ∧ y ≤ x) → x = y),
∀x ∀y ∀z ((x ≤ y ∧ y ≤ z) → x ≤ z) }

are exactly those structures in which ≤M is reflexive, anti-symmetric, and


transitive, i.e., a partial order. Hence, we can take them as axioms for the
first-order theory of partial orders.

content/first-order-logic/models-theories/theories.tex

17.3 Examples of First-Order Theories


fol:mat:the:
sec

Release : 6891b66 (2024-12-01) 227


CHAPTER 17. THEORIES AND THEIR MODELS

Example 17.4. The theory of strict linear orders in the language L< is ax-
iomatized by the set

{ ∀x ¬x < x,
∀x ∀y ((x < y ∨ y < x) ∨ x = y),
∀x ∀y ∀z ((x < y ∧ y < z) → x < z) }

It completely captures the intended structures: every strict linear order is a


model of this axiom system, and vice versa, if R is a linear order on a set X,
then the structure M with |M| = X and <M = R is a model of this theory.

Example 17.5. The theory of groups in the language 1 (constant symbol), ·


(two-place function symbol) is axiomatized by

∀x (x · 1) = x
∀x ∀y ∀z (x · (y · z)) = ((x · y) · z)
∀x ∃y (x · y) = 1

Example 17.6. The theory of Peano arithmetic is axiomatized by the follow-


ing sentences in the language of arithmetic LA .

∀x ∀y (x′ = y ′ → x = y)
∀x 0 ̸= x′
∀x (x + 0) = x
∀x ∀y (x + y ′ ) = (x + y)′
∀x (x × 0) = 0
∀x ∀y (x × y ′ ) = ((x × y) + x)
∀x ∀y (x < y ↔ ∃z (z ′ + x) = y)

plus all sentences of the form

(φ(0) ∧ ∀x (φ(x) → φ(x′ ))) → ∀x φ(x)

Since there are infinitely many sentences of the latter form, this axiom system
is infinite. The latter form is called the induction schema. (Actually, the
induction schema is a bit more complicated than we let on here.)
The last axiom is an explicit definition of <.

Example 17.7. The theory of pure sets plays an important role in the foun-
dations (and in the philosophy) of mathematics. A set is pure if all its elements
are also pure sets. The empty set counts therefore as pure, but a set that has
something as an element that is not a set would not be pure. So the pure sets
are those that are formed just from the empty set and no “urelements,” i.e.,
objects that are not themselves sets.

228 Release : 6891b66 (2024-12-01)


17.3. EXAMPLES OF FIRST-ORDER THEORIES

The following might be considered as an axiom system for a theory of pure


sets:

∃x ¬∃y y ∈ x
∀x ∀y (∀z(z ∈ x ↔ z ∈ y) → x = y)
∀x ∀y ∃z ∀u (u ∈ z ↔ (u = x ∨ u = y))
∀x ∃y ∀z (z ∈ y ↔ ∃u (z ∈ u ∧ u ∈ x))

plus all sentences of the form

∃x ∀y (y ∈ x ↔ φ(y))

The first axiom says that there is a set with no elements (i.e., ∅ exists); the
second says that sets are extensional; the third that for any sets X and Y , the
set {X, Y } exists; the fourth that for any set X, the set ∪X exists, where ∪X
is the union of all the elements of X.
The sentences mentioned last are collectively called the naive comprehen-
sion scheme. It essentially says that for every φ(x), the set {x : φ(x)} exists—so
at first glance a true, useful, and perhaps even necessary axiom. It is called
“naive” because, as it turns out, it makes this theory unsatisfiable: if you take
φ(y) to be ¬y ∈ y, you get the sentence

∃x ∀y (y ∈ x ↔ ¬y ∈ y)

and this sentence is not satisfied in any structure.

Example 17.8. In the area of mereology, the relation of parthood is a funda-


mental relation. Just like theories of sets, there are theories of parthood that
axiomatize various conceptions (sometimes conflicting) of this relation.
The language of mereology contains a single two-place predicate symbol P ,
and P (x, y) “means” that x is a part of y. When we have this interpretation
in mind, a structure for this language is called a parthood structure. Of course,
not every structure for a single two-place predicate will really deserve this
name. To have a chance of capturing “parthood,” P M must satisfy some
conditions, which we can lay down as axioms for a theory of parthood. For
instance, parthood is a partial order on objects: every object is a part (albeit
an improper part) of itself; no two different objects can be parts of each other;
a part of a part of an object is itself part of that object. Note that in this sense
“is a part of” resembles “is a subset of,” but does not resemble “is an element
of” which is neither reflexive nor transitive.

∀x P (x, x)
∀x ∀y ((P (x, y) ∧ P (y, x)) → x = y)
∀x ∀y ∀z ((P (x, y) ∧ P (y, z)) → P (x, z))

Release : 6891b66 (2024-12-01) 229


CHAPTER 17. THEORIES AND THEIR MODELS

Moreover, any two objects have a mereological sum (an object that has these
two objects as parts, and is minimal in this respect).

∀x ∀y ∃z ∀u (P (z, u) ↔ (P (x, u) ∧ P (y, u)))

These are only some of the basic principles of parthood considered by meta-
physicians. Further principles, however, quickly become hard to formulate or
write down without first introducing some defined relations. For instance, most
metaphysicians interested in mereology also view the following as a valid prin-
ciple: whenever an object x has a proper part y, it also has a part z that has
no parts in common with y, and so that the fusion of y and z is x.

content/first-order-logic/models-theories/expressing-relations.tex

17.4 Expressing Relations in a Structure


explanation One main use formulas can be put to is to express properties and relations in fol:mat:exr:
sec
a structure M in terms of the primitives of the language L of M. By this we
mean the following: the domain of M is a set of objects. The constant symbols,
function symbols, and predicate symbols are interpreted in M by some objects
in |M|, functions on |M|, and relations on |M|. For instance, if A20 is in L, then
M
M assigns to it a relation R = A20 . Then the formula A20 (v1 , v2 ) expresses
that very relation, in the following sense: if a variable assignment s maps v1 to
a ∈ |M| and v2 to b ∈ |M|, then

Rab iff M, s ⊨ A20 (v1 , v2 ).

Note that we have to involve variable assignments here: we can’t just say “Rab
iff M ⊨ A20 (a, b)” because a and b are not symbols of our language: they are
elements of |M|.
Since we don’t just have atomic formulas, but can combine them using the
logical connectives and the quantifiers, more complex formulas can define other
relations which aren’t directly built into M. We’re interested in how to do that,
and specifically, which relations we can define in a structure.

Definition 17.9. Let φ(v1 , . . . , vn ) be a formula of L in which only v1 ,. . . ,


vn occur free, and let M be a structure for L. φ(v1 , . . . , vn ) expresses the
n
relation R ⊆ |M| iff

Ra1 . . . an iff M, s ⊨ φ(v1 , . . . , vn )

for any variable assignment s with s(vi ) = ai (i = 1, . . . , n).

Example 17.10. In the standard model of arithmetic N, the formula v1 <


v2 ∨ v1 = v2 expresses the ≤ relation on N. The formula v2 = v1′ expresses
the successor relation, i.e., the relation R ⊆ N2 where Rnm holds if m is the

230 Release : 6891b66 (2024-12-01)


17.4. EXPRESSING RELATIONS IN A STRUCTURE

successor of n. The formula v1 = v2′ expresses the predecessor relation. The


formulas ∃v3 (v3 ̸= 0 ∧ v2 = (v1 + v3 )) and ∃v3 (v1 + v3 ′ ) = v2 both express the
< relation. This means that the predicate symbol < is actually superfluous in
the language of arithmetic; it can be defined.

This idea is not just interesting in specific structures, but generally when- explanation

ever we use a language to describe an intended model or models, i.e., when we


consider theories. These theories often only contain a few predicate symbols as
basic symbols, but in the domain they are used to describe often many other
relations play an important role. If these other relations can be systematically
expressed by the relations that interpret the basic predicate symbols of the
language, we say we can define them in the language.

Problem 17.1. Find formulas in LA which define the following relations:

1. n is between i and j;

2. n evenly divides m (i.e., m is a multiple of n);

3. n is a prime number (i.e., no number other than 1 and n evenly divides n).
2
Problem 17.2. Suppose the formula φ(v1 , v2 ) expresses the relation R ⊆ |M|
in a structure M. Find formulas that express the following relations:

1. the inverse R−1 of R;

2. the relative product R | R;

Can you find a way to express R+ , the transitive closure of R?

Problem 17.3. Let L be the language containing a 2-place predicate symbol


< only (no other constant symbols, function symbols or predicate symbols—
except of course =). Let N be the structure such that |N| = N, and <N =
{⟨n, m⟩ : n < m}. Prove the following:

1. {0} is definable in N;

2. {1} is definable in N;

3. {2} is definable in N;

4. for each n ∈ N, the set {n} is definable in N;

5. every finite subset of |N| is definable in N;

6. every co-finite subset of |N| is definable in N (where X ⊆ N is co-finite


iff N \ X is finite).

content/first-order-logic/models-theories/set-theory.tex

Release : 6891b66 (2024-12-01) 231


CHAPTER 17. THEORIES AND THEIR MODELS

17.5 The Theory of Sets


Almost all of mathematics can be developed in the theory of sets. Developing fol:mat:set:
sec
mathematics in this theory involves a number of things. First, it requires a
set of axioms for the relation ∈. A number of different axiom systems have
been developed, sometimes with conflicting properties of ∈. The axiom system
known as ZFC, Zermelo–Fraenkel set theory with the axiom of choice stands
out: it is by far the most widely used and studied, because it turns out that its
axioms suffice to prove almost all the things mathematicians expect to be able
to prove. But before that can be established, it first is necessary to make clear
how we can even express all the things mathematicians would like to express.
For starters, the language contains no constant symbols or function symbols, so
it seems at first glance unclear that we can talk about particular sets (such as
∅ or N), can talk about operations on sets (such as X ∪ Y and ℘(X)), let alone
other constructions which involve things other than sets, such as relations and
functions.
To begin with, “is an element of” is not the only relation we are interested
in: “is a subset of” seems almost as important. But we can define “is a subset
of” in terms of “is an element of.” To do this, we have to find a formula φ(x, y)
in the language of set theory which is satisfied by a pair of sets ⟨X, Y ⟩ iff
X ⊆ Y . But X is a subset of Y just in case all elements of X are also elements
of Y . So we can define ⊆ by the formula

∀z (z ∈ x → z ∈ y)

Now, whenever we want to use the relation ⊆ in a formula, we could instead


use that formula (with x and y suitably replaced, and the bound variable z
renamed if necessary). For instance, extensionality of sets means that if any
sets x and y are contained in each other, then x and y must be the same set.
This can be expressed by ∀x ∀y ((x ⊆ y ∧ y ⊆ x) → x = y), or, if we replace ⊆
by the above definition, by

∀x ∀y ((∀z (z ∈ x → z ∈ y) ∧ ∀z (z ∈ y → z ∈ x)) → x = y).

This is in fact one of the axioms of ZFC, the “axiom of extensionality.”


There is no constant symbol for ∅, but we can express “x is empty” by
¬∃y y ∈ x. Then “∅ exists” becomes the sentence ∃x ¬∃y y ∈ x. This is
another axiom of ZFC. (Note that the axiom of extensionality implies that
there is only one empty set.) Whenever we want to talk about ∅ in the language
of set theory, we would write this as “there is a set that’s empty and . . . ” As
an example, to express the fact that ∅ is a subset of every set, we could write

∃x (¬∃y y ∈ x ∧ ∀z x ⊆ z)

where, of course, x ⊆ z would in turn have to be replaced by its definition.


To talk about operations on sets, such as X ∪ Y and ℘(X), we have to use
a similar trick. There are no function symbols in the language of set theory,

232 Release : 6891b66 (2024-12-01)


17.5. THE THEORY OF SETS

but we can express the functional relations X ∪ Y = Z and ℘(X) = Y by

∀u ((u ∈ x ∨ u ∈ y) ↔ u ∈ z)
∀u (u ⊆ x ↔ u ∈ y)

since the elements of X ∪Y are exactly the sets that are either elements of X or
elements of Y , and the elements of ℘(X) are exactly the subsets of X. However,
this doesn’t allow us to use x ∪ y or ℘(x) as if they were terms: we can only use
the entire formulas that define the relations X ∪ Y = Z and ℘(X) = Y . In fact,
we do not know that these relations are ever satisfied, i.e., we do not know that
unions and power sets always exist. For instance, the sentence ∀x ∃y ℘(x) = y
is another axiom of ZFC (the power set axiom).
Now what about talk of ordered pairs or functions? Here we have to explain
how we can think of ordered pairs and functions as special kinds of sets. One
way to define the ordered pair ⟨x, y⟩ is as the set {{x}, {x, y}}. But like before,
we cannot introduce a function symbol that names this set; we can only define
the relation ⟨x, y⟩ = z, i.e., {{x}, {x, y}} = z:

∀u (u ∈ z ↔ (∀v (v ∈ u ↔ v = x) ∨ ∀v (v ∈ u ↔ (v = x ∨ v = y))))

This says that the elements u of z are exactly those sets which either have x
as its only element or have x and y as its only elements (in other words, those
sets that are either identical to {x} or identical to {x, y}). Once we have this,
we can say further things, e.g., that X × Y = Z:

∀z (z ∈ Z ↔ ∃x ∃y (x ∈ X ∧ y ∈ Y ∧ ⟨x, y⟩ = z))

A function f : X → Y can be thought of as the relation f (x) = y, i.e., as


the set of pairs {⟨x, y⟩ : f (x) = y}. We can then say that a set f is a function
from X to Y if (a) it is a relation ⊆ X × Y , (b) it is total, i.e., for all x ∈ X
there is some y ∈ Y such that ⟨x, y⟩ ∈ f and (c) it is functional, i.e., whenever
⟨x, y⟩, ⟨x, y ′ ⟩ ∈ f , y = y ′ (because values of functions must be unique). So “f
is a function from X to Y ” can be written as:

∀u (u ∈ f → ∃x ∃y (x ∈ X ∧ y ∈ Y ∧ ⟨x, y⟩ = u)) ∧
∀x (x ∈ X → (∃y (y ∈ Y ∧ maps(f, x, y)) ∧
(∀y ∀y ′ ((maps(f, x, y) ∧ maps(f, x, y ′ )) → y = y ′ )))

where maps(f, x, y) abbreviates ∃v (v ∈ f ∧ ⟨x, y⟩ = v) (this formula expresses


“f (x) = y”).
It is now also not hard to express that f : X → Y is injective, for instance:

f : X → Y ∧ ∀x ∀x′ ((x ∈ X ∧ x′ ∈ X ∧
∃y (maps(f, x, y) ∧ maps(f, x′ , y))) → x = x′ )

A function f : X → Y is injective iff, whenever f maps x, x′ ∈ X to a single y,


x = x′ . If we abbreviate this formula as inj(f, X, Y ), we’re already in a position

Release : 6891b66 (2024-12-01) 233


CHAPTER 17. THEORIES AND THEIR MODELS

to state in the language of set theory something as non-trivial as Cantor’s


theorem: there is no injective function from ℘(X) to X:
∀X ∀Y (℘(X) = Y → ¬∃f inj(f, Y, X))
One might think that set theory requires another axiom that guarantees
the existence of a set for every defining property. If φ(x) is a formula of set
theory with the variable x free, we can consider the sentence
∃y ∀x (x ∈ y ↔ φ(x)).
This sentence states that there is a set y whose elements are all and only those
x that satisfy φ(x). This schema is called the “comprehension principle.” It
looks very useful; unfortunately it is inconsistent. Take φ(x) ≡ ¬x ∈ x, then
the comprehension principle states
∃y ∀x (x ∈ y ↔ x ∈
/ x),
i.e., it states the existence of a set of all sets that are not elements of themselves.
No such set can exist—this is Russell’s Paradox. ZFC, in fact, contains a
restricted—and consistent—version of this principle, the separation principle:
∀z ∃y ∀x (x ∈ y ↔ (x ∈ z ∧ φ(x)).
Problem 17.4. Show that the comprehension principle is inconsistent by giv-
ing a derivation that shows
∃y ∀x (x ∈ y ↔ x ∈
/ x) ⊢ ⊥.
It may help to first show (A → ¬A) ∧ (¬A → A) ⊢ ⊥.

content/first-order-logic/models-theories/size-of-structures.tex

17.6 Expressing the Size of Structures


explanation There are some properties of structures we can express even without using the fol:mat:siz:
sec
non-logical symbols of a language. For instance, there are sentences which are
true in a structure iff the domain of the structure has at least, at most, or
exactly a certain number n of elements.
Proposition 17.11. The sentence

φ≥n ≡ ∃x1 ∃x2 . . . ∃xn


(x1 ̸= x2 ∧ x1 ̸= x3 ∧ x1 ̸= x4 ∧ · · · ∧ x1 ̸= xn ∧
x2 ̸= x3 ∧ x2 ̸= x4 ∧ · · · ∧ x2 ̸= xn ∧
..
.
xn−1 ̸= xn )

234 Release : 6891b66 (2024-12-01)


is true in a structure M iff |M| contains at least n elements. Consequently,
M ⊨ ¬φ≥n+1 iff |M| contains at most n elements.

Proposition 17.12. The sentence

φ=n ≡ ∃x1 ∃x2 . . . ∃xn


(x1 ̸= x2 ∧ x1 ̸= x3 ∧ x1 ̸= x4 ∧ · · · ∧ x1 ̸= xn ∧
x2 ̸= x3 ∧ x2 ̸= x4 ∧ · · · ∧ x2 ̸= xn ∧
..
.
xn−1 ̸= xn ∧
∀y (y = x1 ∨ · · · ∨ y = xn ))

is true in a structure M iff |M| contains exactly n elements.

Proposition 17.13. A structure is infinite iff it is a model of

{φ≥1 , φ≥2 , φ≥3 , . . . }.

There is no single purely logical sentence which is true in M iff |M| is


infinite. However, one can give sentences with non-logical predicate symbols
which only have infinite models (although not every infinite structure is a model
of them). The property of being a finite structure, and the property of being
a non-enumerable structure cannot even be expressed with an infinite set of
sentences. These facts follow from the compactness and Löwenheim–Skolem
theorems.

Chapter 18

Derivation Systems

This chapter collects general material on derivation systems. A text-


book using a specific system can insert the introduction section plus the
relevant survey section at the beginning of the chapter introducing that
system.

235
CHAPTER 18. DERIVATION SYSTEMS

content/first-order-logic/proof-systems/introduction.tex

18.1 Introduction
Logics commonly have both a semantics and a derivation system. The seman- fol:prf:int:
sec
tics concerns concepts such as truth, satisfiability, validity, and entailment.
The purpose of derivation systems is to provide a purely syntactic method of
establishing entailment and validity. They are purely syntactic in the sense
that a derivation in such a system is a finite syntactic object, usually a se-
quence (or other finite arrangement) of sentences or formulas. Good derivation
systems have the property that any given sequence or arrangement of sentences
or formulas can be verified mechanically to be “correct.”
The simplest (and historically first) derivation systems for first-order logic
were axiomatic. A sequence of formulas counts as a derivation in such a sys-
tem if each individual formula in it is either among a fixed set of “axioms”
or follows from formulas coming before it in the sequence by one of a fixed
number of “inference rules”—and it can be mechanically verified if a formula
is an axiom and whether it follows correctly from other formulas by one of the
inference rules. Axiomatic derivation systems are easy to describe—and also
easy to handle meta-theoretically—but derivations in them are hard to read
and understand, and are also hard to produce.
Other derivation systems have been developed with the aim of making it
easier to construct derivations or easier to understand derivations once they
are complete. Examples are natural deduction, truth trees, also known as
tableaux proofs, and the sequent calculus. Some derivation systems are de-
signed especially with mechanization in mind, e.g., the resolution method is
easy to implement in software (but its derivations are essentially impossible to
understand). Most of these other derivation systems represent derivations as
trees of formulas rather than sequences. This makes it easier to see which parts
of a derivation depend on which other parts.
So for a given logic, such as first-order logic, the different derivation systems
will give different explications of what it is for a sentence to be a theorem and
what it means for a sentence to be derivable from some others. However that is
done (via axiomatic derivations, natural deductions, sequent derivations, truth
trees, resolution refutations), we want these relations to match the semantic
notions of validity and entailment. Let’s write ⊢ φ for “φ is a theorem” and
“Γ ⊢ φ” for “φ is derivable from Γ .” However ⊢ is defined, we want it to match
up with ⊨, that is:
1. ⊢ φ if and only if ⊨ φ
2. Γ ⊢ φ if and only if Γ ⊨ φ
The “only if” direction of the above is called soundness. A derivation system is
sound if derivability guarantees entailment (or validity). Every decent deriva-
tion system has to be sound; unsound derivation systems are not useful at all.

236 Release : 6891b66 (2024-12-01)


18.2. THE SEQUENT CALCULUS

After all, the entire purpose of a derivation is to provide a syntactic guarantee


of validity or entailment. We’ll prove soundness for the derivation systems we
present.
The converse “if” direction is also important: it is called completeness.
A complete derivation system is strong enough to show that φ is a theorem
whenever φ is valid, and that Γ ⊢ φ whenever Γ ⊨ φ. Completeness is harder
to establish, and some logics have no complete derivation systems. First-order
logic does. Kurt Gödel was the first one to prove completeness for a derivation
system of first-order logic in his 1929 dissertation.
Another concept that is connected to derivation systems is that of consis-
tency. A set of sentences is called inconsistent if anything whatsoever can be
derived from it, and consistent otherwise. Inconsistency is the syntactic coun-
terpart to unsatisfiablity: like unsatisfiable sets, inconsistent sets of sentences
do not make good theories, they are defective in a fundamental way. Consis-
tent sets of sentences may not be true or useful, but at least they pass that
minimal threshold of logical usefulness. For different derivation systems the
specific definition of consistency of sets of sentences might differ, but like ⊢, we
want consistency to coincide with its semantic counterpart, satisfiability. We
want it to always be the case that Γ is consistent if and only if it is satisfi-
able. Here, the “if” direction amounts to completeness (consistency guarantees
satisfiability), and the “only if” direction amounts to soundness (satisfiability
guarantees consistency). In fact, for classical first-order logic, the two versions
of soundness and completeness are equivalent.

content/first-order-logic/proof-systems/sequent-calculus.tex

18.2 The Sequent Calculus


fol:prf:seq: While many derivation systems operate with arrangements of sentences, the
sec
sequent calculus operates with sequents. A sequent is an expression of the
form
φ1 , . . . , φ m ⇒ ψ1 , . . . , ψ m ,
that is a pair of sequences of sentences, separated by the sequent symbol ⇒.
Either sequence may be empty. A derivation in the sequent calculus is a tree
of sequents, where the topmost sequents are of a special form (they are called
“initial sequents” or “axioms”) and every other sequent follows from the se-
quents immediately above it by one of the rules of inference. The rules of
inference either manipulate the sentences in the sequents (adding, removing,
or rearranging them on either the left or the right), or they introduce a com-
plex formula in the conclusion of the rule. For instance, the ∧L rule allows the
inference from φ, Γ ⇒ ∆ to φ ∧ ψ, Γ ⇒ ∆, and the →R allows the inference
from φ, Γ ⇒ ∆, ψ to Γ ⇒ ∆, φ → ψ, for any Γ , ∆, φ, and ψ. (In particular, Γ
and ∆ may be empty.)
The ⊢ relation based on the sequent calculus is defined as follows: Γ ⊢ φ
iff there is some sequence Γ0 such that every φ in Γ0 is in Γ and there is a

Release : 6891b66 (2024-12-01) 237


CHAPTER 18. DERIVATION SYSTEMS

derivation with the sequent Γ0 ⇒ φ at its root. φ is a theorem in the sequent


calculus if the sequent ⇒ φ has a derivation. For instance, here is a derivation
that shows that ⊢ (φ ∧ ψ) → φ:
φ ⇒ φ
∧L
φ∧ψ ⇒ φ
→R
⇒ (φ ∧ ψ) → φ

A set Γ is inconsistent in the sequent calculus if there is a derivation of


Γ0 ⇒ (where every φ ∈ Γ0 is in Γ and the right side of the sequent is empty).
Using the rule WR, any sentence can be derived from an inconsistent set.
The sequent calculus was invented in the 1930s by Gerhard Gentzen. Be-
cause of its systematic and symmetric design, it is a very useful formalism for
developing a theory of derivations. It is relatively easy to find derivations in
the sequent calculus, but these derivations are often hard to read and their
connection to proofs are sometimes not easy to see. It has proved to be a very
elegant approach to derivation systems, however, and many logics have sequent
calculus systems.

content/first-order-logic/proof-systems/natural-deduction.tex

18.3 Natural Deduction


Natural deduction is a derivation system intended to mirror actual reasoning fol:prf:ntd:
sec
(especially the kind of regimented reasoning employed by mathematicians).
Actual reasoning proceeds by a number of “natural” patterns. For instance,
proof by cases allows us to establish a conclusion on the basis of a disjunctive
premise, by establishing that the conclusion follows from either of the disjuncts.
Indirect proof allows us to establish a conclusion by showing that its negation
leads to a contradiction. Conditional proof establishes a conditional claim “if
. . . then . . . ” by showing that the consequent follows from the antecedent.
Natural deduction is a formalization of some of these natural inferences. Each
of the logical connectives and quantifiers comes with two rules, an introduction
and an elimination rule, and they each correspond to one such natural inference
pattern. For instance, →Intro corresponds to conditional proof, and ∨Elim to
proof by cases. A particularly simple rule is ∧Elim which allows the inference
from φ ∧ ψ to φ (or ψ).
One feature that distinguishes natural deduction from other derivation sys-
tems is its use of assumptions. A derivation in natural deduction is a tree
of formulas. A single formula stands at the root of the tree of formulas, and
the “leaves” of the tree are formulas from which the conclusion is derived. In
natural deduction, some leaf formulas play a role inside the derivation but are
“used up” by the time the derivation reaches the conclusion. This corresponds
to the practice, in actual reasoning, of introducing hypotheses which only re-
main in effect for a short while. For instance, in a proof by cases, we assume
the truth of each of the disjuncts; in conditional proof, we assume the truth

238 Release : 6891b66 (2024-12-01)


18.4. TABLEAUX

of the antecedent; in indirect proof, we assume the truth of the negation of


the conclusion. This way of introducing hypothetical assumptions and then
doing away with them in the service of establishing an intermediate step is a
hallmark of natural deduction. The formulas at the leaves of a natural de-
duction derivation are called assumptions, and some of the rules of inference
may “discharge” them. For instance, if we have a derivation of ψ from some
assumptions which include φ, then the →Intro rule allows us to infer φ→ψ and
discharge any assumption of the form φ. (To keep track of which assumptions
are discharged at which inferences, we label the inference and the assumptions
it discharges with a number.) The assumptions that remain undischarged at
the end of the derivation are together sufficient for the truth of the conclu-
sion, and so a derivation establishes that its undischarged assumptions entail
its conclusion.
The relation Γ ⊢ φ based on natural deduction holds iff there is a derivation
in which φ is the last sentence in the tree, and every leaf which is undischarged
is in Γ . φ is a theorem in natural deduction iff there is a derivation in which
φ is the last sentence and all assumptions are discharged. For instance, here is
a derivation that shows that ⊢ (φ ∧ ψ) → φ:
[φ ∧ ψ]1
φ ∧Elim
1 →Intro
(φ ∧ ψ) → φ
The label 1 indicates that the assumption φ ∧ ψ is discharged at the →Intro
inference.
A set Γ is inconsistent iff Γ ⊢ ⊥ in natural deduction. The rule ⊥I makes
it so that from an inconsistent set, any sentence can be derived.
Natural deduction systems were developed by Gerhard Gentzen and Sta-
nislaw Jaśkowski in the 1930s, and later developed by Dag Prawitz and Frederic
Fitch. Because its inferences mirror natural methods of proof, it is favored by
philosophers. The versions developed by Fitch are often used in introductory
logic textbooks. In the philosophy of logic, the rules of natural deduction have
sometimes been taken to give the meanings of the logical operators (“proof-
theoretic semantics”).

content/first-order-logic/proof-systems/tableaux.tex

18.4 Tableaux
fol:prf:tab: While many derivation systems operate with arrangements of sentences, tableaux
sec
operate with signed formulas. A signed formula is a pair consisting of a truth
value sign (T or F) and a sentence

T φ or F φ.

A tableau consists of signed formulas arranged in a downward-branching tree.


It begins with a number of assumptions and continues with signed formulas

Release : 6891b66 (2024-12-01) 239


CHAPTER 18. DERIVATION SYSTEMS

which result from one of the signed formulas above it by applying one of the
rules of inference. Each rule allows us to add one or more signed formulas to
the end of a branch, or two signed formulas side by side—in this case a branch
splits into two, with the two added signed formulas forming the ends of the
two branches.
A rule applied to a complex signed formula results in the addition of signed
formulas which are immediate sub-formulas. They come in pairs, one rule for
each of the two signs. For instance, the ∧T rule applies to T φ ∧ ψ, and allows
the addition of both the two signed formulas T φ and T ψ to the end of any
branch containing T φ ∧ ψ, and the rule φ ∧ ψF allows a branch to be split by
adding F φ and F ψ side-by-side. A tableau is closed if every one of its branches
contains a matching pair of signed formulas T φ and F φ.
The ⊢ relation based on tableaux is defined as follows: Γ ⊢ φ iff there is
some finite set Γ0 = {ψ1 , . . . , ψn } ⊆ Γ such that there is a closed tableau for
the assumptions
{F φ, T ψ1 , . . . , T ψn }
For instance, here is a closed tableau that shows that ⊢ (φ ∧ ψ) → φ:

1. F (φ ∧ ψ) → φ Assumption
2. Tφ ∧ ψ →F 1
3. Fφ →F 1
4. Tφ →T 2
5. Tψ →T 2

A set Γ is inconsistent in the tableau calculus if there is a closed tableau


for assumptions
{T ψ1 , . . . , T ψn }
for some ψi ∈ Γ .
Tableaux were invented in the 1950s independently by Evert Beth and
Jaakko Hintikka, and simplified and popularized by Raymond Smullyan. They
are very easy to use, since constructing a tableau is a very systematic proce-
dure. Because of the systematic nature of tableaux, they also lend themselves
to implementation by computer. However, a tableau is often hard to read and
their connection to proofs are sometimes not easy to see. The approach is also
quite general, and many different logics have tableau systems. Tableaux also
help us to find structures that satisfy given (sets of) sentences: if the set is
satisfiable, it won’t have a closed tableau, i.e., any tableau will have an open
branch. The satisfying structure can be “read off” an open branch, provided
every rule it is possible to apply has been applied on that branch. There is also
a very close connection to the sequent calculus: essentially, a closed tableau is
a condensed derivation in the sequent calculus, written upside-down.

content/first-order-logic/proof-systems/axiomatic-deduction.tex

240 Release : 6891b66 (2024-12-01)


18.5. AXIOMATIC DERIVATIONS

18.5 Axiomatic Derivations


fol:prf:axd: Axiomatic derivations are the oldest and simplest logical derivation systems. Its
sec
derivations are simply sequences of sentences. A sequence of sentences counts
as a correct derivation if every sentence φ in it satisfies one of the following
conditions:

1. φ is an axiom, or

2. φ is an element of a given set Γ of sentences, or

3. φ is justified by a rule of inference.

To be an axiom, φ has to have the form of one of a number of fixed sentence


schemas. There are many sets of axiom schemas that provide a satisfactory
(sound and complete) derivation system for first-order logic. Some are orga-
nized according to the connectives they govern, e.g., the schemas

φ → (ψ → φ) ψ → (ψ ∨ χ) (ψ ∧ χ) → ψ

are common axioms that govern →, ∨ and ∧. Some axiom systems aim at a
minimal number of axioms. Depending on the connectives that are taken as
primitives, it is even possible to find axiom systems that consist of a single
axiom.
A rule of inference is a conditional statement that gives a sufficient condition
for a sentence in a derivation to be justified. Modus ponens is one very common
such rule: it says that if φ and φ → ψ are already justified, then ψ is justified.
This means that a line in a derivation containing the sentence ψ is justified,
provided that both φ and φ → ψ (for some sentence φ) appear in the derivation
before ψ.
The ⊢ relation based on axiomatic derivations is defined as follows: Γ ⊢ φ
iff there is a derivation with the sentence φ as its last formula (and Γ is taken
as the set of sentences in that derivation which are justified by (2) above). φ
is a theorem if φ has a derivation where Γ is empty, i.e., every sentence in the
derivation is justified either by (1) or (3). For instance, here is a derivation
that shows that ⊢ φ → (ψ → (ψ ∨ φ)):

1. ψ → (ψ ∨ φ)
2. (ψ → (ψ ∨ φ)) → (φ → (ψ → (ψ ∨ φ)))
3. φ → (ψ → (ψ ∨ φ))

The sentence on line 1 is of the form of the axiom φ → (φ ∨ ψ) (with the


roles of φ and ψ reversed). The sentence on line 2 is of the form of the axiom
φ→(ψ →φ). Thus, both lines are justified. Line 3 is justified by modus ponens:
if we abbreviate it as θ, then line 2 has the form χ → θ, where χ is ψ → (ψ ∨ φ),
i.e., line 1.
A set Γ is inconsistent if Γ ⊢ ⊥. A complete axiom system will also prove
that ⊥ → φ for any φ, and so if Γ is inconsistent, then Γ ⊢ φ for any φ.

Release : 6891b66 (2024-12-01) 241


Systems of axiomatic derivations for logic were first given by Gottlob Frege
in his 1879 Begriffsschrift, which for this reason is often considered the first
work of modern logic. They were perfected in Alfred North Whitehead and
Bertrand Russell’s Principia Mathematica and by David Hilbert and his stu-
dents in the 1920s. They are thus often called “Frege systems” or “Hilbert
systems.” They are very versatile in that it is often easy to find an axiomatic
system for a logic. Because derivations have a very simple structure and only
one or two inference rules, it is also relatively easy to prove things about them.
However, they are very hard to use in practice, i.e., it is difficult to find and
write proofs.

Chapter 19

The Sequent Calculus

This chapter presents Gentzen’s standard sequent calculus LK for clas-


sical first-order logic. It could use more examples and exercises. To include
or exclude material relevant to the sequent calculus as a proof system, use
the “prfLK” tag.

content/first-order-logic/sequent-calculus/rules-and-proofs.tex

19.1 Rules and Derivations


For the following, let Γ, ∆, Π, Λ represent finite sequences of sentences. fol:seq:rul:
sec

Definition 19.1 (Sequent). A sequent is an expression of the form

Γ ⇒∆

where Γ and ∆ are finite (possibly empty) sequences of sentences of the lan-
guage L. Γ is called the antecedent, while ∆ is the succedent.

explanation

242
19.2. PROPOSITIONAL RULES

The intuitive idea behind a sequent is: if all of the sentences in the an-
tecedent hold, then at least one of the sentences in the succedent holds. That
is, if Γ = ⟨φ1 , . . . , φm ⟩ and ∆ = ⟨ψ1 , . . . , ψn ⟩, then Γ ⇒ ∆ holds iff

(φ1 ∧ · · · ∧ φm ) → (ψ1 ∨ · · · ∨ ψn )

holds. There are two special cases: where Γ is empty and when ∆ is empty.
When Γ is empty, i.e., m = 0, ⇒ ∆ holds iff ψ1 ∨ · · · ∨ ψn holds. When ∆
is empty, i.e., n = 0, Γ ⇒ holds iff ¬(φ1 ∧ · · · ∧ φm ) does. We say a sequent
is valid iff the corresponding sentence is valid.
If Γ is a sequence of sentences, we write Γ, φ for the result of appending φ
to the right end of Γ (and φ, Γ for the result of appending φ to the left end
of Γ ). If ∆ is a sequence of sentences also, then Γ, ∆ is the concatenation of
the two sequences.

Definition 19.2 (Initial Sequent). An initial sequent is a sequent of one of


the following forms:

1. φ ⇒ φ

2. ⇒⊤

3. ⊥ ⇒

for any sentence φ in the language.

Derivations in the sequent calculus are certain trees of sequents, where the
topmost sequents are initial sequents, and if a sequent stands below one or two
other sequents, it must follow correctly by a rule of inference. The rules for LK
are divided into two main types: logical rules and structural rules. The logical
rules are named for the main operator of the sentence containing φ and/or ψ in
the lower sequent. Each one comes in two versions, one for inferring a sequent
with the sentence containing the logical operator on the left, and one with the
sentence on the right.

content/first-order-logic/sequent-calculus/propositional-rules.tex

19.2 Propositional Rules


fol:seq:prl:
sec
Rules for ¬

Γ ⇒ ∆, φ φ, Γ ⇒ ∆
¬L ¬R
¬φ, Γ ⇒ ∆ Γ ⇒ ∆, ¬φ

Release : 6891b66 (2024-12-01) 243


CHAPTER 19. THE SEQUENT CALCULUS

Rules for ∧

φ, Γ ⇒ ∆
∧L
φ ∧ ψ, Γ ⇒ ∆ Γ ⇒ ∆, φ Γ ⇒ ∆, ψ
∧R
ψ, Γ ⇒ ∆ Γ ⇒ ∆, φ ∧ ψ
∧L
φ ∧ ψ, Γ ⇒ ∆

Rules for ∨

Γ ⇒ ∆, φ
∨R
φ, Γ ⇒ ∆ ψ, Γ ⇒ ∆ Γ ⇒ ∆, φ ∨ ψ
∨L
φ ∨ ψ, Γ ⇒ ∆ Γ ⇒ ∆, ψ
∨R
Γ ⇒ ∆, φ ∨ ψ

Rules for →

Γ ⇒ ∆, φ ψ, Π ⇒ Λ φ, Γ ⇒ ∆, ψ
→L →R
φ → ψ, Γ, Π ⇒ ∆, Λ Γ ⇒ ∆, φ → ψ

content/first-order-logic/sequent-calculus/quantifier-rules.tex

19.3 Quantifier Rules

Rules for ∀ fol:seq:qrl:


sec

φ(t), Γ ⇒ ∆ Γ ⇒ ∆, φ(a)
∀L ∀R
∀x φ(x), Γ ⇒ ∆ Γ ⇒ ∆, ∀x φ(x)

In ∀L, t is a closed term (i.e., one without variables). In ∀R, a is a constant


symbol which must not occur anywhere in the lower sequent of the ∀R rule.
We call a the eigenvariable of the ∀R inference.1
1 We use the term “eigenvariable” even though a in the above rule is a constant symbol.

This has historical reasons.

244 Release : 6891b66 (2024-12-01)


19.4. STRUCTURAL RULES

Rules for ∃

φ(a), Γ ⇒ ∆ Γ ⇒ ∆, φ(t)
∃L ∃R
∃x φ(x), Γ ⇒ ∆ Γ ⇒ ∆, ∃x φ(x)

Again, t is a closed term, and a is a constant symbol which does not occur
in the lower sequent of the ∃L rule. We call a the eigenvariable of the ∃L
inference.
The condition that an eigenvariable not occur in the lower sequent of the
∀R or ∃L inference is called the eigenvariable condition.
Recall the convention that when φ is a formula with the variable x free, we explanation
indicate this by writing φ(x). In the same context, φ(t) then is short for φ[t/x].
So we could also write the ∃R rule as:
Γ ⇒ ∆, φ[t/x]
∃R
Γ ⇒ ∆, ∃x φ
Note that t may already occur in φ, e.g., φ might be P (t, x). Thus, inferring
Γ ⇒ ∆, ∃x P (t, x) from Γ ⇒ ∆, P (t, t) is a correct application of ∃R—you
may “replace” one or more, and not necessarily all, occurrences of t in the
premise by the bound variable x. However, the eigenvariable conditions in ∀R
and ∃L require that the constant symbol a does not occur in φ. So, you cannot
correctly infer Γ ⇒ ∆, ∀x P (a, x) from Γ ⇒ ∆, P (a, a) using ∀R.
In ∃R and ∀L there are no restrictions on the term t. On the other hand, explanation
in the ∃L and ∀R rules, the eigenvariable condition requires that the constant
symbol a does not occur anywhere outside of φ(a) in the upper sequent. It is
necessary to ensure that the system is sound, i.e., only derives sequents that
are valid. Without this condition, the following would be allowed:
φ(a) ⇒ φ(a) φ(a) ⇒ φ(a)
*∃L *∀R
∃x φ(x) ⇒ φ(a) φ(a) ⇒ ∀x φ(x)
∀R ∃L
∃x φ(x) ⇒ ∀x φ(x) ∃x φ(x) ⇒ ∀x φ(x)
However, ∃x φ(x) ⇒ ∀x φ(x) is not valid.

content/first-order-logic/sequent-calculus/structural-rules.tex

19.4 Structural Rules


fol:seq:srl: We also need a few rules that allow us to rearrange sentences in the left and
sec
right side of a sequent. Since the logical rules require that the sentences in
the premise which the rule acts upon stand either to the far left or to the far
right, we need an “exchange” rule that allows us to move sentences to the right
position. It’s also important sometimes to be able to combine two identical
sentences into one, and to add a sentence on either side.

Release : 6891b66 (2024-12-01) 245


CHAPTER 19. THE SEQUENT CALCULUS

Weakening

Γ ⇒ ∆ Γ ⇒ ∆
WL WR
φ, Γ ⇒ ∆ Γ ⇒ ∆, φ

Contraction

φ, φ, Γ ⇒ ∆ Γ ⇒ ∆, φ, φ
CL CR
φ, Γ ⇒ ∆ Γ ⇒ ∆, φ

Exchange

Γ, φ, ψ, Π ⇒ ∆ Γ ⇒ ∆, φ, ψ, Λ
XL XR
Γ, ψ, φ, Π ⇒ ∆ Γ ⇒ ∆, ψ, φ, Λ

A series of weakening, contraction, and exchange inferences will often be indi-


cated by double inference lines.
The following rule, called “cut,” is not strictly speaking necessary, but
makes it a lot easier to reuse and combine derivations.

Γ ⇒ ∆, φ φ, Π ⇒ Λ
Cut
Γ, Π ⇒ ∆, Λ

content/first-order-logic/sequent-calculus/derivations.tex

19.5 Derivations
explanation We’ve said what an initial sequent looks like, and we’ve given the rules of fol:seq:der:
sec
inference. Derivations in the sequent calculus are inductively generated from
these: each derivation either is an initial sequent on its own, or consists of one
or two derivations followed by an inference.

Definition 19.3 (LK derivation). An LK-derivation of a sequent S is a


finite tree of sequents satisfying the following conditions:

1. The topmost sequents of the tree are initial sequents.

2. The bottommost sequent of the tree is S.

246 Release : 6891b66 (2024-12-01)


19.5. DERIVATIONS

3. Every sequent in the tree except S is a premise of a correct application


of an inference rule whose conclusion stands directly below that sequent
in the tree.
We then say that S is the end-sequent of the derivation and that S is derivable
in LK (or LK-derivable).

Example 19.4. Every initial sequent, e.g., χ ⇒ χ is a derivation. We can


obtain a new derivation from this by applying, say, the WL rule,
Γ ⇒ ∆
WL
φ, Γ ⇒ ∆
The rule, however, is meant to be general: we can replace the φ in the rule
with any sentence, e.g., also with θ. If the premise matches our initial sequent
χ ⇒ χ, that means that both Γ and ∆ are just χ, and the conclusion would
then be θ, χ ⇒ χ. So, the following is a derivation:
χ ⇒ χ
WL
θ, χ ⇒ χ
We can now apply another rule, say XL, which allows us to switch two sentences
on the left. So, the following is also a correct derivation:
χ ⇒ χ
WL
θ, χ ⇒ χ
XL
χ, θ ⇒ χ
In this application of the rule, which was given as
Γ, φ, ψ, Π ⇒ ∆
XL
Γ, ψ, φ, Π ⇒ ∆,
both Γ and Π were empty, ∆ is χ, and the roles of φ and ψ are played by θ
and χ, respectively. In much the same way, we also see that
θ ⇒ θ
WL
χ, θ ⇒ θ
is a derivation. Now we can take these two derivations, and combine them
using ∧R. That rule was
Γ ⇒ ∆, φ Γ ⇒ ∆, ψ
∧R
Γ ⇒ ∆, φ ∧ ψ
In our case, the premises must match the last sequents of the derivations ending
in the premises. That means that Γ is χ, θ, ∆ is empty, φ is χ and ψ is θ. So
the conclusion, if the inference should be correct, is χ, θ ⇒ χ ∧ θ.
χ ⇒ χ
WL
θ, χ ⇒ χ θ ⇒ θ
XL WL
χ, θ ⇒ χ χ, θ ⇒ θ
∧R
χ, θ ⇒ χ ∧ θ

Release : 6891b66 (2024-12-01) 247


CHAPTER 19. THE SEQUENT CALCULUS

Of course, we can also reverse the premises, then φ would be θ and ψ would
be χ.
χ ⇒ χ
WL
θ ⇒ θ θ, χ ⇒ χ
WL XL
χ, θ ⇒ θ χ, θ ⇒ χ
∧R
χ, θ ⇒ θ ∧ χ

content/first-order-logic/sequent-calculus/proving-things.tex

19.6 Examples of Derivations


fol:seq:pro:
sec
Example 19.5. Give an LK-derivation for the sequent φ ∧ ψ ⇒ φ.
We begin by writing the desired end-sequent at the bottom of the derivation.

φ∧ψ ⇒ φ
Next, we need to figure out what kind of inference could have a lower sequent
of this form. This could be a structural rule, but it is a good idea to start by
looking for a logical rule. The only logical connective occurring in the lower
sequent is ∧, so we’re looking for an ∧ rule, and since the ∧ symbol occurs in
the antecedent, we’re looking at the ∧L rule.
∧L
φ∧ψ ⇒ φ
There are two options for what could have been the upper sequent of the ∧L
inference: we could have an upper sequent of φ ⇒ φ, or of ψ ⇒ φ. Clearly,
φ ⇒ φ is an initial sequent (which is a good thing), while ψ ⇒ φ is not
derivable in general. We fill in the upper sequent:
φ ⇒ φ
∧L
φ∧ψ ⇒ φ
We now have a correct LK-derivation of the sequent φ ∧ ψ ⇒ φ.

Example 19.6. Give an LK-derivation for the sequent ¬φ ∨ ψ ⇒ φ → ψ.


Begin by writing the desired end-sequent at the bottom of the derivation.

¬φ ∨ ψ ⇒ φ → ψ
To find a logical rule that could give us this end-sequent, we look at the logical
connectives in the end-sequent: ¬, ∨, and →. We only care at the moment
about ∨ and → because they are main operators of sentences in the end-sequent,
while ¬ is inside the scope of another connective, so we will take care of it later.
Our options for logical rules for the final inference are therefore the ∨L rule
and the →R rule. We could pick either rule, really, but let’s pick the →R rule
(if for no reason other than it allows us to put off splitting into two branches).

248 Release : 6891b66 (2024-12-01)


19.6. EXAMPLES OF DERIVATIONS

According to the form of →R inferences which can yield the lower sequent, this
must look like:

φ, ¬φ ∨ ψ ⇒ ψ
→R
¬φ ∨ ψ ⇒ φ → ψ

If we move ¬φ ∨ ψ to the outside of the antecedent, we can apply the ∨L


rule. According to the schema, this must split into two upper sequents as
follows:

¬φ, φ ⇒ ψ ψ, φ ⇒ ψ
∨L
¬φ ∨ ψ, φ ⇒ ψ
XR
φ, ¬φ ∨ ψ ⇒ ψ
→R
¬φ ∨ ψ ⇒ φ→ψ

Remember that we are trying to wind our way up to initial sequents; we seem
to be pretty close! The right branch is just one weakening and one exchange
away from an initial sequent and then it is done:
ψ ⇒ ψ
WL
φ, ψ ⇒ ψ
XL
¬φ, φ ⇒ ψ ψ, φ ⇒ ψ
∨L
¬φ ∨ ψ, φ ⇒ ψ
XR
φ, ¬φ ∨ ψ ⇒ ψ
→R
¬φ ∨ ψ ⇒ φ → ψ

Now looking at the left branch, the only logical connective in any sentence
is the ¬ symbol in the antecedent sentences, so we’re looking at an instance of
the ¬L rule.
ψ ⇒ ψ
WL
φ ⇒ ψ, φ φ, ψ ⇒ ψ
¬L XL
¬φ, φ ⇒ ψ ψ, φ ⇒ ψ
∨L
¬φ ∨ ψ, φ ⇒ ψ
XR
φ, ¬φ ∨ ψ ⇒ ψ
→R
¬φ ∨ ψ ⇒ φ → ψ

Similarly to how we finished off the right branch, we are just one weakening
and one exchange away from finishing off this left branch as well.
φ ⇒ φ
WR
φ ⇒ φ, ψ ψ ⇒ ψ
XR WL
φ ⇒ ψ, φ φ, ψ ⇒ ψ
¬L XL
¬φ, φ ⇒ ψ ψ, φ ⇒ ψ
∨L
¬φ ∨ ψ, φ ⇒ ψ
XR
φ, ¬φ ∨ ψ ⇒ ψ
→R
¬φ ∨ ψ ⇒ φ→ψ

Release : 6891b66 (2024-12-01) 249


CHAPTER 19. THE SEQUENT CALCULUS

Example 19.7. Give an LK-derivation of the sequent ¬φ ∨ ¬ψ ⇒ ¬(φ ∧ ψ)


Using the techniques from above, we start by writing the desired end-
sequent at the bottom.

¬φ ∨ ¬ψ ⇒ ¬(φ ∧ ψ)

The available main connectives of sentences in the end-sequent are the ∨ symbol
and the ¬ symbol. It would work to apply either the ∨L or the ¬R rule here,
but we start with the ¬R rule because it avoids splitting up into two branches
for a moment:

φ ∧ ψ, ¬φ ∨ ¬ψ ⇒
¬R
¬φ ∨ ¬ψ ⇒ ¬(φ ∧ ψ)

Now we have a choice of whether to look at the ∧L or the ∨L rule. Let’s see
what happens when we apply the ∧L rule: we have a choice to start with either
the sequent φ, ¬φ ∨ ψ ⇒ or the sequent ψ, ¬φ ∨ ψ ⇒ . Since the derivation
is symmetric with regards to φ and ψ, let’s go with the former:

φ, ¬φ ∨ ¬ψ ⇒
∧L
φ ∧ ψ, ¬φ ∨ ¬ψ ⇒
¬R
¬φ ∨ ¬ψ ⇒ ¬(φ ∧ ψ)

Continuing to fill in the derivation, we see that we run into a problem:

?
φ ⇒ φ φ ⇒ ψ
¬L ¬L
¬φ, φ ⇒ ¬ψ, φ ⇒
∨L
¬φ ∨ ¬ψ, φ ⇒
XL
φ, ¬φ ∨ ¬ψ ⇒
∧L
φ ∧ ψ, ¬φ ∨ ¬ψ ⇒
¬R
¬φ ∨ ¬ψ ⇒ ¬(φ ∧ ψ)

The top of the right branch cannot be reduced any further, and it cannot be
brought by way of structural inferences to an initial sequent, so this is not the
right path to take. So clearly, it was a mistake to apply the ∧L rule above.
Going back to what we had before and carrying out the ∨L rule instead, we
get

¬φ, φ ∧ ψ ⇒ ¬ψ, φ ∧ ψ ⇒
∨L
¬φ ∨ ¬ψ, φ ∧ ψ ⇒
XL
φ ∧ ψ, ¬φ ∨ ¬ψ ⇒
¬R
¬φ ∨ ¬ψ ⇒ ¬(φ ∧ ψ)

Completing each branch as we’ve done before, we get

250 Release : 6891b66 (2024-12-01)


19.6. EXAMPLES OF DERIVATIONS

φ ⇒ φ ψ ⇒ ψ
∧L ∧L
φ∧ψ ⇒ φ φ∧ψ ⇒ ψ
¬L ¬L
¬φ, φ ∧ ψ ⇒ ¬ψ, φ ∧ ψ ⇒
∨L
¬φ ∨ ¬ψ, φ ∧ ψ ⇒
XL
φ ∧ ψ, ¬φ ∨ ¬ψ ⇒
¬R
¬φ ∨ ¬ψ ⇒ ¬(φ ∧ ψ)
(We could have carried out the ∧ rules lower than the ¬ rules in these steps
and still obtained a correct derivation).

Example 19.8. So far we haven’t used the contraction rule, but it is some-
times required. Here’s an example where that happens. Suppose we want to
prove ⇒ φ ∨ ¬φ. Applying ∨R backwards would give us one of these two
derivations:
φ ⇒
⇒ φ ⇒ ¬φ ¬R
⇒ φ ∨ ¬φ ∨R ⇒ φ ∨ ¬φ ∨R
Neither of these of course ends in an initial sequent. The trick is to realize
that the contraction rule allows us to combine two copies of a sentence into
one—and when we’re searching for a proof, i.e., going from bottom to top, we
can keep a copy of φ ∨ ¬φ in the premise, e.g.,

⇒ φ ∨ ¬φ, φ
⇒ φ ∨ ¬φ, φ ∨ ¬φ ∨R
⇒ φ ∨ ¬φ CR

Now we can apply ∨R a second time, and also get ¬φ, which leads to a complete
derivation.
φ ⇒ φ
⇒ φ, ¬φ ¬R
⇒ φ, φ ∨ ¬φ ∨R
⇒ φ ∨ ¬φ, φ XR
⇒ φ ∨ ¬φ, φ ∨ ¬φ ∨R
⇒ φ ∨ ¬φ CR

Problem 19.1. Give derivations of the following sequents:


1. φ ∧ (ψ ∧ χ) ⇒ (φ ∧ ψ) ∧ χ.
2. φ ∨ (ψ ∨ χ) ⇒ (φ ∨ ψ) ∨ χ.
3. φ → (ψ → χ) ⇒ ψ → (φ → χ).
4. φ ⇒ ¬¬φ.

Problem 19.2. Give derivations of the following sequents:


1. (φ ∨ ψ) → χ ⇒ φ → χ.

Release : 6891b66 (2024-12-01) 251


CHAPTER 19. THE SEQUENT CALCULUS

2. (φ → χ) ∧ (ψ → χ) ⇒ (φ ∨ ψ) → χ.

3. ⇒ ¬(φ ∧ ¬φ).

4. ψ → φ ⇒ ¬φ → ¬ψ.

5. ⇒ (φ → ¬φ) → ¬φ.

6. ⇒ ¬(φ → ψ) → ¬ψ.

7. φ → χ ⇒ ¬(φ ∧ ¬χ).

8. φ ∧ ¬χ ⇒ ¬(φ → χ).

9. φ ∨ ψ, ¬ψ ⇒ φ.

10. ¬φ ∨ ¬ψ ⇒ ¬(φ ∧ ψ).

11. ⇒ (¬φ ∧ ¬ψ) → ¬(φ ∨ ψ).

12. ⇒ ¬(φ ∨ ψ) → (¬φ ∧ ¬ψ).

Problem 19.3. Give derivations of the following sequents:

1. ¬(φ → ψ) ⇒ φ.

2. ¬(φ ∧ ψ) ⇒ ¬φ ∨ ¬ψ.

3. φ → ψ ⇒ ¬φ ∨ ψ.

4. ⇒ ¬¬φ → φ.

5. φ → ψ, ¬φ → ψ ⇒ ψ.

6. (φ ∧ ψ) → χ ⇒ (φ → χ) ∨ (ψ → χ).

7. (φ → ψ) → φ ⇒ φ.

8. ⇒ (φ → ψ) ∨ (ψ → χ).

(These all require the CR rule.)

content/first-order-logic/sequent-calculus/proving-things-quant.tex

252 Release : 6891b66 (2024-12-01)


19.7. DERIVATIONS WITH QUANTIFIERS

19.7 Derivations with Quantifiers


fol:seq:prq:
sec
Example 19.9. Give an LK-derivation of the sequent ∃x ¬φ(x) ⇒ ¬∀x φ(x).
When dealing with quantifiers, we have to make sure not to violate the
eigenvariable condition, and sometimes this requires us to play around with
the order of carrying out certain inferences. In general, it helps to try and
take care of rules subject to the eigenvariable condition first (they will be lower
down in the finished proof). Also, it is a good idea to try and look ahead and
try to guess what the initial sequent might look like. In our case, it will have to
be something like φ(a) ⇒ φ(a). That means that when we are “reversing” the
quantifier rules, we will have to pick the same term—what we will call a—for
both the ∀ and the ∃ rule. If we picked different terms for each rule, we would
end up with something like φ(a) ⇒ φ(b), which, of course, is not derivable.
Starting as usual, we write

∃x ¬φ(x) ⇒ ¬∀x φ(x)

We could either carry out the ∃L rule or the ¬R rule. Since the ∃L rule is
subject to the eigenvariable condition, it’s a good idea to take care of it sooner
rather than later, so we’ll do that one first.

¬φ(a) ⇒ ¬∀x φ(x)


∃L
∃x ¬φ(x) ⇒ ¬∀x φ(x)

Applying the ¬L and ¬R rules backwards, we get

∀x φ(x) ⇒ φ(a)
¬L
¬φ(a), ∀x φ(x) ⇒
XL
∀x φ(x), ¬φ(a) ⇒
¬R
¬φ(a) ⇒ ¬∀xφ(x)
∃L
∃x¬φ(x) ⇒ ¬∀xφ(x)

At this point, our only option is to carry out the ∀L rule. Since this rule is not
subject to the eigenvariable restriction, we’re in the clear. Remember, we want
to try and obtain an initial sequent (of the form φ(a) ⇒ φ(a)), so we should
choose a as our argument for φ when we apply the rule.

φ(a) ⇒ φ(a)
∀L
∀x φ(x) ⇒ φ(a)
¬L
¬φ(a), ∀x φ(x) ⇒
XL
∀x φ(x), ¬φ(a) ⇒
¬R
¬φ(a) ⇒ ¬∀x φ(x)
∃L
∃x ¬φ(x) ⇒ ¬∀x φ(x)

Release : 6891b66 (2024-12-01) 253


CHAPTER 19. THE SEQUENT CALCULUS

It is important, especially when dealing with quantifiers, to double check at


this point that the eigenvariable condition has not been violated. Since the
only rule we applied that is subject to the eigenvariable condition was ∃L, and
the eigenvariable a does not occur in its lower sequent (the end-sequent), this
is a correct derivation.

Problem 19.4. Give derivations of the following sequents:

1. ⇒ (∀x φ(x) ∧ ∀y ψ(y)) → ∀z (φ(z) ∧ ψ(z)).

2. ⇒ (∃x φ(x) ∨ ∃y ψ(y)) → ∃z (φ(z) ∨ ψ(z)).

3. ∀x (φ(x) → ψ) ⇒ ∃y φ(y) → ψ.

4. ∀x ¬φ(x) ⇒ ¬∃x φ(x).

5. ⇒ ¬∃x φ(x) → ∀x ¬φ(x).

6. ⇒ ¬∃x ∀y ((φ(x, y) → ¬φ(y, y)) ∧ (¬φ(y, y) → φ(x, y))).

Problem 19.5. Give derivations of the following sequents:

1. ⇒ ¬∀x φ(x) → ∃x ¬φ(x).

2. (∀x φ(x) → ψ) ⇒ ∃y (φ(y) → ψ).

3. ⇒ ∃x (φ(x) → ∀y φ(y)).

(These all require the CR rule.)

This section collects the definitions of the provability relation and con-
sistency for natural deduction.

content/first-order-logic/sequent-calculus/proof-theoretic-notions.tex

19.8 Proof-Theoretic Notions


explanation Just as we’ve defined a number of important semantic notions (validity, en- fol:seq:ptn:
sec
tailment, satisfiabilty), we now define corresponding proof-theoretic notions.
These are not defined by appeal to satisfaction of sentences in structures, but
by appeal to the derivability or non-derivability of certain sequents. It was an
important discovery that these notions coincide. That they do is the content
of the soundness and completeness theorem.

Definition 19.10 (Theorems). A sentence φ is a theorem if there is a deriva-


tion in LK of the sequent ⇒ φ. We write ⊢ φ if φ is a theorem and ⊬ φ if
it is not.

254 Release : 6891b66 (2024-12-01)


19.8. PROOF-THEORETIC NOTIONS

Definition 19.11 (Derivability). A sentence φ is derivable from a set of


sentences Γ , Γ ⊢ φ, iff there is a finite subset Γ0 ⊆ Γ and a sequence Γ0′ of the
sentences in Γ0 such that LK derives Γ0′ ⇒ φ. If φ is not derivable from Γ we
write Γ ⊬ φ.

Because of the contraction, weakening, and exchange rules, the order and
number of sentences in Γ0′ does not matter: if a sequent Γ0′ ⇒ φ is derivable,
then so is Γ0′′ ⇒ φ for any Γ0′′ that contains the same sentences as Γ0′ . For
instance, if Γ0 = {ψ, χ} then both Γ0′ = ⟨ψ, ψ, χ⟩ and Γ0′′ = ⟨χ, χ, ψ⟩ are
sequences containing just the sentences in Γ0 . If a sequent containing one is
derivable, so is the other, e.g.:

ψ, ψ, χ ⇒ φ
CL
ψ, χ ⇒ φ
XL
χ, ψ ⇒ φ
WL
χ, χ, ψ ⇒ φ

From now on we’ll say that if Γ0 is a finite set of sentences then Γ0 ⇒ φ is


any sequent where the antecedent is a sequence of sentences in Γ0 and tacitly
include contractions, exchanges, and weakenings if necessary.

Definition 19.12 (Consistency). A set of sentences Γ is inconsistent iff


there is a finite subset Γ0 ⊆ Γ such that LK derives Γ0 ⇒ . If Γ is not
inconsistent, i.e., if for every finite Γ0 ⊆ Γ , LK does not derive Γ0 ⇒ , we
say it is consistent.

fol:seq:ptn: Proposition 19.13 (Reflexivity). If φ ∈ Γ , then Γ ⊢ φ.


prop:reflexivity

Proof. The initial sequent φ ⇒ φ is derivable, and {φ} ⊆ Γ .

fol:seq:ptn: Proposition 19.14 (Monotonicity). If Γ ⊆ ∆ and Γ ⊢ φ, then ∆ ⊢ φ.


prop:monotonicity

Proof. Suppose Γ ⊢ φ, i.e., there is a finite Γ0 ⊆ Γ such that Γ0 ⇒ φ is


derivable. Since Γ ⊆ ∆, then Γ0 is also a finite subset of ∆. The derivation of
Γ0 ⇒ φ thus also shows ∆ ⊢ φ.

fol:seq:ptn: Proposition 19.15 (Transitivity). If Γ ⊢ φ and {φ}∪∆ ⊢ ψ, then Γ ∪∆ ⊢


prop:transitivity
ψ.

Proof. If Γ ⊢ φ, there is a finite Γ0 ⊆ Γ and a derivation π0 of Γ0 ⇒ φ. If


{φ} ∪ ∆ ⊢ ψ, then for some finite subset ∆0 ⊆ ∆, there is a derivation π1 of
φ, ∆0 ⇒ ψ. Consider the following derivation:

Release : 6891b66 (2024-12-01) 255


CHAPTER 19. THE SEQUENT CALCULUS

π0 π1

Γ0 ⇒ φ φ, ∆0 ⇒ ψ
Cut
Γ0 , ∆0 ⇒ ψ
Since Γ0 ∪ ∆0 ⊆ Γ ∪ ∆, this shows Γ ∪ ∆ ⊢ ψ.

Note that this means that in particular if Γ ⊢ φ and φ ⊢ ψ, then Γ ⊢ ψ. It


follows also that if φ1 , . . . , φn ⊢ ψ and Γ ⊢ φi for each i, then Γ ⊢ ψ.
Proposition 19.16. Γ is inconsistent iff Γ ⊢ φ for every sentence φ. fol:seq:ptn:
prop:incons

Proof. Exercise.

Problem 19.6. Prove Proposition 19.16

Proposition 19.17 (Compactness). fol:seq:ptn:


prop:proves-compact
1. If Γ ⊢ φ then there is a finite subset Γ0 ⊆ Γ such that Γ0 ⊢ φ.
2. If every finite subset of Γ is consistent, then Γ is consistent.

Proof. 1. If Γ ⊢ φ, then there is a finite subset Γ0 ⊆ Γ such that the sequent


Γ0 ⇒ φ has a derivation. Consequently, Γ0 ⊢ φ.
2. If Γ is inconsistent, there is a finite subset Γ0 ⊆ Γ such that LK derives
Γ0 ⇒ . But then Γ0 is a finite subset of Γ that is inconsistent.

content/first-order-logic/sequent-calculus/provability-consistency.tex

19.9 Derivability and Consistency


We will now establish a number of properties of the derivability relation. They fol:seq:prv:
sec
are independently interesting, but each will play a role in the proof of the
completeness theorem.
Proposition 19.18. If Γ ⊢ φ and Γ ∪ {φ} is inconsistent, then Γ is incon- fol:seq:prv:
prop:provability-contr
sistent.

Proof. There are finite Γ0 and Γ1 ⊆ Γ such that LK derives Γ0 ⇒ φ and


φ, Γ1 ⇒ . Let the LK-derivation of Γ0 ⇒ φ be π0 and the LK-derivation of
Γ1 , φ ⇒ be π1 . We can then derive

π0 π1

Γ0 ⇒ φ φ, Γ1 ⇒
Cut
Γ0 , Γ1 ⇒

256 Release : 6891b66 (2024-12-01)


19.9. DERIVABILITY AND CONSISTENCY

Since Γ0 ⊆ Γ and Γ1 ⊆ Γ , Γ0 ∪ Γ1 ⊆ Γ , hence Γ is inconsistent.

fol:seq:prv: Proposition 19.19. Γ ⊢ φ iff Γ ∪ {¬φ} is inconsistent.


prop:prov-incons

Proof. First suppose Γ ⊢ φ, i.e., there is a derivation π0 of Γ ⇒ φ. By adding


a ¬L rule, we obtain a derivation of ¬φ, Γ ⇒ , i.e., Γ ∪ {¬φ} is inconsistent.
If Γ ∪ {¬φ} is inconsistent, there is a derivation π1 of ¬φ, Γ ⇒ . The
following is a derivation of Γ ⇒ φ:

π1
φ ⇒ φ
⇒ φ, ¬φ ¬R ¬φ, Γ ⇒
Cut
Γ ⇒ φ

Problem 19.7. Prove that Γ ⊢ ¬φ iff Γ ∪ {φ} is inconsistent.

fol:seq:prv: Proposition 19.20. If Γ ⊢ φ and ¬φ ∈ Γ , then Γ is inconsistent.


prop:explicit-inc

Proof. Suppose Γ ⊢ φ and ¬φ ∈ Γ . Then there is a derivation π of a sequent


Γ0 ⇒ φ. The sequent ¬φ, Γ0 ⇒ is also derivable:

π φ ⇒ φ
¬φ, φ ⇒ ¬L
Γ0 ⇒ φ φ, ¬φ ⇒ XL
Cut
Γ0 , ¬φ ⇒

Since ¬φ ∈ Γ and Γ0 ⊆ Γ , this shows that Γ is inconsistent.

fol:seq:prv: Proposition 19.21. If Γ ∪ {φ} and Γ ∪ {¬φ} are both inconsistent, then Γ
prop:provability-exhaustive
is inconsistent.

Proof. There are finite sets Γ0 ⊆ Γ and Γ1 ⊆ Γ and LK-derivations π0 and π1


of φ, Γ0 ⇒ and ¬φ, Γ1 ⇒ , respectively. We can then derive

π0
π1
φ, Γ0 ⇒
¬R
Γ0 ⇒ ¬φ ¬φ, Γ1 ⇒
Cut
Γ0 , Γ1 ⇒

Since Γ0 ⊆ Γ and Γ1 ⊆ Γ , Γ0 ∪ Γ1 ⊆ Γ . Hence Γ is inconsistent.

content/first-order-logic/sequent-calculus/provability-propositional.tex

Release : 6891b66 (2024-12-01) 257


CHAPTER 19. THE SEQUENT CALCULUS

19.10 Derivability and the Propositional Connectives


explanation We establish that the derivability relation ⊢ of the sequent calculus is strong fol:seq:ppr:
sec
enough to establish some basic facts involving the propositional connectives,
such as that φ ∧ ψ ⊢ φ and φ, φ → ψ ⊢ ψ (modus ponens). These facts are
needed for the proof of the completeness theorem.

Proposition 19.22. fol:seq:ppr:


prop:provability-land
1. Both φ ∧ ψ ⊢ φ and φ ∧ ψ ⊢ ψ. fol:seq:ppr:
prop:provability-land-left
2. φ, ψ ⊢ φ ∧ ψ. fol:seq:ppr:
prop:provability-land-right

Proof. 1. Both sequents φ ∧ ψ ⇒ φ and φ ∧ ψ ⇒ ψ are derivable:

φ ⇒ φ ψ ⇒ ψ
∧L ∧L
φ∧ψ ⇒ φ φ∧ψ ⇒ ψ

2. Here is a derivation of the sequent φ, ψ ⇒ φ ∧ ψ:

φ ⇒ φ ψ ⇒ ψ
∧R
φ, ψ ⇒ φ ∧ ψ

Proposition 19.23. fol:seq:ppr:


prop:provability-lor
1. φ ∨ ψ, ¬φ, ¬ψ is inconsistent.

2. Both φ ⊢ φ ∨ ψ and ψ ⊢ φ ∨ ψ.

Proof. 1. We give a derivation of the sequent φ ∨ ψ, ¬φ, ¬ψ ⇒:

φ ⇒ φ ψ ⇒ ψ
¬L ¬L
¬φ, φ ⇒ ¬ψ, ψ ⇒
φ, ¬φ, ¬ψ ⇒ ψ, ¬φ, ¬ψ ⇒
∨L
φ ∨ ψ, ¬φ, ¬ψ ⇒

(Recall that double inference lines indicate several weakening, contrac-


tion, and exchange inferences.)

2. Both sequents φ ⇒ φ ∨ ψ and ψ ⇒ φ ∨ ψ have derivations:

φ ⇒ φ ψ ⇒ ψ
∨R ∨R
φ ⇒ φ∨ψ ψ ⇒ φ∨ψ

Proposition 19.24. fol:seq:ppr:


prop:provability-lif
1. φ, φ → ψ ⊢ ψ. fol:seq:ppr:
prop:provability-lif-left

258 Release : 6891b66 (2024-12-01)


19.11. DERIVABILITY AND THE QUANTIFIERS

fol:seq:ppr: 2. Both ¬φ ⊢ φ → ψ and ψ ⊢ φ → ψ.


prop:provability-lif-right

Proof. 1. The sequent φ → ψ, φ ⇒ ψ is derivable:

φ ⇒ φ ψ ⇒ ψ
→L
φ → ψ, φ ⇒ ψ

2. Both sequents ¬φ ⇒ φ → ψ and ψ ⇒ φ → ψ are derivable:


φ ⇒ φ
¬φ, φ ⇒ ¬L
φ, ¬φ ⇒ XL ψ ⇒ ψ
WR WL
φ, ¬φ ⇒ ψ φ, ψ ⇒ ψ
→R →R
¬φ ⇒ φ→ψ ψ ⇒ φ→ψ

content/first-order-logic/sequent-calculus/provability-quantifiers.tex

19.11 Derivability and the Quantifiers


fol:seq:qpr: The completeness theorem also requires that the sequent calculus rules rules explanation
sec
yield the facts about ⊢ established in this section.
fol:seq:qpr: Theorem 19.25. If c is a constant not occurring in Γ or φ(x) and Γ ⊢ φ(c),
thm:strong-generalization
then Γ ⊢ ∀x φ(x).

Proof. Let π0 be an LK-derivation of Γ0 ⇒ φ(c) for some finite Γ0 ⊆ Γ . By


adding a ∀R inference, we obtain a derivation of Γ0 ⇒ ∀x φ(x), since c does
not occur in Γ or φ(x) and thus the eigenvariable condition is satisfied.

fol:seq:qpr: Proposition 19.26.


prop:provability-quantifiers
1. φ(t) ⊢ ∃x φ(x).
2. ∀x φ(x) ⊢ φ(t).

Proof. 1. The sequent φ(t) ⇒ ∃x φ(x) is derivable:

φ(t) ⇒ φ(t)
∃R
φ(t) ⇒ ∃x φ(x)

2. The sequent ∀x φ(x) ⇒ φ(t) is derivable:

φ(t) ⇒ φ(t)
∀L
∀x φ(x) ⇒ φ(t)

content/first-order-logic/sequent-calculus/soundness.tex

Release : 6891b66 (2024-12-01) 259


CHAPTER 19. THE SEQUENT CALCULUS

19.12 Soundness
explanation A derivation system, such as the sequent calculus, is sound if it cannot derive fol:seq:sou:
sec
things that do not actually hold. Soundness is thus a kind of guaranteed safety
property for derivation systems. Depending on which proof theoretic property
is in question, we would like to know for instance, that
1. every derivable φ is valid;
2. if a sentence is derivable from some others, it is also a consequence of
them;
3. if a set of sentences is inconsistent, it is unsatisfiable.
These are important properties of a derivation system. If any of them do
not hold, the derivation system is deficient—it would derive too much. Con-
sequently, establishing the soundness of a derivation system is of the utmost
importance.
Because all these proof-theoretic properties are defined via derivability in
the sequent calculus of certain sequents, proving (1)–(3) above requires proving
something about the semantic properties of derivable sequents. We will first
define what it means for a sequent to be valid, and then show that every
derivable sequent is valid. (1)–(3) then follow as corollaries from this result.
Definition 19.27. A structure M satisfies a sequent Γ ⇒ ∆ iff either M ⊭ φ
for some φ ∈ Γ or M ⊨ φ for some φ ∈ ∆.
A sequent is valid iff every structure M satisfies it.

Theorem 19.28 (Soundness). If LK derives Θ ⇒ Ξ, then Θ ⇒ Ξ is valid. fol:seq:sou:


thm:sequent-soundness
Proof. Let π be a derivation of Θ ⇒ Ξ. We proceed by induction on the
number of inferences n in π.
If the number of inferences is 0, then π consists only of an initial sequent.
Every initial sequent φ ⇒ φ is obviously valid, since for every M, either M ⊭ φ
or M ⊨ φ.
If the number of inferences is greater than 0, we distinguish cases according
to the type of the lowermost inference. By induction hypothesis, we can assume
that the premises of that inference are valid, since the number of inferences in
the derivation of any premise is smaller than n.
First, we consider the possible inferences with only one premise.
1. The last inference is a weakening. Then Θ ⇒ Ξ is either φ, Γ ⇒ ∆ (if
the last inference is WL) or Γ ⇒ ∆, φ (if it’s WR), and the derivation
ends in one of

Γ ⇒ ∆ Γ ⇒ ∆
WL WR
φ, Γ ⇒ ∆ Γ ⇒ ∆, φ

260 Release : 6891b66 (2024-12-01)


19.12. SOUNDNESS

By induction hypothesis, Γ ⇒ ∆ is valid, i.e., for every structure M,


either there is some χ ∈ Γ such that M ⊭ χ or there is some χ ∈ ∆ such
that M ⊨ χ.
If M ⊭ χ for some χ ∈ Γ , then χ ∈ Θ as well since Θ = φ, Γ , and so
M ⊭ χ for some χ ∈ Θ. Similarly, if M ⊨ χ for some χ ∈ ∆, as χ ∈ Ξ,
M ⊨ χ for some χ ∈ Ξ. Consequently, Θ ⇒ Ξ is valid.

2. The last inference is ¬L: Then the premise of the last inference is Γ ⇒
∆, φ and the conclusion is ¬φ, Γ ⇒ ∆, i.e., the derivation ends in

Γ ⇒ ∆, φ
¬L
¬φ, Γ ⇒ ∆

and Θ = ¬φ, Γ while Ξ = ∆.


The induction hypothesis tells us that Γ ⇒ ∆, φ is valid, i.e., for every
M, either (a) for some χ ∈ Γ , M ⊭ χ, or (b) for some χ ∈ ∆, M ⊨ χ,
or (c) M ⊨ φ. We want to show that Θ ⇒ Ξ is also valid. Let M be
a structure. If (a) holds, then there is χ ∈ Γ so that M ⊭ χ, but χ ∈ Θ
as well. If (b) holds, there is χ ∈ ∆ such that M ⊨ χ, but χ ∈ Ξ as well.
Finally, if M ⊨ φ, then M ⊭ ¬φ. Since ¬φ ∈ Θ, there is χ ∈ Θ such that
M ⊭ χ. Consequently, Θ ⇒ Ξ is valid.

3. The last inference is ¬R: Exercise.

4. The last inference is ∧L: There are two variants: φ ∧ ψ may be inferred
on the left from φ or from ψ on the left side of the premise. In the first
case, the π ends in

φ, Γ ⇒ ∆
∧L
φ ∧ ψ, Γ ⇒ ∆

and Θ = φ ∧ ψ, Γ while Ξ = ∆. Consider a structure M. Since by


induction hypothesis, φ, Γ ⇒ ∆ is valid, (a) M ⊭ φ, (b) M ⊭ χ for some
χ ∈ Γ , or (c) M ⊨ χ for some χ ∈ ∆. In case (a), M ⊭ φ ∧ ψ, so there
is χ ∈ Θ (namely, φ ∧ ψ) such that M ⊭ χ. In case (b), there is χ ∈ Γ
such that M ⊭ χ, and χ ∈ Θ as well. In case (c), there is χ ∈ ∆ such
that M ⊨ χ, and χ ∈ Ξ as well since Ξ = ∆. So in each case, M satisfies
φ ∧ ψ, Γ ⇒ ∆. Since M was arbitrary, Γ ⇒ ∆ is valid. The case where
φ ∧ ψ is inferred from ψ is handled the same, changing φ to ψ.

Release : 6891b66 (2024-12-01) 261


CHAPTER 19. THE SEQUENT CALCULUS

5. The last inference is ∨R: There are two variants: φ ∨ ψ may be inferred
on the right from φ or from ψ on the right side of the premise. In the
first case, π ends in

Γ ⇒ ∆, φ
∨R
Γ ⇒ ∆, φ ∨ ψ

Now Θ = Γ and Ξ = ∆, φ ∨ ψ. Consider a structure M. Since Γ ⇒ ∆, φ


is valid, (a) M ⊨ φ, (b) M ⊭ χ for some χ ∈ Γ , or (c) M ⊨ χ for some
χ ∈ ∆. In case (a), M ⊨ φ ∨ ψ. In case (b), there is χ ∈ Γ such that
M ⊭ χ. In case (c), there is χ ∈ ∆ such that M ⊨ χ. So in each case,
M satisfies Γ ⇒ ∆, φ ∨ ψ, i.e., Θ ⇒ Ξ. Since M was arbitrary, Θ ⇒ Ξ
is valid. The case where φ ∨ ψ is inferred from ψ is handled the same,
changing φ to ψ.
6. The last inference is →R: Then π ends in

φ, Γ ⇒ ∆, ψ
→R
Γ ⇒ ∆, φ → ψ

Again, the induction hypothesis says that the premise is valid; we want
to show that the conclusion is valid as well. Let M be arbitrary. Since
φ, Γ ⇒ ∆, ψ is valid, at least one of the following cases obtains: (a)
M ⊭ φ, (b) M ⊨ ψ, (c) M ⊭ χ for some χ ∈ Γ , or (d) M ⊨ χ for some
χ ∈ ∆. In cases (a) and (b), M ⊨ φ → ψ and so there is a χ ∈ ∆, φ → ψ
such that M ⊨ χ. In case (c), for some χ ∈ Γ , M ⊭ χ. In case (d), for
some χ ∈ ∆, M ⊨ χ. In each case, M satisfies Γ ⇒ ∆, φ → ψ. Since M
was arbitrary, Γ ⇒ ∆, φ → ψ is valid.
7. The last inference is ∀L: Then there is a formula φ(x) and a closed term t
such that π ends in

φ(t), Γ ⇒ ∆
∀L
∀x φ(x), Γ ⇒ ∆

We want to show that the conclusion ∀x φ(x), Γ ⇒ ∆ is valid. Consider


a structure M. Since the premise φ(t), Γ ⇒ ∆ is valid, (a) M ⊭ φ(t),
(b) M ⊭ χ for some χ ∈ Γ , or (c) M ⊨ χ for some χ ∈ ∆. In case (a),
by Proposition 16.31, if M ⊨ ∀x φ(x), then M ⊨ φ(t). Since M ⊭ φ(t),

262 Release : 6891b66 (2024-12-01)


19.12. SOUNDNESS

M ⊭ ∀x φ(x) . In case (b) and (c), M also satisfies ∀x φ(x), Γ ⇒ ∆.


Since M was arbitrary, ∀x φ(x), Γ ⇒ ∆ is valid.
8. The last inference is ∃R: Exercise.
9. The last inference is ∀R: Then there is a formula φ(x) and a constant
symbol a such that π ends in

Γ ⇒ ∆, φ(a)
∀R
Γ ⇒ ∆, ∀x φ(x)

where the eigenvariable condition is satisfied, i.e., a does not occur in


φ(x), Γ , or ∆. By induction hypothesis, the premise of the last inference
is valid. We have to show that the conclusion is valid as well, i.e., that
for any structure M, (a) M ⊨ ∀x φ(x), (b) M ⊭ χ for some χ ∈ Γ , or
(c) M ⊨ χ for some χ ∈ ∆.
Suppose M is an arbitrary structure. If (b) or (c) holds, we are done, so
suppose neither holds: for all χ ∈ Γ , M ⊨ χ, and for all χ ∈ ∆, M ⊭ χ.
We have to show that (a) holds, i.e., M ⊨ ∀x φ(x). By Proposition 16.19,
if suffices to show that M, s ⊨ φ(x) for all variable assignments s. So let s
be an arbitrary variable assignment. Consider the structure M′ which is

just like M except aM = s(x). By Corollary 16.21, for any χ ∈ Γ , M′ ⊨ χ
since a does not occur in Γ , and for any χ ∈ ∆, M′ ⊭ χ. But the premise
is valid, so M′ ⊨ φ(a). By Proposition 16.18, M′ , s ⊨ φ(a), since φ(a) is

a sentence. Now s ∼x s with s(x) = ValM s (a), since we’ve defined M


in just this way. So Proposition 16.23 applies, and we get M , s ⊨ φ(x).
Since a does not occur in φ(x), by Proposition 16.20, M, s ⊨ φ(x). Since s
was arbitrary, we’ve completed the proof that M, s ⊨ φ(x) for all variable
assignments.
10. The last inference is ∃L: Exercise.
Now let’s consider the possible inferences with two premises.
1. The last inference is a cut: then π ends in

Γ ⇒ ∆, φ φ, Π ⇒ Λ
Cut
Γ, Π ⇒ ∆, Λ

Let M be a structure. By induction hypothesis, the premises are valid,


so M satisfies both premises. We distinguish two cases: (a) M ⊭ φ and
(b) M ⊨ φ. In case (a), in order for M to satisfy the left premise, it must

Release : 6891b66 (2024-12-01) 263


CHAPTER 19. THE SEQUENT CALCULUS

satisfy Γ ⇒ ∆. But then it also satisfies the conclusion. In case (b), in


order for M to satisfy the right premise, it must satisfy Π \ Λ. Again, M
satisfies the conclusion.
2. The last inference is ∧R. Then π ends in

Γ ⇒ ∆, φ Γ ⇒ ∆, ψ
∧R
Γ ⇒ ∆, φ ∧ ψ

Consider a structure M. If M satisfies Γ ⇒ ∆, we are done. So suppose


it doesn’t. Since Γ ⇒ ∆, φ is valid by induction hypothesis, M ⊨ φ.
Similarly, since Γ ⇒ ∆, ψ is valid, M ⊨ ψ. But then M ⊨ φ ∧ ψ.
3. The last inference is ∨L: Exercise.
4. The last inference is →L. Then π ends in

Γ ⇒ ∆, φ ψ, Π ⇒ Λ
→L
φ → ψ, Γ, Π ⇒ ∆, Λ

Again, consider a structure M and suppose M doesn’t satisfy Γ, Π ⇒


∆, Λ. We have to show that M ⊭ φ→ψ. If M doesn’t satisfy Γ, Π ⇒ ∆, Λ,
it satisfies neither Γ ⇒ ∆ nor Π ⇒ Λ. Since, Γ ⇒ ∆, φ is valid, we have
M ⊨ φ. Since ψ, Π ⇒ Λ is valid, we have M ⊭ ψ. But then M ⊭ φ → ψ,
which is what we wanted to show.

Problem 19.8. Complete the proof of Theorem 19.28.


Corollary 19.29. If ⊢ φ then φ is valid. fol:seq:sou:
cor:weak-soundness
Corollary 19.30. If Γ ⊢ φ then Γ ⊨ φ. fol:seq:sou:
cor:entailment-soundness
Proof. If Γ ⊢ φ then for some finite subset Γ0 ⊆ Γ , there is a derivation of
Γ0 ⇒ φ. By Theorem 19.28, every structure M either makes some ψ ∈ Γ0 false
or makes φ true. Hence, if M ⊨ Γ then also M ⊨ φ.
Corollary 19.31. If Γ is satisfiable, then it is consistent. fol:seq:sou:
cor:consistency-soundness
Proof. We prove the contrapositive. Suppose that Γ is not consistent. Then
there is a finite Γ0 ⊆ Γ and a derivation of Γ0 ⇒ . By Theorem 19.28, Γ0 ⇒
is valid. In other words, for every structure M, there is χ ∈ Γ0 so that M ⊭ χ,
and since Γ0 ⊆ Γ , that χ is also in Γ . Thus, no M satisfies Γ , and Γ is not
satisfiable.

content/first-order-logic/sequent-calculus/identity.tex

264 Release : 6891b66 (2024-12-01)


19.13. DERIVATIONS WITH IDENTITY PREDICATE

19.13 Derivations with Identity predicate


fol:seq:ide: Derivations with identity predicate require additional initial sequents and in-
sec
ference rules.

Definition 19.32 (Initial sequents for =). If t is a closed term, then ⇒


t = t is an initial sequent.

The rules for = are (t1 and t2 are closed terms):

t1 = t2 , Γ ⇒ ∆, φ(t1 ) t1 = t2 , Γ ⇒ ∆, φ(t2 )
= =
t1 = t2 , Γ ⇒ ∆, φ(t2 ) t1 = t2 , Γ ⇒ ∆, φ(t1 )

Example 19.33. If s and t are closed terms, then s = t, φ(s) ⊢ φ(t):

φ(s) ⇒ φ(s)
WL
s = t, φ(s) ⇒ φ(s)
=
s = t, φ(s) ⇒ φ(t)

This may be familiar as the principle of substitutability of identicals, or Leibniz’


Law.
LK proves that = is symmetric and transitive:

t1 = t2 ⇒ t1 = t2
⇒ t1 = t1 t2 = t3 , t1 = t2 ⇒ t1 = t2 WL
=
t1 = t2 ⇒ t1 = t1 WL t2 = t3 , t1 = t2 ⇒ t1 = t3
=
t1 = t2 ⇒ t2 = t1 t1 = t2 , t2 = t3 ⇒ t1 = t3 XL

In the derivation on the left, the formula x = t1 is our φ(x). On the right, we
take φ(x) to be t1 = x.

Problem 19.9. Give derivations of the following sequents:

1. ⇒ ∀x ∀y ((x = y ∧ φ(x)) → φ(y))

2. ∃x φ(x) ∧ ∀y ∀z ((φ(y) ∧ φ(z)) → y = z) ⇒ ∃x (φ(x) ∧ ∀y (φ(y) → y = x))

content/first-order-logic/sequent-calculus/soundness-identity.tex

19.14 Soundness with Identity predicate


fol:seq:sid:
sec
Proposition 19.34. LK with initial sequents and rules for identity is sound.

Release : 6891b66 (2024-12-01) 265


Proof. Initial sequents of the form ⇒ t = t are valid, since for every struc-
ture M, M ⊨ t = t. (Note that we assume the term t to be closed, i.e., it
contains no variables, so variable assignments are irrelevant).
Suppose the last inference in a derivation is =. Then the premise is t1 =
t2 , Γ ⇒ ∆, φ(t1 ) and the conclusion is t1 = t2 , Γ ⇒ ∆, φ(t2 ). Consider a struc-
ture M. We need to show that the conclusion is valid, i.e., if M ⊨ t1 = t2 and
M ⊨ Γ , then either M ⊨ χ for some χ ∈ ∆ or M ⊨ φ(t2 ).
By induction hypothesis, the premise is valid. This means that if M ⊨
t1 = t2 and M ⊨ Γ either (a) for some χ ∈ ∆, M ⊨ χ or (b) M ⊨ φ(t1 ).
In case (a) we are done. Consider case (b). Let s be a variable assignment
with s(x) = ValM (t1 ). By Proposition 16.18, M, s ⊨ φ(t1 ). Since s ∼x s,
by Proposition 16.23, M, s ⊨ φ(x). since M ⊨ t1 = t2 , we have ValM (t1 ) =
ValM (t2 ), and hence s(x) = ValM (t2 ). By applying Proposition 16.23 again,
we also have M, s ⊨ φ(t2 ). By Proposition 16.18, M ⊨ φ(t2 ).

Chapter 20

Natural Deduction

This chapter presents a natural deduction system in the style of


Gentzen/Prawitz.
To include or exclude material relevant to natural deduction as a proof
system, use the “prfND” tag.

content/first-order-logic/natural-deduction/rules-and-proofs.tex

20.1 Rules and Derivations


explanation Natural deduction systems are meant to closely parallel the informal reason- fol:ntd:rul:
sec
ing used in mathematical proof (hence it is somewhat “natural”). Natural
deduction proofs begin with assumptions. Inference rules are then applied.
Assumptions are “discharged” by the ¬Intro, →Intro, ∨Elim and ∃Elim in-
ference rules, and the label of the discharged assumption is placed beside the
inference for clarity.

266
20.2. PROPOSITIONAL RULES

Definition 20.1 (Assumption). An assumption is any sentence in the top-


most position of any branch.

Derivations in natural deduction are certain trees of sentences, where the


topmost sentences are assumptions, and if a sentence stands below one, two,
or three other sequents, it must follow correctly by a rule of inference. The
sentences at the top of the inference are called the premises and the sentence
below the conclusion of the inference. The rules come in pairs, an introduction
and an elimination rule for each logical operator. They introduce a logical
operator in the conclusion or remove a logical operator from a premise of the
rule. Some of the rules allow an assumption of a certain type to be discharged.
To indicate which assumption is discharged by which inference, we also assign
labels to both the assumption and the inference. This is indicated by writing
the assumption as “[φ]n .”
It is customary to consider rules for all the logical operators ∧, ∨, →, ¬,
and ⊥, even if some of those are defined.

content/first-order-logic/natural-deduction/propositional-rules.tex

20.2 Propositional Rules


fol:ntd:prl:
sec
Rules for ∧

φ∧ψ
φ ∧Elim
φ ψ
∧Intro
φ∧ψ φ∧ψ
∧Elim
ψ

Rules for ∨

φ [φ]n [ψ]n
∨Intro
φ∨ψ
ψ
∨Intro φ∨ψ χ χ
φ∨ψ n ∨Elim
χ

Rules for →

Release : 6891b66 (2024-12-01) 267


CHAPTER 20. NATURAL DEDUCTION

[φ]n

φ→ψ φ
→Elim
ψ
ψ
n →Intro
φ→ψ

Rules for ¬

[φ]n
¬φ φ
¬Elim


n
¬φ ¬Intro

Rules for ⊥

[¬φ]n

⊥ ⊥
φ I

n
⊥ ⊥
φ C

Note that ¬Intro and ⊥C are very similar: The difference is that ¬Intro derives
a negated sentence ¬φ but ⊥C a positive sentence φ.
Whenever a rule indicates that some assumption may be discharged, we
take this to be a permission, but not a requirement. E.g., in the →Intro rule,
we may discharge any number of assumptions of the form φ in the derivation
of the premise ψ, including zero.

content/first-order-logic/natural-deduction/quantifier-rules.tex

20.3 Quantifier Rules


Rules for ∀ fol:ntd:qrl:
sec

φ(a) ∀x φ(x)
∀Intro ∀Elim
∀x φ(x) φ(t)

268 Release : 6891b66 (2024-12-01)


20.3. QUANTIFIER RULES

In the rules for ∀, t is a closed term (a term that does not contain any variables),
and a is a constant symbol which does not occur in the conclusion ∀x φ(x), or
in any assumption which is undischarged in the derivation ending with the
premise φ(a). We call a the eigenvariable of the ∀Intro inference.1

Rules for ∃

[φ(a)]n
φ(t)
∃Intro
∃x φ(x)
∃x φ(x) χ
n
χ ∃Elim

Again, t is a closed term, and a is a constant which does not occur in the
premise ∃x φ(x), in the conclusion χ, or any assumption which is undischarged
in the derivations ending with the two premises (other than the assumptions
φ(a)). We call a the eigenvariable of the ∃Elim inference.
The condition that an eigenvariable neither occur in the premises nor in
any assumption that is undischarged in the derivations leading to the premises
for the ∀Intro or ∃Elim inference is called the eigenvariable condition.
Recall the convention that when φ is a formula with the variable x free, we explanation

indicate this by writing φ(x). In the same context, φ(t) then is short for φ[t/x].
So we could also write the ∃Intro rule as:

φ[t/x]
∃Intro
∃x φ

Note that t may already occur in φ, e.g., φ might be P (t, x). Thus, inferring
∃x P (t, x) from P (t, t) is a correct application of ∃Intro—you may “replace” one
or more, and not necessarily all, occurrences of t in the premise by the bound
variable x. However, the eigenvariable conditions in ∀Intro and ∃Elim require
that the constant symbol a does not occur in φ. So, you cannot correctly infer
∀x P (a, x) from P (a, a) using ∀Intro.
In ∃Intro and ∀Elim there are no restrictions, and the term t can be any- explanation

thing, so we do not have to worry about any conditions. On the other hand,
in the ∃Elim and ∀Intro rules, the eigenvariable condition requires that the
constant symbol a does not occur anywhere in the conclusion or in an undis-
charged assumption. The condition is necessary to ensure that the system is
sound, i.e., only derives sentences from undischarged assumptions from which
they follow. Without this condition, the following would be allowed:
1 We use the term “eigenvariable” even though a in the above rule is a constant. This

has historical reasons.

Release : 6891b66 (2024-12-01) 269


CHAPTER 20. NATURAL DEDUCTION

[φ(a)]1
*∀Intro
∃x φ(x) ∀x φ(x)
∃Elim
∀x φ(x)

However, ∃x φ(x) ⊭ ∀x φ(x).


As the elimination rules for quantifiers only allow substituting closed terms
for variables, it follows that any formula that can be derived from a set of
sentences is itself a sentence.

content/first-order-logic/natural-deduction/derivations.tex

20.4 Derivations
explanation We’ve said what an assumption is, and we’ve given the rules of inference. fol:ntd:der:
sec
Derivations in natural deduction are inductively generated from these: each
derivation either is an assumption on its own, or consists of one, two, or three
derivations followed by a correct inference.

Definition 20.2 (Derivation). A derivation of a sentence φ from assump-


tions Γ is a finite tree of sentences satisfying the following conditions:

1. The topmost sentences of the tree are either in Γ or are discharged by


an inference in the tree.

2. The bottommost sentence of the tree is φ.

3. Every sentence in the tree except the sentence φ at the bottom is a


premise of a correct application of an inference rule whose conclusion
stands directly below that sentence in the tree.

We then say that φ is the conclusion of the derivation and Γ its undischarged
assumptions.
If a derivation of φ from Γ exists, we say that φ is derivable from Γ , or
in symbols: Γ ⊢ φ. If there is a derivation of φ in which every assumption is
discharged, we write ⊢ φ.

Example 20.3. Every assumption on its own is a derivation. So, e.g., φ by


itself is a derivation, and so is ψ by itself. We can obtain a new derivation from
these by applying, say, the ∧Intro rule,
φ ψ
∧Intro
φ∧ψ

These rules are meant to be general: we can replace the φ and ψ in it with any
sentences, e.g., by χ and θ. Then the conclusion would be χ ∧ θ, and so
χ θ
∧Intro
χ∧θ

270 Release : 6891b66 (2024-12-01)


20.5. EXAMPLES OF DERIVATIONS

is a correct derivation. Of course, we can also switch the assumptions, so that


θ plays the role of φ and χ that of ψ. Thus,
θ χ
∧Intro
θ∧χ

is also a correct derivation.


We can now apply another rule, say, →Intro, which allows us to conclude
a conditional and allows us to discharge any assumption that is identical to
the antecedent of that conditional. So both of the following would be correct
derivations:
[χ]1 θ χ [θ]1
∧Intro ∧Intro
χ∧θ χ∧θ
1 →Intro 1 →Intro
χ → (χ ∧ θ) θ → (χ ∧ θ)

They show, respectively, that θ ⊢ χ → (χ ∧ θ) and χ ⊢ θ → (χ ∧ θ).


Remember that discharging of assumptions is a permission, not a require-
ment: we don’t have to discharge the assumptions. In particular, we can apply
a rule even if the assumptions are not present in the derivation. For instance,
the following is legal, even though there is no assumption φ to be discharged:
ψ
1 →Intro
φ→ψ

content/first-order-logic/natural-deduction/proving-things.tex

20.5 Examples of Derivations


fol:ntd:pro:
sec
Example 20.4. Let’s give a derivation of the sentence (φ ∧ ψ) → φ.
We begin by writing the desired conclusion at the bottom of the derivation.

(φ ∧ ψ) → φ

Next, we need to figure out what kind of inference could result in a sentence
of this form. The main operator of the conclusion is →, so we’ll try to arrive at
the conclusion using the →Intro rule. It is best to write down the assumptions
involved and label the inference rules as you progress, so it is easy to see whether
all assumptions have been discharged at the end of the proof.

[φ ∧ ψ]1

φ
1 →Intro
(φ ∧ ψ) → φ

Release : 6891b66 (2024-12-01) 271


CHAPTER 20. NATURAL DEDUCTION

We now need to fill in the steps from the assumption φ ∧ ψ to φ. Since we


only have one connective to deal with, ∧, we must use the ∧ elim rule. This
gives us the following proof:

[φ ∧ ψ]1
φ ∧Elim
1 →Intro
(φ ∧ ψ) → φ

We now have a correct derivation of (φ ∧ ψ) → φ.

Example 20.5. Now let’s give a derivation of (¬φ ∨ ψ) → (φ → ψ).


We begin by writing the desired conclusion at the bottom of the derivation.

(¬φ ∨ ψ) → (φ → ψ)

To find a logical rule that could give us this conclusion, we look at the logical
connectives in the conclusion: ¬, ∨, and →. We only care at the moment about
the first occurrence of → because it is the main operator of the sentence in the
end-sequent, while ¬, ∨ and the second occurrence of → are inside the scope
of another connective, so we will take care of those later. We therefore start
with the →Intro rule. A correct application must look like this:

[¬φ ∨ ψ]1

φ→ψ
1 →Intro
(¬φ ∨ ψ) → (φ → ψ)

This leaves us with two possibilities to continue. Either we can keep working
from the bottom up and look for another application of the →Intro rule, or we
can work from the top down and apply a ∨Elim rule. Let us apply the latter.
We will use the assumption ¬φ ∨ ψ as the leftmost premise of ∨Elim. For a
valid application of ∨Elim, the other two premises must be identical to the
conclusion φ → ψ, but each may be derived in turn from another assumption,
namely one of the two disjuncts of ¬φ ∨ ψ. So our derivation will look like this:

[¬φ]2 [ψ]2

[¬φ ∨ ψ]1 φ→ψ φ→ψ


2 ∨Elim
φ→ψ
1 →Intro
(¬φ ∨ ψ) → (φ → ψ)

In each of the two branches on the right, we want to derive φ → ψ, which


is best done using →Intro.

272 Release : 6891b66 (2024-12-01)


20.5. EXAMPLES OF DERIVATIONS

[¬φ]2 , [φ]3 [ψ]2 , [φ]4

ψ ψ
1 3 →Intro 4 →Intro
[¬φ ∨ ψ] φ→ψ φ→ψ
2 ∨Elim
φ→ψ
1 →Intro
(¬φ ∨ ψ) → (φ → ψ)

For the two missing parts of the derivation, we need derivations of ψ from
¬φ and φ in the middle, and from φ and ψ on the left. Let’s take the former
first. ¬φ and φ are the two premises of ¬Elim:

[¬φ]2 [φ]3
¬Elim

By using ⊥I , we can obtain ψ as a conclusion and complete the branch.

[ψ]2 , [φ]4
[¬φ]2 [φ]3
⊥Intro
⊥ ⊥
I
ψ ψ
3 →Intro 4 →Intro
[¬φ ∨ ψ]1 φ→ψ φ→ψ
2 ∨Elim
φ→ψ
1 →Intro
(¬φ ∨ ψ) → (φ → ψ)

Let’s now look at the rightmost branch. Here it’s important to realize
that the definition of derivation allows assumptions to be discharged but does
not require them to be. In other words, if we can derive ψ from one of the
assumptions φ and ψ without using the other, that’s ok. And to derive ψ
from ψ is trivial: ψ by itself is such a derivation, and no inferences are needed.
So we can simply delete the assumption φ.

[¬φ]2 [φ]3
¬Elim
⊥ ⊥
I
ψ [ψ]2
3 →Intro →Intro
[¬φ ∨ ψ]1 φ→ψ φ→ψ
2 ∨Elim
φ→ψ
1 →Intro
(¬φ ∨ ψ) → (φ → ψ)

Note that in the finished derivation, the rightmost →Intro inference does not
actually discharge any assumptions.

Release : 6891b66 (2024-12-01) 273


CHAPTER 20. NATURAL DEDUCTION

Example 20.6. So far we have not needed the ⊥C rule. It is special in that it
allows us to discharge an assumption that isn’t a sub-formula of the conclusion
of the rule. It is closely related to the ⊥I rule. In fact, the ⊥I rule is a special
case of the ⊥C rule—there is a logic called “intuitionistic logic” in which only
⊥I is allowed. The ⊥C rule is a last resort when nothing else works. For
instance, suppose we want to derive φ ∨ ¬φ. Our usual strategy would be to
attempt to derive φ ∨ ¬φ using ∨Intro. But this would require us to derive
either φ or ¬φ from no assumptions, and this can’t be done. ⊥C to the rescue!

[¬(φ ∨ ¬φ)]1

1
⊥ ⊥
φ ∨ ¬φ C

Now we’re looking for a derivation of ⊥ from ¬(φ ∨ ¬φ). Since ⊥ is the
conclusion of ¬Elim we might try that:

[¬(φ ∨ ¬φ)]1 [¬(φ ∨ ¬φ)]1

¬φ φ
¬Elim
1
⊥ ⊥
φ ∨ ¬φ C

Our strategy for finding a derivation of ¬φ calls for an application of ¬Intro:

[¬(φ ∨ ¬φ)]1 , [φ]2


[¬(φ ∨ ¬φ)]1


2
¬φ ¬Intro φ
¬Elim
1
⊥ ⊥C
φ ∨ ¬φ

Here, we can get ⊥ easily by applying ¬Elim to the assumption ¬(φ ∨ ¬φ) and
φ ∨ ¬φ which follows from our new assumption φ by ∨Intro:

[¬(φ ∨ ¬φ)]1
[φ]2
[¬(φ ∨ ¬φ)]1 φ ∨ ¬φ ∨Intro
¬Elim

2
¬φ ¬Intro φ
¬Elim
1
⊥ ⊥
φ ∨ ¬φ C

On the right side we use the same strategy, except we get φ by ⊥C :

274 Release : 6891b66 (2024-12-01)


20.5. EXAMPLES OF DERIVATIONS

[φ]2 [¬φ]3
[¬(φ ∨ ¬φ)] 1
φ ∨ ¬φ ∨Intro [¬(φ ∨ ¬φ)]1
φ ∨ ¬φ ∨Intro
¬Elim ¬Elim
⊥ ⊥ ⊥
2
¬φ ¬Intro 3
φ C
¬Elim
1
⊥ ⊥C
φ ∨ ¬φ

Problem 20.1. Give derivations that show the following:


1. φ ∧ (ψ ∧ χ) ⊢ (φ ∧ ψ) ∧ χ.
2. φ ∨ (ψ ∨ χ) ⊢ (φ ∨ ψ) ∨ χ.
3. φ → (ψ → χ) ⊢ ψ → (φ → χ).
4. φ ⊢ ¬¬φ.

Problem 20.2. Give derivations that show the following:


1. (φ ∨ ψ) → χ ⊢ φ → χ.
2. (φ → χ) ∧ (ψ → χ) ⊢ (φ ∨ ψ) → χ.
3. ⊢ ¬(φ ∧ ¬φ).
4. ψ → φ ⊢ ¬φ → ¬ψ.
5. ⊢ (φ → ¬φ) → ¬φ.
6. ⊢ ¬(φ → ψ) → ¬ψ.
7. φ → χ ⊢ ¬(φ ∧ ¬χ).
8. φ ∧ ¬χ ⊢ ¬(φ → χ).
9. φ ∨ ψ, ¬ψ ⊢ φ.
10. ¬φ ∨ ¬ψ ⊢ ¬(φ ∧ ψ).
11. ⊢ (¬φ ∧ ¬ψ) → ¬(φ ∨ ψ).
12. ⊢ ¬(φ ∨ ψ) → (¬φ ∧ ¬ψ).

Problem 20.3. Give derivations that show the following:


1. ¬(φ → ψ) ⊢ φ.
2. ¬(φ ∧ ψ) ⊢ ¬φ ∨ ¬ψ.
3. φ → ψ ⊢ ¬φ ∨ ψ.
4. ⊢ ¬¬φ → φ.
5. φ → ψ, ¬φ → ψ ⊢ ψ.

Release : 6891b66 (2024-12-01) 275


CHAPTER 20. NATURAL DEDUCTION

6. (φ ∧ ψ) → χ ⊢ (φ → χ) ∨ (ψ → χ).

7. (φ → ψ) → φ ⊢ φ.

8. ⊢ (φ → ψ) ∨ (ψ → χ).

(These all require the ⊥C rule.)

content/first-order-logic/natural-deduction/proving-things-quant.tex

20.6 Derivations with Quantifiers


fol:ntd:prq:
sec
Example 20.7. When dealing with quantifiers, we have to make sure not
to violate the eigenvariable condition, and sometimes this requires us to play
around with the order of carrying out certain inferences. In general, it helps
to try and take care of rules subject to the eigenvariable condition first (they
will be lower down in the finished proof).
Let’s see how we’d give a derivation of the formula ∃x ¬φ(x) → ¬∀x φ(x).
Starting as usual, we write

∃x ¬φ(x) → ¬∀x φ(x)

We start by writing down what it would take to justify that last step using the
→Intro rule.
[∃x ¬φ(x)]1

¬∀x φ(x)
1 →Intro
∃x ¬φ(x) → ¬∀x φ(x)

Since there is no obvious rule to apply to ¬∀x φ(x), we will proceed by setting
up the derivation so we can use the ∃Elim rule. Here we must pay attention
to the eigenvariable condition, and choose a constant that does not appear in
∃x φ(x) or any assumptions that it depends on. (Since no constant symbols
appear, however, any choice will do fine.)

[¬φ(a)]2

[∃x ¬φ(x)]1 ¬∀x φ(x)


2 ∃Elim
¬∀x φ(x)
1 →Intro
∃x ¬φ(x) → ¬∀x φ(x)

276 Release : 6891b66 (2024-12-01)


20.6. DERIVATIONS WITH QUANTIFIERS

In order to derive ¬∀x φ(x), we will attempt to use the ¬Intro rule: this re-
quires that we derive a contradiction, possibly using ∀x φ(x) as an additional
assumption. Of course, this contradiction may involve the assumption ¬φ(a)
which will be discharged by the ∃Elim inference. We can set it up as follows:

[¬φ(a)]2 , [∀x φ(x)]3


3 ¬Intro
[∃x ¬φ(x)]1 ¬∀x φ(x)
2 ∃Elim
¬∀x φ(x)
1 →Intro
∃x ¬φ(x) → ¬∀x φ(x)

It looks like we are close to getting a contradiction. The easiest rule to apply is
the ∀Elim, which has no eigenvariable conditions. Since we can use any term
we want to replace the universally quantified x, it makes the most sense to
continue using a so we can reach a contradiction.

[∀x φ(x)]3
2 ∀Elim
[¬φ(a)] φ(a)
¬Elim

3 ¬Intro
[∃x ¬φ(x)]1 ¬∀x φ(x)
2 ∃Elim
¬∀x φ(x)
1 →Intro
∃x ¬φ(x) → ¬∀x φ(x)

It is important, especially when dealing with quantifiers, to double check


at this point that the eigenvariable condition has not been violated. Since the
only rule we applied that is subject to the eigenvariable condition was ∃Elim,
and the eigenvariable a does not occur in any assumptions it depends on, this
is a correct derivation.

Example 20.8. Sometimes we may derive a formula from other formulas. In


these cases, we may have undischarged assumptions. It is important to keep
track of our assumptions as well as the end goal.
Let’s see how we’d give a derivation of the formula ∃x χ(x, b) from the
assumptions ∃x (φ(x) ∧ ψ(x)) and ∀x (ψ(x) → χ(x, b)). Starting as usual, we
write the conclusion at the bottom.

∃x χ(x, b)

We have two premises to work with. To use the first, i.e., try to find
a derivation of ∃x χ(x, b) from ∃x (φ(x) ∧ ψ(x)) we would use the ∃Elim rule.
Since it has an eigenvariable condition, we will apply that rule first. We get
the following:

Release : 6891b66 (2024-12-01) 277


CHAPTER 20. NATURAL DEDUCTION

[φ(a) ∧ ψ(a)]1

∃x (φ(x) ∧ ψ(x)) ∃x χ(x, b)


1 ∃Elim
∃x χ(x, b)

The two assumptions we are working with share ψ. It may be useful at this
point to apply ∧Elim to separate out ψ(a).

[φ(a) ∧ ψ(a)]1
∧Elim
ψ(a)

∃x (φ(x) ∧ ψ(x)) ∃x χ(x, b)


1 ∃Elim
∃x χ(x, b)

The second assumption we have to work with is ∀x (ψ(x) → χ(x, b)). Since
there is no eigenvariable condition we can instantiate x with the constant sym-
bol a using ∀Elim to get ψ(a) → χ(a, b). We now have both ψ(a) → χ(a, b) and
ψ(a). Our next move should be a straightforward application of the →Elim
rule.
∀x (ψ(x) → χ(x, b)) [φ(a) ∧ ψ(a)]1
∀Elim ∧Elim
ψ(a) → χ(a, b) ψ(a)
→Elim
χ(a, b)

∃x (φ(x) ∧ ψ(x)) ∃x χ(x, b)


1 ∃Elim
∃x χ(x, b)

We are so close! One application of ∃Intro and we have reached our goal.

∀x (ψ(x) → χ(x, b)) [φ(a) ∧ ψ(a)]1


∀Elim ∧Elim
ψ(a) → χ(a, b) ψ(a)
→Elim
χ(a, b)
∃Intro
∃x (φ(x) ∧ ψ(x)) ∃x χ(x, b)
1 ∃Elim
∃x χ(x, b)

Since we ensured at each step that the eigenvariable conditions were not vio-
lated, we can be confident that this is a correct derivation.

Example 20.9. Give a derivation of the formula ¬∀x φ(x) from the assump-
tions ∀x φ(x) → ∃y ψ(y) and ¬∃y ψ(y). Starting as usual, we write the target
formula at the bottom.

278 Release : 6891b66 (2024-12-01)


20.6. DERIVATIONS WITH QUANTIFIERS

¬∀x φ(x)

The last line of the derivation is a negation, so let’s try using ¬Intro. This will
require that we figure out how to derive a contradiction.
[∀x φ(x)]1


1 ¬Intro
¬∀x φ(x)

So far so good. We can use ∀Elim but it’s not obvious if that will help us get to
our goal. Instead, let’s use one of our assumptions. ∀x φ(x) → ∃y ψ(y) together
with ∀x φ(x) will allow us to use the →Elim rule.
∀x φ(x) → ∃y ψ(y) [∀x φ(x)]1
→Elim
∃y ψ(y)


1 ¬Intro
¬∀x φ(x)

We now have one final assumption to work with, and it looks like this will help
us reach a contradiction by using ¬Elim.
∀x φ(x) → ∃y ψ(y) [∀x φ(x)]1
→Elim
¬∃y ψ(y) ∃y ψ(y)
¬Elim

1 ¬Intro
¬∀x φ(x)

Problem 20.4. Give derivations that show the following:

1. ⊢ (∀x φ(x) ∧ ∀y ψ(y)) → ∀z (φ(z) ∧ ψ(z)).

2. ⊢ (∃x φ(x) ∨ ∃y ψ(y)) → ∃z (φ(z) ∨ ψ(z)).

3. ∀x (φ(x) → ψ) ⊢ ∃y φ(y) → ψ.

4. ∀x ¬φ(x) ⊢ ¬∃x φ(x).

5. ⊢ ¬∃x φ(x) → ∀x ¬φ(x).

6. ⊢ ¬∃x ∀y ((φ(x, y) → ¬φ(y, y)) ∧ (¬φ(y, y) → φ(x, y))).

Problem 20.5. Give derivations that show the following:

1. ⊢ ¬∀x φ(x) → ∃x ¬φ(x).

2. (∀x φ(x) → ψ) ⊢ ∃y (φ(y) → ψ).

Release : 6891b66 (2024-12-01) 279


CHAPTER 20. NATURAL DEDUCTION

3. ⊢ ∃x (φ(x) → ∀y φ(y)).

(These all require the ⊥C rule.)

content/first-order-logic/natural-deduction/proof-theoretic-notions.tex

20.7 Proof-Theoretic Notions


fol:ntd:ptn:
sec

This section collects the definitions the provability relation and consis-
tency for natural deduction.

explanation Just as we’ve defined a number of important semantic notions (validity, en-
tailment, satisfiability), we now define corresponding proof-theoretic notions.
These are not defined by appeal to satisfaction of sentences in structures, but
by appeal to the derivability or non-derivability of certain sentences from oth-
ers. It was an important discovery that these notions coincide. That they do
is the content of the soundness and completeness theorems.

Definition 20.10 (Theorems). A sentence φ is a theorem if there is a deriva-


tion of φ in natural deduction in which all assumptions are discharged. We
write ⊢ φ if φ is a theorem and ⊬ φ if it is not.

Definition 20.11 (Derivability). A sentence φ is derivable from a set of


sentences Γ , Γ ⊢ φ, if there is a derivation with conclusion φ and in which
every assumption is either discharged or is in Γ . If φ is not derivable from Γ
we write Γ ⊬ φ.

Definition 20.12 (Consistency). A set of sentences Γ is inconsistent iff Γ ⊢


⊥. If Γ is not inconsistent, i.e., if Γ ⊬ ⊥, we say it is consistent.

Proposition 20.13 (Reflexivity). If φ ∈ Γ , then Γ ⊢ φ. fol:ntd:ptn:


prop:reflexivity

Proof. The assumption φ by itself is a derivation of φ where every undischarged


assumption (i.e., φ) is in Γ .

Proposition 20.14 (Monotonicity). If Γ ⊆ ∆ and Γ ⊢ φ, then ∆ ⊢ φ. fol:ntd:ptn:


prop:monotonicity

Proof. Any derivation of φ from Γ is also a derivation of φ from ∆.

Proposition 20.15 (Transitivity). If Γ ⊢ φ and {φ}∪∆ ⊢ ψ, then Γ ∪∆ ⊢ fol:ntd:ptn:


prop:transitivity
ψ.

280 Release : 6891b66 (2024-12-01)


20.7. PROOF-THEORETIC NOTIONS

Proof. If Γ ⊢ φ, there is a derivation δ0 of φ with all undischarged assumptions


in Γ . If {φ} ∪ ∆ ⊢ ψ, then there is a derivation δ1 of ψ with all undischarged
assumptions in {φ} ∪ ∆. Now consider:

∆, [φ]1

δ1 Γ
δ0
ψ
1 →Intro
φ→ψ φ
→Elim
ψ

The undischarged assumptions are now all among Γ ∪ ∆, so this shows Γ ∪ ∆ ⊢


ψ.

When Γ = {φ1 , φ2 , . . . , φk } is a finite set we may use the simplified notation


φ1 , φ2 , . . . , φk ⊢ ψ for Γ ⊢ ψ, in particular φ ⊢ ψ means that {φ} ⊢ ψ.
Note that if Γ ⊢ φ and φ ⊢ ψ, then Γ ⊢ ψ. It follows also that if φ1 , . . . , φn ⊢
ψ and Γ ⊢ φi for each i, then Γ ⊢ ψ.

fol:ntd:ptn: Proposition 20.16. The following are equivalent.


prop:incons

1. Γ is inconsistent.

2. Γ ⊢ φ for every sentence φ.

3. Γ ⊢ φ and Γ ⊢ ¬φ for some sentence φ.

Proof. Exercise.

Problem 20.6. Prove Proposition 20.16

fol:ntd:ptn: Proposition 20.17 (Compactness).


prop:proves-compact

1. If Γ ⊢ φ then there is a finite subset Γ0 ⊆ Γ such that Γ0 ⊢ φ.

2. If every finite subset of Γ is consistent, then Γ is consistent.

Proof. 1. If Γ ⊢ φ, then there is a derivation δ of φ from Γ . Let Γ0 be the


set of undischarged assumptions of δ. Since any derivation is finite, Γ0
can only contain finitely many sentences. So, δ is a derivation of φ from
a finite Γ0 ⊆ Γ .

2. This is the contrapositive of (1) for the special case φ ≡ ⊥.

content/first-order-logic/natural-deduction/provability-consistency.tex

Release : 6891b66 (2024-12-01) 281


CHAPTER 20. NATURAL DEDUCTION

20.8 Derivability and Consistency


We will now establish a number of properties of the derivability relation. They fol:ntd:prv:
sec
are independently interesting, but each will play a role in the proof of the
completeness theorem.

Proposition 20.18. If Γ ⊢ φ and Γ ∪ {φ} is inconsistent, then Γ is incon- fol:ntd:prv:


prop:provability-contr
sistent.

Proof. Let the derivation of φ from Γ be δ1 and the derivation of ⊥ from


Γ ∪ {φ} be δ2 . We can then derive:

Γ, [φ]1
Γ
δ2
δ1

1
¬φ ¬Intro φ
¬Elim

In the new derivation, the assumption φ is discharged, so it is a derivation


from Γ .

Proposition 20.19. Γ ⊢ φ iff Γ ∪ {¬φ} is inconsistent. fol:ntd:prv:


prop:prov-incons

Proof. First suppose Γ ⊢ φ, i.e., there is a derivation δ0 of φ from undischarged


assumptions Γ . We obtain a derivation of ⊥ from Γ ∪ {¬φ} as follows:
Γ
δ0
¬φ φ
¬Elim

Now assume Γ ∪{¬φ} is inconsistent, and let δ1 be the corresponding deriva-


tion of ⊥ from undischarged assumptions in Γ ∪ {¬φ}. We obtain a derivation
of φ from Γ alone by using ⊥C :

Γ, [¬φ]1

δ1

1
⊥ ⊥
φ C

Problem 20.7. Prove that Γ ⊢ ¬φ iff Γ ∪ {φ} is inconsistent.

Proposition 20.20. If Γ ⊢ φ and ¬φ ∈ Γ , then Γ is inconsistent. fol:ntd:prv:


prop:explicit-inc

282 Release : 6891b66 (2024-12-01)


20.9. DERIVABILITY AND THE PROPOSITIONAL CONNECTIVES

Proof. Suppose Γ ⊢ φ and ¬φ ∈ Γ . Then there is a derivation δ of φ from Γ .


Consider this simple application of the ¬Elim rule:
Γ

δ
¬φ φ
¬Elim

Since ¬φ ∈ Γ , all undischarged assumptions are in Γ , this shows that Γ ⊢ ⊥.

fol:ntd:prv: Proposition 20.21. If Γ ∪ {φ} and Γ ∪ {¬φ} are both inconsistent, then Γ
prop:provability-exhaustive
is inconsistent.

Proof. There are derivations δ1 and δ2 of ⊥ from Γ ∪{φ} and ⊥ from Γ ∪{¬φ},
respectively. We can then derive
Γ, [¬φ]2 Γ, [φ]1

δ2 δ1

⊥ ⊥
2
¬¬φ ¬Intro 1
¬φ ¬Intro
¬Elim

Since the assumptions φ and ¬φ are discharged, this is a derivation of ⊥ from Γ
alone. Hence Γ is inconsistent.

content/first-order-logic/natural-deduction/provability-propositional.tex

20.9 Derivability and the Propositional Connectives


fol:ntd:ppr: We establish that the derivability relation ⊢ of natural deduction is strong explanation
sec
enough to establish some basic facts involving the propositional connectives,
such as that φ ∧ ψ ⊢ φ and φ, φ → ψ ⊢ ψ (modus ponens). These facts are
needed for the proof of the completeness theorem.
fol:ntd:ppr: Proposition 20.22.
prop:provability-land
fol:ntd:ppr: 1. Both φ ∧ ψ ⊢ φ and φ ∧ ψ ⊢ ψ
prop:provability-land-left
fol:ntd:ppr: 2. φ, ψ ⊢ φ ∧ ψ.
prop:provability-land-right

Proof. 1. We can derive both

φ∧ψ φ∧ψ
∧Elim ∧Elim
φ ψ

2. We can derive:

Release : 6891b66 (2024-12-01) 283


CHAPTER 20. NATURAL DEDUCTION

φ ψ
∧Intro
φ∧ψ

Proposition 20.23. fol:ntd:ppr:


prop:provability-lor
1. φ ∨ ψ, ¬φ, ¬ψ is inconsistent.

2. Both φ ⊢ φ ∨ ψ and ψ ⊢ φ ∨ ψ.

Proof. 1. Consider the following derivation:

¬φ [φ]1 ¬ψ [ψ]1
¬Elim ¬Elim
φ∨ψ ⊥ ⊥
1 ∨Elim

This is a derivation of ⊥ from undischarged assumptions φ ∨ ψ, ¬φ, and


¬ψ.

2. We can derive both

φ ψ
∨Intro ∨Intro
φ∨ψ φ∨ψ

Proposition 20.24. fol:ntd:ppr:


prop:provability-lif
1. φ, φ → ψ ⊢ ψ. fol:ntd:ppr:
prop:provability-lif-left
2. Both ¬φ ⊢ φ → ψ and ψ ⊢ φ → ψ. fol:ntd:ppr:
prop:provability-lif-right

Proof. 1. We can derive:

φ→ψ φ
→Elim
ψ

2. This is shown by the following two derivations:

¬φ [φ]1
¬Elim
⊥ ⊥
I
ψ ψ
1 →Intro →Intro
φ→ψ φ→ψ

Note that →Intro may, but does not have to, discharge the assumption φ.

content/first-order-logic/natural-deduction/provability-quantifiers.tex

284 Release : 6891b66 (2024-12-01)


20.10. DERIVABILITY AND THE QUANTIFIERS

20.10 Derivability and the Quantifiers


fol:ntd:qpr: The completeness theorem also requires that the natural deduction rules yield explanation
sec
the facts about ⊢ established in this section.
fol:ntd:qpr: Theorem 20.25. If c is a constant not occurring in Γ or φ(x) and Γ ⊢ φ(c),
thm:strong-generalization
then Γ ⊢ ∀x φ(x).

Proof. Let δ be a derivation of φ(c) from Γ . By adding a ∀Intro inference,


we obtain a derivation of ∀x φ(x). Since c does not occur in Γ or φ(x), the
eigenvariable condition is satisfied.

fol:ntd:qpr: Proposition 20.26.


prop:provability-quantifiers
1. φ(t) ⊢ ∃x φ(x).
2. ∀x φ(x) ⊢ φ(t).

Proof. 1. The following is a derivation of ∃x φ(x) from φ(t):

φ(t)
∃Intro
∃x φ(x)

2. The following is a derivation of φ(t) from ∀x φ(x):

∀x φ(x)
∀Elim
φ(t)

content/first-order-logic/natural-deduction/soundness.tex

20.11 Soundness
fol:ntd:sou: A derivation system, such as natural deduction, is sound if it cannot derive explanation
sec
things that do not actually follow. Soundness is thus a kind of guaranteed
safety property for derivation systems. Depending on which proof theoretic
property is in question, we would like to know for instance, that
1. every derivable sentence is valid;
2. if a sentence is derivable from some others, it is also a consequence of
them;
3. if a set of sentences is inconsistent, it is unsatisfiable.
These are important properties of a derivation system. If any of them do
not hold, the derivation system is deficient—it would derive too much. Con-
sequently, establishing the soundness of a derivation system is of the utmost
importance.

Release : 6891b66 (2024-12-01) 285


CHAPTER 20. NATURAL DEDUCTION

Theorem 20.27 (Soundness). If φ is derivable from the undischarged as- fol:ntd:sou:


thm:soundness
sumptions Γ , then Γ ⊨ φ.

Proof. Let δ be a derivation of φ. We proceed by induction on the number of


inferences in δ.
For the induction basis we show the claim if the number of inferences is 0.
In this case, δ consists only of a single sentence φ, i.e., an assumption. That
assumption is undischarged, since assumptions can only be discharged by in-
ferences, and there are no inferences. So, any structure M that satisfies all of
the undischarged assumptions of the proof also satisfies φ.
Now for the inductive step. Suppose that δ contains n inferences. The
premise(s) of the lowermost inference are derived using sub-derivations, each
of which contains fewer than n inferences. We assume the induction hypoth-
esis: The premises of the lowermost inference follow from the undischarged
assumptions of the sub-derivations ending in those premises. We have to show
that the conclusion φ follows from the undischarged assumptions of the entire
proof.
We distinguish cases according to the type of the lowermost inference. First,
we consider the possible inferences with only one premise.

1. Suppose that the last inference is ¬Intro: The derivation has the form

Γ, [φ]n

δ1


n
¬φ ¬Intro

By inductive hypothesis, ⊥ follows from the undischarged assumptions


Γ ∪ {φ} of δ1 . Consider a structure M. We need to show that, if M ⊨ Γ ,
then M ⊨ ¬φ. Suppose for reductio that M ⊨ Γ , but M ⊭ ¬φ, i.e.,
M ⊨ φ. This would mean that M ⊨ Γ ∪ {φ}. This is contrary to our
inductive hypothesis. So, M ⊨ ¬φ.

2. The last inference is ∧Elim: There are two variants: φ or ψ may be


inferred from the premise φ ∧ ψ. Consider the first case. The derivation δ
looks like this:

Γ
δ1

φ∧ψ
φ ∧Elim

By inductive hypothesis, φ ∧ ψ follows from the undischarged assump-


tions Γ of δ1 . Consider a structure M. We need to show that, if M ⊨ Γ ,

286 Release : 6891b66 (2024-12-01)


20.11. SOUNDNESS

then M ⊨ φ. Suppose M ⊨ Γ . By our inductive hypothesis (Γ ⊨ φ ∧ ψ),


we know that M ⊨ φ ∧ ψ. By definition, M ⊨ φ ∧ ψ iff M ⊨ φ and M ⊨ ψ.
(The case where ψ is inferred from φ ∧ ψ is handled similarly.)

3. The last inference is ∨Intro: There are two variants: φ ∨ ψ may be


inferred from the premise φ or the premise ψ. Consider the first case.
The derivation has the form

Γ
δ1
φ
∨Intro
φ∨ψ

By inductive hypothesis, φ follows from the undischarged assumptions Γ


of δ1 . Consider a structure M. We need to show that, if M ⊨ Γ , then
M ⊨ φ ∨ ψ. Suppose M ⊨ Γ ; then M ⊨ φ since Γ ⊨ φ (the inductive
hypothesis). So it must also be the case that M ⊨ φ ∨ ψ. (The case where
φ ∨ ψ is inferred from ψ is handled similarly.)

4. The last inference is →Intro: φ → ψ is inferred from a subproof with


assumption φ and conclusion ψ, i.e.,

Γ, [φ]n

δ1

ψ
n →Intro
φ→ψ

By inductive hypothesis, ψ follows from the undischarged assumptions


of δ1 , i.e., Γ ∪ {φ} ⊨ ψ. Consider a structure M. The undischarged
assumptions of δ are just Γ , since φ is discharged at the last inference.
So we need to show that Γ ⊨ φ → ψ. For reductio, suppose that for some
structure M, M ⊨ Γ but M ⊭ φ → ψ. So, M ⊨ φ and M ⊭ ψ. But
by hypothesis, ψ is a consequence of Γ ∪ {φ}, i.e., M ⊨ ψ, which is a
contradiction. So, Γ ⊨ φ → ψ.

5. The last inference is ⊥I : Here, δ ends in

Γ
δ1

⊥ ⊥
φ I

Release : 6891b66 (2024-12-01) 287


CHAPTER 20. NATURAL DEDUCTION

By induction hypothesis, Γ ⊨ ⊥. We have to show that Γ ⊨ φ. Suppose


not; then for some M we have M ⊨ Γ and M ⊭ φ. But we always
have M ⊭ ⊥, so this would mean that Γ ⊭ ⊥, contrary to the induction
hypothesis.

6. The last inference is ⊥C : Exercise.

7. The last inference is ∀Intro: Then δ has the form

Γ
δ1

φ(a)
∀Intro
∀x φ(x)

The premise φ(a) is a consequence of the undischarged assumptions Γ


by induction hypothesis. Consider some structure, M, such that M ⊨ Γ .
We need to show that M ⊨ ∀x φ(x). Since ∀x φ(x) is a sentence, this
means we have to show that for every variable assignment s, M, s ⊨ φ(x)
(Proposition 16.19). Since Γ consists entirely of sentences, M, s ⊨ ψ for

all ψ ∈ Γ by Definition 16.11. Let M′ be like M except that aM = s(x).
Since a does not occur in Γ , M′ ⊨ Γ by Corollary 16.21. Since Γ ⊨ φ(a),
M′ ⊨ φ(a). Since φ(a) is a sentence, M′ , s ⊨ φ(a) by Proposition 16.18.
M′ , s ⊨ φ(x) iff M′ ⊨ φ(a) by Proposition 16.23 (recall that φ(a) is
just φ(x)[a/x]). So, M′ , s ⊨ φ(x). Since a does not occur in φ(x),
by Proposition 16.20, M, s ⊨ φ(x). But s was an arbitrary variable
assignment, so M ⊨ ∀x φ(x).

8. The last inference is ∃Intro: Exercise.

9. The last inference is ∀Elim: Exercise.

Now let’s consider the possible inferences with several premises: ∨Elim,
∧Intro, →Elim, and ∃Elim.

1. The last inference is ∧Intro. φ ∧ ψ is inferred from the premises φ and ψ


and δ has the form

Γ1 Γ2

δ1 δ2

φ ψ
∧Intro
φ∧ψ

By induction hypothesis, φ follows from the undischarged assumptions Γ1


of δ1 and ψ follows from the undischarged assumptions Γ2 of δ2 . The
undischarged assumptions of δ are Γ1 ∪ Γ2 , so we have to show that

288 Release : 6891b66 (2024-12-01)


20.12. DERIVATIONS WITH IDENTITY PREDICATE

Γ1 ∪ Γ2 ⊨ φ ∧ ψ. Consider a structure M with M ⊨ Γ1 ∪ Γ2 . Since


M ⊨ Γ1 , it must be the case that M ⊨ φ as Γ1 ⊨ φ, and since M ⊨ Γ2 ,
M ⊨ ψ since Γ2 ⊨ ψ. Together, M ⊨ φ ∧ ψ.

2. The last inference is ∨Elim: Exercise.

3. The last inference is →Elim. ψ is inferred from the premises φ→ψ and φ.
The derivation δ looks like this:

Γ1 Γ2
δ1 δ2
φ→ψ φ
→Elim
ψ

By induction hypothesis, φ → ψ follows from the undischarged assump-


tions Γ1 of δ1 and φ follows from the undischarged assumptions Γ2 of δ2 .
Consider a structure M. We need to show that, if M ⊨ Γ1 ∪ Γ2 , then
M ⊨ ψ. Suppose M ⊨ Γ1 ∪ Γ2 . Since Γ1 ⊨ φ → ψ, M ⊨ φ → ψ. Since
Γ2 ⊨ φ, we have M ⊨ φ. This means that M ⊨ ψ (For if M ⊭ ψ, since
M ⊨ φ, we’d have M ⊭ φ → ψ, contradicting M ⊨ φ → ψ).

4. The last inference is ¬Elim: Exercise.

5. The last inference is ∃Elim: Exercise.

Problem 20.8. Complete the proof of Theorem 20.27.

fol:ntd:sou: Corollary 20.28. If ⊢ φ, then φ is valid.


cor:weak-soundness

fol:ntd:sou: Corollary 20.29. If Γ is satisfiable, then it is consistent.


cor:consistency-soundness

Proof. We prove the contrapositive. Suppose that Γ is not consistent. Then


Γ ⊢ ⊥, i.e., there is a derivation of ⊥ from undischarged assumptions in Γ . By
Theorem 20.27, any structure M that satisfies Γ must satisfy ⊥. Since M ⊭ ⊥
for every structure M, no M can satisfy Γ , i.e., Γ is not satisfiable.

content/first-order-logic/natural-deduction/identity.tex

20.12 Derivations with Identity predicate


fol:ntd:ide: Derivations with identity predicate require additional inference rules.
sec

Release : 6891b66 (2024-12-01) 289


CHAPTER 20. NATURAL DEDUCTION

t1 = t2 φ(t1 )
=Elim
φ(t2 )
t = t =Intro
t1 = t2 φ(t2 )
=Elim
φ(t1 )

In the above rules, t, t1 , and t2 are closed terms. The =Intro rule allows us
to derive any identity statement of the form t = t outright, from no assump-
tions.

Example 20.30. If s and t are closed terms, then φ(s), s = t ⊢ φ(t):

s=t φ(s)
=Elim
φ(t)

This may be familiar as the “principle of substitutability of identicals,” or


Leibniz’ Law.

Problem 20.9. Prove that = is both symmetric and transitive, i.e., give
derivations of ∀x ∀y (x = y → y = x) and ∀x ∀y ∀z((x = y ∧ y = z) → x = z)

Example 20.31. We derive the sentence

∀x ∀y ((φ(x) ∧ φ(y)) → x = y)

from the sentence

∃x ∀y (φ(y) → y = x)

We develop the derivation backwards:

∃x ∀y (φ(y) → y = x) [φ(a) ∧ φ(b)]1

a=b
1 →Intro
((φ(a) ∧ φ(b)) → a = b)
∀Intro
∀y ((φ(a) ∧ φ(y)) → a = y)
∀Intro
∀x ∀y ((φ(x) ∧ φ(y)) → x = y)

We’ll now have to use the main assumption: since it is an existential formula,
we use ∃Elim to derive the intermediary conclusion a = b.

290 Release : 6891b66 (2024-12-01)


20.13. SOUNDNESS WITH IDENTITY PREDICATE

[∀y (φ(y) → y = c)]2


[φ(a) ∧ φ(b)]1

∃x ∀y (φ(y) → y = x) a=b
2 ∃Elim
a=b
1 →Intro
((φ(a) ∧ φ(b)) → a = b)
∀Intro
∀y ((φ(a) ∧ φ(y)) → a = y)
∀Intro
∀x ∀y ((φ(x) ∧ φ(y)) → x = y)

The sub-derivation on the top right is completed by using its assumptions to


show that a = c and b = c. This requires two separate derivations. The
derivation for a = c is as follows:
[∀y (φ(y) → y = c)]2 [φ(a) ∧ φ(b)]1
∀Elim ∧Elim
φ(a) → a = c φ(a)
a=c →Elim

From a = c and b = c we derive a = b by =Elim.

Problem 20.10. Give derivations of the following formulas:

1. ∀x ∀y ((x = y ∧ φ(x)) → φ(y))

2. ∃x φ(x) ∧ ∀y ∀z ((φ(y) ∧ φ(z)) → y = z) → ∃x (φ(x) ∧ ∀y (φ(y) → y = x))

content/first-order-logic/natural-deduction/soundness-identity.tex

20.13 Soundness with Identity predicate


fol:ntd:sid:
sec
Proposition 20.32. Natural deduction with rules for = is sound.

Proof. Any formula of the form t = t is valid, since for every structure M,
M ⊨ t = t. (Note that we assume the term t to be closed, i.e., it contains no
variables, so variable assignments are irrelevant).
Suppose the last inference in a derivation is =Elim, i.e., the derivation has
the following form:

Γ1 Γ2

δ1 δ2

t1 = t2 φ(t1 )
=Elim
φ(t2 )

Release : 6891b66 (2024-12-01) 291


The premises t1 = t2 and φ(t1 ) are derived from undischarged assumptions Γ1
and Γ2 , respectively. We want to show that φ(t2 ) follows from Γ1 ∪ Γ2 . Con-
sider a structure M with M ⊨ Γ1 ∪ Γ2 . By induction hypothesis, M ⊨ φ(t1 )
and M ⊨ t1 = t2 . Therefore, ValM (t1 ) = ValM (t2 ). Let s be any variable as-
signment, and m = ValM (t1 ) = ValM (t2 ). By Proposition 16.23, M, s ⊨ φ(t1 )
iff M, s[m/x] ⊨ φ(x) iff M, s ⊨ φ(t2 ). Since M ⊨ φ(t1 ), we have M ⊨ φ(t2 ).

Chapter 21

Tableaux

This chapter presents a signed analytic tableaux system.


To include or exclude material relevant to natural deduction as a proof
system, use the “prfTab” tag.

content/first-order-logic/tableaux/rules-and-proofs.tex

21.1 Rules and Tableaux


A tableau is a systematic survey of the possible ways a sentence can be true fol:tab:rul:
sec
or false in a structure. The building blocks of a tableau are signed formulas:
sentences plus a truth value “sign,” either T or F. These signed formulas are
arranged in a (downward growing) tree.

Definition 21.1. A signed formula is a pair consisting of a truth value and


a sentence, i.e., either:
T φ or F φ.

Intuitively, we might read T φ as “φ might be true” and F φ as “φ might


be false” (in some structure).
Each signed formula in the tree is either an assumption (which are listed
at the very top of the tree), or it is obtained from a signed formula above it
by one of a number of rules of inference. There are two rules for each possible
main operator of the preceding formula, one for the case where the sign is T,

292
21.2. PROPOSITIONAL RULES

and one for the case where the sign is F. Some rules allow the tree to branch,
and some only add signed formulas to the branch. A rule may be (and often
must be) applied not to the immediately preceding signed formula, but to any
signed formula in the branch from the root to the place the rule is applied.
A branch is closed when it contains both T φ and F φ. A closed tableau
is one where every branch is closed. Under the intuitive interpretation, any
branch describes a joint possibility, but T φ and F φ are not jointly possible.
In other words, if a branch is closed, the possibility it describes has been ruled
out. In particular, that means that a closed tableau rules out all possibilities
of simultaneously making every assumption of the form T φ true and every
assumption of the form F φ false.
A closed tableau for φ is a closed tableau with root F φ. If such a closed
tableau exists, all possibilities for φ being false have been ruled out; i.e., φ
must be true in every structure.

content/first-order-logic/tableaux/propositional-rules.tex

21.2 Propositional Rules


fol:tab:prl:
sec
Rules for ¬

T ¬φ F ¬φ
¬T ¬F
Fφ Tφ

Rules for ∧

Tφ ∧ ψ
∧T Fφ∧ψ
Tφ ∧F
Fφ | Fψ

Rules for ∨

Fφ∨ψ
Tφ ∨ ψ ∨F
∨T Fφ
Tφ | Tψ

Release : 6891b66 (2024-12-01) 293


CHAPTER 21. TABLEAUX

Rules for →

Fφ→ψ
Tφ → ψ →F
→T Tφ
F φ | Tψ

The Cut Rule

Cut
Tφ | Fφ

The Cut rule is not applied “to” a previous signed formula; rather, it allows
every branch in a tableau to be split in two, one branch containing T φ, the
other F φ. It is not necessary—any set of signed formulas with a closed tableau
has one not using Cut—but it allows us to combine tableaux in a convenient
way.

content/first-order-logic/tableaux/quantifier-rules.tex

21.3 Quantifier Rules


Rules for ∀ fol:tab:qrl:
sec

T ∀x φ(x) F ∀x φ(x)
∀T ∀F
T φ(t) F φ(a)

In ∀T, t is a closed term (i.e., one without variables). In ∀F, a is a constant


symbol which must not occur anywhere in the branch above ∀F rule. We call
a the eigenvariable of the ∀F inference.1

Rules for ∃

T ∃x φ(x) F ∃x φ(x)
∃T ∃F
T φ(a) F φ(t)

1 We use the term “eigenvariable” even though a in the above rule is a constant symbol.

This has historical reasons.

294 Release : 6891b66 (2024-12-01)


21.4. TABLEAUX

Again, t is a closed term, and a is a constant symbol which does not occur in
the branch above the ∃T rule. We call a the eigenvariable of the ∃T inference.
The condition that an eigenvariable not occur in the branch above the ∀F
or ∃T inference is called the eigenvariable condition.
Recall the convention that when φ is a formula with the variable x free, we explanation
indicate this by writing φ(x). In the same context, φ(t) then is short for φ[t/x].
So we could also write the ∃F rule as:
F ∃x φ
∃F
F φ[t/x]
Note that t may already occur in φ, e.g., φ might be P (t, x). Thus, inferring
F P (t, t) from F ∃x P (t, x) is a correct application of ∃F. However, the eigen-
variable conditions in ∀F and ∃T require that the constant symbol a does not
occur in φ. So, you cannot correctly infer F P (a, a) from F ∀x P (a, x) using ∀F.
In ∀T and ∃F there are no restrictions on the term t. On the other hand, in explanation
the ∃T and ∀F rules, the eigenvariable condition requires that the constant sym-
bol a does not occur anywhere in the branches above the respective inference.
It is necessary to ensure that the system is sound. Without this condition, the
following would be a closed tableau for ∃x φ(x) → ∀x φ(x):

1. F ∃x φ(x) → ∀x φ(x) Assumption


2. T ∃x φ(x) →F 1
3. F ∀x φ(x) →F 1
4. T φ(a) ∃T 2
5. F φ(a) ∀F 3

However, ∃x φ(x) → ∀x φ(x) is not valid.

content/first-order-logic/tableaux/derivations.tex

21.4 Tableaux
fol:tab:der: We’ve said what an assumption is, and we’ve given the rules of inference. explanation
sec
Tableaux are inductively generated from these: each tableau either is a single
branch consisting of one or more assumptions, or it results from a tableau by
applying one of the rules of inference on a branch.
Definition 21.2 (Tableau). A tableau for assumptions S1φ1 , . . . , Snφn (where
each Si is either T or F) is a finite tree of signed formulas satisfying the following
conditions:
1. The n topmost signed formulas of the tree are Siφi , one below the other.
2. Every signed formula in the tree that is not one of the assumptions results
from a correct application of an inference rule to a signed formula in the
branch above it.

Release : 6891b66 (2024-12-01) 295


CHAPTER 21. TABLEAUX

A branch of a tableau is closed iff it contains both T φ and F φ, and open


otherwise. A tableau in which every branch is closed is a closed tableau (for its
set of assumptions). If a tableau is not closed, i.e., if it contains at least one
open branch, it is open.

Example 21.3. Every set of assumptions on its own is a tableau, but it will
generally not be closed. (Obviously, it is closed only if the assumptions already
contain a pair of signed formulas T φ and F φ.)
From a tableau (open or closed) we can obtain a new, larger one by applying
one of the rules of inference to a signed formula φ in it. The rule will append
one or more signed formulas to the end of any branch containing the occurrence
of φ to which we apply the rule.
For instance, consider the assumption T φ ∧ ¬φ. Here is the (open) tableau
consisting of just that assumption:

1. T φ ∧ ¬φ Assumption

We obtain a new tableau from it by applying the ∧T rule to the assumption.


That rule allows us to add two new lines to the tableau, T φ and T ¬φ:

1. T φ ∧ ¬φ Assumption
2. Tφ ∧T 1
3. T ¬φ ∧T 1

When we write down tableaux, we record the rules we’ve applied on the right
(e.g., ∧T1 means that the signed formula on that line is the result of applying
the ∧T rule to the signed formula on line 1). This new tableau now contains
additional signed formulas, but to only one (T ¬φ) can we apply a rule (in this
case, the ¬T rule). This results in the closed tableau

1. T φ ∧ ¬φ Assumption
2. Tφ ∧T 1
3. T ¬φ ∧T 1
4. Fφ ¬T 3

content/first-order-logic/tableaux/proving-things.tex

21.5 Examples of Tableaux


fol:tab:pro:
sec
Example 21.4. Let’s find a closed tableau for the sentence (φ ∧ ψ) → φ.
We begin by writing the corresponding assumption at the top of the tableau.

1. F (φ ∧ ψ) → φ Assumption

296 Release : 6891b66 (2024-12-01)


21.5. EXAMPLES OF TABLEAUX

There is only one assumption, so only one signed formula to which we can
apply a rule. (For every signed formula, there is always at most one rule that
can be applied: it’s the rule for the corresponding sign and main operator of
the sentence.) In this case, this means, we must apply →F.

1. F (φ ∧ ψ) → φ ✓ Assumption
2. Tφ ∧ ψ →F 1
3. Fφ →F 1

To keep track of which signed formulas we have applied their corresponding


rules to, we write a checkmark next to the sentence. However, only write a
checkmark if the rule has been applied to all open branches. Once a signed
formula has had the corresponding rule applied in every open branch, we will
not have to return to it and apply the rule again. In this case, there is only
one branch, so the rule only has to be applied once. (Note that checkmarks
are only a convenience for constructing tableaux and are not officially part of
the syntax of tableaux.)
There is one new signed formula to which we can apply a rule: the T φ ∧ ψ
on line 2. Applying the ∧T rule results in:

1. F (φ ∧ ψ) → φ ✓ Assumption
2. Tφ ∧ ψ ✓ →F 1
3. Fφ →F 1
4. Tφ ∧T 2
5. Tψ ∧T 2

Since the branch now contains both T φ (on line 4) and F φ (on line 3), the
branch is closed. Since it is the only branch, the tableau is closed. We have
found a closed tableau for (φ ∧ ψ) → φ.

Example 21.5. Now let’s find a closed tableau for (¬φ ∨ ψ) → (φ → ψ).
We begin with the corresponding assumption:

1. F (¬φ ∨ ψ) → (φ → ψ) Assumption

The one signed formula in this tableau has main operator → and sign F, so we
apply the →F rule to it to obtain:

1. F (¬φ ∨ ψ) → (φ → ψ) ✓ Assumption
2. T ¬φ ∨ ψ →F 1
3. F (φ → ψ) →F 1

We now have a choice as to whether to apply ∨T to line 2 or →F to line 3.


It actually doesn’t matter which order we pick, as long as each signed formula
has its corresponding rule applied in every branch. So let’s pick the first one.
The ∨T rule allows the tableau to branch, and the two conclusions of the rule

Release : 6891b66 (2024-12-01) 297


CHAPTER 21. TABLEAUX

will be the new signed formulas added to the two new branches. This results
in:

1. F (¬φ ∨ ψ) → (φ → ψ) ✓ Assumption
2. T ¬φ ∨ ψ ✓ →F 1
3. F (φ → ψ) →F 1

4. T ¬φ Tψ ∨T 2

We have not applied the →F rule to line 3 yet: let’s do that now. To save
time, we apply it to both branches. Recall that we write a checkmark next to
a signed formula only if we have applied the corresponding rule in every open
branch. So it’s a good idea to apply a rule at the end of every branch that
contains the signed formula the rule applies to. That way we won’t have to
return to that signed formula lower down in the various branches.

1. F (¬φ ∨ ψ) → (φ → ψ) ✓ Assumption
2. T ¬φ ∨ ψ ✓ →F 1
3. F (φ → ψ) ✓ →F 1

4. T ¬φ Tψ ∨T 2
5. Tφ Tφ →F 3
6. Fψ Fψ →F 3

The right branch is now closed. On the left branch, we can still apply the ¬T
rule to line 4. This results in F φ and closes the left branch:

1. F (¬φ ∨ ψ) → (φ → ψ) ✓ Assumption
2. T ¬φ ∨ ψ ✓ →F 1
3. F (φ → ψ) ✓ →F 1

4. T ¬φ Tψ ∨T 2
5. Tφ Tφ →F 3
6. Fψ Fψ →F 3
7. Fφ ⊗ ¬T 4

Example 21.6. We can give tableaux for any number of signed formulas as
assumptions. Often it is also necessary to apply more than one rule that allows
branching; and in general a tableau can have any number of branches. For
instance, consider a tableau for {T φ ∨ (ψ ∧ χ), F (φ ∨ ψ) ∧ (φ ∨ χ)}. We start
by applying the ∨T to the first assumption:

298 Release : 6891b66 (2024-12-01)


21.5. EXAMPLES OF TABLEAUX

1. T φ ∨ (ψ ∧ χ) ✓ Assumption
2. F (φ ∨ ψ) ∧ (φ ∨ χ) Assumption

3. Tφ Tψ ∧ χ ∨T 1

Now we can apply the ∧F rule to line 2. We do this on both branches simul-
taneously, and can therefore check off line 2:

1. T φ ∨ (ψ ∧ χ) ✓ Assumption
2. F (φ ∨ ψ) ∧ (φ ∨ χ) ✓ Assumption

3. Tφ Tψ ∧ χ ∨T 1

4. Fφ∨ψ Fφ∨χ Fφ∨ψ Fφ∨χ ∧F 2

Now we can apply ∨F to all the branches containing φ ∨ ψ:

1. T φ ∨ (ψ ∧ χ) ✓ Assumption
2. F (φ ∨ ψ) ∧ (φ ∨ χ) ✓ Assumption

3. Tφ Tψ ∧ χ ∨T 1

4. Fφ∨ψ ✓ Fφ∨χ Fφ∨ψ ✓ Fφ∨χ ∧F 2


5. Fφ Fφ ∨F 4
6. Fψ Fψ ∨F 4

The leftmost branch is now closed. Let’s now apply ∨F to φ ∨ χ:

1. T φ ∨ (ψ ∧ χ) ✓ Assumption
2. F (φ ∨ ψ) ∧ (φ ∨ χ) ✓ Assumption

3. Tφ Tψ ∧ χ ∨T 1

4. Fφ∨ψ ✓ Fφ∨χ ✓ Fφ∨ψ ✓ Fφ∨χ ✓ ∧F 2


5. Fφ Fφ ∨F 4
6. Fψ Fψ ∨F 4
7. ⊗ Fφ Fφ ∨F 4
8. Fχ Fχ ∨F 4

Note that we moved the result of applying ∨F a second time below for clarity.
In this instance it would not have been needed, since the justifications would
have been the same.

Release : 6891b66 (2024-12-01) 299


CHAPTER 21. TABLEAUX

Two branches remain open, and T ψ ∧ χ on line 3 remains unchecked. We


apply ∧T to it to obtain a closed tableau:

1. T φ ∨ (ψ ∧ χ) ✓ Assumption
2. F (φ ∨ ψ) ∧ (φ ∨ χ) ✓ Assumption

3. Tφ Tψ ∧ χ ✓ ∨T 1

4. Fφ∨ψ ✓ Fφ∨χ ✓ Fφ∨ψ ✓ Fφ∨χ ✓ ∧F 2


5. Fφ Fφ Fφ Fφ ∨F 4
6. Fψ Fχ Fψ Fχ ∨F 4
7. ⊗ ⊗ Tψ Tψ ∧T 3
8. Tχ Tχ ∧T 3
⊗ ⊗

For comparison, here’s a closed tableau for the same set of assumptions in
which the rules are applied in a different order:

1. T φ ∨ (ψ ∧ χ) ✓ Assumption
2. F (φ ∨ ψ) ∧ (φ ∨ χ) ✓ Assumption

3. Fφ∨ψ ✓ Fφ∨χ ✓ ∧F 2
4. Fφ Fφ ∨F 3
5. Fψ Fχ ∨F 3

6. Tφ Tψ ∧ χ ✓ Tφ Tψ ∧ χ ✓ ∨T 1
7. ⊗ Tψ ⊗ Tψ ∧T 6
8. Tχ Tχ ∧T 6
⊗ ⊗

Problem 21.1. Give closed tableaux of the following:


1. T φ ∧ (ψ ∧ χ), F (φ ∧ ψ) ∧ χ.
2. T φ ∨ (ψ ∨ χ), F (φ ∨ ψ) ∨ χ.
3. T φ → (ψ → χ), F ψ → (φ → χ).
4. T φ, F ¬¬φ.

Problem 21.2. Give closed tableaux of the following:


1. T (φ ∨ ψ) → χ, F φ → χ.
2. T (φ → χ) ∧ (ψ → χ), F (φ ∨ ψ) → χ.
3. F ¬(φ ∧ ¬φ).

300 Release : 6891b66 (2024-12-01)


21.6. TABLEAUX WITH QUANTIFIERS

4. T ψ → φ, F ¬φ → ¬ψ.

5. F (φ → ¬φ) → ¬φ.

6. F ¬(φ → ψ) → ¬ψ.

7. T φ → χ, F ¬(φ ∧ ¬χ).

8. T φ ∧ ¬χ, F ¬(φ → χ).

9. T φ ∨ ψ, ¬ψ, F φ.

10. T ¬φ ∨ ¬ψ, F ¬(φ ∧ ψ).

11. F (¬φ ∧ ¬ψ) → ¬(φ ∨ ψ).

12. F ¬(φ ∨ ψ) → (¬φ ∧ ¬ψ).

Problem 21.3. Give closed tableaux of the following:

1. T ¬(φ → ψ), F φ.

2. T ¬(φ ∧ ψ), F ¬φ ∨ ¬ψ.

3. T φ → ψ, F ¬φ ∨ ψ.

4. F ¬¬φ → φ.

5. T φ → ψ, T ¬φ → ψ, F ψ.

6. T (φ ∧ ψ) → χ, F (φ → χ) ∨ (ψ → χ).

7. T (φ → ψ) → φ, F φ.

8. F (φ → ψ) ∨ (ψ → χ).

content/first-order-logic/tableaux/proving-things-quant.tex

21.6 Tableaux with Quantifiers


fol:tab:prq:
sec
Example 21.7. When dealing with quantifiers, we have to make sure not
to violate the eigenvariable condition, and sometimes this requires us to play
around with the order of carrying out certain inferences. In general, it helps
to try and take care of rules subject to the eigenvariable condition first (they
will be higher up in the finished tableau).
Let’s see how we’d give a tableau for the sentence ∃x ¬φ(x) → ¬∀x φ(x).
Starting as usual, we start by recording the assumption,

1. F ∃x ¬φ(x) → ¬∀x φ(x) Assumption

Release : 6891b66 (2024-12-01) 301


CHAPTER 21. TABLEAUX

Since the main operator is →, we apply the →F:

1. F ∃x ¬φ(x) → ¬∀x φ(x) ✓ Assumption


2. T ∃x ¬φ(x) →F 1
3. F ¬∀x φ(x) →F 1

The next line to deal with is 2. We use ∃T. This requires a new constant
symbol; since no constant symbols yet occur, we can pick any one, say, a.

1. F ∃x ¬φ(x) → ¬∀x φ(x) ✓ Assumption


2. T ∃x ¬φ(x) ✓ →F 1
3. F ¬∀x φ(x) →F 1
4. T ¬φ(a) ∃T 2

Now we apply ¬F to line 3:

1. F ∃x ¬φ(x) → ¬∀x φ(x) ✓ Assumption


2. T ∃x ¬φ(x) ✓ →F 1
3. F ¬∀x φ(x) ✓ →F 1
4. T ¬φ(a) ∃T 2
5. T ∀x φ(x) ¬F 3

We obtain a closed tableau by applying ¬T to line 4, followed by ∀T to line 5.

1. F ∃x ¬φ(x) → ¬∀x φ(x) ✓ Assumption


2. T ∃x ¬φ(x) ✓ →F 1
3. F ¬∀x φ(x) ✓ →F 1
4. T ¬φ(a) ∃T 2
5. T ∀x φ(x) ¬F 3
6. F φ(a) ¬T 4
7. T φ(a) ∀T 5

Example 21.8. Let’s see how we’d give a tableau for the set

F ∃x χ(x, b), T ∃x (φ(x) ∧ ψ(x)), T ∀x (ψ(x) → χ(x, b)).

Starting as usual, we start with the assumptions:

1. F ∃x χ(x, b) Assumption
2. T ∃x (φ(x) ∧ ψ(x)) Assumption
3. T ∀x (ψ(x) → χ(x, b)) Assumption

We should always apply a rule with the eigenvariable condition first; in this
case that would be ∃T to line 2. Since the assumptions contain the constant
symbol b, we have to use a different one; let’s pick a again.

302 Release : 6891b66 (2024-12-01)


21.6. TABLEAUX WITH QUANTIFIERS

1. F ∃x χ(x, b) Assumption
2. T ∃x (φ(x) ∧ ψ(x)) ✓ Assumption
3. T ∀x (ψ(x) → χ(x, b)) Assumption
4. T φ(a) ∧ ψ(a) ∃T 2

If we now apply ∃F to line 1 or ∀T to line 3, we have to decide which term t


to substitute for x. Since there is no eigenvariable condition for these rules, we
can pick any term we like. In some cases we may even have to apply the rule
several times with different ts. But as a general rule, it pays to pick one of the
terms already occurring in the tableau—in this case, a and b—and in this case
we can guess that a will be more likely to result in a closed branch.

1. F ∃x χ(x, b) Assumption
2. T ∃x (φ(x) ∧ ψ(x)) ✓ Assumption
3. T ∀x (ψ(x) → χ(x, b)) Assumption
4. T φ(a) ∧ ψ(a) ∃T 2
5. F χ(a, b) ∃F 1
6. T ψ(a) → χ(a, b) ∀T 3

We don’t check the signed formulas in lines 1 and 3, since we may have to use
them again. Now apply ∧T to line 4:

1. F ∃x χ(x, b) Assumption
2. T ∃x (φ(x) ∧ ψ(x)) ✓ Assumption
3. T ∀x (ψ(x) → χ(x, b)) Assumption
4. T φ(a) ∧ ψ(a) ✓ ∃T 2
5. F χ(a, b) ∃F 1
6. T ψ(a) → χ(a, b) ∀T 3
7. T φ(a) ∧T 4
8. T ψ(a) ∧T 4

If we now apply →T to line 6, the tableau closes:

1. F ∃x χ(x, b) Assumption
2. T ∃x (φ(x) ∧ ψ(x)) ✓ Assumption
3. T ∀x (ψ(x) → χ(x, b)) Assumption
4. T φ(a) ∧ ψ(a) ✓ ∃T 2
5. F χ(a, b) ∃F 1
6. T ψ(a) → χ(a, b) ✓ ∀T 3
7. T φ(a) ∧T 4
8. T ψ(a) ∧T 4

9. F ψ(a) T χ(a, b) →T 6
⊗ ⊗

Release : 6891b66 (2024-12-01) 303


CHAPTER 21. TABLEAUX

Example 21.9. We construct a tableau for the set

T ∀x φ(x), T ∀x φ(x) → ∃y ψ(y), T ¬∃y ψ(y).

Starting as usual, we write down the assumptions:

1. T ∀x φ(x) Assumption
2. T ∀x φ(x) → ∃y ψ(y) Assumption
3. T ¬∃y ψ(y) Assumption

We begin by applying the ¬T rule to line 3. A corollary to the rule “always


apply rules with eigenvariable conditions first” is “defer applying quantifier
rules without eigenvariable conditions until needed.” Also, defer rules that
result in a split.

1. T ∀x φ(x) Assumption
2. T ∀x φ(x) → ∃y ψ(y) Assumption
3. T ¬∃y ψ(y) ✓ Assumption
4. F ∃y ψ(y) ¬T 3

The new line 4 requires ∃F, a quantifier rule without the eigenvariable condi-
tion. So we defer this in favor of using →T on line 2.

1. T ∀x φ(x) Assumption
2. T ∀x φ(x) → ∃y ψ(y) ✓ Assumption
3. T ¬∃y ψ(y) ✓ Assumption
4. F ∃y ψ(y) ¬T 3

5. F ∀x φ(x) T ∃y ψ(y) →T 2

Both new signed formulas require rules with eigenvariable conditions, so these
should be next:

1. T ∀x φ(x) Assumption
2. T ∀x φ(x) → ∃y ψ(y) ✓ Assumption
3. T ¬∃y ψ(y) ✓ Assumption
4. F ∃y ψ(y) ¬T 3

5. F ∀x φ(x) ✓ T ∃y ψ(y) ✓ →T 2
6. F φ(b) T ψ(c) ∀F 5; ∃T 5

To close the branches, we have to use the signed formulas on lines 1 and 3.
The corresponding rules (∀T and ∃F) don’t have eigenvariable conditions, so
we are free to pick whichever terms are suitable. In this case, that’s b and c,
respectively.

304 Release : 6891b66 (2024-12-01)


21.7. PROOF-THEORETIC NOTIONS

1. T ∀x φ(x) Assumption
2. T ∀x φ(x) → ∃y ψ(y) ✓ Assumption
3. T ¬∃y ψ(y) ✓ Assumption
4. F ∃y ψ(y) ¬T 3

5. F ∀x φ(x) ✓ T ∃y ψ(y) ✓ →T 2
6. F φ(b) T ψ(c) ∀F 5; ∃T 5
7. T φ(b) F ψ(c) ∀T 1; ∃F 4
⊗ ⊗

Problem 21.4. Give closed tableaux of the following:


1. F (∀x φ(x) ∧ ∀y ψ(y)) → ∀z (φ(z) ∧ ψ(z)).
2. F (∃x φ(x) ∨ ∃y ψ(y)) → ∃z (φ(z) ∨ ψ(z)).
3. T ∀x (φ(x) → ψ), F ∃y φ(y) → ψ.
4. T ∀x ¬φ(x), F ¬∃x φ(x).
5. F ¬∃x φ(x) → ∀x ¬φ(x).
6. F ¬∃x ∀y ((φ(x, y) → ¬φ(y, y)) ∧ (¬φ(y, y) → φ(x, y))).

Problem 21.5. Give closed tableaux of the following:


1. F ¬∀x φ(x) → ∃x ¬φ(x).
2. T (∀x φ(x) → ψ), F ∃y (φ(y) → ψ).
3. F ∃x (φ(x) → ∀y φ(y)).

content/first-order-logic/tableaux/proof-theoretic-notions.tex

21.7 Proof-Theoretic Notions


fol:tab:ptn:
sec

This section collects the definitions of the provability relation and con-
sistency for tableaux.

Just as we’ve defined a number of important semantic notions (validity, en- explanation
tailment, satisfiability), we now define corresponding proof-theoretic notions.
These are not defined by appeal to satisfaction of sentences in structures, but
by appeal to the existence of certain closed tableaux. It was an important dis-
covery that these notions coincide. That they do is the content of the soundness
and completeness theorems.

Release : 6891b66 (2024-12-01) 305


CHAPTER 21. TABLEAUX

Definition 21.10 (Theorems). A sentence φ is a theorem if there is a closed


tableau for F φ. We write ⊢ φ if φ is a theorem and ⊬ φ if it is not.

Definition 21.11 (Derivability). A sentence φ is derivable from a set of


sentences Γ , Γ ⊢ φ iff there is a finite set {ψ1 , . . . , ψn } ⊆ Γ and a closed
tableau for the set
{F φ, T ψ1 , . . . , T ψn }.
If φ is not derivable from Γ we write Γ ⊬ φ.

Definition 21.12 (Consistency). A set of sentences Γ is inconsistent iff


there is a finite set {ψ1 , . . . , ψn } ⊆ Γ and a closed tableau for the set

{T ψ1 , . . . , T ψn }.

If Γ is not inconsistent, we say it is consistent.

Proposition 21.13 (Reflexivity). If φ ∈ Γ , then Γ ⊢ φ. fol:tab:ptn:


prop:reflexivity

Proof. If φ ∈ Γ , {φ} is a finite subset of Γ and the tableau

1. Fφ Assumption
2. Tφ Assumption

is closed.

Proposition 21.14 (Monotonicity). If Γ ⊆ ∆ and Γ ⊢ φ, then ∆ ⊢ φ. fol:tab:ptn:


prop:monotonicity

Proof. Any finite subset of Γ is also a finite subset of ∆.

Proposition 21.15 (Transitivity). If Γ ⊢ φ and {φ}∪∆ ⊢ ψ, then Γ ∪∆ ⊢ fol:tab:ptn:


prop:transitivity
ψ.

Proof. If {φ} ∪ ∆ ⊢ ψ, then there is a finite subset ∆0 = {χ1 , . . . , χn } ⊆ ∆


such that

{F ψ,T φ, T χ1 , . . . , T χn }

has a closed tableau. If Γ ⊢ φ then there are θ1 , . . . , θm such that

{F φ,T θ1 , . . . , T θm }

has a closed tableau.


Now consider the tableau with assumptions

F ψ, T χ1 , . . . , T χn , T θ1 , . . . , T θm .

306 Release : 6891b66 (2024-12-01)


21.7. PROOF-THEORETIC NOTIONS

Apply the Cut rule on φ. This generates two branches, one has T φ in it, the
other F φ. Thus, on the one branch, all of

{F ψ, T φ, T χ1 , . . . , T χn }

are available. Since there is a closed tableau for these assumptions, we can
attach it to that branch; every branch through T φ closes. On the other branch,
all of
{F φ, T θ1 , . . . , T θm }

are available, so we can also complete the other side to obtain a closed tableau.
This shows Γ ∪ ∆ ⊢ ψ.

Note that this means that in particular if Γ ⊢ φ and φ ⊢ ψ, then Γ ⊢ ψ. It


follows also that if φ1 , . . . , φn ⊢ ψ and Γ ⊢ φi for each i, then Γ ⊢ ψ.

fol:tab:ptn: Proposition 21.16. Γ is inconsistent iff Γ ⊢ φ for every sentence φ.


prop:incons

Proof. Exercise.

Problem 21.6. Prove Proposition 21.16

fol:tab:ptn: Proposition 21.17 (Compactness).


prop:proves-compact

1. If Γ ⊢ φ then there is a finite subset Γ0 ⊆ Γ such that Γ0 ⊢ φ.

2. If every finite subset of Γ is consistent, then Γ is consistent.

Proof. 1. If Γ ⊢ φ, then there is a finite subset Γ0 = {ψ1 , . . . , ψn } and a


closed tableau for
{F φ, T ψ1 , . . . , T ψn }

This tableau also shows Γ0 ⊢ φ.

2. If Γ is inconsistent, then for some finite subset Γ0 = {ψ1 , . . . , ψn } there


is a closed tableau for
{T ψ1 , . . . , T ψn }

This closed tableau shows that Γ0 is inconsistent.

content/first-order-logic/tableaux/provability-consistency.tex

Release : 6891b66 (2024-12-01) 307


CHAPTER 21. TABLEAUX

21.8 Derivability and Consistency


We will now establish a number of properties of the derivability relation. They fol:tab:prv:
sec
are independently interesting, but each will play a role in the proof of the
completeness theorem.
Proposition 21.18. If Γ ⊢ φ and Γ ∪ {φ} is inconsistent, then Γ is incon- fol:tab:prv:
prop:provability-contr
sistent.
Proof. There are finite Γ0 = {ψ1 , . . . , ψn } and Γ1 = {χ1 , . . . , χn } ⊆ Γ such
that
{F φ,T ψ1 , . . . , T ψn }
{T φ,T χ1 , . . . , T χm }
have closed tableaux. Using the Cut rule on φ we can combine these into a
single closed tableau that shows Γ0 ∪ Γ1 is inconsistent. Since Γ0 ⊆ Γ and
Γ1 ⊆ Γ , Γ0 ∪ Γ1 ⊆ Γ , hence Γ is inconsistent.
Proposition 21.19. Γ ⊢ φ iff Γ ∪ {¬φ} is inconsistent. fol:tab:prv:
prop:prov-incons
Proof. First suppose Γ ⊢ φ, i.e., there is a closed tableau for
{F φ, T ψ1 , . . . , T ψn }
Using the ¬T rule, this can be turned into a closed tableau for
{T ¬φ, T ψ1 , . . . , T ψn }.
On the other hand, if there is a closed tableau for the latter, we can turn it
into a closed tableau of the former by removing every formula that results from
¬T applied to the first assumption T ¬φ as well as that assumption, and adding
the assumption F φ. For if a branch was closed before because it contained the
conclusion of ¬T applied to T ¬φ, i.e., F φ, the corresponding branch in the
new tableau is also closed. If a branch in the old tableau was closed because
it contained the assumption T ¬φ as well as F ¬φ we can turn it into a closed
branch by applying ¬F to F ¬φ to obtain T φ. This closes the branch since we
added F φ as an assumption.
Problem 21.7. Prove that Γ ⊢ ¬φ iff Γ ∪ {φ} is inconsistent.
Proposition 21.20. If Γ ⊢ φ and ¬φ ∈ Γ , then Γ is inconsistent. fol:tab:prv:
prop:explicit-inc
Proof. Suppose Γ ⊢ φ and ¬φ ∈ Γ . Then there are ψ1 , . . . , ψn ∈ Γ such that
{F φ, T ψ1 , . . . , T ψn }
has a closed tableau. Replace the assumption F φ by T ¬φ, and insert the
conclusion of ¬T applied to F φ after the assumptions. Any sentence in the
tableau justified by appeal to line 1 in the old tableau is now justified by appeal
to line n + 1. So if the old tableau was closed, the new one is. It shows that Γ
is inconsistent, since all assumptions are in Γ .

308 Release : 6891b66 (2024-12-01)


21.9. DERIVABILITY AND THE PROPOSITIONAL CONNECTIVES

fol:tab:prv: Proposition 21.21. If Γ ∪ {φ} and Γ ∪ {¬φ} are both inconsistent, then Γ
prop:provability-exhaustive
is inconsistent.

Proof. If there are ψ1 , . . . , ψn ∈ Γ and χ1 , . . . , χm ∈ Γ such that

{T φ,T ψ1 , . . . , T ψn } and
{T ¬φ,T χ1 , . . . , T χm }

both have closed tableaux, we can construct a single, combined tableau that
shows that Γ is inconsistent by using as assumptions T ψ1 , . . . , T ψn together
with T χ1 , . . . , T χm , followed by an application of the Cut rule. This yields
two branches, one starting with T φ, the other with F φ.
On the left left side, add the part of the first tableau below its assumptions.
Here, every rule application is still correct, since each of the assumptions of the
first tableau, including T φ, is available. Thus, every branch below T φ closes.
On the right side, add the part of the second tableau below its assumption,
with the results of any applications of ¬T to T ¬φ removed. The conclusion of
¬T to T ¬φ is F φ, which is nevertheless available, as it is the conclusion of the
Cut rule on the right side of the combined tableau.
If a branch in the second tableau was closed because it contained the as-
sumption T ¬φ (which no longer appears as an assumption in the combined
tableau) as well as F ¬φ, we can applying ¬F to F ¬φ to obtain T φ. Now
the corresponding branch in the combined tableau also closes, because it con-
tains the right-hand conclusion of the Cut rule, F φ. If a branch in the second
tableau closed for any other reason, the corresponding branch in the combined
tableau also closes, since any signed formulas other than T ¬φ occurring on the
branch in the old, second tableau also occur on the corresponding branch in
the combined tableau.

content/first-order-logic/tableaux/provability-propositional.tex

21.9 Derivability and the Propositional Connectives


fol:tab:ppr: We establish that the derivability relation ⊢ of tableaux is strong enough to explanation
sec
establish some basic facts involving the propositional connectives, such as that
φ ∧ ψ ⊢ φ and φ, φ → ψ ⊢ ψ (modus ponens). These facts are needed for the
proof of the completeness theorem.

fol:tab:ppr: Proposition 21.22.


prop:provability-land

fol:tab:ppr: 1. Both φ ∧ ψ ⊢ φ and φ ∧ ψ ⊢ ψ.


prop:provability-land-left

fol:tab:ppr: 2. φ, ψ ⊢ φ ∧ ψ.
prop:provability-land-right

Proof. 1. Both {F φ, T φ ∧ ψ} and {F ψ, T φ ∧ ψ} have closed tableaux

Release : 6891b66 (2024-12-01) 309


CHAPTER 21. TABLEAUX

1. Fφ Assumption
2. Tφ ∧ ψ Assumption
3. Tφ ∧T 2
4. Tψ ∧T 2

1. Fψ Assumption
2. Tφ ∧ ψ Assumption
3. Tφ ∧T 2
4. Tψ ∧T 2

2. Here is a closed tableau for {T φ, T ψ, F φ ∧ ψ}:

1. Fφ∧ψ Assumption
2. Tφ Assumption
3. Tψ Assumption

4. Fφ Fψ ∧F 1
⊗ ⊗

Proposition 21.23. fol:tab:ppr:


prop:provability-lor

1. {φ ∨ ψ, ¬φ, ¬ψ} is inconsistent.

2. Both φ ⊢ φ ∨ ψ and ψ ⊢ φ ∨ ψ.

Proof. 1. We give a closed tableau of {T φ ∨ ψ, T ¬φ, T ¬ψ}:

1. Tφ ∨ ψ Assumption
2. T ¬φ Assumption
3. T ¬ψ Assumption
4. Fφ ¬T 2
5. Fψ ¬T 3

6. Tφ Tψ ∨T 1
⊗ ⊗

2. Both {F φ ∨ ψ, T φ} and {F φ ∨ ψ, T ψ} have closed tableaux:

310 Release : 6891b66 (2024-12-01)


21.9. DERIVABILITY AND THE PROPOSITIONAL CONNECTIVES

1. Fφ∨ψ Assumption
2. Tφ Assumption
3. Fφ ∨F 1
4. Fψ ∨F 1

1. Fφ∨ψ Assumption
2. Tψ Assumption
3. Fφ ∨F 1
4. Fψ ∨F 1

fol:tab:ppr: Proposition 21.24.


prop:provability-lif
fol:tab:ppr: 1. φ, φ → ψ ⊢ ψ.
prop:provability-lif-left
fol:tab:ppr: 2. Both ¬φ ⊢ φ → ψ and ψ ⊢ φ → ψ.
prop:provability-lif-right
Proof. 1. {F ψ, T φ → ψ, T φ} has a closed tableau:

1. Fψ Assumption
2. Tφ → ψ Assumption
3. Tφ Assumption

4. Fφ Tψ →T 2
⊗ ⊗

2. Both {F φ → ψ, T ¬φ} and {F φ → ψ, T ψ} have closed tableaux:

1. Fφ→ψ Assumption
2. T ¬φ Assumption
3. Tφ →F 1
4. Fψ →F 1
5. Fφ ¬T 2

1. Fφ→ψ Assumption
2. Tψ Assumption
3. Tφ →F 1
4. Fψ →F 1

content/first-order-logic/tableaux/provability-quantifiers.tex

Release : 6891b66 (2024-12-01) 311


CHAPTER 21. TABLEAUX

21.10 Derivability and the Quantifiers


explanation The completeness theorem also requires that the tableaux rules yield the facts fol:tab:qpr:
sec
about ⊢ established in this section.
Theorem 21.25. If c is a constant not occurring in Γ or φ(x) and Γ ⊢ φ(c), fol:tab:qpr:

then Γ ⊢ ∀x φ(x). thm:strong-generalization

Proof. Suppose Γ ⊢ φ(c), i.e., there are ψ1 , . . . , ψn ∈ Γ and a closed tableau


for

{F φ(c),T ψ1 , . . . , T ψn }.

We have to show that there is also a closed tableau for

{F ∀x φ(x),T ψ1 , . . . , T ψn }.

Take the closed tableau and replace the first assumption with F ∀x φ(x), and
insert F φ(c) after the assumptions.

F φ(c) F ∀x φ(x)
Tψ.. 1 Tψ.. 1
. .
T ψn T ψn
F φ(c)

The tableau is still closed, since all sentences available as assumptions before
are still available at the top of the tableau. The inserted line is the result of
a correct application of ∀F, since the constant symbol c does not occur in ψ1 ,
. . . , ψn or ∀x φ(x), i.e., it does not occur above the inserted line in the new
tableau.

Proposition 21.26. fol:tab:qpr:


prop:provability-quantifiers
1. φ(t) ⊢ ∃x φ(x).
2. ∀x φ(x) ⊢ φ(t).

Proof. 1. A closed tableau for F ∃x φ(x), T φ(t) is:

1. F ∃x φ(x) Assumption
2. T φ(t) Assumption
3. F φ(t) ∃F 1

2. A closed tableau for F φ(t), T ∀x φ(x), is:

312 Release : 6891b66 (2024-12-01)


21.11. SOUNDNESS

1. F φ(t) Assumption
2. T ∀x φ(x) Assumption
3. T φ(t) ∀T 2

content/first-order-logic/tableaux/soundness.tex

21.11 Soundness
fol:tab:sou: A derivation system, such as tableaux, is sound if it cannot derive things that explanation
sec
do not actually hold. Soundness is thus a kind of guaranteed safety property
for derivation systems. Depending on which proof theoretic property is in
question, we would like to know for instance, that

1. every derivable φ is valid;

2. if a sentence is derivable from some others, it is also a consequence of


them;

3. if a set of sentences is inconsistent, it is unsatisfiable.

These are important properties of a derivation system. If any of them do


not hold, the derivation system is deficient—it would derive too much. Con-
sequently, establishing the soundness of a derivation system is of the utmost
importance.
Because all these proof-theoretic properties are defined via closed tableaux
of some kind or other, proving (1)–(3) above requires proving something about
the semantic properties of closed tableaux. We will first define what it means
for a signed formula to be satisfied in a structure, and then show that if a
tableau is closed, no structure satisfies all its assumptions. (1)–(3) then follow
as corollaries from this result.

Definition 21.27. A structure M satisfies a signed formula T φ iff M ⊨ φ,


and it satisfies F φ iff M ⊭ φ. M satisfies a set of signed formulas Γ iff it
satisfies every S φ ∈ Γ . Γ is satisfiable if there is a structure that satisfies it,
and unsatisfiable otherwise.

fol:tab:sou: Theorem 21.28 (Soundness). If Γ has a closed tableau, Γ is unsatisfiable.


thm:tableau-soundness

Proof. Let’s call a branch of a tableau satisfiable iff the set of signed formulas
on it is satisfiable, and let’s call a tableau satisfiable if it contains at least one
satisfiable branch.
We show the following: Extending a satisfiable tableau by one of the rules
of inference always results in a satisfiable tableau. This will prove the theo-
rem: any closed tableau results by applying rules of inference to the tableau
consisting only of assumptions from Γ . So if Γ were satisfiable, any tableau

Release : 6891b66 (2024-12-01) 313


CHAPTER 21. TABLEAUX

for it would be satisfiable. A closed tableau, however, is clearly not satisfiable:


every branch contains both T φ and F φ, and no structure can both satisfy and
not satisfy φ.
Suppose we have a satisfiable tableau, i.e., a tableau with at least one
satisfiable branch. Applying a rule of inference either adds signed formulas
to a branch, or splits a branch in two. If the tableau has a satisfiable branch
which is not extended by the rule application in question, it remains a satisfiable
branch in the extended tableau, so the extended tableau is satisfiable. So we
only have to consider the case where a rule is applied to a satisfiable branch.
Let Γ be the set of signed formulas on that branch, and let S φ ∈ Γ be the
signed formula to which the rule is applied. If the rule does not result in a split
branch, we have to show that the extended branch, i.e., Γ together with the
conclusions of the rule, is still satisfiable. If the rule results in a split branch,
we have to show that at least one of the two resulting branches is satisfiable.
First, we consider the possible inferences that do not result in a split branch.

1. The branch is expanded by applying ¬T to T ¬ψ ∈ Γ . Then the extended


branch contains the signed formulas Γ ∪ {F ψ}. Suppose M ⊨ Γ . In
particular, M ⊨ ¬ψ. Thus, M ⊭ ψ, i.e., M satisfies F ψ.
2. The branch is expanded by applying ¬F to F ¬ψ ∈ Γ : Exercise.
3. The branch is expanded by applying ∧T to T ψ ∧ χ ∈ Γ , which results in
two new signed formulas on the branch: T ψ and T χ. Suppose M ⊨ Γ ,
in particular M ⊨ ψ ∧ χ. Then M ⊨ ψ and M ⊨ χ. This means that M
satisfies both T ψ and T χ.
4. The branch is expanded by applying ∨F to F ψ ∨ χ ∈ Γ : Exercise.
5. The branch is expanded by applying →F to F ψ → χ ∈ Γ : This results in
two new signed formulas on the branch: T ψ and F χ. Suppose M ⊨ Γ ,
in particular M ⊭ ψ → χ. Then M ⊨ ψ and M ⊭ χ. This means that M
satisfies both T ψ and F χ.

6. The branch is expanded by applying ∀T to T ∀x ψ(x) ∈ Γ : This results in


a new signed formula T φ(t) on the branch. Suppose M ⊨ Γ , in particular,
M ⊨ ∀x φ(x). By Proposition 16.31, M ⊨ φ(t). Consequently, M satisfies
T φ(t).
7. The branch is expanded by applying ∀F to F ∀x ψ(x) ∈ Γ : This results in
a new signed formula F φ(a) where a is a constant symbol not occurring
in Γ . Since Γ is satisfiable, there is a M such that M ⊨ Γ , in particular
M ⊭ ∀x ψ(x). We have to show that Γ ∪ {F φ(a)} is satisfiable. To do
this, we define a suitable M′ as follows.
By Proposition 16.19, M ⊭ ∀x ψ(x) iff for some s, M, s ⊭ ψ(x). Now

let M′ be just like M, except aM = s(x). By Corollary 16.21, for any
T χ ∈ Γ , M′ ⊨ χ, and for any F χ ∈ Γ , M′ ⊭ χ, since a does not occur
in Γ .

314 Release : 6891b66 (2024-12-01)


21.11. SOUNDNESS

By Proposition 16.20, M′ , s ⊭ φ(x). By Proposition 16.23, M′ , s ⊭ φ(a).


Since φ(a) is a sentence, by Proposition 16.18, M′ ⊭ φ(a), i.e., M′ satisfies
F φ(a).

8. The branch is expanded by applying ∃T to T ∃x ψ(x) ∈ Γ : Exercise.

9. The branch is expanded by applying ∃F to F ∃x ψ(x) ∈ Γ : Exercise.

Now let’s consider the possible inferences that result in a split branch.

1. The branch is expanded by applying ∧F to F ψ ∧ χ ∈ Γ , which results in


two branches, a left one continuing through F ψ and a right one through
F χ. Suppose M ⊨ Γ , in particular M ⊭ ψ ∧ χ. Then M ⊭ ψ or M ⊭ χ.
In the former case, M satisfies F ψ, i.e., M satisfies the formulas on the
left branch. In the latter, M satisfies F χ, i.e., M satisfies the formulas
on the right branch.

2. The branch is expanded by applying ∨T to T ψ ∨ χ ∈ Γ : Exercise.

3. The branch is expanded by applying →T to T ψ → χ ∈ Γ : Exercise.

4. The branch is expanded by Cut: This results in two branches, one con-
taining T ψ, the other containing F ψ. Since M ⊨ Γ and either M ⊨ ψ or
M ⊭ ψ, M satisfies either the left or the right branch.

Problem 21.8. Complete the proof of Theorem 21.28.

fol:tab:sou: Corollary 21.29. If ⊢ φ then φ is valid.


cor:weak-soundness

fol:tab:sou: Corollary 21.30. If Γ ⊢ φ then Γ ⊨ φ.


cor:entailment-soundness

Proof. If Γ ⊢ φ then for some ψ1 , . . . , ψn ∈ Γ , {F φ, T ψ1 , . . . , T ψn } has a


closed tableau. By Theorem 21.28, every structure M either makes some ψi
false or makes φ true. Hence, if M ⊨ Γ then also M ⊨ φ.

fol:tab:sou: Corollary 21.31. If Γ is satisfiable, then it is consistent.


cor:consistency-soundness

Proof. We prove the contrapositive. Suppose that Γ is not consistent. Then


there are ψ1 , . . . , ψn ∈ Γ and a closed tableau for {T ψ1 , . . . , T ψn }. By Theo-
rem 21.28, there is no M such that M ⊨ ψi for all i = 1, . . . , n. But then Γ is
not satisfiable.

content/first-order-logic/tableaux/identity.tex

Release : 6891b66 (2024-12-01) 315


CHAPTER 21. TABLEAUX

21.12 Tableaux with Identity predicate


Tableaux with identity predicate require additional inference rules. The rules fol:tab:ide:
sec
for = are (t, t1 , and t2 are closed terms):

T t1 = t2 T t1 = t2
= T φ(t1 ) F φ(t1 )
Tt = t
=T =F
T φ(t2 ) F φ(t2 )

Note that in contrast to all the other rules, =T and =F require that two
signed formulas already appear on the branch, namely both T t1 = t2 and
S φ(t1 ).

Example 21.32. If s and t are closed terms, then s = t, φ(s) ⊢ φ(t):

1. F φ(t) Assumption
2. Ts = t Assumption
3. T φ(s) Assumption
4. T φ(t) =T 2, 3

This may be familiar as the principle of substitutability of identicals, or Leibniz’


Law.
Tableaux prove that = is symmetric, i.e., that s1 = s2 ⊢ s2 = s1 :

1. F s2 = s1 Assumption
2. T s1 = s2 Assumption
3. T s1 = s1 =
4. T s2 = s1 =T 2, 3

Here, line 2 is the first prerequisite formula T s1 = s2 of =T. Line 3 is the


second one, of the form T φ(s2 )—think of φ(x) as x = s1 , then φ(s1 ) is s1 = s1
and φ(s2 ) is s2 = s1 .
They also prove that = is transitive, i.e., that s1 = s2 , s2 = s3 ⊢ s1 = s3 :

1. F s1 = s3 Assumption
2. T s1 = s2 Assumption
3. T s2 = s3 Assumption
4. T s1 = s3 =T 3, 2

In this tableau, the first prerequisite formula of =T is line 3, T s2 = s3 (s2


plays the role of t1 , and s3 the role of t2 ). The second prerequisite, of the

316 Release : 6891b66 (2024-12-01)


form T φ(s2 ) is line 2. Here, think of φ(x) as s1 = x; that makes φ(s2 ) into
t1 = t2 (i.e., line 2) and φ(s3 ) into the formula s1 = s3 in the conclusion.

Problem 21.9. Give closed tableaux for the following:

1. F ∀x ∀y ((x = y ∧ φ(x)) → φ(y))

2. F ∃x (φ(x) ∧ ∀y (φ(y) → y = x)),


T ∃x φ(x) ∧ ∀y ∀z ((φ(y) ∧ φ(z)) → y = z)

content/first-order-logic/tableaux/soundness-identity.tex

21.13 Soundness with Identity predicate


fol:tab:sid:
sec
Proposition 21.33. Tableaux with rules for identity are sound: no closed
tableau is satisfiable.

Proof. We just have to show as before that if a tableau has a satisfiable branch,
the branch resulting from applying one of the rules for = to it is also satisfiable.
Let Γ be the set of signed formulas on the branch, and let M be a structure
satisfying Γ .
Suppose the branch is expanded using =, i.e., by adding the signed for-
mula T t = t. Trivially, M ⊨ t = t, so M also satisfies Γ ∪ {T t = t}.
If the branch is expanded using =T, we add a signed formula S φ(t2 ), but Γ
contains both T t1 = t2 and T φ(t1 ). Thus we have M ⊨ t1 = t2 and M ⊨ φ(t1 ).
Let s be a variable assignment with s(x) = ValM (t1 ). By Proposition 16.18,
M, s ⊨ φ(t1 ). Since s ∼x s, by Proposition 16.23, M, s ⊨ φ(x). since M ⊨ t1 =
t2 , we have ValM (t1 ) = ValM (t2 ), and hence s(x) = ValM (t2 ). By applying
Proposition 16.23 again, we also have M, s ⊨ φ(t2 ). By Proposition 16.18,
M ⊨ φ(t2 ). The case of =F is treated similarly.

317
CHAPTER 22. AXIOMATIC DERIVATIONS

Chapter 22

Axiomatic Derivations

No effort has been made yet to ensure that the material in this chap-
ter respects various tags indicating which connectives and quantifiers are
primitive or defined: all are assumed to be primitive, except ↔ which is
assumed to be defined. If the FOL tag is true, we produce a version with
quantifiers, otherwise without.

content/first-order-logic/axiomatic-deduction/rules-and-proofs.tex

22.1 Rules and Derivations


explanation Axiomatic derivations are perhaps the simplest derivation system for logic. fol:axd:rul:
sec
A derivation is just a sequence of formulas. To count as a derivation, every
formula in the sequence must either be an instance of an axiom, or must follow
from one or more formulas that precede it in the sequence by a rule of inference.
A derivation derives its last formula.
Definition 22.1 (Derivability). If Γ is a set of formulas of L then a deriva-
tion from Γ is a finite sequence φ1 , . . . , φn of formulas where for each i ≤ n
one of the following holds:
1. φi ∈ Γ ; or
2. φi is an axiom; or
3. φi follows from some φj (and φk ) with j < i (and k < i) by a rule of
inference.

What counts as a correct derivation depends on which inference rules we


allow (and of course what we take to be axioms). And an inference rule is
an if-then statement that tells us that, under certain conditions, a step Ai in
a derivation is a correct inference step.

318 Release : 6891b66 (2024-12-01)


22.1. RULES AND DERIVATIONS

Definition 22.2 (Rule of inference). A rule of inference gives a sufficient


condition for what counts as a correct inference step in a derivation from Γ .

For instance, since any one-element sequence φ with φ ∈ Γ trivially counts


as a derivation, the following might be a very simple rule of inference:

If φ ∈ Γ , then φ is always a correct inference step in any derivation


from Γ .

Similarly, if φ is one of the axioms, then φ by itself is a derivation, and so this


is also a rule of inference:

If φ is an axiom, then φ is a correct inference step.

It gets more interesting if the rule of inference appeals to formulas that appear
before the step considered. The following rule is called modus ponens:

If ψ →φ and ψ occur higher up in the derivation, then φ is a correct


inference step.

If this is the only rule of inference, then our definition of derivation above
amounts to this: φ1 , . . . , φn is a derivation iff for each i ≤ n one of the
following holds:

1. φi ∈ Γ ; or

2. φi is an axiom; or

3. for some j < i, φj is ψ → φi , and for some k < i, φk is ψ.

The last clause says that φi follows from φj (ψ) and φk (ψ → φi ) by modus
ponens. If we can go from 1 to n, and each time we find a formula φi that is
either in Γ , an axiom, or which a rule of inference tells us that it is a correct
inference step, then the entire sequence counts as a correct derivation.

Definition 22.3 (Derivability). A formula φ is derivable from Γ , written


Γ ⊢ φ, if there is a derivation from Γ ending in φ.

Definition 22.4 (Theorems). A formula φ is a theorem if there is a deriva-


tion of φ from the empty set. We write ⊢ φ if φ is a theorem and ⊬ φ if it is
not.

content/first-order-logic/axiomatic-deduction/axioms-rules-propositional.tex

Release : 6891b66 (2024-12-01) 319


CHAPTER 22. AXIOMATIC DERIVATIONS

22.2 Axiom and Rules for the Propositional Connectives


fol:axd:prp:
sec
Definition 22.5 (Axioms). The set of Ax0 of axioms for the propositional
connectives comprises all formulas of the following forms:

(φ ∧ ψ) → φ (22.1) fol:axd:prp:
ax:land1
(φ ∧ ψ) → ψ (22.2) fol:axd:prp:
ax:land2
φ → (ψ → (φ ∧ ψ)) (22.3) fol:axd:prp:
ax:land3
φ → (φ ∨ ψ) (22.4) fol:axd:prp:
ax:lor1
φ → (ψ ∨ φ) (22.5) fol:axd:prp:
ax:lor2
(φ → χ) → ((ψ → χ) → ((φ ∨ ψ) → χ)) (22.6) fol:axd:prp:
ax:lor3
φ → (ψ → φ) (22.7) fol:axd:prp:
ax:lif1
(φ → (ψ → χ)) → ((φ → ψ) → (φ → χ)) (22.8) fol:axd:prp:
ax:lif2
(φ → ψ) → ((φ → ¬ψ) → ¬φ) (22.9) fol:axd:prp:
ax:lnot1
¬φ → (φ → ψ) (22.10) fol:axd:prp:
ax:lnot2
⊤ (22.11) fol:axd:prp:
ax:ltrue
⊥→φ (22.12) fol:axd:prp:
ax:lfalse1
(φ → ⊥) → ¬φ (22.13) fol:axd:prp:
ax:lfalse2
¬¬φ → φ (22.14) fol:axd:prp:
ax:dne

Definition 22.6 (Modus ponens). If ψ and ψ→φ already occur in a deriva-


tion, then φ is a correct inference step.

We’ll abbreviate the rule modus ponens as “mp.”

content/first-order-logic/axiomatic-deduction/axioms-rules-quantifiers.tex

22.3 Axioms and Rules for Quantifiers


fol:axd:qua:
sec
Definition 22.7 (Axioms for quantifiers). The axioms governing quanti-
fiers are all instances of the following:

∀x ψ → ψ(t), (22.15) fol:axd:qua:


ax:q1
ψ(t) → ∃x ψ. (22.16) fol:axd:qua:
ax:q2

for any closed term t.

Definition 22.8 (Rules for quantifiers).


If ψ → φ(a) already occurs in the derivation and a does not occur in Γ or ψ,
then ψ → ∀x φ(x) is a correct inference step.

320 Release : 6891b66 (2024-12-01)


22.4. EXAMPLES OF DERIVATIONS

If φ(a) → ψ already occurs in the derivation and a does not occur in Γ or ψ,


then ∃x φ(x) → ψ is a correct inference step.

We’ll abbreviate either of these by “qr.”

content/first-order-logic/axiomatic-deduction/proving-things.tex

22.4 Examples of Derivations


fol:axd:pro:
sec
Example 22.9. Suppose we want to prove (¬θ ∨ α) → (θ → α). Clearly, this is
not an instance of any of our axioms, so we have to use the mp rule to derive
it. Our only rule is MP, which given φ and φ → ψ allows us to justify ψ. One
strategy would be to use eq. (22.6) with φ being ¬θ, ψ being α, and χ being
θ → α, i.e., the instance

(¬θ → (θ → α)) → ((α → (θ → α)) → ((¬θ ∨ α) → (θ → α))).

Why? Two applications of MP yield the last part, which is what we want. And
we easily see that ¬θ → (θ → α) is an instance of eq. (22.10), and α → (θ → α)
is an instance of eq. (22.7). So our derivation is:

1. ¬θ → (θ → α) eq. (22.10)
2. (¬θ → (θ → α)) →
((α → (θ → α)) → ((¬θ ∨ α) → (θ → α))) eq. (22.6)
3. ((α → (θ → α)) → ((¬θ ∨ α) → (θ → α)) 1, 2, mp
4. α → (θ → α) eq. (22.7)
5. (¬θ ∨ α) → (θ → α) 3, 4, mp

fol:axd:pro: Example 22.10. Let’s try to find a derivation of θ → θ. It is not an instance


ex:identity
of an axiom, so we have to use mp to derive it. eq. (22.7) is an axiom of the
form φ → ψ to which we could apply mp. To be useful, of course, the ψ which
mp would justify as a correct step in this case would have to be θ → θ, since
this is what we want to derive. That means φ would also have to be θ, i.e., we
might look at this instance of eq. (22.7):

θ → (θ → θ)

In order to apply mp, we would also need to justify the corresponding second
premise, namely φ. But in our case, that would be θ, and we won’t be able to
derive θ by itself. So we need a different strategy.
The other axiom involving just → is eq. (22.8), i.e.,

(φ → (ψ → χ)) → ((φ → ψ) → (φ → χ))

We could get to the last nested conditional by applying mp twice. Again, that
would mean that we want an instance of eq. (22.8) where φ → χ is θ → θ, the

Release : 6891b66 (2024-12-01) 321


CHAPTER 22. AXIOMATIC DERIVATIONS

formula we are aiming for. Then of course, φ and χ are both θ. How should
we pick ψ so that both φ → (ψ → χ) and φ → ψ, i.e., in our case θ → (ψ → θ)
and θ → ψ, are also derivable? Well, the first of these is already an instance of
eq. (22.7), whatever we decide ψ to be. And θ → ψ would be another instance
of eq. (22.7) if ψ were (θ → θ). So, our derivation is:
1. θ → ((θ → θ) → θ) eq. (22.7)
2. (θ → ((θ → θ) → θ)) →
((θ → (θ → θ)) → (θ → θ)) eq. (22.8)
3. (θ → (θ → θ)) → (θ → θ) 1, 2, mp
4. θ → (θ → θ) eq. (22.7)
5. θ→θ 3, 4, mp
Example 22.11. Sometimes we want to show that there is a derivation of fol:axd:pro:
ex:chain
some formula from some other formulas Γ . For instance, let’s show that we
can derive φ → χ from Γ = {φ → ψ, ψ → χ}.
1. φ→ψ Hyp
2. ψ→χ Hyp
3. (ψ → χ) → (φ → (ψ → χ)) eq. (22.7)
4. φ → (ψ → χ) 2, 3, mp
5. (φ → (ψ → χ)) →
((φ → ψ) → (φ → χ)) eq. (22.8)
6. ((φ → ψ) → (φ → χ)) 4, 5, mp
7. φ→χ 1, 6, mp
The lines labelled “Hyp” (for “hypothesis”) indicate that the formula on that
line is an element of Γ .
Proposition 22.12. If Γ ⊢ φ → ψ and Γ ⊢ ψ → χ, then Γ ⊢ φ → χ fol:axd:pro:
prop:chain
Proof. Suppose Γ ⊢ φ → ψ and Γ ⊢ ψ → χ. Then there is a derivation of φ → ψ
from Γ ; and a derivation of ψ → χ from Γ as well. Combine these into a single
derivation by concatenating them. Now add lines 3–7 of the derivation in the
preceding example. This is a derivation of φ → χ—which is the last line of the
new derivation—from Γ . Note that the justifications of lines 4 and 7 remain
valid if the reference to line number 2 is replaced by reference to the last line
of the derivation of φ → ψ, and reference to line number 1 by reference to the
last line of the derivation of B → χ.
Problem 22.1. Show that the following hold by exhibiting derivations from
the axioms:
1. (φ ∧ ψ) → (ψ ∧ φ)
2. ((φ ∧ ψ) → χ) → (φ → (ψ → χ))
3. ¬(φ ∨ ψ) → ¬φ

content/first-order-logic/axiomatic-deduction/proving-things-quant.tex

322 Release : 6891b66 (2024-12-01)


22.5. DERIVATIONS WITH QUANTIFIERS

22.5 Derivations with Quantifiers


fol:axd:prq:
sec
Example 22.13. Let us give a derivation of (∀x φ(x) ∧ ∀y ψ(y)) → ∀x (φ(x) ∧
ψ(x)).
First, note that

(∀x φ(x) ∧ ∀y ψ(y)) → ∀x φ(x)

is an instance of eq. (22.1), and

∀x φ(x) → φ(a)

of eq. (22.15). So, by Proposition 22.12, we know that

(∀x φ(x) ∧ ∀y ψ(y)) → φ(a)

is derivable. Likewise, since

(∀x φ(x) ∧ ∀y ψ(y)) → ∀y ψ(y) and


∀y ψ(y) → ψ(a)

are instances of eq. (22.2) and eq. (22.15), respectively,

(∀x φ(x) ∧ ∀y ψ(y)) → ψ(a)

is derivable by Proposition 22.12. Using an appropriate instance of eq. (22.3)


and two applications of mp, we see that

(∀x φ(x) ∧ ∀y ψ(y)) → (φ(a) ∧ ψ(a))

is derivable. We can now apply qr to obtain

(∀x φ(x) ∧ ∀y ψ(y)) → ∀x (φ(x) ∧ ψ(x)).

content/first-order-logic/axiomatic-deduction/proof-theoretic-notions.tex

22.6 Proof-Theoretic Notions


fol:axd:ptn: Just as we’ve defined a number of important semantic notions (validity, en- explanation
sec
tailment, satisfiability), we now define corresponding proof-theoretic notions.
These are not defined by appeal to satisfaction of sentences in structures, but
by appeal to the derivability or non-derivability of certain formulas. It was an
important discovery that these notions coincide. That they do is the content
of the soundness and completeness theorems.

Release : 6891b66 (2024-12-01) 323


CHAPTER 22. AXIOMATIC DERIVATIONS

Definition 22.14 (Derivability). A formula φ is derivable from Γ , written


Γ ⊢ φ, if there is a derivation from Γ ending in φ.

Definition 22.15 (Theorems). A formula φ is a theorem if there is a deriva-


tion of φ from the empty set. We write ⊢ φ if φ is a theorem and ⊬ φ if it is
not.

Definition 22.16 (Consistency). A set Γ of formulas is consistent if and


only if Γ ⊬ ⊥; it is inconsistent otherwise.

Proposition 22.17 (Reflexivity). If φ ∈ Γ , then Γ ⊢ φ. fol:axd:ptn:


prop:reflexivity

Proof. The formula φ by itself is a derivation of φ from Γ .

Proposition 22.18 (Monotonicity). If Γ ⊆ ∆ and Γ ⊢ φ, then ∆ ⊢ φ. fol:axd:ptn:


prop:monotonicity

Proof. Any derivation of φ from Γ is also a derivation of φ from ∆.

Proposition 22.19 (Transitivity). If Γ ⊢ φ and {φ}∪∆ ⊢ ψ, then Γ ∪∆ ⊢ fol:axd:ptn:


prop:transitivity
ψ.

Proof. Suppose {φ} ∪ ∆ ⊢ ψ. Then there is a derivation ψ1 , . . . , ψl = ψ


from {φ} ∪ ∆. Some of the steps in that derivation will be correct because of
a rule which refers to a prior line ψi = φ. By hypothesis, there is a derivation
of φ from Γ , i.e., a derivation φ1 , . . . , φk = φ where every φi is an axiom,
an element of Γ , or correct by a rule of inference. Now consider the sequence

φ1 , . . . , φk = φ, ψ1 , . . . , ψl = ψ.

This is a correct derivation of ψ from Γ ∪ ∆ since every Bi = φ is now justified


by the same rule which justifies φk = φ.

Note that this means that in particular if Γ ⊢ φ and φ ⊢ ψ, then Γ ⊢ ψ. It


follows also that if φ1 , . . . , φn ⊢ ψ and Γ ⊢ φi for each i, then Γ ⊢ ψ.

Proposition 22.20. Γ is inconsistent iff Γ ⊢ φ for every φ. fol:axd:ptn:


prop:incons

Proof. Exercise.

Problem 22.2. Prove Proposition 22.20.

Proposition 22.21 (Compactness). fol:axd:ptn:


prop:proves-compact

1. If Γ ⊢ φ then there is a finite subset Γ0 ⊆ Γ such that Γ0 ⊢ φ.

2. If every finite subset of Γ is consistent, then Γ is consistent.

324 Release : 6891b66 (2024-12-01)


22.7. THE DEDUCTION THEOREM

Proof. 1. If Γ ⊢ φ, then there is a finite sequence of formulas φ1 , . . . , φn


so that φ ≡ φn and each φi is either a logical axiom, an element of Γ or
follows from previous formulas by modus ponens. Take Γ0 to be those
φi which are in Γ . Then the derivation is likewise a derivation from Γ0 ,
and so Γ0 ⊢ φ.
2. This is the contrapositive of (1) for the special case φ ≡ ⊥.

content/first-order-logic/axiomatic-deduction/deduction-theorem.tex

22.7 The Deduction Theorem


fol:axd:ded: As we’ve seen, giving derivations in an axiomatic system is cumbersome, and
sec
derivations may be hard to find. Rather than actually write out long lists of
formulas, it is generally easier to argue that such derivations exist, by mak-
ing use of a few simple results. We’ve already established three such results:
Proposition 22.17 says we can always assert that Γ ⊢ φ when we know that
φ ∈ Γ . Proposition 22.18 says that if Γ ⊢ φ then also Γ ∪ {ψ} ⊢ φ. And
Proposition 22.19 implies that if Γ ⊢ φ and φ ⊢ ψ, then Γ ⊢ ψ. Here’s another
simple result, a “meta”-version of modus ponens:
fol:axd:ded: Proposition 22.22. If Γ ⊢ φ and Γ ⊢ φ → ψ, then Γ ⊢ ψ.
prop:mp

Proof. We have that {φ, φ → ψ} ⊢ ψ:


1. φ Hyp.
2. φ→ψ Hyp.
3. ψ 1, 2, MP
By Proposition 22.19, Γ ⊢ ψ.

The most important result we’ll use in this context is the deduction theorem:
fol:axd:ded: Theorem 22.23 (Deduction Theorem). Γ ∪ {φ} ⊢ ψ if and only if Γ ⊢
thm:deduction-thm
φ → ψ.

Proof. The “if” direction is immediate. If Γ ⊢ φ → ψ then also Γ ∪ {φ} ⊢


φ → ψ by Proposition 22.18. Also, Γ ∪ {φ} ⊢ φ by Proposition 22.17. So, by
Proposition 22.22, Γ ∪ {φ} ⊢ ψ.
For the “only if” direction, we proceed by induction on the length of the
derivation of ψ from Γ ∪ {φ}.
For the induction basis, we prove the claim for every derivation of length 1.
A derivation of ψ from Γ ∪ {φ} of length 1 consists of ψ by itself; and if it is
correct ψ is either ∈ Γ ∪ {φ} or is an axiom. If ψ ∈ Γ or is an axiom, then
Γ ⊢ ψ. We also have that Γ ⊢ ψ →(φ→ψ) by eq. (22.7), and Proposition 22.22
gives Γ ⊢ φ → ψ. If ψ ∈ {φ} then Γ ⊢ φ → ψ because then last sentence φ → ψ
is the same as φ → φ, and we have derived that in Example 22.10.

Release : 6891b66 (2024-12-01) 325


CHAPTER 22. AXIOMATIC DERIVATIONS

For the inductive step, suppose a derivation of ψ from Γ ∪ {φ} ends with
a step ψ which is justified by modus ponens. (If it is not justified by modus
ponens, ψ ∈ Γ , ψ ≡ φ, or ψ is an axiom, and the same reasoning as in the
induction basis applies.) Then some previous steps in the derivation are χ → ψ
and χ, for some formula χ, i.e., Γ ∪ {φ} ⊢ χ → ψ and Γ ∪ {φ} ⊢ χ, and the
respective derivations are shorter, so the inductive hypothesis applies to them.
We thus have both:

Γ ⊢ φ → (χ → ψ);
Γ ⊢ φ → χ.

But also
Γ ⊢ (φ → (χ → ψ)) → ((φ → χ) → (φ → ψ)),
by eq. (22.8), and two applications of Proposition 22.22 give Γ ⊢ φ → ψ, as
required.

Notice how eq. (22.7) and eq. (22.8) were chosen precisely so that the De-
duction Theorem would hold.
The following are some useful facts about derivability, which we leave as
exercises.

Proposition 22.24. fol:axd:ded:


prop:derivfacts
1. ⊢ (φ → ψ) → ((ψ → χ) → (φ → χ); fol:axd:ded:
derivfacts:a
2. If Γ ∪ {¬φ} ⊢ ¬ψ then Γ ∪ {ψ} ⊢ φ (Contraposition); fol:axd:ded:
derivfacts:b
3. {φ, ¬φ} ⊢ ψ (Ex Falso Quodlibet, Explosion); fol:axd:ded:
derivfacts:c
4. {¬¬φ} ⊢ φ (Double Negation Elimination); fol:axd:ded:
derivfacts:d
5. If Γ ⊢ ¬¬φ then Γ ⊢ φ; fol:axd:ded:
derivfacts:e

Problem 22.3. Prove Proposition 22.24

content/first-order-logic/axiomatic-deduction/deduction-theorem-quantifiers.tex

22.8 The Deduction Theorem with Quantifiers


fol:axd:ddq:
sec
Theorem 22.25 (Deduction Theorem). If Γ ∪ {φ} ⊢ ψ, then Γ ⊢ φ → ψ. fol:axd:ddq:
thm:deduction-thm-q

Proof. We again proceed by induction on the length of the derivation of ψ from


Γ ∪ {φ}.
The proof of the induction basis is identical to that in the proof of Theo-
rem 22.23.

326 Release : 6891b66 (2024-12-01)


22.8. THE DEDUCTION THEOREM WITH QUANTIFIERS

For the inductive step, suppose again that the derivation of ψ from Γ ∪ {φ}
ends with a step ψ which is justified by an inference rule. If the inference rule
is modus ponens, we proceed as in the proof of Theorem 22.23. If the inference
rule is qr, we know that ψ ≡ χ → ∀x θ(x) and a formula of the form χ → θ(a)
appears earlier in the derivation, where a does not occur in χ, φ, or Γ . We
thus have that

Γ ∪ {φ} ⊢ χ → θ(a),

and the induction hypothesis applies, i.e., we have that

Γ ⊢ φ → (χ → θ(a)).

By

⊢ (φ → (χ → θ(a))) → ((φ ∧ χ) → θ(a))

and modus ponens we get

Γ ⊢ (φ ∧ χ) → θ(a).

Since the eigenvariable condition still applies, we can add a step to this deriva-
tion justified by qr, and get

Γ ⊢ (φ ∧ χ) → ∀x θ(x).

We also have

⊢ ((φ ∧ χ) → ∀x θ(x)) → (φ → (χ → ∀x θ(x)),

so by modus ponens,

Γ ⊢ φ → (χ → ∀x θ(x)),

i.e., Γ ⊢ ψ.
We leave the case where ψ is justified by the rule qr, but is of the form
∃x θ(x) → χ, as an exercise.

Problem 22.4. Complete the proof of Theorem 22.25.

content/first-order-logic/axiomatic-deduction/provability-consistency.tex

Release : 6891b66 (2024-12-01) 327


CHAPTER 22. AXIOMATIC DERIVATIONS

22.9 Derivability and Consistency


We will now establish a number of properties of the derivability relation. They fol:axd:prv:
sec
are independently interesting, but each will play a role in the proof of the
completeness theorem.
Proposition 22.26. If Γ ⊢ φ and Γ ∪ {φ} is inconsistent, then Γ is incon- fol:axd:prv:
prop:provability-contr
sistent.

Proof. If Γ ∪ {φ} is inconsistent, then Γ ∪ {φ} ⊢ ⊥. By Proposition 22.17,


Γ ⊢ ψ for every ψ ∈ Γ . Since also Γ ⊢ φ by hypothesis, Γ ⊢ ψ for every
ψ ∈ Γ ∪ {φ}. By Proposition 22.19, Γ ⊢ ⊥, i.e., Γ is inconsistent.

Proposition 22.27. Γ ⊢ φ iff Γ ∪ {¬φ} is inconsistent. fol:axd:prv:


prop:prov-incons

Proof. First suppose Γ ⊢ φ. Then Γ ∪ {¬φ} ⊢ φ by Proposition 22.18. Γ ∪


{¬φ} ⊢ ¬φ by Proposition 22.17. We also have ⊢ ¬φ → (φ → ⊥) by eq. (22.10).
So by two applications of Proposition 22.22, we have Γ ∪ {¬φ} ⊢ ⊥.
Now assume Γ ∪ {¬φ} is inconsistent, i.e., Γ ∪ {¬φ} ⊢ ⊥. By the deduction
theorem, Γ ⊢ ¬φ → ⊥. Γ ⊢ (¬φ → ⊥) → ¬¬φ by eq. (22.13), so Γ ⊢ ¬¬φ
by Proposition 22.22. Since Γ ⊢ ¬¬φ → φ (eq. (22.14)), we have Γ ⊢ φ by
Proposition 22.22 again.

Problem 22.5. Prove that Γ ⊢ ¬φ iff Γ ∪ {φ} is inconsistent.

Proposition 22.28. If Γ ⊢ φ and ¬φ ∈ Γ , then Γ is inconsistent. fol:axd:prv:


prop:explicit-inc

Proof. Γ ⊢ ¬φ → (φ → ⊥) by eq. (22.10). Γ ⊢ ⊥ by two applications of


Proposition 22.22.

Proposition 22.29. If Γ ∪ {φ} and Γ ∪ {¬φ} are both inconsistent, then Γ fol:axd:prv:
prop:provability-exhaustive
is inconsistent.

Proof. Exercise.

Problem 22.6. Prove Proposition 22.29

content/first-order-logic/axiomatic-deduction/provability-propositional.tex

22.10 Derivability and the Propositional Connectives


explanation We establish that the derivability relation ⊢ of axiomatic deduction is strong fol:axd:ppr:
sec
enough to establish some basic facts involving the propositional connectives,
such as that φ ∧ ψ ⊢ φ and φ, φ → ψ ⊢ ψ (modus ponens). These facts are
needed for the proof of the completeness theorem.
Proposition 22.30. fol:axd:ppr:
prop:provability-land

328 Release : 6891b66 (2024-12-01)


22.11. DERIVABILITY AND THE QUANTIFIERS

fol:axd:ppr: 1. Both φ ∧ ψ ⊢ φ and φ ∧ ψ ⊢ ψ


prop:provability-land-left
fol:axd:ppr: 2. φ, ψ ⊢ φ ∧ ψ.
prop:provability-land-right

Proof. 1. From eq. (22.1) and eq. (22.1) by modus ponens.

2. From eq. (22.3) by two applications of modus ponens.

fol:axd:ppr: Proposition 22.31.


prop:provability-lor
1. φ ∨ ψ, ¬φ, ¬ψ is inconsistent.

2. Both φ ⊢ φ ∨ ψ and ψ ⊢ φ ∨ ψ.

Proof. 1. From eq. (22.9) we get ⊢ ¬φ→(φ→⊥) and ⊢ ¬ψ →(ψ →⊥). So by


the deduction theorem, we have {¬φ} ⊢ φ → ⊥ and {¬ψ} ⊢ ψ → ⊥. From
eq. (22.6) we get {¬φ, ¬ψ} ⊢ (φ ∨ ψ) → ⊥. By the deduction theorem,
{φ ∨ ψ, ¬φ, ¬ψ} ⊢ ⊥.

2. From eq. (22.4) and eq. (22.5) by modus ponsens.

fol:axd:ppr: Proposition 22.32.


prop:provability-lif
fol:axd:ppr: 1. φ, φ → ψ ⊢ ψ.
prop:provability-lif-left
fol:axd:ppr: 2. Both ¬φ ⊢ φ → ψ and ψ ⊢ φ → ψ.
prop:provability-lif-right

Proof. 1. We can derive:

1. φ Hyp
2. φ→ψ Hyp
3. ψ 1, 2, mp

2. By eq. (22.10) and eq. (22.7) and the deduction theorem, respectively.

content/first-order-logic/axiomatic-deduction/provability-quantifiers.tex

22.11 Derivability and the Quantifiers


fol:axd:qpr: The completeness theorem also requires that axiomatic deductions yield the explanation
sec
facts about ⊢ established in this section.

fol:axd:qpr: Theorem 22.33. If c is a constant symbol not occurring in Γ or φ(x) and


thm:strong-generalization
Γ ⊢ φ(c), then Γ ⊢ ∀x φ(x).

Proof. By the deduction theorem, Γ ⊢ ⊤ → φ(c). Since c does not occur in Γ


or ⊤, we get Γ ⊢ ⊤ → φ(c). By the deduction theorem again, Γ ⊢ ∀x φ(x).

Release : 6891b66 (2024-12-01) 329


CHAPTER 22. AXIOMATIC DERIVATIONS

Proposition 22.34. fol:axd:qpr:


prop:provability-quantifiers
1. φ(t) ⊢ ∃x φ(x).

2. ∀x φ(x) ⊢ φ(t).

Proof. 1. By eq. (22.16) and the deduction theorem.

2. By eq. (22.15) and the deduction theorem.

content/first-order-logic/axiomatic-deduction/soundness.tex

22.12 Soundness
explanation A derivation system, such as axiomatic deduction, is sound if it cannot derive fol:axd:sou:
sec
things that do not actually hold. Soundness is thus a kind of guaranteed safety
property for derivation systems. Depending on which proof theoretic property
is in question, we would like to know for instance, that

1. every derivable φ is valid;

2. if φ is derivable from some others Γ , it is also a consequence of them;

3. if a set of formulas Γ is inconsistent, it is unsatisfiable.

These are important properties of a derivation system. If any of them do


not hold, the derivation system is deficient—it would derive too much. Con-
sequently, establishing the soundness of a derivation system is of the utmost
importance.

Proposition 22.35. If φ is an axiom, then M, s ⊨ φ for each structure M


and assignment s.

Proof. We have to verify that all the axioms are valid. For instance, here is the
case for eq. (22.15): suppose t is free for x in φ, and assume M, s ⊨ ∀x φ. Then
by definition of satisfaction, for each s′ ∼x s, also M, s′ ⊨ φ, and in particular
this holds when s′ (x) = ValM s (t). By Proposition 16.23, M, s ⊨ φ[t/x]. This
shows that M, s ⊨ (∀x φ → φ[t/x]).

Theorem 22.36 (Soundness). If Γ ⊢ φ then Γ ⊨ φ. fol:axd:sou:


thm:soundness

Proof. By induction on the length of the derivation of φ from Γ . If there are


no steps justified by inferences, then all formulas in the derivation are either
instances of axioms or are in Γ . By the previous proposition, all the axioms are
valid, and hence if φ is an axiom then Γ ⊨ φ. If φ ∈ Γ , then trivially Γ ⊨ φ.
If the last step of the derivation of φ is justified by modus ponens, then
there are formulas ψ and ψ → φ in the derivation, and the induction hypothesis
applies to the part of the derivation ending in those formulas (since they contain

330 Release : 6891b66 (2024-12-01)


22.13. DERIVATIONS WITH IDENTITY PREDICATE

at least one fewer steps justified by an inference). So, by induction hypothesis,


Γ ⊨ ψ and Γ ⊨ ψ → φ. Then Γ ⊨ φ by Theorem 16.30.
Now suppose the last step is justified by qr. Then that step has the form
χ → ∀x B(x) and there is a preceding step χ → ψ(c) with c not in Γ , χ, or
∀x B(x). By induction hypothesis, Γ ⊨ χ → ∀x B(x). By Theorem 16.30,
Γ ∪ {χ} ⊨ ψ(c).
Consider some structure M such that M ⊨ Γ ∪ {χ}. We need to show that
M ⊨ ∀x ψ(x). Since ∀x ψ(x) is a sentence, this means we have to show that for
every variable assignment s, M, s ⊨ ψ(x) (Proposition 16.19). Since Γ ∪ {χ}
consists entirely of sentences, M, s ⊨ θ for all θ ∈ Γ by Definition 16.11. Let

M′ be like M except that cM = s(x). Since c does not occur in Γ or χ,
M′ ⊨ Γ ∪ {χ} by Corollary 16.21. Since Γ ∪ {χ} ⊨ ψ(c), M′ ⊨ B(c). Since ψ(c)
is a sentence, M, s ⊨ ψ(c) by Proposition 16.18. M′ , s ⊨ ψ(x) iff M′ ⊨ ψ(c) by
Proposition 16.23 (recall that ψ(c) is just ψ(x)[c/x]). So, M′ , s ⊨ ψ(x). Since
c does not occur in ψ(x), by Proposition 16.20, M, s ⊨ ψ(x). But s was an
arbitrary variable assignment, so M ⊨ ∀x ψ(x). Thus Γ ∪ {χ} ⊨ ∀x ψ(x). By
Theorem 16.30, Γ ⊨ χ → ∀x ψ(x).
The case where φ is justified by qr but is of the form ∃x ψ(x) → χ is left
as an exercise.

Problem 22.7. Complete the proof of Theorem 22.36.

fol:axd:sou: Corollary 22.37. If ⊢ φ, then φ is valid.


cor:weak-soundness

fol:axd:sou: Corollary 22.38. If Γ is satisfiable, then it is consistent.


cor:consistency-soundness

Proof. We prove the contrapositive. Suppose that Γ is not consistent. Then


Γ ⊢ ⊥, i.e., there is a derivation of ⊥ from Γ . By Theorem 22.36, any struc-
ture M that satisfies Γ must satisfy ⊥. Since M ⊭ ⊥ for every structure M,
no M can satisfy Γ , i.e., Γ is not satisfiable.

content/first-order-logic/axiomatic-deduction/identity.tex

22.13 Derivations with Identity predicate


fol:axd:ide: In order to accommodate = in derivations, we simply add new axiom schemas.
sec
The definition of derivation and ⊢ remains the same, we just also allow the
new axioms.

Definition 22.39 (Axioms for identity predicate).

fol:axd:ide: t = t, (22.17)
ax:id1
fol:axd:ide: t1 = t2 → (ψ(t1 ) → ψ(t2 )), (22.18)
ax:id2

for any closed terms t, t1 , t2 .

Release : 6891b66 (2024-12-01) 331


Proposition 22.40. The axioms eq. (22.17) and eq. (22.18) are valid. fol:axd:ide:
prop:sound

Proof. Exercise.

Problem 22.8. Prove Proposition 22.40.

Proposition 22.41. Γ ⊢ t = t, for any term t and set Γ . fol:axd:ide:


prop:iden1

Proposition 22.42. If Γ ⊢ φ(t1 ) and Γ ⊢ t1 = t2 , then Γ ⊢ φ(t2 ). fol:axd:ide:


prop:iden2

Proof. The formula


(t1 = t2 → (φ(t1 ) → φ(t2 )))
is an instance of eq. (22.18). The conclusion follows by two applications of mp.

Chapter 23

The Completeness Theorem

content/first-order-logic/completeness/introduction.tex

23.1 Introduction
The completeness theorem is one of the most fundamental results about logic. fol:com:int:
sec
It comes in two formulations, the equivalence of which we’ll prove. In its
first formulation it says something fundamental about the relationship between
semantic consequence and our derivation system: if a sentence φ follows from
some sentences Γ , then there is also a derivation that establishes Γ ⊢ φ. Thus,
the derivation system is as strong as it can possibly be without proving things
that don’t actually follow.
In its second formulation, it can be stated as a model existence result: every
consistent set of sentences is satisfiable. Consistency is a proof-theoretic notion:
it says that our derivation system is unable to produce certain derivations. But
who’s to say that just because there are no derivations of a certain sort from Γ ,
it’s guaranteed that there is a structure M? Before the completeness theorem
was first proved—in fact before we had the derivation systems we now do—the
great German mathematician David Hilbert held the view that consistency of

332
23.2. OUTLINE OF THE PROOF

mathematical theories guarantees the existence of the objects they are about.
He put it as follows in a letter to Gottlob Frege:

If the arbitrarily given axioms do not contradict one another with


all their consequences, then they are true and the things defined by
the axioms exist. This is for me the criterion of truth and existence.

Frege vehemently disagreed. The second formulation of the completeness the-


orem shows that Hilbert was right in at least the sense that if the axioms are
consistent, then some structure exists that makes them all true.
These aren’t the only reasons the completeness theorem—or rather, its
proof—is important. It has a number of important consequences, some of
which we’ll discuss separately. For instance, since any derivation that shows
Γ ⊢ φ is finite and so can only use finitely many of the sentences in Γ , it follows
by the completeness theorem that if φ is a consequence of Γ , it is already a
consequence of a finite subset of Γ . This is called compactness. Equivalently,
if every finite subset of Γ is consistent, then Γ itself must be consistent.
Although the compactness theorem follows from the completeness theorem
via the detour through derivations, it is also possible to use the the proof of the
completeness theorem to establish it directly. For what the proof does is take a
set of sentences with a certain property—consistency—and constructs a struc-
ture out of this set that has certain properties (in this case, that it satisfies
the set). Almost the very same construction can be used to directly establish
compactness, by starting from “finitely satisfiable” sets of sentences instead
of consistent ones. The construction also yields other consequences, e.g., that
any satisfiable set of sentences has a finite or denumerable model. (This re-
sult is called the Löwenheim–Skolem theorem.) In general, the construction of
structures from sets of sentences is used often in logic, and sometimes even in
philosophy.

content/first-order-logic/completeness/outline.tex

23.2 Outline of the Proof


fol:com:out: The proof of the completeness theorem is a bit complex, and upon first reading
sec
it, it is easy to get lost. So let us outline the proof. The first step is a shift
of perspective, that allows us to see a route to a proof. When completeness
is thought of as “whenever Γ ⊨ φ then Γ ⊢ φ,” it may be hard to even come
up with an idea: for to show that Γ ⊢ φ we have to find a derivation, and
it does not look like the hypothesis that Γ ⊨ φ helps us for this in any way.
For some proof systems it is possible to directly construct a derivation, but we
will take a slightly different approach. The shift in perspective required is this:
completeness can also be formulated as: “if Γ is consistent, it is satisfiable.”
Perhaps we can use the information in Γ together with the hypothesis that it is
consistent to construct a structure that satisfies every sentence in Γ . After all,

Release : 6891b66 (2024-12-01) 333


CHAPTER 23. THE COMPLETENESS THEOREM

we know what kind of structure we are looking for: one that is as Γ describes
it!
If Γ contains only atomic sentences, it is easy to construct a model for it.
Suppose the atomic sentences are all of the form P (a1 , . . . , an ) where the ai
are constant symbols. All we have to do is come up with a domain |M| and
an assignment for P so that M ⊨ P (a1 , . . . , an ). But that’s not very hard: put
|M| = N, ciM = i, and for every P (a1 , . . . , an ) ∈ Γ , put the tuple ⟨k1 , . . . , kn ⟩
into P M , where ki is the index of the constant symbol ai (i.e., ai ≡ cki ).
Now suppose Γ contains some formula ¬ψ, with ψ atomic. We might worry
that the construction of M interferes with the possibility of making ¬ψ true.
But here’s where the consistency of Γ comes in: if ¬ψ ∈ Γ , then ψ ∈ / Γ , or else
Γ would be inconsistent. And if ψ ∈ / Γ , then according to our construction
of M, M ⊭ ψ, so M ⊨ ¬ψ. So far so good.
What if Γ contains complex, non-atomic formulas? Say it contains φ ∧ ψ.
To make that true, we should proceed as if both φ and ψ were in Γ . And if
φ ∨ ψ ∈ Γ , then we will have to make at least one of them true, i.e., proceed
as if one of them was in Γ .
This suggests the following idea: we add additional formulas to Γ so as to
(a) keep the resulting set consistent and (b) make sure that for every possible
atomic sentence φ, either φ is in the resulting set, or ¬φ is, and (c) such that,
whenever φ ∧ ψ is in the set, so are both φ and ψ, if φ ∨ ψ is in the set, at least
one of φ or ψ is also, etc. We keep doing this (potentially forever). Call the
set of all formulas so added Γ ∗ . Then our construction above would provide
us with a structure M for which we could prove, by induction, that it satisfies
all sentences in Γ ∗ , and hence also all sentence in Γ since Γ ⊆ Γ ∗ . It turns
out that guaranteeing (a) and (b) is enough. A set of sentences for which (b)
holds is called complete. So our task will be to extend the consistent set Γ to
a consistent and complete set Γ ∗ .
There is one wrinkle in this plan: if ∃x φ(x) ∈ Γ we would hope to be able
to pick some constant symbol c and add φ(c) in this process. But how do we
know we can always do that? Perhaps we only have a few constant symbols
in our language, and for each one of them we have ¬φ(c) ∈ Γ . We can’t also
add φ(c), since this would make the set inconsistent, and we wouldn’t know
whether M has to make φ(c) or ¬φ(c) true. Moreover, it might happen that
Γ contains only sentences in a language that has no constant symbols at all
(e.g., the language of set theory).
The solution to this problem is to simply add infinitely many constants at
the beginning, plus sentences that connect them with the quantifiers in the right
way. (Of course, we have to verify that this cannot introduce an inconsistency.)
Our original construction works well if we only have constant symbols in the
atomic sentences. But the language might also contain function symbols. In
that case, it might be tricky to find the right functions on N to assign to these
function symbols to make everything work. So here’s another trick: instead of
using i to interpret ci , just take the set of constant symbols itself as the domain.
Then M can assign every constant symbol to itself: ciM = ci . But why not go
all the way: let |M| be all terms of the language! If we do this, there is an

334 Release : 6891b66 (2024-12-01)


23.3. COMPLETE CONSISTENT SETS OF SENTENCES

obvious assignment of functions (that take terms as arguments and have terms
as values) to function symbols: we assign to the function symbol fin the function
which, given n terms t1 , . . . , tn as input, produces the term fin (t1 , . . . , tn ) as
value.
The last piece of the puzzle is what to do with =. The predicate symbol =
has a fixed interpretation: M ⊨ t = t′ iff ValM (t) = ValM (t′ ). Now if we set
things up so that the value of a term t is t itself, then this structure will make
no sentence of the form t = t′ true unless t and t′ are one and the same term.
And of course this is a problem, since basically every interesting theory in a
language with function symbols will have as theorems sentences t = t′ where t
and t′ are not the same term (e.g., in theories of arithmetic: (0 + 0) = 0). To
solve this problem, we change the domain of M: instead of using terms as the
objects in |M|, we use sets of terms, and each set is so that it contains all those
terms which the sentences in Γ require to be equal. So, e.g., if Γ is a theory of
arithmetic, one of these sets will contain: 0, (0 + 0), (0 × 0), etc. This will be
the set we assign to 0, and it will turn out that this set is also the value of all
the terms in it, e.g., also of (0 + 0). Therefore, the sentence (0 + 0) = 0 will
be true in this revised structure.
So here’s what we’ll do. First we investigate the properties of complete
consistent sets, in particular we prove that a complete consistent set contains
φ ∧ ψ iff it contains both φ and ψ, φ ∨ ψ iff it contains at least one of them,
etc. (Proposition 23.2). Then we define and investigate “saturated” sets of
sentences. A saturated set is one which contains conditionals that link each
quantified sentence to instances of it (Definition 23.5). We show that any
consistent set Γ can always be extended to a saturated set Γ ′ (Lemma 23.6).
If a set is consistent, saturated, and complete it also has the property that
it contains ∃x φ(x) iff it contains φ(t) for some closed term t and ∀x φ(x) iff
it contains φ(t) for all closed terms t (Proposition 23.7). We’ll then take the
saturated consistent set Γ ′ and show that it can be extended to a saturated,
consistent, and complete set Γ ∗ (Lemma 23.8). This set Γ ∗ is what we’ll
use to define our term model M(Γ ∗ ). The term model has the set of closed
terms as its domain, and the interpretation of its predicate symbols is given
by the atomic sentences in Γ ∗ (Definition 23.9). We’ll use the properties of
saturated, complete consistent sets to show that indeed M(Γ ∗ ) ⊨ φ iff φ ∈ Γ ∗
(Lemma 23.12), and thus in particular, M(Γ ∗ ) ⊨ Γ . Finally, we’ll consider
how to define a term model if Γ contains = as well (Definition 23.16) and show
that it satisfies Γ ∗ (Lemma 23.19).

content/first-order-logic/completeness/complete-consistent-sets.tex

23.3 Complete Consistent Sets of Sentences


fol:com:ccs:
sec
fol:com:ccs: Definition 23.1 (Complete set). A set Γ of sentences is complete iff for
def:complete-set
any sentence φ, either φ ∈ Γ or ¬φ ∈ Γ .

Release : 6891b66 (2024-12-01) 335


CHAPTER 23. THE COMPLETENESS THEOREM

explanation Complete sets of sentences leave no questions unanswered. For any sen-
tence φ, Γ “says” if φ is true or false. The importance of complete sets extends
beyond the proof of the completeness theorem. A theory which is complete and
axiomatizable, for instance, is always decidable.
explanation Complete consistent sets are important in the completeness proof since we
can guarantee that every consistent set of sentences Γ is contained in a complete
consistent set Γ ∗ . A complete consistent set contains, for each sentence φ,
either φ or its negation ¬φ, but not both. This is true in particular for atomic
sentences, so from a complete consistent set in a language suitably expanded
by constant symbols, we can construct a structure where the interpretation of
predicate symbols is defined according to which atomic sentences are in Γ ∗ .
This structure can then be shown to make all sentences in Γ ∗ (and hence also
all those in Γ ) true. The proof of this latter fact requires that ¬φ ∈ Γ ∗ iff
φ∈ / Γ ∗ , (φ ∨ ψ) ∈ Γ ∗ iff φ ∈ Γ ∗ or ψ ∈ Γ ∗ , etc.
In what follows, we will often tacitly use the properties of reflexivity, mono-
tonicity, and transitivity of ⊢ (see sections 19.8, 20.7, 21.7 and 22.6).
Proposition 23.2. Suppose Γ is complete and consistent. Then: fol:com:ccs:
prop:ccs
1. If Γ ⊢ φ, then φ ∈ Γ . fol:com:ccs:
prop:ccs-prov-in
2. φ ∧ ψ ∈ Γ iff both φ ∈ Γ and ψ ∈ Γ . fol:com:ccs:
prop:ccs-and
3. φ ∨ ψ ∈ Γ iff either φ ∈ Γ or ψ ∈ Γ . fol:com:ccs:
prop:ccs-or
4. φ → ψ ∈ Γ iff either φ ∈
/ Γ or ψ ∈ Γ . fol:com:ccs:
prop:ccs-if
Proof. Let us suppose for all of the following that Γ is complete and consistent.
1. If Γ ⊢ φ, then φ ∈ Γ .
Suppose that Γ ⊢ φ. Suppose to the contrary that φ ∈ / Γ . Since Γ
is complete, ¬φ ∈ Γ . By Propositions 19.20, 20.20, 21.20 and 22.28,
Γ is inconsistent. This contradicts the assumption that Γ is consistent.
Hence, it cannot be the case that φ ∈/ Γ , so φ ∈ Γ .
2. φ ∧ ψ ∈ Γ iff both φ ∈ Γ and ψ ∈ Γ :
For the forward direction, suppose φ∧ψ ∈ Γ . Then by Propositions 19.22,
20.22, 21.22 and 22.30, item (1), Γ ⊢ φ and Γ ⊢ ψ. By (1), φ ∈ Γ and
ψ ∈ Γ , as required.
For the reverse direction, let φ ∈ Γ and ψ ∈ Γ . By Propositions 19.22,
20.22, 21.22 and 22.30, item (2), Γ ⊢ φ ∧ ψ. By (1), φ ∧ ψ ∈ Γ .
3. First we show that if φ ∨ ψ ∈ Γ , then either φ ∈ Γ or ψ ∈ Γ . Suppose
φ ∨ ψ ∈ Γ but φ ∈ / Γ and ψ ∈ / Γ . Since Γ is complete, ¬φ ∈ Γ and
¬ψ ∈ Γ . By Propositions 19.23, 20.23, 21.23 and 22.31, item (1), Γ is
inconsistent, a contradiction. Hence, either φ ∈ Γ or ψ ∈ Γ .
For the reverse direction, suppose that φ ∈ Γ or ψ ∈ Γ . By Proposi-
tions 19.23, 20.23, 21.23 and 22.31, item (2), Γ ⊢ φ∨ψ. By (1), φ∨ψ ∈ Γ ,
as required.

336 Release : 6891b66 (2024-12-01)


23.4. HENKIN EXPANSION

4. For the forward direction, suppose φ→ψ ∈ Γ , and suppose to the contrary
that φ ∈ Γ and ψ ∈ / Γ . On these assumptions, φ → ψ ∈ Γ and φ ∈ Γ .
By Propositions 19.24, 20.24, 21.24 and 22.32, item (1), Γ ⊢ ψ. But then
by (1), ψ ∈ Γ , contradicting the assumption that ψ ∈/ Γ.
For the reverse direction, first consider the case where φ ∈
/ Γ . Since Γ is
complete, ¬φ ∈ Γ . By Propositions 19.24, 20.24, 21.24 and 22.32, item
(2), Γ ⊢ φ → ψ. Again by (1), we get that φ → ψ ∈ Γ , as required.
Now consider the case where ψ ∈ Γ . By Propositions 19.24, 20.24, 21.24
and 22.32, item (2) again, Γ ⊢ φ → ψ. By (1), φ → ψ ∈ Γ .

Problem 23.1. Complete the proof of Proposition 23.2.

content/first-order-logic/completeness/henkin-expansions.tex

23.4 Henkin Expansion


fol:com:hen: Part of the challenge in proving the completeness theorem is that the model explanation
sec
we construct from a complete consistent set Γ must make all the quantified
formulas in Γ true. In order to guarantee this, we use a trick due to Leon
Henkin. In essence, the trick consists in expanding the language by infinitely
many constant symbols and adding, for each formula with one free variable
φ(x) a formula of the form ∃x φ(x) → φ(c), where c is one of the new constant
symbols. When we construct the structure satisfying Γ , this will guarantee
that each true existential sentence has a witness among the new constants.

fol:com:hen: Proposition 23.3. If Γ is consistent in L and L′ is obtained from L by adding


prop:lang-exp
a denumerable set of new constant symbols d0 , d1 , . . . , then Γ is consistent
in L′ .

Definition 23.4 (Saturated set). A set Γ of formulas of a language L is


saturated iff for each formula φ(x) ∈ Frm(L) with one free variable x there is
a constant symbol c ∈ L such that ∃x φ(x) → φ(c) ∈ Γ .

The following definition will be used in the proof of the next theorem.

fol:com:hen: Definition 23.5. Let L′ be as in Proposition 23.3. Fix an enumeration


defn:henkin-exp
φ0 (x0 ), φ1 (x1 ), . . . of all formulas φi (xi ) of L′ in which one variable (xi ) occurs
free. We define the sentences θn by induction on n.
Let c0 be the first constant symbol among the di we added to L which does
not occur in φ0 (x0 ). Assuming that θ0 , . . . , θn−1 have already been defined,
let cn be the first among the new constant symbols di that occurs neither in
θ0 , . . . , θn−1 nor in φn (xn ).
Now let θn be the formula ∃xn φn (xn ) → φn (cn ).

Release : 6891b66 (2024-12-01) 337


CHAPTER 23. THE COMPLETENESS THEOREM

Lemma 23.6. Every consistent set Γ can be extended to a saturated consis- fol:com:hen:
tent set Γ ′ . lem:henkin

Proof. Given a consistent set of sentences Γ in a language L, expand the lan-


guage by adding a denumerable set of new constant symbols to form L′ . By
Proposition 23.3, Γ is still consistent in the richer language. Further, let θi be
as in Definition 23.5. Let
Γ0 = Γ
Γn+1 = Γn ∪ {θn }
i.e., Γn+1 = Γ ∪ {θ0 , . . . , θn }, and let Γ ′ = n Γn . Γ ′ is clearly saturated.
S
If Γ ′ were inconsistent, then for some n, Γn would be inconsistent (Exercise:
explain why). So to show that Γ ′ is consistent it suffices to show, by induction
on n, that each set Γn is consistent.
The induction basis is simply the claim that Γ0 = Γ is consistent, which
is the hypothesis of the theorem. For the induction step, suppose that Γn is
consistent but Γn+1 = Γn ∪{θn } is inconsistent. Recall that θn is ∃xn φn (xn )→
φn (cn ), where φn (xn ) is a formula of L′ with only the variable xn free. By the
way we’ve chosen the cn (see Definition 23.5), cn does not occur in φn (xn ) nor
in Γn .
If Γn ∪ {θn } is inconsistent, then Γn ⊢ ¬θn , and hence both of the following
hold:
Γn ⊢ ∃xn φn (xn ) Γn ⊢ ¬φn (cn )
Since cn does not occur in Γn or in φn (xn ), Theorems 19.25, 20.25, 21.25
and 22.33 applies. From Γn ⊢ ¬φn (cn ), we obtain Γn ⊢ ∀xn ¬φn (xn ). Thus
we have that both Γn ⊢ ∃xn φn (xn ) and Γn ⊢ ∀xn ¬φn (xn ), so Γn itself is
inconsistent. (Note that ∀xn ¬φn (xn ) ⊢ ¬∃xn φn (xn ).) Contradiction: Γn was
supposed to be consistent. Hence Γn ∪ {θn } is consistent.

explanation We’ll now show that complete, consistent sets which are saturated have the
property that it contains a universally quantified sentence iff it contains all its
instances and it contains an existentially quantified sentence iff it contains at
least one instance. We’ll use this to show that the structure we’ll generate from
a complete, consistent, saturated set makes all its quantified sentences true.
Proposition 23.7. Suppose Γ is complete, consistent, and saturated. fol:com:hen:
prop:saturated-instances
1. ∃x φ(x) ∈ Γ iff φ(t) ∈ Γ for at least one closed term t.
2. ∀x φ(x) ∈ Γ iff φ(t) ∈ Γ for all closed terms t.

Proof. 1. First suppose that ∃x φ(x) ∈ Γ . Because Γ is saturated, (∃x φ(x)→


φ(c)) ∈ Γ for some constant symbol c. By Propositions 19.24, 20.24,
21.24 and 22.32, item (1), and Proposition 23.2(1), φ(c) ∈ Γ .
For the other direction, saturation is not necessary: Suppose φ(t) ∈ Γ .
Then Γ ⊢ ∃x φ(x) by Propositions 19.26, 20.26, 21.26 and 22.34, item (1).
By Proposition 23.2(1), ∃x φ(x) ∈ Γ .

338 Release : 6891b66 (2024-12-01)


23.5. LINDENBAUM’S LEMMA

2. Suppose that φ(t) ∈ Γ for all closed terms t. By way of contradiction,


assume ∀x φ(x) ∈ / Γ . Since Γ is complete, ¬∀x φ(x) ∈ Γ . By saturation,
(∃x ¬φ(x) → ¬φ(c)) ∈ Γ for some constant symbol c. By assumption,
since c is a closed term, φ(c) ∈ Γ . But this would make Γ inconsistent.
(Exercise: give the derivation that shows

¬∀x φ(x), ∃x ¬φ(x) → ¬φ(c), φ(c)

is inconsistent.)
For the reverse direction, we do not need saturation: Suppose ∀x φ(x) ∈
Γ . Then Γ ⊢ φ(t) by Propositions 19.26, 20.26, 21.26 and 22.34, item (2).
We get φ(t) ∈ Γ by Proposition 23.2.

content/first-order-logic/completeness/lindenbaums-lemma.tex

23.5 Lindenbaum’s Lemma


fol:com:lin: We now prove a lemma that shows that any consistent set of sentences is con- explanation
sec
tained in some set of sentences which is not just consistent, but also complete.
The proof works by adding one sentence at a time, guaranteeing at each step
that the set remains consistent. We do this so that for every φ, either φ or ¬φ
gets added at some stage. The union of all stages in that construction then
contains either φ or its negation ¬φ and is thus complete. It is also consistent,
since we make sure at each stage not to introduce an inconsistency.

fol:com:lin: Lemma 23.8 (Lindenbaum’s Lemma). Every consistent set Γ in a lan-


lem:lindenbaum
guage L can be extended to a complete and consistent set Γ ∗ .

Proof. Let Γ be consistent. Let φ0 , φ1 , . . . be an enumeration of all the


sentences of L. Define Γ0 = Γ , and
(
Γn ∪ {φn } if Γn ∪ {φn } is consistent;
Γn+1 =
Γn ∪ {¬φn } otherwise.

Let Γ ∗ = n≥0 Γn .
S
Each Γn is consistent: Γ0 is consistent by definition. If Γn+1 = Γn ∪ {φn },
this is because the latter is consistent. If it isn’t, Γn+1 = Γn ∪ {¬φn }. We
have to verify that Γn ∪ {¬φn } is consistent. Suppose it’s not. Then both
Γn ∪ {φn } and Γn ∪ {¬φn } are inconsistent. This means that Γn would be
inconsistent by Propositions 19.21, 20.21, 21.21 and 22.29, contrary to the
induction hypothesis.
For every n and every i < n, Γi ⊆ Γn . This follows by a simple induction
on n. For n = 0, there are no i < 0, so the claim holds automatically. For
the inductive step, suppose it is true for n. We show that if i < n + 1 then
Γi ⊆ Γn+1 . We have Γn+1 = Γn ∪ {φn } or = Γn ∪ {¬φn } by construction. So

Release : 6891b66 (2024-12-01) 339


CHAPTER 23. THE COMPLETENESS THEOREM

Γn ⊆ Γn+1 . If i < n + 1, then Γi ⊆ Γn by inductive hypothesis (if i < n) or


the trivial fact that Γn ⊆ Γn (if i = n). We get that Γi ⊆ Γn+1 by transitivity
of ⊆.
From this it follows that Γ ∗ is consistent. Here’s why: Let Γ ′ ⊆ Γ ∗ be
finite. Each ψ ∈ Γ ′ is also in Γi for some i. Let n be the largest of these. Since
Γi ⊆ Γn if i ≤ n, every ψ ∈ Γ ′ is also ∈ Γn , i.e., Γ ′ ⊆ Γn , and Γn is consistent.
So, every finite subset Γ ′ ⊆ Γ ∗ is consistent. By Propositions 19.17, 20.17,
21.17 and 22.21, Γ ∗ is consistent.
Every sentence of Frm(L) appears on the list used to define Γ ∗ . If φn ∈ / Γ ∗,
then that is because Γn ∪ {φn } was inconsistent. But then ¬φn ∈ Γ , so Γ ∗ is

complete.

content/first-order-logic/completeness/construction-of-model.tex

23.6 Construction of a Model


explanation Right now we are not concerned about =, i.e., we only want to show that a fol:com:mod:
sec
consistent set Γ of sentences not containing = is satisfiable. We first extend Γ
to a consistent, complete, and saturated set Γ ∗ . In this case, the definition of
a model M(Γ ∗ ) is simple: We take the set of closed terms of L′ as the domain.
We assign every constant symbol ∗
to itself, and make sure that more generally,
for every closed term t, ValM(Γ ) (t) = t. The predicate symbols are assigned
extensions in such a way that an atomic sentence is true in M(Γ ∗ ) iff it is
in Γ ∗ . This will obviously make all the atomic sentences in Γ ∗ true in M(Γ ∗ ).
The rest are true provided the Γ ∗ we start with is consistent, complete, and
saturated.

Definition 23.9 (Term model). Let Γ ∗ be a complete and consistent, sat- fol:com:mod:
urated set of sentences in a language L. The term model M(Γ ∗ ) of Γ ∗ is the defn:termmodel

structure defined as follows:

1. The domain |M(Γ ∗ )| is the set of all closed terms of L.



)
2. The interpretation of a constant symbol c is c itself: cM(Γ = c.

3. The function symbol f is assigned the function which, given as arguments


the closed terms t1 , . . . , tn , has as value the closed term f (t1 , . . . , tn ):

f M(Γ ) (t1 , . . . , tn ) = f (t1 , . . . , tn )

4. If R is an n-place predicate symbol, then



⟨t1 , . . . , tn ⟩ ∈ RM(Γ )
iff R(t1 , . . . , tn ) ∈ Γ ∗ .


We will now check that we indeed have ValM(Γ ) (t) = t.

340 Release : 6891b66 (2024-12-01)


23.6. CONSTRUCTION OF A MODEL


fol:com:mod: Lemma 23.10. Let M(Γ ∗ ) be the term model of Definition 23.9, then ValM(Γ ) (t) =
lem:val-in-termmodel
t.

Proof. The proof is by induction on t, where the base case, when t is a con-
stant symbol, follows directly from the definition of the term model. ∗
For the
induction step assume t1 , . . . , tn are closed terms such that ValM(Γ ) (ti ) = ti
and that f is an n-ary function symbol. Then
∗ ∗ ∗ ∗
ValM(Γ ) (f (t1 , . . . , tn )) = f M(Γ ) (ValM(Γ ) (t1 ), . . . , ValM(Γ ) (tn ))

= f M(Γ ) (t1 , . . . , tn )
= f (t1 , . . . , tn ),

and so by induction this holds for every closed term t.

A structure M may make an existentially quantified sentence ∃x φ(x) true explanation

without there being an instance φ(t) that it makes true. A structure M may
make all instances φ(t) of a universally quantified sentence ∀x φ(x) true, with-
out making ∀x φ(x) true. This is because in general not every element of |M|
is the value of a closed term (M may not be covered). This is the reason the
satisfaction relation is defined via variable assignments. However, for our term
model M(Γ ∗ ) this wouldn’t be necessary—because it is covered. This is the
content of the next result.

fol:com:mod: Proposition 23.11. Let M(Γ ∗ ) be the term model of Definition 23.9.
prop:quant-termmodel

1. M(Γ ∗ ) ⊨ ∃x φ(x) iff M(Γ ∗ ) ⊨ φ(t) for at least one closed term t.

2. M(Γ ∗ ) ⊨ ∀x φ(x) iff M(Γ ∗ ) ⊨ φ(t) for all closed terms t.

Proof. 1. By Proposition 16.19, M(Γ ∗ ) ⊨ ∃x φ(x) iff for at least one variable
assignment s, M(Γ ∗ ), s ⊨ φ(x). As |M(Γ ∗ )| consists of the closed terms
of L, this is the case iff there is at least one closed term t such that
s(x) = t and M(Γ ∗ ), s ⊨ φ(x). By Proposition 16.23, M(Γ ∗ ), s ⊨ φ(x) iff
M(Γ ∗ ), s ⊨ φ(t), where s(x) = t. By Proposition 16.18, M(Γ ∗ ), s ⊨ φ(t)
iff M(Γ ∗ ) ⊨ φ(t), since φ(t) is a sentence.

2. By Proposition 16.19, M(Γ ∗ ) ⊨ ∀x φ(x) iff for every variable assignment


s, M(Γ ∗ ), s ⊨ φ(x). Recall that |M(Γ ∗ )| consists of the closed terms
of L, so for every closed term t, s(x) = t is such a variable assignment,
and for any variable assignment, s(x) is some closed term t. By Propo-
sition 16.23, M(Γ ∗ ), s ⊨ φ(x) iff M(Γ ∗ ), s ⊨ φ(t), where s(x) = t. By
Proposition 16.18, M(Γ ∗ ), s ⊨ φ(t) iff M(Γ ∗ ) ⊨ φ(t), since φ(t) is a
sentence.

fol:com:mod: Lemma 23.12 (Truth Lemma). Suppose φ does not contain =. Then M(Γ ∗ ) ⊨
lem:truth
φ iff φ ∈ Γ ∗ .

Release : 6891b66 (2024-12-01) 341


CHAPTER 23. THE COMPLETENESS THEOREM

Proof. We prove both directions simultaneously, and by induction on φ.


1. φ ≡ ⊥: M(Γ ∗ ) ⊭ ⊥ by definition of satisfaction. On the other hand,
⊥∈/ Γ ∗ since Γ ∗ is consistent.
2. φ ≡ ⊤: M(Γ ∗ ) ⊨ ⊤ by definition of satisfaction. On the other hand,
⊤ ∈ Γ ∗ since Γ ∗ is consistent and complete, and Γ ∗ ⊢ ⊤.

3. φ ≡ R(t1 , . . . , tn ): M(Γ ∗ ) ⊨ R(t1 , . . . , tn ) iff ⟨t1 , . . . , tn ⟩ ∈ RM(Γ ) (by
the definition of satisfaction) iff R(t1 , . . . , tn ) ∈ Γ ∗ (by the construction
of M(Γ ∗ )).
4. φ ≡ ¬ψ: M(Γ ∗ ) ⊨ φ iff M(Γ ∗ ) ⊭ ψ (by definition of satisfaction). By
induction hypothesis, M(Γ ∗ ) ⊭ ψ iff ψ ∈
/ Γ ∗ . Since Γ ∗ is consistent and
∗ ∗
/ Γ iff ¬ψ ∈ Γ .
complete, ψ ∈
5. φ ≡ ψ ∧ χ: M(Γ ∗ ) ⊨ φ iff we have both M(Γ ∗ ) ⊨ ψ and M(Γ ∗ ) ⊨ χ (by
definition of satisfaction) iff both ψ ∈ Γ ∗ and χ ∈ Γ ∗ (by the induction
hypothesis). By Proposition 23.2(2), this is the case iff (ψ ∧ χ) ∈ Γ ∗ .
6. φ ≡ ψ ∨ χ: M(Γ ∗ ) ⊨ φ iff M(Γ ∗ ) ⊨ ψ or M(Γ ∗ ) ⊨ χ (by definition of
satisfaction) iff ψ ∈ Γ ∗ or χ ∈ Γ ∗ (by induction hypothesis). This is the
case iff (ψ ∨ χ) ∈ Γ ∗ (by Proposition 23.2(3)).
7. φ ≡ ψ → χ: M(Γ ∗ ) ⊨ φ iff M(Γ ∗ ) ⊭ ψ or M(Γ ∗ ) ⊨ χ (by definition of
/ Γ ∗ or χ ∈ Γ ∗ (by induction hypothesis). This is the
satisfaction) iff ψ ∈
case iff (ψ → χ) ∈ Γ ∗ (by Proposition 23.2(4)).
8. φ ≡ ∀x ψ(x): M(Γ ∗ ) ⊨ φ iff M(Γ ∗ ) ⊨ ψ(t) for all terms t (Proposi-
tion 23.11). By induction hypothesis, this is the case iff ψ(t) ∈ Γ ∗ for all
terms t, by Proposition 23.7, this in turn is the case iff ∀x φ(x) ∈ Γ ∗ .
9. φ ≡ ∃x ψ(x): M(Γ ∗ ) ⊨ φ iff M(Γ ∗ ) ⊨ ψ(t) for at least one term t
(Proposition 23.11). By induction hypothesis, this is the case iff ψ(t) ∈
Γ ∗ for at least one term t. By Proposition 23.7, this in turn is the case
iff ∃x ψ(x) ∈ Γ ∗ .

content/first-order-logic/completeness/identity.tex

23.7 Identity
explanation The construction of the term model given in the preceding section is enough fol:com:ide:
sec
to establish completeness for first-order logic for sets Γ that do not contain =.
The term model satisfies every φ ∈ Γ ∗ which does not contain = (and hence
all φ ∈ Γ ). It does not work, however, if = is present. The reason is that Γ ∗
then may contain a sentence t = t′ , but in the term model the value of any
term is that term itself. Hence, if t and t′ are different terms, their values in
the term model—i.e., t and t′ , respectively—are different, and so t = t′ is false.
We can fix this, however, using a construction known as “factoring.”

342 Release : 6891b66 (2024-12-01)


23.7. IDENTITY

Definition 23.13. Let Γ ∗ be a consistent and complete set of sentences in L.


We define the relation ≈ on the set of closed terms of L by
t ≈ t′ iff t = t′ ∈ Γ ∗

fol:com:ide: Proposition 23.14. The relation ≈ has the following properties:


prop:approx-equiv
1. ≈ is reflexive.
2. ≈ is symmetric.
3. ≈ is transitive.
4. If t ≈ t′ , f is a function symbol, and t1 , . . . , ti−1 , ti+1 , . . . , tn are closed
terms, then
f (t1 , . . . , ti−1 , t, ti+1 , . . . , tn ) ≈ f (t1 , . . . , ti−1 , t′ , ti+1 , . . . , tn ).

5. If t ≈ t′ , R is a predicate symbol, and t1 , . . . , ti−1 , ti+1 , . . . , tn are


closed terms, then

R(t1 , . . . , ti−1 , t, ti+1 , . . . , tn ) ∈ Γ ∗ iff


R(t1 , . . . , ti−1 , t′ , ti+1 , . . . , tn ) ∈ Γ ∗ .

Proof. Since Γ ∗ is consistent and complete, t = t′ ∈ Γ ∗ iff Γ ∗ ⊢ t = t′ . Thus it


is enough to show the following:
1. Γ ∗ ⊢ t = t for all closed terms t.
2. If Γ ∗ ⊢ t = t′ then Γ ∗ ⊢ t′ = t.
3. If Γ ∗ ⊢ t = t′ and Γ ∗ ⊢ t′ = t′′ , then Γ ∗ ⊢ t = t′′ .
4. If Γ ∗ ⊢ t = t′ , then
Γ ∗ ⊢ f (t1 , . . . , ti−1 , t, ti+1 , , . . . , tn ) = f (t1 , . . . , ti−1 , t′ , ti+1 , . . . , tn )
for every n-place function symbol f and closed terms t1 , . . . , ti−1 , ti+1 ,
. . . , tn .
5. If Γ ∗ ⊢ t = t′ and Γ ∗ ⊢ R(t1 , . . . , ti−1 , t, ti+1 , . . . , tn ), then Γ ∗ ⊢ R(t1 , . . . , ti−1 , t′ , ti+1 , . . . , tn )
for every n-place predicate symbol R and closed terms t1 , . . . , ti−1 , ti+1 ,
. . . , tn .

Problem 23.2. Complete the proof of Proposition 23.14.

Definition 23.15. Suppose Γ ∗ is a consistent and complete set in a lan-


guage L, t is a closed term, and ≈ as in the previous definition. Then:
[t]≈ = {t′ : t′ ∈ Trm(L), t ≈ t′ }
and Trm(L)/≈ = {[t]≈ : t ∈ Trm(L)}.

Release : 6891b66 (2024-12-01) 343


CHAPTER 23. THE COMPLETENESS THEOREM

Definition 23.16. Let M = M(Γ ∗ ) be the term model for Γ ∗ from Defini- fol:com:ide:
defn:term-model-factor
tion 23.9. Then M/≈ is the following structure:

1. |M/≈ | = Trm(L)/≈ .

2. cM/≈ = [c]≈

3. f M/≈ ([t1 ]≈ , . . . , [tn ]≈ ) = [f (t1 , . . . , tn )]≈

4. ⟨[t1 ]≈ , . . . , [tn ]≈ ⟩ ∈ RM/≈ iff M ⊨ R(t1 , . . . , tn ), i.e., iff R(t1 , . . . , tn ) ∈ Γ ∗ .

explanation Note that we have defined f M/≈ and RM/≈ for elements of Trm(L)/≈ by
referring to them as [t]≈ , i.e., via representatives t ∈ [t]≈ . We have to make
sure that these definitions do not depend on the choice of these representatives,
i.e., that for some other choices t′ which determine the same equivalence classes
([t]≈ = [t′ ]≈ ), the definitions yield the same result. For instance, if R is a one-
place predicate symbol, the last clause of the definition says that [t]≈ ∈ RM/≈
iff M ⊨ R(t). If for some other term t′ with t ≈ t′ , M ⊭ R(t), then the definition
would require [t′ ]≈ ∈ / RM/≈ . If t ≈ t′ , then [t]≈ = [t′ ]≈ , but we can’t have both
M/≈
[t]≈ ∈ R and [t]≈ ∈/ RM/≈ . However, Proposition 23.14 guarantees that
this cannot happen.

Proposition 23.17. M/≈ is well defined, i.e., if t1 , . . . , tn , t′1 , . . . , t′n are


closed terms, and ti ≈ t′i then

1. [f (t1 , . . . , tn )]≈ = [f (t′1 , . . . , t′n )]≈ , i.e.,

f (t1 , . . . , tn ) ≈ f (t′1 , . . . , t′n )

and

2. M ⊨ R(t1 , . . . , tn ) iff M ⊨ R(t′1 , . . . , t′n ), i.e.,

R(t1 , . . . , tn ) ∈ Γ ∗ iff R(t′1 , . . . , t′n ) ∈ Γ ∗ .

Proof. Follows from Proposition 23.14 by induction on n.

As in the case of the term model, before proving the truth lemma we need
the following lemma.

Lemma 23.18. Let M = M(Γ ∗ ), then ValM/≈ (t) = [t]≈ . fol:com:ide:


lem:val-in-termmodel-factored

Proof. The proof is similar to that of Lemma 23.10.

Problem 23.3. Complete the proof of Lemma 23.18.

Lemma 23.19. M/≈ ⊨ φ iff φ ∈ Γ ∗ for all sentences φ. fol:com:ide:


lem:truth

344 Release : 6891b66 (2024-12-01)


23.8. THE COMPLETENESS THEOREM

Proof. By induction on φ, just as in the proof of Lemma 23.12. The only case
that needs additional attention is when φ ≡ t = t′ .

M/≈ ⊨ t = t′ iff [t]≈ = [t′ ]≈ (by definition of M/≈ )


iff t ≈ t′ (by definition of [t]≈ )
iff t = t′ ∈ Γ ∗ (by definition of ≈).

Note that while M(Γ ∗ ) is always enumerable and infinite, M/≈ may be digression
finite, since it may turn out that there are only finitely many classes [t]≈ . This
is to be expected, since Γ may contain sentences which require any structure
in which they are true to be finite. For instance, ∀x ∀y x = y is a consistent
sentence, but is satisfied only in structures with a domain that contains exactly
one element.

content/first-order-logic/completeness/completeness-thm.tex

23.8 The Completeness Theorem


fol:com:cth: Let’s combine our results: we arrive at the completeness theorem. explanation
sec

fol:com:cth: Theorem 23.20 (Completeness Theorem). Let Γ be a set of sentences.


thm:completeness
If Γ is consistent, it is satisfiable.

Proof. Suppose Γ is consistent. By Lemma 23.6, there is a saturated consistent


set Γ ′ ⊇ Γ . By Lemma 23.8, there is a Γ ∗ ⊇ Γ ′ which is consistent and
complete. Since Γ ′ ⊆ Γ ∗ , for each formula φ(x), Γ ∗ contains a sentence of
the form ∃x φ(x) → φ(c) and so Γ ∗ is saturated. If Γ does not contain =, then
by Lemma 23.12, M(Γ ∗ ) ⊨ φ iff φ ∈ Γ ∗ . From this it follows in particular that
for all φ ∈ Γ , M(Γ ∗ ) ⊨ φ, so Γ is satisfiable. If Γ does contain =, then by
Lemma 23.19, for all sentences φ, M/≈ ⊨ φ iff φ ∈ Γ ∗ . In particular, M/≈ ⊨ φ
for all φ ∈ Γ , so Γ is satisfiable.

fol:com:cth: Corollary 23.21 (Completeness Theorem, Second Version). For all Γ


cor:completeness
and sentences φ: if Γ ⊨ φ then Γ ⊢ φ.

Proof. Note that the Γ ’s in Corollary 23.21 and Theorem 23.20 are univer-
sally quantified. To make sure we do not confuse ourselves, let us restate
Theorem 23.20 using a different variable: for any set of sentences ∆, if ∆ is
consistent, it is satisfiable. By contraposition, if ∆ is not satisfiable, then ∆ is
inconsistent. We will use this to prove the corollary.
Suppose that Γ ⊨ φ. Then Γ ∪ {¬φ} is unsatisfiable by Proposition 16.28.
Taking Γ ∪ {¬φ} as our ∆, the previous version of Theorem 23.20 gives us
that Γ ∪ {¬φ} is inconsistent. By Propositions 19.19, 20.19, 21.19 and 22.27,
Γ ⊢ φ.

Problem 23.4. Use Corollary 23.21 to prove Theorem 23.20, thus showing
that the two formulations of the completeness theorem are equivalent.

Release : 6891b66 (2024-12-01) 345


CHAPTER 23. THE COMPLETENESS THEOREM

Problem 23.5. In order for a derivation system to be complete, its rules must
be strong enough to prove every unsatisfiable set inconsistent. Which of the
rules of derivation were necessary to prove completeness? Are any of these rules
not used anywhere in the proof? In order to answer these questions, make a
list or diagram that shows which of the rules of derivation were used in which
results that lead up to the proof of Theorem 23.20. Be sure to note any tacit
uses of rules in these proofs.

content/first-order-logic/completeness/compactness.tex

23.9 The Compactness Theorem


One important consequence of the completeness theorem is the compactness fol:com:com:
sec
theorem. The compactness theorem states that if each finite subset of a set
of sentences is satisfiable, the entire set is satisfiable—even if the set itself is
infinite. This is far from obvious. There is nothing that seems to rule out,
at first glance at least, the possibility of there being infinite sets of sentences
which are contradictory, but the contradiction only arises, so to speak, from
the infinite number. The compactness theorem says that such a scenario can
be ruled out: there are no unsatisfiable infinite sets of sentences each finite
subset of which is satisfiable. Like the completeness theorem, it has a version
related to entailment: if an infinite set of sentences entails something, already
a finite subset does.

Definition 23.22. A set Γ of formulas is finitely satisfiable iff every finite


Γ0 ⊆ Γ is satisfiable.

Theorem 23.23 (Compactness Theorem). The following hold for any sen- fol:com:com:
thm:compactness
tences Γ and φ:

1. Γ ⊨ φ iff there is a finite Γ0 ⊆ Γ such that Γ0 ⊨ φ.

2. Γ is satisfiable iff it is finitely satisfiable.

Proof. We prove (2). If Γ is satisfiable, then there is a structure M such that


M ⊨ φ for all φ ∈ Γ . Of course, this M also satisfies every finite subset of Γ ,
so Γ is finitely satisfiable.
Now suppose that Γ is finitely satisfiable. Then every finite subset Γ0 ⊆ Γ
is satisfiable. By soundness (Corollaries 19.31, 20.29, 21.31 and 22.38), ev-
ery finite subset is consistent. Then Γ itself must be consistent by Proposi-
tions 19.17, 20.17, 21.17 and 22.21. By completeness (Theorem 23.20), since
Γ is consistent, it is satisfiable.

Problem 23.6. Prove (1) of Theorem 23.23.

346 Release : 6891b66 (2024-12-01)


23.9. THE COMPACTNESS THEOREM

Example 23.24. In every model M of a theory Γ , each term t of course picks


out an element of |M|. Can we guarantee that it is also true that every element
of |M| is picked out by some term or other? In other words, are there theories Γ
all models of which are covered? The compactness theorem shows that this is
not the case if Γ has infinite models. Here’s how to see this: Let M be an
infinite model of Γ , and let c be a constant symbol not in the language of Γ .
Let ∆ be the set of all sentences c ̸= t for t a term in the language L of Γ , i.e.,
∆ = {c ̸= t : t ∈ Trm(L)}.
A finite subset of Γ ∪ ∆ can be written as Γ ′ ∪ ∆′ , with Γ ′ ⊆ Γ and ∆′ ⊆ ∆.
Since ∆′ is finite, it can contain only finitely many terms. Let a ∈ |M| be
an element of |M| not picked out by any of them, and let M′ be the structure

that is just like M, but also cM = a. Since a ̸= ValM (t) for all t occurring
in ∆′ , M′ ⊨ ∆′ . Since M ⊨ Γ , Γ ′ ⊆ Γ , and c does not occur in Γ , also M′ ⊨ Γ ′ .
Together, M′ ⊨ Γ ′ ∪ ∆′ for every finite subset Γ ′ ∪ ∆′ of Γ ∪ ∆. So every finite
subset of Γ ∪ ∆ is satisfiable. By compactness, Γ ∪ ∆ itself is satisfiable. So
there are models M ⊨ Γ ∪∆. Every such M is a model of Γ , but is not covered,
since ValM (c) ̸= ValM (t) for all terms t of L.

Example 23.25. Consider a language L containing the predicate symbol <,


constant symbols 0, 1, and function symbols +, ×, and −. Let Γ be the set of
all sentences in this language true in the structure Q with domain Q and the
obvious interpretations. Γ is the set of all sentences of L true about the rational
numbers. Of course, in Q (and even in R), there are no numbers r which are
greater than 0 but less than 1/k for all k ∈ Z+ . Such a number, if it existed,
would be an infinitesimal: non-zero, but infinitely small. The compactness
theorem can be used to show that there are models of Γ in which infinitesimals
exist. We do not have a function symbol for division in our language (division
by zero is undefined, and function symbols have to be interpreted by total
functions). However, we can still express that r < 1/k, since this is the case iff
r · k < 1. Now let c be a new constant symbol and let ∆ be
{0 < c} ∪ {c × k < 1 : k ∈ Z+ }
(where k = (1 + (1 + · · · + (1 + 1) . . . )) with k 1’s). For any finite subset ∆0
of ∆ there is a K such that for all the sentences c × k < 1 in ∆0 have k < K.

If we expand Q to Q′ with cQ = 1/K we have that Q′ ⊨ Γ0 ∪ ∆0 for any finite
Γ0 ⊆ Γ , and so Γ ∪ ∆ is finitely satisfiable (Exercise: prove this in detail).
By compactness, Γ ∪ ∆ is satisfiable. Any model S of Γ ∪ ∆ contains an
infinitesimal, namely cS .

Problem 23.7. In the standard model of arithmetic N, there is no element k ∈


|N| which satisfies every formula n < x (where n is 0′...′ with n ′’s). Use the
compactness theorem to show that the set of sentences in the language of
arithmetic which are true in the standard model of arithmetic N are also true
in a structure N′ that contains an element which does satisfy every formula
n < x.

Release : 6891b66 (2024-12-01) 347


CHAPTER 23. THE COMPLETENESS THEOREM

Example 23.26. We know that first-order logic with identity predicate can
express that the size of the domain must have some minimal size: The sen-
tence φ≥n (which says “there are at least n distinct objects”) is true only in
structures where |M| has at least n objects. So if we take

∆ = {φ≥n : n ≥ 1}

then any model of ∆ must be infinite. Thus, we can guarantee that a theory
only has infinite models by adding ∆ to it: the models of Γ ∪ ∆ are all and
only the infinite models of Γ .
So first-order logic can express infinitude. The compactness theorem shows
that it cannot express finitude, however. For suppose some set of sentences Λ
were satisfied in all and only finite structures. Then ∆ ∪ Λ is finitely satisfiable.
Why? Suppose ∆′ ∪ Λ′ ⊆ ∆ ∪ Λ is finite with ∆′ ⊆ ∆ and Λ′ ⊆ Λ. Let n be the
largest number such that φ≥n ∈ ∆′ . Λ, being satisfied in all finite structures,
has a model M with finitely many but ≥ n elements. But then M ⊨ ∆′ ∪ Λ′ .
By compactness, ∆ ∪ Λ has an infinite model, contradicting the assumption
that Λ is satisfied only in finite structures.

content/first-order-logic/completeness/compactness-direct.tex

23.10 A Direct Proof of the Compactness Theorem


We can prove the Compactness Theorem directly, without appealing to the fol:com:cpd:
sec
Completeness Theorem, using the same ideas as in the proof of the complete-
ness theorem. In the proof of the Completeness Theorem we started with a
consistent set Γ of sentences, expanded it to a consistent, saturated, and com-
plete set Γ ∗ of sentences, and then showed that in the term model M(Γ ∗ )
constructed from Γ ∗ , all sentences of Γ are true, so Γ is satisfiable.
We can use the same method to show that a finitely satisfiable set of sen-
tences is satisfiable. We just have to prove the corresponding versions of the
results leading to the truth lemma where we replace “consistent” with “finitely
satisfiable.”

Proposition 23.27. Suppose Γ is complete and finitely satisfiable. Then: fol:com:cpd:


prop:fsat-ccs

1. (φ ∧ ψ) ∈ Γ iff both φ ∈ Γ and ψ ∈ Γ .

2. (φ ∨ ψ) ∈ Γ iff either φ ∈ Γ or ψ ∈ Γ .

3. (φ → ψ) ∈ Γ iff either φ ∈
/ Γ or ψ ∈ Γ .

Problem 23.8. Prove Proposition 23.27. Avoid the use of ⊢.

Lemma 23.28. Every finitely satisfiable set Γ can be extended to a saturated fol:com:cpd:
finitely satisfiable set Γ ′ . lem:fsat-henkin

348 Release : 6891b66 (2024-12-01)


23.11. THE LÖWENHEIM–SKOLEM THEOREM

Problem 23.9. Prove Lemma 23.28. (Hint: The crucial step is to show that
if Γn is finitely satisfiable, so is Γn ∪ {θn }, without any appeal to derivations
or consistency.)

fol:com:cpd: Proposition 23.29. Suppose Γ is complete, finitely satisfiable, and saturated.


prop:fsat-instances
1. ∃x φ(x) ∈ Γ iff φ(t) ∈ Γ for at least one closed term t.

2. ∀x φ(x) ∈ Γ iff φ(t) ∈ Γ for all closed terms t.

Problem 23.10. Prove Proposition 23.29.

fol:com:cpd: Lemma 23.30. Every finitely satisfiable set Γ can be extended to a complete
lem:fsat-lindenbaum
and finitely satisfiable set Γ ∗ .

Problem 23.11. Prove Lemma 23.30. (Hint: the crucial step is to show that
if Γn is finitely satisfiable, then either Γn ∪ {φn } or Γn ∪ {¬φn } is finitely
satisfiable.)

fol:com:cpd: Theorem 23.31 (Compactness). Γ is satisfiable if and only if it is finitely


thm:compactness-direct
satisfiable.

Proof. If Γ is satisfiable, then there is a structure M such that M ⊨ φ for all


φ ∈ Γ . Of course, this M also satisfies every finite subset of Γ , so Γ is finitely
satisfiable.
Now suppose that Γ is finitely satisfiable. By Lemma 23.28, there is a
finitely satisfiable, saturated set Γ ′ ⊇ Γ . By Lemma 23.30, Γ ′ can be extended
to a complete and finitely satisfiable set Γ ∗ , and Γ ∗ is still saturated. Construct
the term model M(Γ ∗ ) as in Definition 23.9. Note that Proposition 23.11 did
not rely on the fact that Γ ∗ is consistent (or complete or saturated, for that
matter), but just on the fact that M(Γ ∗ ) is covered. The proof of the Truth
Lemma (Lemma 23.12) goes through if we replace references to Proposition 23.2
and Proposition 23.7 by references to Proposition 23.27 and Proposition 23.29

Problem 23.12. Write out the complete proof of the Truth Lemma (Lemma 23.12)
in the version required for the proof of Theorem 23.31.

content/first-order-logic/completeness/downward-ls.tex

23.11 The Löwenheim–Skolem Theorem


fol:com:dls: The Löwenheim–Skolem Theorem says that if a theory has an infinite model,
sec
then it also has a model that is at most denumerable. An immediate con-
sequence of this fact is that first-order logic cannot express that the size of
a structure is non-enumerable: any sentence or set of sentences satisfied in all
non-enumerable structures is also satisfied in some enumerable structure.

Release : 6891b66 (2024-12-01) 349


Theorem 23.32. If Γ is consistent then it has an enumerable model, i.e., it fol:com:dls:
thm:downward-ls
is satisfiable in a structure whose domain is either finite or denumerable.

Proof. If Γ is consistent, the structure M delivered by the proof of the com-


pleteness theorem has a domain |M| that is no larger than the set of the terms
of the language L. So M is at most denumerable.

Theorem 23.33. If Γ is a consistent set of sentences in the language of fol:com:dls:


noidentity-ls
first-order logic without identity, then it has a denumerable model, i.e., it is
satisfiable in a structure whose domain is infinite and enumerable.

Proof. If Γ is consistent and contains no sentences in which identity appears,


then the structure M delivered by the proof of the completeness theorem has
a domain |M| identical to the set of terms of the language L′ . So M is denu-
merable, since Trm(L′ ) is.

Example 23.34 (Skolem’s Paradox). Zermelo–Fraenkel set theory ZFC is


a very powerful framework in which practically all mathematical statements
can be expressed, including facts about the sizes of sets. So for instance,
ZFC can prove that the set R of real numbers is non-enumerable, it can prove
Cantor’s Theorem that the power set of any set is larger than the set itself,
etc. If ZFC is consistent, its models are all infinite, and moreover, they all
contain elements about which the theory says that they are non-enumerable,
such as the element that makes true the theorem of ZFC that the power set
of the natural numbers exists. By the Löwenheim–Skolem Theorem, ZFC also
has enumerable models—models that contain “non-enumerable” sets but which
themselves are enumerable.

Chapter 24

Beyond First-order Logic

This chapter, adapted from Jeremy Avigad’s logic notes, gives the
briefest of glimpses into which other logical systems there are. It is in-
tended as a chapter suggesting further topics for study in a course that does

350
24.1. OVERVIEW

not cover them. Each one of the topics mentioned here will—hopefully—
eventually receive its own part-level treatment in the Open Logic Project.

content/first-order-logic/beyond/introduction.tex

24.1 Overview
fol:byd:int: First-order logic is not the only system of logic of interest: there are many
sec
extensions and variations of first-order logic. A logic typically consists of the
formal specification of a language, usually, but not always, a deductive system,
and usually, but not always, an intended semantics. But the technical use of
the term raises an obvious question: what do logics that are not first-order
logic have to do with the word “logic,” used in the intuitive or philosophical
sense? All of the systems described below are designed to model reasoning of
some form or another; can we say what makes them logical?
No easy answers are forthcoming. The word “logic” is used in different
ways and in different contexts, and the notion, like that of “truth,” has been
analyzed from numerous philosophical stances. For example, one might take
the goal of logical reasoning to be the determination of which statements are
necessarily true, true a priori, true independent of the interpretation of the
nonlogical terms, true by virtue of their form, or true by linguistic convention;
and each of these conceptions requires a good deal of clarification. Even if
one restricts one’s attention to the kind of logic used in mathematics, there is
little agreement as to its scope. For example, in the Principia Mathematica,
Russell and Whitehead tried to develop mathematics on the basis of logic,
in the logicist tradition begun by Frege. Their system of logic was a form
of higher-type logic similar to the one described below. In the end they were
forced to introduce axioms which, by most standards, do not seem purely logical
(notably, the axiom of infinity, and the axiom of reducibility), but one might
nonetheless hold that some forms of higher-order reasoning should be accepted
as logical. In contrast, Quine, whose ontology does not admit “propositions”
as legitimate objects of discourse, argues that second-order and higher-order
logic are really manifestations of set theory in sheep’s clothing; in other words,
systems involving quantification over predicates are not purely logical.
For now, it is best to leave such philosophical issues for a rainy day, and
simply think of the systems below as formal idealizations of various kinds of
reasoning, logical or otherwise.

content/first-order-logic/beyond/many-sorted-logic.tex

24.2 Many-Sorted Logic


fol:byd:msl:
sec

Release : 6891b66 (2024-12-01) 351


CHAPTER 24. BEYOND FIRST-ORDER LOGIC

In first-order logic, variables and quantifiers range over a single domain. But
it is often useful to have multiple (disjoint) domains: for example, you might
want to have a domain of numbers, a domain of geometric objects, a domain
of functions from numbers to numbers, a domain of abelian groups, and so on.
Many-sorted logic provides this kind of framework. One starts with a list
of “sorts”—the “sort” of an object indicates the “domain” it is supposed to
inhabit. One then has variables and quantifiers for each sort, and (usually)
an identity predicate for each sort. Functions and relations are also “typed”
by the sorts of objects they can take as arguments. Otherwise, one keeps the
usual rules of first-order logic, with versions of the quantifier-rules repeated for
each sort.
For example, to study international relations we might choose a language
with two sorts of objects, French citizens and German citizens. We might have
a unary relation, “drinks wine,” for objects of the first sort; another unary
relation, “eats wurst,” for objects of the second sort; and a binary relation,
“forms a multinational married couple,” which takes two arguments, where
the first argument is of the first sort and the second argument is of the second
sort. If we use variables a, b, c to range over French citizens and x, y, z to
range over German citizens, then

∀a ∀x[(Mar r i edT o(a, x) → (Dr i nksW i ne(a) ∨ ¬EatsW ur st(x))]]

asserts that if any French person is married to a German, either the French
person drinks wine or the German doesn’t eat wurst.
Many-sorted logic can be embedded in first-order logic in a natural way,
by lumping all the objects of the many-sorted domains together into one first-
order domain, using unary predicate symbols to keep track of the sorts, and
relativizing quantifiers. For example, the first-order language corresponding
to the example above would have unary predicate symbols “Ger man” and
“F r ench,” in addition to the other relations described, with the sort require-
ments erased. A sorted quantifier ∀x φ, where x is a variable of the German
sort, translates to
∀x (Ger man(x) → φ).

We need to add axioms that insure that the sorts are separate—e.g., ∀x ¬(Ger man(x)∧
F r ench(x))—as well as axioms that guarantee that “drinks wine” only holds
of objects satisfying the predicate F r ench(x), etc. With these conventions
and axioms, it is not difficult to show that many-sorted sentences translate
to first-order sentences, and many-sorted derivations translate to first-order
derivations. Also, many-sorted structures “translate” to corresponding first-
order structures and vice-versa, so we also have a completeness theorem for
many-sorted logic.

content/first-order-logic/beyond/second-order-logic.tex

352 Release : 6891b66 (2024-12-01)


24.3. SECOND-ORDER LOGIC

24.3 Second-Order logic


fol:byd:sol: The language of second-order logic allows one to quantify not just over a domain
sec
of individuals, but over relations on that domain as well. Given a first-order
language L, for each k one adds variables R which range over k-ary relations,
and allows quantification over those variables. If R is a variable for a k-ary
relation, and t1 , . . . , tk are ordinary (first-order) terms, R(t1 , . . . , tk ) is an
atomic formula. Otherwise, the set of formulas is defined just as in the case of
first-order logic, with additional clauses for second-order quantification. Note
that we only have the identity predicate for first-order terms: if R and S are
relation variables of the same arity k, we can define R = S to be an abbreviation
for
∀x1 . . . ∀xk (R(x1 , . . . , xk ) ↔ S(x1 , . . . , xk )).
The rules for second-order logic simply extend the quantifier rules to the
new second order variables. Here, however, one has to be a little bit careful to
explain how these variables interact with the predicate symbols of L, and with
formulas of L more generally. At the bare minimum, relation variables count
as terms, so one has inferences of the form

φ(R) ⊢ ∃R φ(R)

But if L is the language of arithmetic with a constant relation symbol <, one
would also expect the following inference to be valid:

x < y ⊢ ∃R R(x, y)

or for a given formula φ,

φ(x1 , . . . , xk ) ⊢ ∃R R(x1 , . . . , xk )

More generally, we might want to allow inferences of the form

φ[λ⃗x. ψ(⃗x)/R] ⊢ ∃R φ

where φ[λ⃗x. ψ(⃗x)/R] denotes the result of replacing every atomic formula of
the form Rt1 , . . . , tk in φ by ψ(t1 , . . . , tk ). This last rule is equivalent to having
a comprehension schema, i.e., an axiom of the form

∃R ∀x1 , . . . , xk (φ(x1 , . . . , xk ) ↔ R(x1 , . . . , xk )),

one for each formula φ in the second-order language, in which R is not a free
variable. (Exercise: show that if R is allowed to occur in φ, this schema is
inconsistent!)
When logicians refer to the “axioms of second-order logic” they usually
mean the minimal extension of first-order logic by second-order quantifier rules
together with the comprehension schema. But it is often interesting to study
weaker subsystems of these axioms and rules. For example, note that in its full
generality the axiom schema of comprehension is impredicative: it allows one to

Release : 6891b66 (2024-12-01) 353


CHAPTER 24. BEYOND FIRST-ORDER LOGIC

assert the existence of a relation R(x1 , . . . , xk ) that is “defined” by a formula


with second-order quantifiers; and these quantifiers range over the set of all
such relations—a set which includes R itself! Around the turn of the twentieth
century, a common reaction to Russell’s paradox was to lay the blame on such
definitions, and to avoid them in developing the foundations of mathematics.
If one prohibits the use of second-order quantifiers in the formula φ, one has a
predicative form of comprehension, which is somewhat weaker.
From the semantic point of view, one can think of a second-order struc-
ture as consisting of a first-order structure for the language, coupled with a
set of relations on the domain over which the second-order quantifiers range
(more precisely, for each k there is a set of relations of arity k). Of course, if
comprehension is included in the derivation system, then we have the added
requirement that there are enough relations in the “second-order part” to sat-
isfy the comprehension axioms—otherwise the derivation system is not sound!
One easy way to ensure that there are enough relations around is to take the
second-order part to consist of all the relations on the first-order part. Such
a structure is called full, and, in a sense, is really the “intended structure”
for the language. If we restrict our attention to full structures we have what
is known as the full second-order semantics. In that case, specifying a struc-
ture boils down to specifying the first-order part, since the contents of the
second-order part follow from that implicitly.
To summarize, there is some ambiguity when talking about second-order
logic. In terms of the derivation system, one might have in mind either

1. A “minimal” second-order derivation system, together with some com-


prehension axioms.
2. The “standard” second-order derivation system, with full comprehension.
In terms of the semantics, one might be interested in either

1. The “weak” semantics, where a structure consists of a first-order part,


together with a second-order part big enough to satisfy the comprehension
axioms.
2. The “standard” second-order semantics, in which one considers full struc-
tures only.

When logicians do not specify the derivation system or the semantics they have
in mind, they are usually referring to the second item on each list. The ad-
vantage to using this semantics is that, as we will see, it gives us categorical
descriptions of many natural mathematical structures; at the same time, the
derivation system is quite strong, and sound for this semantics. The drawback
is that the derivation system is not complete for the semantics; in fact, no
effectively given derivation system is complete for the full second-order seman-
tics. On the other hand, we will see that the derivation system is complete for
the weakened semantics; this implies that if a sentence is not provable, then
there is some structure, not necessarily the full one, in which it is false.

354 Release : 6891b66 (2024-12-01)


24.3. SECOND-ORDER LOGIC

The language of second-order logic is quite rich. One can identify unary
relations with subsets of the domain, and so in particular you can quantify over
these sets; for example, one can express induction for the natural numbers with
a single axiom
∀R ((R(0) ∧ ∀x (R(x) → R(x′ ))) → ∀x R(x)).
If one takes the language of arithmetic to have symbols 0, ′, +, × and <, one
can add the following axioms to describe their behavior:
1. ∀x ¬x′ = 0
2. ∀x ∀y (s(x) = s(y) → x = y)
3. ∀x (x + 0) = x
4. ∀x ∀y (x + y ′ ) = (x + y)′
5. ∀x (x × 0) = 0
6. ∀x ∀y (x × y ′ ) = ((x × y) + x)
7. ∀x ∀y (x < y ↔ ∃z y = (x + z ′ ))
It is not difficult to show that these axioms, together with the axiom of induc-
tion above, provide a categorical description of the structure N, the standard
model of arithmetic, provided we are using the full second-order semantics.
Given any structure M in which these axioms are true, define a function f
from N to the domain of M using ordinary recursion on N, so that f (0) = 0M
and f (x + 1) = ′M (f (x)). Using ordinary induction on N and the fact that
axioms (1) and (2) hold in M, we see that f is injective. To see that f is
surjective, let P be the set of elements of |M| that are in the range of f . Since
M is full, P is in the second-order domain. By the construction of f , we know
that 0M is in P , and that P is closed under ′M . The fact that the induction
axiom holds in M (in particular, for P ) guarantees that P is equal to the entire
first-order domain of M. This shows that f is a bijection. Showing that f is a
homomorphism is no more difficult, using ordinary induction on N repeatedly.
In set-theoretic terms, a function is just a special kind of relation; for ex-
ample, a unary function f can be identified with a binary relation R satisfying
∀x ∃!y R(x, y). As a result, one can quantify over functions too. Using the full
semantics, one can then define the class of infinite structures to be the class of
structures M for which there is an injective function from the domain of M to
a proper subset of itself:
∃f (∀x ∀y (f (x) = f (y) → x = y) ∧ ∃y ∀x f (x) ̸= y).
The negation of this sentence then defines the class of finite structures.
In addition, one can define the class of well-orderings, by adding the follow-
ing to the definition of a linear ordering:
∀P (∃x P (x) → ∃x (P (x) ∧ ∀y (y < x → ¬P (y)))).

Release : 6891b66 (2024-12-01) 355


CHAPTER 24. BEYOND FIRST-ORDER LOGIC

This asserts that every non-empty set has a least element, modulo the identifi-
cation of “set” with “one-place relation”. For another example, one can express
the notion of connectedness for graphs, by saying that there is no nontrivial
separation of the vertices into disconnected parts:

¬∃A (∃x A(x) ∧ ∃y ¬A(y) ∧ ∀w ∀z ((A(w) ∧ ¬A(z)) → ¬R(w, z))).

For yet another example, you might try as an exercise to define the class of
finite structures whose domain has even size. More strikingly, one can pro-
vide a categorical description of the real numbers as a complete ordered field
containing the rationals.
In short, second-order logic is much more expressive than first-order logic.
That’s the good news; now for the bad. We have already mentioned that
there is no effective derivation system that is complete for the full second-order
semantics. For better or for worse, many of the properties of first-order logic
are absent, including compactness and the Löwenheim–Skolem theorems.
On the other hand, if one is willing to give up the full second-order semantics
in terms of the weaker one, then the minimal second-order derivation system
is complete for this semantics. In other words, if we read ⊢ as “proves in
the minimal system” and ⊨ as “logically implies in the weaker semantics”,
we can show that whenever Γ ⊨ φ then Γ ⊢ φ. If one wants to include
specific comprehension axioms in the derivation system, one has to restrict the
semantics to second-order structures that satisfy these axioms: for example, if
∆ consists of a set of comprehension axioms (possibly all of them), we have
that if Γ ∪ ∆ ⊨ φ, then Γ ∪ ∆ ⊢ φ. In particular, if φ is not provable using
the comprehension axioms we are considering, then there is a model of ¬φ in
which these comprehension axioms nonetheless hold.
The easiest way to see that the completeness theorem holds for the weaker
semantics is to think of second-order logic as a many-sorted logic, as follows.
One sort is interpreted as the ordinary “first-order” domain, and then for each
k we have a domain of “relations of arity k.” We take the language to have
built-in relation symbols “tr ue k (R, x1 , . . . , xk )” which is meant to assert that
R holds of x1 , . . . , xk , where R is a variable of the sort “k-ary relation” and
x1 , . . . , xk are objects of the first-order sort.
With this identification, the weak second-order semantics is essentially the
usual semantics for many-sorted logic; and we have already observed that
many-sorted logic can be embedded in first-order logic. Modulo the trans-
lations back and forth, then, the weaker conception of second-order logic is
really a form of first-order logic in disguise, where the domain contains both
“objects” and “relations” governed by the appropriate axioms.

content/first-order-logic/beyond/higher-order-logic.tex

24.4 Higher-Order logic


fol:byd:hol:
sec

356 Release : 6891b66 (2024-12-01)


24.4. HIGHER-ORDER LOGIC

Passing from first-order logic to second-order logic enabled us to talk about


sets of objects in the first-order domain, within the formal language. Why stop
there? For example, third-order logic should enable us to deal with sets of sets
of objects, or perhaps even sets which contain both objects and sets of objects.
And fourth-order logic will let us talk about sets of objects of that kind. As
you may have guessed, one can iterate this idea arbitrarily.
In practice, higher-order logic is often formulated in terms of functions in-
stead of relations. (Modulo the natural identifications, this difference is inessen-
tial.) Given some basic “sorts” A, B, C, . . . (which we will now call “types”),
we can create new ones by stipulating

If σ and τ are finite types then so is σ → τ .

Think of types as syntactic “labels,” which classify the objects we want in our
domain; σ → τ describes those objects that are functions which take objects
of type σ to objects of type τ . For example, we might want to have a type
Ω of truth values, “true” and “false,” and a type N of natural numbers. In
that case, you can think of objects of type N → Ω as unary relations, or
subsets of N; objects of type N → N are functions from natural numbers to
natural numbers; and objects of type (N → N) → N are “functionals,” that is,
higher-type functions that take functions to numbers.
As in the case of second-order logic, one can think of higher-order logic as a
kind of many-sorted logic, where there is a sort for each type of object we want
to consider. But it is usually clearer just to define the syntax of higher-type
logic from the ground up. For example, we can define a set of finite types
inductively, as follows:

1. N is a finite type.

2. If σ and τ are finite types, then so is σ → τ .

3. If σ and τ are finite types, so is σ × τ .

Intuitively, N denotes the type of the natural numbers, σ → τ denotes the


type of functions from σ to τ , and σ × τ denotes the type of pairs of objects,
one from σ and one from τ . We can then define a set of terms inductively, as
follows:

1. For each type σ, there is a stock of variables x, y, z, . . . of type σ

2. 0 is a term of type N

3. S (successor) is a term of type N → N

4. If s is a term of type σ, and t is a term of type N → (σ → σ), then Rst


is a term of type N → σ

5. If s is a term of type τ → σ and t is a term of type τ , then s(t) is a term


of type σ

Release : 6891b66 (2024-12-01) 357


CHAPTER 24. BEYOND FIRST-ORDER LOGIC

6. If s is a term of type σ and x is a variable of type τ , then λx. s is a term


of type τ → σ.
7. If s is a term of type σ and t is a term of type τ , then ⟨s, t⟩ is a term of
type σ × τ .
8. If s is a term of type σ × τ then p1 (s) is a term of type σ and p2 (s) is a
term of type τ .
Intuitively, Rst denotes the function defined recursively by

Rst (0) = s
Rst (x + 1) = t(x, Rst (x)),

⟨s, t⟩ denotes the pair whose first component is s and whose second component
is t, and p1 (s) and p2 (s) denote the first and second elements (“projections”)
of s. Finally, λx. s denotes the function f defined by

f (x) = s

for any x of type σ; so item (6) gives us a form of comprehension, enabling us


to define functions using terms. Formulas are built up from identity predicate
statements s = t between terms of the same type, the usual propositional
connectives, and higher-type quantification. One can then take the axioms
of the system to be the basic equations governing the terms defined above,
together with the usual rules of logic with quantifiers and identity predicate.
If one augments the finite type system with a type Ω of truth values, one
has to include axioms which govern its use as well. In fact, if one is clever, one
can get rid of complex formulas entirely, replacing them with terms of type Ω!
The proof system can then be modified accordingly. The result is essentially
the simple theory of types set forth by Alonzo Church in the 1930s.
As in the case of second-order logic, there are different versions of higher-
type semantics that one might want to use. In the full version, variables of
type σ → τ range over the set of all functions from the objects of type σ to
objects of type τ . As you might expect, this semantics is too strong to admit a
complete, effective derivation system. But one can consider a weaker semantics,
in which a structure consists of sets of elements Tτ for each type τ , together
with appropriate operations for application, projection, etc. If the details are
carried out correctly, one can obtain completeness theorems for the kinds of
derivation systems described above.
Higher-type logic is attractive because it provides a framework in which
we can embed a good deal of mathematics in a natural way: starting with
N, one can define real numbers, continuous functions, and so on. It is also
particularly attractive in the context of intuitionistic logic, since the types
have clear “constructive” interpretations. In fact, one can develop constructive
versions of higher-type semantics (based on intuitionistic, rather than classical
logic) that clarify these constructive interpretations quite nicely, and are, in
many ways, more interesting than the classical counterparts.

358 Release : 6891b66 (2024-12-01)


24.5. INTUITIONISTIC LOGIC

content/first-order-logic/beyond/intuitionistic-logic.tex

24.5 Intuitionistic Logic


fol:byd:il: In contrast to second-order and higher-order logic, intuitionistic first-order logic
sec
represents a restriction of the classical version, intended to model a more “con-
structive” kind of reasoning. The following examples may serve to illustrate
some of the underlying motivations.
Suppose someone came up to you one day and announced that they had
determined a natural number x, with the property that if x is prime, the
Riemann hypothesis is true, and if x is composite, the Riemann hypothesis is
false. Great news! Whether the Riemann hypothesis is true or not is one of
the big open questions of mathematics, and here they seem to have reduced
the problem to one of calculation, that is, to the determination of whether a
specific number is prime or not.
What is the magic value of x? They describe it as follows: x is the natural
number that is equal to 7 if the Riemann hypothesis is true, and 9 otherwise.
Angrily, you demand your money back. From a classical point of view, the
description above does in fact determine a unique value of x; but what you
really want is a value of x that is given explicitly.
To take another, perhaps less contrived example, consider the following
question. We know that it is possible to raise an irrational number to a rational
√ 2
power, and get a rational result. For example, 2 = 2. What is less clear
is whether or not it is possible to raise an irrational number to an irrational
power, and get a rational result. The following theorem answers this in the
affirmative:

Theorem 24.1. There are irrational numbers a and b such that ab is rational.
√ √2 √
Proof. Consider 2 . If this is rational, we are done: we can let a = b = 2.
Otherwise, it is irrational. Then we have
√ √ √
√ 2 √2 √ 2· 2 √ 2
( 2 ) = 2 = 2 = 2,

√ 2 √
which is certainly rational. So, in this case, let a be 2 , and let b be 2.

Does this constitute a valid proof? Most mathematicians feel that it does.
But again, there is something a little bit unsatisfying here: we have proved
the existence of a pair of real numbers with a certain property, without being
able to say which pair of numbers it is. It is possible to prove the same√result,
but in such a way that the pair a, b is given in the proof: take a = 3 and
b = log3 4. Then
√ log3 4
ab = 3 = 31/2·log3 4 = (3log3 4 )1/2 = 41/2 = 2,

Release : 6891b66 (2024-12-01) 359


CHAPTER 24. BEYOND FIRST-ORDER LOGIC

since 3log3 x = x.
Intuitionistic logic is designed to model a kind of reasoning where moves
like the one in the first proof are disallowed. Proving the existence of an x
satisfying φ(x) means that you have to give a specific x, and a proof that it
satisfies φ, like in the second proof. Proving that φ or ψ holds requires that
you can prove one or the other.
Formally speaking, intuitionistic first-order logic is what you get if you
restrict a derivation system for first-order logic in a certain way. Similarly,
there are intuitionistic versions of second-order or higher-order logic. From the
mathematical point of view, these are just formal deductive systems, but, as
already noted, they are intended to model a kind of mathematical reasoning.
One can take this to be the kind of reasoning that is justified on a certain
philosophical view of mathematics (such as Brouwer’s intuitionism); one can
take it to be a kind of mathematical reasoning which is more “concrete” and
satisfying (along the lines of Bishop’s constructivism); and one can argue about
whether or not the formal description captures the informal motivation. But
whatever philosophical positions we may hold, we can study intuitionistic logic
as a formally presented logic; and for whatever reasons, many mathematical
logicians find it interesting to do so.
There is an informal constructive interpretation of the intuitionist connec-
tives, usually known as the BHK interpretation (named after Brouwer, Heyting,
and Kolmogorov). It runs as follows: a proof of φ ∧ ψ consists of a proof of
φ paired with a proof of ψ; a proof of φ ∨ ψ consists of either a proof of φ,
or a proof of ψ, where we have explicit information as to which is the case;
a proof of φ → ψ consists of a procedure, which transforms a proof of φ to a
proof of ψ; a proof of ∀x φ(x) consists of a procedure which returns a proof of
φ(x) for any value of x; and a proof of ∃x φ(x) consists of a value of x, together
with a proof that this value satisfies φ. One can describe the interpretation in
computational terms known as the “Curry–Howard isomorphism” or the “for-
mulas-as-types paradigm”: think of a formula as specifying a certain kind of
data type, and proofs as computational objects of these data types that enable
us to see that the corresponding formula is true.
Intuitionistic logic is often thought of as being classical logic “minus” the
law of the excluded middle. This following theorem makes this more precise.

Theorem 24.2. Intuitionistically, the following axiom schemata are equiva-


lent:

1. (¬φ → ⊥) → φ.

2. φ ∨ ¬φ

3. ¬¬φ → φ

Obtaining instances of one schema from either of the others is a good exercise
in intuitionistic logic.

360 Release : 6891b66 (2024-12-01)


24.5. INTUITIONISTIC LOGIC

The first deductive systems for intuitionistic propositional logic, put forth
as formalizations of Brouwer’s intuitionism, are due, independently, to Kol-
mogorov, Glivenko, and Heyting. The first formalization of intuitionistic first-
order logic (and parts of intuitionist mathematics) is due to Heyting. Though
a number of classically valid schemata are not intuitionistically valid, many
are.
The double-negation translation describes an important relationship be-
tween classical and intuitionist logic. It is defined inductively follows (think of
φN as the “intuitionist” translation of the classical formula φ):

φN ≡ ¬¬φ for atomic formulas φ


(φ ∧ ψ)N ≡ (φN ∧ ψ N )
(φ ∨ ψ)N ≡ ¬¬(φN ∨ ψ N )
(φ → ψ)N ≡ (φN → ψ N )
(∀x φ)N ≡ ∀x φN
(∃x φ)N ≡ ¬¬∃x φN

Kolmogorov and Glivenko had versions of this translation for propositional


logic; for predicate logic, it is due to Gödel and Gentzen, independently. We
have
Theorem 24.3. 1. φ ↔ φN is provable classically
2. If φ is provable classically, then φN is provable intuitionistically.

We can now envision the following dialogue. Classical mathematician: “I’ve


proved φ!” Intuitionist mathematician: “Your proof isn’t valid. What you’ve
really proved is φN .” Classical mathematician: “Fine by me!” As far as the
classical mathematician is concerned, the intuitionist is just splitting hairs,
since the two are equivalent. But the intuitionist insists there is a difference.
Note that the above translation concerns pure logic only; it does not address
the question as to what the appropriate nonlogical axioms are for classical and
intuitionistic mathematics, or what the relationship is between them. But the
following slight extension of the theorem above provides some useful informa-
tion:
Theorem 24.4. If Γ proves φ classically, Γ N proves φN intuitionistically.

In other words, if φ is provable from some hypotheses classically, then φN


is provable from their double-negation translations.
To show that a sentence or propositional formula is intuitionistically valid,
all you have to do is provide a proof. But how can you show that it is not
valid? For that purpose, we need a semantics that is sound, and preferably
complete. A semantics due to Kripke nicely fits the bill.
We can play the same game we did for classical logic: define the semantics,
and prove soundness and completeness. It is worthwhile, however, to note

Release : 6891b66 (2024-12-01) 361


CHAPTER 24. BEYOND FIRST-ORDER LOGIC

the following distinction. In the case of classical logic, the semantics was the
“obvious” one, in a sense implicit in the meaning of the connectives. Though
one can provide some intuitive motivation for Kripke semantics, the latter does
not offer the same feeling of inevitability. In addition, the notion of a classical
structure is a natural mathematical one, so we can either take the notion of
a structure to be a tool for studying classical first-order logic, or take classical
first-order logic to be a tool for studying mathematical structures. In contrast,
Kripke structures can only be viewed as a logical construct; they don’t seem
to have independent mathematical interest.
A Kripke structure M = ⟨W, R, V ⟩ for a propositional language consists
of a set W , partial order R on W with a least element, and an “monotone”
assignment of propositional variables to the elements of W . The intuition is
that the elements of W represent “worlds,” or “states of knowledge”; an element
v ≥ u represents a “possible future state” of u; and the propositional variables
assigned to u are the propositions that are known to be true in state u. The
forcing relation M, w ⊩ φ then extends this relationship to arbitrary formulas
in the language; read M, w ⊩ φ as “φ is true in state w.” The relationship is
defined inductively, as follows:

1. M, w ⊩ pi iff pi is one of the propositional variables assigned to w.

2. M, w ⊮ ⊥.

3. M, w ⊩ (φ ∧ ψ) iff M, w ⊩ φ and M, w ⊩ ψ.

4. M, w ⊩ (φ ∨ ψ) iff M, w ⊩ φ or M, w ⊩ ψ.

5. M, w ⊩ (φ → ψ) iff, whenever w′ ≥ w and M, w′ ⊩ φ, then M, w′ ⊩ ψ.

It is a good exercise to try to show that ¬(p ∧ q) → (¬p ∨ ¬q) is not intuitionis-
tically valid, by cooking up a Kripke structure that provides a counterexample.

content/first-order-logic/beyond/modal-logics.tex

24.6 Modal Logics


Consider the following example of a conditional sentence: fol:byd:mod:
sec

If Jeremy is alone in that room, then he is drunk and naked and


dancing on the chairs.

This is an example of a conditional assertion that may be materially true


but nonetheless misleading, since it seems to suggest that there is a stronger
link between the antecedent and conclusion other than simply that either the
antecedent is false or the consequent true. That is, the wording suggests that
the claim is not only true in this particular world (where it may be trivially true,
because Jeremy is not alone in the room), but that, moreover, the conclusion
would have been true had the antecedent been true. In other words, one can

362 Release : 6891b66 (2024-12-01)


24.6. MODAL LOGICS

take the assertion to mean that the claim is true not just in this world, but in
any “possible” world; or that it is necessarily true, as opposed to just true in
this particular world.
Modal logic was designed to make sense of this kind of necessity. One
obtains modal propositional logic from ordinary propositional logic by adding
a box operator; which is to say, if φ is a formula, so is □φ. Intuitively, □φ
asserts that φ is necessarily true, or true in any possible world. ♢φ is usually
taken to be an abbreviation for ¬□¬φ, and can be read as asserting that φ is
possibly true. Of course, modality can be added to predicate logic as well.
Kripke structures can be used to provide a semantics for modal logic; in
fact, Kripke first designed this semantics with modal logic in mind. Rather than
restricting to partial orders, more generally one has a set of “possible worlds,”
P , and a binary “accessibility” relation R(x, y) between worlds. Intuitively,
R(p, q) asserts that the world q is compatible with p; i.e., if we are “in” world p,
we have to entertain the possibility that the world could have been like q.
Modal logic is sometimes called an “intensional” logic, as opposed to an
“extensional” one. The intended semantics for an extensional logic, like classi-
cal logic, will only refer to a single world, the “actual” one; while the semantics
for an “intensional” logic relies on a more elaborate ontology. In addition to
structureing necessity, one can use modality to structure other linguistic con-
structions, reinterpreting □ and ♢ according to the application. For example:

1. In provability logic, □φ is read “φ is provable” and ♢φ is read “φ is


consistent.”
2. In epistemic logic, one might read □φ as “I know φ” or “I believe φ.”
3. In temporal logic, one can read □φ as “φ is always true” and ♢φ as “φ
is sometimes true.”

One would like to augment logic with rules and axioms dealing with modal-
ity. For example, the system S4 consists of the ordinary axioms and rules of
propositional logic, together with the following axioms:

□(φ → ψ) → (□φ → □ψ)


□φ → φ
□φ → □□φ

as well as a rule, “from φ conclude □φ.” S5 adds the following axiom:

♢φ → □♢φ

Variations of these axioms may be suitable for different applications; for ex-
ample, S5 is usually taken to characterize the notion of logical necessity. And
the nice thing is that one can usually find a semantics for which the derivation
system is sound and complete by restricting the accessibility relation in the
Kripke structures in natural ways. For example, S4 corresponds to the class

Release : 6891b66 (2024-12-01) 363


CHAPTER 24. BEYOND FIRST-ORDER LOGIC

of Kripke structures in which the accessibility relation is reflexive and transi-


tive. S5 corresponds to the class of Kripke structures in which the accessibility
relation is universal, which is to say that every world is accessible from every
other; so □φ holds if and only if φ holds in every world.

content/first-order-logic/beyond/other-logics.tex

24.7 Other Logics


As you may have gathered by now, it is not hard to design a new logic. You fol:byd:oth:
sec
too can create your own a syntax, make up a deductive system, and fashion
a semantics to go with it. You might have to be a bit clever if you want
the derivation system to be complete for the semantics, and it might take
some effort to convince the world at large that your logic is truly interesting.
But, in return, you can enjoy hours of good, clean fun, exploring your logic’s
mathematical and computational properties.
Recent decades have witnessed a veritable explosion of formal logics. Fuzzy
logic is designed to model reasoning about vague properties. Probabilistic logic
is designed to model reasoning about uncertainty. Default logics and nonmono-
tonic logics are designed to model defeasible forms of reasoning, which is to say,
“reasonable” inferences that can later be overturned in the face of new informa-
tion. There are epistemic logics, designed to model reasoning about knowledge;
causal logics, designed to model reasoning about causal relationships; and even
“deontic” logics, which are designed to model reasoning about moral and ethi-
cal obligations. Depending on whether the primary motivation for introducing
these systems is philosophical, mathematical, or computational, you may find
such creatures studies under the rubric of mathematical logic, philosophical
logic, artificial intelligence, cognitive science, or elsewhere.
The list goes on and on, and the possibilities seem endless. We may never
attain Leibniz’ dream of reducing all of human reason to calculation—but that
can’t stop us from trying.

364 Release : 6891b66 (2024-12-01)


Part IV

Model Theory
Material on model theory is incomplete and experimental. It is cur-
rently simply an adaptation of Aldo Antonelli’s notes on model theory, less
those topics covered in the part on first-order logic (theories, completeness,
compactness). It requires much more introduction, motivation, and expla-
nation, as well as exercises, to be useful for a textbook. Andy Arana is at
planning to work on this part specifically (issue #65).

Chapter 25

Basics of Model Theory

content/model-theory/basics/reducts-and-expansions.tex

25.1 Reducts and Expansions


Often it is useful or necessary to compare languages which have symbols in
common, as well as structures for these languages. The most common case is
when all the symbols in a language L are also part of a language L′ , i.e., L ⊆ L′ .
An L-structure M can then always be expanded to an L′ -structure by adding
interpretations of the additional symbols while leaving the interpretations of
the common symbols the same. On the other hand, from an L′ -structure M′
we can obtain an L-structure simply by “forgetting” the interpretations of the
symbols that do not occur in L.

mod:bas:red: Definition 25.1. Suppose L ⊆ L′ , M is an L-structure and M′ is an L′ -


defn:reduct
structure. M is the reduct of M′ to L, and M′ is an expansion of M to L′
iff

1. |M| = |M′ |

365
CHAPTER 25. BASICS OF MODEL THEORY


2. For every constant symbol c ∈ L, cM = cM .

3. For every function symbol f ∈ L, f M = f M .

4. For every predicate symbol P ∈ L, P M = P M .

Proposition 25.2. If an L-structure M is a reduct of an L′ -structure M′ , mod:bas:red:

then for all L-sentences φ, prop:reduct

M ⊨ φ iff M′ ⊨ φ.

Proof. Exercise.

Problem 25.1. Prove Proposition 25.2.

Definition 25.3. When we have an L-structure M, and L′ = L ∪ {P } is the


expansion of L obtained by adding a single n-place predicate symbol P , and
n
R ⊆ |M| is an n-place relation, then we write (M, R) for the expansion M′

of M with P M = R.

content/model-theory/basics/substructures.tex

25.2 Substructures
The domain of a structure M may be a subset of another M′ . But we should mod:bas:sub:
sec
obviously only consider M a “part” of M′ if not only |M| ⊆ |M′ |, but M and
M′ “agree” in how they interpret the symbols of the language at least on the
shared part |M|.
Definition 25.4. Given structures M and M′ for the same language L, we say mod:bas:sub:
that M is a substructure of M′ , and M′ an extension of M, written M ⊆ M′ , defn:substructure

iff
1. |M| ⊆ |M′ |,

2. For each constant c ∈ L, cM = cM ;

3. For each n-place function symbol f ∈ L f M (a1 , . . . , an ) = f M (a1 , . . . , an )
for all a1 , . . . , an ∈ |M|.
4. For each n-place predicate symbol R ∈ L, ⟨a1 , . . . , an ⟩ ∈ RM iff ⟨a1 , . . . , an ⟩ ∈

RM for all a1 , . . . , an ∈ |M|.

Remark 1. If the language contains no constant or function symbols, then any mod:bas:sub:
N ⊆ |M| determines a substructure N of M with domain |N| = N by putting rem:substructure

RN = RM ∩ N n .

content/model-theory/basics/overspill.tex

366 Release : 6891b66 (2024-12-01)


25.3. OVERSPILL

25.3 Overspill
mod:bas:ove:
sec
mod:bas:ove: Theorem 25.5. If a set Γ of sentences has arbitrarily large finite models,
overspill
then it has an infinite model.

Proof. Expand the language of Γ by adding countably many new constants c0 ,


c1 , . . . and consider the set Γ ∪ {ci ̸= cj : i ̸= j}. To say that Γ has arbitrarily
large finite models means that for every m > 0 there is n ≥ m such that Γ
has a model of cardinality n. This implies that Γ ∪ {ci ̸= cj : i ̸= j} is finitely
satisfiable. By compactness, Γ ∪{ci ̸= cj : i ̸= j} has a model M whose domain
must be infinite, since it satisfies all inequalities ci ̸= cj .

mod:bas:ove: Proposition 25.6. There is no sentence φ of any first-order language that is


inf-not-fo
true in a structure M if and only if the domain |M| of the structure is infinite.

Proof. If there were such a φ, its negation ¬φ would be true in all and only
the finite structures, and it would therefore have arbitrarily large finite models
but it would lack an infinite model, contradicting Theorem 25.5.

content/model-theory/basics/isomorphism.tex

25.4 Isomorphic Structures


mod:bas:iso: First-order structures can be alike in one of two ways. One way in which the
sec
can be alike is that they make the same sentences true. We call such structures
elementarily equivalent. But structures can be very different and still make the
same sentences true—for instance, one can be enumerable and the other not.
This is because there are lots of features of a structure that cannot be expressed
in first-order languages, either because the language is not rich enough, or
because of fundamental limitations of first-order logic such as the Löwenheim–
Skolem theorem. So another, stricter, aspect in which structures can be alike
is if they are fundamentally the same, in the sense that they only differ in the
objects that make them up, but not in their structural features. A way of
making this precise is by the notion of an isomorphism.
mod:bas:iso: Definition 25.7. Given two structures M and M′ for the same language L,
defn:elem-equiv
we say that M is elementarily equivalent to M′ , written M ≡ M′ , if and only
if for every sentence φ of L, M ⊨ φ iff M′ ⊨ φ.

mod:bas:iso: Definition 25.8. Given two structures M and M′ for the same language L,
defn:isomorphism
we say that M is isomorphic to M′ , written M ≃ M′ , if and only if there is a
function h : |M| → |M′ | such that:
1. h is injective: if h(x) = h(y) then x = y;
2. h is surjective: for every y ∈ |M′ | there is x ∈ |M| such that h(x) = y;

Release : 6891b66 (2024-12-01) 367


CHAPTER 25. BASICS OF MODEL THEORY


3. for every constant symbol c: h(cM ) = cM ; mod:bas:iso:
defn:iso-const
4. for every n-place predicate symbol P : mod:bas:iso:
defn:iso-pred

⟨a1 , . . . , an ⟩ ∈ P M iff ⟨h(a1 ), . . . , h(an )⟩ ∈ P M ;

5. for every n-place function symbol f : mod:bas:iso:


defn:iso-func

h(f M (a1 , . . . , an )) = f M (h(a1 ), . . . , h(an )).

Theorem 25.9. If M ≃ M′ then M ≡ M′ . mod:bas:iso:


thm:isom

Proof. Let h be an isomorphism of M onto M′ . For any assignment s, h◦s is the


composition of h and s, i.e., the assignment in M′ such that (h◦s)(x) = h(s(x)).
By induction on t and φ one can prove the stronger claims:

a. h(ValM M
s (t)) = Valh◦s (t).

b. M, s ⊨ φ iff M′ , h ◦ s ⊨ φ.

The first is proved by induction on the complexity of t.


′ ′
1. If t ≡ c, then ValM
s (c) = c
M
and ValMh◦s (c) = c
M
. Thus, h(ValM
s (t)) =
′ ′
h(cM ) = cM (by (3) of Definition 25.8) = ValMh◦s (t).

2. If t ≡ x, then ValM M M
s (x) = s(x) and Valh◦s (x) = h(s(x)). Thus, h(Vals (x)) =

h(s(x)) = ValM h◦s (x).

3. If t ≡ f (t1 , . . . , tn ), then

ValM M M M
s (t) = f (Vals (t1 ), . . . , Vals (tn )) and
′ ′ ′
ValM M M M
h◦s (t) = f (Valh◦s (t1 ), . . . , Valh◦s (tn )).


The induction hypothesis is that for each i, h(ValM M
s (ti )) = Valh◦s (ti ). So,

h(ValM M M M
s (t)) = h(f (Vals (t1 ), . . . , Vals (tn ))
′ ′
= h(f M (ValM M
h◦s (t1 ), . . . , Valh◦s (tn )) (25.1) mod:bas:iso:
′ iso-1
M′ M′
=f (ValM
h◦s (t1 ), . . . , Valh◦s (tn )) (25.2) mod:bas:iso:
′ iso-2
= ValM
h◦s (t)

Here, eq. (25.1) follows by induction hypothesis and eq. (25.2) by (5) of
Definition 25.8.

Part (b) is left as an exercise.


If φ is a sentence, the assignments s and h ◦ s are irrelevant, and we have
M ⊨ φ iff M′ ⊨ φ.

368 Release : 6891b66 (2024-12-01)


25.5. THE THEORY OF A STRUCTURE

Problem 25.2. Carry out the proof of (b) of Theorem 25.9 in detail. Make
sure to note where each of the five properties characterizing isomorphisms of
Definition 25.8 is used.

Definition 25.10. An automorphism of a structure M is an isomorphism of


M onto itself.

Problem 25.3. Show that for any structure M, if X is a definable subset of


M, and h is an automorphism of M, then X = {h(x) : x ∈ X} (i.e., X is fixed
under h).

content/model-theory/basics/theory-of-m.tex

25.5 The Theory of a Structure


Every structure M makes some sentences true, and some false. The set of all
the sentences it makes true is called its theory. That set is in fact a theory,
since anything it entails must be true in all its models, including M.

Definition 25.11. Given a structure M, the theory of M is the set Th(M) of


sentences that are true in M, i.e., Th(M) = {φ : M ⊨ φ}.

We also use the term “theory” informally to refer to sets of sentences having
an intended interpretation, whether deductively closed or not.

Proposition 25.12. For any M, Th(M) is complete.

Proof. For any sentence φ either M ⊨ φ or M ⊨ ¬φ, so either φ ∈ Th(M) or


¬φ ∈ Th(M).

mod:bas:thm: Proposition 25.13. If N |= φ for every φ ∈ Th(M), then M ≡ N.


prop:equiv

Proof. Since N ⊨ φ for all φ ∈ Th(M), Th(M) ⊆ Th(N). If N ⊨ φ, then


N ⊭ ¬φ, so ¬φ ∈
/ Th(M). Since Th(M) is complete, φ ∈ Th(M). So, Th(N) ⊆
Th(M), and we have M ≡ N.

mod:bas:thm: Remark 2. Consider R = ⟨R, <⟩, the structure whose domain is the set R of
remark:R
the real numbers, in the language comprising only a 2-place predicate sym-
bol interpreted as the < relation over the reals. Clearly R is non-enumerable;
however, since Th(R) is obviously consistent, by the Löwenheim–Skolem the-
orem it has an enumerable model, say S, and by Proposition 25.13, R ≡ S.
Moreover, since R and S are not isomorphic, this shows that the converse of
Theorem 25.9 fails in general.

content/model-theory/basics/partial-iso.tex

Release : 6891b66 (2024-12-01) 369


CHAPTER 25. BASICS OF MODEL THEORY

25.6 Partial Isomorphisms


Definition 25.14. Given two structures M and N, a partial isomorphism from
M to N is a finite partial function p taking arguments in |M| and returning
values in |N|, which satisfies the isomorphism conditions from Definition 25.8
on its domain:

1. p is injective;

2. for every constant symbol c: if p(cM ) is defined, then p(cM ) = cN ;

3. for every n-place predicate symbol P : if a1 , . . . , an are in the domain of


p, then ⟨a1 , . . . , an ⟩ ∈ P M if and only if ⟨p(a1 ), . . . , p(an )⟩ ∈ P N ;

4. for every n-place function symbol f : if a1 , . . . , an are in the domain of


p, then p(f M (a1 , . . . , an )) = f N (p(a1 ), dots, p(an )).

That p is finite means that dom(p) is finite.

Notice that the empty function ∅ is always a partial isomorphism between


any two structures.

Definition 25.15. Two structures M and N, are partially isomorphic, written mod:bas:pis:
M ≃p N, if and only if there is a non-empty set I of partial isomorphisms defn:partialisom

between M and N satisfying the back-and-forth property:

1. (Forth) For every p ∈ I and a ∈ |M| there is q ∈ I such that p ⊆ q and


a is in the domain of q;

2. (Back ) For every p ∈ I and b ∈ |N| there is q ∈ I such that p ⊆ q and b


is in the range of q.

Theorem 25.16. If M ≃p N and M and N are enumerable, then M ≃ N. mod:bas:pis:


thm:p-isom1

Proof. Since M and N are enumerable, let |M| = {a0 , a1 , . . .} and |N| =
{b0 , b1 , . . .}. Starting with an arbitrary p0 ∈ I, we define an increasing se-
quence of partial isomorphisms p0 ⊆ p1 ⊆ p2 ⊆ · · · as follows:

1. if n + 1 is odd, say n = 2r, then using the Forth property find a pn+1 ∈ I
such that pn ⊆ pn+1 and ar is in the domain of pn+1 ;

2. if n + 1 is even, say n + 1 = 2r, then using the Back property find a


pn+1 ∈ I such that pn ⊆ pn+1 and br is in the range of pn+1 .

If we now put: [
p= pn ,
n≥0

we have that p is a an isomorphism between M and N.

370 Release : 6891b66 (2024-12-01)


25.6. PARTIAL ISOMORPHISMS

Problem 25.4. Show in detail that p as defined in Theorem 25.16 is in fact


an isomorphism.

mod:bas:pis: Theorem 25.17. Suppose M and N are structures for a purely relational lan-
thm:p-isom2
guage (a language containing only predicate symbols, and no function symbols
or constants). Then if M ≃p N, also M ≡ N.

Proof. By induction on formulas, one shows that if a1 , . . . , an and b1 , . . . ,


bn are such that there is a partial isomorphism p mapping each ai to bi and
s1 (xi ) = ai and s2 (xi ) = bi (for i = 1, . . . , n), then M, s1 ⊨ φ if and only if
N, s2 ⊨ φ. The case for n = 0 gives M ≡ N.

Remark 3. If function symbols are present, the previous result is still true, but
one needs to consider the isomorphism induced by p between the substructure
of M generated by a1 , . . . , an and the substructure of N generated by b1 , . . . ,
bn .

The previous result can be “broken down” into stages by establishing a


connection between the number of nested quantifiers in a formula and how
many times the relevant partial isomorphisms can be extended.

Definition 25.18. For any formula φ, the quantifier rank of φ, denoted by


qr(φ) ∈ N, is recursively defined as the highest number of nested quantifiers in
φ. Two structures M and N are n-equivalent, written M ≡n N, if they agree
on all sentences of quantifier rank less than or equal to n.

mod:bas:pis: Proposition 25.19. Let L be a finite purely relational language, i.e., a lan-
prop:qr-finite
guage containing finitely many predicate symbols and constant symbols, and no
function symbols. Then for each n ∈ N there are only finitely many first-order
sentences in the language L that have quantifier rank no greater than n, up to
logical equivalence.

Proof. By induction on n.

Definition 25.20. Given a structure M, let |M| be the set of all finite
sequences over |M|. We use a, b, c, . . . to range over finite sequences of elements.

If a ∈ |M| and a ∈ |M|, then aa represents the concatenation of a with a.

Definition 25.21. Given structures M and N, we define relations In ⊆ |M| ×

|N| between sequences of equal length, by recursion on n as follows:

1. I0 (a, b) if and only if a and b satisfy the same atomic formulas in M and
N; i.e., if s1 (xi ) = ai and s2 (xi ) = bi and φ is atomic with all variables
among x1 , . . . , xn , then M, s1 ⊨ φ if and only if N, s2 ⊨ φ.

2. In+1 (a, b) if and only if for every a ∈ A there is a b ∈ B such that


In (aa, bb), and vice-versa.

Release : 6891b66 (2024-12-01) 371


CHAPTER 25. BASICS OF MODEL THEORY

Definition 25.22. Write M ≈n N if In (Λ, Λ) holds of M and N (where Λ is


the empty sequence).

Theorem 25.23. Let L be a purely relational language. Then In (a, b) implies mod:bas:pis:
that for every φ such that qr(φ) ≤ n, we have M, a ⊨ φ if and only if N, b ⊨ φ thm:b-n-f

(where again a satisfies φ if any s such that s(xi ) = ai satisfies φ). Moreover,
if L is finite, the converse also holds.

Proof. The proof that In (a, b) implies that a and b satisfy the same formulas
of quantifier rank no greater than n is by an easy induction on φ. For the
converse we proceed by induction on n, using Proposition 25.19, which ensures
that for each n there are at most finitely many non-equivalent formulas of that
quantifier rank.
For n = 0 the hypothesis that a and b satisfy the same quantifier-free
formulas gives that they satisfy the same atomic ones, so that I0 (a, b).
For the n + 1 case, suppose that a and b satisfy the same formulas of
quantifier rank no greater than n + 1; in order to show that In+1 (a, b) suffices
to show that for each a ∈ |M| there is a b ∈ |N| such that In (aa, bb), and by
the inductive hypothesis again suffices to show that for each a ∈ |M| there is
a b ∈ |N| such that aa and bb satisfy the same formulas of quantifier rank no
greater than n.
Given a ∈ |M|, let τna be set of formulas ψ(x, y) of rank no greater than
n satisfied by aa in M; τna is finite, so we can assume it is a single first-order
formula. It follows that a satisfies ∃x τna (x, y), which has quantifier rank no
greater than n + 1. By hypothesis b satisfies the same formula in N, so that
there is a b ∈ |N| such that bb satisfies τna ; in particular, bb satisfies the same
formulas of quantifier rank no greater than n as aa. Similarly one shows that
for every b ∈ |N| there is a ∈ |M| such that aa and bb satisfy the same formulas
of quantifier rank no greater than n, which completes the proof.

Corollary 25.24. If M and N are purely relational structures in a finite mod:bas:pis:


language, then M ≈n N if and only if M ≡n N. In particular M ≡ N if and cor:b-n-f

only if for each n, M ≈n N .

content/model-theory/basics/dlo.tex

25.7 Dense Linear Orders


Definition 25.25. A dense linear ordering without endpoints is a structure M
for the language containing a single 2-place predicate symbol < satisfying the
following sentences:

1. ∀x ¬x < x;

2. ∀x ∀y ∀z (x < y → (y < z → x < z));

372 Release : 6891b66 (2024-12-01)


3. ∀x ∀y (x < y ∨ x = y ∨ y < x);

4. ∀x ∃y x < y;

5. ∀x ∃y y < x;

6. ∀x ∀y (x < y → ∃z (x < z ∧ z < y)).

mod:bas:dlo: Theorem 25.26. Any two enumerable dense linear orderings without end-
thm:cantorQ
points are isomorphic.

Proof. Let M1 and M2 be enumerable dense linear orderings without end-


points, with <1 = <M1 and <2 = <M2 , and let I be the set of all partial
isomorphisms between them. I is not empty since at least ∅ ∈ I. We show
that I satisfies the Back-and-Forth property. Then M1 ≃p M2 , and the theo-
rem follows by Theorem 25.16.
To show I satisfies the Forth property, let p ∈ I and let p(ai ) = bi for i = 1,
. . . , n, and without loss of generality suppose a1 <1 a2 <1 · · · <1 an . Given
a ∈ |M1 |, find b ∈ |M2 | as follows:

1. if a <2 a1 let b ∈ |M2 | be such that b <2 b1 ;

2. if an <1 a let b ∈ |M2 | be such that bn <2 b;

3. if ai <1 a <1 ai+1 for some i, then let b ∈ |M2 | be such that bi <2 b <2
bi+1 .

It is always possible to find a b with the desired property since M2 is a dense


linear ordering without endpoints. Define q = p ∪ {⟨a, b⟩} so that q ∈ I is the
desired extension of p. This establishes the Forth property. The Back property
is similar. So M1 ≃p M2 ; by Theorem 25.16, M1 ≃ M2 .

Problem 25.5. Complete the proof of Theorem 25.26 by verifying that I


satisfies the Back property.

Remark 4. Let S be any enumerable dense linear ordering without endpoints.


Then (by Theorem 25.26) S ≃ Q, where Q = (Q, <) is the enumerable dense
linear ordering having the set Q of the rational numbers as its domain. Now
consider again the structure R = (R, <) from Remark 2. We saw that there
is an enumerable structure S such that R ≡ S. But S is an enumerable
dense linear ordering without endpoints, and so it is isomorphic (and hence
elementarily equivalent) to the structure Q. By transitivity of elementary
equivalence, R ≡ Q. (We could have shown this directly by establishing R ≃p
Q by the same back-and-forth argument.)

373
CHAPTER 26. MODELS OF ARITHMETIC

Chapter 26

Models of Arithmetic

content/model-theory/models-of-arithmetic/introduction.tex

26.1 Introduction
The standard model of arithmetic is the structure N with |N| = N in which 0,
′, +, ×, and < are interpreted as you would expect. That is, 0 is 0, ′ is the
successor function, + is interpreted as addition and × as multiplication of the
numbers in N. Specifically,

0N = 0
′N (n) = n + 1
+N (n, m) = n + m
×N (n, m) = nm

Of course, there are structures for LA that have domains other than N. For
instance, we can take M with domain |M| = {a}∗ (the finite sequences of the
single symbol a, i.e., ∅, a, aa, aaa, . . . ), and interpretations

0M = ∅
′M (s) = s ⌢ a
+M (n, m) = an+m
×M (n, m) = anm

These two structures are “essentially the same” in the sense that the only
difference is the elements of the domains but not how the elements of the
domains are related among each other by the interpretation functions. We say
that the two structures are isomorphic.
It is an easy consequence of the compactness theorem that any theory true
in N also has models that are not isomorphic to N. Such structures are called

374 Release : 6891b66 (2024-12-01)


26.2. STANDARD MODELS OF ARITHMETIC

non-standard. The interesting thing about them is that while the elements of a
standard model (i.e., N, but also all structures isomorphic to it) are exhausted
by the values of the standard numerals n, i.e.,
|N| = {ValN (n) : n ∈ N}
that isn’t the case in non-standard models: if M is non-standard, then there is
at least one x ∈ |M| such that x ̸= ValM (n) for all n.
These non-standard elements are pretty neat: they are “infinite natural
numbers.” But their existence also explains, in a sense, the incompleteness
phenomena. Consider an example, e.g., the consistency statement for Peano
arithmetic, ConPA , i.e., ¬∃x Prf PA (x, ⌜⊥⌝). Since PA neither proves ConPA
nor ¬ConPA , either can be consistently added to PA. Since PA is consis-
tent, N ⊨ ConPA , and consequently N ⊭ ¬ConPA . So N is not a model
of PA ∪ {¬ConPA }, and all its models must be nonstandard. Models of
PA ∪ {¬ConPA } must contain some element that serves as the witness that
makes ∃x Prf PA (⌜⊥⌝) true, i.e., a Gödel number of a derivation of a contradic-
tion from PA. Such an element can’t be standard—since PA ⊢ ¬Prf PA (n, ⌜⊥⌝)
for every n.

content/model-theory/models-of-arithmetic/standard-models.tex

26.2 Standard Models of Arithmetic


mod:mar:stm: The language of arithmetic LA is obviously intended to be about numbers,
sec
specifically, about natural numbers. So, “the” standard model N is special: it
is the model we want to talk about. But in logic, we are often just interested in
structural properties, and any two structures that are isomorphic share those.
So we can be a bit more liberal, and consider any structure that is isomorphic
to N “standard.”
Definition 26.1. A structure for LA is standard if it is isomorphic to N.

mod:mar:stm: Proposition 26.2. If a structure M is standard, then its domain is the set
prop:standard-domain
of values of the standard numerals, i.e.,
|M| = {ValM (n) : n ∈ N}

Proof. Clearly, every ValM (n) ∈ |M|. We just have to show that every x ∈ |M|
is equal to ValM (n) for some n. Since M is standard, it is isomorphic to N.
Suppose g : N → |M| is an isomorphism. Then g(n) = g(ValN (n)) = ValM (n).
But for every x ∈ |M|, there is an n ∈ N such that g(n) = x, since g is
surjective.

If a structure M for LA is standard, the elements of its domain can all be explanation
named by the standard numerals 0, 1, 2, . . . , i.e., the terms 0, 0′ , 0′′ , etc. Of
course, this does not mean that the elements of |M| are the numbers, just that
we can pick them out the same way we can pick out the numbers in |N|.

Release : 6891b66 (2024-12-01) 375


CHAPTER 26. MODELS OF ARITHMETIC

Problem 26.1. Show that the converse of Proposition 26.2 is false, i.e., give
an example of a structure M with |M| = {ValM (n) : n ∈ N} that is not
isomorphic to N.

Proposition 26.3. If M ⊨ Q, and |M| = {ValM (n) : n ∈ N}, then M is mod:mar:stm:


prop:thq-standard
standard.

Proof. We have to show that M is isomorphic to N. Consider the function


g : N → |M| defined by g(n) = ValM (n). By the hypothesis, g is surjective. It
is also injective: Q ⊢ n ̸= m whenever n ̸= m. Thus, since M ⊨ Q, M ⊨ n ̸= m,
whenever n ̸= m. Thus, if n ̸= m, then ValM (n) ̸= ValM (m), i.e., g(n) ̸= g(m).
We also have to verify that g is an isomorphism.

1. We have g(0N ) = g(0) since, 0N = 0. By definition of g, g(0) = ValM (0).


But 0 is just 0, and the value of a term which happens to be a constant
symbol is given by what the structure assigns to that constant symbol,
i.e., ValM (0) = 0M . So we have g(0N ) = 0M as required.

2. g(′N (n)) = g(n + 1), since ′ in N is the successor function on N. Then,


g(n + 1) = ValM (n + 1) by definition of g. But n + 1 is the same term
as n′ , so ValM (n + 1) = ValM (n′ ). By the definition of the value func-
tion, this is = ′M (ValM (n)). Since ValM (n) = g(n) we get g(′N (n)) =
′M (g(n)).

3. g(+N (n, m)) = g(n + m), since + in N is the addition function on N.


Then, g(n + m) = ValM (n + m) by definition of g. But Q ⊢ n + m =
(n + m), so ValM (n + m) = ValM (n + m). By the definition of the value
function, this is = +M (ValM (n), ValM (m)). Since ValM (n) = g(n) and
ValM (m) = g(m), we get g(+N (n, m)) = +M (g(n), g(m)).

4. g(×N (n, m)) = ×M (g(n), g(m)): Exercise.

5. ⟨n, m⟩ ∈ <N iff n < m. If n < m, then Q ⊢ n < m, and also M ⊨ n < m.
Thus ⟨ValM (n), ValM (m)⟩ ∈ <M , i.e., ⟨g(n), g(m)⟩ ∈ <M . If n ̸< m,
then Q ⊢ ¬n < m, and consequently M ⊭ n < m. Thus, as before,
/ <M . Together, we get: ⟨n, m⟩ ∈ <N iff ⟨g(n), g(m)⟩ ∈
⟨g(n), g(m)⟩ ∈
M
< .

explanation The function g is the most obvious way of defining a mapping from N to the
domain of any other structure M for LA , since every such M contains elements
named by 0, 1, 2, etc. So it isn’t surprising that if M makes at least some
basic statements about the n’s true in the same way that N does, and g is
also bijective, then g will turn into an isomorphism. In fact, if |M| contains no
elements other than what the n’s name, it’s the only one.

Proposition 26.4. If M is standard, then g from the proof of Proposition 26.3 mod:mar:stm:
prop:thq-unique-iso
is the only isomorphism from N to M.

376 Release : 6891b66 (2024-12-01)


26.3. NON-STANDARD MODELS

Proof. Suppose h : N → |M| is an isomorphism between N and M. We show


that g = h by induction on n. If n = 0, then g(0) = 0M by definition of g. But
since h is an isomorphism, h(0) = h(0N ) = 0M , so g(0) = h(0).
Now consider the case for n + 1. We have

g(n + 1) = ValM (n + 1) by definition of g


= ValM (n′ ) since n + 1 ≡ n′
= ′M (ValM (n)) by definition of ValM (t′ )
= ′M (g(n)) by definition of g
= ′M (h(n)) by induction hypothesis
= h(′N (n)) since h is an isomorphism
= h(n + 1)

For any denumerable set M , there’s a bijection between N and M , so every explanation

such set M is potentially the domain of a standard model M. In fact, once you
pick an object z ∈ M and a suitable function s as 0M and ′M , the interpreta-
tions of +, ×, and < is already fixed. Only functions s : M → M \ {z} that
are both injective and surjective are suitable in a standard model as ′M . The
range of s cannot contain z, since otherwise ∀x 0 ̸= x′ would be false. That
sentence is true in N, and so M also has to make it true. The function s has
to be injective, since the successor function ′N in N is, and that ′N is injective
is expressed by a sentence true in N. It has to be surjective because otherwise
there would be some x ∈ M \ {z} not in the domain of s, i.e., the sentence
∀x (x = 0 ∨ ∃y y ′ = x) would be false in M—but it is true in N.

content/model-theory/models-of-arithmetic/non-standard-models.tex

26.3 Non-Standard Models


We call a structure for LA standard if it is isomorphic to N. If a structure isn’t explanation

isomorphic to N, it is called non-standard.

Definition 26.5. A structure M for LA is non-standard if it is not isomorphic


to N. The elements x ∈ |M| which are equal to ValM (n) for some n ∈ N are
called standard numbers (of M), and those not, non-standard numbers.

By Proposition 26.2, any standard structure for LA contains only stan- explanation

dard elements. Consequently, a non-standard structure must contain at least


one non-standard element. In fact, the existence of a non-standard element
guarantees that the structure is non-standard.

Proposition 26.6. If a structure M for LA contains a non-standard number,


M is non-standard.

Release : 6891b66 (2024-12-01) 377


CHAPTER 26. MODELS OF ARITHMETIC

Proof. Suppose not, i.e., suppose M standard but contains a non-standard


number x. Let g : N → |M| be an isomorphism. It is easy to see (by induction
on n) that g(ValN (n)) = ValM (n). In other words, g maps standard numbers
of N to standard numbers of M. If M contains a non-standard number, g
cannot be surjective, contrary to hypothesis.

Problem 26.2. Recall that Q contains the axioms


∀x ∀y (x′ = y ′ → x = y) (Q1 )

∀x 0 ̸= x (Q2 )
∀x (x = 0 ∨ ∃y x = y ′ ) (Q3 )
Give structures M1 , M2 , M3 such that
1. M1 ⊨ Q1 , M1 ⊨ Q2 , M1 ⊭ Q3 ;
2. M2 ⊨ Q1 , M2 ⊭ Q2 , M2 ⊨ Q3 ; and
3. M3 ⊭ Q1 , M3 ⊨ Q2 , M3 ⊨ Q3 ;
Obviously, you just have to specify 0Mi and ′Mi for each.

explanation It is easy enough to specify non-standard structures for LA . For instance,


take the structure with domain Z and interpret all non-logical symbols as usual.
Since negative numbers are not values of n for any n, this structure is non-
standard. Of course, it will not be a model of arithmetic in the sense that it
makes the same sentences true as N. For instance, ∀x x′ ̸= 0 is false. However,
we can prove that non-standard models of arithmetic exist easily enough, using
the compactness theorem.
Proposition 26.7. Let TA = {φ : N ⊨ φ} be the theory of N. TA has
an enumerable non-standard model.

Proof. Expand LA by a new constant symbol c and consider the set of sentences
Γ = TA ∪ {c ̸= 0, c ̸= 1, c ̸= 2, . . . }
Any model Mc of Γ would contain an element x = cM which is non-standard,
since x ̸= ValM (n) for all n ∈ N. Also, obviously, Mc ⊨ TA, since TA ⊆ Γ . If
we turn Mc into a structure M for LA simply by forgetting about c, its domain
still contains the non-standard x, and also M ⊨ TA. The latter is guaranteed
since c does not occur in TA. So, it suffices to show that Γ has a model.
We use the compactness theorem to show that Γ has a model. If every
finite subset of Γ is satisfiable, so is Γ . Consider any finite subset Γ0 ⊆ Γ . Γ0
includes some sentences of TA and some of the form c ̸= n, but only finitely
many. Suppose k is the largest number so that c ̸= k ∈ Γ0 . Define Nk by
expanding N to include the interpretation cNk = k + 1. Nk ⊨ Γ0 : if φ ∈ TA,
Nk ⊨ φ since Nk is just like N in all respects except c, and c does not occur
in φ. And Nk ⊨ c ̸= n, since n ≤ k, and ValNk (c) = k + 1. Thus, every finite
subset of Γ is satisfiable.

378 Release : 6891b66 (2024-12-01)


26.4. MODELS OF Q

content/model-theory/models-of-arithmetic/models-of-q.tex

26.4 Models of Q
We know that there are non-standard structures that make the same sentences explanation

true as N does, i.e., is a model of TA. Since N ⊨ Q, any model of TA is also


a model of Q. Q is much weaker than TA, e.g., Q ⊬ ∀x ∀y (x + y) = (y + x).
Weaker theories are easier to satisfy: they have more models. E.g., Q has
models which make ∀x ∀y (x + y) = (y + x) false, but those cannot also be
models of TA, or PA for that matter. Models of Q are also relatively simple:
we can specify them explicitly.

mod:mar:mdq: Example 26.8. Consider the structure K with domain |K| = N ∪ {a} and
ex:model-K-of-Q
interpretations

0K = 0
(
K x + 1 if x ∈ N
′ (x) =
a if x = a
(
x + y if x, y ∈ N
+K (x, y) =
a otherwise

xy if x, y ∈ N

×K (x, y) = 0 if x = 0 or y = 0

a otherwise

<K = {⟨x, y⟩ : x, y ∈ N and x < y} ∪ {⟨x, a⟩ : x ∈ |K|}

To show that K ⊨ Q we have to verify that all axioms of Q are true in K.


For convenience, let’s write x∗ for ′K (x) (the “successor” of x in K), x ⊕ y for
+K (x, y) (the “sum” of x and y in K, x ⊗ y for ×K (x, y) (the “product” of x
and y in K), and x 4 y for ⟨x, y⟩ ∈ <K . With these abbreviations, we can give
the operations in K more perspicuously as

x⊕y 0 m a x⊗y 0 m a
x x∗
0 0 m a 0 0 0 0
n n+1
n n n+m a n 0 nm a
a a
a a a a a 0 a a

We have n 4 m iff n < m for n, m ∈ N and x 4 a for all x ∈ |K|.


K ⊨ ∀x ∀y (x′ = y ′ → x = y) since ∗ is injective. K ⊨ ∀x 0 ̸= x′ since 0 is
not a ∗-successor in K. K ⊨ ∀x (x = 0 ∨ ∃y x = y ′ ) since for every n > 0,
n = (n − 1)∗ , and a = a∗ .
K ⊨ ∀x (x + 0) = x since n ⊕ 0 = n + 0 = n, and a ⊕ 0 = a by definition
of ⊕. K ⊨ ∀x ∀y (x + y ′ ) = (x + y)′ is a bit trickier. If n, m are both standard,

Release : 6891b66 (2024-12-01) 379


CHAPTER 26. MODELS OF ARITHMETIC

we have:
(n ⊕ m∗ ) = (n + (m + 1)) = (n + m) + 1 = (n ⊕ m)∗

since ⊕ and agree with + and ′ on standard numbers. Now suppose x ∈ |K|.
Then

(x ⊕ a∗ ) = (x ⊕ a) = a = a∗ = (x ⊕ a)∗

The remaining case is if y ∈ |K| but x = a. Here we also have to distinguish


cases according to whether y = n is standard or y = b:

(a ⊕ n∗ ) = (a ⊕ (n + 1)) = a = a∗ = (a ⊕ n)∗
(a ⊕ a∗ ) = (a ⊕ a) = a = a∗ = (a ⊕ a)∗
This is of course a bit more detailed than needed. For instance, since a ⊕ z = a
whatever z is, we can immediately conclude a ⊕ a∗ = a. The remaining axioms
can be verified the same way.
K is thus a model of Q. Its “addition” ⊕ is also commutative. But there
are other sentences true in N but false in K, and vice versa. For instance, a 4 a,
so K ⊨ ∃x x < x and K ⊭ ∀x ¬x < x. This shows that Q ⊬ ∀x ¬x < x.
Problem 26.3. Prove that K from Example 26.8 satisfies the remaining ax-
ioms of Q,
∀x (x × 0) = 0 (Q6 )
∀x ∀y (x × y ′ ) = ((x × y) + x) (Q7 )

∀x ∀y (x < y ↔ ∃z (z + x) = y) (Q8 )
Find a sentence only involving ′ true in N but false in K.
Example 26.9. Consider the structure L with domain |L| = N ∪ {a, b} and mod:mar:mdq:

interpretations ′L = ∗, +L = ⊕ given by ex:model-L-of-Q

x x∗ x⊕y m a b
n n+1 n n+m b a
a a a a b a
b b b b b a
Since ∗ is injective, 0 is not in its range, and every x ∈ |L| other than 0 is,
axioms Q1 –Q3 are true in L. For any x, x ⊕ 0 = x, so Q4 is true as well. For
Q5 , consider x ⊕ y ∗ and (x ⊕ y)∗ . They are equal if x and y are both standard,
since then ∗ and ⊕ agree with ′ and +. If x is non-standard, and y is standard,
we have x ⊕ y ∗ = x = x∗ = (x ⊕ y)∗ . If x and y are both non-standard, we
have four cases:
a ⊕ a∗ = b = b∗ = (a ⊕ a)∗
b ⊕ b∗ = a = a∗ = (b ⊕ b)∗
b ⊕ a∗ = b = b∗ = (b ⊕ y)∗
a ⊕ b∗ = a = a∗ = (a ⊕ b)∗

380 Release : 6891b66 (2024-12-01)


26.5. MODELS OF PA

If x is standard, but y is non-standard, we have

n ⊕ a∗ = n ⊕ a = b = b∗ = (n ⊕ a)∗
n ⊕ b∗ = n ⊕ b = a = a∗ = (n ⊕ b)∗

So, L ⊨ Q5 . However, a ⊕ 0 ̸= 0 ⊕ a, so L ⊭ ∀x ∀y (x + y) = (y + x).

Problem 26.4. Expand L of Example 26.9 to include ⊗ and 4 that inter-


pret × and <. Show that your structure satisfies the remaining axioms of Q,

∀x (x × 0) = 0 (Q6 )

∀x ∀y (x × y ) = ((x × y) + x) (Q7 )

∀x ∀y (x < y ↔ ∃z (z + x) = y) (Q8 )

Problem 26.5. In L of Example 26.9, a∗ = a and b∗ = b. Is there a model


of Q in which a∗ = b and b∗ = a?

We’ve explicitly constructed models of Q in which the non-standard ele- explanation

ments live “beyond” the standard elements. In fact, that much is required by
the axioms. A non-standard element x cannot be 40, since Q ⊢ ∀x ¬x < 0 (see
Lemma 35.23). Also, for every n, Q ⊢ ∀x (x < n′ →(x = 0∨x = 1∨· · ·∨x = n))
(Lemma 35.24), so we can’t have a 4 n for any n > 0.

content/model-theory/models-of-arithmetic/models-of-pa.tex

26.5 Models of PA
Any non-standard model of TA is also one of PA. We know that non-standard explanation
models of TA and hence of PA exist. We also know that such non-standard
models contain non-standard “numbers,” i.e., elements of the domain that are
“beyond” all the standard “numbers.” But how are they arranged? How many
are there? We’ve seen that models of the weaker theory Q can contain as few
as a single non-standard number. But these simple structures are not models
of PA or TA.
The key to understanding the structure of models of PA or TA is to see
what facts are derivable in these theories. For instance, already PA proves
that ∀x x ̸= x′ and ∀x ∀y (x + y) = (y + x), so this rules out simple structures
(in which these sentences are false) as models of PA.
Suppose M is a model of PA. Then if PA ⊢ φ, M ⊨ φ. Let’s again use z
for 0M , ∗ for ′M , ⊕ for +M , ⊗ for ×M , and 4 for <M . Any sentence φ then
states some condition about z, ∗, ⊕, ⊗, and 4, and if M ⊨ φ that condition
must be satisfied. For instance, if M ⊨ Q1 , i.e., M ⊨ ∀x ∀y (x′ = y ′ → x = y),
then ∗ must be injective.
Proposition 26.10. In M, 4 is a linear strict order, i.e., it satisfies:
1. Not x 4 x for any x ∈ |M|.

Release : 6891b66 (2024-12-01) 381


CHAPTER 26. MODELS OF ARITHMETIC

2. If x 4 y and y 4 z then x 4 z.

3. For any x ̸= y, x 4 y or y 4 x

Proof. PA proves:

1. ∀x ¬x < x

2. ∀x ∀y ∀z ((x < y ∧ y < z) → x < z)

3. ∀x ∀y ((x < y ∨ y < x) ∨ x = y))

Proposition 26.11. z is the least element of |M| in the 4-ordering. For any mod:mar:mpa:

x, x 4 x∗ , and x∗ is the 4-least element with that property. For any x, there prop:M-discrete

is a unique y such that y ∗ = x. (We call y the “predecessor” of x in M, and


denote it by ∗ x.)

Proof. Exercise.

Problem 26.6. Find sentences in LA derivable in PA (and hence true in N)


which guarantee the properties of z, ∗, and 4 in Proposition 26.11

Proposition 26.12. All standard elements of M are less than (according to 4)


all non-standard elements.

Proof. We’ll use n as short for ValM (n), a standard element of M. Already
Q proves that, for any n ∈ N, ∀x (x < n′ → (x = 0 ∨ x = 1 ∨ · · · ∨ x = n)).
There are no elements that are 4z. So if n is standard and x is non-standard,
we cannot have x 4 n. By definition, a non-standard element is one that isn’t
ValM (n) for any n ∈ N, so x ̸= n as well. Since 4 is a linear order, we must
have n 4 x.

Proposition 26.13. Every nonstandard element x of |M| is an element of the


subset
. . .∗∗∗ x 4∗∗ x 4∗ x 4 x 4 x∗ 4 x∗∗ 4 x∗∗∗ 4 . . .
We call this subset the block of x and write it as [x]. It has no least and no
greatest element. It can be characterized as the set of those y ∈ |M| such that,
for some standard n, x ⊕ n = y or y ⊕ n = x.

Proof. Clearly, such a set [x] always exists since every element y of |M| has
a unique successor y ∗ and unique predecessor ∗ y. For successive elements y,
y ∗ we have y 4 y ∗ and y ∗ is the 4-least element of |M| such that y is 4-less
than it. Since always ∗ y 4 y and y 4 y ∗ , [x] has no least or greatest element. If
y ∈ [x] then x ∈ [y], for then either y ∗...∗ = x or x∗...∗ = y. If y ∗...∗ = x (with
n ∗’s), then y ⊕ n = x and conversely, since PA ⊢ ∀x x′...′ = (x + n) (if n is the
number of ′’s).

382 Release : 6891b66 (2024-12-01)


26.5. MODELS OF PA

Proposition 26.14. If [x] ̸= [y] and x 4 y, then for any u ∈ [x] and any
v ∈ [y], u 4 v.

Proof. Note that PA ⊢ ∀x ∀y (x < y → (x′ < y ∨ x′ = y)). Thus, if u 4 v, we


also have u ⊕ n∗ 4 v for any n if [u] ̸= [v].
Any u ∈ [x] is 4y: x 4 y by assumption. If u 4 x, u 4 y by transitivity. And
if x 4 u but u ∈ [x], we have u = x ⊕ n∗ for some n, and so u 4 y by the fact
just proved.
Now suppose that v ∈ [y] is 4y, i.e., v ⊕ m∗ = y for some standard m.
This rules out v 4 x, otherwise y = v ⊕ m∗ 4 x. Clearly also, x ̸= v, otherwise
x ⊕ m∗ = v ⊕ m∗ = y and we would have [x] = [y]. So, x 4 v. But then also
x ⊕ n∗ 4 v for any n. Hence, if x 4 u and u ∈ [x], we have u 4 v. If u 4 x then
u 4 v by transitivity.
Lastly, if y 4 v, u 4 v since, as we’ve shown, u 4 y and y 4 v.

Corollary 26.15. If [x] ̸= [y], [x] ∩ [y] = ∅.

Proof. Suppose z ∈ [x] and x 4 y. Then z 4 u for all u ∈ [y]. If z ∈ [y], we


would have z 4 z. Similarly if y 4 x.

This means that the blocks themselves can be ordered in a way that respects explanation

4: [x]4[y] iff x4y, or, equivalently, if u4v for any u ∈ [x] and v ∈ [y]. Clearly,
the standard block [0] is the least block. It intersects with no non-standard
block, and no two non-standard blocks intersect either. Specifically, you cannot
“reach” a different block by taking repeated successors or predecessors.

Proposition 26.16. If x and y are non-standard, then x4x⊕y and x⊕y ∈


/ [x].

Proof. If y is nonstandard, then y ̸= z. PA ⊢ ∀x (y ̸= 0 → x < (x + y)).


Now suppose x ⊕ y ∈ [x]. Since x 4 x ⊕ y, we would have x ⊕ n∗ = x ⊕ y.
But PA ⊢ ∀x ∀y ∀z ((x + y) = (x + z) → y = z) (the cancellation law for
addition). This would mean y = n∗ for some standard n; but y is assumed to
be non-standard.

Proposition 26.17. There is no least non-standard block.

Proof. PA ⊢ ∀x ∃y ((y +y) = x∨(y +y)′ = x), i.e., that every x is divisible by 2
(possibly with remainder 1). If x is non-standard, so is y. By the preceding
/ [y]. Then also y 4 (y ⊕ y)∗ and (y ⊕ y)∗ ∈
proposition, y 4 y ⊕ y and y ⊕ y ∈ / [y].

But x = y ⊕ y or x = (y ⊕ y) , so y 4 x and y ∈ / [x].

Proposition 26.18. There is no largest block.

Proof. Exercise.

Problem 26.7. Show that in a non-standard model of PA, there is no largest


block.

Release : 6891b66 (2024-12-01) 383


CHAPTER 26. MODELS OF ARITHMETIC

Proposition 26.19. The ordering of the blocks is dense. That is, if x 4 y mod:mar:mpa:
and [x] ̸= [y], then there is a block [z] distinct from both that is between them. prop:blocks-dense

Proof. Suppose x 4 y. As before, x ⊕ y is divisible by two (possibly with


remainder): there is a z ∈ |M| such that either x⊕y = z ⊕z or x⊕y = (z ⊕z)∗ .
The element z is the “average” of x and y, and x 4 z and z 4 y.

Problem 26.8. Write out a detailed proof of Proposition 26.19. Which sen-
tence must PA derive in order to guarantee the existence of z? Why is x 4 z
and z 4 y, and why is [x] ̸= [z] and [z] ̸= [y]?

explanation The non-standard blocks are therefore ordered like the rationals: they form
a denumerable dense linear ordering without endpoints. One can show that
any two such denumerable orderings are isomorphic. It follows that for any
two enumerable non-standard models M1 and M2 of true arithmetic, their
reducts to the language containing < and = only are isomorphic. Indeed, an
isomorphism h can be defined as follows: the standard parts of M1 and M2
are isomorphic to the standard model N and hence to each other. The blocks
making up the non-standard part are themselves ordered like the rationals
and therefore isomorphic; an isomorphism of the blocks can be extended to
an isomorphism within the blocks by matching up arbitrary elements in each,
and then taking the image of the successor of x in M1 to be the successor
of the image of x in M2 . Note that it does not follow that M1 and M2 are
isomorphic in the full language of arithmetic (indeed, isomorphism is always
relative to a language), as there are non-isomorphic ways to define addition and
multiplication over |M1 | and |M2 |. (This also follows from a famous theorem
due to Vaught that the number of countable models of a complete theory cannot
be 2.)

content/model-theory/models-of-arithmetic/computable-models.tex

26.6 Computable Models of Arithmetic


explanation The standard model N has two nice features. Its domain is the natural num-
bers N, i.e., its elements are just the kinds of things we want to talk about using
the language of arithmetic, and the standard numeral n actually picks out n.
The other nice feature is that the interpretations of the non-logical symbols
of LA are all computable. The successor, addition, and multiplication functions
which serve as ′N , +N , and ×N are computable functions of numbers. (Com-
putable by Turing machines, or definable by primitive recursion, say.) And the
less-than relation on N, i.e., <N , is decidable.
Non-standard models of arithmetical theories such as Q and PA must con-
tain non-standard elements. Thus their domains typically include elements in
addition to N. However, any countable structure can be built on any denumer-
able set, including N. So there are also non-standard models with domain N.

384 Release : 6891b66 (2024-12-01)


26.6. COMPUTABLE MODELS OF ARITHMETIC

In such models M, of course, at least some numbers cannot play the roles they
usually play, since some k must be different from ValM (n) for all n ∈ N.
Definition 26.20. A structure M for LA is computable iff |M| = N and ′M ,
+M , ×M are computable functions and <M is a decidable relation.

mod:mar:cmp: Example 26.21. Recall the structure K from Example 26.8. Its domain was
ex:comp-model-q
|K| = N ∪ {a} and interpretations

0K = 0
(
K x + 1 if x ∈ N
′ (x) =
a if x = a
(
x + y if x, y ∈ N
+K (x, y) =
a otherwise

xy if x, y ∈ N

×K (x, y) = 0 if x = 0 or y = 0

a otherwise

<K = {⟨x, y⟩ : x, y ∈ N and x < y} ∪ {⟨x, a⟩ : n ∈ |K|}

But |K| is denumerable and so is equinumerous with N. For instance, g : N →


|K| with g(0) = a and g(n) = n + 1 for n > 0 is a bijection. We can turn it into
an isomorphism between a new model K′ of Q and K. In K′ , we have to assign
different functions and relations to the symbols of LA , since different elements
of N play the roles of standard and non-standard numbers.
Specifically, 0 now plays the role of a, not of the smallest standard number.

The smallest standard number is now 1. So we assign 0K = 1. The successor
function is also different now: given a standard number, i.e., an n > 0, it still
returns n + 1. But 0 now plays the role of a, which is its own successor. So

′K (0) = 0. For addition and multiplication we likewise have
(
K′ x + y − 1 if x, y > 0
+ (x, y) =
0 otherwise


1
 if x = 1 or y = 1
×K (x, y) = xy − x − y + 2 if x, y > 1

0 otherwise


And we have ⟨x, y⟩ ∈ <K iff x < y and x > 0 and y > 0, or if y = 0.

All of these functions are computable functions of natural numbers and <K
is a decidable relation on N—but they are not the same functions as successor,

addition, and multiplication on N, and <K is not the same relation as < on N.

Problem 26.9. Give a structure L′ with |L′ | = N isomorphic to L of Exam-


ple 26.9.

Release : 6891b66 (2024-12-01) 385


explanation Example 26.21 shows that Q has computable non-standard models with
domain N. However, the following result shows that this is not true for models
of PA (and thus also for models of TA).
Theorem 26.22 (Tennenbaum’s Theorem). N is the only computable model
of PA.

Chapter 27

The Interpolation Theorem

content/model-theory/interpolation/introduction.tex

27.1 Introduction
The interpolation theorem is the following result: Suppose ⊨ φ → ψ. Then mod:int:int:
sec
there is a sentence χ such that ⊨ φ → χ and ⊨ χ → ψ. Moreover, every constant
symbol, function symbol, and predicate symbol (other than =) in χ occurs
both in φ and ψ. The sentence χ is called an interpolant of φ and ψ.
The interpolation theorem is interesting in its own right, but its main im-
portance lies in the fact that it can be used to prove results about definability
in a theory, and the conditions under which combining two consistent theories
results in a consistent theory. The first result is known as the Beth definability
theorem; the second, Robinson’s joint consistency theorem.

content/model-theory/interpolation/separation.tex

27.2 Separation of Sentences


A bit of groundwork is needed before we can proceed with the proof of the mod:int:sep:
sec
interpolation theorem. An interpolant for φ and ψ is a sentence χ such that
φ ⊨ χ and χ ⊨ ψ. By contraposition, the latter is true iff ¬ψ ⊨ ¬χ. A sentence χ
with this property is said to separate φ and ¬ψ. So finding an interpolant for
φ and ψ amounts to finding a sentence that separates φ and ¬ψ. As so often,
it will be useful to consider a generalization: a sentence that separates two sets
of sentences.

386
27.2. SEPARATION OF SENTENCES

Definition 27.1. A sentence χ separates sets of sentences Γ and ∆ if and only


if Γ ⊨ χ and ∆ ⊨ ¬χ. If no such sentence exists, then Γ and ∆ are inseparable.

The inclusion relations between the classes of models of Γ , ∆ and χ are


represented below:
χ

Γ ∆

¬χ

Figure 27.1: χ separates Γ and ∆


mod:int:sep:
mod:int:sep: Lemma 27.2. Suppose L0 is the language containing every constant symbol,
fig:sep
lem:sep1 .
function symbol and predicate symbol (other than =) that occurs in both Γ

and ∆, and let L0 be obtained by the addition of infinitely many new constant
symbols cn for n ≥ 0. Then if Γ and ∆ are inseparable in L0 , they are also
inseparable in L′0 .

Proof. We proceed indirectly: suppose by way of contradiction that Γ and ∆


are separated in L′0 . Then Γ ⊨ χ[c/x] and ∆ ⊨ ¬χ[c/x] for some χ ∈ L0 (where
c is a new constant symbol—the case where χ contains more than one such new
constant symbol is similar). By compactness, there are finite subsets Γ0 of Γ
and ∆0 of ∆ such that Γ0 ⊨ χ[c/x] and ∆0 ⊨ ¬χ[c/x]. Let γ be the conjunction
of all formulas in Γ0 and δ the conjunction of all formulas in ∆0 . Then
γ ⊨ χ[c/x], δ ⊨ ¬χ[c/x].
From the former, by Generalization, we have γ ⊨ ∀x χ, and from the latter
by contraposition, χ[c/x] ⊨ ¬δ, whence also ∀x χ ⊨ ¬δ. Contraposition again
gives δ ⊨ ¬∀x χ. By monotonicity,
Γ ⊨ ∀x χ, ∆ ⊨ ¬∀x χ,
so that ∀x χ separates Γ and ∆ in L0 .

mod:int:sep: Lemma 27.3. Suppose that Γ ∪ {∃x σ} and ∆ are inseparable, and c is a new
lem:sep2
constant symbol not in Γ , ∆, or σ. Then Γ ∪ {∃x σ, σ[c/x]} and ∆ are also
inseparable.

Proof. Suppose for contradiction that χ separates Γ ∪ {∃x σ, σ[c/x]} and ∆,


while at the same time Γ ∪ {∃xσ} and ∆ are inseparable. We distinguish two
cases:
1. c does not occur in χ: in this case Γ ∪ {∃x σ, ¬χ} is satisfiable (otherwise
χ separates Γ ∪ {∃x σ} and ∆). It remains so if σ[c/x] is added, so χ
does not separate Γ ∪ {∃x σ, σ[c/x]} and ∆ after all.

Release : 6891b66 (2024-12-01) 387


CHAPTER 27. THE INTERPOLATION THEOREM

2. c does occur in χ so that χ has the form χ[c/x]. Then we have that
Γ ∪ {∃x σ, σ[c/x]} ⊨ χ[c/x],
whence Γ, ∃x σ ⊨ ∀x (σ → χ) by the Deduction Theorem and Generaliza-
tion, and finally Γ ∪ {∃x σ} ⊨ ∃x χ. On the other hand, ∆ ⊨ ¬χ[c/x] and
hence by Generalization ∆ ⊨ ¬∃x χ. So Γ ∪ {∃x σ} and ∆ are separable,
a contradiction.

content/model-theory/interpolation/interpolation-proof.tex

27.3 Craig’s Interpolation Theorem


mod:int:prf:
sec
Theorem 27.4 (Craig’s Interpolation Theorem). If ⊨ φ → ψ, then there mod:int:prf:
is a sentence χ such that ⊨ φ → χ and ⊨ χ → ψ, and every constant symbol, thm:interpol

function symbol, and predicate symbol (other than =) in χ occurs both in φ


and ψ. The sentence χ is called an interpolant of φ and ψ.

Proof. Suppose L1 is the language of φ and L2 is the language of ψ. Let


L0 = L1 ∩ L2 . For each i ∈ {0, 1, 2}, let L′i be obtained from Li by adding the
infinitely many new constant symbols c0 , c1 , c2 , . . . .
If φ is unsatisfiable, ∃x x ̸= x is an interpolant. If ¬ψ is unsatisfiable (and
hence ψ is valid), ∃x x = x is an interpolant. So we may assume also that both
φ and ¬ψ are satisfiable.
In order to prove the contrapositive of the Interpolation Theorem, assume
that there is no interpolant for φ and ψ. In other words, assume that {φ} and
{¬ψ} are inseparable in L0 .
Our goal is to extend the pair ({φ}, {¬ψ}) to a maximally inseparable pair
(Γ ∗ , ∆∗ ). Let φ0 , φ1 , φ2 , . . . enumerate the sentences of L1 , and ψ0 , ψ1 , ψ2 ,
. . . enumerate the sentences of L2 . We define two increasing sequences of sets
of sentences (Γn , ∆n ), for n ≥ 0, as follows. Put Γ0 = {φ} and ∆0 = {¬ψ}.
Assuming (Γn , ∆n ) are already defined, define Γn+1 and ∆n+1 by:
1. If Γn ∪ {φn } and ∆n are inseparable in L′0 , put φn in Γn+1 . Moreover, if
φn is an existential formula ∃x σ then pick a new constant symbol c not
occurring in Γn , ∆n , φn or ψn , and put σ[c/x] in Γn+1 .
2. If Γn+1 and ∆n ∪ {ψn } are inseparable in L′0 , put ψn in ∆n+1 . Moreover,
if ψn is an existential formula ∃x σ, then pick a new constant symbol c
not occurring in Γn+1 , ∆n , φn or ψn , and put σ[c/x] in ∆n+1 .
Finally, define:
[ [
Γ∗ = Γn , ∆∗ = ∆n .
n≥0 n≥0

By simultaneous induction on n we can now prove:

388 Release : 6891b66 (2024-12-01)


27.3. CRAIG’S INTERPOLATION THEOREM

mod:int:prf: 1. Γn and ∆n are inseparable in L′0 ;


part-a

mod:int:prf: 2. Γn+1 and ∆n are inseparable in L′0 .


part-b

The basis for (1) is given by Lemma 27.2. For part (2), we need to distinguish
three cases:

1. If Γ0 ∪ {φ0 } and ∆0 are separable, then Γ1 = Γ0 and (2) is just (1);

2. If Γ1 = Γ0 ∪ {φ0 }, then Γ1 and ∆0 are inseparable by construction.

3. It remains to consider the case where φ0 is existential, so that Γ1 =


Γ0 ∪ {∃x σ, σ[c/x]}. By construction, Γ0 ∪ {∃x σ} and ∆0 are inseparable,
so that by Lemma 27.3 also Γ0 ∪ {∃x σ, σ[c/x]} and ∆0 are inseparable.

This completes the basis of the induction for (1) and (2) above. Now for
the inductive step. For (1), if ∆n+1 = ∆n ∪ {ψn } then Γn+1 and ∆n+1 are
inseparable by construction (even when ψn is existential, by Lemma 27.3);
if ∆n+1 = ∆n (because Γn+1 and ∆n ∪ {ψn } are separable), then we use
the induction hypothesis on (2). For the inductive step for (2), if Γn+2 =
Γn+1 ∪ {φn+1 } then Γn+2 and ∆n+1 are inseparable by construction (even
when φn+1 is existential, by Lemma 27.3); and if Γn+2 = Γn+1 then we use
the inductive case for (1) just proved. This concludes the induction on (1) and
(2).
It follows that Γ ∗ and ∆∗ are inseparable; if not, by compactness, there
is n ≥ 0 that separates Γn and ∆n , against (1). In particular, Γ ∗ and ∆∗
are consistent: for if the former or the latter is inconsistent, then they are
separated by ∃x x ̸= x or ∀x x = x, respectively.
We now show that Γ ∗ is maximally consistent in L′1 and likewise ∆∗ in

L2 . For the former, suppose that φn ∈ / Γ ∗ and ¬φn ∈/ Γ ∗ , for some n ≥ 0. If
φn ∈ / Γ then Γn ∪ {φn } is separable from ∆n , and so there is χ ∈ L′0 such

that both:

Γ ∗ ⊨ φn → χ, ∆∗ ⊨ ¬χ.

/ Γ ∗ , there is χ′ ∈ L′0 such that both:


Likewise, if ¬φn ∈

Γ ∗ ⊨ ¬φn → χ′ , ∆∗ ⊨ ¬χ′ .

By propositional logic, Γ ∗ ⊨ χ ∨ χ′ and ∆∗ ⊨ ¬(χ ∨ χ′ ), so χ ∨ χ′ separates


Γ ∗ and ∆∗ . A similar argument establishes that ∆∗ is maximal.
Finally, we show that Γ ∗ ∩ ∆∗ is maximally consistent in L′0 . It is obviously
consistent, since it is the intersection of consistent sets. To show maximality,
let σ ∈ L′0 . Now, Γ ∗ is maximal in L′1 ⊇ L′0 , and similarly ∆∗ is maximal in
L′2 ⊇ L′0 . It follows that either σ ∈ Γ ∗ or ¬σ ∈ Γ ∗ , and either σ ∈ ∆∗ or
¬σ ∈ ∆∗ . If σ ∈ Γ ∗ and ¬σ ∈ ∆∗ then σ would separate Γ ∗ and ∆∗ ; and if
¬σ ∈ Γ ∗ and σ ∈ ∆∗ then Γ ∗ and ∆∗ would be separated by ¬σ. Hence, either
σ ∈ Γ ∗ ∩ ∆∗ or ¬σ ∈ Γ ∗ ∩ ∆∗ , and Γ ∗ ∩ ∆∗ is maximal.

Release : 6891b66 (2024-12-01) 389


CHAPTER 27. THE INTERPOLATION THEOREM

Since Γ ∗ is maximally consistent, it has a model M′1 whose domain |M′1 |



comprises all and only the elements cM1 interpreting the constant symbols—
just like in the proof of the completeness theorem (Theorem 23.20). Similarly,

∆∗ has a model M′2 whose domain |M′2 | is given by the interpretations cM2 of
the constant symbols.
Let M1 be obtained from M′1 by dropping interpretations for constant sym-
bols, function symbols, and predicate symbols in L′1 \ L′0 , and similarly for M2 .
′ ′
Then the map h : M1 → M2 defined by h(cM1 ) = cM2 is an isomorphism in
L′0 , because Γ ∗ ∩ ∆∗ is maximally consistent in L′0 , as shown. This follows
because any L′0 -sentence either belongs to both Γ ∗ and ∆∗ , or to neither: so
′ ′
cM1 ∈ P M1 if and only if P (c) ∈ Γ ∗ if and only if P (c) ∈ ∆∗ if and only if
′ ′
cM2 ∈ P M2 . The other conditions satisfied by isomorphisms can be established
similarly.
Let us now define a model M for the language L1 ∪ L2 as follows:

1. The domain |M| is just |M2 |, i.e., the set of all elements cM2 ;

2. If a predicate symbol P is in L2 \ L1 then P M = P M2 ;
′ M′ M′
3. If a predicate P is in L1 \ L2 then P M = h(P M2 ), i.e., ⟨c1 2 , . . . , cn 2 ⟩ ∈
M′ M′ ′
P M if and only if ⟨c1 1 , . . . , cn 1 ⟩ ∈ P M1 .
′ ′
4. If a predicate symbol P is in L0 then P M = P M2 = h(P M1 ).
5. Function symbols of L1 ∪ L2 , including constant symbols, are handled
similarly.
Finally, one shows by induction on formulas that M agrees with M′1 on all
formulas of L′1 and with M′2 on all formulas of L′2 . In particular, M ⊨ Γ ∗ ∪ ∆∗ ,
whence M ⊨ φ and M ⊨ ¬ψ, and ̸⊨ φ → ψ. This concludes the proof of Craig’s
Interpolation Theorem.

content/model-theory/interpolation/definability.tex

27.4 The Definability Theorem


One important application of the interpolation theorem is Beth’s definability mod:int:def:
sec
theorem. To define an n-place relation R we can give a formula χ with n free
variables which does not involve R. This would be an explicit definition of R in
terms of χ. We can then say also that a theory Σ(P ) in a language containing
the n-place predicate symbol P explicitly defines P if it contains (or at least
entails) a formalized explicit definition, i.e.,

Σ(P ) ⊨ ∀x1 . . . ∀xn (P (x1 , . . . , xn ) ↔ χ(x1 , . . . , xn )).

But an explicit definition is only one way of defining—in the sense of determin-
ing completely—a relation. A theory may also be such that the interpretation

390 Release : 6891b66 (2024-12-01)


27.4. THE DEFINABILITY THEOREM

of P is fixed by the interpretation of the rest of the language in any model. The
definability theorem states that whenever a theory fixes the interpretation of P
in this way—whenever it implicitly defines P —then it also explicitly defines it.
Definition 27.5. Suppose L is a language not containing the predicate sym-
bol P . A set Σ(P ) of sentences of L ∪ {P } explicitly defines P if and only if
there is a formula χ(x1 , . . . , xn ) of L such that

Σ(P ) ⊨ ∀x1 . . . ∀xn (P (x1 , . . . , xn ) ↔ χ(x1 , . . . , xn )).

Definition 27.6. Suppose L is a language not containing the predicate sym-


bols P and P ′ . A set Σ(P ) of sentences of L ∪ {P } implicitly defines P if and
only if

Σ(P ) ∪ Σ(P ′ ) ⊨ ∀x1 . . . ∀xn (P (x1 , . . . , xn ) ↔ P ′ (x1 , . . . , xn )),

where Σ(P ′ ) is the result of uniformly replacing P with P ′ in Σ(P ).


n
In other words, for any model M and R, R′ ⊆ |M| , if both (M, R) ⊨ Σ(P )
and (M, R′ ) ⊨ Σ(P ′ ), then R = R′ ; where (M, R) is the structure M′ for the

expansion of L to L ∪ {P } such that P M = R, and similarly for (M, R′ ).
Theorem 27.7 (Beth Definability Theorem). A set Σ(P ) of L ∪ {P }-
formulas implicitly defines P if and only Σ(P ) explicitly defines P .

Proof. If Σ(P ) explicitly defines P then both

Σ(P ) ⊨ ∀x1 . . . ∀xn (P (x1 , . . . , xn ) ↔ χ(x1 , . . . , xn ))



Σ(P ) ⊨ ∀x1 . . . ∀xn (P ′ (x1 , . . . , xn ) ↔ χ(x1 , . . . , xn ))

and the conclusion follows. For the converse: assume that Σ(P ) implicitly
defines P . First, we add constant symbols c1 , . . . , cn to L. Then

Σ(P ) ∪ Σ(P ′ ) ⊨ P (c1 , . . . , cn ) → P ′ (c1 , . . . , cn ).

By compactness, there are finite sets ∆0 ⊆ Σ(P ) and ∆1 ⊆ Σ(P ′ ) such that

∆0 ∪ ∆1 ⊨ P (c1 , . . . , cn ) → P ′ (c1 , . . . , cn ).

Let θ(P ) be the conjunction of all sentences φ(P ) such that either φ(P ) ∈ ∆0
or φ(P ′ ) ∈ ∆1 and let θ(P ′ ) be the conjunction of all sentences φ(P ′ ) such
that either φ(P ) ∈ ∆0 or φ(P ′ ) ∈ ∆1 . Then θ(P ) ∧ θ(P ′ ) ⊨ P (c1 , . . . , cn ) →
P ′ c1 . . . cn . We can re-arrange this so that each predicate symbol occurs on
one side of ⊨:

θ(P ) ∧ P (c1 , . . . , cn ) ⊨ θ(P ′ ) → P ′ (c1 , . . . , cn ).

By Craig’s Interpolation Theorem there is a sentence χ(c1 , . . . , cn ) not contain-


ing P or P ′ such that:

θ(P ) ∧ P (c1 , . . . , cn ) ⊨ χ(c1 , . . . , cn ); χ(c1 , . . . , cn ) ⊨ θ(P ′ ) → P ′ (c1 , . . . , cn ).

Release : 6891b66 (2024-12-01) 391


From the former of these two entailments we have: θ(P ) ⊨ P (c1 , . . . , cn ) →
χ(c1 , . . . , cn ). And from the latter, since an L ∪ {P }-model (M, R) ⊨ φ(P )
if and only if the corresponding L ∪ {P ′ }-model (M, R) |= φ(P ′ ), we have
χ(c1 , . . . , cn ) ⊨ θ(P ) → P (c1 , . . . , cn ), from which:

θ(P ) ⊨ χ(c1 , . . . , cn ) → P (c1 , . . . , cn ).

Putting the two together, θ(P ) ⊨ P (c1 , . . . , cn ) ↔ χ(c1 , . . . , cn ), and by mono-


tonicity and generalization also

Σ(P ) ⊨ ∀x1 . . . ∀xn (P (x1 , . . . , xn ) ↔ χ(x1 , . . . , xn )).

Chapter 28

Lindström’s Theorem

content/model-theory/lindstrom/introduction.tex

28.1 Introduction
In this chapter we aim to prove Lindström’s characterization of first-order logic
as the maximal logic for which (given certain further constraints) the Compact-
ness and the Downward Löwenheim–Skolem theorems hold (Theorem 23.23 and
Theorem 23.32). First, we need a more general characterization of the general
class of logics to which the theorem applies. We will restrict ourselves to re-
lational languages, i.e., languages which only contain predicate symbols and
individual constants, but no function symbols.

content/model-theory/lindstrom/abstract-logics.tex

28.2 Abstract Logics


mod:lin:alg:
sec
Definition 28.1. An abstract logic is a pair ⟨L, |=L ⟩, where L is a function that
assigns to each language L a set L(L) of sentences, and |=L is a relation between
structures for the language L and elements of L(L). In particular, ⟨F, |=⟩ is

392
28.2. ABSTRACT LOGICS

ordinary first-order logic, i.e., F is the function assigning to the language L


the set of first-order sentences built from the constants in L, and |= is the
satisfaction relation of first-order logic.

Notice that we are still employing the same notion of structure for a given
language as for first-order logic, but we do not presuppose that sentences are
build up from the basic symbols in L in the usual way, nor that the rela-
tion |=L is recursively defined in the same way as for first-order logic. So for
instance the definition, being completely general, is intended to capture the
case where sentences in ⟨L, |=L ⟩ contain infinitely long conjunctions or disjunc-
tion, or quantifiers other than ∃ and ∀ (e.g., “there are infinitely many x such
that . . . ”), or perhaps infinitely long quantifier prefixes. To emphasize that
“sentences” in L(L) need not be ordinary sentences of first-order logic, in this
chapter we use variables α, β, . . . to range over them, and reserve φ, ψ, . . . for
ordinary first-order formulas.
Definition 28.2. Let ModL (α) denote the class {M : M |=L α}. If the lan-
guage needs to be made explicit, we write ModL L (α). Two structures M and
N for L are elementarily equivalent in ⟨L, |=L ⟩, written M ≡L N, if the same
sentences from L(L) are true in each.

Definition 28.3. An abstract logic ⟨L, |=L ⟩ for the language L is normal if it
satisfies the following properties:
1. (L-Monotonicity) For languages L and L′ , if L ⊆ L′ , then L(L) ⊆ L(L′ ).
2. (Expansion Property) For each α ∈ L(L) there is a finite subset L′ of L
such that the relation M |=L α depends only on the reduct of M to L′ ;
i.e., if M and N have the same reduct to L′ then M |=L α if and only if
N |=L α.
3. (Isomorphism Property) If M |=L α and M ≃ N then also N |=L α.
4. (Renaming Property) The relation |=L is preserved under renaming: if the
language L′ is obtained from L by replacing each symbol P by a symbol
P ′ of the same arity and each constant c by a distinct constant c′ , then
for each structure M and sentence α, M |=L α if and only if M′ |=L α′ ,
where M′ is the L′ -structure corresponding to L and α′ ∈ L(L′ ).
5. (Boolean Property) The abstract logic ⟨L, |=L ⟩ is closed under the Boolean
connectives in the sense that for each α ∈ L(L) there is a β ∈ L(L) such
that M |=L β if and only if M ̸|=L α, and for each α and β there is a γ
such that ModL (γ) = ModL (α)∩ModL (β). Similarly for atomic formulas
and the other connectives.
6. (Quantifier Property) For each constant c in L and α ∈ L(L) there is a
β ∈ L(L) such that

ModL L
L (β) = {M : (M, a)} ∈ ModL (α) for some a ∈ |M|},

Release : 6891b66 (2024-12-01) 393


CHAPTER 28. LINDSTRÖM’S THEOREM

where L′ = L \ {c} and (M, a) is the expansion of M to L assigning a


to c.

7. (Relativization Property) Given a sentence α ∈ L(L) and symbols R, c1 ,


. . . , cn not in L, there is a sentence β ∈ L(L ∪ {R, c1 , . . . , cn }) called the
relativization of α to R(x, c1 , . . . cn ), such that for each structure M:

(M, X, b1 , . . . , bn ) |=L β) if and only if N |=L α,

where N is the substructure of M with domain |N| = {a ∈ |M| :


RM (a, b1 , . . . , bn )} (see Remark 1), and (M, X, b1 , . . . , bn ) is the expan-
sion of M interpreting R, c1 , . . . , cn by X, b1 , . . . , bn , respectively (with
X ⊆ M n+1 ).

Definition 28.4. Given two abstract logics ⟨L1 , |=L1 ⟩ and ⟨L2 , |=L2 ⟩ we say
that the latter is at least as expressive as the former, written ⟨L1 , |=L1 ⟩ ≤
⟨L2 , |=L2 ⟩, if for each language L and sentence α ∈ L1 (L) there is a sentence β ∈
L2 (L) such that ModL L
L1 (α) = ModL2 (β). The logics ⟨L1 , |=L1 ⟩ and ⟨L2 , |=L2 ⟩
are equivalent if ⟨L1 , |=L1 ⟩ ≤ ⟨L2 , |=L2 ⟩ and ⟨L2 , |=L2 ⟩ ≤ ⟨L1 , |=L1 ⟩.

Remark 5. First-order logic, i.e., the abstract logic ⟨F, |=⟩, is normal. In fact,
the above properties are mostly straightforward for first-order logic. We just
remark that the expansion property comes down to extensionality, and that
the relativization of a sentence α to R(x, c1 , . . . , cn ) is obtained by replacing
each subformula ∀x β by ∀x (R(x, c1 , . . . , cn ) → β). Moreover, if ⟨L, |=L ⟩ is
normal, then ⟨F, |=⟩ ≤ ⟨L, |=L ⟩, as can be can shown by induction on first-
order formulas. Accordingly, with no loss in generality, we can assume that
every first-order sentence belongs to every normal logic.

content/model-theory/lindstrom/ls-property.tex

28.3 Compactness and Löwenheim–Skolem Properties


We now give the obvious extensions of compactness and Löwenheim–Skolem mod:lin:lsp:
sec
to the case of abstract logics.

Definition 28.5. An abstract logic ⟨L, |=L ⟩ has the Compactness Property
if each set Γ of L(L)-sentences is satisfiable whenever each finite Γ0 ⊆ Γ is
satisfiable.

Definition 28.6. ⟨L, |=L ⟩ has the Downward Löwenheim–Skolem property if


any satisfiable Γ has an enumerable model.

The notion of partial isomorphism from Definition 25.15 is purely “alge-


braic” (i.e., given without reference to the sentences of the language but only
to the constants provided by the language L of the structures), and hence it

394 Release : 6891b66 (2024-12-01)


28.3. COMPACTNESS AND LÖWENHEIM–SKOLEM PROPERTIES

applies to the case of abstract logics. In case of first-order logic, we know from
Theorem 25.17 that if two structures are partially isomorphic then they are
elementarily equivalent. That proof does not carry over to abstract logics, for
induction on formulas need not be available for arbitrary α ∈ L(L), but the
theorem is true nonetheless, provided the Löwenheim–Skolem property holds.
mod:lin:lsp: Theorem 28.7. Suppose ⟨L, |=L ⟩ is a normal logic with the Löwenheim–
thm:abstract-p-isom
Skolem property. Then any two structures that are partially isomorphic are
elementarily equivalent in ⟨L, |=L ⟩.

Proof. Suppose M ≃p N, but for some α also M |=L α while N ̸|=L α. By the
Isomorphism Property we can assume that |M| and |N| are disjoint, and by
the Expansion Property we can assume that α ∈ L(L) for a finite language L.
Let I be a set of partial isomorphisms between M and N, and with no loss of
generality also assume that if p ∈ I and q ⊆ p then also q ∈ I.

|M| is the set of finite sequences of elements of |M|. Let S be the ternary
<ω <ω
relation over |M| representing concatenation, i.e., if a, b, c ∈ |M| then
S(a, b, c) holds if and only if c is the concatenation of a and b; and let T be

the ternary relation such that T (a, b, c) holds for b ∈ M and a, c ∈ |M|
if and only if a = a1 , . . . an and c = a1 , . . . an , b. Pick new 3-place predicate

symbols P and Q and form the structure M∗ having the universe |M| ∪ |M| ,
having M as a substructure, and interpreting P and Q by the concatenation
relations S and T (so M∗ is in the language L ∪ {P, Q}).

Define |N| , S ′ , T ′ , P ′ , Q′ and N∗ analogously. Since by hypothesis
<ω <ω
M ≃p N, there is a relation I between |M| and |N| such that I(a, b)
holds if and only if a and b are isomorphic and satisfy the back-and-forth
condition of Definition 25.15. Now, let M be the structure whose domain is
the union of the domains of M∗ and N∗ , having M∗ and N∗ as substructures,
in the language with one extra binary predicate symbol R interpreted by the

relation I and predicate symbols denoting the domains |M| and |N| ∗.

I
M N

M∗ N∗

Figure 28.1: The structure M with the internal partial isomorphism.


The crucial observation is that in the language of the structure M there is
a first-order sentence θ1 true in M saying that M |=L α and N ̸|=L α (this
requires the Relativization Property), as well as a first-order sentence θ2 true
in M saying that M ≃p N via the partial isomorphism I. By the Löwenheim–
Skolem Property, θ1 and θ2 are jointly true in an enumerable model M0 con-

Release : 6891b66 (2024-12-01) 395


CHAPTER 28. LINDSTRÖM’S THEOREM

taining partially isomorphic substructures M0 and N0 such that M0 |=L α


and N0 ̸|=L α. But enumerable partially isomorphic structures are in fact iso-
morphic by Theorem 25.16, contradicting the Isomorphism Property of normal
abstract logics.

content/model-theory/lindstrom/lindstrom-proof.tex

28.4 Lindström’s Theorem


mod:lin:prf:
sec
Lemma 28.8. Suppose α ∈ L(L), with L finite, and assume also that there is mod:lin:prf:
an n ∈ N such that for any two structures M and N, if M ≡n N and M |=L α lem:lindstrom

then also N |=L α. Then α is equivalent to a first-order sentence, i.e., there is


a first-order θ such that ModL (α) = ModL (θ).

Proof. Let n be such that any two n-equivalent structures M and N agree on
the value assigned to α. Recall Proposition 25.19: there are only finitely many
first-order sentences in a finite language that have quantifier rank no greater
than n, up to logical equivalence. Now, for each fixed structure M let θM
be the conjunction of all first-order sentences α true in M with qr(α) ≤ n
(thisWconjunction is finite), so that N |= θM if and only if N ≡n M. Then put
θ = {θM : M |=L α}; this disjunction is also finite (up to logical equivalence).
The conclusion ModL (α) = ModL (θ) follows. In fact, if N |=L θ then for
some M |=L α we have N |= θM , whence also N |=L α (by the hypothesis of the
lemma). Conversely, if N |=L α then θN is a disjunct in θ, and since N |= θN ,
also N |=L θ.

Theorem 28.9 (Lindström’s Theorem). Suppose ⟨L, |=L ⟩ has the Com- mod:lin:prf:
pactness and the Löwenheim–Skolem Properties. Then ⟨L, |=L ⟩ ≤ ⟨F, |=⟩ (so thm:lindstrom

⟨L, |=L ⟩ is equivalent to first-order logic).

Proof. By Lemma 28.8, it suffices to show that for any α ∈ L(L), with L finite,
there is n ∈ N such that for any two structures M and N: if M ≡n N then M
and N agree on α. For then α is equivalent to a first-order sentence, from which
⟨L, |=L ⟩ ≤ ⟨F, |=⟩ follows. Since we are working in a finite, purely relational
language, by Theorem 25.23 we can replace the statement that M ≡n N by
the corresponding algebraic statement that In (∅, ∅).
Given α, suppose towards a contradiction that for each n there are struc-
tures Mn and Nn such that In (∅, ∅), but (say) Mn |=L α whereas Nn ̸|=L α.
By the Isomorphism Property we can assume that all the Mn ’s interpret the
constants of the language by the same objects; furthermore, since there are
only finitely many atomic sentences in the language, we may also assume that
they satisfy the same atomic sentences (we can take a subsequence of the M’s
otherwise). Let M be the union of all the Mn ’s, i.e., the unique minimal struc-
ture having each Mn as a substructure. As in the proof of Theorem 28.7, let

396 Release : 6891b66 (2024-12-01)


28.4. LINDSTRÖM’S THEOREM


M∗ be the extension of M with domain |M| ∪ |M| , in the expanded language
comprising the concatenation predicates P and Q.
Similarly, define Nn , N and N∗ . Now let M be the structure whose domain
comprises the domains of M∗ and N∗ as well as the natural numbers N along
with their natural ordering ≤, in the language with extra predicates represent-
<ω <ω
ing the domains |M|, |N|, |M| and |N| as well as predicates coding the
domains of Mn and Nn in the sense that:

|Mn | = {a ∈ |M| : R(a, n)}; |Nn | = {a ∈ |N| : S(a, n)};


<ω <ω <ω <ω
|M|n = {a ∈ |M| : R(a, n)}; |N|n = {a ∈ |N| : S(a, n)}.

The structure M also has a ternary relation J such that J(n, a, b) holds if and
only if In (a, b).
Now there is a sentence θ in the language L augmented by R, S, J, etc.,
saying that ≤ is a discrete linear ordering with first but no last element and
such that Mn |= α, Nn ̸|= α, and for each n in the ordering, J(n, a, b) holds if
and only if In (a, b).
Using the Compactness Property, we can find a model M∗ of θ in which
the ordering contains a non-standard element n∗ . In particular then M∗ will
contain substructures Mn∗ and Nn∗ such that Mn∗ |=L α and Nn∗ ̸|=L α. But
now we can define a set I of pairs of k-tuples from |Mn∗ | and |Nn∗ | by putting
⟨a, b⟩ ∈ I if and only if J(n∗ − k, a, b), where k is the length of a and b.
Since n∗ is non-standard, for each standard k we have that n∗ − k > 0, and
the set I witnesses the fact that Mn∗ ≃p Nn∗ . But by Theorem 28.7, Mn∗ is
L-equivalent to Nn∗ , a contradiction.

Release : 6891b66 (2024-12-01) 397


Part V

Computability
This part is based on Jeremy Avigad’s notes on computability theory.
Only the chapter on recursive functions contains exercises yet, and every-
thing could stand to be expanded with motivation, examples, details, and
exercises.

Chapter 29

Recursive Functions

These are Jeremy Avigad’s notes on recursive functions, revised and


expanded by Richard Zach. This chapter does contain some exercises,
and can be included independently to provide the basis for a discussion of
arithmetization of syntax.

content/computability/recursive-functions/introduction.tex

29.1 Introduction
In order to develop a mathematical theory of computability, one has to, first of cmp:rec:int:
sec
all, develop a model of computability. We now think of computability as the
kind of thing that computers do, and computers work with symbols. But at the
beginning of the development of theories of computability, the paradigmatic
example of computation was numerical computation. Mathematicians were
always interested in number-theoretic functions, i.e., functions f : Nn → N that
can be computed. So it is not surprising that at the beginning of the theory
of computability, it was such functions that were studied. The most familiar

398
29.2. PRIMITIVE RECURSION

examples of computable numerical functions, such as addition, multiplication,


exponentiation (of natural numbers) share an interesting feature: they can be
defined recursively. It is thus quite natural to attempt a general definition of
computable function on the basis of recursive definitions. Among the many
possible ways to define number-theoretic functions recursively, one particularly
simple pattern of definition here becomes central: so-called primitive recursion.
In addition to computable functions, we might be interested in computable
sets and relations. A set is computable if we can compute the answer to
whether or not a given number is an element of the set, and a relation is
computable iff we can compute whether or not a tuple ⟨n1 , . . . , nk ⟩ is an element
of the relation. By considering the characteristic function of a set or relation,
discussion of computable sets and relations can be subsumed under that of
computable functions. Thus we can define primitive recursive relations as well,
e.g., the relation “n evenly divides m” is a primitive recursive relation.
Primitive recursive functions—those that can be defined using just primitive
recursion—are not, however, the only computable number-theoretic functions.
Many generalizations of primitive recursion have been considered, but the most
powerful and widely-accepted additional way of computing functions is by un-
bounded search. This leads to the definition of partial recursive functions, and
a related definition to general recursive functions. General recursive functions
are computable and total, and the definition characterizes exactly the partial
recursive functions that happen to be total. Recursive functions can simulate
every other model of computation (Turing machines, lambda calculus, etc.)
and so represent one of the many accepted models of computation.

content/computability/recursive-functions/primitive-recursion.tex

29.2 Primitive Recursion


cmp:rec:pre: A characteristic of the natural numbers is that every natural number can be
sec
reached from 0 by applying the successor operation +1 finitely many times—
any natural number is either 0 or the successor of . . . the successor of 0.
One way to specify a function h : N → N that makes use of this fact is this:
(a) specify what the value of h is for argument 0, and (b) also specify how to,
given the value of h(x), compute the value of h(x + 1). For (a) tells us directly
what h(0) is, so h is defined for 0. Now, using the instruction given by (b) for
x = 0, we can compute h(1) = h(0 + 1) from h(0). Using the same instructions
for x = 1, we compute h(2) = h(1 + 1) from h(1), and so on. For every natural
number x, we’ll eventually reach the step where we define h(x) from h(x + 1),
and so h(x) is defined for all x ∈ N.
For instance, suppose we specify h : N → N by the following two equations:

h(0) = 1
h(x + 1) = 2 · h(x)

Release : 6891b66 (2024-12-01) 399


CHAPTER 29. RECURSIVE FUNCTIONS

If we already know how to multiply, then these equations give us the infor-
mation required for (a) and (b) above. By successively applying the second
equation, we get that

h(1) = 2 · h(0) = 2,
h(2) = 2 · h(1) = 2 · 2,
h(3) = 2 · h(2) = 2 · 2 · 2,
..
.

We see that the function h we have specified is h(x) = 2x .


The characteristic feature of the natural numbers guarantees that there is
only one function h that meets these two criteria. A pair of equations like these
is called a definition by primitive recursion of the function h. It is so-called
because we define h “recursively,” i.e., the definition, specifically the second
equation, involves h itself on the right-hand-side. It is “primitive” because in
defining h(x + 1) we only use the value h(x), i.e., the immediately preceding
value. This is the simplest way of defining a function on N recursively.
We can define even more fundamental functions like addition and multipli-
cation by primitive recursion. In these cases, however, the functions in question
are 2-place. We fix one of the argument places, and use the other for the recur-
sion. E.g, to define add(x, y) we can fix x and define the value first for y = 0
and then for y + 1 in terms of y. Since x is fixed, it will appear on the left and
on the right side of the defining equations.

add(x, 0) = x
add(x, y + 1) = add(x, y) + 1

These equations specify the value of add for all x and y. To find add(2, 3),
for instance, we apply the defining equations for x = 2, using the first to find
add(2, 0) = 2, then using the second to successively find add(2, 1) = 2 + 1 = 3,
add(2, 2) = 3 + 1 = 4, add(2, 3) = 4 + 1 = 5.
In the definition of add we used + on the right-hand-side of the second
equation, but only to add 1. In other words, we used the successor function
succ(z) = z+1 and applied it to the previous value add(x, y) to define add(x, y+
1). So we can think of the recursive definition as given in terms of a single
function which we apply to the previous value. However, it doesn’t hurt—
and sometimes is necessary—to allow the function to depend not just on the
previous value but also on x and y. Consider:

mult(x, 0) = 0
mult(x, y + 1) = add(mult(x, y), x)

This is a primitive recursive definition of a function mult by applying the func-


tion add to both the preceding value mult(x, y) and the first argument x. It
also defines the function mult(x, y) for all arguments x and y. For instance,

400 Release : 6891b66 (2024-12-01)


29.3. COMPOSITION

mult(2, 3) is determined by successively computing mult(2, 0), mult(2, 1), mult(2, 2),
and mult(2, 3):
mult(2, 0) = 0
mult(2, 1) = mult(2, 0 + 1) = add(mult(2, 0), 2) = add(0, 2) = 2
mult(2, 2) = mult(2, 1 + 1) = add(mult(2, 1), 2) = add(2, 2) = 4
mult(2, 3) = mult(2, 2 + 1) = add(mult(2, 2), 2) = add(4, 2) = 6
The general pattern then is this: to give a primitive recursive definition of
a function h(x0 , . . . , xk−1 , y), we provide two equations. The first defines the
value of h(x0 , . . . , xk−1 , 0) without reference to h. The second defines the value
of h(x0 , . . . , xk−1 , y + 1) in terms of h(x0 , . . . , xk−1 , y), the other arguments x0 ,
. . . , xk−1 , and y. Only the immediately preceding value of h may be used in
that second equation. If we think of the operations given by the right-hand-
sides of these two equations as themselves being functions f and g, then the
general pattern to define a new function h by primitive recursion is this:
h(x0 , . . . , xk−1 , 0) = f (x0 , . . . , xk−1 )
h(x0 , . . . , xk−1 , y + 1) = g(x0 , . . . , xk−1 , y, h(x0 , . . . , xk−1 , y))
In the case of add, we have k = 1 and f (x0 ) = x0 (the identity function), and
g(x0 , y, z) = z + 1 (the 3-place function that returns the successor of its third
argument):
add(x0 , 0) = f (x0 ) = x0
add(x0 , y + 1) = g(x0 , y, add(x0 , y)) = succ(add(x0 , y))
In the case of mult, we have f (x0 ) = 0 (the constant function always return-
ing 0) and g(x0 , y, z) = add(z, x0 ) (the 3-place function that returns the sum
of its last and first argument):
mult(x0 , 0) = f (x0 ) = 0
mult(x0 , y + 1) = g(x0 , y, mult(x0 , y)) = add(mult(x0 , y), x0 )

content/computability/recursive-functions/composition.tex

29.3 Composition
cmp:rec:com: If f and g are two one-place functions of natural numbers, we can compose
sec
them: h(x) = g(f (x)). The new function h(x) is then defined by composition
from the functions f and g. We’d like to generalize this to functions of more
than one argument.
Here’s one way of doing this: suppose f is a k-place function, and g0 , . . . ,
gk−1 are k functions which are all n-place. Then we can define a new n-place
function h as follows:
h(x0 , . . . , xn−1 ) = f (g0 (x0 , . . . , xn−1 ), . . . , gk−1 (x0 , . . . , xn−1 ))

Release : 6891b66 (2024-12-01) 401


CHAPTER 29. RECURSIVE FUNCTIONS

If f and all gi are computable, so is h: To compute h(x0 , . . . , xn−1 ), first


compute the values yi = gi (x0 , . . . , xn−1 ) for each i = 0, . . . , k − 1. Then feed
these values into f to compute h(x0 , . . . , xk−1 ) = f (y0 , . . . , yk−1 ).
This may seem like an overly restrictive characterization of what happens
when we compute a new function using some existing ones. For one thing,
sometimes we do not use all the arguments of a function, as when we defined
g(x, y, z) = succ(z) for use in the primitive recursive definition of add. Suppose
we are allowed use of the following functions:
Pin (x0 , . . . , xn−1 ) = xi
The functions Pik are called projection functions: Pin is an n-place function.
Then g can be defined by
g(x, y, z) = succ(P23 (x, y, z)).
Here the role of f is played by the 1-place function succ, so k = 1. And we
have one 3-place function P23 which plays the role of g0 . The result is a 3-place
function that returns the successor of the third argument.
The projection functions also allow us to define new functions by reordering
or identifying arguments. For instance, the function h(x) = add(x, x) can be
defined by
h(x0 ) = add(P01 (x0 ), P01 (x0 )).
Here k = 2, n = 1, the role of f (y0 , y1 ) is played by add, and the roles of g0 (x0 )
and g1 (x0 ) are both played by P01 (x0 ), the one-place projection function (aka
the identity function).
If f (y0 , y1 ) is a function we already have, we can define the function h(x0 , x1 ) =
f (x1 , x0 ) by
h(x0 , x1 ) = f (P12 (x0 , x1 ), P02 (x0 , x1 )).
Here k = 2, n = 2, and the roles of g0 and g1 are played by P12 and P02 ,
respectively.
You may also worry that g0 , . . . , gk−1 are all required to have the same
arity n. (Remember that the arity of a function is the number of arguments;
an n-place function has arity n.) But adding the projection functions provides
the desired flexibility. For example, suppose f and g are 3-place functions and
h is the 2-place function defined by
h(x, y) = f (x, g(x, x, y), y).
The definition of h can be rewritten with the projection functions, as
h(x, y) = f (P02 (x, y), g(P02 (x, y), P02 (x, y), P12 (x, y)), P12 (x, y)).
Then h is the composition of f with P02 , l, and P12 , where
l(x, y) = g(P02 (x, y), P02 (x, y), P12 (x, y)),
i.e., l is the composition of g with P02 , P02 , and P12 .

content/computability/recursive-functions/pr-functions.tex

402 Release : 6891b66 (2024-12-01)


29.4. PRIMITIVE RECURSION FUNCTIONS

29.4 Primitive Recursion Functions


cmp:rec:prf: Let us record again how we can define new functions from existing ones using
sec
primitive recursion and composition.

cmp:rec:prf: Definition 29.1. Suppose f is a k-place function (k ≥ 1) and g is a (k + 2)-


defn:primitive-recursion
place function. The function defined by primitive recursion from f and g is
the (k + 1)-place function h defined by the equations

h(x0 , . . . , xk−1 , 0) = f (x0 , . . . , xk−1 )


h(x0 , . . . , xk−1 , y + 1) = g(x0 , . . . , xk−1 , y, h(x0 , . . . , xk−1 , y))

cmp:rec:prf: Definition 29.2. Suppose f is a k-place function, and g0 , . . . , gk−1 are k


defn:composition
functions which are all n-place. The function defined by composition from f
and g0 , . . . , gk−1 is the n-place function h defined by

h(x0 , . . . , xn−1 ) = f (g0 (x0 , . . . , xn−1 ), . . . , gk−1 (x0 , . . . , xn−1 )).

In addition to succ and the projection functions

Pin (x0 , . . . , xn−1 ) = xi ,

for each natural number n and i < n, we will include among the primitive
recursive functions the function zero(x) = 0.

Definition 29.3. The set of primitive recursive functions is the set of functions
from Nn to N, defined inductively by the following clauses:

1. zero is primitive recursive.

2. succ is primitive recursive.

3. Each projection function Pin is primitive recursive.

4. If f is a k-place primitive recursive function and g0 , . . . , gk−1 are n-place


primitive recursive functions, then the composition of f with g0 , . . . , gk−1
is primitive recursive.

5. If f is a k-place primitive recursive function and g is a k+2-place primitive


recursive function, then the function defined by primitive recursion from
f and g is primitive recursive.

Put more concisely, the set of primitive recursive functions is the smallest explanation

set containing zero, succ, and the projection functions Pjn , and which is closed
under composition and primitive recursion.
Another way of describing the set of primitive recursive functions is by
defining it in terms of “stages.” Let S0 denote the set of starting functions:
zero, succ, and the projections. These are the primitive recursive functions of
stage 0. Once a stage Si has been defined, let Si+1 be the set of all functions

Release : 6891b66 (2024-12-01) 403


CHAPTER 29. RECURSIVE FUNCTIONS

you get by applying a single instance of composition or primitive recursion to


functions already in Si . Then
[
S= Si
i∈N

is the set of all primitive recursive functions


Let us verify that add is a primitive recursive function.

Proposition 29.4. The addition function add(x, y) = x + y is primitive re-


cursive.

Proof. We already have a primitive recursive definition of add in terms of two


functions f and g which matches the format of Definition 29.1:

add(x0 , 0) = f (x0 ) = x0
add(x0 , y + 1) = g(x0 , y, add(x0 , y)) = succ(add(x0 , y))

So add is primitive recursive provided f and g are as well. f (x0 ) = x0 = P01 (x0 ),
and the projection functions count as primitive recursive, so f is primitive
recursive. The function g is the three-place function g(x0 , y, z) defined by

g(x0 , y, z) = succ(z).

This does not yet tell us that g is primitive recursive, since g and succ are not
quite the same function: succ is one-place, and g has to be three-place. But
we can define g “officially” by composition as

g(x0 , y, z) = succ(P23 (x0 , y, z))

Since succ and P23 count as primitive recursive functions, g does as well, since
it can be defined by composition from primitive recursive functions.

Proposition 29.5. The multiplication function mult(x, y) = x · y is primitive cmp:rec:prf:


prop:mult-pr
recursive.

Proof. Exercise.

Problem 29.1. Prove Proposition 29.5 by showing that the primitive recur-
sive definition of mult can be put into the form required by Definition 29.1 and
showing that the corresponding functions f and g are primitive recursive.

Example 29.6. Here’s our very first example of a primitive recursive defini-
tion:

h(0) = 1
h(y + 1) = 2 · h(y).

404 Release : 6891b66 (2024-12-01)


29.5. PRIMITIVE RECURSION NOTATIONS

This function cannot fit into the form required by Definition 29.1, since k = 0.
The definition also involves the constants 1 and 2. To get around the first
problem, let’s introduce a dummy argument and define the function h′ :
h′ (x0 , 0) = f (x0 ) = 1
h (x0 , y + 1) = g(x0 , y, h′ (x0 , y)) = 2 · h′ (x0 , y).

The function f (x0 ) = 1 can be defined from succ and zero by composition:
f (x0 ) = succ(zero(x0 )). The function g can be defined by composition from
g ′ (z) = 2 · z and projections:
g(x0 , y, z) = g ′ (P23 (x0 , y, z))

and g ′ in turn can be defined by composition as

g ′ (z) = mult(g ′′ (z), P01 (z))

and

g ′′ (z) = succ(f (z)),


where f is as above: f (z) = succ(zero(z)). Now that we have h′ , we can use
composition again to let h(y) = h′ (P01 (y), P01 (y)). This shows that h can be
defined from the basic functions using a sequence of compositions and primitive
recursions, so h is primitive recursive.

content/computability/recursive-functions/notation-pr-functions.tex

29.5 Primitive Recursion Notations


cmp:rec:not: One advantage to having the precise inductive description of the primitive
sec
recursive functions is that we can be systematic in describing them. For exam-
ple, we can assign a “notation” to each such function, as follows. Use symbols
zero, succ, and Pin for zero, successor, and the projections. Now suppose h
is defined by composition from a k-place function f and n-place functions g0 ,
. . . , gk−1 , and we have assigned notations F , G0 , . . . , Gk−1 to the latter func-
tions. Then, using a new symbol Compk,n , we can denote the function h by
Compk,n [F, G0 , . . . , Gk−1 ].
For functions defined by primitive recursion, we can use analogous nota-
tions. Suppose the (k + 1)-ary function h is defined by primitive recursion
from the k-ary function f and the (k + 2)-ary function g, and the notations
assigned to f and g are F and G, respectively. Then the notation assigned to h
is Reck [F, G].
Recall that the addition function is defined by primitive recursion as
add(x0 , 0) = P01 (x0 ) = x0
add(x0 , y + 1) = succ(P23 (x0 , y, add(x0 , y))) = add(x0 , y) + 1

Release : 6891b66 (2024-12-01) 405


CHAPTER 29. RECURSIVE FUNCTIONS

Here the role of f is played by P01 , and the role of g is played by succ(P23 (x0 , y, z)),
which is assigned the notation Comp1,3 [succ, P23 ] as it is the result of defining a
function by composition from the 1-ary function succ and the 3-ary function P23 .
With this setup, we can denote the addition function by

Rec1 [P01 , Comp1,3 [succ, P23 ]].

Having these notations sometimes proves useful, e.g., when enumerating prim-
itive recursive functions.
Problem 29.2. Give the complete primitive recursive notation for mult.

content/computability/recursive-functions/pr-functions-computable.tex

29.6 Primitive Recursive Functions are Computable


Suppose a function h is defined by primitive recursion cmp:rec:cmp:
sec

h(⃗x, 0) = f (⃗x)
h(⃗x, y + 1) = g(⃗x, y, h(⃗x, y))

and suppose the functions f and g are computable. (We use ⃗x to abbreviate x0 ,
. . . , xk−1 .) Then h(⃗x, 0) can obviously be computed, since it is just f (⃗x) which
we assume is computable. h(⃗x, 1) can then also be computed, since 1 = 0 + 1
and so h(⃗x, 1) is just

h(⃗x, 1) = g(⃗x, 0, h(⃗x, 0)) = g(⃗x, 0, f (⃗x)).

We can go on in this way and compute

h(⃗x, 2) = g(⃗x, 1, h(⃗x, 1)) = g(⃗x, 1, g(⃗x, 0, f (⃗x)))


h(⃗x, 3) = g(⃗x, 2, h(⃗x, 2)) = g(⃗x, 2, g(⃗x, 1, g(⃗x, 0, f (⃗x))))
h(⃗x, 4) = g(⃗x, 3, h(⃗x, 3)) = g(⃗x, 3, g(⃗x, 2, g(⃗x, 1, g(⃗x, 0, f (⃗x)))))
..
.

Thus, to compute h(⃗x, y) in general, successively compute h(⃗x, 0), h(⃗x, 1), . . . ,
until we reach h(⃗x, y).
Thus, a primitive recursive definition yields a new computable function if
the functions f and g are computable. Composition of functions also results in
a computable function if the functions f and gi are computable.
Since the basic functions zero, succ, and Pin are computable, and com-
position and primitive recursion yield computable functions from computable
functions, this means that every primitive recursive function is computable.

content/computability/recursive-functions/examples.tex

406 Release : 6891b66 (2024-12-01)


29.7. EXAMPLES OF PRIMITIVE RECURSIVE FUNCTIONS

29.7 Examples of Primitive Recursive Functions


cmp:rec:exa: We already have some examples of primitive recursive functions: the addition
sec
and multiplication functions add and mult. The identity function id(x) = x is
primitive recursive, since it is just P01 . The constant functions constn (x) = n
are primitive recursive since they can be defined from zero and succ by suc-
cessive composition. This is useful when we want to use constants in primi-
tive recursive definitions, e.g., if we want to define the function f (x) = 2 · x
can obtain it by composition from constn (x) and multiplication as f (x) =
mult(const2 (x), P01 (x)). We’ll make use of this trick from now on.

Proposition 29.7. The exponentiation function exp(x, y) = xy is primitive


recursive.

Proof. We can define exp primitive recursively as

exp(x, 0) = 1
exp(x, y + 1) = mult(x, exp(x, y)).

Strictly speaking, this is not a recursive definition from primitive recursive


functions. Officially, though, we have:

exp(x, 0) = f (x)
exp(x, y + 1) = g(x, y, exp(x, y)).

where

f (x) = succ(zero(x)) = 1
g(x, y, z) = mult(P03 (x, y, z), P23 (x, y, z)) = x · z

and so f and g are defined from primitive recursive functions by composition.

Proposition 29.8. The predecessor function pred(y) defined by


(
0 if y = 0
pred(y) =
y − 1 otherwise

is primitive recursive.

Proof. Note that

pred(0) = 0 and
pred(y + 1) = y.

This is almost a primitive recursive definition. It does not, strictly speaking, fit
into the pattern of definition by primitive recursion, since that pattern requires

Release : 6891b66 (2024-12-01) 407


CHAPTER 29. RECURSIVE FUNCTIONS

at least one extra argument x. It is also odd in that it does not actually use
pred(y) in the definition of pred(y + 1). But we can first define pred′ (x, y) by

pred′ (x, 0) = zero(x) = 0,


pred′ (x, y + 1) = P13 (x, y, pred′ (x, y)) = y.

and then define pred from it by composition, e.g., as pred(x) = pred′ (zero(x), P01 (x)).

Proposition 29.9. The factorial function fac(x) = x ! = 1 · 2 · 3 · · · · · x is


primitive recursive.

Proof. The obvious primitive recursive definition is

fac(0) = 1
fac(y + 1) = fac(y) · (y + 1).

Officially, we have to first define a two-place function h

h(x, 0) = const1 (x)


h(x, y + 1) = g(x, y, h(x, y))

where g(x, y, z) = mult(P23 (x, y, z), succ(P13 (x, y, z))) and then let

fac(y) = h(P01 (y), P01 (y)) = h(y, y).

From now on we’ll be a bit more laissez-faire and not give the official definitions
by composition and primitive recursion.

Proposition 29.10. Truncated subtraction, x −̇ y, defined by


(
0 if x < y
x −̇ y =
x − y otherwise

is primitive recursive.

Proof. We have:

x −̇ 0 = x
x −̇ (y + 1) = pred(x −̇ y)

Proposition 29.11. The distance between x and y, |x − y|, is primitive re-


cursive.

Proof. We have |x − y| = (x −̇ y) + (y −̇ x), so the distance can be defined by


composition from + and −̇, which are primitive recursive.

408 Release : 6891b66 (2024-12-01)


29.7. EXAMPLES OF PRIMITIVE RECURSIVE FUNCTIONS

Proposition 29.12. The maximum of x and y, max(x, y), is primitive recur-


sive.

Proof. We can define max(x, y) by composition from + and −̇ by

max(x, y) = x + (y −̇ x).

If x is the maximum, i.e., x ≥ y, then y −̇ x = 0, so x + (y −̇ x) = x + 0 = x. If


y is the maximum, then y −̇ x = y − x, and so x + (y −̇ x) = x + (y − x) = y.

cmp:rec:exa: Proposition 29.13. The minimum of x and y, min(x, y), is primitive recur-
prop:min-pr
sive.

Proof. Exercise.

Problem 29.3. Prove Proposition 29.13.

Problem 29.4. Show that


2x

.
..
f (x, y) = 2(2 ) y 2’s

is primitive recursive.

Problem 29.5. Show that integer division d(x, y) = ⌊x/y⌋ (i.e., division,
where you disregard everything after the decimal point) is primitive recur-
sive. When y = 0, we stipulate d(x, y) = 0. Give an explicit definition of d
using primitive recursion and composition.

Proposition 29.14. The set of primitive recursive functions is closed under


the following two operations:
1. Finite sums: if f (⃗x, z) is primitive recursive, then so is the function
y
X
g(⃗x, y) = f (⃗x, z).
z=0

2. Finite products: if f (⃗x, z) is primitive recursive, then so is the function


y
Y
h(⃗x, y) = f (⃗x, z).
z=0

Proof. For example, finite sums are defined recursively by the equations

g(⃗x, 0) = f (⃗x, 0)
g(⃗x, y + 1) = g(⃗x, y) + f (⃗x, y + 1).

content/computability/recursive-functions/pr-relations.tex

Release : 6891b66 (2024-12-01) 409


CHAPTER 29. RECURSIVE FUNCTIONS

29.8 Primitive Recursive Relations


cmp:rec:prr:
sec
Definition 29.15. A relation R(⃗x) is said to be primitive recursive if its char-
acteristic function, 
1 if R(⃗x)
χR (⃗x) =
0 otherwise
is primitive recursive.
In other words, when one speaks of a primitive recursive relation R(⃗x),
one is referring to a relation of the form χR (⃗x) = 1, where χR is a primitive
recursive function which, on any input, returns either 1 or 0. For example, the
relation IsZero(x), which holds if and only if x = 0, corresponds to the function
χIsZero , defined using primitive recursion by
χIsZero (0) = 1,
χIsZero (x + 1) = 0.
It should be clear that one can compose relations with other primitive
recursive functions. So the following are also primitive recursive:
1. The equality relation, x = y, defined by IsZero(|x − y|)
2. The less-than relation, x ≤ y, defined by IsZero(x −̇ y)
Proposition 29.16. The set of primitive recursive relations is closed under
Boolean operations, that is, if P (⃗x) and Q(⃗x) are primitive recursive, so are
1. ¬P (⃗x)
2. P (⃗x) ∧ Q(⃗x)
3. P (⃗x) ∨ Q(⃗x)
4. P (⃗x) → Q(⃗x)
Proof. Suppose P (⃗x) and Q(⃗x) are primitive recursive, i.e., their characteristic
functions χP and χQ are. We have to show that the characteristic functions of
¬P (⃗x), etc., are also primitive recursive.
(
0 if χP (⃗x) = 1
χ¬P (⃗x) =
1 otherwise

We can define χ¬P (⃗x) as 1 −̇ χP (⃗x).


(
1 if χP (⃗x) = χQ (⃗x) = 1
χP ∧Q (⃗x) =
0 otherwise
We can define χP ∧Q (⃗x) as χP (⃗x) · χQ (⃗x) or as min(χP (⃗x), χQ (⃗x)). Similarly,
χP ∨Q (⃗x) = max(χP (⃗x), χQ (⃗x))) and
χP →Q (⃗x) = max(1 −̇ χP (⃗x), χQ (⃗x)).

410 Release : 6891b66 (2024-12-01)


29.8. PRIMITIVE RECURSIVE RELATIONS

Proposition 29.17. The set of primitive recursive relations is closed under


bounded quantification, i.e., if R(⃗x, z) is a primitive recursive relation, then so
are the relations
(∀z < y) R(⃗x, z) and
(∃z < y) R(⃗x, z).
(∀z < y) R(⃗x, z) holds of ⃗x and y if and only if R(⃗x, z) holds for every z less
than y, and similarly for (∃z < y) R(⃗x, z).
Proof. By convention, we take (∀z < 0) R(⃗x, z) to be true (for the trivial reason
that there are no z less than 0) and (∃z < 0) R(⃗x, z) to be false. A bounded
universal quantifier functions just like a finite product or iterated minimum,
i.e., if P (⃗x, y) ⇔ (∀z < y) R(⃗x, z) then χP (⃗x, y) can be defined by
χP (⃗x, 0) = 1
χP (⃗x, y + 1) = min(χP (⃗x, y), χR (⃗x, y))).
Bounded existential quantification can similarly be defined using max. Al-
ternatively, it can be defined from bounded universal quantification, using
the equivalence (∃z < y) R(⃗x, z) ↔ ¬(∀z < y) ¬R(⃗x, z). Note that, for ex-
ample, a bounded quantifier of the form (∃x ≤ y) . . . x . . . is equivalent to
(∃x < y + 1) . . . x . . . .
Problem 29.6. Show that the three place relation x ≡ y mod n (congruence
modulo n) is primitive recursive.
Another useful primitive recursive function is the conditional function,
cond(x, y, z), defined by
(
y if x = 0
cond(x, y, z) =
z otherwise.

This is defined recursively by

cond(0, y, z) = y,
cond(x + 1, y, z) = z.
One can use this to justify definitions of primitive recursive functions by cases
from primitive recursive relations:
Proposition 29.18. If g0 (⃗x), . . . , gm (⃗x) are primitive recursive functions,
and R0 (⃗x), . . . , Rm−1 (⃗x) are primitive recursive relations, then the function f
defined by


 g0 (⃗x) if R0 (⃗x)

 g (⃗x) if R1 (⃗x) and not R0 (⃗x)
 1


..
f (⃗x) = .





 gm−1 (⃗x) if Rm−1 (⃗x) and none of the previous hold

gm (⃗x) otherwise

Release : 6891b66 (2024-12-01) 411


CHAPTER 29. RECURSIVE FUNCTIONS

is also primitive recursive.

Proof. When m = 1, this is just the function defined by

f (⃗x) = cond(χ¬R0 (⃗x), g0 (⃗x), g1 (⃗x)).

For m greater than 1, one can just compose definitions of this form.

content/computability/recursive-functions/bounded-minimization.tex

29.9 Bounded Minimization


explanation It is often useful to define a function as the least number satisfying some prop- cmp:rec:bmi:
sec
erty or relation P . If P is decidable, we can compute this function simply by
trying out all the possible numbers, 0, 1, 2, . . . , until we find the least one
satisfying P . This kind of unbounded search takes us out of the realm of prim-
itive recursive functions. However, if we’re only interested in the least number
less than some independently given bound, we stay primitive recursive. In other
words, and a bit more generally, suppose we have a primitive recursive rela-
tion R(x, z). Consider the function that maps x and y to the least z < y such
that R(x, z). It, too, can be computed, by testing whether R(x, 0), R(x, 1),
. . . , R(x, y − 1). But why is it primitive recursive?
Proposition 29.19. If R(⃗x, z) is primitive recursive, so is the function mR (⃗x, y)
which returns the least z less than y such that R(⃗x, z) holds, if there is one,
and y otherwise. We will write the function mR as

(min z < y) R(⃗x, z),

Proof. Note than there can be no z < 0 such that R(⃗x, z) since there is no
z < 0 at all. So mR (⃗x, 0) = 0.
In case the bound is of the form y + 1 we have three cases:
1. There is a z < y such that R(⃗x, z), in which case mR (⃗x, y+1) = mR (⃗x, y).
2. There is no such z < y but R(⃗x, y) holds, then mR (⃗x, y + 1) = y.
3. There is no z < y + 1 such that R(⃗x, z), then mR (⃗z, y + 1) = y + 1.
So we can define mR (⃗x, 0) by primitive recursion as follows:

mR (⃗x, 0) = 0

mR (⃗x, y)
 if mR (⃗x, y) ̸= y
mR (⃗x, y + 1) = y if mR (⃗x, y) = y and R(⃗x, y)

y+1 otherwise.

Note that there is a z < y such that R(⃗x, z) iff mR (⃗x, y) ̸= y.

412 Release : 6891b66 (2024-12-01)


29.10. PRIMES

Problem 29.7. Suppose R(⃗x, z) is primitive recursive. Define the function


m′R (⃗x, y) which returns the least z less than y such that R(⃗x, z) holds, if there
is one, and 0 otherwise, by primitive recursion from χR .

content/computability/recursive-functions/primes.tex

29.10 Primes
cmp:rec:pri: Bounded quantification and bounded minimization provide us with a good
sec
deal of machinery to show that natural functions and relations are primitive
recursive. For example, consider the relation “x divides y”, written x | y. The
relation x | y holds if division of y by x is possible without remainder, i.e., if y
is an integer multiple of x. (If it doesn’t hold, i.e., the remainder when dividing
x by y is > 0, we write x ∤ y.) In other words, x | y iff for some z, x · z = y.
Obviously, any such z, if it exists, must be ≤ y. So, we have that x | y iff for
some z ≤ y, x · z = y. We can define the relation x | y by bounded existential
quantification from = and multiplication by

x | y ⇔ (∃z ≤ y) (x · z) = y.

We’ve thus shown that x | y is primitive recursive.


A natural number x is prime if it is neither 0 nor 1 and is only divisible
by 1 and itself. In other words, prime numbers are such that, whenever y | x,
either y = 1 or y = x. To test if x is prime, we only have to check if y | x for
all y ≤ x, since if y > x, then automatically y ∤ x. So, the relation Prime(x),
which holds iff x is prime, can be defined by

Prime(x) ⇔ x ≥ 2 ∧ (∀y ≤ x) (y | x → y = 1 ∨ y = x)

and is thus primitive recursive.


The primes are 2, 3, 5, 7, 11, etc. Consider the function p(x) which returns
the xth prime in that sequence, i.e., p(0) = 2, p(1) = 3, p(2) = 5, etc. (For
convenience we will often write p(x) as px (p0 = 2, p1 = 3, etc.)
If we had a function nextPrime(x), which returns the first prime number
larger than x, p can be easily defined using primitive recursion:

p(0) = 2
p(x + 1) = nextPrime(p(x))

Since nextPrime(x) is the least y such that y > x and y is prime, it can be
easily computed by unbounded search. But it can also be defined by bounded
minimization, thanks to a result due to Euclid: there is always a prime number
between x and x ! + 1.

nextPrime(x) = (min y ≤ x ! + 1) (y > x ∧ Prime(y)).

Release : 6891b66 (2024-12-01) 413


CHAPTER 29. RECURSIVE FUNCTIONS

This shows, that nextPrime(x) and hence p(x) are (not just computable but)
primitive recursive.
(If you’re curious, here’s a quick proof of Euclid’s theorem. Suppose pn
is the largest prime ≤ x and consider the product p = p0 · p1 · · · · · pn of all
primes ≤ x. Either p + 1 is prime or there is a prime between x and p + 1.
Why? Suppose p + 1 is not prime. Then some prime number q | p + 1 where
q < p + 1. None of the primes ≤ x divide p + 1. (By definition of p, each of the
primes pi ≤ x divides p, i.e., with remainder 0. So, each of the primes pi ≤ x
divides p + 1 with remainder 1, and so pi ∤ p + 1.) Hence, q is a prime > x and
< p + 1. And p ≤ x !, so there is a prime > x and ≤ x ! + 1.)

Problem 29.8. Define integer division d(x, y) using bounded minimization.

content/computability/recursive-functions/sequences.tex

29.11 Sequences
The set of primitive recursive functions is remarkably robust. But we will be cmp:rec:seq:
sec
able to do even more once we have developed a adequate means of handling
sequences. We will identify finite sequences of natural numbers with natural
numbers in the following way: the sequence ⟨a0 , a1 , a2 , . . . , ak ⟩ corresponds to
the number
pa0 0 +1 · pa1 1 +1 · pa2 2 +1 · · · · · pakk +1 .
We add one to the exponents to guarantee that, for example, the sequences
⟨2, 7, 3⟩ and ⟨2, 7, 3, 0, 0⟩ have distinct numeric codes. We can take both 0
and 1 to code the empty sequence; for concreteness, let Λ denote 0.
The reason that this coding of sequences works is the so-called Fundamental
Theorem of Arithmetic: every natural number n ≥ 2 can be written in one and
only one way in the form

n = pa0 0 · pa1 1 · · · · · pakk

with ak ≥ 1. This guarantees that the mapping ⟨⟩(a0 , . . . , ak ) = ⟨a0 , . . . , ak ⟩ is


injective: different sequences are mapped to different numbers; to each number
only at most one sequence corresponds.
We’ll now show that the operations of determining the length of a sequence,
determining its ith element, appending an element to a sequence, and concate-
nating two sequences, are all primitive recursive.

Proposition 29.20. The function len(s), which returns the length of the se-
quence s, is primitive recursive.

Proof. Let R(i, s) be the relation defined by

R(i, s) iff pi | s ∧ pi+1 ∤ s.

414 Release : 6891b66 (2024-12-01)


29.11. SEQUENCES

R is clearly primitive recursive. Whenever s is the code of a non-empty se-


quence, i.e.,
s = pa0 0 +1 · · · · · pakk +1 ,
R(i, s) holds if pi is the largest prime such that pi | s, i.e., i = k. The length
of s thus is i + 1 iff pi is the largest prime that divides s, so we can let
(
0 if s = 0 or s = 1
len(s) =
1 + (min i < s) R(i, s) otherwise

We can use bounded minimization, since there is only one i that satisfies R(s, i)
when s is a code of a sequence, and if i exists it is less than s itself.

Proposition 29.21. The function append(s, a), which returns the result of
appending a to the sequence s, is primitive recursive.

Proof. append can be defined by:


(
2a+1 if s = 0 or s = 1
append(s, a) =
s · pa+1
len(s) otherwise.

Proposition 29.22. The function element(s, i), which returns the ith element
of s (where the initial element is called the 0th), or 0 if i is greater than or
equal to the length of s, is primitive recursive.

Proof. Note that a is the ith element of s iff pa+1 i is the largest power of pi
that divides s, i.e., pa+1
i | s but p a+2
i ∤ s. So:
(
0 if i ≥ len(s)
element(s, i) = a+2
(min a < s) (pi ∤ s) otherwise.

Instead of using the official names for the functions defined above, we intro-
duce a more compact notation. We will use (s)i instead of element(s, i), and
⟨s0 , . . . , sk ⟩ to abbreviate

append(append(. . . append(Λ, s0 ) . . . ), sk ).

Note that if s has length k, the elements of s are (s)0 , . . . , (s)k−1 .

Proposition 29.23. The function concat(s, t), which concatenates two se-
quences, is primitive recursive.

Proof. We want a function concat with the property that

concat(⟨a0 , . . . , ak ⟩, ⟨b0 , . . . , bl ⟩) = ⟨a0 , . . . , ak , b0 , . . . , bl ⟩.

Release : 6891b66 (2024-12-01) 415


CHAPTER 29. RECURSIVE FUNCTIONS

We’ll use a “helper” function hconcat(s, t, n) which concatenates the first n


symbols of t to s. This function can be defined by primitive recursion as
follows:

hconcat(s, t, 0) = s
hconcat(s, t, n + 1) = append(hconcat(s, t, n), (t)n )

Then we can define concat by

concat(s, t) = hconcat(s, t, len(t)).

We will write s ⌢ t instead of concat(s, t).


It will be useful for us to be able to bound the numeric code of a sequence in
terms of its length and its largest element. Suppose s is a sequence of length k,
each element of which is less than or equal to some number x. Then s has at
most k prime factors, each at most pk−1 , and each raised to at most x + 1 in
the prime factorization of s. In other words, if we define
k·(x+1)
sequenceBound(x, k) = pk−1 ,

then the numeric code of the sequence s described above is at most sequenceBound(x, k).
Having such a bound on sequences gives us a way of defining new functions
using bounded search. For example, we can define concat using bounded search.
All we need to do is write down a primitive recursive specification of the object
(number of the concatenated sequence) we are looking for, and a bound on how
far to look. The following works:

concat(s, t) = (min v < sequenceBound(s + t, len(s) + len(t)))


(len(v) = len(s) + len(t) ∧
(∀i < len(s)) ((v)i = (s)i ) ∧
(∀j < len(t)) ((v)len(s)+j = (t)j ))

Problem 29.9. Show that there is a primitive recursive function sconcat(s)


with the property that

sconcat(⟨s0 , . . . , sk ⟩) = s0 ⌢ . . . ⌢ sk .

Problem 29.10. Show that there is a primitive recursive function tail(s) with
the property that

tail(Λ) = 0 and
tail(⟨s0 , . . . , sk ⟩) = ⟨s1 , . . . , sk ⟩.

Proposition 29.24. The function subseq(s, i, n) which returns the subse- cmp:rec:seq:
prop:subseq
quence of s of length n beginning at the ith element, is primitive recursive.

Proof. Exercise.

416 Release : 6891b66 (2024-12-01)


29.12. TREES

Problem 29.11. Prove Proposition 29.24.

content/computability/recursive-functions/trees.tex

29.12 Trees
cmp:rec:tre: Sometimes it is useful to represent trees as natural numbers, just like we can
sec
represent sequences by numbers and properties of and operations on them by
primitive recursive relations and functions on their codes. We’ll use sequences
and their codes to do this. A tree can be either a single node (possibly with a
label) or else a node (possibly with a label) connected to a number of subtrees.
The node is called the root of the tree, and the subtrees it is connected to its
immediate subtrees.
We code trees recursively as a sequence ⟨k, d1 , . . . , dk ⟩, where k is the num-
ber of immediate subtrees and d1 , . . . , dk the codes of the immediate subtrees.
If the nodes have labels, they can be included after the immediate subtrees. So
a tree consisting just of a single node with label l would be coded by ⟨0, l⟩, and
a tree consisting of a root (labelled l1 ) connected to two single nodes (labelled
l2 , l3 ) would be coded by ⟨2, ⟨0, l2 ⟩, ⟨0, l3 ⟩, l1 ⟩.

cmp:rec:tre: Proposition 29.25. The function SubtreeSeq(t), which returns the code of
prop:subtreeseq
a sequence the elements of which are the codes of all subtrees of the tree with
code t, is primitive recursive.

Proof. First note that ISubtrees(t) = subseq(t, 1, (t)0 ) is primitive recursive


and returns the codes of the immediate subtrees of a tree t. Now we can
define a helper function hSubtreeSeq(t, n) which computes the sequence of all
subtrees which are n nodes removed from the root. The sequence of subtrees
of t which is 0 nodes removed from the root—in other words, begins at the root
of t—is the sequence consisting just of t. To obtain a sequence of all level n + 1
subtrees of t, we concatenate the level n subtrees with a sequence consisting of
all immediate subtrees of the level n subtrees. To get a list of all these, note
that if f (x) is a primitive recursive function returning codes of sequences, then
gf (s, k) = f ((s)0 ) ⌢ . . . ⌢ f ((s)k ) is also primitive recursive:

g(s, 0) = f ((s)0 )
g(s, k + 1) = g(s, k) ⌢ f ((s)k+1 )

For instance, if s is a sequence of trees, then h(s) = gISubtrees (s, len(s)) gives
the sequence of the immediate subtrees of the elements of s. We can use it to
define hSubtreeSeq by

hSubtreeSeq(t, 0) = ⟨t⟩
hSubtreeSeq(t, n + 1) = hSubtreeSeq(t, n) ⌢ h(hSubtreeSeq(t, n)).

Release : 6891b66 (2024-12-01) 417


CHAPTER 29. RECURSIVE FUNCTIONS

The maximum level of subtrees in a tree coded by t, i.e., the maximum distance
between the root and a leaf node, is bounded by the code t. So a sequence of
codes of all subtrees of the tree coded by t is given by hSubtreeSeq(t, t).

Problem 29.12. The definition of hSubtreeSeq in the proof of Proposition 29.25


in general includes repetitions. Give an alternative definition which guarantees
that the code of a subtree occurs only once in the resulting list.

content/computability/recursive-functions/other-recursions.tex

29.13 Other Recursions


Using pairing and sequencing, we can justify more exotic (and useful) forms cmp:rec:ore:
sec
of primitive recursion. For example, it is often useful to define two functions
simultaneously, such as in the following definition:

h0 (⃗x, 0) = f0 (⃗x)
h1 (⃗x, 0) = f1 (⃗x)
h0 (⃗x, y + 1) = g0 (⃗x, y, h0 (⃗x, y), h1 (⃗x, y))
h1 (⃗x, y + 1) = g1 (⃗x, y, h0 (⃗x, y), h1 (⃗x, y))

This is an instance of simultaneous recursion. Another useful way of defining


functions is to give the value of h(⃗x, y + 1) in terms of all the values h(⃗x, 0),
. . . , h(⃗x, y), as in the following definition:

h(⃗x, 0) = f (⃗x)
h(⃗x, y + 1) = g(⃗x, y, ⟨h(⃗x, 0), . . . , h(⃗x, y)⟩).

The following schema captures this idea more succinctly:

h(⃗x, y) = g(⃗x, y, ⟨h(⃗x, 0), . . . , h(⃗x, y − 1)⟩)

with the understanding that the last argument to g is just the empty sequence
when y is 0. In either formulation, the idea is that in computing the “successor
step,” the function h can make use of the entire sequence of values computed
so far. This is known as a course-of-values recursion. For a particular example,
it can be used to justify the following type of definition:
(
g(⃗x, y, h(⃗x, k(⃗x, y))) if k(⃗x, y) < y
h(⃗x, y) =
f (⃗x) otherwise

In other words, the value of h at y can be computed in terms of the value of h


at any previous value, given by k.

418 Release : 6891b66 (2024-12-01)


29.14. NON-PRIMITIVE RECURSIVE FUNCTIONS

Problem 29.13. Define the remainder function r(x, y) by course-of-values re-


cursion. (If x, y are natural numbers and y > 0, r(x, y) is the number less
than y such that x = z × y + r(x, y) for some z. For definiteness, let’s say that
if y = 0, r(x, 0) = 0.)

You should think about how to obtain these functions using ordinary prim-
itive recursion. One final version of primitive recursion is more flexible in that
one is allowed to change the parameters (side values) along the way:

h(⃗x, 0) = f (⃗x)
h(⃗x, y + 1) = g(⃗x, y, h(k(⃗x), y))

This, too, can be simulated with ordinary primitive recursion. (Doing so is


tricky. For a hint, try unwinding the computation by hand.)

content/computability/recursive-functions/non-pr-functions.tex

29.14 Non-Primitive Recursive Functions


cmp:rec:npr: The primitive recursive functions do not exhaust the intuitively computable
sec
functions. It should be intuitively clear that we can make a list of all the
unary primitive recursive functions, f0 , f1 , f2 , . . . such that we can effectively
compute the value of fx on input y; in other words, the function g(x, y), defined
by
g(x, y) = fx (y)
is computable. But then so is the function

h(x) = g(x, x) + 1
= fx (x) + 1.

For each primitive recursive function fi , the value of h and fi differ at i. So h


is computable, but not primitive recursive; and one can say the same about g.
This is an “effective” version of Cantor’s diagonalization argument.
One can provide more explicit examples of computable functions that are
not primitive recursive. For example, let the notation g n (x) denote g(g(. . . g(x))),
with n g’s in all; and define a sequence g0 , g1 , . . . of functions by

g0 (x) = x+1
gn+1 (x) = gnx (x)

You can confirm that each function gn is primitive recursive. Each successive
function grows much faster than the one before; g1 (x) is equal to 2x, g2 (x) is
equal to 2x · x, and g3 (x) grows roughly like an exponential stack of x 2’s. The
Ackermann–Péter function is essentially the function G(x) = gx (x), and one
can show that this grows faster than any primitive recursive function.

Release : 6891b66 (2024-12-01) 419


CHAPTER 29. RECURSIVE FUNCTIONS

Let us return to the issue of enumerating the primitive recursive functions.


Remember that we have assigned symbolic notations to each primitive recursive
function; so it suffices to enumerate notations. We can assign a natural number
#(F ) to each notation F , recursively, as follows:

#(0) = ⟨0⟩
#(S) = ⟨1⟩
#(Pin ) = ⟨2, n, i⟩
#(Compk,l [H, G0 , . . . , Gk−1 ]) = ⟨3, k, l, #(H), #(G0 ), . . . , #(Gk−1 )⟩
#(Recl [G, H]) = ⟨4, l, #(G), #(H)⟩

Here we are using the fact that every sequence of numbers can be viewed as
a natural number, using the codes from the last section. The upshot is that
every code is assigned a natural number. Of course, some sequences (and
hence some numbers) do not correspond to notations; but we can let fi be the
unary primitive recursive function with notation coded as i, if i codes such a
notation; and the constant 0 function otherwise. The net result is that we have
an explicit way of enumerating the unary primitive recursive functions.
(In fact, some functions, like the constant zero function, will appear more
than once on the list. This is not just an artifact of our coding, but also a result
of the fact that the constant zero function has more than one notation. We
will later see that one can not computably avoid these repetitions; for example,
there is no computable function that decides whether or not a given notation
represents the constant zero function.)
We can now take the function g(x, y) to be given by fx (y), where fx refers
to the enumeration we have just described. How do we know that g(x, y) is
computable? Intuitively, this is clear: to compute g(x, y), first “unpack” x,
and see if it is a notation for a unary function. If it is, compute the value of
that function on input y.
digression You may already be convinced that (with some work!) one can write a
program (say, in Java or C++) that does this; and now we can appeal to the
Church–Turing thesis, which says that anything that, intuitively, is computable
can be computed by a Turing machine.
Of course, a more direct way to show that g(x, y) is computable is to de-
scribe a Turing machine that computes it, explicitly. This would, in particular,
avoid the Church–Turing thesis and appeals to intuition. Soon we will have
built up enough machinery to show that g(x, y) is computable, appealing to a
model of computation that can be simulated on a Turing machine: namely, the
recursive functions.

content/computability/recursive-functions/partial-functions.tex

29.15 Partial Recursive Functions


cmp:rec:par:
sec

420 Release : 6891b66 (2024-12-01)


29.15. PARTIAL RECURSIVE FUNCTIONS

To motivate the definition of the recursive functions, note that our proof that
there are computable functions that are not primitive recursive actually estab-
lishes much more. The argument was simple: all we used was the fact that it
is possible to enumerate functions f0 , f1 , . . . such that, as a function of x and
y, fx (y) is computable. So the argument applies to any class of functions that
can be enumerated in such a way. This puts us in a bind: we would like to
describe the computable functions explicitly; but any explicit description of a
collection of computable functions cannot be exhaustive!
The way out is to allow partial functions to come into play. We will see
that it is possible to enumerate the partial computable functions. In fact, we
already pretty much know that this is the case, since it is possible to enumerate
Turing machines in a systematic way. We will come back to our diagonal
argument later, and explore why it does not go through when partial functions
are included.
The question is now this: what do we need to add to the primitive recursive
functions to obtain all the partial recursive functions? We need to do two
things:

1. Modify our definition of the primitive recursive functions to allow for


partial functions as well.

2. Add something to the definition, so that some new partial functions are
included.

The first is easy. As before, we will start with zero, successor, and projec-
tions, and close under composition and primitive recursion. The only difference
is that we have to modify the definitions of composition and primitive recur-
sion to allow for the possibility that some of the terms in the definition are not
defined. If f and g are partial functions, we will write f (x) ↓ to mean that f
is defined at x, i.e., x is in the domain of f ; and f (x) ↑ to mean the opposite,
i.e., that f is not defined at x. We will use f (x) ≃ g(x) to mean that either
f (x) and g(x) are both undefined, or they are both defined and equal. We
will use these notations for more complicated terms as well. We will adopt the
convention that if h and g0 , . . . , gk all are partial functions, then

h(g0 (⃗x), . . . , gk (⃗x))

is defined if and only if each gi is defined at ⃗x, and h is defined at g0 (⃗x),


. . . , gk (⃗x). With this understanding, the definitions of composition and prim-
itive recursion for partial functions is just as above, except that we have to
replace “=” by “≃”.
What we will add to the definition of the primitive recursive functions to
obtain partial functions is the unbounded search operator. If f (x, ⃗z) is any
partial function on the natural numbers, define µx f (x, ⃗z) to be

the least x such that f (0, ⃗z), f (1, ⃗z), . . . , f (x, ⃗z) are all defined, and
f (x, ⃗z) = 0, if such an x exists

Release : 6891b66 (2024-12-01) 421


CHAPTER 29. RECURSIVE FUNCTIONS

with the understanding that µx f (x, ⃗z) is undefined otherwise. This defines
µx f (x, ⃗z) uniquely.
explanation Note that our definition makes no reference to Turing machines, or al-
gorithms, or any specific computational model. But like composition and
primitive recursion, there is an operational, computational intuition behind
unbounded search. When it comes to the computability of a partial func-
tion, arguments where the function is undefined correspond to inputs for which
the computation does not halt. The procedure for computing µx f (x, ⃗z) will
amount to this: compute f (0, ⃗z), f (1, ⃗z), f (2, ⃗z) until a value of 0 is returned.
If any of the intermediate computations do not halt, however, neither does the
computation of µx f (x, ⃗z).
If R(x, ⃗z) is any relation, µx R(x, ⃗z) is defined to be µx (1 −̇ χR (x, ⃗z)). In
other words, µx R(x, ⃗z) returns the least value of x such that R(x, ⃗z) holds.
So, if f (x, ⃗z) is a total function, µx f (x, ⃗z) is the same as µx (f (x, ⃗z) = 0).
But note that our original definition is more general, since it allows for the
possibility that f (x, ⃗z) is not everywhere defined (whereas, in contrast, the
characteristic function of a relation is always total).

Definition 29.26. The set of partial recursive functions is the smallest set of
partial functions from the natural numbers to the natural numbers (of various
arities) containing zero, successor, and projections, and closed under composi-
tion, primitive recursion, and unbounded search.

Of course, some of the partial recursive functions will happen to be total,


i.e., defined for every argument.

Definition 29.27. The set of recursive functions is the set of partial recursive cmp:rec:par:
defn:recursive-fn
functions that are total.

A recursive function is sometimes called “total recursive” to emphasize that


it is defined everywhere.

content/computability/recursive-functions/normal-form.tex

29.16 The Normal Form Theorem


cmp:rec:nft:
sec
Theorem 29.28 (Kleene’s Normal Form Theorem). There is a primi- cmp:rec:nft:
thm:kleene-nf
tive recursive relation T (e, x, s) and a primitive recursive function U (s), with
the following property: if f is any partial recursive function, then for some e,

f (x) ≃ U (µs T (e, x, s))

for every x.

explanation The proof of the normal form theorem is involved, but the basic idea is
simple. Every partial recursive function has an index e, intuitively, a number

422 Release : 6891b66 (2024-12-01)


29.17. THE HALTING PROBLEM

coding its program or definition. If f (x) ↓, the computation can be recorded


systematically and coded by some number s, and the fact that s codes the
computation of f on input x can be checked primitive recursively using only x
and the definition e. Consequently, the relation T , “the function with index e
has a computation for input x, and s codes this computation,” is primitive
recursive. Given the full record of the computation s, the “upshot” of s is the
value of f (x), and it can be obtained from s primitive recursively as well.
The normal form theorem shows that only a single unbounded search is
required for the definition of any partial recursive function. Basically, we can
search through all numbers until we find one that codes a computation of the
function with index e for input x. We can use the numbers e as “names”
of partial recursive functions, and write φe for the function f defined by the
equation in the theorem. Note that any partial recursive function can have
more than one index—in fact, every partial recursive function has infinitely
many indices.

content/computability/recursive-functions/halting-problem.tex

29.17 The Halting Problem


cmp:rec:hlt: The halting problem in general is the problem of deciding, given the specifica-
sec
tion e (e.g., program) of a computable function and a number n, whether the
computation of the function on input n halts, i.e., produces a result. Famously,
Alan Turing proved that this problem itself cannot be solved by a computable
function, i.e., the function
(
1 if computation e halts on input n
h(e, n) =
0 otherwise,

is not computable.
In the context of partial recursive functions, the role of the specification of a
program may be played by the index e given in Kleene’s normal form theorem.
If f is a partial recursive function, any e for which the equation in the normal
form theorem holds, is an index of f . Given a number e, the normal form
theorem states that
φe (x) ≃ U (µs T (e, x, s))
is partial recursive, and for every partial recursive f : N → N, there is an e ∈ N
such that φe (x) ≃ f (x) for all x ∈ N. In fact, for each such f there is not just
one, but infinitely many such e. The halting function h is defined by
(
1 if φe (x) ↓
h(e, x) =
0 otherwise.

Note that h(e, x) = 0 if φe (x) ↑, but also when e is not the index of a partial
recursive function at all.

Release : 6891b66 (2024-12-01) 423


CHAPTER 29. RECURSIVE FUNCTIONS

Theorem 29.29. The halting function h is not partial recursive. cmp:rec:hlt:


thm:halting-problem

Proof. If h were partial recursive, we could define


(
1 if h(y, y) = 0
d(y) =
µx x ̸= x otherwise.

Since no number x satisfies x ̸= x, there is no µx x ̸= x, and so d(y) ↑ iff


h(y, y) ̸= 0. From this definition it follows that

1. d(y) ↓ iff φy (y) ↑ or y is not the index of a partial recursive function.

2. d(y) ↑ iff φy (y) ↓.

If h were partial recursive, then d would be partial recursive as well. Thus,


by the Kleene normal form theorem, it has an index ed . Consider the value of
h(ed , ed ). There are two possible cases, 0 and 1.

1. If h(ed , ed ) = 1 then φed (ed ) ↓. But φed ≃ d, and d(ed ) is defined iff
h(ed , ed ) = 0. So h(ed , ed ) ̸= 1.

2. If h(ed , ed ) = 0 then either ed is not the index of a partial recursive


function, or it is and φed (ed ) ↑. But again, φed ≃ d, and d(ed ) is undefined
iff φed (ed ) ↓.

The upshot is that ed cannot, after all, be the index of a partial recursive
function. But if h were partial recursive, d would be too, and so our definition
of ed as an index of it would be admissible. We must conclude that h cannot
be partial recursive.

content/computability/recursive-functions/general-recursive-functions.tex

29.18 General Recursive Functions


There is another way to obtain a set of total functions. Say a total function cmp:rec:gen:
sec
f (x, ⃗z) is regular if for every sequence of natural numbers ⃗z, there is an x
such that f (x, ⃗z) = 0. In other words, the regular functions are exactly those
functions to which one can apply unbounded search, and end up with a to-
tal function. One can, conservatively, restrict unbounded search to regular
functions:

Definition 29.30. The set of general recursive functions is the smallest set cmp:rec:gen:
defn:general-recursive
of functions from the natural numbers to the natural numbers (of various ari-
ties) containing zero, successor, and projections, and closed under composition,
primitive recursion, and unbounded search applied to regular functions.

424 Release : 6891b66 (2024-12-01)


Clearly every general recursive function is total. The difference between
Definition 29.30 and Definition 29.27 is that in the latter one is allowed to
use partial recursive functions along the way; the only requirement is that
the function you end up with at the end is total. So the word “general,” a
historic relic, is a misnomer; on the surface, Definition 29.30 is less general
than Definition 29.27. But, fortunately, the difference is illusory; though the
definitions are different, the set of general recursive functions and the set of
recursive functions are one and the same.

Chapter 30

Computability Theory

Material in this chapter should be reviewed and expanded. In particu-


lar, there are no exercises yet.

content/computability/computability-theory/introduction.tex

30.1 Introduction
cmp:thy:int: The branch of logic known as Computability Theory deals with issues having
sec
to do with the computability, or relative computability, of functions and sets.
It is a evidence of Kleene’s influence that the subject used to be known as
Recursion Theory, and today, both names are commonly used.
Let us call a function f : N → 7 N partial computable if it can be computed
in some model of computation. If f is total we will simply say that f is
computable. A relation R with computable characteristic function χR is also
called computable. If f and g are partial functions, we will write f (x) ↓ to
mean that f is defined at x, i.e., x is in the domain of f ; and f (x) ↑ to mean
the opposite, i.e., that f is not defined at x. We will use f (x) ≃ g(x) to mean
that either f (x) and g(x) are both undefined, or they are both defined and
equal.
One can explore the subject without having to refer to a specific model
of computation. To do this, one shows that there is a universal partial com-

425
CHAPTER 30. COMPUTABILITY THEORY

putable function, Un(k, x). This allows us to enumerate the partial computable
functions. We will adopt the notation φk to denote the k-th unary partial
computable function, defined by φk (x) ≃ Un(k, x). (Kleene used {k} for this
purpose, but this notation has not been used as much recently.) Slightly more
generally, we can uniformly enumerate the partial computable functions of ar-
bitrary arities, and we will use φnk to denote the k-th n-ary partial recursive
function.
Recall that if f (⃗x, y) is a total or partial function, then µy f (⃗x, y) is the
function of ⃗x that returns the least y such that f (⃗x, y) = 0, assuming that all of
f (⃗x, 0), . . . , f (⃗x, y − 1) are defined; if there is no such y, µy f (⃗x, y) is undefined.
If R(⃗x, y) is a relation, µy R(⃗x, y) is defined to be the least y such that R(⃗x, y) is
true; in other words, the least y such that one minus the characteristic function
of R is equal to zero at ⃗x, y.
To show that a function is computable, there are two ways one can proceed:
1. Rigorously: describe a Turing machine or partial recursive function ex-
plicitly, and show that it computes the function you have in mind;
2. Informally: describe an algorithm that computes it, and appeal to Church’s
thesis.
There is no fine line between the two; a detailed description of an algorithm
should provide enough information so that it is relatively clear how one could,
in principle, design the right Turing machine or sequence of partial recursive
definitions. Fully rigorous definitions are unlikely to be informative, and we
will try to find a happy medium between these two approaches; in short, we
will try to find intuitive yet rigorous proofs that the precise definitions could
be obtained.

content/computability/computability-theory/coding-computations.tex

30.2 Coding Computations


In every model of computation, it is possible to do the following: cmp:thy:cod:
sec

1. Describe the definitions of computable functions in a systematic way.


For instance, you can think of Turing machine specifications, recursive
definitions, or programs in a programming language as providing these
definitions.
2. Describe the complete record of the computation of a function given
by some definition for a given input. For instance, a Turing machine
computation can be described by the sequence of configurations (state of
the machine, contents of the tape) for each step of computation.
3. Test whether a putative record of a computation is in fact the record of
how a computable function with a given definition would be computed
for a given input.

426 Release : 6891b66 (2024-12-01)


30.3. THE NORMAL FORM THEOREM

4. Extract from such a description of the complete record of a computation


the value of the function for a given input. For instance, the contents of
the tape in the very last step of a halting Turing machine computation
is the value.

Using coding, it is possible to assign to each description of a computable


function a numerical index in such a way that the instructions can be recovered
from the index in a computable way. Similarly, the complete record of a com-
putation can be coded by a single number as well. The resulting arithmetical
relation “s codes the record of computation of the function with index e for
input x” and the function “output of computation sequence with code s” are
then computable; in fact, they are primitive recursive.
This fundamental fact is very powerful, and allows us to prove a number
of striking and important results about computability, independently of the
model of computation chosen.

content/computability/computability-theory/normal-form.tex

30.3 The Normal Form Theorem


cmp:thy:nfm:
sec
cmp:thy:nfm: Theorem 30.1 (Kleene’s Normal Form Theorem). There are a primi-
thm:normal-form
tive recursive relation T (k, x, s) and a primitive recursive function U (s), with
the following property: if f is any partial computable function, then for some k,

f (x) ≃ U (µs T (k, x, s))

for every x.

Proof Sketch. For any model of computation one can rigorously define a de-
scription of the computable function f and code such description using a nat-
ural number k. One can also rigorously define a notion of “computation se-
quence” which records the process of computing the function with index k for
input x. These computation sequences can likewise be coded as numbers s.
This can be done in such a way that (a) it is decidable whether a number s
codes the computation sequence of the function with index k on input x and
(b) what the end result of the computation sequence coded by s is. In fact, the
relation in (a) and the function in (b) are primitive recursive.

In order to give a rigorous proof of the Normal Form Theorem, we would explanation

have to fix a model of computation and carry out the coding of descriptions of
computable functions and of computation sequences in detail, and verify that
the relation T and function U are primitive recursive. For most applications,
it suffices that T and U are computable and that U is total.
It is probably best to remember the proof of the normal form theorem in
slogan form: µs T (k, x, s) searches for a computation sequence of the function

Release : 6891b66 (2024-12-01) 427


CHAPTER 30. COMPUTABILITY THEORY

with index k on input x, and U returns the output of the computation sequence
if one can be found.
T and U can be used to define the enumeration φ0 , φ1 , φ2 , . . . . From now
on, we will assume that we have fixed a suitable choice of T and U , and take
the equation
φe (x) ≃ U (µs T (e, x, s))

to be the definition of φe .
Here is another useful fact:

Theorem 30.2. Every partial computable function has infinitely many in-
dices.

Again, this is intuitively clear. Given any (description of) a computable


function, one can come up with a different description which computes the
same function (input-output pair) but does so, e.g., by first doing something
that has no effect on the computation (say, test if 0 = 0, or count to 5, etc.).
The index of the altered description will always be different from the original
index. Both are indices of the same function, just computed slightly differently.

content/computability/computability-theory/s-m-n.tex

30.4 The s-m-n Theorem


explanation The next theorem is known as the “s-m-n theorem,” for a reason that will be cmp:thy:smn:
sec
clear in a moment. The hard part is understanding just what the theorem says;
once you understand the statement, it will seem fairly obvious.

Theorem 30.3. For each pair of natural numbers n and m, there is a prim- cmp:thy:smn:
itive recursive function sm
n such that for every sequence x, a0 , . . . , am−1 , y0
thm:s-m-n

,. . . , yn−1 , we have

φnsm
n (x,a0 ,...,am−1 )
(y0 , . . . , yn−1 ) ≃ φm+n
x (a0 , . . . , am−1 , y0 , . . . , yn−1 ).

explanation It is helpful to think of sm n as acting on programs. That is, sn takes a


m

program, x, for an (m + n)-ary function, as well as fixed inputs a0 , . . . , am−1 ;


and it returns a program, sm n (x, a0 , . . . , am−1 ), for the n-ary function of the
remaining arguments. It you think of x as the description of a Turing machine,
then sm
n (x, a0 , . . . , am−1 ) is the Turing machine that, on input y0 , . . . , yn−1 ,
prepends a0 , . . . , am−1 to the input string, and runs x. Each sm n is then just
a primitive recursive function that finds a code for the appropriate Turing
machine.

content/computability/computability-theory/universal-part-function.tex

428 Release : 6891b66 (2024-12-01)


30.5. THE UNIVERSAL PARTIAL COMPUTABLE FUNCTION

30.5 The Universal Partial Computable Function


cmp:thy:uni:
sec
cmp:thy:uni: Theorem 30.4. There is a universal partial computable function Un(k, x).
thm:univ-comp
In other words, there is a function Un(k, x) such that:
1. Un(k, x) is partial computable.
2. If f (x) is any partial computable function, then there is a natural number
k such that f (x) ≃ Un(k, x) for every x.

Proof. Let Un(k, x) ≃ U (µs T (k, x, s)) in Kleene’s normal form theorem.

This is just a precise way of saying that we have an effective enumeration of explanation
the partial computable functions; the idea is that if we write fk for the function
defined by fk (x) = Un(k, x), then the sequence f0 , f1 , f2 , . . . includes all the
partial computable functions, with the property that fk (x) can be computed
“uniformly” in k and x. For simplicity, we are using a binary function that
is universal for unary functions, but by coding sequences of numbers we can
easily generalize this to more arguments. For example, note that if f (x, y, z) is
a 3-place partial recursive function, then the function g(x) ≃ f ((x)0 , (x)1 , (x)2 )
is a unary recursive function.

content/computability/computability-theory/no-universal-function.tex

30.6 No Universal Computable Function


cmp:thy:nou:
sec
cmp:thy:nou: Theorem 30.5. There is no universal computable function. In other words,
thm:no-univ
the universal function Un′ (k, x) = φk (x) is not computable.

Proof. This theorem says that there is no total computable function that is
universal for the total computable functions. The proof is a simple diagonal-
ization: if Un′ (k, x) were total and computable, then

d(x) = Un′ (x, x) + 1

would also be total and computable. However, for every k, d(k) is not equal to
Un′ (k, k).

Theorem Theorem 30.4 above shows that we can get around this diagonal- explanation
ization argument, but only at the expense of allowing partial functions. It is
worth trying to understand what goes wrong with the diagonalization argu-
ment, when we try to apply it in the partial case. In particular, the function
h(x) = Un(x, x) + 1 is partial recursive. Suppose h is the k-th function in the
enumeration; what can we say about h(k)?

content/computability/computability-theory/halting-problem.tex

Release : 6891b66 (2024-12-01) 429


CHAPTER 30. COMPUTABILITY THEORY

30.7 The Halting Problem


Since, in our construction, Un(k, x) is defined if and only if the computation cmp:thy:hlt:
sec
of the function coded by k produces a value for input x, it is natural to ask
if we can decide whether this is the case. And in fact, it is not. For the
Turing machine model of computation, this means that whether a given Turing
machine halts on a given input is computationally undecidable. The following
theorem is therefore known as the “undecidability of the halting problem.” We
will provide two proofs below. The first continues the thread of our previous
discussion, while the second is more direct.
Theorem 30.6. Let cmp:thy:hlt:
thm:halting-problem
(
1 if Un(k, x) is defined
h(k, x) =
0 otherwise.

Then h is not computable.

Proof. If h were computable, we would have a universal computable function,


as follows. Suppose h is computable, and define
(
′ f nU n(k, x) if h(k, x) = 1
Un (k, x) =
0 otherwise.

But now Un′ (k, x) is a total function, and is computable if h is. For instance,
we could define g using primitive recursion, by
g(0, k, x) ≃ 0
g(y + 1, k, x) ≃ Un(k, x);
then
Un′ (k, x) ≃ g(h(k, x), k, x).
And since Un′ (k, x) agrees with Un(k, x) wherever the latter is defined, Un′ is
universal for those partial computable functions that happen to be total. But
this contradicts Theorem 30.5.

Proof. Suppose h(k, x) were computable. Define the function g by


(
0 if h(x, x) = 0
g(x) =
undefined otherwise.

The function g is partial computable; for example, one can define it as µy h(x, x) =
0. So, for some k, g(x) ≃ Un(k, x) for every x. Is g defined at k? If it is, then,
by the definition of g, h(k, k) = 0. By the definition of f , this means that
Un(k, k) is undefined; but by our assumption that g(k) ≃ Un(k, x) for every
x, this means that g(k) is undefined, a contradiction. On the other hand, if
g(k) is undefined, then h(k, k) ̸= 0, and so h(k, k) = 1. But this means that
Un(k, k) is defined, i.e., that g(k) is defined.

430 Release : 6891b66 (2024-12-01)


30.8. COMPARISON WITH RUSSELL’S PARADOX

We can describe this argument in terms of Turing machines. Suppose there explanation

were a Turing machine H that took as input a description of a Turing machine


K and an input x, and decided whether or not K halts on input x. Then we
could build another Turing machine G which takes a single input x, calls H to
decide if machine x halts on input x, and does the opposite. In other words,
if H reports that x halts on input x, G goes into an infinite loop, and if H
reports that x doesn’t halt on input x, then G just halts. Does G halt on
input G? The argument above shows that it does if and only if it doesn’t—a
contradiction. So our supposition that there is a such Turing machine H, is
false.

content/computability/computability-theory/russells-paradox.tex

30.8 Comparison with Russell’s Paradox


cmp:thy:rus: It is instructive to compare and contrast the arguments in this section with
sec
Russell’s paradox:

1. Russell’s paradox: let S = {x : x ∈


/ x}. Then S ∈ S if and only if X ∈
/ S,
a contradiction.
Conclusion: There is no such set S. Assuming the existence of a “set of
all sets” is inconsistent with the other axioms of set theory.

2. A modification of Russell’s paradox: let F be the “function” from the set


of all functions to {0, 1}, defined by
(
1 if f is in the domain of f , and f (f ) = 0
F (f ) =
0 otherwise

A similar argument shows that F (F ) = 0 if and only if F (F ) = 1, a


contradiction.
Conclusion: F is not a function. The “set of all functions” is too big to
be the domain of a function.

3. The diagonalization argument: let f0 , f1 , . . . be the enumeration of the


partial computable functions, and let G : N → {0, 1} be defined by
(
1 if fx (x) ↓= 0
G(x) =
0 otherwise

If G is computable, then it is the function fk for some k. But then


G(k) = 1 if and only if G(k) = 0, a contradiction.
Conclusion: G is not computable. Note that according to the axioms of
set theory, G is still a function; there is no paradox here, just a clarifica-
tion.

Release : 6891b66 (2024-12-01) 431


CHAPTER 30. COMPUTABILITY THEORY

That talk of partial functions, computable functions, partial computable


functions, and so on can be confusing. The set of all partial functions from N
to N is a big collection of objects. Some of them are total, some of them are
computable, some are both total and computable, and some are neither. Keep
in mind that when we say “function,” by default, we mean a total function.
Thus we have:

1. computable functions

2. partial computable functions that are not total

3. functions that are not computable

4. partial functions that are neither total nor computable

To sort this out, it might help to draw a big square representing all the partial
functions from N to N, and then mark off two overlapping regions, correspond-
ing to the total functions and the computable partial functions, respectively.
It is a good exercise to see if you can describe an object in each of the resulting
regions in the diagram.

content/computability/computability-theory/computable-sets.tex

30.9 Computable Sets


We can extend the notion of computability from computable functions to com- cmp:thy:cps:
sec
putable sets:

Definition 30.7. Let S be a set of natural numbers. Then S is computable iff


its characteristic function is. In other words, S is computable iff the function
(
1 if x ∈ S
χS (x) =
0 otherwise

is computable. Similarly, a relation R(x0 , . . . , xk−1 ) is computable if and only


if its characteristic function is.

explanation Computable sets are also called decidable.


Notice that we now have a number of notions of computability: for partial
functions, for functions, and for sets. Do not get them confused! The Turing
machine computing a partial function returns the output of the function, for
input values at which the function is defined; the Turing machine computing
a set returns either 1 or 0, after deciding whether or not the input value is in
the set or not.

content/computability/computability-theory/ce-sets.tex

432 Release : 6891b66 (2024-12-01)


30.10. COMPUTABLY ENUMERABLE SETS

30.10 Computably Enumerable Sets


cmp:thy:ces:
sec
Definition 30.8. A set is computably enumerable if it is empty or the range
of a computable function.

Historical Remarks Computably enumerable sets are also called recursively


enumerable instead. This is the original terminology, and today both are com-
monly used, as well as the abbreviations “c.e.” and “r.e.”
You should think about what the definition means, and why the terminology explanation

is appropriate. The idea is that if S is the range of the computable function f ,


then
S = {f (0), f (1), f (2), . . . },
and so f can be seen as “enumerating” the elements of S. Note that according
to the definition, f need not be an increasing function, i.e., the enumeration
need not be in increasing order. In fact, f need not even be injective, so that
the constant function f (x) = 0 enumerates the set {0}.
Any computable set is computably enumerable. To see this, suppose S is
computable. If S is empty, then by definition it is computably enumerable.
Otherwise, let a be any element of S. Define f by
(
x if χS (x) = 1
f (x) =
a otherwise.

Then f is a computable function, and S is the range of f .

content/computability/computability-theory/equiv-ce-defs.tex

30.11 Equivalent Defininitions of Computably


Enumerable Sets
cmp:thy:eqc: The following gives a number of important equivalent statements of what it
sec
means to be computably enumerable.

cmp:thy:eqc: Theorem 30.9. Let S be a set of natural numbers. Then the following are
thm:ce-equiv
equivalent:

1. S is computably enumerable.

2. S is the range of a partial computable function.

3. S is empty or the range of a primitive recursive function.

4. S is the domain of a partial computable function.

Release : 6891b66 (2024-12-01) 433


CHAPTER 30. COMPUTABILITY THEORY

explanation The first three clauses say that we can equivalently take any non-empty
computably enumerable set to be enumerated by either a computable function,
a partial computable function, or a primitive recursive function. The fourth
clause tells us that if S is computably enumerable, then for some index e,
S = {x : φe (x) ↓}.
In other words, S is the set of inputs on for which the computation of φe
halts. For that reason, computably enumerable sets are sometimes called semi-
decidable: if a number is in the set, you eventually get a “yes,” but if it isn’t,
you never get a “no”!
Proof. Since every primitive recursive function is computable and every com-
putable function is partial computable, (3) implies (1) and (1) implies (2).
(Note that if S is empty, S is the range of the partial computable function that
is nowhere defined.) If we show that (2) implies (3), we will have shown the
first three clauses equivalent.
So, suppose S is the range of the partial computable function φe . If S is
empty, we are done. Otherwise, let a be any element of S. By Kleene’s normal
form theorem, we can write
φe (x) = U (µs T (e, x, s)).
In particular, φe (x) ↓ and = y if and only if there is an s such that T (e, x, s)
and U (s) = y. Define f (z) by
(
U ((z)1 ) if T (e, (z)0 , (z)1 )
f (z) =
a otherwise.

Then f is primitive recursive, because T and U are. Expressed in terms of


Turing machines, if z codes a pair ⟨(z)0 , (z)1 ⟩ such that (z)1 is a halting com-
putation of machine e on input (z)0 , then f returns the output of the compu-
tation; otherwise, it returns a.We need to show that S is the range of f , i.e.,
for any natural number y, y ∈ S if and only if it is in the range of f . In the
forwards direction, suppose y ∈ S. Then y is in the range of φe , so for some
x and s, T (e, x, s) and U (s) = y; but then y = f (⟨x, s⟩). Conversely, suppose
y is in the range of f . Then either y = a, or for some z, T (e, (z)0 , (z)1 ) and
U ((z)1 ) = y. Since, in the latter case, φe (x) ↓= y, either way, y is in S.
(The notation φe (x) ↓= y means “φe (x) is defined and equal to y.” We
could just as well use φe (x) = y, but the extra arrow is sometimes helpful in
reminding us that we are dealing with a partial function.)
To finish up the proof of Theorem 30.9, it suffices to show that (1) and (4)
are equivalent. First, let us show that (1) implies (4). Suppose S is the range
of a computable function f , i.e.,
S = {y : for some x,f (x) = y}.
Let
g(y) = µx f (x) = y.

434 Release : 6891b66 (2024-12-01)


30.11. DEFINITIONS OF C. E. SETS

Then g is a partial computable function, and g(y) is defined if and only if for
some x, f (x) = y. In other words, the domain of g is the range of f . Expressed
in terms of Turing machines: given a Turing machine F that enumerates the
elements of S, let G be the Turing machine that semi-decides S by searching
through the outputs of F to see if a given element is in the set.
Finally, to show (4) implies (1), suppose that S is the domain of the partial
computable function φe , i.e.,

S = {x : φe (x) ↓}.

If S is empty, we are done; otherwise, let a be any element of S. Define f by


(
(z)0 if T (e, (z)0 , (z)1 )
f (z) =
a otherwise.

Then, as above, a number x is in the range of f if and only if φe (x) ↓, i.e., if and
only if x ∈ S. Expressed in terms of Turing machines: given a machine Me that
semi-decides S, enumerate the elements of S by running through all possible
Turing machine computations, and returning the inputs that correspond to
halting computations.

The fourth clause of Theorem 30.9 provides us with a convenient way of


enumerating the computably enumerable sets: for each e, let We denote the
domain of φe . Then if A is any computably enumerable set, A = We , for some
e.
The following provides yet another characterization of the computably enu-
merable sets.

cmp:thy:eqc: Theorem 30.10. A set S is computably enumerable if and only if there is a


thm:exists-char
computable relation R(x, y) such that

S = {x : ∃y R(x, y)}.

Proof. In the forward direction, suppose S is computably enumerable. Then


for some e, S = We . For this value of e we can write S as

S = {x : ∃y T (e, x, y)}.

In the reverse direction, suppose S = {x : ∃y R(x, y)}. Define f by

f (x) ≃ µy AtomRx, y.

Then f is partial computable, and S is the domain of f .

content/computability/computability-theory/ce-closed-cup-cap.tex

Release : 6891b66 (2024-12-01) 435


CHAPTER 30. COMPUTABILITY THEORY

30.12 Computably Enumerable Sets are Closed under


Union and Intersection
The following theorem gives some closure properties on the set of computably cmp:thy:clo:
sec
enumerable sets.
Theorem 30.11. Suppose A and B are computably enumerable. Then so are
A ∩ B and A ∪ B.

Proof. Theorem 30.9 allows us to use various characterizations of the com-


putably enumerable sets. By way of illustration, we will provide a few different
proofs.
For the first proof, suppose A is enumerated by a computable function f ,
and B is enumerated by a computable function g. Let

h(x) = µy (f (y) = x ∨ g(y) = x) and


j(x) = µy (f ((y)0 ) = x ∧ g((y)1 ) = x).

Then A ∪ B is the domain of h, and A ∩ B is the domain of j.


explanation Here is what is going on, in computational terms: given procedures that
enumerate A and B, we can semi-decide if an element x is in A ∪ B by looking
for x in either enumeration; and we can semi-decide if an element x is in A ∩ B
for looking for x in both enumerations at the same time.
For the second proof, suppose again that A is enumerated by f and B is
enumerated by g. Let
(
f (x/2) if x is even
k(x) =
g((x − 1)/2) if x is odd.

Then k enumerates A ∪ B; the idea is that k just alternates between the enu-
merations offered by f and g. Enumerating A ∩ B is tricker. If A ∩ B is empty,
it is trivially computably enumerable. Otherwise, let c be any element of A∩B,
and define l by (
f ((x)0 ) if f ((x)0 ) = g((x)1 )
l(x) =
c otherwise.
In computational terms, l runs through pairs of elements in the enumerations of
f and g, and outputs every match it finds; otherwise, it just stalls by outputting
c.
For the last proof, suppose A is the domain of the partial function m(x)
and B is the domain of the partial function n(x). Then A ∩ B is the domain
of the partial function m(x) + n(x).
explanation In computational terms, if A is the set of values for which m halts and B
is the set of values for which n halts, A ∩ B is the set of values for which both
procedures halt.
Expressing A ∪ B as a set of halting values is more difficult, because one
has to simulate m and n in parallel. Let d be an index for m and let e be an

436 Release : 6891b66 (2024-12-01)


30.13. COMPUTABLY ENUMERABLE SETS NOT CLOSED UNDER
COMPLEMENT
index for n; in other words, m = φd and n = φe . Then A ∪ B is the domain of
the function
p(x) = µy (T (d, x, y) ∨ T (e, x, y)).
In computational terms, on input x, p searches for either a halting computation explanation
for m or a halting computation for n, and halts if it finds either one.

content/computability/computability-theory/complement-ce.tex

30.13 Computably Enumerable Sets not Closed under


Complement
cmp:thy:cmp: Suppose A is computably enumerable. Is the complement of A, A = N \
sec
A, necessarily computably enumerable as well? The following theorem and
corollary show that the answer is “no.”

cmp:thy:cmp: Theorem 30.12. Let A be any set of natural numbers. Then A is computable
thm:ce-comp
if and only if both A and A are computably enumerable.

Proof. The forwards direction is easy: if A is computable, then A is computable


as well (χA = 1 −̇ χA ), and so both are computably enumerable.
In the other direction, suppose A and A are both computably enumerable.
Let A be the domain of φd , and let A be the domain of φe . Define h by

h(x) = µs (T (d, x, s) ∨ T (e, x, s)).

In other words, on input x, h searches for either a halting computation of φd


or a halting computation of φe . Now, if x ∈ A, it will succeed in the first case,
and if x ∈ A, it will succeed in the second case. So, h is a total computable
function. But now we have that for every x, x ∈ A if and only if T (e, x, h(x)),
i.e., if φe is the one that is defined. Since T (e, x, h(x)) is a computable relation,
A is computable.

It is easier to understand what is going on in informal computational terms: explanation

to decide A, on input x search for halting computations of φe and φf . One of


them is bound to halt; if it is φe , then x is in A, and otherwise, x is in A.

cmp:thy:cmp: Corollary 30.13. K0 is not computably enumerable.


cor:comp-k

Proof. We know that K0 is computably enumerable, but not computable. If


K0 were computably enumerable, then K0 would be computable by Theo-
rem 30.12.

content/computability/computability-theory/reducibility.tex

Release : 6891b66 (2024-12-01) 437


CHAPTER 30. COMPUTABILITY THEORY

30.14 Reducibility
explanation We now know that there is at least one set, K0 , that is computably enumerable cmp:thy:red:
sec
but not computable. It should be clear that there are others. The method of
reducibility provides a powerful method of showing that other sets have these
properties, without constantly having to return to first principles.
Generally speaking, a “reduction” of a set A to a set B is a method of
transforming answers to whether or not elements are in B into answers as to
whether or not elements are in A. We will focus on a notion called “many-
one reducibility,” but there are many other notions of reducibility available,
with varying properties. Notions of reducibility are also central to the study
of computational complexity, where efficiency issues have to be considered as
well. For example, a set is said to be “NP-complete” if it is in NP and every
NP problem can be reduced to it, using a notion of reduction that is similar to
the one described below, only with the added requirement that the reduction
can be computed in polynomial time.
We have already used this notion implicitly. Define the set K by
K = {x : φx (x) ↓},
i.e., K = {x : x ∈ Wx }. Our proof that the halting problem in unsolvable,
Theorem 30.6, shows most directly that K is not computable. Recall that K0
is the set
K0 = {⟨e, x⟩ : φe (x) ↓}.
i.e. K0 = {⟨x, e⟩ : x ∈ We }. It is easy to extend any proof of the uncomputabil-
ity of K to the uncomputability of K0 : if K0 were computable, we could decide
whether or not an element x is in K simply by asking whether or not the pair
⟨x, x⟩ is in K0 . The function f which maps x to ⟨x, x⟩ is an example of a
reduction of K to K0 .
Definition 30.14. Let A and B be sets. Then A is said to be many-one
reducible to B, written A ≤m B, if there is a computable function f such that
for every natural number x,
x∈A if and only if f (x) ∈ B.
If A is many-one reducible to B and vice-versa, then A and B are said to be
many-one equivalent, written A ≡m B.

If the function f in the definition above happens to be injective, A is said


to be one-one reducible to B. Most of the reductions described below meet
this stronger requirement, but we will not use this fact.
digression It is true, but by no means obvious, that one-one reducibility really is a
stronger requirement than many-one reducibility. In other words, there are
infinite sets A and B such that A is many-one reducible to B but not one-one
reducible to B.

content/computability/computability-theory/prop-reduce.tex

438 Release : 6891b66 (2024-12-01)


30.15. PROPERTIES OF REDUCIBILITY

30.15 Properties of Reducibility


cmp:thy:ppr: The intuition behind writing A ≤m B is that A is “no harder than” B. The
sec
following two propositions support this intuition.

cmp:thy:ppr: Proposition 30.15. If A ≤m B and B ≤m C, then A ≤m C.


prop:trans-red

Proof. Composing a reduction of A to B with a reduction of B to C yields a


reduction of A to C. (You should check the details!)

cmp:thy:ppr: Proposition 30.16. Let A and B be any sets, and suppose A is many-one
prop:reduce
reducible to B.

1. If B is computably enumerable, so is A.

2. If B is computable, so is A.

Proof. Let f be a many-one reduction from A to B. For the first claim, just
check that if B is the domain of a partial function g, then A is the domain
of g ◦ f :

x ∈ Aiff f (x) ∈ B
iff g(f (x)) ↓ .

For the second claim, remember that if B is computable then B and B


are computably enumerable. It is not hard to check that f is also a many-one
reduction of A to B, so, by the first part of this proof, A and A are computably
enumerable. So A is computable as well. (Alternatively, you can check that
χA = χB ◦ f ; so if χB is computable, then so is χA .)

A more general notion of reducibility called Turing reducibility is useful digression

in other contexts, especially for proving undecidability results. Note that by


Corollary 30.13, the complement of K0 is not reducible to K0 , since it is not
computably enumerable. But, intuitively, if you knew the answers to questions
about K0 , you would know the answer to questions about its complement as
well. A set A is said to be Turing reducible to B if one can determine answers to
questions in A using a computable procedure that can ask questions about B.
This is more liberal than many-one reducibility, in which (1) you are only
allowed to ask one question about B, and (2) a “yes” answer has to translate
to a “yes” answer to the question about A, and similarly for “no.” It is still
the case that if A is Turing reducible to B and B is computable then A is
computable as well (though, as we have seen, the analogous statement does
not hold for computable enumerability).
You should think about the various notions of reducibility we have dis-
cussed, and understand the distinctions between them. We will, however, only
deal with many-one reducibility in this chapter. Incidentally, both types of
reducibility discussed in the last paragraph have analogues in computational

Release : 6891b66 (2024-12-01) 439


CHAPTER 30. COMPUTABILITY THEORY

complexity, with the added requirement that the Turing machines run in poly-
nomial time: the complexity version of many-one reducibility is known as Karp
reducibility, while the complexity version of Turing reducibility is known as
Cook reducibility.

content/computability/computability-theory/complete-ce-sets.tex

30.16 Complete Computably Enumerable Sets


cmp:thy:cce:
sec
Definition 30.17. A set A is a complete computably enumerable set (under
many-one reducibility) if

1. A is computably enumerable, and

2. for any other computably enumerable set B, B ≤m A.

In other words, complete computably enumerable sets are the “hardest”


computably enumerable sets possible; they allow one to answer questions about
any computably enumerable set.

Theorem 30.18. K, K0 , and K1 are all complete computably enumerable


sets.

Proof. To see that K0 is complete, let B be any computably enumerable set.


Then for some index e,

B = We = {x : φe (x) ↓}.

Let f be the function f (x) = ⟨e, x⟩. Then for every natural number x, x ∈ B
if and only if f (x) ∈ K0 . In other words, f reduces B to K0 .
To see that K1 is complete, note that in the proof of Proposition 30.19 we
reduced K0 to it. So, by Proposition 30.15, any computably enumerable set
can be reduced to K1 as well.
K can be reduced to K0 in much the same way.

Problem 30.1. Give a reduction of K to K0 .

digression So, it turns out that all the examples of computably enumerable sets that
we have considered so far are either computable, or complete. This should
seem strange! Are there any examples of computably enumerable sets that
are neither computable nor complete? The answer is yes, but it wasn’t until
the middle of the 1950s that this was established by Friedberg and Muchnik,
independently.

content/computability/computability-theory/k-1.tex

440 Release : 6891b66 (2024-12-01)


30.17. AN EXAMPLE OF REDUCIBILITY

30.17 An Example of Reducibility


cmp:thy:k1: Let us consider an application of Proposition 30.16.
sec
cmp:thy:k1: Proposition 30.19. Let
prop:k1
K1 = {e : φe (0) ↓}.
Then K1 is computably enumerable but not computable.
Proof. Since K1 = {e : ∃s T (e, 0, s)}, K1 is computably enumerable by Theo-
rem 30.10.
To show that K1 is not computable, let us show that K0 is reducible to it.
This is a little bit tricky, since using K1 we can only ask questions about explanation
computations that start with a particular input, 0. Suppose you have a smart
friend who can answer questions of this type (friends like this are known as
“oracles”). Then suppose someone comes up to you and asks you whether or
not ⟨e, x⟩ is in K0 , that is, whether or not machine e halts on input x. One
thing you can do is build another machine, ex , that, for any input, ignores that
input and instead runs e on input x. Then clearly the question as to whether
machine e halts on input x is equivalent to the question as to whether machine
ex halts on input 0 (or any other input). So, then you ask your friend whether
this new machine, ex , halts on input 0; your friend’s answer to the modified
question provides the answer to the original one. This provides the desired
reduction of K0 to K1 .
Using the universal partial computable function, let f be the 3-ary function
defined by
f (x, y, z) ≃ φx (y).
Note that f ignores its third input entirely. Pick an index e such that f = φ3e ;
so we have
φ3e (x, y, z) ≃ φx (y).
By the s-m-n theorem, there is a function s(e, x, y) such that, for every z,
φs(e,x,y) (z) ≃ φ3e (x, y, z)
≃ φx (y).
In terms of the informal argument above, s(e, x, y) is an index for the ma- explanation
chine that, for any input z, ignores that input and computes φx (y).
In particular, we have
φs(e,x,y) (0) ↓ if and only if φx (y) ↓ .
In other words, ⟨x, y⟩ ∈ K0 if and only if s(e, x, y) ∈ K1 . So the function g
defined by
g(w) = s(e, (w)0 , (w)1 )
is a reduction of K0 to K1 .

content/computability/computability-theory/total.tex

Release : 6891b66 (2024-12-01) 441


CHAPTER 30. COMPUTABILITY THEORY

30.18 Totality is Undecidable


Let us consider one more example of using the s-m-n theorem to show that cmp:thy:tot:
sec
something is noncomputable. Let Tot be the set of indices of total computable
functions, i.e.
Tot = {x : for every y, φx (y) ↓}.
Proposition 30.20. Tot is not computable. cmp:thy:tot:
prop:total
Proof. To see that Tot is not computable, it suffices to show that K is reducible
to it. Let h(x, y) be defined by
(
0 if x ∈ K
h(x, y) ≃
undefined otherwise

Note that h(x, y) does not depend on y at all. It should not be hard to see that
h is partial computable: on input x, y, the we compute h by first simulating
the function φx on input x; if this computation halts, h(x, y) outputs 0 and
halts. So h(x, y) is just Z(µs T (x, x, s)), where Z is the constant zero function.
Using the s-m-n theorem, there is a primitive recursive function k(x) such
that for every x and y,
(
0 if x ∈ K
φk(x) (y) =
undefined otherwise

So φk(x) is total if x ∈ K, and undefined otherwise. Thus, k is a reduction of


K to Tot.

digression It turns out that Tot is not even computably enumerable—its complexity
lies further up on the “arithmetic hierarchy.” But we will not worry about this
strengthening here.

content/computability/computability-theory/rice-theorem.tex

30.19 Rice’s Theorem


If you think about it, you will see that the specifics of Tot do not play into cmp:thy:rce:
sec
the proof of Proposition 30.20. We designed h(x, y) to act like the constant
function j(y) = 0 exactly when x is in K; but we could just as well have made
it act like any other partial computable function under those circumstances.
This observation lets us state a more general theorem, which says, roughly,
that no nontrivial property of computable functions is decidable.
Keep in mind that φ0 , φ1 , φ2 , . . . is our standard enumeration of the partial
computable functions.
Theorem 30.21 (Rice’s Theorem). Let C be any set of partial computable
functions, and let A = {n : φn ∈ C}. If A is computable, then either C is ∅ or
C is the set of all the partial computable functions.

442 Release : 6891b66 (2024-12-01)


30.19. RICE’S THEOREM

An index set is a set A with the property that if n and m are indices which
“compute” the same function, then either both n and m are in A, or neither is.
It is not hard to see that the set A in the theorem has this property. Conversely,
if A is an index set and C is the set of functions computed by these indices,
then A = {n : φn ∈ C}.
With this terminology, Rice’s theorem is equivalent to saying that no non- explanation

trivial index set is decidable. To understand what the theorem says, it is helpful
to emphasize the distinction between programs (say, in your favorite program-
ming language) and the functions they compute. There are certainly questions
about programs (indices), which are syntactic objects, that are computable:
does this program have more than 150 symbols? Does it have more than 22
lines? Does it have a “while” statement? Does the string “hello world” every
appear in the argument to a “print” statement? Rice’s theorem says that no
nontrivial question about the program’s behavior is computable. This includes
questions like these: does the program halt on input 0? Does it ever halt?
Does it ever output an even number?

Proof of Rice’s theorem. Suppose C is neither ∅ nor the set of all the partial
computable functions, and let A be the set of indices of functions in C. We
will show that if A were computable, we could solve the halting problem; so
A is not computable.
Without loss of generality, we can assume that the function f which is
nowhere defined is not in C (otherwise, switch C and its complement in the
argument below). Let g be any function in C. The idea is that if we could
decide A, we could tell the difference between indices computing f , and in-
dices computing g; and then we could use that capability to solve the halting
problem.
Here’s how. Using the universal computation predicate, we can define a
function (
undefined if φx (x) ↑
h(x, y) ≃
g(y) otherwise.

To compute h, first we try to compute φx (x); if that computation halts, we


go on to compute g(y); and if that computation halts, we return the output.
More formally, we can write

h(x, y) ≃ P02 (g(y), Un(x, x)).

where P02 (z0 , z1 ) = z0 is the 2-place projection function returning the 0-th
argument, which is computable.
Then h is a composition of partial computable functions, and the right side
is defined and equal to g(y) just when Un(x, x) and g(y) are both defined.
Notice that for a fixed x, if φx (x) is undefined, then h(x, y) is undefined for
every y; and if φx (x) is defined, then h(x, y) ≃ g(y). So, for any fixed value
of x, either h(x, y) acts just like f or it acts just like g, and deciding whether or
not φx (x) is defined amounts to deciding which of these two cases holds. But

Release : 6891b66 (2024-12-01) 443


CHAPTER 30. COMPUTABILITY THEORY

this amounts to deciding whether or not hx (y) ≃ h(x, y) is in C or not, and if


A were computable, we could do just that.
More formally, since h is partial computable, it is equal to the function φk
for some index k. By the s-m-n theorem there is a primitive recursive function
s such that for each x, φs(k,x) (y) = hx (y). Now we have that for each x, if
φx (x) ↓, then φs(k,x) is the same function as g, and so s(k, x) is in A. On the
other hand, if φx (x) ↑, then φs(k,x) is the same function as f , and so s(k, x)
is not in A. In other words we have that for every x, x ∈ K if and only if
s(k, x) ∈ A. If A were computable, K would be also, which is a contradiction.
So A is not computable.

Rice’s theorem is very powerful. The following immediate corollary shows


some sample applications.
Corollary 30.22. The following sets are undecidable.
1. {x : 17 is in the range of φx }
2. {x : φx is constant}
3. {x : φx is total}
4. {x : whenever y < y ′ , φx (y) ↓, and if φx (y ′ ) ↓, then φx (y) < φx (y ′ )}
Proof. These are all nontrivial index sets.

content/computability/computability-theory/fixed-point-thm.tex

30.20 The Fixed-Point Theorem


Let’s consider the halting problem again. As temporary notation, let us write cmp:thy:fix:
sec
⌜φx (y)⌝ for ⟨x, y⟩; think of this as representing a “name” for the value φx (y).
With this notation, we can reword one of our proofs that the halting problem
is undecidable.
Question: is there a computable function h, with the following property?
For every x and y, (
1 if φx (y) ↓
h(⌜φx (y)⌝) =
0 otherwise.
Answer: No; otherwise, the partial function
(
0 if h(⌜φx (x)⌝) = 0
g(x) ≃
undefined otherwise

would be computable, and so have some index e. But then we have


(
0 if h(⌜φe (e)⌝) = 0
φe (e) ≃
undefined otherwise,

444 Release : 6891b66 (2024-12-01)


30.20. THE FIXED-POINT THEOREM

in which case φe (e) is defined if and only if it isn’t, a contradiction.


Now, take a look at the equation with φe . There is an instance of self-
reference there, in a sense: we have arranged for the value of φe (e) to depend
on ⌜φe (e)⌝, in a certain way. The fixed-point theorem says that we can do this,
in general—not just for the sake of proving contradictions.
Lemma 30.23 gives two equivalent ways of stating the fixed-point theorem.
Logically speaking, the fact that the statements are equivalent follows from the
fact that they are both true; but what we really mean is that each one follows
straightforwardly from the other, so that they can be taken as alternative
statements of the same theorem.

cmp:thy:fix: Lemma 30.23. The following statements are equivalent:


lem:fixed-equiv

1. For every partial computable function g(x, y), there is an index e such
that for every y,
φe (y) ≃ g(e, y).

2. For every computable function f (x), there is an index e such that for
every y,
φe (y) ≃ φf (e) (y).

Proof. (1) ⇒ (2): Given f , define g by g(x, y) ≃ Un(f (x), y). Use (1) to get
an index e such that for every y,

φe (y) = Un(f (e), y)


= φf (e) (y).

(2) ⇒ (1): Given g, use the s-m-n theorem to get f such that for every x
and y, φf (x) (y) ≃ g(x, y). Use (2) to get an index e such that

φe (y) = φf (e) (y)


= g(e, y).

This concludes the proof.

Before showing that statement (1) is true (and hence (2) as well), consider explanation

how bizarre it is. Think of e as being a computer program; statement (1) says
that given any partial computable g(x, y), you can find a computer program
e that computes ge (y) ≃ g(e, y). In other words, you can find a computer
program that computes a function that references the program itself.

Theorem 30.24. The two statements in Lemma 30.23 are true. Specifically,
for every partial computable function g(x, y), there is an index e such that for
every y,
φe (y) ≃ g(e, y).

Release : 6891b66 (2024-12-01) 445


CHAPTER 30. COMPUTABILITY THEORY

Proof. The ingredients are already implicit in the discussion of the halting
problem above. Let diag(x) be a computable function which for each x returns
an index for the function fx (y) ≃ φx (x, y), i.e.

φdiag(x) (y) ≃ φx (x, y).

Think of diag as a function that transforms a program for a 2-ary function


into a program for a 1-ary function, obtained by fixing the original program as
its first argument. The function diag can be defined formally as follows: first
define s by
s(x, y) ≃ Un2 (x, x, y),
where Un2 is a 3-ary function that is universal for partial computable 2-ary
functions. Then, by the s-m-n theorem, we can find a primitive recursive
function diag satisfying
φdiag(x) (y) ≃ s(x, y).
Now, define the function l by

l(x, y) ≃ g(diag(x), y).

and let ⌜l⌝ be an index for l. Finally, let e = diag(⌜l⌝). Then for every y, we
have

φe (y) ≃ φdiag(⌜l⌝) (y)


≃ φ⌜l⌝ (⌜l⌝, y)
≃ l(⌜l⌝, y)
≃ g(diag(⌜l⌝), y)
≃ g(e, y),

as required.

explanation What’s going on? Suppose you are given the task of writing a computer
program that prints itself out. Suppose further, however, that you are working
with a programming language with a rich and bizarre library of string functions.
In particular, suppose your programming language has a function diag which
works as follows: given an input string s, diag locates each instance of the
symbol ‘x’ occurring in s, and replaces it by a quoted version of the original
string. For example, given the string

hello x world

as input, the function returns

hello ’hello x world’ world

as output. In that case, it is easy to write the desired program; you can check
that

446 Release : 6891b66 (2024-12-01)


30.20. THE FIXED-POINT THEOREM

print(diag(’print(diag(x))’))
does the trick. For more common programming languages like C++ and Java,
the same idea (with a more involved implementation) still works.
We are only a couple of steps away from the proof of the fixed-point theo-
rem. Suppose a variant of the print function print(x, y) accepts a string x and
another numeric argument y, and prints the string x repeatedly, y times. Then
the “program”
getinput(y); print(diag(’getinput(y); print(diag(x), y)’), y)
prints itself out y times, on input y. Replacing the getinput—print—diag
skeleton by an arbitrary function g(x, y) yields
g(diag(’g(diag(x), y)’), y)
which is a program that, on input y, runs g on the program itself and y.
Thinking of “quoting” with “using an index for,” we have the proof above.
For now, it is o.k. if you want to think of the proof as formal trickery, or
black magic. But you should be able to reconstruct the details of the argument
given above. When we prove the incompleteness theorems (and the related
“fixed-point theorem”) we will discuss other ways of understanding why it
works.
The same idea can be used to get a “fixed point” combinator. Suppose you digression

have a lambda term g, and you want another term k with the property that k
is β-equivalent to gk. Define terms
diag(x) = xx
and
l(x) = g(diag(x))
using our notational conventions; in other words, l is the term λx. g(xx). Let
k be the term ll. Then we have
k = (λx. g(xx))(λx. g(xx))


→ g((λx. g(xx))(λx. g(xx)))
= gk.
If one takes
Y = λg. ((λx. g(xx))(λx. g(xx)))
then Y g and g(Y g) reduce to a common term; so Y g ≡β g(Y g). This is known
as “Curry’s combinator.” If instead one takes
Y = (λxg. g(xxg))(λxg. g(xxg))
then in fact Y g reduces to g(Y g), which is a stronger statement. This latter
version of Y is known as “Turing’s combinator.”

content/computability/computability-theory/application-fixed-point.tex

Release : 6891b66 (2024-12-01) 447


CHAPTER 30. COMPUTABILITY THEORY

30.21 Applying the Fixed-Point Theorem


The fixed-point theorem essentially lets us define partial computable functions cmp:thy:apf:
sec
in terms of their indices. For example, we can find an index e such that for
every y,
φe (y) = e + y.

As another example, one can use the proof of the fixed-point theorem to design
a program in Java or C++ that prints itself out.
Remember that if for each e, we let We be the domain of φe , then the
sequence W0 , W1 , W2 , . . . enumerates the computably enumerable sets. Some
of these sets are computable. One can ask if there is an algorithm which takes
as input a value x, and, if Wx happens to be computable, returns an index for
its characteristic function. The answer is “no,” there is no such algorithm:

Theorem 30.25. There is no partial computable function f with the following


property: whenever We is computable, then f (e) is defined and φf (e) is its
characteristic function.

Proof. Let f be any computable function; we will construct an e such that We


is computable, but φf (e) is not its characteristic function. Using the fixed point
theorem, we can find an index e such that
(
0 if y = 0 and φf (e) (0) ↓= 0
φe (y) ≃
undefined otherwise.

That is, e is obtained by applying the fixed-point theorem to the function


defined by
(
0 if y = 0 and φf (x) (0) ↓= 0
g(x, y) ≃
undefined otherwise.

Informally, we can see that g is partial computable, as follows: on input x and


y, the algorithm first checks to see if y is equal to 0. If it is, the algorithm
computes f (x), and then uses the universal machine to compute φf (x) (0). If
this last computation halts and returns 0, the algorithm returns 0; otherwise,
the algorithm doesn’t halt.
But now notice that if φf (e) (0) is defined and equal to 0, then φe (y) is
defined exactly when y is equal to 0, so We = {0}. If φf (e) (0) is not defined,
or is defined but not equal to 0, then We = ∅. Either way, φf (e) is not the
characteristic function of We , since it gives the wrong answer on input 0.

content/computability/computability-theory/def-functions-self-reference.tex

448 Release : 6891b66 (2024-12-01)


30.22. DEFINING FUNCTIONS USING SELF-REFERENCE

30.22 Defining Functions using Self-Reference


cmp:thy:slf: It is generally useful to be able to define functions in terms of themselves. For
sec
example, given computable functions k, l, and m, the fixed-point lemma tells us
that there is a partial computable function f satisfying the following equation
for every y: (
k(y) if l(y) = 0
f (y) ≃
f (m(y)) otherwise.
Again, more specifically, f is obtained by letting
(
k(y) if l(y) = 0
g(x, y) ≃
φx (m(y)) otherwise

and then using the fixed-point lemma to find an index e such that φe (y) =
g(e, y).
For a concrete example, the “greatest common divisor” function gcd(u, v)
can be defined by
(
v if 0 = 0
gcd(u, v) ≃
gcd(mod(v, u), u) otherwise

where mod(v, u) denotes the remainder of dividing v by u. An appeal to the


fixed-point lemma shows that gcd is partial computable. (In fact, this can be
put in the format above, letting y code the pair ⟨u, v⟩.) A subsequent induction
on u then shows that, in fact, gcd is total.
Of course, one can cook up self-referential definitions that are much fancier
than the examples just discussed. Most programming languages support def-
initions of functions in terms of themselves, one way or another. Note that
this is a little bit less dramatic than being able to define a function in terms
of an index for an algorithm computing the functions, which is what, in full
generality, the fixed-point theorem lets you do.

content/computability/computability-theory/minimization-lambda.tex

30.23 Minimization with Lambda Terms


cmp:thy:mla: When it comes to the lambda calculus, we’ve shown the following:
sec

1. Every primitive recursive function is represented by a lambda term.

2. There is a lambda term Y such that for any lambda term G, Y G →




G(Y G).

To show that every partial computable function is represented by some lambda


term, we only need to show the following.

Release : 6891b66 (2024-12-01) 449


CHAPTER 30. COMPUTABILITY THEORY

Lemma 30.26. Suppose f (x, y) is primitive recursive. Let g be defined by

g(x) ≃ µy f (x, y) = 0.

Then g is represented by a lambda term.

Proof. The idea is roughly as follows. Given x, we will use the fixed-point
lambda term Y to define a function hx (n) which searches for a y starting at n;
then g(x) is just hx (0). The function hx can be expressed as the solution of a
fixed-point equation:
(
n if f (x, n) = 0
hx (n) ≃
hx (n + 1) otherwise.

Here are the details. Since f is primitive recursive, it is represented by


some term F . Remember that we also have a lambda term D such that
D(M, N, 0) →
−→ M and D(M, N, 1) → −
→ N . Fixing x for the moment, to repre-
sent hx we want to find a term H (depending on x) satisfying

H(n) ≡ D(n, H(S(n)), F (x, n)).

We can do this using the fixed-point term Y . First, let U be the term

λh. λz. D(z, (h(Sz)), F (x, z)),

and then let H be the term Y U . Notice that the only free variable in H is x.
Let us show that H satisfies the equation above.
By the definition of Y , we have

H = Y U ≡ U (Y U ) = U (H).

In particular, for each natural number n, we have

H(n) ≡ U (H, n)


→ D(n, H(S(n)), F (x, n)),

as required. Notice that if you substitute a numeral m for x in the last line,
the expression reduces to n if F (m, n) reduces to 0, and it reduces to H(S(n))
if F (m, n) reduces to any other numeral.
To finish off the proof, let G be λx. H(0). Then G represents g; in other
words, for every m, G(m) reduces to reduces to g(m), if g(m) is defined, and
has no normal form otherwise.

450 Release : 6891b66 (2024-12-01)


Part VI

Turing Machines

Chapter 31

Turing Machine Computations

content/turing-machines/machines-computations/introduction.tex

31.1 Introduction
tur:mac:int: What does it mean for a function, say, from N to N to be computable? Among
sec
the first answers, and the most well known one, is that a function is computable
if it can be computed by a Turing machine. This notion was set out by Alan
Turing in 1936. Turing machines are an example of a model of computation—
they are a mathematically precise way of defining the idea of a “computational
procedure.” What exactly that means is debated, but it is widely agreed that
Turing machines are one way of specifying computational procedures. Even
though the term “Turing machine” evokes the image of a physical machine
with moving parts, strictly speaking a Turing machine is a purely mathematical
construct, and as such it idealizes the idea of a computational procedure. For
instance, we place no restriction on either the time or memory requirements
of a Turing machine: Turing machines can compute something even if the
computation would require more storage space or more steps than there are
atoms in the universe.
It is perhaps best to think of a Turing machine as a program for a special explanation

kind of imaginary mechanism. This mechanism consists of a tape and a read-


write head. In our version of Turing machines, the tape is infinite in one
direction (to the right), and it is divided into squares, each of which may contain
a symbol from a finite alphabet. Such alphabets can contain any number of
different symbols, but we will mainly make do with three: ▷, 0, and 1. When
the mechanism is started, the tape is empty (i.e., each square contains the
symbol 0) except for the leftmost square, which contains ▷, and a finite number

451
CHAPTER 31. TURING MACHINE COMPUTATIONS

Figure 31.1: A Turing machine executing its program.


tur:mac:int:
fig:tm

of squares which contain the input. At any time, the mechanism is in one of a
finite number of states. At the outset, the head scans the leftmost square and
in a specified initial state. At each step of the mechanism’s run, the content
of the square currently scanned together with the state the mechanism is in
and the Turing machine program determine what happens next. The Turing
machine program is given by a partial function which takes as input a state q
and a symbol σ and outputs a triple ⟨q ′ , σ ′ , D⟩. Whenever the mechanism is in
state q and reads symbol σ, it replaces the symbol on the current square with
σ ′ , the head moves left, right, or stays put according to whether D is L, R, or
N , and the mechanism goes into state q ′ .
For instance, consider the situation in Figure 31.1. The visible part of the
tape of the Turing machine contains the end-of-tape symbol ▷ on the leftmost
square, followed by three 1’s, a 0, and four more 1’s. The head is reading the
third square from the left, which contains a 1, and is in state q1 —we say “the
machine is reading a 1 in state q1 .” If the program of the Turing machine
returns, for input ⟨q1 , 1⟩, the triple ⟨q2 , 0, N ⟩, the machine would now replace
the 1 on the third square with a 0, leave the read/write head where it is, and
switch to state q2 . If then the program returns ⟨q3 , 0, R⟩ for input ⟨q2 , 0⟩, the
machine would now overwrite the 0 with another 0 (effectively, leaving the
content of the tape under the read/write head unchanged), move one square
to the right, and enter state q3 . And so on.
We say that the machine halts when it encounters some state, qn , and sym-
bol, σ such that there is no instruction for ⟨qn , σ⟩, i.e., the transition function
for input ⟨qn , σ⟩ is undefined. In other words, the machine has no instruction
to carry out, and at that point, it ceases operation. Halting is sometimes rep-
resented by a specific halt state h. This will be demonstrated in more detail
later on.
digression The beauty of Turing’s paper, “On computable numbers,” is that he presents
not only a formal definition, but also an argument that the definition captures
the intuitive notion of computability. From the definition, it should be clear
that any function computable by a Turing machine is computable in the intu-
itive sense. Turing offers three types of argument that the converse is true, i.e.,
that any function that we would naturally regard as computable is computable
by such a machine. They are (in Turing’s words):

452 Release : 6891b66 (2024-12-01)


31.2. REPRESENTING TURING MACHINES

1. A direct appeal to intuition.


2. A proof of the equivalence of two definitions (in case the new definition
has a greater intuitive appeal).
3. Giving examples of large classes of numbers which are computable.
Our goal is to try to define the notion of computability “in principle,” i.e.,
without taking into account practical limitations of time and space. Of course,
with the broadest definition of computability in place, one can then go on
to consider computation with bounded resources; this forms the heart of the
subject known as “computational complexity.”

Historical Remarks Alan Turing invented Turing machines in 1936. While


his interest at the time was the decidability of first-order logic, the paper has
been described as a definitive paper on the foundations of computer design.
In the paper, Turing focuses on computable real numbers, i.e., real numbers
whose decimal expansions are computable; but he notes that it is not hard
to adapt his notions to computable functions on the natural numbers, and
so on. Notice that this was a full five years before the first working general
purpose computer was built in 1941 (by the German Konrad Zuse in his par-
ent’s living room), seven years before Turing and his colleagues at Bletchley
Park built the code-breaking Colossus (1943), nine years before the American
ENIAC (1945), twelve years before the first British general purpose computer—
the Manchester Small-Scale Experimental Machine—was built in Manchester
(1948), and thirteen years before the Americans first tested the BINAC (1949).
The Manchester SSEM has the distinction of being the first stored-program
computer—previous machines had to be rewired by hand for each new task.

content/turing-machines/machines-computations/representing-tms.tex

31.2 Representing Turing Machines


tur:mac:rep: Turing machines can be represented visually by state diagrams. The diagrams explanation
sec
are composed of state cells connected by arrows. Unsurprisingly, each state cell
represents a state of the machine. Each arrow represents an instruction that
can be carried out from that state, with the specifics of the instruction written
above or below the appropriate arrow. Consider the following machine, which
has only two internal states, q0 and q1 , and one instruction:

0, 1, R
start q0 q1

Recall that the Turing machine has a read/write head and a tape with the
input written on it. The instruction can be read as if reading a 0 in state q0 ,
write a 1, move right, and move to state q1 . This is equivalent to the transition
function mapping ⟨q0 , 0⟩ to ⟨q1 , 1, R⟩.

Release : 6891b66 (2024-12-01) 453


CHAPTER 31. TURING MACHINE COMPUTATIONS

Example 31.1. Even Machine: The following Turing machine halts if, and
only if, there are an even number of 1’s on the tape (under the assumption
that all 1’s come before the first 0 on the tape).

0, 0, R
1, 1, R

start q0 q1

1, 1, R

The state diagram corresponds to the following transition function:


δ(q0 , 1) = ⟨q1 , 1, R⟩,
δ(q1 , 1) = ⟨q0 , 1, R⟩,
δ(q1 , 0) = ⟨q1 , 0, R⟩
explanation The above machine halts only when the input is an even number of strokes.
Otherwise, the machine (theoretically) continues to operate indefinitely. For
any machine and input, it is possible to trace through the configurations of
the machine in order to determine the output. We will give a formal definition
of configurations later. For now, we can intuitively think of configurations as
a series of diagrams showing the state of the machine at any point in time
during operation. Configurations show the content of the tape, the state of the
machine and the location of the read/write head.
Let us trace through the configurations of the even machine if it is started
with an input of four 1’s. In this case, we expect that the machine will halt.
We will then run the machine on an input of three 1’s, where the machine will
run forever.
The machine starts in state q0 , scanning the leftmost 1. We can represent
the initial state of the machine as follows:
▷10 1110 . . .
The above configuration is straightforward. As can be seen, the machine starts
in state one, scanning the leftmost 1. This is represented by a subscript of the
state name on the first 1. The applicable instruction at this point is δ(q0 , 1) =
⟨q1 , 1, R⟩, and so the machine moves right on the tape and changes to state q1 .
▷111 110 . . .
Since the machine is now in state q1 scanning a 1, we have to “follow” the
instruction δ(q1 , 1) = ⟨q0 , 1, R⟩. This results in the configuration
▷1110 10 . . .
As the machine continues, the rules are applied again in the same order, re-
sulting in the following two configurations:
▷11111 0 . . .

454 Release : 6891b66 (2024-12-01)


31.2. REPRESENTING TURING MACHINES

▷111100 . . .
The machine is now in state q0 scanning a 0. Based on the transition diagram,
we can easily see that there is no instruction to be carried out, and thus the
machine has halted. This means that the input has been accepted.
Suppose next we start the machine with an input of three 1’s. The first few
configurations are similar, as the same instructions are carried out, with only
a small difference of the tape input:

▷10 110 . . .

▷111 10 . . .

▷1110 0 . . .

▷11101 . . .
The machine has now traversed past all the 1’s, and is reading a 0 in state q1 . As
shown in the diagram, there is an instruction of the form δ(q1 , 0) = ⟨q1 , 0, R⟩.
Since the tape is filled with 0 indefinitely to the right, the machine will continue
to execute this instruction forever, staying in state q1 and moving ever further
to the right. The machine will never halt, and does not accept the input.
It is important to note that not all machines will halt. If halting means that explanation

the machine runs out of instructions to execute, then we can create a machine
that never halts simply by ensuring that there is an outgoing arrow for each
symbol at each state. The even machine can be modified to run indefinitely by
adding an instruction for scanning a 0 at q0 .
Example 31.2.

0, 0, R 0, 0, R
1, 1, R

start q0 q1

1, 1, R

Machine tables are another way of representing Turing machines. Machine explanation

tables have the tape alphabet displayed on the x-axis, and the set of machine
states across the y-axis. Inside the table, at the intersection of each state and
symbol, is written the rest of the instruction—the new state, new symbol, and
direction of movement. Machine tables make it easy to determine in what
state, and for what symbol, the machine halts. Whenever there is a gap in the
table is a possible point for the machine to halt. Unlike state diagrams and
instruction sets, where the points at which the machine halts are not always
immediately obvious, any halting points are quickly identified by finding the
gaps in the machine table.

Release : 6891b66 (2024-12-01) 455


CHAPTER 31. TURING MACHINE COMPUTATIONS

1, 1, R 1, 1, R

1, 0, R 0, 0, R
start q0 q1 q2

0, 0, R 0, 1, R

q5 q4 q3
0, 0, L 1, 1, L

1, 1, L 1, 1, L 0, 1, L

Figure 31.2: A doubler machine


tur:mac:rep:
fig:doubler
Example 31.3. The machine table for the even machine is:
0 1 ▷
q0 1, q1 , R
q1 0, q1 , R 1, q0 , R
As we can see, the machine halts when scanning a 0 in state q0 .

explanation So far we have only considered machines that read and accept input. How-
ever, Turing machines have the capacity to both read and write. An example
of such a machine (although there are many, many examples) is a doubler. A
doubler, when started with a block of n 1’s on the tape, outputs a block of 2n
1’s.
Example 31.4. Before building a doubler machine, it is important to come tur:mac:rep:
ex:doubler
up with a strategy for solving the problem. Since the machine (as we have
formulated it) cannot remember how many 1’s it has read, we need to come
up with a way to keep track of all the 1’s on the tape. One such way is to
separate the output from the input with a 0. The machine can then erase the
first 1 from the input, traverse over the rest of the input, leave a 0, and write
two new 1’s. The machine will then go back and find the second 1 in the input,
and double that one as well. For each one 1 of input, it will write two 1’s of
output. By erasing the input as the machine goes, we can guarantee that no
1 is missed or doubled twice. When the entire input is erased, there will be
2n 1’s left on the tape. The state diagram of the resulting Turing machine is
depicted in Figure 31.2.

Problem 31.1. Choose an arbitrary input and trace through the configura-
tions of the doubler machine in Example 31.4.

456 Release : 6891b66 (2024-12-01)


31.3. TURING MACHINES

Problem 31.2. Design a Turing-machine with alphabet {▷, 0, A, B} that ac-


cepts, i.e., halts on, any string of A’s and B’s where the number of A’s is the
same as the number of B’s and all the A’s precede all the B’s, and rejects,
i.e., does not halt on, any string where the number of A’s is not equal to the
number of B’s or the A’s do not precede all the B’s. (E.g., the machine should
accept AABB, and AAABBB, but reject both AAB and AABBAABB.)

Problem 31.3. Design a Turing-machine with alphabet {▷, 0, A, B} that takes


as input any string α of A’s and B’s and duplicates them to produce an output
of the form αα. (E.g. input ABBA should result in output ABBAABBA).

Problem 31.4. Alphabetical?: Design a Turing-machine with alphabet {▷, 0, A, B}


that when given as input a finite sequence of A’s and B’s checks to see if all
the A’s appear to the left of all the B’s or not. The machine should leave the
input string on the tape, and either halt if the string is “alphabetical”, or loop
forever if the string is not.

Problem 31.5. Alphabetizer: Design a Turing-machine with alphabet {▷, 0, A, B}


that takes as input a finite sequence of A’s and B’s rearranges them so that
all the A’s are to the left of all the B’s. (e.g., the sequence BABAA should
become the sequence AAABB, and the sequence ABBABB should become
the sequence AABBBB).

content/turing-machines/machines-computations/turing-machines.tex

31.3 Turing Machines


tur:mac:tur: The formal definition of what constitutes a Turing machine looks abstract, but explanation
sec
is actually simple: it merely packs into one mathematical structure all the
information needed to specify the workings of a Turing machine. This includes
(1) which states the machine can be in, (2) which symbols are allowed to be
on the tape, (3) which state the machine should start in, and (4) what the
instruction set of the machine is.

Definition 31.5 (Turing machine). A Turing machine M is a tuple ⟨Q, Σ, q0 , δ⟩


consisting of

1. a finite set of states Q,

2. a finite alphabet Σ which includes ▷ and 0,

3. an initial state q0 ∈ Q,

4. a finite instruction set δ : Q × Σ →


7 Q × Σ × {L, R, N }.

The partial function δ is also called the transition function of M .

Release : 6891b66 (2024-12-01) 457


CHAPTER 31. TURING MACHINE COMPUTATIONS

explanation We assume that the tape is infinite in one direction only. For this reason
it is useful to designate a special symbol ▷ as a marker for the left end of the
tape. This makes it easier for Turing machine programs to tell when they’re
“in danger” of running off the tape. We could assume that this symbol is never
overwritten, i.e., that δ(q, ▷) = ⟨q ′ , ▷, x⟩ if δ(q, ▷) is defined. Some textbooks
do this, we do not. You can simply be careful when constructing your Turing
machine that it never overwrites ▷. Moreover, there are cases where allowing
such overwriting provides some convenient flexibility.

Example 31.6. Even Machine: The even machine is formally the quadruple
⟨Q, Σ, q0 , δ⟩ where

Q = {q0 , q1 }
Σ = {▷, 0, 1},
δ(q0 , 1) = ⟨q1 , 1, R⟩,
δ(q1 , 1) = ⟨q0 , 1, R⟩,
δ(q1 , 0) = ⟨q1 , 0, R⟩.

content/turing-machines/machines-computations/configuration.tex

31.4 Configurations and Computations


explanation Recall tracing through the configurations of the even machine earlier. The cmp:tur:con:
sec
imaginary mechanism consisting of tape, read/write head, and Turing machine
program is really just an intuitive way of visualizing what a Turing machine
computation is. Formally, we can define the computation of a Turing machine
on a given input as a sequence of configurations—and a configuration in turn
is a sequence of symbols (corresponding to the contents of the tape at a given
point in the computation), a number indicating the position of the read/write
head, and a state. Using these, we can define what the Turing machine M
computes on a given input.

Definition 31.7 (Configuration). A configuration of Turing machine M =


⟨Q, Σ, q0 , δ⟩ is a triple ⟨C, m, q⟩ where

1. C ∈ Σ ∗ is a finite sequence of symbols from Σ,

2. m ∈ N is a number < len(C), and

3. q ∈ Q

Intuitively, the sequence C is the content of the tape (symbols of all squares
from the leftmost square to the last non-blank or previously visited square),
m is the number of the square the read/write head is scanning (beginning with
0 being the number of the leftmost square), and q is the current state of the
machine.

458 Release : 6891b66 (2024-12-01)


31.4. CONFIGURATIONS AND COMPUTATIONS

The potential input for a Turing machine is a sequence of symbols, usually explanation

a sequence that encodes a number in some form. The initial configuration of


the Turing machine is that configuration in which we start the Turing machine
to work on that input: the tape contains the tape end marker immediately
followed by the input written on the squares to the right, the read/write head
is scanning the leftmost square of the input (i.e., the square to the right of the
left end marker), and the mechanism is in the designated start state q0 .

Definition 31.8 (Initial configuration). The initial configuration of M for


input I ∈ Σ ∗ is
⟨▷ ⌢ I, 1, q0 ⟩.

The ⌢ symbol is for concatenation—the input string begins immediately explanation

to the left end marker.

Definition 31.9. We say that a configuration ⟨C, m, q⟩ yields the configuration


⟨C ′ , m′ , q ′ ⟩ in one step (according to M ), iff

1. the m-th symbol of C is σ,

2. the instruction set of M specifies δ(q, σ) = ⟨q ′ , σ ′ , D⟩,

3. the m-th symbol of C ′ is σ ′ , and

4. a) D = L and m′ = m − 1 if m > 0, otherwise m′ = 0, or


b) D = R and m′ = m + 1, or
c) D = N and m′ = m,

5. if m′ = len(C), then len(C ′ ) = len(C) + 1 and the m′ -th symbol of C ′


is 0. Otherwise len(C ′ ) = len(C).

6. for all i such that i < len(C) and i ̸= m, C ′ (i) = C(i),

cmp:tur:con: Definition 31.10. A run of M on input I is a sequence Ci of configurations


defn:run-output
of M , where C0 is the initial configuration of M for input I, and each Ci yields
Ci+1 in one step.
We say that M halts on input I after k steps if Ck = ⟨C, m, q⟩, the mth
symbol of C is σ, and δ(q, σ) is undefined. In that case, the output of M
for input I is O, where O is a string of symbols not ending in 0 such that
C = ▷ ⌢ O ⌢ 0j for some j ∈ N. (0j is a sequence of j 0’s.)

According to this definition, the output O of M always ends in a symbol explanation

other than 0, or, if at time k the entire tape is filled with 0 (except for the
leftmost ▷), O is the empty string.

content/turing-machines/machines-computations/unary-numbers.tex

Release : 6891b66 (2024-12-01) 459


CHAPTER 31. TURING MACHINE COMPUTATIONS

1, 1, R 1, 1, R 1, 0, N

0, 1, N 0, 0, L
start q0 q1 q2

Figure 31.3: A machine computing f (x, y) = x + y


tur:mac:una:
fig:adder
31.5 Unary Representation of Numbers
explanation Turing machines work on sequences of symbols written on their tape. Depend- tur:mac:una:
sec
ing on the alphabet a Turing machine uses, these sequences of symbols can
represent various inputs and outputs. Of particular interest, of course, are
Turing machines which compute arithmetical functions, i.e., functions of nat-
ural numbers. A simple way to represent positive integers is by coding them
as sequences of a single symbol 1. If n ∈ N, let 1n be the empty sequence if
n = 0, and otherwise the sequence consisting of exactly n 1’s.
Definition 31.11 (Computation). A Turing machine M computes the func-
tion f : Nk → N iff M halts on input

1n1 01n2 0 . . . 01nk

with output 1f (n1 ,...,nk ) .

Problem 31.6. Give a definition for when a Turing machine M computes the
function f : Nk → Nm .

Example 31.12. Addition: Let’s build a machine that computes the function tur:mac:una:
ex:adder
f (n, m) = n + m. This requires a machine that starts with two blocks of 1’s of
length n and m on the tape, and halts with one block consisting of n + m 1’s.
The two input blocks of 1’s are separated by a 0, so one method would be to
write a stroke on the square containing the 0, and erase the last 1.

Problem 31.7. Trace through the configurations of the machine from Exam-
ple 31.12 for input ⟨3, 2⟩. What happens if the machine computes 0 + 0?

explanation In Example 31.4, we gave an example of a Turing machine that takes as


input a sequence of 1’s and halts with a sequence of twice as many 1’s on
the tape—the doubler machine. However, because the output contains 0’s to
the left of the doubled block of 1’s, it does not actually compute the function
f (x) = 2x, as you might have assumed. We’ll describe two ways of fixing that.
Example 31.13. The machine in Figure 31.4 computes the function f (x) =
2x. Instead of erasing the input and writing two 1’s at the far right for every
1 in the input as the machine from Example 31.4 does, this machine adds a

460 Release : 6891b66 (2024-12-01)


31.5. UNARY REPRESENTATION OF NUMBERS

1, 1, R 1, 1, L

0, 1, L
q2 q3

q6
0, 0, R 0, 0, L

R
1,
0,
0, 1, R
1, 1, R q1 q4

q7 1, 1, R
1, 0, R 1, 1, L

0, 0, L
start q0 q5
0, 1, R
q8 1, 0, N
1, 1, L

Figure 31.4: A machine computing f (x) = 2x


tur:mac:una:
fig:doubler-disc
single 1 to the right for every 1 in the input. It has to keep track of where the
input ends, so it leaves a 0 between the input and the added strokes, which it
fills with a 1 at the very end. And we have to “remember” where we are in the
input, so we temporarily replace a 1 in the input block by a 0.

tur:mac:una: Example 31.14. A second possibility for computing f (x) = 2x is to keep


ex:mover
the original doubler machine, but add states and instructions at the end which
move the doubled block of strokes to the far left of the tape. The machine in
Figure 31.5 does just this last part: started on a tape consisting of a block of 0’s
followed by a block of 1’s (and the head positioned anywhere in the block of 0’s),
it erases the 1’s one at a time and writes them at the beginning of the tape. In
order to be able to tell when it is done, it first marks the end of the block of
1’s with a ▷ symbol, which gets deleted at the end. We’ve started numbering
the states at q6 , so they can be added to the doubler machine. All you’ll need
is an additional instruction δ(q0 , 0) = ⟨q6 , 0, N ⟩, i.e., an arrow from q0 to q6
labelled 0, 0, N . (There is one subtle problem: the resulting machine does not
work for input x = 0. We’ll leave this as an exercise.)

Problem 31.8. In Example 31.14 we described a machine consisting of a com-


bination of the doubler machine from Figure 31.4 and the mover machine from

Release : 6891b66 (2024-12-01) 461


CHAPTER 31. TURING MACHINE COMPUTATIONS

0, 0, R 1, 1, R

0, 0, R 1, 1, R
start q6 q7 q8

0, 0, L 0, ▷, L

1, 0, L
q11 q10 q9 1, 1, L
0, 0, R
1,
▷, ▷, R 0,
L
1, 1, R

0, 1, R ▷, 0, N
q12 q13 q14

0, 0, R

Figure 31.5: Moving a block of 1’s to the left


tur:mac:una:
fig:mover
Figure 31.5. What happens if you start this combined machine on input x = 0,
i.e., on an empty tape? How would you fix the machine so that in this case the
machine halts with output 2x = 0? (You should be able to do this by adding
one state and one transition.)

Problem 31.9. Subtraction: Design a Turing machine that when given an


input of two non-empty strings of strokes of length n and m, where n > m,
computes the function f (n, m) = n − m.

Problem 31.10. Equality: Design a Turing machine to compute the following


function: (
1 if n = m
equality(n, m) =
0 if n ̸= m

where n and m ∈ Z+ .

Problem 31.11. Design a Turing machine to compute the function min(x, y)


where x and y are positive integers represented on the tape by strings of 1’s
separated by a 0. You may use additional symbols in the alphabet of the
machine.

462 Release : 6891b66 (2024-12-01)


31.6. HALTING STATES

The function min selects the smallest value from its arguments, so min(3, 5) =
3, min(20, 16) = 16, and min(4, 4) = 4, and so on.

Definition 31.15. A Turing machine M computes the partial function f : Nk →


7
N iff,

1. M halts on input 1n1 ⌢ 0 ⌢ . . . ⌢ 0 ⌢ 1nk with output 1m if


f (n1 , . . . , nk ) = m.

2. M does not halt at all, or with an output that is not a single block of 1’s
if f (n1 , . . . , nk ) is undefined.

content/turing-machines/machines-computations/halting-states.tex

31.6 Halting States


tur:mac:hal: Although we have defined our machines to halt only when there is no instruction explanation
sec
to carry out, common representations of Turing machines have a dedicated
halting state h, such that h ∈ Q.
The idea behind a halting state is simple: when the machine has finished
operation (it is ready to accept input, or has finished writing the output), it
goes into a state h where it halts. Some machines have two halting states, one
that accepts input and one that rejects input.

Example 31.16. Halting States. To elucidate this concept, let us begin with
an alteration of the even machine. Instead of having the machine halt in
state q0 if the input is even, we can add an instruction to send the machine
into a halting state.

0, 0, R
1, 1, R

start q0 q1

1, 1, R
0, 0, N

Let us further expand the example. When the machine determines that
the input is odd, it never halts. We can alter the machine to include a reject

Release : 6891b66 (2024-12-01) 463


CHAPTER 31. TURING MACHINE COMPUTATIONS

state by replacing the looping instruction with an instruction to go to a reject


state r.
1, 1, R

start q0 q1

1, 1, R
0, 0, N 0, 0, N

h r

explanation Adding a dedicated halting state can be advantageous in cases like this,
where it makes explicit when the machine accepts/rejects certain inputs. How-
ever, it is important to note that no computing power is gained by adding
a dedicated halting state. Similarly, a less formal notion of halting has its
own advantages. The definition of halting used so far in this chapter makes
the proof of the Halting Problem intuitive and easy to demonstrate. For this
reason, we continue with our original definition.

content/turing-machines/machines-computations/disciplined-machines.tex

31.7 Disciplined Machines


explanation In section section 31.6, we considered Turing machines that have a single, tur:mac:dis:
sec
designated halting state h—such machines are guaranteed to halt, if they halt
at all, in state h. In this way, machines with a single halting state are more
“disciplined” than we allow Turing machines in general to be. There are other
restrictions we might impose on the behavior of Turing machines. For instance,
we also have not prohibited Turing machines from ever erasing the tape-end
marker on square 0, or to attempt to move left from square 0. (Our definition
states that the head simply stays on square 0 in this case; other definitions
have the machine halt.) It is likewise sometimes desirable to be able to assume
that a Turing machine, if it halts at all, halts on square 1.

Definition 31.17. A Turing machine M is disciplined iff tur:mac:dis:


defn:disciplined

1. it has a designated single halting state h,

2. it halts, if it halts at all, while scanning square 1,

3. it never erases the ▷ symbol on square 0, and

4. it never attempts to move left from square 0.

464 Release : 6891b66 (2024-12-01)


31.7. DISCIPLINED MACHINES

0, 1, N
start q0 q1

0, 0, L
1, 1, R 1, 1, R
q2

1, 1, L
1, 0, L

h q3
▷, ▷, R

Figure 31.6: A disciplined addition machine


tur:mac:dis:
fig:adder-disc

We have already discussed that any Turing machine can be changed into explanation

one with the same behavior but with a designated halting state. This is done
simply by adding a new state h, and adding an instruction δ(q, σ) = ⟨h, σ, N ⟩
for any pair ⟨q, σ⟩ where the original δ is undefined. It is true, although tedious
to prove, that any Turing machine M can be turned into a disciplined Turing
machine M ′ which halts on the same inputs and produces the same output.
For instance, if the Turing machine halts and is not on square 1, we can add
some instructions to make the head move left until it finds the tape-end marker,
then move one square to the right, then halt. We’ll leave you to think about
how the other conditions can be dealt with.

Example 31.18. In Figure 31.6, we turn the addition machine from Exam-
ple 31.12 into a disciplined machine.

tur:mac:dis: Proposition 31.19. For every Turing machine M , there is a disciplined Tur-
prop:disciplined
ing machine M ′ which halts with output O if M halts with output O, and does
not halt if M does not halt. In particular, any function f : Nn → N computable
by a Turing machine is also computable by a disciplined Turing machine.

Problem 31.12. Give a disciplined machine that computes f (x) = x + 1.

Problem 31.13. Find a disciplined machine which, when started on input 1n


produces output 1n ⌢ 0 ⌢ 1n .

content/turing-machines/machines-computations/combining-machines.tex

Release : 6891b66 (2024-12-01) 465


CHAPTER 31. TURING MACHINE COMPUTATIONS

31.8 Combining Turing Machines


explanation The examples of Turing machines we have seen so far have been fairly simple tur:mac:cmb:
sec
in nature. But in fact, any problem that can be solved with any modern
programming language can also be solved with Turing machines. To build
more complex Turing machines, it is important to convince ourselves that we
can combine them, so we can build machines to solve more complex problems by
breaking the procedure into simpler parts. If we can find a natural way to break
a complex problem down into constituent parts, we can tackle the problem in
several stages, creating several simple Turing machines and combining them
into one machine that can solve the problem. This point is especially important
when tackling the Halting Problem in the next section.
How do we combine Turing machines M = ⟨Q, Σ, q0 , δ⟩ and M ′ = ⟨Q′ , Σ ′ , q0′ , δ ′ ⟩?
We now use the configuration of the tape after M has halted as the input con-
figuration of a run of machine M ′ . To get a single Turing machine M ⌢ M ′
that does this, do the following:

1. Renumber (or relabel) all the states Q′ of M ′ so that M and M ′ have no


states in common (Q ∩ Q′ = ∅).

2. The states of M ⌢ M ′ are Q ∪ Q′ .

3. The tape alphabet is Σ ∪ Σ ′ .

4. The start state is q0 .

5. The transition function is the function δ ′′ given by:



δ(q, σ)
 if q ∈ Q
′′ ′
δ (q, σ) = δ (q, σ) if q ∈ Q′
 ′

⟨q0 , σ, N ⟩ if q ∈ Q and δ(q, σ) is undefined

The resulting machine uses the instructions of M when it is in a state q ∈ Q,


the instructions of M ′ when it is in a state q ∈ Q′ . When it is in a state q ∈ Q
and is scanning a symbol σ for which M has no transition (i.e., M would have
halted), it enters the start state of M ′ (and leaves the tape contents and head
position as it is).
Note that unless the machine M is disciplined, we don’t know where the
tape head is when M halts, so the halting configuration of M need not have
the head scanning square 1. When combining machines, it’s important to keep
this in mind.

Example 31.20. Combining Machines: We’ll design a machine which, when


started on input consisting of two blocks of 1’s of length n and m, halts with a
single block of 2(m + n) 1’s on the tape. In order to build this machine, we can
combine two machines we are already familiar with: the addition machine, and

466 Release : 6891b66 (2024-12-01)


31.8. COMBINING TURING MACHINES

the doubler. We begin by drawing a state diagram for the addition machine.

1, 1, R 1, 1, R 1, 0, N

0, 1, N 0, 0, L
start q0 q1 q2

Instead of halting in state q2 , we want to continue operation in order to double


the output. Recall that the doubler machine erases the first stroke in the input
and writes two strokes in a separate output. Let’s add an instruction to make
sure the tape head is reading the first stroke of the output of the addition
machine.
1, 1, R 1, 1, R

0, 1, N 0, 0, L
start q0 q1 q2

1, 0, L

1, 1, L q3

▷, ▷, R

q4

It is now easy to double the input—all we have to do is connect the doubler


machine onto state q4 . This requires renaming the states of the doubler machine
so that they start at q4 instead of q0 —this way we don’t end up with two
starting states. The final diagram should look as in Figure 31.7.

Proposition 31.21. If M and M ′ are disciplined and compute the functions


f : Nk → N and f ′ : N → N, respectively, then M ⌢ M ′ is disciplined and
computes f ′ ◦ f .

Proof. Since M is disciplined, when it halts with output f (n1 , . . . , nk ) = m, the


head is scanning square 1. If we now enter the start state of M ′ , the machine
will halt with output f ′ (m), again scanning square 1. The other conditions of
Definition 31.17 are also satisfied.

Problem 31.14. Give a disciplined Turing machine computing f (x) = x + 2


by taking the machine M from Problem 31.12 and construct M ⌢ M .

content/turing-machines/machines-computations/variants.tex

Release : 6891b66 (2024-12-01) 467


CHAPTER 31. TURING MACHINE COMPUTATIONS

1, 1, R 1, 1, R

0, 1, N 0, 0, L
start q0 q1 q2

1, 0, L

1, 1, L q3

1, 1, R 1, 1, R
▷, ▷, R

1, 0, R 0, 0, R
q4 q5 q6

0, 0, R 0, 1, R

q9 q8 q7
0, 0, L 1, 1, L

1, 1, L 1, 1, L 0, 1, L

Figure 31.7: Combining adder and doubler machines


tur:mac:cmb:
fig:combined

31.9 Variants of Turing Machines


There are in fact many possible ways to define Turing machines, of which ours tur:mac:var:
sec
is only one. In some ways, our definition is more liberal than others. We allow
arbitrary finite alphabets, a more restricted definition might allow only two
tape symbols, 1 and 0. We allow the machine to write a symbol to the tape
and move at the same time, other definitions allow either writing or moving. We
allow the possibility of writing without moving the tape head, other definitions
leave out the N “instruction.” In other ways, our definition is more restrictive.
We assumed that the tape is infinite in one direction only, other definitions
allow the tape to be infinite both to the left and the right. In fact, one can
even allow any number of separate tapes, or even an infinite grid of squares.
We represent the instruction set of the Turing machine by a transition function;
other definitions use a transition relation where the machine has more than one
possible instruction in any given situation.
This last relaxation of the definition is particularly interesting. In our
definition, when the machine is in state q reading symbol σ, δ(q, σ) determines

468 Release : 6891b66 (2024-12-01)


31.10. THE CHURCH–TURING THESIS

what the new symbol, state, and tape head position is. But if we allow the
instruction set to be a relation between current state-symbol pairs ⟨q, σ⟩ and
new state-symbol-direction triples ⟨q ′ , σ ′ , D⟩, the action of the Turing machine
may not be uniquely determined—the instruction relation may contain both
⟨q, σ, q ′ , σ ′ , D⟩ and ⟨q, σ, q ′′ , σ ′′ , D′ ⟩. In this case we have a non-deterministic
Turing machine. These play an important role in computational complexity
theory.
There are also different conventions for when a Turing machine halts: we
say it halts when the transition function is undefined, other definitions require
the machine to be in a special designated halting state. We have explained in
section 31.6 why requiring a designated halting state is not a restriction which
impacts what Turing machines can compute. Since the tapes of our Turing
machines are infinite in one direction only, there are cases where a Turing
machine can’t properly carry out an instruction: if it reads the leftmost square
and is supposed to move left. According to our definition, it just stays put
instead of “falling off”, but we could have defined it so that it halts when that
happens. This definition is also equivalent: we could simulate the behavior
of a Turing machine that halts when it attempts to move left from square 0
by deleting every transition δ(q, ▷) = ⟨q ′ , σ, L⟩—then instead of attempting to
move left on ▷ the machine halts.1
There are also different ways of representing numbers (and hence the input-
output function computed by a Turing machine): we use unary representation,
but you can also use binary representation. This requires two symbols in
addition to 0 and ▷.
Now here is an interesting fact: none of these variations matters as to which
functions are Turing computable. If a function is Turing computable according
to one definition, it is Turing computable according to all of them.
We won’t go into the details of verifying this. Here’s just one example: we
gain no additional computing power by allowing a tape that is infinite in both
directions, or multiple tapes. The reason is, roughly, that a Turing machine
with a single one-way infinite tape can simulate multiple or two-way infinite
tapes. E.g., using additional states and instructions, we can “translate” a
program for a machine with multiple tapes or two-way infinite tape into one
with a single one-way infinite tape. The translated machine can use the even
squares for the squares of tape 1 (or the “positive” squares of a two-way infinite
tape) and the odd squares for the squares of tape 2 (or the “negative” squares).

content/turing-machines/machines-computations/church-turing-thesis.tex

31.10 The Church–Turing Thesis


tur:mac:ctt:
sec
1 This doesn’t quite work, since nothing prevents us from writing and reading ▷ on squares

other than square 0 (see Example 31.14). We can get around that by adding a second ▷′
symbol to use instead for such a purpose.

Release : 6891b66 (2024-12-01) 469


Turing machines are supposed to be a precise replacement for the concept of an
effective procedure. Turing thought that anyone who grasped both the concept
of an effective procedure and the concept of a Turing machine would have the
intuition that anything that could be done via an effective procedure could be
done by Turing machine. This claim is given support by the fact that all the
other proposed precise replacements for the concept of an effective procedure
turn out to be extensionally equivalent to the concept of a Turing machine
—that is, they can compute exactly the same set of functions. This claim is
called the Church–Turing thesis.

Definition 31.22 (Church–Turing thesis). The Church–Turing Thesis states


that anything computable via an effective procedure is Turing computable.

The Church–Turing thesis is appealed to in two ways. The first kind of


use of the Church–Turing thesis is an excuse for laziness. Suppose we have a
description of an effective procedure to compute something, say, in “pseudo-
code.” Then we can invoke the Church–Turing thesis to justify the claim that
the same function is computed by some Turing machine, even if we have not
in fact constructed it.
The other use of the Church–Turing thesis is more philosophically interest-
ing. It can be shown that there are functions which cannot be computed by
Turing machines. From this, using the Church–Turing thesis, one can conclude
that it cannot be effectively computed, using any procedure whatsoever. For
if there were such a procedure, by the Church–Turing thesis, it would follow
that there would be a Turing machine for it. So if we can prove that there is
no Turing machine that computes it, there also can’t be an effective procedure.
In particular, the Church–Turing thesis is invoked to claim that the so-called
halting problem not only cannot be solved by Turing machines, it cannot be
effectively solved at all.

Chapter 32

Undecidability

content/turing-machines/undecidability/introduction.tex

470
32.1. INTRODUCTION

32.1 Introduction
tur:und:int: It might seem obvious that not every function, even every arithmetical func-
sec
tion, can be computable. There are just too many, whose behavior is too
complicated. Functions defined from the decay of radioactive particles, for
instance, or other chaotic or random behavior. Suppose we start counting 1-
second intervals from a given time, and define the function f (n) as the number
of particles in the universe that decay in the n-th 1-second interval after that
initial moment. This seems like a candidate for a function we cannot ever hope
to compute.
But it is one thing to not be able to imagine how one would compute such
functions, and quite another to actually prove that they are uncomputable.
In fact, even functions that seem hopelessly complicated may, in an abstract
sense, be computable. For instance, suppose the universe is finite in time—
some day, in the very distant future the universe will contract into a single
point, as some cosmological theories predict. Then there is only a finite (but
incredibly large) number of seconds from that initial moment for which f (n)
is defined. And any function which is defined for only finitely many inputs is
computable: we could list the outputs in one big table, or code it in one very
big Turing machine state transition diagram.
We are often interested in special cases of functions whose values give the
answers to yes/no questions. For instance, the question “is n a prime number?”
is associated with the function
(
1 if n is prime
isprime(n) =
0 otherwise.

We say that a yes/no question can be effectively decided, if the associated


1/0-valued function is effectively computable.
To prove mathematically that there are functions which cannot be effec-
tively computed, or problems that cannot effectively decided, it is essential to
fix a specific model of computation, and show that there are functions it cannot
compute or problems it cannot decide. We can show, for instance, that not
every function can be computed by Turing machines, and not every problem
can be decided by Turing machines. We can then appeal to the Church–Turing
thesis to conclude that not only are Turing machines not powerful enough to
compute every function, but no effective procedure can.
The key to proving such negative results is the fact that we can assign
numbers to Turing machines themselves. The easiest way to do this is to enu-
merate them, perhaps by fixing a specific way to write down Turing machines
and their programs, and then listing them in a systematic fashion. Once we
see that this can be done, then the existence of Turing-uncomputable functions
follows by simple cardinality considerations: the set of functions from N to N
(in fact, even just from N to {0, 1}) are non-enumerable, but since we can enu-
merate all the Turing machines, the set of Turing-computable functions is only
denumerable.

Release : 6891b66 (2024-12-01) 471


CHAPTER 32. UNDECIDABILITY

We can also define specific functions and problems which we can prove to be
uncomputable and undecidable, respectively. One such problem is the so-called
Halting Problem. Turing machines can be finitely described by listing their
instructions. Such a description of a Turing machine, i.e., a Turing machine
program, can of course be used as input to another Turing machine. So we can
consider Turing machines that decide questions about other Turing machines.
One particularly interesting question is this: “Does the given Turing machine
eventually halt when started on input n?” It would be nice if there were a
Turing machine that could decide this question: think of it as a quality-control
Turing machine which ensures that Turing machines don’t get caught in infinite
loops and such. The interesting fact, which Turing proved, is that there cannot
be such a Turing machine. There cannot be a single Turing machine which,
when started on input consisting of a description of a Turing machine M and
some number n, will always halt with either output 1 or 0 according to whether
M machine would have halted when started on input n or not.
Once we have examples of specific undecidable problems we can use them to
show that other problems are undecidable, too. For instance, one celebrated un-
decidable problem is the question, “Is the first-order formula φ valid?”. There is
no Turing machine which, given as input a first-order formula φ, is guaranteed
to halt with output 1 or 0 according to whether φ is valid or not. Historically,
the question of finding a procedure to effectively solve this problem was called
simply “the” decision problem; and so we say that the decision problem is un-
solvable. Turing and Church proved this result independently at around the
same time, so it is also called the Church–Turing Theorem.

content/turing-machines/undecidability/enumerating-tms.tex

32.2 Enumerating Turing Machines


explanation We can show that the set of all Turing machines is enumerable. This follows tur:und:enu:
sec
from the fact that each Turing machine can be finitely described. The set of
states and the tape vocabulary are finite sets. The transition function is a
partial function from Q × Σ to Q × Σ × {L, R, N }, and so likewise can be
specified by listing its values for the finitely many argument pairs for which it
is defined.
This is true as far as it goes, but there is a subtle difference. The definition
of Turing machines made no restriction on what elements the set of states and
tape alphabet can have. So, e.g., for every real number, there technically is
a Turing machine that uses that number as a state. However, the behavior
of the Turing machine is independent of which objects serve as states and
vocabulary. Consider the two Turing machines in Figure 32.1. These two
diagrams correspond to two machines, M with the tape alphabet Σ = {▷, 0, 1}
and set of states {q0 , q1 }, and M ′ with alphabet Σ ′ = {▷, 0, A} and states {s, h}.
But their instructions are otherwise the same: M will halt on a sequence of
n 1’s iff n is even, and M ′ will halt on a sequence of n A’s iff n is even. All

472 Release : 6891b66 (2024-12-01)


32.2. ENUMERATING TURING MACHINES

0, 0, R
1, 1, R

start q0 q1

1, 1, R
0, 0, R
A, A, R

start s h

A, A, R

Figure 32.1: Variants of the Even machine


tur:und:enu:
fig:variants

2, 2, R
3, 3, R

start 1 2

3, 3, R

Figure 32.2: A standard Even machine


tur:und:enu:
fig:standard-even
we’ve done is rename 1 to A, q0 to s, and q1 to h. This example generalizes:
we can think of Turing machines as the same as long as one results from the
other by such a renaming of symbols and states. In fact, we can simply think
of the symbols and states of a Turing machine as positive integers: instead
of σ0 think 1, instead of σ1 think 2, etc.; ▷ is 1, 0 is 2, etc. In this way, the
Even machine becomes the machine depicted in Figure 32.2. We might call a
Turing machine with states and symbols that are positive integers a standard
machine, and only consider standard machines from now on.1
We wanted to show that the set of Turing machines is enumerable, and
with the above considerations in mind, it is enough to show that the set of
standard Turing machines is enumerable. Suppose we are given a standard
Turing machine M = ⟨Q, Σ, q0 , δ⟩. How could we describe it using a finite
string of positive integers? We’ll first list the number of states, the states
themselves, the number of symbols, the symbols themselves, and the starting
state. (Remember, all of these are positive integers, since M is a standard
machine.) What about δ? The set of possible arguments, i.e., pairs ⟨q, σ⟩, is
1 The terminology “standard machine” is not standard.

Release : 6891b66 (2024-12-01) 473


CHAPTER 32. UNDECIDABILITY

finite, since Q and Σ are finite. So the information in δ is simply the finite list
of all 5-tuples ⟨q, σ, q ′ , σ ′ , d⟩ where δ(q, σ) = ⟨q ′ , σ ′ , D⟩, and d is a number that
codes the direction D (say, 1 for L, 2 for R, and 3 for N ).
In this way, every standard Turing machine can be described by a finite list
of positive integers, i.e., as a sequence sM ∈ (Z+ )∗ . For instance, the standard
Even machine is coded by the sequence

Σ δ(2,2)=⟨2,2,R⟩
z }| { z }| {
2, 1, 2 , 3, 1, 2, 3, 1, 1, 3, 2, 3, 2 , 2, 2, 2, 2, 2 , 2, 3, 1, 3, 2 .
|{z} | {z } | {z }
Q δ(1,3)=⟨2,3,R⟩ δ(2,3)=⟨1,3,R⟩

Theorem 32.1. There are functions from N to N which are not Turing com-
putable.

Proof. We know that the set of finite sequences of positive integers (Z+ )∗ is
enumerable (Problem 4.7). This gives us that the set of descriptions of stan-
dard Turing machines, as a subset of (Z+ )∗ , is itself enumerable. Every Turing
computable function N to N is computed by some (in fact, many) Turing ma-
chines. By renaming its states and symbols to positive integers (in particular,
▷ as 1, 0 as 2, and 1 as 3) we can see that every Turing computable function is
computed by a standard Turing machine. This means that the set of all Turing
computable functions from N to N is also enumerable.
On the other hand, the set of all functions from N to N is not enumerable
(Problem 4.35). If all functions were computable by some Turing machine, we
could enumerate the set of all functions by listing all the descriptions of Turing
machines that compute them. So there are some functions that are not Turing
computable.

Problem 32.1. Can you think of a way to describe Turing machines that does
not require that the states and alphabet symbols are explicitly listed? You may
define your own notion of “standard” machine, but say something about why
every Turing machine can be computed by a “standard” machine in your new
sense.

content/turing-machines/undecidability/universal-tm.tex

32.3 Universal Turing Machines


In section 32.2 we discussed how every Turing machine can be described by a tur:und:uni:
sec
finite sequence of integers. This sequence encodes the states, alphabet, start
state, and instructions of the Turing machine. We also pointed out that the
set of all of these descriptions is enumerable. Since the set of such descriptions
is denumerable, this means that there is a surjective function from N to these
descriptions. Such a surjective function can be obtained, for instance, using
Cantor’s zig-zag method. It gives us a way of enumerating all (descriptions) of

474 Release : 6891b66 (2024-12-01)


32.3. UNIVERSAL TURING MACHINES

Turing machines. If we fix one such enumeration, it now makes sense to talk
of the 1st, 2nd, . . . , eth Turing machine. These numbers are called indices.
Definition 32.2. If M is the eth Turing machine (in our fixed enumeration),
we say that e is an index of M . We write Me for the eth Turing machine.

A machine may have more than one index, e.g., two descriptions of M
may differ in the order in which we list its instructions, and these different
descriptions will have different indices.
Importantly, it is possible to give the enumeration of Turing machine de-
scriptions in such a way that we can effectively compute the description of M
from its index, and to effectively compute an index of a machine M from its
description. By the Church–Turing thesis, it is then possible to find a Turing
machine which recovers the description of the Turing machine with index e
and writes the corresponding description on its tape as output. The descrip-
tion would be a sequence of blocks of 1’s (representing the positive integers in
the sequence describing Me ).
Given this, it now becomes natural to ask: what functions of Turing machine
indices are themselves computable by Turing machines? What properties of
Turing machine indices can be decided by Turing machines? An example: the
function that maps an index e to the number of states the Turing machine with
index e has, is computable by a Turing machine. Here’s what such a Turing
machine would do: started on a tape containing a single block of e 1’s, it
would first decode e into its description. The description is now represented by
a sequence of blocks of 1’s on the tape. Since the first element in this sequence
is the number of states. So all that has to be done now is to erase everything
but the first block of 1’s and then halt.
A remarkable result is the following:
tur:und:uni: Theorem 32.3. There is a universal Turing machine U which, when started
thm:universal-tm
on input ⟨e, n⟩
1. halts iff Me halts on input n, and
2. if Me halts with output m, so does U .
U thus computes the function f : N × N →
7 N given by f (e, n) = m if Me started
on input n halts with output m, and undefined otherwise.

Proof. To actually produce U is basically impossible, since it is an extremely


complicated machine. But we can describe in outline how it works, and then
invoke the Church–Turing thesis. When it starts, U ’s tape contains a block of e
1’s followed by a block of n 1’s. It first “decodes” the index e to the right of the
input n. This produces a list of numbers (i.e., blocks of 1’s separated by 0’s)
that describes the instructions of machine Me . U then writes the number of the
start state of Me and the number 1 on the tape to the right of the description
of Me . (Again, these are represented in unary, as blocks of 1’s.) Next, it copies
the input (block of n 1’s) to the right—but it replaces each 1 by a block of three

Release : 6891b66 (2024-12-01) 475


CHAPTER 32. UNDECIDABILITY

1’s (remember, the number of the 1 symbol is 3, 1 being the number of ▷ and
2 being the number of 0). At the left end of this sequence of blocks (separated
by 0 symbols on the tape of U ), it writes a single 1, the code for ▷.
U now has on its tape: the index e, the number n, the code number of the
start state (the “current state”), the number of the initial head position 1 (the
“current head position”), and the initial contents of the “tape” (a sequence
of blocks of 1’s representing the code numbers of the symbols of Me —the
“symbols”—separated by 0’s).
It now simulates what Me would do if started on input n, by doing the
following:

1. Find the number k of the “current head position” (at the beginning,
that’s 1),

2. Move to the kth block in the “tape” to see what the “symbol” there is,

3. Find the instruction matching the current “state” and “symbol,” tur:und:uni:
find-inst

4. Move back to the kth block on the “tape” and replace the “symbol” there
with the code number of the symbol Me would write,

5. Move the head to where it records the current “state” and replace the
number there with the number of the new state,

6. Move to the place where it records the “tape position” and erase a 1 or
add a 1 (if the instruction says to move left or right, respectively).

7. Repeat.2

If Me started on input n never halts, then U also never halts, so its output is
undefined.
If in step (3) it turns out that the description of Me contains no instruction
for the current “state”/“symbol” pair, then Me would halt. If this happens, U
erases the part of its tape to the left of the “tape.” For each block of three 1’s
(representing a 1 on Me ’s tape), it writes a 1 on the left end of its own tape,
and successively erases the “tape.” When this is done, U ’s tape contains a
single block of 1’s of length m.
If U encounters something other than a block of three 1’s on the “tape,” it
immediately halts. Since U ’s tape in this case does not contain a single block
of 1’s, its output is not a natural number, i.e., f (e, n) is undefined in this case.

content/turing-machines/undecidability/halting-problem.tex
2 We’re glossing over some subtle difficulties here. E.g., U may need some extra space

when it increases the counter where it keeps track of the “current head position”—in that
case it will have to move the entire “tape” to the right.

476 Release : 6891b66 (2024-12-01)


32.4. THE HALTING PROBLEM

32.4 The Halting Problem


tur:und:hal: Assume we have fixed some enumeration of Turing machine descriptions. Each explanation
sec
Turing machine thus receives an index : its place in the enumeration M1 , M2 ,
M3 , . . . of Turing machine descriptions.
We know that there must be non-Turing-computable functions: the set of
Turing machine descriptions—and hence the set of Turing machines—is enu-
merable, but the set of all functions from N to N is not. But we can find specific
examples of non-computable functions as well. One such function is the halting
function.

Definition 32.4 (Halting function). The halting function h is defined as


(
0 if machine Me does not halt for input n
h(e, n) =
1 if machine Me halts for input n

Definition 32.5 (Halting problem). The Halting Problem is the problem


of determining (for any e, n) whether the Turing machine Me halts for an
input of n strokes.

We show that h is not Turing-computable by showing that a related func- explanation

tion, s, is not Turing-computable. This proof relies on the fact that anything
that can be computed by a Turing machine can be computed by a disciplined
Turing machine (section 31.7), and the fact that two Turing machines can be
hooked together to create a single machine (section 31.8).

Definition 32.6. The function s is defined as


(
0 if machine Me does not halt for input e
s(e) =
1 if machine Me halts for input e

Lemma 32.7. The function s is not Turing computable.

Proof. We suppose, for contradiction, that the function s is Turing computable.


Then there would be a Turing machine S that computes s. We may assume,
without loss of generality, that when S halts, it does so while scanning the
first square (i.e., that it is disciplined). This machine can be “hooked up” to
another machine J, which halts if it is started on input 0 (i.e., if it reads 0
in the initial state while scanning the square to the right of the end-of-tape
symbol), and otherwise wanders off to the right, never halting. S ⌢ J, the
machine created by hooking S to J, is a Turing machine, so it is Me for some e
(i.e., it appears somewhere in the enumeration). Start Me on an input of e 1s.
There are two possibilities: either Me halts or it does not halt.

1. Suppose Me halts for an input of e 1s. Then s(e) = 1. So S, when started


on e, halts with a single 1 as output on the tape. Then J starts with a 1
on the tape. In that case J does not halt. But Me is the machine S ⌢ J,

Release : 6891b66 (2024-12-01) 477


CHAPTER 32. UNDECIDABILITY

so it should do exactly what S followed by J would do (i.e., in this case,


wander off to the right and never halt). So Me cannot halt for an input
of e 1’s.
2. Now suppose Me does not halt for an input of e 1s. Then s(e) = 0, and
S, when started on input e, halts with a blank tape. J, when started on
a blank tape, immediately halts. Again, Me does what S followed by J
would do, so Me must halt for an input of e 1’s.
In each case we arrive at a contradiction with our assumption. This shows
there cannot be a Turing machine S: s is not Turing computable.

Theorem 32.8 (Unsolvability of the Halting Problem). The halting prob-


tur:und:hal:
thm:halting-problem
lem is unsolvable, i.e., the function h is not Turing computable.

Proof. Suppose h were Turing computable, say, by a Turing machine H. We


could use H to build a Turing machine that computes s: First, make a copy
of the input (separated by a 0 symbol). Then move back to the beginning,
and run H. We can clearly make a machine that does the former (see Prob-
lem 31.13), and if H existed, we would be able to “hook it up” to such a copier
machine to get a new machine which would determine if Me halts on input e,
i.e., computes s. But we’ve already shown that no such machine can exist.
Hence, h is also not Turing computable.

Problem 32.2. The Three Halting (3-Halt) problem is the problem of giving
a decision procedure to determine whether or not an arbitrarily chosen Turing
Machine halts for an input of three 1’s on an otherwise blank tape. Prove that
the 3-Halt problem is unsolvable.

Problem 32.3. Show that if the halting problem is solvable for Turing ma-
chine and input pairs Me and n where e ̸= n, then it is also solvable for the
cases where e = n.

Problem 32.4. We proved that the halting problem is unsolvable if the input
is a number e, which identifies a Turing machine Me via an enumeration of all
Turing machines. What if we allow the description of Turing machines from
section 32.2 directly as input? Can there be a Turing machine which decides
the halting problem but takes as input descriptions of Turing machines rather
than indices? Explain why or why not.

Problem 32.5. Show that the partial function s′ is defined as


(
′ 1 if machine Me halts for input e
s (e) =
undefined if machine Me does not halt for input e

is Turing computable.

content/turing-machines/undecidability/decision-problem.tex

478 Release : 6891b66 (2024-12-01)


32.5. THE DECISION PROBLEM

32.5 The Decision Problem


tur:und:dec: We say that first-order logic is decidable iff there is an effective method for
sec
determining whether or not a given sentence is valid. As it turns out, there
is no such method: the problem of deciding validity of first-order sentences is
unsolvable.
In order to establish this important negative result, we prove that the de-
cision problem cannot be solved by a Turing machine. That is, we show that
there is no Turing machine which, whenever it is started on a tape that contains
a first-order sentence, eventually halts and outputs either 1 or 0 depending on
whether the sentence is valid or not. By the Church–Turing thesis, every func-
tion which is computable is Turing computable. So if this “validity function”
were effectively computable at all, it would be Turing computable. If it isn’t
Turing computable, then, it also cannot be effectively computable.
Our strategy for proving that the decision problem is unsolvable is to reduce
the halting problem to it. This means the following: We have proved that the
function h(e, w) that halts with output 1 if the Turing machine described by e
halts on input w and outputs 0 otherwise, is not Turing computable. We will
show that if there were a Turing machine that decides validity of first-order
sentences, then there is also Turing machine that computes h. Since h cannot
be computed by a Turing machine, there cannot be a Turing machine that
decides validity either.
The first step in this strategy is to show that for every input w and a Turing
machine M , we can effectively describe a sentence τ (M, w) representing the
instruction set of M and the input w and a sentence α(M, w) expressing “M
eventually halts” such that:

⊨ τ (M, w) → α(M, w) iff M halts for input w.

The bulk of our proof will consist in describing these sentences τ (M, w) and α(M, w)
and in verifying that τ (M, w) → α(M, w) is valid iff M halts on input w.

content/turing-machines/undecidability/representing-tms.tex

32.6 Representing Turing Machines


tur:und:rep: In order to represent Turing machines and their behavior by a sentence of first- explanation
sec
order logic, we have to define a suitable language. The language consists of
two parts: predicate symbols for describing configurations of the machine, and
expressions for numbering execution steps (“moments”) and positions on the
tape.
We introduce two kinds of predicate symbols, both of them 2-place: For
each state q, a predicate symbol Qq , and for each tape symbol σ, a predicate
symbol Sσ . The former allow us to describe the state of M and the position of
its tape head, the latter allow us to describe the contents of the tape.

Release : 6891b66 (2024-12-01) 479


CHAPTER 32. UNDECIDABILITY

In order to express the positions of the tape head and the number of steps
executed, we need a way to express numbers. This is done using a constant
symbol 0, and a 1-place function ′, the successor function. By convention it is
written after its argument (and we leave out the parentheses).
For each number n there is a canonical term n, the numeral for n, which
represents it in LM . 0 is 0, 1 is 0′ , 2 is 0′′ , and so on. More formally:

0=0
n + 1 = n′

The term 0, i.e., 0 names the leftmost position on the tape as well as the time
before the first execution step (the initial configuration). The term 1, i.e., 0′
names the square to the right of the leftmost square, and the time after the
first execution step, and so on.
We also introduce a predicate symbol < to express both the ordering of
tape positions (when it means “to the left of”) and execution steps (then it
means “before”).
Once we have the language in place, we list the “axioms” of τ (M, w), i.e.,
the sentences which, taken together, describe the behavior of M when run on
input w. There will be sentences which lay down conditions on 0, ′, and <,
sentences that describes the input configuration, and sentences that describe
what the configuration of M is after it executes a particular instruction.

Definition 32.9. Given a Turing machine M = ⟨Q, Σ, q0 , δ⟩, the language LM tur:und:rep:
defn:tm-descr
consists of:

1. A two-place predicate symbol Qq (x, y) for every state q ∈ Q. Intuitively,


Qq (m, n) expresses “after n steps, M is in state q scanning the mth
square.”

2. A two-place predicate symbol Sσ (x, y) for every symbol σ ∈ Σ. Intu-


itively, Sσ (m, n) expresses “after n steps, the mth square contains sym-
bol σ.”

3. A constant symbol 0

4. A one-place function symbol ′

5. A two-place predicate symbol <

The sentences describing the operation of the Turing machine M on input


w = σi1 . . . σik are the following:

1. Axioms describing numbers and <:

a) A sentence that says that every number is less than its successor:

∀x x < x′

480 Release : 6891b66 (2024-12-01)


32.6. REPRESENTING TURING MACHINES

b) A sentence that ensures that < is transitive:

∀x ∀y ∀z ((x < y ∧ y < z) → x < z)

2. Axioms describing the input configuration:


a) After 0 steps—before the machine starts—M is in the initial state q0 ,
scanning square 1:
Qq0 (1, 0)
b) The first k + 1 squares contain the symbols ▷, σi1 , . . . , σik :

S▷ (0, 0) ∧ Sσi1 (1, 0) ∧ · · · ∧ Sσik (k, 0)

c) Otherwise, the tape is empty:

∀x (k < x → S0 (x, 0))

3. Axioms describing the transition from one configuration to the next:


For the following, let φ(x, y) be the conjunction of all sentences of the
form
∀z (((z < x ∨ x < z) ∧ Sσ (z, y)) → Sσ (z, y ′ ))
where σ ∈ Σ. We use φ(m, n) to express “other than at square m, the
tape after n + 1 steps is the same as after n steps.”
tur:und:rep: a) For every instruction δ(qi , σ) = ⟨qj , σ ′ , R⟩, the sentence:
rep-right

∀x ∀y ((Qqi (x, y) ∧ Sσ (x, y)) →


(Qqj (x′ , y ′ ) ∧ Sσ′ (x, y ′ ) ∧ φ(x, y)))

This says that if, after y steps, the machine is in state qi scanning
square x which contains symbol σ, then after y+1 steps it is scanning
square x+1, is in state qj , square x now contains σ ′ , and every square
other than x contains the same symbol as it did after y steps.
tur:und:rep: b) For every instruction δ(qi , σ) = ⟨qj , σ ′ , L⟩, the sentence:
rep-left

∀x ∀y ((Qqi (x′ , y) ∧ Sσ (x′ , y)) →


(Qqj (x, y ′ ) ∧ Sσ′ (x′ , y ′ ) ∧ φ(x, y))) ∧
∀y ((Qqi (0, y) ∧ Sσ (0, y)) →
(Qqj (0, y ′ ) ∧ Sσ′ (0, y ′ ) ∧ φ(0, y)))

Take a moment to think about how this works: now we don’t start
with “if scanning square x . . . ” but: “if scanning square x+1 . . . ” A
move to the left means that in the next step the machine is scanning
square x. But the square that is written on is x + 1. We do it this
way since we don’t have subtraction or a predecessor function.

Release : 6891b66 (2024-12-01) 481


CHAPTER 32. UNDECIDABILITY

Note that numbers of the form x + 1 are 1, 2, . . . , i.e., this doesn’t


cover the case where the machine is scanning square 0 and is sup-
posed to move left (which of course it can’t—it just stays put). That
special case is covered by the second conjunction: it says that if, af-
ter y steps, the machine is scanning square 0 in state qi and square 0
contains symbol σ, then after y + 1 steps it’s still scanning square 0,
is now in state qj , the symbol on square 0 is σ ′ , and the squares
other than square 0 contain the same symbols they contained ofter
y steps.
c) For every instruction δ(qi , σ) = ⟨qj , σ ′ , N ⟩, the sentence: tur:und:rep:
rep-stay

∀x ∀y ((Qqi (x, y) ∧ Sσ (x, y)) →


(Qqj (x, y ′ ) ∧ Sσ′ (x, y ′ ) ∧ φ(x, y)))

Let τ (M, w) be the conjunction of all the above sentences for Turing machine M
and input w.
In order to express that M eventually halts, we have to find a sentence that
says “after some number of steps, the transition function will be undefined.”
Let X be the set of all pairs ⟨q, σ⟩ such that δ(q, σ) is undefined. Let α(M, w)
then be the sentence
_
∃x ∃y ( (Qq (x, y) ∧ Sσ (x, y)))
⟨q,σ⟩∈X

If we use a Turing machine with a designated halting state h, it is even


easier: then the sentence α(M, w)

∃x ∃y Qh (x, y)

expresses that the machine eventually halts.

Proposition 32.10. If m < k, then τ (M, w) ⊨ m < k tur:und:rep:


prop:mlessk

Proof. Exercise.

Problem 32.6. Prove Proposition 32.10. (Hint: use induction on k − m).

content/turing-machines/undecidability/verification.tex

32.7 Verifying the Representation


explanation In order to verify that our representation works, we have to prove two things. tur:und:ver:
sec
First, we have to show that if M halts on input w, then τ (M, w) → α(M, w) is
valid. Then, we have to show the converse, i.e., that if τ (M, w) → α(M, w) is
valid, then M does in fact eventually halt when run on input w.

482 Release : 6891b66 (2024-12-01)


32.7. VERIFYING THE REPRESENTATION

The strategy for proving these is very different. For the first result, we have
to show that a sentence of first-order logic (namely, τ (M, w) → α(M, w)) is
valid. The easiest way to do this is to give a derivation. Our proof is supposed
to work for all M and w, though, so there isn’t really a single sentence for which
we have to give a derivation, but infinitely many. So the best we can do is to
prove by induction that, whatever M and w look like, and however many steps
it takes M to halt on input w, there will be a derivation of τ (M, w) → α(M, w).
Naturally, our induction will proceed on the number of steps M takes before
it reaches a halting configuration. In our inductive proof, we’ll establish that
for each step n of the run of M on input w, τ (M, w) ⊨ χ(M, w, n), where
χ(M, w, n) correctly describes the configuration of M run on w after n steps.
Now if M halts on input w after, say, n steps, χ(M, w, n) will describe a
halting configuration. We’ll also show that χ(M, w, n) ⊨ α(M, w), whenever
χ(M, w, n) describes a halting configuration. So, if M halts on input w, then for
some n, M will be in a halting configuration after n steps. Hence, τ (M, w) ⊨
χ(M, w, n) where χ(M, w, n) describes a halting configuration, and since in
that case χ(M, w, n) ⊨ α(M, w), we get that T (M, w) ⊨ α(M, w), i.e., that
⊨ τ (M, w) → α(M, w).
The strategy for the converse is very different. Here we assume that ⊨
τ (M, w) → α(M, w) and have to prove that M halts on input w. From the hy-
pothesis we get that τ (M, w) ⊨ α(M, w), i.e., α(M, w) is true in every structure
in which τ (M, w) is true. So we’ll describe a structure M in which τ (M, w)
is true: its domain will be N, and the interpretation of all the Qq and Sσ
will be given by the configurations of M during a run on input w. So, e.g.,
M ⊨ Qq (m, n) iff T , when run on input w for n steps, is in state q and scanning
square m. Now since τ (M, w) ⊨ α(M, w) by hypothesis, and since M ⊨ τ (M, w)
by construction, M ⊨ α(M, w). But M ⊨ α(M, w) iff there is some n ∈ |M| = N
so that M , run on input w, is in a halting configuration after n steps.

Definition 32.11. Let χ(M, w, n) be the sentence

Qq (m, n) ∧ Sσ0 (0, n) ∧ · · · ∧ Sσk (k, n) ∧ ∀x (k < x → S0 (x, n))

where q is the state of M at time n, M is scanning square m at time n, square i


contains symbol σi at time n for 0 ≤ i ≤ k and k is the right-most non-blank
square of the tape at time 0, or the right-most square the tape head has visited
after n steps, whichever is greater.

tur:und:ver: Lemma 32.12. If M run on input w is in a halting configuration after n


lem:halt-config-implies-halt
steps, then χ(M, w, n) ⊨ α(M, w).

Proof. Suppose that M halts for input w after n steps. There is some state q,
square m, and symbol σ such that:

1. After n steps, M is in state q scanning square m on which σ appears.

2. The transition function δ(q, σ) is undefined.

Release : 6891b66 (2024-12-01) 483


CHAPTER 32. UNDECIDABILITY

χ(M, w, n) is the description of this configuration and will include the clauses
Qq (m, n) and Sσ (m, n). These clauses together imply α(M, w):
_
∃x ∃y ( (Qq (x, y) ∧ Sσ (x, y)))
⟨q,σ⟩∈X

as ⟨q ′ , σ ′ ⟩ ∈ X.
W
since Qq′ (m, n) ∧ Sσ′ (m, n) ⊨ ⟨q,σ⟩∈X (Qq (m, n) ∧ Sσ (m, n)),

explanation So if M halts for input w, then there is some n such that χ(M, w, n) ⊨
α(M, w). We will now show that for any time n, τ (M, w) ⊨ χ(M, w, n).
Lemma 32.13. For each n, if M has not halted after n steps, τ (M, w) ⊨ tur:und:ver:
lem:config
χ(M, w, n).

Proof. Induction basis: If n = 0, then the conjuncts of χ(M, w, 0) are also


conjuncts of τ (M, w), so entailed by it.
Inductive hypothesis: If M has not halted before the nth step, then τ (M, w) ⊨
χ(M, w, n). We have to show that (unless χ(M, w, n) describes a halting con-
figuration), τ (M, w) ⊨ χ(M, w, n + 1).
Suppose n > 0 and after n steps, M started on w is in state q scanning
square m. Since M does not halt after n steps, there must be an instruction
of one of the following three forms in the program of M :

1. δ(q, σ) = ⟨q ′ , σ ′ , R⟩ tur:und:ver:
right
2. δ(q, σ) = ⟨q ′ , σ ′ , L⟩ tur:und:ver:
left
3. δ(q, σ) = ⟨q ′ , σ ′ , N ⟩ tur:und:ver:
stay

We will consider each of these three cases in turn.

1. Suppose there is an instruction of the form (1). By Definition 32.9(3a),


this means that

∀x ∀y ((Qq (x, y) ∧ Sσ (x, y)) →


(Qq′ (x′ , y ′ ) ∧ Sσ′ (x, y ′ ) ∧ φ(x, y)))

is a conjunct of τ (M, w). This entails the following sentence (universal


instantiation, m for x and n for y):

(Qq (m, n) ∧ Sσ (m, n)) →


(Qq′ (m′ , n′ ) ∧ Sσ′ (m, n′ ) ∧ φ(m, n)).

By induction hypothesis, τ (M, w) ⊨ χ(M, w, n), i.e.,

Qq (m, n) ∧ Sσ0 (0, n) ∧ · · · ∧ Sσk (k, n)∧


∀x (k < x → S0 (x, n))

484 Release : 6891b66 (2024-12-01)


32.7. VERIFYING THE REPRESENTATION

Since after n steps, tape square m contains σ, the corresponding conjunct


is Sσ (m, n), so this entails:

Qq (m, n) ∧ Sσ (m, n)

We now get

Qq′ (m′ , n′ ) ∧ Sσ′ (m, n′ ) ∧


Sσ0 (0, n′ ) ∧ · · · ∧ Sσk (k, n′ ) ∧
∀x (k < x → S0 (x, n′ ))

as follows: The first line comes directly from the consequent of the pre-
ceding conditional, by modus ponens. Each conjunct in the middle
line—which excludes Sσm (m, n′ )—follows from the corresponding con-
junct in χ(M, w, n) together with φ(m, n).
If m < k, τ (M, w) ⊢ m < k (Proposition 32.10) and by transitivity of <,
we have ∀x (k < x → m < x). If m = k, then ∀x (k < x → m < x) by
logic alone. The last line then follows from the corresponding conjunct
in χ(M, w, n), ∀x (k < x → m < x), and φ(m, n). If m < k, this already
is χ(M, w, n + 1).
Now suppose m = k. In that case, after n + 1 steps, the tape head has
also visited square k + 1, which now is the right-most square visited. So

χ(M, w, n + 1) has a new conjunct, S0 (k , n′ ), and the last conjunct is

∀x (k < x → S0 (x, n′ )). We have to verify that these two sentences are
also implied.
We already have ∀x (k < x → S0 (x, n′ )). In particular, this gives us
′ ′ ′
k < k → S0 (k , n′ ). From the axiom ∀x x < x′ we get k < k . By modus

ponens, S0 (k , n′ ) follows.

Also, since τ (M, w) ⊢ k < k , the axiom for transitivity of < gives us

∀x (k < x → S0 (x, n′ )). (We leave the verification of this as an exercise.)
2. Suppose there is an instruction of the form (2). Then, by Definition 32.9(3b),

∀x ∀y ((Qq (x′ , y) ∧ Sσ (x′ , y)) →


(Qq′ (x, y ′ ) ∧ Sσ′ (x′ , y ′ ) ∧ φ(x, y))) ∧
∀y ((Qqi (0, y) ∧ Sσ (0, y)) →
(Qqj (0, y ′ ) ∧ Sσ′ (0, y ′ ) ∧ φ(0, y)))

is a conjunct of τ (M, w). If m > 0, then let l = m − 1 (i.e., m = l + 1).


The first conjunct of the above sentence entails the following:
′ ′
(Qq (l , n) ∧ Sσ (l , n)) →

(Qq′ (l, n′ ) ∧ Sσ′ (l , n′ ) ∧ φ(l, n))

Release : 6891b66 (2024-12-01) 485


CHAPTER 32. UNDECIDABILITY

Otherwise, let l = m = 0 and consider the following sentence entailed by


the second conjunct:

((Qqi (0, n) ∧ Sσ (0, n)) →


(Qqj (0, n′ ) ∧ Sσ′ (0, n′ ) ∧ φ(0, n)))

Either sentence implies

Qq′ (l, n′ ) ∧ Sσ′ (m, n′ ) ∧


Sσ0 (0, n′ ) ∧ · · · ∧ Sσk (k, n′ ) ∧
∀x (k < x → S0 (x, n′ ))


as before. (Note that in the first case, l ≡ l + 1 ≡ m and in the second
case l ≡ 0.) But this just is χ(M, w, n + 1).

3. Case (3) is left as an exercise.

We have shown that for any n, τ (M, w) ⊨ χ(M, w, n).

Problem 32.7. Complete case (3) of the proof of Lemma 32.13.

Problem 32.8. Give a derivation of Sσi (i, n′ ) from Sσi (i, n) and φ(m, n) (as-
suming i ̸= m, i.e., either i < m or m < i).


Problem 32.9. Give a derivation of ∀x (k < x → S0 (x, n′ )) from ∀x (k <
x → S0 (x, n′ )), ∀x x < x′ , and ∀x ∀y ∀z ((x < y ∧ y < z) → x < z).)

Lemma 32.14. If M halts on input w, then τ (M, w) → α(M, w) is valid. tur:und:ver:


lem:valid-if-halt

Proof. By Lemma 32.13, we know that, for any time n, the description χ(M, w, n)
of the configuration of M at time n is entailed by τ (M, w). Suppose M halts af-
ter k steps. At that point, it will be scanning square m, for some m ∈ N. Then
χ(M, w, k) describes a halting configuration of M , i.e., it contains as conjuncts
both Qq (m, k) and Sσ (m, k) with δ(q, σ) undefined. Thus, by Lemma 32.12,
χ(M, w, k) ⊨ α(M, w). But since τ (M, w) ⊨ χ(M, w, k), we have τ (M, w) ⊨
α(M, w) and therefore τ (M, w) → α(M, w) is valid.

explanation To complete the verification of our claim, we also have to establish the
reverse direction: if τ (M, w) → α(M, w) is valid, then M does in fact halt when
started on input w.

Lemma 32.15. If ⊨ τ (M, w) → α(M, w), then M halts on input w. tur:und:ver:


lem:halt-if-valid

486 Release : 6891b66 (2024-12-01)


32.8. THE DECISION PROBLEM IS UNSOLVABLE

Proof. Consider the LM -structure M with domain N which interprets 0 as 0,


′ as the successor function, and < as the less-than relation, and the predicates
Qq and Sσ as follows:

started on w, after n steps,


QM
q = {⟨m, n⟩ : }
M is in state q scanning square m
started on w, after n steps,
SσM = {⟨m, n⟩ : }
square m of M contains symbol σ

In other words, we construct the structure M so that it describes what M


started on input w actually does, step by step. Clearly, M ⊨ τ (M, w). If
⊨ τ (M, w) → α(M, w), then also M ⊨ α(M, w), i.e.,
_
M ⊨ ∃x ∃y ( (Qq (x, y) ∧ Sσ (x, y))).
⟨q,σ⟩∈X

As |M| = N, there must be m, n ∈ N so that M ⊨ Qq (m, n) ∧ Sσ (m, n) for


some q and σ such that δ(q, σ) is undefined. By the definition of M, this means
that M started on input w after n steps is in state q and reading symbol σ,
and the transition function is undefined, i.e., M has halted.

content/turing-machines/undecidability/unsolvability-decision-problem.tex

32.8 The Decision Problem is Unsolvable


tur:und:uns:
sec
tur:und:uns: Theorem 32.16. The decision problem is unsolvable: There is no Turing
thm:decision-prob
machine D, which when started on a tape that contains a sentence ψ of first-
order logic as input, D eventually halts, and outputs 1 iff ψ is valid and 0
otherwise.

Proof. Suppose the decision problem were solvable, i.e., suppose there were a
Turing machine D. Then we could solve the halting problem as follows. We
construct a Turing machine E that, given as input the number e of Turing
machine Me and input w, computes the corresponding sentence τ (Me , w) →
α(Me , w) and halts, scanning the leftmost square on the tape. The machine
E ⌢ D would then, given input e and w, first compute τ (Me , w) → α(Me , w)
and then run the decision problem machine D on that input. D halts with out-
put 1 iff τ (Me , w)→α(Me , w) is valid and outputs 0 otherwise. By Lemma 32.15
and Lemma 32.14, τ (Me , w) → α(Me , w) is valid iff Me halts on input w. Thus,
E ⌢ D, given input e and w halts with output 1 iff Me halts on input w and
halts with output 0 otherwise. In other words, E ⌢ D would solve the halting
problem. But we know, by Theorem 32.8, that no such Turing machine can
exist.

Release : 6891b66 (2024-12-01) 487


CHAPTER 32. UNDECIDABILITY

Corollary 32.17. It is undecidable if an arbitrary sentence of first-order logic tur:und:uns:


cor:undecidable-sat
is satisfiable.

Proof. Suppose satisfiability were decidable by a Turing machine S. Then we


could solve the decision problem as follows: Given a sentence B as input, move
ψ to the right one square. Return to square 1 and write the symbol ¬.
Now run the Turing machine S. It eventually halts with output either 1 (if
¬ψ is satisfiable) or 0 (if ¬ψ is unsatisfiable) on the tape. If there is a 1 on
square 1, erase it; if square 1 is empty, write a 1, then halt.
This Turing machine always halts, and its output is 1 iff ¬ψ is unsatisfiable
and 0 otherwise. Since ψ is valid iff ¬ψ is unsatisfiable, the machine outputs 1
iff ψ is valid, and 0 otherwise, i.e., it would solve the decision problem.

explanation So there is no Turing machine which always gives a correct “yes” or “no”
answer to the question “Is ψ a valid sentence of first-order logic?” However,
there is a Turing machine that always gives a correct “yes” answer—but simply
does not halt if the answer is “no.” This follows from the soundness and
completeness theorem of first-order logic, and the fact that derivations can be
effectively enumerated.

Theorem 32.18. Validity of first-order sentences is semi-decidable: There is tur:und:uns:


thm:valid-ce
a Turing machine E, which when started on a tape that contains a sentence ψ
of first-order logic as input, E eventually halts and outputs 1 iff ψ is valid, but
does not halt otherwise.

Proof. All possible derivations of first-order logic can be generated, one after
another, by an effective algorithm. The machine E does this, and when it finds
a derivation that shows that ⊢ ψ, it halts with output 1. By the soundness
theorem, if E halts with output 1, it’s because ⊨ ψ. By the completeness the-
orem, if ⊨ ψ there is a derivation that shows that ⊢ ψ. Since E systematically
generates all possible derivations, it will eventually find one that shows ⊢ ψ, so
will eventually halt with output 1.

content/turing-machines/undecidability/trakhtenbrot.tex

32.9 Trakhtenbrot’s Theorem


explanation In section 32.6 we defined sentences τ (M, w) and α(M, w) for a Turing ma- tur:und:tra:
sec
chine M and input string w. Then we showed in Lemma 32.14 and Lemma 32.15
that τ (M, w) → α(M, w) is valid iff M , started on input w, eventually halts.
Since the Halting Problem is undecidable, this implies that validity and satisfia-
bility of sentences of first-order logic is undecidable (Theorem 32.16 and Corol-
lary 32.17).
But validity and satisfiability of sentences is defined for arbitrary structures,
finite or infinite. You might suspect that it is easier to decide if a sentence is

488 Release : 6891b66 (2024-12-01)


32.9. TRAKHTENBROT’S THEOREM

satisfiable in a finite structure (or valid in all finite structures). We can adapt
the proof of the unsolvability of the decision problem so that it shows this is
not the case.
First, if you go back to the proof of Lemma 32.15, you’ll see that what
we did there is produce a model M of τ (M, w) which describes exactly what
machine M does when started on input w. The domain of that model was N,
i.e., infinite. But if M actually halts on input w, we can build a finite model M′
in the same way. Suppose M started on input w halts after k steps. Take as
domain |M′ | the set {0, . . . , n}, where n is the larger of k and the length of w,
and let (
M′ x + 1 if x < n
′ (x) =
n otherwise,

and ⟨x, y⟩ ∈ <M iff x < y or x = y = n. Otherwise M′ is defined just like M.
By the definition of M′ , just like in the proof of Lemma 32.15, M′ ⊨ τ (M, w).
And since we assumed that M halts on input w, M′ ⊨ α(M, w). So, M′ is a
finite model of τ (M, w) ∧ α(M, w) (note that we’ve replaced → with ∧).
We are halfway to a proof: we’ve shown that if M halts on input w, then
τ (M, w) ∧ α(M, w) has a finite model. Unfortunately, the converse of this does
not hold, i.e., there are Turing machines that don’t halt on some input w,
but τ (M, w) ∧ α(M, w) still has a finite model. For instance, consider the ma-
chine M with the single state q0 and instruction δ(q0 , 0) = ⟨q0 , 0, N ⟩. Started
on empty input w = Λ, this machine never halts: it is in an infinite loop, but
does not change the tape or move the head. All configurations are the same
(same state, same head position, same tape contents). We can define a finite
structure M′′ that satisfies τ (M, Λ) ∧ α(M, Λ) (exercise). We can, however,
change τ (M, w) in a suitable way so that such structures are ruled out.
Problem 32.10. Let M be a Turing machine with the single state q0 and
′′ ′
single instruction δ(q0 , 0) = ⟨q, 0, N ⟩. Let |M′′ | = {0, 1, 2}, ′M (0) = ′M (1) =
′′ ′′ ′′ ′′
1 and ′M (2) = 2, and <M = {⟨0, 1⟩, ⟨1, 1⟩, ⟨2, 2⟩}. Define QM M
q0 , S0 , and
M′′
S▷ so that τ (M, Λ) and α(M, Λ) become true and explain why they are.
Hint: Observe that δ(q0 , ▷) is undefined. Ensure that

Qq0 (1, n) ∧ S▷ (0, n) ∧ ∀x (0 < x → S0 (x, n)) for all n ∈ N


∃y (Qq0 (0, y) ∧ S▷ (0, y))

are both true in M′′ .

Consider the sentences describing the operation of the Turing machine M


on input w = σi1 . . . σik :
1. Axioms describing numbers and < (just like in the definition of τ (M, w)
in section 32.6).
2. Axioms describing the input configuration: just like in the definition
of τ (M, w).

Release : 6891b66 (2024-12-01) 489


CHAPTER 32. UNDECIDABILITY

3. Axioms describing the transition from one configuration to the next:


For the following, let φ(x, y) be as before, and let

ψ(y) ≡ ∀x (x < y → x ̸= y).

a) For every instruction δ(qi , σ) = ⟨qj , σ ′ , R⟩, the sentence: tur:und:tra:


rep-right

∀x ∀y ((Qqi (x, y) ∧ Sσ (x, y)) →


(Qqj (x′ , y ′ ) ∧ Sσ′ (x, y ′ ) ∧ φ(x, y) ∧ ψ(y ′ )))

b) For every instruction δ(qi , σ) = ⟨qj , σ ′ , L⟩, the sentence tur:und:tra:


rep-left

∀x ∀y ((Qqi (x′ , y) ∧ Sσ (x′ , y)) →


(Qqj (x, y ′ ) ∧ Sσ′ (x′ , y ′ ) ∧ φ(x, y))) ∧
∀y ((Qqi (0, y) ∧ Sσ (0, y)) →
(Qqj (0, y ′ ) ∧ Sσ′ (0, y ′ ) ∧ φ(0, y) ∧ ψ(y ′ )))

c) For every instruction δ(qi , σ) = ⟨qj , σ ′ , N ⟩, the sentence: tur:und:tra:


rep-stay

∀x ∀y ((Qqi (x, y) ∧ Sσ (x, y)) →


(Qqj (x, y ′ ) ∧ Sσ′ (x, y ′ ) ∧ φ(x, y) ∧ ψ(y ′ )))

As you can see, the sentences describing the transitions of M are the
same as the corresponding sentence in τ (M, w), except we add ψ(y ′ ) at
the end. ψ(y ′ ) ensures that the number y ′ of the “next” configuration is
different from all previous numbers 0, 0′ , . . . .
Let τ ′ (M, w) be the conjunction of all the above sentences for Turing ma-
chine M and input w.
Lemma 32.19. If M started on input w halts, then τ ′ (M, w) ∧ α(M, w) has tur:und:tra:
lem:halts-sat
a finite model.

Proof. Let M′ be as in the proof of Lemma 32.15, except

|M′ | = {0, . . . , n},


(
M′ x + 1 if x < n
′ (x) =
n otherwise,

⟨x, y⟩ ∈ <M iff x < y or x = y = n,

where n = max(k, len(w)) and k is the least number such that M started
on input w has halted after k steps. We leave the verification that M′ ⊨
τ ′ (M, w) ∧ E(M, w) as an exercise.

Problem 32.11. Complete the proof of Lemma 32.19 by proving that M′ ⊨


τ (M, w) ∧ E(M, w).

490 Release : 6891b66 (2024-12-01)


32.9. TRAKHTENBROT’S THEOREM

tur:und:tra: Lemma 32.20. If τ ′ (M, w) ∧ α(M, w) has a finite model, then M started on
lem:sat-halts
input w halts.

Proof. We show the contrapositive. Suppose that M started on w does not


halt. If τ ′ (M, w) ∧ α(M, w) has no model at all, we are done. So assume M is
a model of τ (M, w) ∧ α(M, w). We have to show that it cannot be finite.
We can prove, just like in Lemma 32.13, that if M , started on input w, has
not halted after n steps, then τ ′ (M, w) ⊨ χ(M, w, n)∧ψ(n). Since M started on
input w does not halt, τ ′ (M, w) ⊨ χ(M, w, n)∧ψ(n) for all n ∈ N. Note that by
Proposition 32.10, τ ′ (M, w) ⊨ k < n for all k < n. Also ψ(n) ⊨ k < n → k ̸= n.
So, M ⊨ k ̸= n for all k < n, i.e., the infinitely many terms k must all have
different values in M. But this requires that |M| be infinite, so M cannot be a
finite model of τ ′ (M, w) ∧ α(M, w).

Problem 32.12. Complete the proof of Lemma 32.20 by proving that if M ,


started on input w, has not halted after n steps, then τ ′ (M, w) ⊨ ψ(n).

tur:und:tra: Theorem 32.21 (Trakhtenbrot’s Theorem). It is undecidable if an arbi-


thm:trakhtenbrodt
trary sentence of first-order logic has a finite model (i.e., is finitely satisfiable).

Proof. Suppose there were a Turing machine F that decides the finite satis-
fiability problem. Then given any Turing machine M and input w, we could
compute the sentence τ ′ (M, w) ∧ α(M, w), and use F to decide if it has a finite
model. By Lemmata 32.19 and 32.20, it does iff M started on input w halts.
So we could use F to solve the halting problem, which we know is unsolvable.

tur:und:tra: Corollary 32.22. There can be no derivation system that is sound and com-
cor:fproof-incomp
plete for finite validity, i.e., a derivation system which has ⊢ ψ iff M ⊨ ψ for
every finite structure M.

Proof. Exercise.

Problem 32.13. Prove Corollary 32.22. Observe that ψ is satisfied in every


finite structure iff ¬ψ is not finitely satisfiable. Explain why finite satisfiability
is semi-decidable in the sense of Theorem 32.18. Use this to argue that if there
were a derivation system for finite validity, then finite satisfiability would be
decidable.

Release : 6891b66 (2024-12-01) 491


Part VII

Incompleteness
Material in this part covers the incompleteness theorems. It depends
on material in the parts on first-order logic (esp., the proof system), the
material on recursive functions (in the computability part). It is based on
Jeremy Avigad’s notes with revisions by Richard Zach.

Chapter 33

Introduction to Incompleteness

content/incompleteness/introduction/historical-background.tex

33.1 Historical Background


In this section, we will briefly discuss historical developments that will help inc:int:bgr:
sec
put the incompleteness theorems in context. In particular, we will give a very
sketchy overview of the history of mathematical logic; and then say a few words
about the history of the foundations of mathematics.
digression The phrase “mathematical logic” is ambiguous. One can interpret the word
“mathematical” as describing the subject matter, as in, “the logic of mathe-
matics,” denoting the principles of mathematical reasoning; or as describing
the methods, as in “the mathematics of logic,” denoting a mathematical study
of the principles of reasoning. The account that follows involves mathematical
logic in both senses, often at the same time.
The study of logic began, essentially, with Aristotle, who lived approxi-
mately 384–322 bce. His Categories, Prior analytics, and Posterior analytics
include systematic studies of the principles of scientific reasoning, including a
thorough and systematic study of the syllogism.
Aristotle’s logic dominated scholastic philosophy through the middle ages;
indeed, as late as the eighteenth century, Kant maintained that Aristotle’s logic

492
33.1. HISTORICAL BACKGROUND

was perfect and in no need of revision. But the theory of the syllogism is far
too limited to model anything but the most superficial aspects of mathematical
reasoning. A century earlier, Leibniz, a contemporary of Newton’s, imagined
a complete “calculus” for logical reasoning, and made some rudimentary steps
towards designing such a calculus, essentially describing a version of proposi-
tional logic.
The nineteenth century was a watershed for logic. In 1854 George Boole
wrote The Laws of Thought, with a thorough algebraic study of propositional
logic that is not far from modern presentations. In 1879 Gottlob Frege pub-
lished his Begriffsschrift (Concept writing) which extends propositional logic
with quantifiers and relations, and thus includes first-order logic. In fact,
Frege’s logical systems included higher-order logic as well, and more. In his
Basic Laws of Arithmetic, Frege set out to show that all of arithmetic could
be derived in his Begriffsschrift from purely logical assumption. Unfortunately,
these assumptions turned out to be inconsistent, as Russell showed in 1902.
But setting aside the inconsistent axiom, Frege more or less invented modern
logic singlehandedly, a startling achievement. Quantificational logic was also
developed independently by algebraically-minded thinkers after Boole, includ-
ing Peirce and Schröder.
Let us now turn to developments in the foundations of mathematics. Of
course, since logic plays an important role in mathematics, there is a good
deal of interaction with the developments just described. For example, Frege
developed his logic with the explicit purpose of showing that all of mathematics
could be based solely on his logical framework; in particular, he wished to show
that mathematics consists of a priori analytic truths instead of, as Kant had
maintained, a priori synthetic ones.
Many take the birth of mathematics proper to have occurred with the
Greeks. Euclid’s Elements, written around 300 B.C., is already a mature repre-
sentative of Greek mathematics, with its emphasis on rigor and precision. The
definitions and proofs in Euclid’s Elements survive more or less intact in high
school geometry textbooks today (to the extent that geometry is still taught
in high schools). This model of mathematical reasoning has been held to be a
paradigm for rigorous argumentation not only in mathematics but in branches
of philosophy as well. (Spinoza even presented moral and religious arguments
in the Euclidean style, which is strange to see!)
Calculus was invented by Newton and Leibniz in the seventeenth century.
(A fierce priority dispute raged for centuries, but most scholars today hold that
the two developments were for the most part independent.) Calculus involves
reasoning about, for example, infinite sums of infinitely small quantities; these
features fueled criticism by Bishop Berkeley, who argued that belief in God was
no less rational than the mathematics of his time. The methods of calculus
were widely used in the eighteenth century, for example by Leonhard Euler,
who used calculations involving infinite sums with dramatic results.
In the nineteenth century, mathematicians tried to address Berkeley’s crit-
icisms by putting calculus on a firmer foundation. Efforts by Cauchy, Weier-
strass, Bolzano, and others led to our contemporary definitions of limits, conti-

Release : 6891b66 (2024-12-01) 493


CHAPTER 33. INTRODUCTION TO INCOMPLETENESS

nuity, differentiation, and integration in terms of “epsilons and deltas,” in other


words, devoid of any reference to infinitesimals. Later in the century, mathe-
maticians tried to push further, and explain all aspects of calculus, including
the real numbers themselves, in terms of the natural numbers. (Kronecker:
“God created the whole numbers, all else is the work of man.”) In 1872,
Dedekind wrote “Continuity and the irrational numbers,” where he showed
how to “construct” the real numbers as sets of rational numbers (which, as
you know, can be viewed as pairs of natural numbers); in 1888 he wrote “Was
sind und was sollen die Zahlen” (roughly, “What are the natural numbers, and
what should they be?”) which aimed to explain the natural numbers in purely
“logical” terms. In 1887 Kronecker wrote “Über den Zahlbegriff” (“On the
concept of number”) where he spoke of representing all mathematical object
in terms of the integers; in 1889 Giuseppe Peano gave formal, symbolic axioms
for the natural numbers.
The end of the nineteenth century also brought a new boldness in dealing
with the infinite. Before then, infinitary objects and structures (like the set
of natural numbers) were treated gingerly; “infinitely many” was understood
as “as many as you want,” and “approaches in the limit” was understood as
“gets as close as you want.” But Georg Cantor showed that it was possible to
take the infinite at face value. Work by Cantor, Dedekind, and others help to
introduce the general set-theoretic understanding of mathematics that is now
widely accepted.
This brings us to twentieth century developments in logic and foundations.
In 1902 Russell discovered the paradox in Frege’s logical system. In 1904 Zer-
melo proved Cantor’s well-ordering principle, using the so-called “axiom of
choice”; the legitimacy of this axiom prompted a good deal of debate. Between
1910 and 1913 the three volumes of Russell and Whitehead’s Principia Mathe-
matica appeared, extending the Fregean program of establishing mathematics
on logical grounds. Unfortunately, Russell and Whitehead were forced to adopt
two principles that seemed hard to justify as purely logical: an axiom of in-
finity and an axiom of “reducibility.” In the 1900’s Poincaré criticized the use
of “impredicative definitions” in mathematics, and in the 1910’s Brouwer be-
gan proposing to refound all of mathematics in an “intuitionistic” basis, which
avoided the use of the law of the excluded middle (φ ∨ ¬φ).
Strange days indeed! The program of reducing all of mathematics to logic
is now referred to as “logicism,” and is commonly viewed as having failed, due
to the difficulties mentioned above. The program of developing mathematics
in terms of intuitionistic mental constructions is called “intuitionism,” and is
viewed as posing overly severe restrictions on everyday mathematics. Around
the turn of the century, David Hilbert, one of the most influential mathe-
maticians of all time, was a strong supporter of the new, abstract methods
introduced by Cantor and Dedekind: “no one will drive us from the paradise
that Cantor has created for us.” At the same time, he was sensitive to founda-
tional criticisms of these new methods (oddly enough, now called “classical”).
He proposed a way of having one’s cake and eating it too:

494 Release : 6891b66 (2024-12-01)


33.1. HISTORICAL BACKGROUND

1. Represent classical methods with formal axioms and rules; represent


mathematical questions as formulas in an axiomatic system.
2. Use safe, “finitary” methods to prove that these formal deductive systems
are consistent.
Hilbert’s work went a long way toward accomplishing the first goal. In 1899,
he had done this for geometry in his celebrated book Foundations of geometry.
In subsequent years, he and a number of his students and collaborators worked
on other areas of mathematics to do what Hilbert had done for geometry.
Hilbert himself gave axiom systems for arithmetic and analysis. Zermelo gave
an axiomatization of set theory, which was expanded on by Fraenkel, Skolem,
von Neumann, and others. By the mid-1920s, there were two approaches that
laid claim to the title of an axiomatization of “all” of mathematics, the Prin-
cipia mathematica of Russell and Whitehead, and what came to be known as
Zermelo–Fraenkel set theory.
In 1921, Hilbert set out on a research project to establish the goal of proving
these systems to be consistent. He was aided in this project by several of his
students, in particular Bernays, Ackermann, and later Gentzen. The basic
idea for accomplishing this goal was to cast the question of the possibility of
a derivation of an inconsistency in mathematics as a combinatorial problem
about possible sequences of symbols, namely possible sequences of sentences
which meet the criterion of being a correct derivation of, say, φ ∧ ¬φ from
the axioms of an axiom system for arithmetic, analysis, or set theory. A proof
of the impossibility of such a sequence of symbols would—since it is itself
a mathematical proof—be formalizable in these axiomatic systems. In other
words, there would be some sentence Con which states that, say, arithmetic
is consistent. Moreover, this sentence should be provable in the systems in
question, especially if its proof requires only very restricted, “finitary” means.
The second aim, that the axiom systems developed would settle every math-
ematical question, can be made precise in two ways. In one way, we can for-
mulate it as follows: For any sentence φ in the language of an axiom system
for mathematics, either φ or ¬φ is provable from the axioms. If this were true,
then there would be no sentences which can neither be proved nor refuted on
the basis of the axioms, no questions which the axioms do not settle. An axiom
system with this property is called complete. Of course, for any given sentence
it might still be a difficult task to determine which of the two alternatives
holds. But in principle there should be a method to do so. In fact, for the ax-
iom and derivation systems considered by Hilbert, completeness would imply
that such a method exists—although Hilbert did not realize this. The second
way to interpret the question would be this stronger requirement: that there
be a mechanical, computational method which would determine, for a given
sentence φ, whether it is derivable from the axioms or not.
In 1931, Gödel proved the two “incompleteness theorems,” which showed
that this program could not succeed. There is no axiom system for mathematics
which is complete, specifically, the sentence that expresses the consistency of
the axioms is a sentence which can neither be proved nor refuted.

Release : 6891b66 (2024-12-01) 495


CHAPTER 33. INTRODUCTION TO INCOMPLETENESS

This struck a lethal blow to Hilbert’s original program. However, as is


so often the case in mathematics, it also opened up exciting new avenues for
research. If there is no one, all-encompassing formal system of mathematics, it
makes sense to develop more circumscribed systems and investigate what can be
proved in them. It also makes sense to develop less restricted methods of proof
for establishing the consistency of these systems, and to find ways to measure
how hard it is to prove their consistency. Since Gödel showed that (almost)
every formal system has questions it cannot settle, it makes sense to look for
“interesting” questions a given formal system cannot settle, and to figure out
how strong a formal system has to be to settle them. To the present day,
logicians have been pursuing these questions in a new mathematical discipline,
the theory of proofs.

content/incompleteness/introduction/definitions.tex

33.2 Definitions
In order to carry out Hilbert’s project of formalizing mathematics and showing inc:int:def:
sec
that such a formalization is consistent and complete, the first order of busi-
ness would be that of picking a language, logical framework, and a system of
axioms. For our purposes, let us suppose that mathematics can be formalized
in a first-order language, i.e., that there is some set of constant symbols, func-
tion symbols, and predicate symbols which, together with the connectives and
quantifiers of first-order logic, allow us to express the claims of mathematics.
Most people agree that such a language exists: the language of set theory, in
which ∈ is the only non-logical symbol. That such a simple language is so
expressive is of course a very implausible claim at first sight, and it took a
lot of work to establish that practically of all mathematics can be expressed
in this very austere vocabulary. To keep things simple, for now, let’s restrict
our discussion to arithmetic, so the part of mathematics that just deals with
the natural numbers N. The natural language in which to express facts of
arithmetic is LA . LA contains a single two-place predicate symbol <, a single
constant symbol 0, one one-place function symbol ′, and two two-place function
symbols + and ×.

Definition 33.1. A set of sentences Γ is a theory if it is closed under entail-


ment, i.e., if Γ = {φ : Γ ⊨ φ}.

There are two easy ways to specify theories. One is as the set of sentences
true in some structure. For instance, consider the structure for LA in which the
domain is N and all non-logical symbols are interpreted as you would expect.

Definition 33.2. The standard model of arithmetic is the structure N defined inc:int:def:
def:standard-model
as follows:

1. |N| = N

496 Release : 6891b66 (2024-12-01)


33.2. DEFINITIONS

2. 0N = 0

3. ′N (n) = n + 1 for all n ∈ N

4. +N (n, m) = n + m for all n, m ∈ N

5. ×N (n, m) = n · m for all n, m ∈ N

6. <N = {⟨n, m⟩ : n ∈ N, m ∈ N, n < m}

Note the difference between × and ·: × is a symbol in the language of


arithmetic. Of course, we’ve chosen it to remind us of multiplication, but ×
is not the multiplication operation but a two-place function symbol (officially,
f12 ). By contrast, · is the ordinary multiplication function. When you see
something like n · m, we mean the product of the numbers n and m; when
you see something like x × y we are talking about a term in the language of
arithmetic. In the standard model, the function symbol times is interpreted
as the function · on the natural numbers. For addition, we use + as both the
function symbol of the language of arithmetic, and the addition function on
the natural numbers. Here you have to use the context to determine what is
meant.

Definition 33.3. The theory of true arithmetic is the set of sentences satisfied
in the standard model of arithmetic, i.e.,

TA = {φ : N ⊨ φ}.

TA is a theory, for whenever TA ⊨ φ, φ is satisfied in every structure which


satisfies TA. Since M ⊨ TA, M ⊨ φ, and so φ ∈ TA.
The other way to specify a theory Γ is as the set of sentences entailed
by some set of sentences Γ0 . In that case, Γ is the “closure” of Γ0 under
entailment. Specifying a theory this way is only interesting if Γ0 is explicitly
specified, e.g., if the elements of Γ0 are listed. At the very least, Γ0 has to be
decidable, i.e., there has to be a computable test for when a sentence counts
as an element of Γ0 or not. We call the sentences in Γ0 axioms for Γ , and Γ
axiomatized by Γ0 .

Definition 33.4. A theory Γ is axiomatized by Γ0 iff

Γ = {φ : Γ0 ⊨ φ}

Release : 6891b66 (2024-12-01) 497


CHAPTER 33. INTRODUCTION TO INCOMPLETENESS

Definition 33.5. The theory Q axiomatized by the following sentences is


known as “Robinson’s Q” and is a very simple theory of arithmetic.

∀x ∀y (x′ = y ′ → x = y) (Q1 )

∀x 0 ̸= x (Q2 )

∀x (x = 0 ∨ ∃y x = y ) (Q3 )
∀x (x + 0) = x (Q4 )
∀x ∀y (x + y ′ ) = (x + y)′ (Q5 )
∀x (x × 0) = 0 (Q6 )

∀x ∀y (x × y ) = ((x × y) + x) (Q7 )
∀x ∀y (x < y ↔ ∃z (z ′ + x) = y) (Q8 )

The set of sentences {Q1 , . . . , Q8 } are the axioms of Q, so Q consists of all


sentences entailed by them:

Q = {φ : {Q1 , . . . , Q8 } ⊨ φ}.

Definition 33.6. Suppose φ(x) is a formula in LA with free variables x and


y1 , . . . , yn . Then any sentence of the form

∀y1 . . . ∀yn ((φ(0) ∧ ∀x (φ(x) → φ(x′ ))) → ∀x φ(x))

is an instance of the induction schema.


Peano arithmetic PA is the theory axiomatized by the axioms of Q together
with all instances of the induction schema.

explanation Every instance of the induction schema is true in N. This is easiest to see
if the formula φ only has one free variable x. Then φ(x) defines a subset Xφ
of N in N. Xφ is the set of all n ∈ N such that N, s ⊨ φ(x) when s(x) = n.
The corresponding instance of the induction schema is

((φ(0) ∧ ∀x (φ(x) → φ(x′ ))) → ∀x φ(x)).

If its antecedent is true in N, then 0 ∈ Xφ and, whenever n ∈ Xφ , so is n + 1.


Since 0 ∈ Xφ , we get 1 ∈ Xφ . With 1 ∈ Xφ we get 2 ∈ Xφ . And so on. So for
every n ∈ N, n ∈ Xφ . But this means that ∀x φ(x) is satisfied in N.
Both Q and PA are axiomatized theories. The big question is, how strong
are they? For instance, can PA prove all the truths about N that can be
expressed in LA ? Specifically, do the axioms of PA settle all the questions
that can be formulated in LA ?
Another way to put this is to ask: Is PA = TA? TA obviously does prove
(i.e., it includes) all the truths about N, and it settles all the questions that
can be formulated in LA , since if φ is a sentence in LA , then either N ⊨ φ or
N ⊨ ¬φ, and so either TA ⊨ φ or TA ⊨ ¬φ. Call such a theory complete.
Definition 33.7. A theory Γ is complete iff for every sentence φ in its lan-
guage, either Γ ⊨ φ or Γ ⊨ ¬φ.

498 Release : 6891b66 (2024-12-01)


33.2. DEFINITIONS

By the Completeness Theorem, Γ ⊨ φ iff Γ ⊢ φ, so Γ is complete iff for explanation

every sentence φ in its language, either Γ ⊢ φ or Γ ⊢ ¬φ.


Another question we are led to ask is this: Is there a computational proce-
dure we can use to test if a sentence is in TA, in PA, or even just in Q? We
can make this more precise by defining when a set (e.g., a set of sentences) is
decidable.

Definition 33.8. A set X is decidable iff there is a computational procedure


which on input x returns 1 if x ∈ X and 0 otherwise.

So our question becomes: Is TA (PA, Q) decidable?


The answer to all these questions will be: no. None of these theories are
decidable. However, this phenomenon is not specific to these particular theo-
ries. In fact, any theory that satisfies certain conditions is subject to the same
results. One of these conditions, which Q and PA satisfy, is that they are
axiomatized by a decidable set of axioms.

Definition 33.9. A theory is axiomatizable if it is axiomatized by a decidable


set of axioms.

Example 33.10. Any theory axiomatized by a finite set of sentences is ax-


iomatizable, since any finite set is decidable. Thus, Q, for instance, is axiom-
atizable.
Schematically axiomatized theories like PA are also axiomatizable. For to
test if ψ is among the axioms of PA, i.e., to compute the function χX where
χX (ψ) = 1 if ψ is an axiom of PA and = 0 otherwise, we can do the following:
First, check if ψ is one of the axioms of Q. If it is, the answer is “yes” and the
value of χX (ψ) = 1. If not, test if it is an instance of the induction schema.
This can be done systematically; in this case, perhaps it’s easiest to see that
it can be done as follows: Any instance of the induction schema begins with a
number of universal quantifiers, and then a sub-formula that is a conditional.
The consequent of that conditional is ∀x φ(x, y1 , . . . , yn ) where x and y1 , . . . ,
yn are all the free variables of φ and the initial quantifiers of ψ bind the
variables y1 , . . . , yn . Once we have extracted this φ and checked that its free
variables match the variables bound by the universal quantifiers at the front
and ∀x, we go on to check that the antecedent of the conditional matches

φ(0, y1 , . . . , yn ) ∧ ∀x (φ(x, y1 , . . . , yn ) → φ(x′ , y1 , . . . , yn ))

Again, if it does, ψ is an instance of the induction schema, and if it doesn’t, ψ


isn’t.

In answering this question—and the more general question of which theories


are complete or decidable—it will be useful to consider also the following defi-
nition. Recall that a set X is enumerable iff it is empty or if there is a surjective
function f : N → X. Such a function is called an enumeration of X.

Release : 6891b66 (2024-12-01) 499


CHAPTER 33. INTRODUCTION TO INCOMPLETENESS

Definition 33.11. A set X is called computably enumerable (c.e. for short)


iff it is empty or it has a computable enumeration.

In addition to axiomatizability, another condition on theories to which the


incompleteness theorems apply will be that they are strong enough to prove ba-
sic facts about computable functions and decidable relations. By “basic facts,”
we mean sentences which express what the values of computable functions are
for each of their arguments. And by “strong enough” we mean that the theories
in question count these sentences among its theorems. For instance, consider
a prototypical computable function: addition. The value of + for arguments
2 and 3 is 5, i.e., 2 + 3 = 5. A sentence in the language of arithmetic that
expresses that the value of + for arguments 2 and 3 is 5 is: (2 + 3) = 5. And,
e.g., Q proves this sentence. More generally, we would like there to be, for
each computable function f (x1 , x2 ) a formula φf (x1 , x2 , y) in LA such that
Q ⊢ φf (n1 , n2 , m) whenever f (n1 , n2 ) = m. In this way, Q proves that the
value of f for arguments n1 , n2 is m. In fact, we require that it proves a bit
more, namely that no other number is the value of f for arguments n1 , n2 . And
the same goes for decidable relations. This is made precise in the following two
definitions.

Definition 33.12. A formula φ(x1 , . . . , xk , y) represents the function f : Nk →


N in Γ iff whenever f (n1 , . . . , nk ) = m, then

1. Γ ⊢ φ(n1 , . . . , nk , m), and

2. Γ ⊢ ∀y(φ(n1 , . . . , nk , y) → y = m).

Definition 33.13. A formula φ(x1 , . . . , xk ) represents the relation R ⊆ Nk iff,

1. whenever R(n1 , . . . , nk ), Γ ⊢ φ(n1 , . . . , nk ), and

2. whenever not R(n1 , . . . , nk ), Γ ⊢ ¬φ(n1 , . . . , nk ).

A theory is “strong enough” for the incompleteness theorems to apply if


it represents all computable functions and all decidable relations. Q and its
extensions satisfy this condition, but it will take us a while to establish this—
it’s a non-trivial fact about the kinds of things Q can prove, and it’s hard to
show because Q has only a few axioms from which we’ll have to prove all these
facts. However, Q is a very weak theory. So although it’s hard to prove that
Q represents all computable functions, most interesting theories are stronger
than Q, i.e., prove more than Q does. And if Q proves something, any stronger
theory does; since Q represents all computable functions, every stronger theory
does. This means that many interesting theories meet this condition of the
incompleteness theorems. So our hard work will pay off, since it shows that
the incompleteness theorems apply to a wide range of theories. Certainly, any
theory aiming to formalize “all of mathematics” must prove everything that
Q proves, since it should at the very least be able to capture the results of

500 Release : 6891b66 (2024-12-01)


33.3. OVERVIEW OF INCOMPLETENESS RESULTS

elementary computations. So any theory that is a candidate for a theory of


“all of mathematics” will be one to which the incompleteness theorems apply.

content/incompleteness/introduction/overview.tex

33.3 Overview of Incompleteness Results


inc:int:ovr: Hilbert expected that mathematics could be formalized in an axiomatizable
sec
theory which it would be possible to prove complete and decidable. Moreover,
he aimed to prove the consistency of this theory with very weak, “finitary,”
means, which would defend classical mathematics against the challenges of
intuitionism. Gödel’s incompleteness theorems showed that these goals cannot
be achieved.
Gödel’s first incompleteness theorem showed that a version of Russell and
Whitehead’s Principia Mathematica is not complete. But the proof was actu-
ally very general and applies to a wide variety of theories. This means that it
wasn’t just that Principia Mathematica did not manage to completely capture
mathematics, but that no acceptable theory does. It took a while to isolate
the features of theories that suffice for the incompleteness theorems to apply,
and to generalize Gödel’s proof to apply make it depend only on these fea-
tures. But we are now in a position to state a very general version of the first
incompleteness theorem for theories in the language LA of arithmetic.

Theorem 33.14. If Γ is a consistent and axiomatizable theory in LA which


represents all computable functions and decidable relations, then Γ is not com-
plete.

To say that Γ is not complete is to say that for at least one sentence φ,
Γ ⊬ φ and Γ ⊬ ¬φ. Such a sentence is called independent (of Γ ). We can in
fact relatively quickly prove that there must be independent sentences. But
the power of Gödel’s proof of the theorem lies in the fact that it exhibits a
specific example of such an independent sentence. The intriguing construction
produces a sentence γΓ , called a Gödel sentence for Γ , which is unprovable
because in Γ , γΓ is equivalent to the claim that γΓ is unprovable in Γ . It does
so constructively, i.e., given an axiomatization of Γ and a description of the
derivation system, the proof gives a method for actually writing down γΓ .
The construction in Gödel’s proof requires that we find a way to express
in LA the properties of and operations on terms and formulas of LA itself.
These include properties such as “φ is a sentence,” “δ is a derivation of φ,”
and operations such as φ[t/x]. This way must (a) express these properties and
relations via a “coding” of symbols and sequences thereof (which is what terms,
formulas, derivations, etc. are) as natural numbers (which is what LA can talk
about). It must (b) do this in such a way that Γ will prove the relevant facts,
so we must show that these properties are coded by decidable properties of
natural numbers and the operations correspond to computable functions on
natural numbers. This is called “arithmetization of syntax.”

Release : 6891b66 (2024-12-01) 501


CHAPTER 33. INTRODUCTION TO INCOMPLETENESS

Before we investigate how syntax can be arithmetized, however, we will


consider the condition that Γ is “strong enough,” i.e., represents all computable
functions and decidable relations. This requires that we give a precise definition
of “computable.” This can be done in a number of ways, e.g., via the model
of Turing machines, or as those functions computable by programs in some
general-purpose programming language. Since our aim is to represent these
functions and relations in a theory in the language LA , however, it is best to
pick a simple definition of computability of just numerical functions. This is the
notion of recursive function. So we will first discuss the recursive functions. We
will then show that Q already represents all recursive functions and relations.
This will allow us to apply the incompleteness theorem to specific theories such
as Q and PA, since we will have established that these are examples of theories
that are “strong enough.”
The end result of the arithmetization of syntax is a formula ProvΓ (x) which,
via the coding of formulas as numbers, expresses provability from the axioms
of Γ . Specifically, if φ is coded by the number n, and Γ ⊢ φ, then Γ ⊢ ProvΓ (n).
This “provability predicate” for Γ allows us also to express, in a certain sense,
the consistency of Γ as a sentence of LA : let the “consistency statement”
for Γ be the sentence ¬ProvΓ (n), where we take n to be the code of a contra-
diction, e.g., of ⊥. The second incompleteness theorem states that consistent
axiomatizable theories also do not prove their own consistency statements. The
conditions required for this theorem to apply are a bit more stringent than just
that the theory represents all computable functions and decidable relations,
but we will show that PA satisfies them.

content/incompleteness/introduction/undecidability.tex

33.4 Undecidability and Incompleteness


Gödel’s proof of the incompleteness theorems require arithmetization of syntax. inc:int:dec:
sec
But even without that we can obtain some nice results just on the assumption
that a theory represents all decidable relations. The proof is a diagonal argu-
ment similar to the proof of the undecidability of the halting problem.
Theorem 33.15. If Γ is a consistent theory that represents every decidable
relation, then Γ is not decidable.

Proof. Suppose Γ were decidable. We show that if Γ represents every decidable


relation, it must be inconsistent.
Decidable properties (one-place relations) are represented by formulas with
one free variable. Let φ0 (x), φ1 (x), . . . , be a computable enumeration of all
such formulas. Now consider the following set D ⊆ N:

D = {n : Γ ⊢ ¬φn (n)}

The set D is decidable, since we can test if n ∈ D by first computing φn (x), and
from this ¬φn (n). Obviously, substituting the term n for every free occurrence

502 Release : 6891b66 (2024-12-01)


33.4. UNDECIDABILITY AND INCOMPLETENESS

of x in φn (x) and prefixing φ(n) by ¬ is a mechanical matter. By assumption,


Γ is decidable, so we can test if ¬φ(n) ∈ Γ . If it is, n ∈ D, and if it isn’t,
n∈ / D. So D is likewise decidable.
Since Γ represents all decidable properties, it represents D. And the for-
mulas which represent D in Γ are all among φ0 (x), φ1 (x), . . . . So let d be
a number such that φd (x) represents D in Γ . If d ∈ / D, then, since φd (x)
represents D, Γ ⊢ ¬φd (d). But that means that d meets the defining condition
of D, and so d ∈ D. This contradicts d ∈ / D. So by indirect proof, d ∈ D.
Since d ∈ D, by the definition of D, Γ ⊢ ¬φd (d). On the other hand, since
φd (x) represents D in Γ , Γ ⊢ φd (d). Hence, Γ is inconsistent.

The preceding theorem shows that no consistent theory that represents all explanation

decidable relations can be decidable. We will show that Q does represent all
decidable relations; this means that all theories that include Q, such as PA
and TA, also do, and hence also are not decidable. (Since all these theories
are true in the standard model, they are all consistent.)
We can also use this result to obtain a weak version of the first incomplete-
ness theorem. Any theory that is axiomatizable and complete is decidable.
Consistent theories that are axiomatizable and represent all decidable proper-
ties then cannot be complete.

Theorem 33.16. If Γ is axiomatizable and complete it is decidable.

Proof. Any inconsistent theory is decidable, since inconsistent theories contain


all sentences, so the answer to the question “is φ ∈ Γ ” is always “yes,” i.e., can
be decided.
So suppose Γ is consistent, and furthermore is axiomatizable, and complete.
Since Γ is axiomatizable, it is computably enumerable. For we can enumerate
all the correct derivations from the axioms of Γ by a computable function. From
a correct derivation we can compute the sentence it derives, and so together
there is a computable function that enumerates all theorems of Γ . A sentence
is a theorem of Γ iff ¬φ is not a theorem, since Γ is consistent and complete.
We can therefore decide if φ ∈ Γ as follows. Enumerate all theorems of Γ .
When φ appears on this list, we know that Γ ⊢ φ. When ¬φ appears on this
list, we know that Γ ⊬ φ. Since Γ is complete, one of these cases eventually
obtains, so the procedure eventually produces an answer.

inc:int:dec: Corollary 33.17. If Γ is consistent, axiomatizable, and represents every de-


cor:incompleteness
cidable property, it is not complete.

Proof. If Γ were complete, it would be decidable by the previous theorem (since


it is axiomatizable and consistent). But since Γ represents every decidable
property, it is not decidable, by the first theorem.

Problem 33.1. Show that TA = {φ : N ⊨ φ} is not axiomatizable. You may


assume that TA represents all decidable properties.

Release : 6891b66 (2024-12-01) 503


Once we have established that, e.g., Q, represents all decidable properties,
the corollary tells us that Q must be incomplete. However, its proof does
not provide an example of an independent sentence; it merely shows that such
a sentence must exist. For this, we have to arithmetize syntax and follow
Gödel’s original proof idea. And of course, we still have to show the first claim,
namely that Q does, in fact, represent all decidable properties.
It should be noted that not every interesting theory is incomplete or un-
decidable. There are many theories that are sufficiently strong to describe
interesting mathematical facts that do not satisify the conditions of Gödel’s
result. For instance, Pres = {φ ∈ LA+ : N ⊨ φ}, the set of sentences of the
language of arithmetic without × true in the standard model, is both complete
and decidable. This theory is called Presburger arithmetic, and proves all the
truths about natural numbers that can be formulated just with 0, ′, and +.

Chapter 34

Arithmetization of Syntax

Note that arithmetization for signed tableaux is not yet available.

content/incompleteness/arithmetization-syntax/introduction.tex

34.1 Introduction
In order to connect computability and logic, we need a way to talk about the ob- inc:art:int:
sec
jects of logic (symbols, terms, formulas, derivations), operations on them, and
their properties and relations, in a way amenable to computational treatment.
We can do this directly, by considering computable functions and relations on
symbols, sequences of symbols, and other objects built from them. Since the
objects of logical syntax are all finite and built from an enumerable sets of
symbols, this is possible for some models of computation. But other models
of computation—such as the recursive functions—-are restricted to numbers,
their relations and functions. Moreover, ultimately we also want to be able to
deal with syntax within certain theories, specifically, in theories formulated in

504
34.1. INTRODUCTION

the language of arithmetic. In these cases it is necessary to arithmetize syntax,


i.e., to represent syntactic objects, operations on them, and their relations, as
numbers, arithmetical functions, and arithmetical relations, respectively. The
idea, which goes back to Leibniz, is to assign numbers to syntactic objects.
It is relatively straightforward to assign numbers to symbols as their “codes.”
Some symbols pose a bit of a challenge, since, e.g., there are infinitely many
variables, and even infinitely many function symbols of each arity n. But of
course it’s possible to assign numbers to symbols systematically in such a way
that, say, v2 and v3 are assigned different codes. Sequences of symbols (such as
terms and formulas) are a bigger challenge. But if we can deal with sequences
of numbers purely arithmetically (e.g., by the powers-of-primes coding of se-
quences), we can extend the coding of individual symbols to coding of sequences
of symbols, and then further to sequences or other arrangements of formulas,
such as derivations. This extended coding is called “Gödel numbering.” Every
term, formula, and derivation is assigned a Gödel number.
By coding sequences of symbols as sequences of their codes, and by chos-
ing a system of coding sequences that can be dealt with using computable
functions, we can then also deal with Gödel numbers using computable func-
tions. In practice, all the relevant functions will be primitive recursive. For
instance, computing the length of a sequence and computing the i-th element
of a sequence from the code of the sequence are both primitive recursive. If
the number coding the sequence is, e.g., the Gödel number of a formula φ, we
immediately see that the length of a formula and the (code of the) i-th symbol
in a formula can also be computed from the Gödel number of φ. It is a bit
harder to prove that, e.g., the property of being the Gödel number of a correctly
formed term or of a correct derivation is primitive recursive. It is nevertheless
possible, because the sequences of interest (terms, formulas, derivations) are
inductively defined.
As an example, consider the operation of substitution. If φ is a formula,
x a variable, and t a term, then φ[t/x] is the result of replacing every free
occurrence of x in φ by t. Now suppose we have assigned Gödel numbers to φ,
x, t—say, k, l, and m, respectively. The same scheme assigns a Gödel number
to φ[t/x], say, n. This mapping—of k, l, and m to n—is the arithmetical analog
of the substitution operation. When the substitution operation maps φ, x, t to
φ[t/x], the arithmetized substitution functions maps the Gödel numbers k, l,
m to the Gödel number n. We will see that this function is primitive recursive.
Arithmetization of syntax is not just of abstract interest, although it was
originally a non-trivial insight that languages like the language of arithmetic,
which do not come with mechanisms for “talking about” languages can, after
all, formalize complex properties of expressions. It is then just a small step to
ask what a theory in this language, such as Peano arithmetic, can prove about
its own language (including, e.g., whether sentences are provable or true). This
leads us to the famous limitative theorems of Gödel (about unprovability) and
Tarski (the undefinability of truth). But the trick of arithmetizing syntax is
also important in order to prove some important results in computability the-
ory, e.g., about the computational power of theories or the relationship between

Release : 6891b66 (2024-12-01) 505


CHAPTER 34. ARITHMETIZATION OF SYNTAX

different models of computability. The arithmetization of syntax serves as a


model for arithmetizing other objects and properties. For instance, it is sim-
ilarly possible to arithmetize configurations and computations (say, of Turing
machines). This makes it possible to simulate computations in one model (e.g.,
Turing machines) in another (e.g., recursive functions).

content/incompleteness/arithmetization-syntax/coding-symbols.tex

34.2 Coding Symbols


The basic language L of first order logic makes use of the symbols inc:art:cod:
sec

⊥ ¬ ∨ ∧ → ∀ ∃ = ( ) ,

together with enumerable sets of variables and constant symbols, and enumer-
able sets of function symbols and predicate symbols of arbitrary arity. We can
assign codes to each of these symbols in such a way that every symbol is as-
signed a unique number as its code, and no two different symbols are assigned
the same number. We know that this is possible since the set of all symbols is
enumerable and so there is a bijection between it and the set of natural num-
bers. But we want to make sure that we can recover the symbol (as well as
some information about it, e.g., the arity of a function symbol) from its code
in a computable way. There are many possible ways of doing this, of course.
Here is one such way, which uses primitive recursive functions. (Recall that
⟨n0 , . . . , nk ⟩ is the number coding the sequence of numbers n0 , . . . , nk .)

Definition 34.1. If s is a symbol of L, let the symbol code cs be defined as


follows:

1. If s is among the logical symbols, cs is given by the following table:

⊥ ¬ ∨ ∧ → ∀
⟨0, 0⟩ ⟨0, 1⟩ ⟨0, 2⟩ ⟨0, 3⟩ ⟨0, 4⟩ ⟨0, 5⟩
∃ = ( ) ,
⟨0, 6⟩ ⟨0, 7⟩ ⟨0, 8⟩ ⟨0, 9⟩ ⟨0, 10⟩

2. If s is the i-th variable vi , then cs = ⟨1, i⟩.

3. If s is the i-th constant symbol ci , then cs = ⟨2, i⟩.

4. If s is the i-th n-ary function symbol fin , then cs = ⟨3, n, i⟩.

5. If s is the i-th n-ary predicate symbol Pin , then cs = ⟨4, n, i⟩.

Proposition 34.2. The following relations are primitive recursive:

1. Fn(x, n) iff x is the code of fin for some i, i.e., x is the code of an n-ary
function symbol.

506 Release : 6891b66 (2024-12-01)


34.3. CODING TERMS

2. Pred(x, n) iff x is the code of Pin for some i or x is the code of = and
n = 2, i.e., x is the code of an n-ary predicate symbol.
Definition 34.3. If s0 , . . . , sn−1 is a sequence of symbols, its Gödel number is
⟨cs0 , . . . , csn−1 ⟩.
Note that codes and Gödel numbers are different things. For instance, the explanation
variable v5 has a code cv5 = ⟨1, 5⟩ = 22 · 36 . But the variable v5 considered as
a term is also a sequence of symbols (of length 1). The Gödel number # v5 # of
2 6
the term v5 is ⟨cv5 ⟩ = 2cv5 +1 = 22 ·3 +1 .
Example 34.4. Recall that if k0 , . . . , kn−1 is a sequence of numbers, then the
code of the sequence ⟨k0 , . . . , kn−1 ⟩ in the power-of-primes coding is
k
2k0 +1 · 3k1 +1 · · · · · pn−1
n−1
,
where pi is the i-th prime (starting with p0 = 2). So for instance, the formula
v0 = 0, or, more explicitly, =(v0 , c0 ), has the Gödel number
⟨c= , c( , cv0 , c, , cc0 , c) ⟩.
0+1 7+1
Here, c= is ⟨0, 7⟩ = 2 ·3 , cv0 is ⟨1, 0⟩ = 21+1 · 30+1 , etc. So # = (v0 , c0 )#
is

2c= +1 · 3c( +1 · 5cv0 +1 · 7c, +1 · 11cc0 +1 · 13c) +1 =


1
·38 +1 1
·39 +1 2
·31 +1 1
·311 +1 3
·31 +1 1
·310 +1
22 · 32 · 52 · 72 · 112 · 132 =
13 123 39 367 13 354 295 25
2 ·3 ·5 ·7 · 11 · 13118 099 .

content/incompleteness/arithmetization-syntax/coding-terms.tex

34.3 Coding Terms


inc:art:trm: A term is simply a certain kind of sequence of symbols: it is built up inductively explanation
sec
from constants and variables according to the formation rules for terms. Since
sequences of symbols can be coded as numbers—using a coding scheme for the
symbols plus a way to code sequences of numbers—assigning Gödel numbers
to terms is not difficult. The challenge is rather to show that the property a
number has if it is the Gödel number of a correctly formed term is computable,
or in fact primitive recursive.
Variables and constant symbols are the simplest terms, and testing whether
x is the Gödel number of such a term is easy: Var(x) holds if x is # vi # for
some i. In other words, x is a sequence of length 1 and its single element (x)0
is the code of some variable vi , i.e., x is ⟨⟨1, i⟩⟩ for some i. Similarly, Const(x)
holds if x is # ci # for some i. Both of these relations are primitive recursive,
since if such an i exists, it must be < x:
Var(x) ⇔ (∃i < x) x = ⟨⟨1, i⟩⟩
Const(x) ⇔ (∃i < x) x = ⟨⟨2, i⟩⟩

Release : 6891b66 (2024-12-01) 507


CHAPTER 34. ARITHMETIZATION OF SYNTAX

Proposition 34.5. The relations Term(x) and ClTerm(x) which hold iff x inc:art:trm:
prop:term-primrec
is the Gödel number of a term or a closed term, respectively, are primitive
recursive.

Proof. A sequence of symbols s is a term iff there is a sequence s0 , . . . , sk−1 = s


of terms which records how the term s was formed from constant symbols and
variables according to the formation rules for terms. To express that such a
putative formation sequence follows the formation rules it has to be the case
that, for each i < k, either

1. si is a variable vj , or

2. si is a constant symbol cj , or

3. si is built from n terms t1 , . . . , tn occurring prior to place i using an


n-place function symbol fjn .

To show that the corresponding relation on Gödel numbers is primitive recur-


sive, we have to express this condition primitive recursively, i.e., using primitive
recursive functions, relations, and bounded quantification.
Suppose y is the number that codes the sequence s0 , . . . , sk−1 , i.e., y =
⟨# s0 # , . . . , # sk−1 # ⟩. It codes a formation sequence for the term with Gödel
number x iff for all i < k:

1. Var((y)i ), or

2. Const((y)i ), or

3. there is an n and a number z = ⟨z1 , . . . , zn ⟩ such that each zl is equal to


some (y)i′ for i′ < i and

(y)i = # fjn (# ⌢ flatten(z) ⌢ # )# ,

and moreover (y)k−1 = x. (The function flatten(z) turns the sequence ⟨# t1 # , . . . , # tn # ⟩


into # t1 , . . . , tn # and is primitive recursive.)
The indices j, n, the Gödel numbers zl of the terms tl , and the code z of
the sequence ⟨z1 , . . . , zn ⟩, in (3) are all less than y. We can replace k above
with len(y). Hence we can express “y is the code of a formation sequence of the
term with Gödel number x” in a way that shows that this relation is primitive
recursive.
We now just have to convince ourselves that there is a primitive recursive
bound on y. But if x is the Gödel number of a term, it must have a formation
sequence with at most len(x) terms (since every term in the formation sequence
of s must start at some place in s, and no two subterms can start at the same
place). The Gödel number of each subterm of s is of course ≤ x. Hence, there
k(x+1)
always is a formation sequence with code ≤ pk−1 , where k = len(x).
For ClTerm, simply leave out the clause for variables.

508 Release : 6891b66 (2024-12-01)


34.4. CODING FORMULAS

Problem 34.1. Show that the function flatten(z), which turns the sequence
⟨# t1 # , . . . , # tn # ⟩ into # t1 , . . . , tn # , is primitive recursive.

inc:art:trm: Proposition 34.6. The function num(n) = # n# is primitive recursive.


prop:num-primrec

Proof. We define num(n) by primitive recursion:

num(0) = # 0#
num(n + 1) = # ′(# ⌢ num(n) ⌢ # )# .

content/incompleteness/arithmetization-syntax/coding-formulas.tex

34.4 Coding Formulas


inc:art:frm:
sec
Proposition 34.7. The relation Atom(x) which holds iff x is the Gödel num-
ber of an atomic formula, is primitive recursive.

Proof. The number x is the Gödel number of an atomic formula iff one of the
following holds:
1. There are n, j < x, and z < x such that for each i < n, Term((z)i ) and
x=
# n #
Pj ( ⌢ flatten(z) ⌢ # )# .

2. There are z1 , z2 < x such that Term(z1 ), Term(z2 ), and x =


#
=(# ⌢ z1 ⌢ # ,# ⌢ z2 ⌢ # )# .

3. x = # ⊥# .
4. x = # ⊤# .

inc:art:frm: Proposition 34.8. The relation Frm(x) which holds iff x is the Gödel number
prop:frm-primrec
of a formula is primitive recursive.

Proof. A sequence of symbols s is a formula iff there is formation sequence s0 ,


. . . , sk−1 = s of formula which records how s was formed from atomic formulas
according to the formation rules. The code for each si (and indeed of the code
of the sequence ⟨s0 , . . . , sk−1 ⟩) is less than the code x of s.

Problem 34.2. Give a detailed proof of Proposition 34.8 along the lines of
the first proof of Proposition 34.5.

inc:art:frm: Proposition 34.9. The relation FreeOcc(x, z, i), which holds iff the i-th sym-
prop:freeocc-primrec
bol of the formula with Gödel number x is a free occurrence of the variable with
Gödel number z, is primitive recursive.

Release : 6891b66 (2024-12-01) 509


CHAPTER 34. ARITHMETIZATION OF SYNTAX

Proof. Exercise.

Problem 34.3. Prove Proposition 34.9. You may make use of the fact that
any substring of a formula which is a formula is a sub-formula of it.

Proposition 34.10. The property Sent(x) which holds iff x is the Gödel num-
ber of a sentence is primitive recursive.

Proof. A sentence is a formula without free occurrences of variables. So Sent(x)


holds iff

(∀i < len(x)) (∀z < x)


((∃j < z) z = # vj # → ¬FreeOcc(x, z, i)).

content/incompleteness/arithmetization-syntax/substitution.tex

34.5 Substitution
Recall that substitution is the operation of replacing all free occurrences of inc:art:sub:
sec
a variable u in a formula φ by a term t, written φ[t/u]. This operation, when
carried out on Gödel numbers of variables, formulas, and terms, is primitive
recursive.
Proposition 34.11. There is a primitive recursive function Subst(x, y, z) inc:art:sub:
prop:subst-primrec
with the property that
Subst(# φ# , # t# , # u# ) = # φ[t/u]# .

Proof. We can then define a function hSubst by primitive recursion as follows:

hSubst(x, y, z, 0) = Λ
hSubst(x, y, z, i + 1) =
(
hSubst(x, y, z, i) ⌢ y if FreeOcc(x, z, i)
append(hSubst(x, y, z, i), (x)i ) otherwise.

Subst(x, y, z) can now be defined as hSubst(x, y, z, len(x)).

Proposition 34.12. The relation FreeFor(x, y, z), which holds iff the term inc:art:sub:
prop:free-for
with Gödel number y is free for the variable with Gödel number z in the formula
with Gödel number x, is primitive recursive.

Proof. Exercise.

Problem 34.4. Prove Proposition 34.12

content/incompleteness/arithmetization-syntax/proofs-in-lk.tex

510 Release : 6891b66 (2024-12-01)


34.6. DERIVATIONS IN LK

34.6 Derivations in LK
inc:art:plk: In order to arithmetize derivations, we must represent derivations as numbers. explanation
sec
Since derivations are trees of sequents where each inference carries also a la-
bel, a recursive representation is the most obvious approach: we represent a
derivation as a tuple, the components of which are the end-sequent, the label,
and the representations of the sub-derivations leading to the premises of the
last inference.

Definition 34.13. If Γ is a finite sequence of sentences, Γ = ⟨φ1 , . . . , φn ⟩,


then # Γ # = ⟨# φ1 # , . . . , # φn # ⟩.
If Γ ⇒ ∆ is a sequent, then a Gödel number of Γ ⇒ ∆ is
#
Γ ⇒ ∆# = ⟨# Γ # , # ∆# ⟩

If π is a derivation in LK, then # π # is defined as follows:

1. If π consists only of the initial sequent Γ ⇒ ∆, then # π # is

⟨0, # Γ ⇒ ∆# ⟩.

2. If π ends in an inference with one or two premises, has Γ ⇒ ∆ as its


conclusion, and π1 and π2 are the immediate subproof ending in the
premise of the last inference, then # π # is

⟨1, # π1 # , # Γ ⇒ ∆# , k⟩ or
⟨2, # π1 # , # π2 # , # Γ ⇒ ∆# , k⟩,

respectively, where k is given by the following table according to which


rule was used in the last inference:
Rule: WL WR CL CR XL XR
k: 1 2 3 4 5 6

Rule: ¬L ¬R ∧L ∧R ∨L ∨R
k: 7 8 9 10 11 12

Rule: →L →R ∀L ∀R ∃L ∃R
k: 13 14 15 16 17 18

Rule: Cut =
k: 19 20

Example 34.14. Consider the very simple derivation


φ ⇒ φ
∧L
φ∧ψ ⇒ φ
→R
⇒ (φ ∧ ψ) → φ

Release : 6891b66 (2024-12-01) 511


CHAPTER 34. ARITHMETIZATION OF SYNTAX

The Gödel number of the initial sequent would be p0 = ⟨0, # φ ⇒ φ# ⟩. The


Gödel number of the derivation ending in the conclusion of ∧L would be p1 =
⟨1, p0 , # φ ∧ ψ ⇒ φ# , 9⟩ (1 since ∧L has one premise, the Gödel number of the
conclusion φ ∧ ψ ⇒ φ, and 9 is the number coding ∧L). The Gödel number of
the entire derivation then is ⟨1, p1 , # ⇒ (φ ∧ ψ) → φ)# , 14⟩, i.e.,

⟨1, ⟨1, ⟨0, # φ ⇒ φ)# ⟩, # φ ∧ ψ ⇒ φ# , 9⟩, # ⇒ (φ ∧ ψ) → φ# , 14⟩.

explanation Having settled on a representation of derivations, we must also show that


we can manipulate such derivations primitive recursively, and express their
essential properties and relations so. Some operations are simple: e.g., given
a Gödel number p of a derivation, EndSeq(p) = (p)(p)0 +1 gives us the Gödel
number of its end-sequent and LastRule(p) = (p)(p)0 +2 the code of its last rule.
The property Sequent(s) defined by

len(s) = 2 ∧ (∀i < len((s)0 ) + len((s)1 )) Sent(((s)0 ⌢ (s)1 )i )

holds of s iff s is the Gödel number of a sequent consisting of sentences. Some


are much harder. We’ll at least sketch how to do this. The goal is to show that
the relation “π is a derivation of φ from Γ ” is a primitive recursive relation of
the Gödel numbers of π and φ.

Proposition 34.15. The property Correct(p) which holds iff the last inference inc:art:plk:
prop:followsby
in the derivation π with Gödel number p is correct, is primitive recursive.

Proof. Γ ⇒ ∆ is an initial sequent if either there is a sentence φ such that


Γ ⇒ ∆ is φ ⇒ φ, or there is a term t such that Γ ⇒ ∆ is ∅ ⇒ t = t. In terms
of Gödel numbers, InitSeq(s) holds iff

(∃x < s) (Sent(x) ∧ s = ⟨⟨x⟩, ⟨x⟩⟩) ∨


(∃t < s) (Term(t) ∧ s = ⟨0, ⟨# =(# ⌢ t ⌢ # ,# ⌢ t ⌢ # )# ⟩⟩).

We also have to show that for each rule of inference R the relation FollowsByR (p)
is primitive recursive, where FollowsByR (p) holds iff p is the Gödel number of
derivation π, and the end-sequent of π follows by a correct application of R
from the immediate sub-derivations of π.
A simple case is that of the ∧R rule. If π ends in a correct ∧R inference, it
looks like this:

π1 π2

Γ ⇒ ∆, φ Γ ⇒ ∆, ψ
∧R
Γ ⇒ ∆, φ ∧ ψ

So, the last inference in the derivation π is a correct application of ∧R iff there
are sequences of sentences Γ and ∆ as well as two sentences φ and ψ such
that the end-sequent of π1 is Γ ⇒ ∆, φ, the end-sequent of π2 is Γ ⇒ ∆, ψ,

512 Release : 6891b66 (2024-12-01)


34.6. DERIVATIONS IN LK

and the end-sequent of π is Γ ⇒ ∆, φ ∧ ψ. We just have to translate this into


Gödel numbers. If s = # Γ ⇒ ∆# then (s)0 = # Γ # and (s)1 = # ∆# . So,
FollowsBy∧R (p) holds iff

(∃g < p) (∃d < p) (∃a < p) (∃b < p)


EndSequent(p) = ⟨g, d ⌢ ⟨# (# ⌢ a ⌢ # ∧# ⌢ b ⌢ # )# ⟩⟩ ∧
EndSequent((p)1 ) = ⟨g, d ⌢ ⟨a⟩⟩ ∧
EndSequent((p)2 ) = ⟨g, d ⌢ ⟨b⟩⟩ ∧
(p)0 = 2 ∧ LastRule(p) = 10.

The individual lines express, respectively, “there is a sequence (Γ ) with Gödel


number g, there is a sequence (∆) with Gödel number d, a formula (φ) with
Gödel number a, and a formula (ψ) with Gödel number b,” such that “the
end-sequent of π is Γ ⇒ ∆, φ ∧ ψ,” “the end-sequent of π1 is Γ ⇒ ∆, φ,” “the
end-sequent of π2 is Γ ⇒ ∆, ψ,” and “π has two immediate subderivations and
the last inference rule is ∧R (with number 10).”
The last inference in π is a correct application of ∃R iff there are sequences
Γ and ∆, a formula φ, a variable x, and a term t, such that the end-sequent
of π is Γ ⇒ ∆, ∃x φ and the end-sequent of π1 is Γ ⇒ ∆, φ[t/x]. So in terms
of Gödel numbers, we have FollowsBy∃R (p) iff

(∃g < p) (∃d < p) (∃a < p) (∃x < p) (∃t < p)
EndSequent(p) = ⟨g, d ⌢ ⟨# ∃# ⌢ x ⌢ a⟩⟩ ∧
EndSequent((p)1 ) = ⟨g, d ⌢ ⟨Subst(a, t, x)⟩⟩ ∧
(p)0 = 1 ∧ LastRule(p) = 18.

We then define Correct(p) as

Sequent(EndSequent(p)) ∧
[(LastRule(p) = 1 ∧ FollowsByWL (p)) ∨ · · · ∨
(LastRule(p) = 20 ∧ FollowsBy= (p)) ∨
(p)0 = 0 ∧ InitialSeq(EndSequent(p))]

The first line ensures that the end-sequent of d is actually a sequent consisting
of sentences. The last line covers the case where p is just an initial sequent.

Problem 34.5. Define the following properties as in Proposition 34.15:

1. FollowsByCut (p),

2. FollowsBy→L (p),

3. FollowsBy= (p),

4. FollowsBy∀R (p).

Release : 6891b66 (2024-12-01) 513


CHAPTER 34. ARITHMETIZATION OF SYNTAX

For the last one, you will have to also show that you can test primitive recur-
sively if the last inference of the derivation with Gödel number p satisfies the
eigenvariable condition, i.e., the eigenvariable a of the ∀R does not occur in
the end-sequent.

Proposition 34.16. The relation Deriv(p) which holds if p is the Gödel num- inc:art:plk:
prop:deriv
ber of a correct derivation π, is primitive recursive.

Proof. A derivation π is correct if every one of its inferences is a correct ap-


plication of a rule, i.e., if every one of its sub-derivations ends in a correct
inference. So, Deriv(d) iff
(∀i < len(SubtreeSeq(p))) Correct((SubtreeSeq(p))i .

Proposition 34.17. Suppose Γ is a primitive recursive set of sentences. Then


the relation Prf Γ (x, y) expressing “x is the code of a derivation π of Γ0 ⇒ φ
for some finite Γ0 ⊆ Γ and y is the Gödel number of φ” is primitive recursive.

Proof. Suppose “y ∈ Γ ” is given by the primitive recursive predicate RΓ (y).


We have to show that Prf Γ (x, y) which holds iff y is the Gödel number of a
sentence φ and x is the code of an LK-derivation with end-sequent Γ0 ⇒ φ is
primitive recursive.
By the previous proposition, the property Deriv(x) which holds iff x is the
code of a correct derivation π in LK is primitive recursive. If x is such a code,
then EndSequent(x) is the code of the end-sequent of π, and so (EndSequent(x))0
is the code of the left side of the end sequent and (EndSequent(x))1 the right
side. So we can express “the right side of the end-sequent of π is φ” as
len((EndSequent(x))1 ) = 1 ∧ ((EndSequent(x))1 )0 = x. The left side of the
end-sequent of π is of course automatically finite, we just have to express that
every sentence in it is in Γ . Thus we can define Prf Γ (x, y) by
Prf Γ (x, y) ⇔ Deriv(x) ∧
(∀i < len((EndSequent(x))0 )) RΓ (((EndSequent(x))0 )i ) ∧
len((EndSequent(x))1 ) = 1 ∧ ((EndSequent(x))1 )0 = y.

content/incompleteness/arithmetization-syntax/proofs-in-nd.tex

34.7 Derivations in Natural Deduction


explanation In order to arithmetize derivations, we must represent derivations as numbers. inc:art:pnd:
sec
Since derivations are trees of formulas where each inference carries one or two
labels, a recursive representation is the most obvious approach: we represent
a derivation as a tuple, the components of which are the number of immediate
sub-derivations leading to the premises of the last inference, the representations
of these sub-derivations, and the end-formula, the discharge label of the last
inference, and a number indicating the type of the last inference.

514 Release : 6891b66 (2024-12-01)


34.7. DERIVATIONS IN NATURAL DEDUCTION

Definition 34.18. If δ is a derivation in natural deduction, then # δ # is defined


inductively as follows:
1. If δ consists only of the assumption φ, then # δ # is ⟨0, # φ# , n⟩. The
number n is 0 if it is an undischarged assumption, and the numerical
label otherwise.
2. If δ ends in an inference with one, two, or three premises, then # δ # is

⟨1, # δ1 # , # φ# , n, k⟩,
⟨2, # δ1 # , # δ2 # , # φ# , n, k⟩, or
⟨3, # δ1 # , # δ2 # , # δ3 # , # φ# , n, k⟩,

respectively. Here δ1 , δ2 , δ3 are the sub-derivations ending in the premise(s)


of the last inference in δ, φ is the conclusion of the last inference in δ, n
is the discharge label of the last inference (0 if the inference does not dis-
charge any assumptions), and k is given by the following table according
to which rule was used in the last inference.
Rule: ∧Intro ∧Elim ∨Intro ∨Elim
k: 1 2 3 4
Rule: →Intro →Elim ¬Intro ¬Elim
k: 5 6 7 8
Rule: ⊥I ⊥C ∀Intro ∀Elim
k: 9 10 11 12
Rule: ∃Intro ∃Elim =Intro =Elim
k: 13 14 15 16

Example 34.19. Consider the very simple derivation


[φ ∧ ψ]1
φ ∧Elim
1 →Intro
(φ ∧ ψ) → φ
The Gödel number of the assumption would be d0 = ⟨0, # φ ∧ ψ # , 1⟩. The
Gödel number of the derivation ending in the conclusion of ∧Elim would
be d1 = ⟨1, d0 , # φ# , 0, 2⟩ (1 since ∧Elim has one premise, the Gödel num-
ber of conclusion φ, 0 because no assumption is discharged, and 2 is the
number coding ∧Elim). The Gödel number of the entire derivation then is
⟨1, d1 , # ((φ ∧ ψ) → φ)# , 1, 5⟩, i.e.,

⟨1, ⟨1, ⟨0, # (φ ∧ ψ)# , 1⟩, # φ# , 0, 2⟩, # ((φ ∧ ψ) → φ)# , 1, 5⟩.

Having settled on a representation of derivations, we must also show that we explanation


can manipulate Gödel numbers of such derivations primitive recursively, and
express their essential properties and relations. Some operations are simple:
e.g., given a Gödel number d of a derivation, EndFmla(d) = (d)(d)0 +1 gives

Release : 6891b66 (2024-12-01) 515


CHAPTER 34. ARITHMETIZATION OF SYNTAX

us the Gödel number of its end-formula, DischargeLabel(d) = (d)(d)0 +2 gives


us the discharge label and LastRule(d) = (d)(d)0 +3 the number indicating the
type of the last inference. Some are much harder. We’ll at least sketch how to
do this. The goal is to show that the relation “δ is a derivation of φ from Γ ”
is a primitive recursive relation of the Gödel numbers of δ and φ.

Proposition 34.20. The following relations are primitive recursive:

1. φ occurs as an assumption in δ with label n.

2. All assumptions in δ with label n are of the form φ (i.e., we can discharge
the assumption φ using label n in δ).

Proof. We have to show that the corresponding relations between Gödel num-
bers of formulas and Gödel numbers of derivations are primitive recursive.

1. We want to show that Assum(x, d, n), which holds if x is the Gödel num-
ber of an assumption of the derivation with Gödel number d labelled n,
is primitive recursive. This is the case if the derivation with Gödel num-
ber ⟨0, x, n⟩ is a sub-derivation of d. Note that the way we code deriva-
tions is a special case of the coding of trees introduced in section 29.12, so
the primitive recursive function SubtreeSeq(d) gives a sequence of Gödel
numbers of all sub-derivations of d (of length a most d). So we can define

Assum(x, d, n) ⇔ (∃i < d) (SubtreeSeq(d))i = ⟨0, x, n⟩.

2. We want to show that Discharge(x, d, n), which holds if all assumptions


with label n in the derivation with Gödel number d all are the formula
with Gödel number x. But this relation holds iff (∀y < d) (Assum(y, d, n)→
y = x).

Proposition 34.21. The property Correct(d) which holds iff the last inference inc:art:pnd:
prop:followsby
in the derivation δ with Gödel number d is correct, is primitive recursive.

Proof. Here we have to show that for each rule of inference R the relation
FollowsByR (d) is primitive recursive, where FollowsByR (d) holds iff d is the
Gödel number of derivation δ, and the end-formula of δ follows by a correct
application of R from the immediate sub-derivations of δ.
A simple case is that of the ∧Intro rule. If δ ends in a correct ∧Intro
inference, it looks like this:

δ1 δ2

φ ψ
∧Intro
φ∧ψ

516 Release : 6891b66 (2024-12-01)


34.7. DERIVATIONS IN NATURAL DEDUCTION

Then the Gödel number d of δ is ⟨2, d1 , d2 , # (φ ∧ ψ)# , 0, k⟩ where EndFmla(d1 ) =


#
φ# , EndFmla(d2 ) = # ψ # , n = 0, and k = 1. So we can define FollowsBy∧Intro (d)
as

(d)0 = 2 ∧ DischargeLabel(d) = 0 ∧ LastRule(d) = 1 ∧


EndFmla(d) = # (# ⌢ EndFmla((d)1 ) ⌢ # ∧# ⌢ EndFmla((d)2 ) ⌢ # )# .

Another simple example if the =Intro rule. Here the premise is an empty
derivation, i.e., (d)1 = 0, and no discharge label, i.e., n = 0. However, φ must
be of the form t = t, for a closed term t. Here, a primitive recursive definition
is

(d)0 = 1 ∧ (d)1 = 0 ∧ DischargeLabel(d) = 0 ∧


(∃t < d) (ClTerm(t) ∧ EndFmla(d) = # =(# ⌢ t ⌢ # ,# ⌢ t ⌢ # )# )

For a more complicated example, FollowsBy→Intro (d) holds iff the end-
formula of δ is of the form (φ → ψ), where the end-formula of δ1 is ψ, and
any assumption in δ labelled n is of the form φ. We can express this primitive
recursively by

(d)0 = 1 ∧
(∃a < d) (Discharge(a, (d)1 , DischargeLabel(d)) ∧
EndFmla(d) = (# (# ⌢ a ⌢ # →# ⌢ EndFmla((d)1 ) ⌢ # )# ))

(Think of a as the Gödel number of φ).


For another example, consider ∃Intro. Here, the last inference in δ is correct
iff there is a formula φ, a closed term t and a variable x such that φ[t/x] is the
end-formula of the derivation δ1 and ∃x φ is the conclusion of the last inference.
So, FollowsBy∃Intro (d) holds iff

(d)0 = 1 ∧ DischargeLabel(d) = 0 ∧
(∃a < d) (∃x < d) (∃t < d) (ClTerm(t) ∧ Var(x) ∧
Subst(a, t, x) = EndFmla((d)1 ) ∧ EndFmla(d) = (# ∃# ⌢ x ⌢ a)).

We then define Correct(d) as

Sent(EndFmla(d)) ∧
(LastRule(d) = 1 ∧ FollowsBy∧Intro (d)) ∨ · · · ∨
(LastRule(d) = 16 ∧ FollowsBy=Elim (d)) ∨
(∃n < d) (∃x < d) (d = ⟨0, x, n⟩).

The first line ensures that the end-formula of d is a sentence. The last line
covers the case where d is just an assumption.

Problem 34.6. Define the following properties as in Proposition 34.21:

Release : 6891b66 (2024-12-01) 517


CHAPTER 34. ARITHMETIZATION OF SYNTAX

1. FollowsBy→Elim (d),
2. FollowsBy=Elim (d),
3. FollowsBy∨Elim (d),
4. FollowsBy∀Intro (d).
For the last one, you will have to also show that you can test primitive recur-
sively if the last inference of the derivation with Gödel number d satisfies the
eigenvariable condition, i.e., the eigenvariable a of the ∀Intro inference occurs
neither in the end-formula of d nor in an open assumption of d. You may use
the primitive recursive predicate OpenAssum from Proposition 34.23 for this.

Proposition 34.22. The relation Deriv(d) which holds if d is the Gödel num- inc:art:pnd:
prop:deriv
ber of a correct derivation δ, is primitive recursive.

Proof. A derivation δ is correct if every one of its inferences is a correct ap-


plication of a rule, i.e., if every one of its sub-derivations ends in a correct
inference. So, Deriv(d) iff

(∀i < len(SubtreeSeq(d))) Correct((SubtreeSeq(d))i )

Proposition 34.23. The relation OpenAssum(z, d) that holds if z is the inc:art:pnd:


prop:openassum
Gödel number of an undischarged assumption φ of the derivation δ with Gödel
number d, is primitive recursive.

Proof. An occurrence of an assumption is discharged if it occurs with label n


in a sub-derivation of δ that ends in a rule with discharge label n. So φ
is an undischarged assumption of δ if at least one of its occurrences is not
discharged in δ. We must be careful: δ may contain both discharged and
undischarged occurrences of φ.
Consider a sequence δ0 , . . . , δk where δ0 = δ, δk is the assumption [φ]n
(for some n), and δi+1 is an immediate sub-derivation of δi . If such a sequence
exists in which no δi ends in an inference with discharge label n, then φ is
an undischarged assumption of δ.
The primitive recursive function SubtreeSeq(d) provides us with a sequence
of Gödel numbers of all sub-derivations of δ. Any sequence of Gödel numbers of
sub-derivations of δ is a subsequence of it. Being a subsequence of is a primitive
recursive relation: Subseq(s, s′ ) holds iff (∀i < len(s)) ∃j < len(s′ ) (s)i =
(s)j . Being an immediate sub-derivation is as well: Subderiv(d, d′ ) iff (∃j <
(d′ )0 ) d = (d′ )j . So we can define OpenAssum(z, d) by

(∃s < SubtreeSeq(d)) (Subseq(s, SubtreeSeq(d)) ∧ (s)0 = d ∧


(∃n < d) ((s)len(s)−̇1 = ⟨0, z, n⟩ ∧
(∀i < (len(s) −̇ 1)) (Subderiv((s)i+1 , (s)i )] ∧
DischargeLabel((s)i+1 ) ̸= n))).

518 Release : 6891b66 (2024-12-01)


34.8. AXIOMATIC DERIVATIONS

Proposition 34.24. Suppose Γ is a primitive recursive set of sentences. Then


the relation Prf Γ (x, y) expressing “x is the code of a derivation δ of φ from
undischarged assumptions in Γ and y is the Gödel number of φ” is primitive
recursive.
Proof. Suppose “y ∈ Γ ” is given by the primitive recursive predicate RΓ (y).
We have to show that Prf Γ (x, y) which holds iff y is the Gödel number of
a sentence φ and x is the code of a natural deduction derivation with end
formula φ and all undischarged assumptions in Γ is primitive recursive.
By Proposition 34.22, the property Deriv(x) which holds iff x is the Gödel
number of a correct derivation δ in natural deduction is primitive recursive.
Thus we can define Prf Γ (x, y) by
Prf Γ (x, y) ⇔ Deriv(x) ∧ EndFmla(x) = y ∧
(∀z < x) (OpenAssum(z, x) → RΓ (z)).

content/incompleteness/arithmetization-syntax/proofs-in-ax.tex

34.8 Axiomatic Derivations


inc:art:pax: In order to arithmetize axiomatic derivations, we must represent derivations explanation
sec
as numbers. Since derivations are simply sequences of formulas, the obvious
approach is to code every derivation as the code of the sequence of codes of
formulas in it.
Definition 34.25. If δ is an axiomatic derivation consisting of formulas φ1 ,
. . . , φn , then # δ # is
⟨# φ1 # , . . . , # φn # ⟩.
Example 34.26. Consider the very simple derivation:
1. ψ → (ψ ∨ φ)
2. (ψ → (ψ ∨ φ)) → (φ → (ψ → (ψ ∨ φ)))
3. φ → (ψ → (ψ ∨ φ))
The Gödel number of this derivation would be
⟨ # ψ → (ψ ∨ φ)# ,
#
(ψ → (ψ ∨ φ)) → (φ → (ψ → (ψ ∨ φ)))# ,
#
φ → (ψ → (ψ ∨ φ))# ⟩.
Having settled on a representation of derivations, we must also show that explanation
we can manipulate such derivations primitive recursively, and express their
essential properties and relations so. Some operations are simple: e.g., given
a Gödel number d of a derivation, (d)len(d)−1 gives us the Gödel number of its
end-formula. Some are much harder. We’ll at least sketch how to do this. The
goal is to show that the relation “δ is a derivation of φ from Γ ” is primitive
recursive in the Gödel numbers of δ and φ.

Release : 6891b66 (2024-12-01) 519


CHAPTER 34. ARITHMETIZATION OF SYNTAX

Proposition 34.27. The following relations are primitive recursive: inc:art:pax:


prop:followsby
1. φ is an axiom.
2. The i-th line in δ is justified by modus ponens
3. The i-th line in δ is justified by qr.
4. δ is a correct derivation.

Proof. We have to show that the corresponding relations between Gödel num-
bers of formulas and Gödel numbers of derivations are primitive recursive.
1. We have a given list of axiom schemas, and φ is an axiom if it is of the
form given by one of these schemas. Since the list of schemas is finite,
it suffices to show that we can test primitive recursively, for each axiom
schema, if φ is of that form. For instance, consider the axiom schema

ψ → (χ → ψ).

φ is an instance of this axiom schema if there are formulas ψ and χ such


that we obtain φ when we concatenate ‘(’ with ψ with ‘→’ with ‘(’ with
χ with ‘→’ with ψ and with ‘))’. We can test the corresponding property
of the Gödel number n of φ, since concatenation of sequences is primitive
recursive and the Gödel numbers of ψ and χ must be smaller than the
Gödel number of φ, since when the relation holds, both ψ and χ are
sub-formulas of φ. Hence, we can define:

IsAxψ→(χ→ψ) (n) ⇔ (∃b < n) (∃c < n) (Sent(b) ∧ Sent(c) ∧


n = # (# ⌢ b ⌢ # →# ⌢ # (# ⌢ c ⌢ # →# ⌢ b ⌢ # ))# ).

If we have such a definition for each axiom schema, their disjunction


defines the property IsAx(n), “n is the Gödel number of an axiom.”
2. The i-th line in δ is justified by modus ponens iff there are lines j and
k < i where the sentence on line j is some formula φ, the sentence on
line k is φ → ψ, and the sentence on line i is ψ.

MP(d, i) ⇔ (∃j < i) (∃k < i)


(d)k = # (# ⌢ (d)j ⌢ # →# ⌢ (d)i ⌢ # )#

Since bounded quantification, concatenation, and = are primitive recur-


sive, this defines a primitive recursive relation.
3. A line in δ is justified by qr if it is of the form ψ → ∀x φ(x), a preceding
line is ψ → φ(c) for some constant symbol c, and c does on occur in ψ.
This is the case iff
a) there is a sentence ψ and

520 Release : 6891b66 (2024-12-01)


34.8. AXIOMATIC DERIVATIONS

b) a formula φ(x) with a single variable x free so that


c) line i contains ψ → ∀x φ(x)
d) some line j < i contains ψ → φ[c/x] for a constant c
e) which does not occur in ψ.

All of these can be tested primitive recursively, since the Gödel numbers
of ψ, φ(x), and x are less than the Gödel number of the formula on line i,
and that of a less than the Gödel number of the formula on line j:

QR1 (d, i) ⇔ (∃b < (d)i ) (∃x < (d)i ) (∃a < (d)i ) (∃c < (d)j ) (
Var(x) ∧ Const(c) ∧
(d)i = ( ⌢ b ⌢ # →# ⌢ # ∀# ⌢ x ⌢ a ⌢ # )# ∧
# #

(d)j = # (# ⌢ b ⌢ # →# ⌢ Subst(a, c, x) ⌢ # )# ∧
Sent(b) ∧ Sent(Subst(a, c, x)) ∧ (∀k < len(b)) (b)k ̸= (c)0 )

Here we assume that c and x are the Gödel numbers of the variable and
constant considered as terms (i.e., not their symbol codes). We test that
x is the only free variable of φ(x) by testing if φ(x)[c/x] is a sentence,
and ensure that c does not occur in ψ by requiring that every symbol
of ψ is different from c.
We leave the other version of qr as an exercise.

4. d is the Gödel number of a correct derivation iff every line in it is an


axiom, or justified by modus ponens or qr. Hence:

Deriv(d) ⇔ (∀i < len(d)) (IsAx((d)i ) ∨ MP(d, i) ∨ QR(d, i))

Problem 34.7. Define the following relations as in Proposition 34.27:

1. IsAxφ→(ψ→(φ∧ψ)) (n),

2. IsAx∀x φ(x)→φ(t) (n),

3. QR2 (d, i) (for the other version of qr).

Proposition 34.28. Suppose Γ is a primitive recursive set of sentences. Then


the relation Prf Γ (x, y) expressing “x is the code of a derivation δ of φ from Γ
and y is the Gödel number of φ” is primitive recursive.

Proof. Suppose “y ∈ Γ ” is given by the primitive recursive predicate RΓ (y).


We have to show that the relation Prf Γ (x, y) is primitive recursive, where
Prf Γ (x, y) holds iff y is the Gödel number of a sentence φ and x is the code of
a derivation of φ from Γ .
By the previous proposition, the property Deriv(x) which holds iff x is the
code of a correct derivation δ is primitive recursive. However, that definition

Release : 6891b66 (2024-12-01) 521


did not take into account the set Γ as an additional way to justify lines in the
derivation. Our primitive recursive test of whether a line is justified by qr also
left out of consideration the requirement that the constant c is not allowed to
occur in Γ . It is possible to amend our definition so that it takes into account Γ
directly, but it is easier to use Deriv and the deduction theorem. Γ ⊢ φ iff there
is some finite list of sentences ψ1 , . . . , ψn ∈ Γ such that {ψ1 , . . . , ψn } ⊢ φ. And
by the deduction theorem, this is the case if ⊢ (ψ1 → (ψ2 → · · · (ψn → φ) · · · )).
Whether a sentence with Gödel number z is of this form can be tested primitive
recursively. So, instead of considering x as the Gödel number of a derivation of
the sentence with Gödel number y from Γ , we consider x as the Gödel number
of a derivation of a nested conditional of the above form from ∅.
First, if we have a sequence of sentences, we can primitive recursively form
the conditional with all these sentences as antecedents and given sentence as
consequent:
hCond(s, y, 0) = y
hCond(s, y, n + 1) = # (# ⌢ (s)n ⌢ # →# ⌢ Cond(s, y, n) ⌢ # )#
Cond(s, y) = hCond(s, y, len(s))

So we can define Prf Γ (x, y) by

Prf Γ (x, y) ⇔ (∃s < sequenceBound(x, x)) (


(x)len(x)−1 = Cond(s, y) ∧
(∀i < len(s)) (s)i ∈ Γ ∧
Deriv(x)).
The bound on s is given by considering that each (s)i is the Gödel number of a
sub-formula of the last line of the derivation, i.e., is less than (x)len(x)−1 . The
number of antecedents ψ ∈ Γ , i.e., the length of s, is less than the length of
the last line of x.

Chapter 35

Representability in Q

content/incompleteness/representability-in-q/introduction.tex

522
35.1. INTRODUCTION

35.1 Introduction
inc:req:int: The incompleteness theorems apply to theories in which basic facts about com-
sec
putable functions can be expressed and proved. We will describe a very minimal
such theory called “Q” (or, sometimes, “Robinson’s Q,” after Raphael Robin-
son). We will say what it means for a function to be representable in Q, and
then we will prove the following:
A function is representable in Q if and only if it is computable.
For one thing, this provides us with another model of computability. But we
will also use it to show that the set {φ : Q ⊢ φ} is not decidable, by reducing
the halting problem to it. By the time we are done, we will have proved much
stronger things than this.
The language of Q is the language of arithmetic; Q consists of the following
axioms (to be used in conjunction with the other axioms and rules of first-order
logic with identity predicate):

∀x ∀y (x′ = y ′ → x = y) (Q1 )
∀x 0 ̸= x′ (Q2 )

∀x (x = 0 ∨ ∃y x = y ) (Q3 )
∀x (x + 0) = x (Q4 )
′ ′
∀x ∀y (x + y ) = (x + y) (Q5 )
∀x (x × 0) = 0 (Q6 )

∀x ∀y (x × y ) = ((x × y) + x) (Q7 )

∀x ∀y (x < y ↔ ∃z (z + x) = y) (Q8 )

For each natural number n, define the numeral n to be the term 0′′...′ where
there are n tick marks in all. So, 0 is the constant symbol 0 by itself, 1 is 0′ , 2
is 0′′ , etc.
As a theory of arithmetic, Q is extremely weak; for example, you can’t even
prove very simple facts like ∀x x ̸= x′ or ∀x ∀y (x + y) = (y + x). But we will
see that much of the reason that Q is so interesting is because it is so weak.
In fact, it is just barely strong enough for the incompleteness theorem to hold.
Another reason Q is interesting is because it has a finite set of axioms.
A stronger theory than Q (called Peano arithmetic PA) is obtained by
adding a schema of induction to Q:

(φ(0) ∧ ∀x (φ(x) → φ(x′ ))) → ∀x φ(x)

where φ(x) is any formula. If φ(x) contains free variables other than x, we add
universal quantifiers to the front to bind all of them (so that the corresponding
instance of the induction schema is a sentence). For instance, if φ(x, y) also
contains the variable y free, the corresponding instance is

∀y ((φ(0) ∧ ∀x (φ(x) → φ(x′ ))) → ∀x φ(x))

Release : 6891b66 (2024-12-01) 523


CHAPTER 35. REPRESENTABILITY IN Q

Using instances of the induction schema, one can prove much more from the
axioms of PA than from those of Q. In fact, it takes a good deal of work to
find “natural” statements about the natural numbers that can’t be proved in
Peano arithmetic!

Definition 35.1. A function f (x0 , . . . , xk ) from the natural numbers to the inc:req:int:
natural numbers is said to be representable in Q if there is a formula φf (x0 , . . . , xkdefn:representable-fn
, y)
such that whenever f (n0 , . . . , nk ) = m, Q proves

1. φf (n0 , . . . , nk , m) inc:req:int:
defn:rep:a

2. ∀y (φf (n0 , . . . , nk , y) → m = y). inc:req:int:


defn:rep:b

There are other ways of stating the definition; for example, we could equiv-
alently require that Q proves ∀y (φf (n0 , . . . , nk , y) ↔ y = m).

Theorem 35.2. A function is representable in Q if and only if it is com- inc:req:int:


thm:representable-iff-comp
putable.

There are two directions to proving the theorem. The left-to-right direc-
tion is fairly straightforward once arithmetization of syntax is in place. The
other direction requires more work. Here is the basic idea: we pick “general
recursive” as a way of making “computable” precise, and show that every gen-
eral recursive function is representable in Q. Recall that a function is general
recursive if it can be defined from zero, the successor function succ, and the
projection functions Pin , using composition, primitive recursion, and regular
minimization. So one way of showing that every general recursive function is
representable in Q is to show that the basic functions are representable, and
whenever some functions are representable, then so are the functions defined
from them using composition, primitive recursion, and regular minimization.
In other words, we might show that the basic functions are representable, and
that the representable functions are “closed under” composition, primitive re-
cursion, and regular minimization. This guarantees that every general recursive
function is representable.
It turns out that the step where we would show that representable func-
tions are closed under primitive recursion is hard. In order to avoid this step,
we show first that in fact we can do without primitive recursion. That is, we
show that every general recursive function can be defined from basic functions
using composition and regular minimization alone. To do this, we show that
primitive recursion can actually be done by a specific regular minimization.
However, for this to work, we have to add some additional basic functions:
addition, multiplication, and the characteristic function of the identity rela-
tion χ= . Then, we can prove the theorem by showing that all of these basic
functions are representable in Q, and the representable functions are closed
under composition and regular minimization.

content/incompleteness/representability-in-q/representable-comp.tex

524 Release : 6891b66 (2024-12-01)


35.2. FUNCTIONS REPRESENTABLE IN Q ARE COMPUTABLE

35.2 Functions Representable in Q are Computable


inc:req:rpc: We’ll prove that every function that is representable in Q is computable. We
sec
first have to establish a lemma about functions representable in Q.
inc:req:rpc: Lemma 35.3. If f (x0 , . . . , xk ) is representable in Q, there is a formula φ(x0 , . . . , xk , y)
lem:rep-q
such that
Q ⊢ φf (n0 , . . . , nk , m) iff m = f (n0 , . . . , nk ).

Proof. The “if” part is Definition 35.1(1). The “only if” part is seen as follows:
Suppose Q ⊢ φf (n0 , . . . , nk , m) but m ̸= f (n0 , . . . , nk ). Let l = f (n0 , . . . , nk ).
By Definition 35.1(1), Q ⊢ φf (n0 , . . . , nk , l). By Definition 35.1(2), ∀y (φf (n0 , . . . , nk , y)→
l = y). Using logic and the assumption that Q ⊢ φf (n0 , . . . , nk , m), we get
that Q ⊢ l = m. On the other hand, by Lemma 35.14, Q ⊢ l ̸= m. So Q is
inconsistent. But that is impossible, since Q is satisfied by the standard model
(see Definition 33.2), N ⊨ Q, and satisfiable theories are always consistent by
the Soundness Theorem (Corollaries 19.31, 20.29, 21.31 and 22.38).

Lemma 35.4. Every function that is representable in Q is computable.

Proof. Let’s first give the intuitive idea for why this is true. To compute f , we
do the following. List all the possible derivations δ in the language of arith-
metic. This is possible to do mechanically. For each one, check if it is a deriva-
tion of a formula of the form φf (n0 , . . . , nk , m) (the formula representing f in Q
from Lemma 35.3). If it is, m = f (n0 , . . . , nk ) by Lemma 35.3, and we’ve found
the value of f . The search terminates because Q ⊢ φf (n0 , . . . , nk , f (n0 , . . . , nk )),
so eventually we find a δ of the right sort.
This is not quite precise because our procedure operates on derivations
and formulas instead of just on numbers, and we haven’t explained exactly
why “listing all possible derivations” is mechanically possible. But as we’ve
seen, it is possible to code terms, formulas, and derivations by Gödel numbers.
We’ve also introduced a precise model of computation, the general recursive
functions. And we’ve seen that the relation Prf Q (d, y), which holds iff d is the
Gödel number of a derivation of the formula with Gödel number y from the
axioms of Q, is (primitive) recursive. Other primitive recursive functions we’ll
need are num (Proposition 34.6) and Subst (Proposition 34.11). From these,
it is possible to define f by minimization; thus, f is recursive.
First, define

A(n0 , . . . , nk , m) =
Subst(Subst(. . . Subst(# φf # , num(n0 ), # x0 # ),
. . . ), num(nk ), # xk # ), num(m), # y # )
This looks complicated, but it’s just the function A(n0 , . . . , nk , m) = # φf (n0 , . . . , nk , m)# .
Now, consider the relation R(n0 , . . . , nk , s) which holds if (s)0 is the Gödel
number of a derivation from Q of φf (n0 , . . . , nk , (s)1 ):
R(n0 , . . . , nk , s) iff Prf Q ((s)0 , A(n0 , . . . , nk , (s)1 ))

Release : 6891b66 (2024-12-01) 525


CHAPTER 35. REPRESENTABILITY IN Q

If we can find an s such that R(n0 , . . . , nk , s) hold, we have found a pair of


numbers—(s)0 and (s1 )—such that (s)0 is the Gödel number of a derivation
of Af (n0 , . . . , nk , (s)1 ). So looking for s is like looking for the pair d and m
in the informal proof. And a computable function that “looks for” such an
s can be defined by regular minimization. Note that R is regular: for every
n0 , . . . , nk , there is a derivation δ of Q ⊢ φf (n0 , . . . , nk , f (n0 , . . . , nk )), so
R(n0 , . . . , nk , s) holds for s = ⟨# δ # , f (n0 , . . . , nk )⟩. So, we can write f as

f (n0 , . . . , nk ) = (µs R(n0 , . . . , nk , s))1 .

content/incompleteness/representability-in-q/beta-function.tex

35.3 The Beta Function Lemma


In order to show that we can carry out primitive recursion if addition, multi- inc:req:bet:
sec
plication, and χ= are available, we need to develop functions that handle se-
quences. (If we had exponentiation as well, our task would be easier.) When we
had primitive recursion, we could define things like the “n-th prime,” and pick a
fairly straightforward coding. But here we do not have primitive recursion—in
fact we want to show that we can do primitive recursion using minimization—so
we need to be more clever.

Lemma 35.5. There is a function β(d, i) such that for every sequence a0 , inc:req:bet:
. . . , an there is a number d, such that for every i ≤ n, β(d, i) = ai . Moreover, lem:beta

β can be defined from the basic functions using just composition and regular
minimization.

Think of d as coding the sequence ⟨a0 , . . . , an ⟩, and β(d, i) returning the


i-th element. (Note that this “coding” does not use the power-of-primes coding
we’re already familiar with!). The lemma is fairly minimal; it doesn’t say we can
concatenate sequences or append elements, or even that we can compute d from
a0 , . . . , an using functions definable by composition and regular minimization.
All it says is that there is a “decoding” function such that every sequence is
“coded.”
The use of the notation β is Gödel’s. To repeat, the hard part of proving
the lemma is defining a suitable β using the seemingly restricted resources,
i.e., using just composition and minimization—however, we’re allowed to use
addition, multiplication, and χ= . There are various ways to prove this lemma,
but one of the cleanest is still Gödel’s original method, which used a number-
theoretic fact called Sunzi’s Theorem (traditionally, the “Chinese Remainder
Theorem”).

Definition 35.6. Two natural numbers a and b are relatively prime iff their
greatest common divisor is 1; in other words, they have no other divisors in
common.

526 Release : 6891b66 (2024-12-01)


35.3. THE BETA FUNCTION LEMMA

Definition 35.7. Natural numbers a and b are congruent modulo c, a ≡ b


mod c, iff c | (a − b), i.e., a and b have the same remainder when divided by c.

Here is Sunzi’s Theorem:

Theorem 35.8. Suppose x0 , . . . , xn are (pairwise) relatively prime. Let y0 ,


. . . , yn be any numbers. Then there is a number z such that

z ≡ y0 mod x0
z ≡ y1 mod x1
..
.
z ≡ yn mod xn .

Here is how we will use Sunzi’s Theorem: if x0 , . . . , xn are bigger than y0 ,


. . . , yn respectively, then we can take z to code the sequence ⟨y0 , . . . , yn ⟩. To
recover yi , we need only divide z by xi and take the remainder. To use this
coding, we will need to find suitable values for x0 , . . . , xn .
A couple of observations will help us in this regard. Given y0 , . . . , yn , let

j = max(n, y0 , . . . , yn ) + 1,

and let

x0 = 1 + j !
x1 = 1 + 2 · j !
x2 = 1 + 3 · j !
..
.
xn = 1 + (n + 1) · j !

Then two things are true:

inc:req:bet: 1. x0 , . . . , xn are relatively prime.


rel-prime

inc:req:bet: 2. For each i, yi < xi .


less

To see that (1) is true, note that if p is a prime number and p | xi and p | xk ,
then p | 1 + (i + 1)j ! and p | 1 + (k + 1)j !. But then p divides their difference,

(1 + (i + 1)j !) − (1 + (k + 1)j !) = (i − k)j !.

Since p divides 1 + (i + 1)j !, it can’t divide j ! as well (otherwise, the first


division would leave a remainder of 1). So p divides i − k, since p divides
(i − k)j !. But |i − k| is at most n, and we have chosen j > n, so this implies
that p | j !, again a contradiction. So there is no prime number dividing both
xi and xk . Clause (2) is easy: we have yi < j < j ! < xi .

Release : 6891b66 (2024-12-01) 527


CHAPTER 35. REPRESENTABILITY IN Q

Now let us prove the β function lemma. Remember that we can use 0,
successor, plus, times, χ= , projections, and any function defined from them
using composition and minimization applied to regular functions. We can also
use a relation if its characteristic function is so definable. As before we can
show that these relations are closed under Boolean combinations and bounded
quantification; for example:
not(x) = χ= (x, 0)
(min x ≤ z) R(x, y) = µx (R(x, y) ∨ x = z)
(∃x ≤ z) R(x, y) ⇔ R((min x ≤ z) R(x, y), y)
We can then show that all of the following are also definable without primitive
recursion:
1. The pairing function, J(x, y) = 21 [(x + y)(x + y + 1)] + x;
2. the projection functions
K(z) = (min x ≤ z) (∃y ≤ z) z = J(x, y),
L(z) = (min y ≤ z) (∃x ≤ z) z = J(x, y);

3. the less-than relation x < y;


4. the divisibility relation x | y;
5. the function rem(x, y) which returns the remainder when y is divided
by x.
Now define
β ∗ (d0 , d1 , i) = rem(1 + (i + 1)d1 , d0 ) and
β(d, i) = β ∗ (K(d), L(d), i).
This is the function we want. Given a0 , . . . , an as above, let
j = max(n, a0 , . . . , an ) + 1,
and let d1 = j !. By (1) above, we know that 1 + d1 , 1 + 2d1 , . . . , 1 + (n + 1)d1
are relatively prime, and by (2) that all are greater than a0 , . . . , an . By Sunzi’s
Theorem there is a value d0 such that for each i,
d0 ≡ a i mod (1 + (i + 1)d1 )
and so (because d1 is greater than ai ),
ai = rem(1 + (i + 1)d1 , d0 ).
Let d = J(d0 , d1 ). Then for each i ≤ n, we have
β(d, i) = β ∗ (d0 , d1 , i)
= rem(1 + (i + 1)d1 , d0 )
= ai
which is what we need. This completes the proof of the β-function lemma.

528 Release : 6891b66 (2024-12-01)


35.4. SIMULATING PRIMITIVE RECURSION

Problem 35.1. Show that the relations x < y, x | y, and the function rem(x, y)
can be defined without primitive recursion. You may use 0, successor, plus,
times, χ= , projections, and bounded minimization and quantification.

content/incompleteness/representability-in-q/prim-rec.tex

35.4 Simulating Primitive Recursion


inc:req:pri: Now we can show that definition by primitive recursion can be “simulated”
sec
by regular minimization using the beta function. Suppose we have f (⃗x) and
g(⃗x, y, z). Then the function h(x, ⃗z) defined from f and g by primitive recursion
is

h(⃗x, 0) = f (⃗x)
h(⃗x, y + 1) = g(⃗x, y, h(⃗x, y)).

We need to show that h can be defined from f and g using just composition
and regular minimization, using the basic functions and functions defined from
them using composition and regular minimization (such as β).

inc:req:pri: Lemma 35.9. If h can be defined from f and g using primitive recursion, it
lem:prim-rec
can be defined from f , g, the functions zero, succ, Pin , add, mult, χ= , using
composition and regular minimization.

Proof. First, define an auxiliary function ĥ(⃗x, y) which returns the least num-
ber d such that d codes a sequence which satisfies

1. (d)0 = f (⃗x), and

2. for each i < y, (d)i+1 = g(⃗x, i, (d)i ),

where now (d)i is short for β(d, i). In other words, ĥ returns the sequence
⟨h(⃗x, 0), h(⃗x, 1), . . . , h(⃗x, y)⟩. We can write ĥ as

ĥ(⃗x, y) = µd (β(d, 0) = f (⃗x) ∧ (∀i < y) β(d, i + 1) = g(⃗x, i, β(d, i)).

Note: no primitive recursion is needed here, just minimization. The function


we minimize is regular because of the beta function lemma Lemma 35.5.
But now we have
h(⃗x, y) = β(ĥ(⃗x, y), y),
so h can be defined from the basic functions using just composition and regular
minimization.

content/incompleteness/representability-in-q/basic-representable.tex

Release : 6891b66 (2024-12-01) 529


CHAPTER 35. REPRESENTABILITY IN Q

35.5 Basic Functions are Representable in Q


First we have to show that all the basic functions are representable in Q. In the inc:req:bre:
sec
end, we need to show how to assign to each k-ary basic function f (x0 , . . . , xk−1 )
a formula φf (x0 , . . . , xk−1 , y) that represents it.
We will be able to represent zero, successor, plus, times, the characteristic
function for equality, and projections. In each case, the appropriate represent-
ing function is entirely straightforward; for example, zero is represented by the
formula y = 0, successor is represented by the formula x′0 = y, and addition
is represented by the formula (x0 + x1 ) = y. The work involves showing that
Q can prove the relevant sentences; for example, saying that addition is rep-
resented by the formula above involves showing that for every pair of natural
numbers m and n, Q proves

n + m = n + m and
∀y ((n + m) = y → y = n + m).

Proposition 35.10. The zero function zero(x) = 0 is represented in Q by inc:req:bre:


prop:rep-zero
φzero (x, y) ≡ y = 0.

Proposition 35.11. The successor function succ(x) = x + 1 is represented inc:req:bre:


prop:rep-succ
in Q by φsucc (x, y) ≡ y = x′ .

Proposition 35.12. The projection function Pin (x0 , . . . , xn−1 ) = xi is repre- inc:req:bre:
prop:rep-proj
sented in Q by
φPin (x0 , . . . , xn−1 , y) ≡ y = xi .

Problem 35.2. Prove that y = 0, y = x′ , and y = xi represent zero, succ,


and Pin , respectively.

Proposition 35.13. The characteristic function of =, inc:req:bre:


prop:rep-id
(
1 if x0 = x1
χ= (x0 , x1 ) =
0 otherwise

is represented in Q by

φχ= (x0 , x1 , y) ≡ (x0 = x1 ∧ y = 1) ∨ (x0 ̸= x1 ∧ y = 0).

The proof requires the following lemma.

Lemma 35.14. Given natural numbers n and m, if n ̸= m, then Q ⊢ n ̸= m. inc:req:bre:


lem:q-proves-neq

Proof. Use induction on n to show that for every m, if n ̸= m, then Q ⊢ n ̸= m.


In the base case, n = 0. If m is not equal to 0, then m = k + 1 for some
natural number k. We have an axiom that says ∀x 0 ̸= x′ . By a quantifier
′ ′
axiom, replacing x by k, we can conclude 0 ̸= k . But k is just m.

530 Release : 6891b66 (2024-12-01)


35.5. BASIC FUNCTIONS ARE REPRESENTABLE IN Q

In the induction step, we can assume the claim is true for n, and consider
n + 1. Let m be any natural number. There are two possibilities: either m = 0
or for some k we have m = k + 1. The first case is handled as above. In the
second case, suppose n + 1 ̸= k + 1. Then n ̸= k. By the induction hypothesis
for n we have Q ⊢ n ̸= k. We have an axiom that says ∀x ∀y x′ = y ′ → x = y.

Using a quantifier axiom, we have n′ = k → n = k. Using propositional logic,

we can conclude, in Q, n ̸= k → n′ ̸= k . Using modus ponens, we can conclude
′ ′
n′ ̸= k , which is what we want, since k is m.

Note that the lemma does not say much: in essence it says that Q can explanation

prove that different numerals denote different objects. For example, Q proves
0′′ ̸= 0′′′ . But showing that this holds in general requires some care. Note also
that although we are using induction, it is induction outside of Q.

Proof of Proposition 35.13. If n = m, then n and m are the same term, and
χ= (n, m) = 1. But Q ⊢ (n = m ∧ 1 = 1), so it proves φ= (n, m, 1). If n ̸= m,
then χ= (n, m) = 0. By Lemma 35.14, Q ⊢ n ̸= m and so also (n ̸= m ∧ 0 = 0).
Thus Q ⊢ φ= (n, m, 0).
For the second part, we also have two cases. If n = m, we have to show
that Q ⊢ ∀y (φ= (n, m, y) → y = 1). Arguing informally, suppose φ= (n, m, y),
i.e.,
(n = n ∧ y = 1) ∨ (n ̸= n ∧ y = 0)
The left disjunct implies y = 1 by logic; the right contradicts n = n which is
provable by logic.
Suppose, on the other hand, that n ̸= m. Then φ= (n, m, y) is

(n = m ∧ y = 1) ∨ (n ̸= m ∧ y = 0)

Here, the left disjunct contradicts n ̸= m, which is provable in Q by Lemma 35.14;


the right disjunct entails y = 0.

inc:req:bre: Proposition 35.15. The addition function add(x0 , x1 ) = x0 + x1 is repre-


prop:rep-add
sented in Q by
φadd (x0 , x1 , y) ≡ y = (x0 + x1 ).

inc:req:bre: Lemma 35.16. Q ⊢ (n + m) = n + m


lem:q-proves-add

Proof. We prove this by induction on m. If m = 0, the claim is that Q ⊢


(n + 0) = n. This follows by axiom Q4 . Now suppose the claim for m; let’s
prove the claim for m + 1, i.e., prove that Q ⊢ (n + m + 1) = n + m + 1.

Note that m + 1 is just m′ , and n + m + 1 is just n + m . By axiom Q5 ,
Q ⊢ (n + m′ ) = (n + m)′ . By induction hypothesis, Q ⊢ (n + m) = n + m. So

Q ⊢ (n + m′ ) = n + m .

Proof of Proposition 35.15. The formula φadd (x0 , x1 , y) representing add is y =


(x0 + x1 ). First we show that if add(n, m) = k, then Q ⊢ φadd (n, m, k), i.e.,

Release : 6891b66 (2024-12-01) 531


CHAPTER 35. REPRESENTABILITY IN Q

Q ⊢ k = (n + m). But since k = n + m, k just is n + m, and we’ve shown in


Lemma 35.16 that Q ⊢ (n + m) = n + m.
We also have to show that if add(n, m) = k, then

Q ⊢ ∀y (φadd (n, m, y) → y = k).

Suppose we have (n + m) = y. Since

Q ⊢ (n + m) = n + m,

we can replace the left side with n + m and get n + m = y, for arbitrary y.

Proposition 35.17. The multiplication function mult(x0 , x1 ) = x0 · x1 is inc:req:bre:


prop:rep-mult
represented in Q by

φmult (x0 , x1 , y) ≡ y = (x0 × x1 ).

Proof. Exercise.

Lemma 35.18. Q ⊢ (n × m) = n · m inc:req:bre:


lem:q-proves-mult

Proof. Exercise.

Problem 35.3. Prove Lemma 35.18.

Problem 35.4. Use Lemma 35.18 to prove Proposition 35.17.

explanation Recall that we use × for the function symbol of the language of arithmetic,
and · for the ordinary multiplication operation on numbers. So · can appear
between expressions for numbers (such as in m·n) while × appears only between
terms of the language of arithmetic (such as in (m×n)). Even more confusingly,
+ is used for both the function symbol and the addition operation. When it
appears between terms—e.g., in (n + m)—it is the 2-place function symbol
of the language of arithmetic, and when it appears between numbers—e.g., in
n + m—it is the addition operation. This includes the case n + m: this is the
standard numeral corresponding to the number n + m.

content/incompleteness/representability-in-q/composition-representable.tex

35.6 Composition is Representable in Q


Suppose h is defined by inc:req:cmp:
sec

h(x0 , . . . , xl−1 ) = f (g0 (x0 , . . . , xl−1 ), . . . , gk−1 (x0 , . . . , xl−1 )).

where we have already found formulas φf , φg0 , . . . , φgk−1 representing the func-
tions f , and g0 , . . . , gk−1 , respectively. We have to find a formula φh repre-
senting h.

532 Release : 6891b66 (2024-12-01)


35.6. COMPOSITION IS REPRESENTABLE IN Q

Let’s start with a simple case, where all functions are 1-place, i.e., consider
h(x) = f (g(x)). If φf (y, z) represents f , and φg (x, y) represents g, we need
a formula φh (x, z) that represents h. Note that h(x) = z iff there is a y such
that both z = f (y) and y = g(x). (If h(x) = z, then g(x) is such a y; if such a
y exists, then since y = g(x) and z = f (y), z = f (g(x)).) This suggests that
∃y (φg (x, y) ∧ φf (y, z)) is a good candidate for φh (x, z). We just have to verify
that Q proves the relevant formulas.

inc:req:cmp: Proposition 35.19. If h(n) = m, then Q ⊢ φh (n, m).


prop:rep1

Proof. Suppose h(n) = m, i.e., f (g(n)) = m. Let k = g(n). Then

Q ⊢ φg (n, k)

since φg represents g, and

Q ⊢ φf (k, m)

since φf represents f . Thus,

Q ⊢ φg (n, k) ∧ φf (k, m)

and consequently also

Q ⊢ ∃y (φg (n, y) ∧ φf (y, m)),

i.e., Q ⊢ φh (n, m).

inc:req:cmp: Proposition 35.20. If h(n) = m, then Q ⊢ ∀z (φh (n, z) → z = m).


prop:rep2

Proof. Suppose h(n) = m, i.e., f (g(n)) = m. Let k = g(n). Then

Q ⊢ ∀y (φg (n, y) → y = k)

since φg represents g, and

Q ⊢ ∀z (φf (k, z) → z = m)

since φf represents f . Using just a little bit of logic, we can show that also

Q ⊢ ∀z (∃y (φg (n, y) ∧ φf (y, z)) → z = m).

i.e., Q ⊢ ∀y (φh (n, y) → y = m).

The same idea works in the more complex case where f and gi have arity
greater than 1.

Release : 6891b66 (2024-12-01) 533


CHAPTER 35. REPRESENTABILITY IN Q

Proposition 35.21. If φf (y0 , . . . , yk−1 , z) represents f (y0 , . . . , yk−1 ) in Q, inc:req:cmp:


prop:rep-composition
and φgi (x0 , . . . , xl−1 , y) represents gi (x0 , . . . , xl−1 ) in Q, then

∃y0 . . . ∃yk−1 (φg0 (x0 , . . . , xl−1 , y0 ) ∧ · · · ∧


φgk−1 (x0 , . . . , xl−1 , yk−1 ) ∧ φf (y0 , . . . , yk−1 , z))
represents
h(x0 , . . . , xl−1 ) = f (g0 (x0 , . . . , xl−1 ), . . . , gk−1 (x0 , . . . , xl−1 )).
Proof. Exercise.
Problem 35.5. Using the proofs of Proposition 35.20 and Proposition 35.20
as a guide, carry out the proof of Proposition 35.21 in detail.

content/incompleteness/representability-in-q/minimization-representable.tex

35.7 Regular Minimization is Representable in Q


Let’s consider unbounded search. Suppose g(x, z) is regular and representable inc:req:min:
sec
in Q, say by the formula φg (x, z, y). Let f be defined by f (z) = µx [g(x, z) = 0].
We would like to find a formula φf (z, y) representing f . The value of f (z) is
that number x which (a) satisfies g(x, z) = 0 and (b) is the least such, i.e., for
any w < x, g(w, z) ̸= 0. So the following is a natural choice:
φf (z, y) ≡ φg (y, z, 0) ∧ ∀w (w < y → ¬φg (w, z, 0)).
In the general case, of course, we would have to replace z with z0 , . . . , zk .
The proof, again, will involve some lemmas about things Q is strong enough
to prove.
Lemma 35.22. For every constant symbol a and every natural number n, inc:req:min:
lem:succ
′ ′
Q ⊢ (a + n) = (a + n) .
Proof. The proof is, as usual, by induction on n. In the base case, n = 0, we
need to show that Q proves (a′ + 0) = (a + 0)′ . But we have:
Q ⊢ (a′ + 0) = a′ by axiom Q4 (35.1) inc:req:min:
step1
Q ⊢ (a + 0) = a by axiom Q4 (35.2) inc:req:min:
′ ′ step2
Q ⊢ (a + 0) = a by eq. (35.2) (35.3) inc:req:min:
step3
Q ⊢ (a′ + 0) = (a + 0)′ by eq. (35.1) and eq. (35.3)
In the induction step, we can assume that we have shown that Q ⊢ (a′ + n) =
(a + n)′ . Since n + 1 is n′ , we need to show that Q proves (a′ + n′ ) = (a + n′ )′ .
We have:
Q ⊢ (a′ + n′ ) = (a′ + n)′ by axiom Q5 (35.4) inc:req:min:
′ ′ ′ ′ step5
Q ⊢ (a + n ) = (a + n ) inductive hypothesis (35.5) inc:req:min:
step6
Q ⊢ (a′ + n)′ = (a + n′ )′ by eq. (35.4) and eq. (35.5).

534 Release : 6891b66 (2024-12-01)


35.7. REGULAR MINIMIZATION IS REPRESENTABLE IN Q

It is again worth mentioning that this is weaker than saying that Q proves
∀x ∀y (x′ + y) = (x + y)′ . Although this sentence is true in N, Q does not prove
it.

inc:req:min: Lemma 35.23. Q ⊢ ∀x ¬x < 0.


lem:less-zero

Proof. We give the proof informally (i.e., only giving hints as to how to con-
struct the formal derivation).
We have to prove ¬a < 0 for an arbitrary a. By the definition of <, we
need to prove ¬∃y (y ′ + a) = 0 in Q. We’ll assume ∃y (y ′ + a) = 0 and prove a
contradiction. Suppose (b′ + a) = 0. Using Q3 , we have that a = 0 ∨ ∃y a = y ′ .
We distinguish cases.
Case 1: a = 0 holds. From (b′ + a) = 0, we have (b′ + 0) = 0. By axiom Q4
of Q, we have (b′ + 0) = b′ , and hence b′ = 0. But by axiom Q2 we also have
b′ ̸= 0, a contradiction.
Case 2: For some c, a = c′ . But then we have (b′ + c′ ) = 0. By axiom Q5 ,
we have (b′ + c)′ = 0, again contradicting axiom Q2 .

inc:req:min: Lemma 35.24. For every natural number n,


lem:less-nsucc

Q ⊢ ∀x (x < n + 1 → (x = 0 ∨ · · · ∨ x = n)).

Proof. We use induction on n. Let us consider the base case, when n = 0. In


that case, we need to show a < 1 → a = 0, for arbitrary a. Suppose a < 1.
Then by the defining axiom for <, we have ∃y (y ′ + a) = 0′ (since 1 ≡ 0′ ).
Suppose b has that property, i.e., we have (b′ + a) = 0′ . We need to show
a = 0. By axiom Q3 , we have either a = 0 or that there is a c such that a = c′ .
In the former case, there is nothing to show. So suppose a = c′ . Then we have
(b′ + c′ ) = 0′ . By axiom Q5 of Q, we have (b′ + c)′ = 0′ . By axiom Q1 , we
have (b′ + c) = 0. But this means, by axiom Q8 , that c < 0, contradicting
Lemma 35.23.
Now for the inductive step. We prove the case for n + 1, assuming the case
for n. So suppose a < n + 2. Again using Q3 we can distinguish two cases:
a = 0 and for some b, a = c′ . In the first case, a = 0 ∨ · · · ∨ a = n + 1 follows

trivially. In the second case, we have c′ < n + 2, i.e., c′ < n + 1 . By axiom Q8 ,
′ ′
for some d, (d′ + c′ ) = n + 1 . By axiom Q5 , (d′ + c)′ = n + 1 . By axiom Q1 ,
(d′ + c) = n + 1, and so c < n + 1 by axiom Q8 . By inductive hypothesis,
c = 0 ∨ · · · ∨ c = n. From this, we get c′ = 0′ ∨ · · · ∨ c′ = n′ by logic, and so
a = 1 ∨ · · · ∨ a = n + 1 since a = c′ .

inc:req:min: Lemma 35.25. For every natural number m,


lem:trichotomy

Q ⊢ ∀y ((y < m ∨ m < y) ∨ y = m).

Proof. By induction on m. First, consider the case m = 0. Q ⊢ ∀y (y =


0 ∨ ∃z y = z ′ ) by Q3 . Let a be arbitrary. Then either a = 0 or for some b,
a = b′ . In the former case, we also have (a < 0 ∨ 0 < a) ∨ a = 0. But if a = b′ ,

Release : 6891b66 (2024-12-01) 535


CHAPTER 35. REPRESENTABILITY IN Q

then (b′ + 0) = (a + 0) by the logic of =. By Q4 , (a + 0) = a, so we have


(b′ + 0) = a, and hence ∃z (z ′ + 0) = a. By the definition of < in Q8 , 0 < a. If
0 < a, then also (0 < a ∨ a < 0) ∨ a = 0.
Now suppose we have

Q ⊢ ∀y ((y < m ∨ m < y) ∨ y = m)

and we want to show

Q ⊢ ∀y ((y < m + 1 ∨ m + 1 < y) ∨ y = m + 1)

Let a be arbitrary. By Q3 , either a = 0 or for some b, a = b′ . In the first case,


we have m′ + a = m + 1 by Q4 , and so a < m + 1 by Q8 .
Now consider the second case, a = b′ . By the induction hypothesis, (b <
m ∨ m < b) ∨ b = m.
The first disjunct b < m is equivalent (by Q8 ) to ∃z (z ′ + b) = m. Suppose
c has this property. If (c′ + b) = m, then also (c′ + b)′ = m′ . By Q5 , (c′ + b)′ =
(c′ + b′ ). Hence, (c′ + b′ ) = m′ . We get ∃u (u′ + b′ ) = m + 1 by existentially
generalizing on c′ and keeping in mind that m′ ≡ m + 1. Hence, if b < m then
b′ < m + 1 and so a < m + 1.
Now suppose m < b, i.e., ∃z (z ′ + m) = b. Suppose c is such a z, i.e.,
(c + m) = b. By logic, (c′ + m)′ = b′ . By Q5 , (c′ + m′ ) = b′ . Since a = b′ and

m′ ≡ m + 1, (c′ + m + 1) = a. By Q8 , m + 1 < a.
Finally, assume b = m. Then, by logic, b′ = m′ , and so a = m + 1.
Hence, from each disjunct of the case for m and b, we can obtain the
corresponding disjunct for for m + 1 and a.

Proposition 35.26. If φg (x, z, y) represents g(x, z) in Q, then inc:req:min:


prop:rep-minimization
φf (z, y) ≡ φg (y, z, 0) ∧ ∀w (w < y → ¬φg (w, z, 0))

represents f (z) = µx [g(x, z) = 0].

Proof. First we show that if f (n) = m, then Q ⊢ φf (n, m), i.e.,

Q ⊢ φg (m, n, 0) ∧ ∀w (w < m → ¬φg (w, n, 0)).

Since φg (x, z, y) represents g(x, z) and g(m, n) = 0 if f (n) = m, we have

Q ⊢ φg (m, n, 0).

If f (n) = m, then for every k < m, g(k, n) ̸= 0. So

Q ⊢ ¬φg (k, n, 0).

We get that

Q ⊢ ∀w (w < m → ¬φg (w, n, 0)). (35.6) inc:req:min:


rep-less

536 Release : 6891b66 (2024-12-01)


35.8. COMPUTABLE FUNCTIONS ARE REPRESENTABLE IN Q

by Lemma 35.23 in case m = 0 and by Lemma 35.24 otherwise.


Now let’s show that if f (n) = m, then Q ⊢ ∀y (φf (n, y) → y = m). We
again sketch the argument informally, leaving the formalization to the reader.
Suppose φf (n, b). From this we get (a) φg (b, n, 0) and (b) ∀w (w < b →
¬φg (w, n, 0)). By Lemma 35.25, (b < m ∨ m < b) ∨ b = m. We’ll show that
both b < m and m < b leads to a contradiction.
If m < b, then ¬φg (m, n, 0) from (b). But m = f (n), so g(m, n) = 0, and
so Q ⊢ φg (m, n, 0) since φg represents g. So we have a contradiction.
Now suppose b < m. Then since Q ⊢ ∀w (w < m → ¬φg (w, n, 0)) by
eq. (35.6), we get ¬φg (b, n, 0). This again contradicts (a).

content/incompleteness/representability-in-q/comp-representable.tex

35.8 Computable Functions are Representable in Q


inc:req:crq:
sec
Theorem 35.27. Every computable function is representable in Q.

Proof. For definiteness, and using the Church–Turing Thesis, let’s say that a
function is computable iff it is general recursive. The general recursive func-
tions are those which can be defined from the zero function zero, the successor
function succ, and the projection function Pin using composition, primitive re-
cursion, and regular minimization. By Lemma 35.9, any function h that can
be defined from f and g can also be defined using composition and regular
minimization from f , g, and zero, succ, Pin , add, mult, χ= . Consequently, a
function is general recursive iff it can be defined from zero, succ, Pin , add, mult,
χ= using composition and regular minimization.
We’ve furthermore shown that the basic functions in question are repre-
sentable in Q (Propositions 35.10 to 35.13, 35.15 and 35.17), and that any
function defined from representable functions by composition or regular min-
imization (Proposition 35.21, Proposition 35.26) is also representable. Thus
every general recursive function is representable in Q.

We have shown that the set of computable functions can be characterized as explanation

the set of functions representable in Q. In fact, the proof is more general. From
the definition of representability, it is not hard to see that any theory extending
Q (or in which one can interpret Q) can represent the computable functions.
But, conversely, in any derivation system in which the notion of derivation
is computable, every representable function is computable. So, for example,
the set of computable functions can be characterized as the set of functions
representable in Peano arithmetic, or even Zermelo–Fraenkel set theory. As
Gödel noted, this is somewhat surprising. We will see that when it comes to
provability, questions are very sensitive to which theory you consider; roughly,
the stronger the axioms, the more you can prove. But across a wide range
of axiomatic theories, the representable functions are exactly the computable

Release : 6891b66 (2024-12-01) 537


CHAPTER 35. REPRESENTABILITY IN Q

ones; stronger theories do not represent more functions as long as they are
axiomatizable.

content/incompleteness/representability-in-q/representing-relations.tex

35.9 Representing Relations


Let us say what it means for a relation to be representable. inc:req:rel:
sec
Definition 35.28. A relation R(x0 , . . . , xk ) on the natural numbers is repre- inc:req:rel:
sentable in Q if there is a formula φR (x0 , . . . , xk ) such that whenever R(n0 , . . . , nkdefn:representing-relations
)
is true, Q proves φR (n0 , . . . , nk ), and whenever R(n0 , . . . , nk ) is false, Q proves
¬φR (n0 , . . . , nk ).

Theorem 35.29. A relation is representable in Q if and only if it is com- inc:req:rel:


thm:representing-rels
putable.

Proof. For the forwards direction, suppose R(x0 , . . . , xk ) is represented by the


formula φR (x0 , . . . , xk ). Here is an algorithm for computing R: on input n0 ,
. . . , nk , simultaneously search for a proof of φR (n0 , . . . , nk ) and a proof of
¬φR (n0 , . . . , nk ). By our hypothesis, the search is bound to find one or the
other; if it is the first, report “yes,” and otherwise, report “no.”
In the other direction, suppose R(x0 , . . . , xk ) is computable. By definition,
this means that the function χR (x0 , . . . , xk ) is computable. By Theorem 35.2,
χR is represented by a formula, say φχR (x0 , . . . , xk , y). Let φR (x0 , . . . , xk ) be
the formula φχR (x0 , . . . , xk , 1). Then for any n0 , . . . , nk , if R(n0 , . . . , nk ) is
true, then χR (n0 , . . . , nk ) = 1, in which case Q proves φχR (n0 , . . . , nk , 1), and
so Q proves φR (n0 , . . . , nk ). On the other hand, if R(n0 , . . . , nk ) is false, then
χR (n0 , . . . , nk ) = 0. This means that Q proves
∀y (φχR (n0 , . . . , nk , y) → y = 0).
Since Q proves 0 ̸= 1, Q proves ¬φχR (n0 , . . . , nk , 1), and so it proves ¬φR (n0 , . . . , nk ).

Problem 35.6. Show that if R is representable in Q, so is χR .

content/incompleteness/representability-in-q/undecidability.tex

35.10 Undecidability
We call a theory T undecidable if there is no computational procedure which, inc:req:und:
sec
after finitely many steps and unfailingly, provides a correct answer to the ques-
tion “does T prove φ?” for any sentence φ in the language of T. So Q would
be decidable iff there were a computational procedure which decides, given a
sentence φ in the language of arithmetic, whether Q ⊢ φ or not. We can make
this more precise by asking: Is the relation ProvQ (y), which holds of y iff y is
the Gödel number of a sentence provable in Q, recursive? The answer is: no.

538 Release : 6891b66 (2024-12-01)


Theorem 35.30. Q is undecidable, i.e., the relation
ProvQ (y) ⇔ Sent(y) ∧ ∃x Prf Q (x, y)
is not recursive.
Proof. Suppose it were. Then we could solve the halting problem as follows:
Given e and n, we know that φe (n) ↓ iff there is an s such that T (e, n, s), where
T is Kleene’s predicate from Theorem 29.28. Since T is primitive recursive it
is representable in Q by a formula ψT , that is, Q ⊢ ψT (e, n, s) iff T (e, n, s).
If Q ⊢ ψT (e, n, s) then also Q ⊢ ∃y ψT (e, n, y). If no such s exists, then
Q ⊢ ¬ψT (e, n, s) for every s. But Q is ω-consistent, i.e., if Q ⊢ ¬φ(n) for
every n ∈ N, then Q ⊬ ∃y φ(y). We know this because the axioms of Q
are true in the standard model N. So, Q ⊬ ∃y ψT (e, n, y). In other words,
Q ⊢ ∃y ψT (e, n, y) iff there is an s such that T (e, n, s), i.e., iff φe (n) ↓. From e
and n we can compute # ∃y ψT (e, n, y)# , let g(e, n) be the primitive recursive
function which does that. So
(
1 if ProvQ (g(e, n))
h(e, n) =
0 otherwise.
This would show that h is recursive if ProvQ is. But h is not recursive, by
Theorem 29.29, so ProvQ cannot be either.
Corollary 35.31. First-order logic is undecidable.
Proof. If first-order logic were decidable, provability in Q would be as well,
since Q ⊢ φ iff ⊢ ω → φ, where ω is the conjunction of the axioms of Q.

This chapter depends on material in the chapter on computability the-


ory, but can be left out if that hasn’t been covered. It’s currently a basic
conversion of Jeremy Avigad’s notes, has not been revised, and is missing
exercises.

Chapter 36

Theories and Computability

539
CHAPTER 36. THEORIES AND COMPUTABILITY

content/incompleteness/theories-computability/introduction.tex

36.1 Introduction
inc:tcp:int:
sec

This section should be rewritten.

We have the following:


1. A definition of what it means for a function to be representable in Q
(Definition 35.1)
2. a definition of what it means for a relation to be representable in Q
(Definition 35.28)
3. a theorem asserting that the representable functions of Q are exactly the
computable ones (Theorem 35.2)
4. a theorem asserting that the representable relations of Q are exactly the
computable ones Theorem 35.29)
A theory is a set of sentences that is deductively closed, that is, with the
property that whenever T proves φ then φ is in T . It is probably best to
think of a theory as being a collection of sentences, together with all the things
that these sentences imply. From now on, we will use Q to refer to the theory
consisting of the set of sentences derivable from the eight axioms in section 35.1.
Remember that we can code formula of Q as numbers; if φ is such a formula,
let # φ# denote the number coding φ. Modulo this coding, we can now ask
whether various sets of formulas are computable or not.

content/incompleteness/theories-computability/q-is-ce.tex

36.2 Q is C.e.-Complete
inc:tcp:qce:
sec
Theorem 36.1. Q is c.e. but not decidable. In fact, it is a complete c.e. set.

Proof. It is not hard to see that Q is c.e., since it is the set of (codes for)
sentences y such that there is a proof x of y in Q:

Q = {y : ∃x Prf Q (x, y)}.

But we know that Prf Q (x, y) is computable (in fact, primitive recursive), and
any set that can be written in the above form is c.e.
Saying that it is a complete c.e. set is equivalent to saying that K ≤m Q,
where K = {x : φx (x) ↓}. So let us show that K is reducible to Q. Since

540 Release : 6891b66 (2024-12-01)


36.3. ω-CONSISTENT EXTENSIONS OF Q ARE UNDECIDABLE

Kleene’s predicate T (e, x, s) is primitive recursive, it is representable in Q, say,


by φT . Then for every x, we have

x ∈ K → ∃s T (x, x, s)
→ ∃s (Q ⊢ φT (x, x, s))
→ Q ⊢ ∃s φT (x, x, s).

Conversely, if Q ⊢ ∃s φT (x, x, s), then, in fact, for some natural number n the
formula φT (x, x, n) must be true. Now, if T (x, x, n) were false, Q would prove
¬φT (x, x, n), since φT represents T . But then Q proves a false formula, which
is a contradiction. So T (x, x, n) must be true, which implies φx (x) ↓.
In short, we have that for every x, x is in K if and only if Q proves
∃s T (x, x, s). So the function f which takes x to (a code for) the sentence
∃s T (x, x, s) is a reduction of K to Q.

content/incompleteness/theories-computability/oconsis-ext-of-q-undec.tex

36.3 ω-Consistent Extensions of Q are Undecidable


inc:tcp:oqn: The proof that Q is c.e.-complete relied on the fact that any sentence prov- explanation
sec
able in Q is “true” of the natural numbers. The next definition and theorem
strengthen this theorem, by pinpointing just those aspects of “truth” that were
needed in the proof above. Don’t dwell on this theorem too long, though, be-
cause we will soon strengthen it even further. We include it mainly for histor-
ical purposes: Gödel’s original paper used the notion of ω-consistency, but his
result was strengthened by replacing ω-consistency with ordinary consistency
soon after.

inc:tcp:oqn: Definition 36.2. A theory T is ω-consistent if the following holds: if ∃x φ(x)


thm:oconsis-q
is any sentence and T proves ¬φ(0), ¬φ(1), ¬φ(2), . . . then T does not prove
∃x φ(x).

Theorem 36.3. Let T be any ω-consistent theory that includes Q. Then T is


not decidable.

Proof. If T includes Q, then T represents the computable functions and re-


lations. We need only modify the previous proof. As above, if x ∈ K, then
T proves ∃s φT (x, x, s). Conversely, suppose T proves ∃s φT (x, x, s). Then x
must be in K: otherwise, there is no halting computation of machine x on input
x; since φT represents Kleene’s T relation, T proves ¬φT (x, x, 0), ¬φT (x, x, 1),
. . . , making T ω-inconsistent.

content/incompleteness/theories-computability/extensions-of-q-not-decidable.tex

Release : 6891b66 (2024-12-01) 541


CHAPTER 36. THEORIES AND COMPUTABILITY

36.4 Consistent Extensions of Q are Undecidable


explanation Remember that a theory is consistent if it does not prove both φ and ¬φ for any inc:tcp:cqn:
sec
formula φ. Since anything follows from a contradiction, an inconsistent theory
is trivial: every sentence is provable. Clearly, if a theory if ω-consistent, then it
is consistent. But being consistent is a weaker requirement (i.e., there are theo-
ries that are consistent but not ω-consistent.). We can weaken the assumption
in Definition 36.2 to simple consistency to obtain a stronger theorem.
Lemma 36.4. There is no “universal computable relation.” That is, there is
no binary computable relation R(x, y), with the following property: whenever
S(y) is a unary computable relation, there is some k such that for every y, S(y)
is true if and only if R(k, y) is true.
Proof. Suppose R(x, y) is a universal computable relation. Let S(y) be the
relation ¬R(y, y). Since S(y) is computable, for some k, S(y) is equivalent to
R(k, y). But then we have that S(k) is equivalent to both R(k, k) and ¬R(k, k),
which is a contradiction.
Theorem 36.5. Let T be any consistent theory that includes Q. Then T is
not decidable.
Proof. Suppose T is a consistent, decidable extension of Q. We will obtain a
contradiction by using T to define a universal computable relation.
Let R(x, y) hold if and only if
x codes a formula θ(u), and T proves θ(y).
Since we are assuming that T is decidable, R is computable. Let us show that
R is universal. If S(y) is any computable relation, then it is representable in
Q (and hence T) by a formula θS (u). Then for every n, we have
S(n) → T ⊢ θS (n)
→ R(# θS (u)# , n)
and
¬S(n) → T ⊢ ¬θS (n)
→ T ̸⊢ θS (n) (since T is consistent)
# #
→ ¬R( θS (u) , n).
That is, for every y, S(y) is true if and only if R(# θS (u)# , y) is. So R is
universal, and we have the contradiction we were looking for.
Let “true arithmetic” be the theory {φ : N ⊨ φ}, that is, the set of sentences
in the language of arithmetic that are true in the standard interpretation.
Corollary 36.6. True arithmetic is not decidable.

content/incompleteness/theories-computability/computably-axiomatizable.tex

542 Release : 6891b66 (2024-12-01)


36.5. AXIOMATIZABLE THEORIES

36.5 Axiomatizable Theories


inc:tcp:cax: A theory T is said to be axiomatizable if it has a computable set of axioms A.
sec
(Saying that A is a set of axioms for T means T = {φ : A ⊢ φ}.) Any
“reasonable” axiomatization of the natural numbers will have this property. In
particular, any theory with a finite set of axioms is axiomatizable.
Lemma 36.7. Suppose T is axiomatizable. Then T is computably enumerable.

Proof. Suppose A is a computable set of axioms for T. To determine if φ ∈ T ,


just search for a derivation of φ from the axioms.
Put slightly differently, φ is in T if and only if there is a finite list of axioms
ψ1 , . . . , ψk in A and a derivation of (ψ1 ∧ · · · ∧ ψk ) → φ in first-order logic.
But we already know that any set with a definition of the form “there exists
. . . such that . . . ” is c.e., provided the second “. . . ” is computable.

content/incompleteness/theories-computability/complete-decidable.tex

36.6 Axiomatizable Complete Theories are Decidable


inc:tcp:cdc: A theory is said to be complete if for every sentence φ, either φ or ¬φ is
sec
provable.
Lemma 36.8. Suppose a theory T is complete and axiomatizable. Then T is
decidable.

Proof. Suppose T is complete and A is a computable set of axioms. If T is


inconsistent, it is clearly computable. (Algorithm: “just say yes.”) So we can
assume that T is also consistent.
To decide whether or not a sentence φ is in T, simultaneously search for
a derivation of φ from T and a derivation of ¬φ. Since T is complete, you are
bound to find one or the other; and since T is consistent, if you find a derivation
of ¬φ, there is no derivation of φ.
Put in different terms, we already know that T is c.e.; so by a theorem we
proved before, it suffices to show that the complement of T is c.e. also. But
a formula φ is in T̄ if and only if ¬φ is in T; so T̄ ≤m T.

content/incompleteness/theories-computability/first-incompleteness.tex

36.7 Q has no Complete, Consistent, Axiomatizable


Extensions
inc:tcp:inc:
sec
inc:tcp:inc: Theorem 36.9. There is no complete, consistent, axiomatizable extension of
thm:first-incompleteness
Q.

Release : 6891b66 (2024-12-01) 543


CHAPTER 36. THEORIES AND COMPUTABILITY

Proof. We already know that there is no consistent, decidable extension of Q.


But if T is complete and axiomatized, then it is decidable.

explanation This theorems is not that far from Gödel’s original 1931 formulation of the
First Incompleteness Theorem. Aside from the more modern terminology, the
key differences are this: Gödel has “ω-consistent” instead of “consistent”; and
he could not say “axiomatizable” in full generality, since the formal notion of
computability was not in place yet. (The formal models of computability were
developed over the following decade, including by Gödel, and in large part to
be able to characterize the kinds of theories that are susceptible to the Gödel
phenomenon.)
The theorem says you can’t have it all, namely, completeness, consistency,
and axiomatizability. If you give up any one of these, though, you can have the
other two: Q is consistent and computably axiomatized, but not complete; the
inconsistent theory is complete, and computably axiomatized (say, by {0 ̸= 0}),
but not consistent; and the set of true sentence of arithmetic is complete and
consistent, but it is not computably axiomatized.

content/incompleteness/theories-computability/inseparability.tex

36.8 Sentences Provable and Refutable in Q are


Computably Inseparable
Let Q̄ be the set of sentences whose negations are provable in Q, i.e., Q̄ = {φ : inc:tcp:ins:
sec
Q ⊢ ¬φ}. Remember that disjoint sets A and B are said to be computably
inseparable if there is no computable set C such that A ⊆ C and B ⊆ C.
Lemma 36.10. Q and Q̄ are computably inseparable.

Proof. Suppose C is a computable set such that Q ⊆ C and Q̄ ⊆ C. Let


R(x, y) be the relation
x codes a formula θ(u) and θ(y) is in C.
We will show that R(x, y) is a universal computable relation, yielding a con-
tradiction.
Suppose S(y) is computable, represented by θS (u) in Q. Then
S(n) → Q ⊢ θS (n)
→ θS (n) ∈ C
and
¬S(n) → Q ⊢ ¬θS (n)
→ θS (n) ∈ Q̄
→ θS (n) ̸∈ C
So S(y) is equivalent to R(#(θS (u)), y).

544 Release : 6891b66 (2024-12-01)


36.9. THEORIES CONSISTENT WITH Q ARE UNDECIDABLE

content/incompleteness/theories-computability/consis-with-q.tex

36.9 Theories Consistent with Q are Undecidable


inc:tcp:con: The following theorem says that not only is Q undecidable, but, in fact, any
sec
theory that does not disagree with Q is undecidable.

Theorem 36.11. Let T be any theory in the language of arithmetic that is


consistent with Q (i.e., T ∪ Q is consistent). Then T is undecidable.

Proof. Remember that Q has a finite set of axioms, Q1 , . . . , Q8 . We can even


replace these by a single axiom, α = Q1 ∧ · · · ∧ Q8 .
Suppose T is a decidable theory consistent with Q. Let

C = {φ : T ⊢ α → φ}.

We show that C would be a computable separation of Q and Q̄, a contradiction.


First, if φ is in Q, then φ is provable from the axioms of Q; by the deduction
theorem, there is a derivation of α → φ in first-order logic. So φ is in C.
On the other hand, if φ is in Q̄, then there is a proof of α → ¬φ in first-
order logic. If T also proves α → φ, then T proves ¬α, in which case T ∪ Q
is inconsistent. But we are assuming T ∪ Q is consistent, so T does not prove
α → φ, and so φ is not in C.
We’ve shown that if φ is in Q, then it is in C, and if φ is in Q̄, then it
is in C. So C is a computable separation, which is the contradiction we were
looking for.

This theorem is very powerful. For example, it implies:

Corollary 36.12. First-order logic for the language of arithmetic (that is, the
set {φ : φ is provable in first-order logic}) is undecidable.

Proof. First-order logic is the set of consequences of ∅, which is consistent


with Q.

content/incompleteness/theories-computability/interpretability.tex

36.10 Theories in which Q is Interpretable are


Undecidable
inc:tcp:itp: We can strengthen these results even more. Informally, an interpretation of
sec
a language L1 in another language L2 involves defining the universe, relation
symbols, and function symbols of L1 with formulas in L2 . Though we won’t
take the time to do this, one can make this definition precise.

Release : 6891b66 (2024-12-01) 545


Theorem 36.13. Suppose T is a theory in a language in which one can in-
terpret the language of arithmetic, in such a way that T is consistent with the
interpretation of Q. Then T is undecidable. If T proves the interpretation of
the axioms of Q, then no consistent extension of T is decidable.

The proof is just a small modification of the proof of the last theorem;
one could use a counterexample to get a separation of Q and Q̄. One can
take ZFC, Zermelo–Fraenkel set theory with the axiom of choice, to be an
axiomatic foundation that is powerful enough to carry out a good deal of ordi-
nary mathematics. In ZFC one can define the natural numbers, and via this
interpretation, the axioms of Q are true. So we have
Corollary 36.14. There is no decidable extension of ZFC.

Corollary 36.15. There is no complete, consistent, computably axiomatizable


extension of ZFC.

The language of ZFC has only a single binary relation, ∈. (In fact, you
don’t even need equality.) So we have
Corollary 36.16. First-order logic for any language with a binary relation
symbol is undecidable.

This result extends to any language with two unary function symbols, since
one can use these to simulate a binary relation symbol. The results just cited
are tight: it turns out that first-order logic for a language with only unary
relation symbols and at most one unary function symbol is decidable.
One more bit of trivia. We know that the set of sentences in the language
0, ′ , +, ×, < true in the standard model is undecidable. In fact, one can
define < in terms of the other symbols, and then one can define + in terms of
× and ′ . So the set of true sentences in the language 0, ′ , × is undecidable.
On the other hand, Presburger has shown that the set of sentences in the
language 0, ′ , + true in the language of arithmetic is decidable. The procedure
is computationally infeasible, however.

Chapter 37

Incompleteness and Provability

546
37.1. INTRODUCTION

content/incompleteness/incompleteness-provability/introduction.tex

37.1 Introduction
inc:inp:int: Hilbert thought that a system of axioms for a mathematical structure, such
sec
as the natural numbers, is inadequate unless it allows one to derive all true
statements about the structure. Combined with his later interest in formal
systems of deduction, this suggests that he thought that we should guarantee
that, say, the formal systems we are using to reason about the natural numbers
is not only consistent, but also complete, i.e., every statement in its language
is either derivable or its negation is. Gödel’s first incompleteness theorem
shows that no such system of axioms exists: there is no complete, consistent,
axiomatizable formal system for arithmetic. In fact, no “sufficiently strong,”
consistent, axiomatizable mathematical theory is complete.
A more important goal of Hilbert’s, the centerpiece of his program for the
justification of modern (“classical”) mathematics, was to find finitary consis-
tency proofs for formal systems representing classical reasoning. With regard
to Hilbert’s program, then, Gödel’s second incompleteness theorem was a much
bigger blow. The second incompleteness theorem can be stated in vague terms,
like the first incompleteness theorem. Roughly speaking, it says that no suffi-
ciently strong theory of arithmetic can prove its own consistency. We will have
to take “sufficiently strong” to include a little bit more than Q.
The idea behind Gödel’s original proof of the incompleteness theorem can
be found in the Epimenides paradox. Epimenides, a Cretan, asserted that all
Cretans are liars; a more direct form of the paradox is the assertion “this sen-
tence is false.” Essentially, by replacing truth with derivability, Gödel was able
to formalize a sentence which, in a roundabout way, asserts that it itself is not
derivable. If that sentence were derivable, the theory would then be inconsis-
tent. Gödel showed that the negation of that sentence is also not derivable from
the system of axioms he was considering. (For this second part, Gödel had to
assume that the theory T is what’s called “ω-consistent.” ω-Consistency is
related to consistency, but is a stronger property.1 A few years after Gödel,
Rosser showed that assuming simple consistency of T is enough.)
The first challenge is to understand how one can construct a sentence that
refers to itself. For every formula φ in the language of Q, let ⌜φ⌝ denote the
numeral corresponding to # φ# . Think about what this means: φ is a formula in
the language of Q, # φ# is a natural number, and ⌜φ⌝ is a term in the language
of Q. So every formula φ in the language of Q has a name, ⌜φ⌝, which is a
term in the language of Q; this provides us with a conceptual framework in
which formulas in the language of Q can “say” things about other formulas.
The following lemma is known as the fixed-point lemma.

1 That is, any ω-consistent theory is consistent, but not vice versa.

Release : 6891b66 (2024-12-01) 547


CHAPTER 37. INCOMPLETENESS AND PROVABILITY

Lemma 37.1. Let T be any theory extending Q, and let ψ(x) be any formula
with only the variable x free. Then there is a sentence φ such that T ⊢ φ ↔
ψ(⌜φ⌝).

The lemma asserts that given any property ψ(x), there is a sentence φ that
asserts “ψ(x) is true of me,” and T “knows” this.
How can we construct such a sentence? Consider the following version of
the Epimenides paradox, due to Quine:
“Yields falsehood when preceded by its quotation” yields falsehood
when preceded by its quotation.
This sentence is not directly self-referential. It simply makes an assertion about
the syntactic objects between quotes, and, in doing so, it is on par with sen-
tences like
1. “Robert” is a nice name.
2. “I ran.” is a short sentence.
3. “Has three words” has three words.
But what happens when one takes the phrase “yields falsehood when preceded
by its quotation,” and precedes it with a quoted version of itself? Then one
has the original sentence! In short, the sentence asserts that it is false.

content/incompleteness/incompleteness-provability/fixed-point-lemma.tex

37.2 The Fixed-Point Lemma


explanation The fixed-point lemma says that for any formula ψ(x), there is a sentence φ inc:inp:fix:
sec
such that T ⊢ φ ↔ ψ(⌜φ⌝), provided T extends Q. In the case of the liar sen-
tence, we’d want φ to be equivalent (provably in T) to “⌜φ⌝ is false,” i.e., the
statement that # φ# is the Gödel number of a false sentence. To understand
the idea of the proof, it will be useful to compare it with Quine’s informal
gloss of φ as, “‘yields a falsehood when preceded by its own quotation’ yields
a falsehood when preceded by its own quotation.” The operation of taking an
expression, and then forming a sentence by preceding this expression by its
own quotation may be called diagonalizing the expression, and the result its
diagonalization. So, the diagonalization of ‘yields a falsehood when preceded
by its own quotation’ is “‘yields a falsehood when preceded by its own quota-
tion’ yields a falsehood when preceded by its own quotation.” Now note that
Quine’s liar sentence is not the diagonalization of ‘yields a falsehood’ but of
‘yields a falsehood when preceded by its own quotation.’ So the property being
diagonalized to yield the liar sentence itself involves diagonalization!
In the language of arithmetic, we form quotations of a formula with one free
variable by computing its Gödel numbers and then substituting the standard
numeral for that Gödel number into the free variable. The diagonalization

548 Release : 6891b66 (2024-12-01)


37.2. THE FIXED-POINT LEMMA

of α(x) is α(n), where n = # α(x)# . (From now on, let’s abbreviate # α(x)# as
⌜α(x)⌝.) So if ψ(x) is “is a falsehood,” then “yields a falsehood if preceded by
its own quotation,” would be “yields a falsehood when applied to the Gödel
number of its diagonalization.” If we had a symbol di ag for the function diag(n)
which computes the Gödel number of the diagonalization of the formula with
Gödel number n, we could write α(x) as ψ(di ag(x)). And Quine’s version of
the liar sentence would then be the diagonalization of it, i.e., α(⌜α(x)⌝) or
ψ(di ag(⌜ψ(di ag(x))⌝)). Of course, ψ(x) could now be any other property, and
the same construction would work. For the incompleteness theorem, we’ll take
ψ(x) to be “x is not derivable in T.” Then α(x) would be “yields a sentence
not derivable in T when applied to the Gödel number of its diagonalization.”
To formalize this in T, we have to find a way to formalize diag. The function
diag(n) is computable, in fact, it is primitive recursive: if n is the Gödel number
of a formula α(x), diag(n) returns the Gödel number of α(⌜α(x)⌝). (Recall,
⌜α(x)⌝ is the standard numeral of the Gödel number of α(x), i.e., # α(x)# ).
If di ag were a function symbol in T representing the function diag, we could
take φ to be the formula ψ(di ag(⌜ψ(di ag(x))⌝)). Notice that
diag(# ψ(di ag(x))# ) = # ψ(di ag(⌜ψ(di ag(x))⌝))#
= # φ# .
Assuming T can derive
di ag(⌜ψ(di ag(x))⌝) = ⌜φ⌝,
it can derive ψ(di ag(⌜ψ(di ag(x))⌝)) ↔ ψ(⌜φ⌝). But the left hand side is, by
definition, φ.
Of course, di ag will in general not be a function symbol of T, and cer-
tainly is not one of Q. But, since diag is computable, it is representable in Q
by some formula θdiag (x, y). So instead of writing ψ(di ag(x)) we can write
∃y (θdiag (x, y) ∧ ψ(y)). Otherwise, the proof sketched above goes through, and
in fact, it goes through already in Q.
inc:inp:fix: Lemma 37.2. Let ψ(x) be any formula with one free variable x. Then there
lem:fixed-point
is a sentence φ such that Q ⊢ φ ↔ ψ(⌜φ⌝).

Proof. Given ψ(x), let α(x) be the formula ∃y (θdiag (x, y) ∧ ψ(y)) and let φ be
its diagonalization, i.e., the formula α(⌜α(x)⌝).
Since θdiag represents diag, and diag(# α(x)# ) = # φ# , Q can derive
inc:inp:fix: θdiag (⌜α(x)⌝, ⌜φ⌝) (37.1)
repdiag1
inc:inp:fix: ∀y (θdiag (⌜α(x)⌝, y) → y = ⌜φ⌝). (37.2)
repdiag2
Now we show that Q ⊢ φ ↔ ψ(⌜φ⌝). We argue informally, using just logic and
facts derivable in Q.
First, suppose φ, i.e., α(⌜α(x)⌝). Going back to the definition of α(x), we
see that α(⌜α(x)⌝) just is
∃y (θdiag (⌜α(x)⌝, y) ∧ ψ(y)).

Release : 6891b66 (2024-12-01) 549


CHAPTER 37. INCOMPLETENESS AND PROVABILITY

Consider such a y. Since θdiag (⌜α(x)⌝, y), by eq. (37.2), y = ⌜φ⌝. So, from
ψ(y) we have ψ(⌜φ⌝).
Now suppose ψ(⌜φ⌝). By eq. (37.1), we have

θdiag (⌜α(x)⌝, ⌜φ⌝) ∧ ψ(⌜φ⌝).

It follows that

∃y (θdiag (⌜α(x)⌝, y) ∧ ψ(y)).

But that’s just α(⌜α(x)⌝), i.e., φ.

digression You should compare this to the proof of the fixed-point lemma in com-
putability theory. The difference is that here we want to define a statement in
terms of itself, whereas there we wanted to define a function in terms of itself;
this difference aside, it is really the same idea.

Problem 37.1. A formula φ(x) is a truth definition if Q ⊢ ψ ↔ φ(⌜ψ⌝) for all


sentences ψ. Show that no formula is a truth definition by using the fixed-point
lemma.

content/incompleteness/incompleteness-provability/first-incompleteness-thm.tex

37.3 The First Incompleteness Theorem


We can now describe Gödel’s original proof of the first incompleteness theorem. inc:inp:1in:
sec
Let T be any computably axiomatized theory in a language extending the
language of arithmetic, such that T includes the axioms of Q. This means
that, in particular, T represents computable functions and relations.
We have argued that, given a reasonable coding of formulas and proofs
as numbers, the relation Prf T (x, y) is computable, where Prf T (x, y) holds if
and only if x is the Gödel number of a derivation of the formula with Gödel
number y in T. In fact, for the particular theory that Gödel had in mind, Gödel
was able to show that this relation is primitive recursive, using the list of 45
functions and relations in his paper. The 45th relation, xBy, is just Prf T (x, y)
for his particular choice of T. Remember that where Gödel uses the word
“recursive” in his paper, we would now use the phrase “primitive recursive.”
Since Prf T (x, y) is computable, it is representable in T. We will use
Prf T (x, y) to refer to the formula that represents it. Let ProvT (y) be the
formula ∃x Prf T (x, y). This describes the 46th relation, Bew(y), on Gödel’s
list. As Gödel notes, this is the only relation that “cannot be asserted to be
recursive.” What he probably meant is this: from the definition, it is not clear
that it is computable; and later developments, in fact, show that it isn’t.
Let T be an axiomatizable theory containing Q. Then Prf T (x, y) is decid-
able, hence representable in Q by a formula Prf T (x, y). Let ProvT (y) be the

550 Release : 6891b66 (2024-12-01)


37.3. THE FIRST INCOMPLETENESS THEOREM

formula we described above. By the fixed-point lemma, there is a formula γT


such that Q (and hence T) derives
inc:inp:1in: γT ↔ ¬ProvT (⌜γT ⌝). (37.3)
eqn:qpf
Note that γT says, in essence, “γT is not derivable in T.”
inc:inp:1in: Lemma 37.3. If T is a consistent, axiomatizable theory extending Q, then
lem:cons-G-unprov
T ⊬ γT .

Proof. Suppose T derives γT . Then there is a derivation, and so, for some
number m, the relation Prf T (m, # γT # ) holds. But then Q derives the sen-
tence Prf T (m, ⌜γT ⌝). So Q derives ∃x Prf T (x, ⌜γT ⌝), which is, by definition,
ProvT (⌜γT ⌝). By eq. (37.3), Q derives ¬γT , and since T extends Q, so does T.
We have shown that if T derives γT , then it also derives ¬γT , and hence it
would be inconsistent.

inc:inp:1in: Definition 37.4. A theory T is ω-consistent if the following holds: if ∃x φ(x)


thm:oconsis-q
is any sentence and T derives ¬φ(0), ¬φ(1), ¬φ(2), . . . then T does not prove
∃x φ(x).

Note that every ω-consistent theory is also consistent. This follows simply
from the fact that if T is inconsistent, then T ⊢ φ for every φ. In particular, if
T is inconsistent, it derives both ¬φ(n) for every n and also derives ∃x φ(x). So,
if T is inconsistent, it is ω-inconsistent. By contraposition, if T is ω-consistent,
it must be consistent.
inc:inp:1in: Lemma 37.5. If T is an ω-consistent, axiomatizable theory extending Q,
lem:omega-cons-G-unref
then T ⊬ ¬γT .

Proof. We show that if T derives ¬γT , then it is ω-inconsistent. Suppose


T derives ¬γT . If T is inconsistent, it is ω-inconsistent, and we are done.
Otherwise, T is consistent, so it does not derive γT by Lemma 37.3. Since
there is no derivation of γT in T, Q derives
¬Prf T (0, ⌜γT ⌝), ¬Prf T (1, ⌜γT ⌝), ¬Prf T (2, ⌜γT ⌝), . . .
and so does T. On the other hand, by eq. (37.3), ¬γT is equivalent to
∃x Prf T (x, ⌜γT ⌝). So T is ω-inconsistent.

Problem 37.2. Every ω-consistent theory is consistent. Show that the con-
verse does not hold, i.e., that there are consistent but ω-inconsistent theories.
Do this by showing that Q ∪ {¬γQ } is consistent but ω-inconsistent.

inc:inp:1in: Theorem 37.6. Let T be any ω-consistent, axiomatizable theory extending Q.


thm:first-incompleteness
Then T is not complete.

Proof. If T is ω-consistent, it is consistent, so T ⊬ γT by Lemma 37.3. By


Lemma 37.5, T ⊬ ¬γT . This means that T is incomplete, since it derives
neither γT nor ¬γT .

Release : 6891b66 (2024-12-01) 551


CHAPTER 37. INCOMPLETENESS AND PROVABILITY

content/incompleteness/incompleteness-provability/rosser-thm.tex

37.4 Rosser’s Theorem


Can we modify Gödel’s proof to get a stronger result, replacing “ω-consistent” inc:inp:ros:
sec
with simply “consistent”? The answer is “yes,” using a trick discovered by
Rosser. Rosser’s trick is to use a “modified” derivability predicate RProvT (y)
instead of ProvT (y).
Theorem 37.7. Let T be any consistent, axiomatizable theory extending Q. inc:inp:ros:
thm:rosser
Then T is not complete.
Proof. Recall that ProvT (y) is defined as ∃x Prf T (x, y), where Prf T (x, y) repre-
sents the decidable relation which holds iff x is the Gödel number of a deriva-
tion of the sentence with Gödel number y. The relation that holds between x
and y if x is the Gödel number of a refutation of the sentence with Gödel num-
ber y is also decidable. Let not(x) be the primitive recursive function which
does the following: if x is the code of a formula φ, not(x) is a code of ¬φ.
Then Ref T (x, y) holds iff Prf T (x, not(y)). Let Ref T (x, y) represent it. Then,
if T ⊢ ¬φ and δ is a corresponding derivation, Q ⊢ Ref T (⌜δ⌝, ⌜φ⌝). We define
RProvT (y) as
∃x (Prf T (x, y) ∧ ∀z (z < x → ¬Ref T (z, y))).
Roughly, RProvT (y) says “there is a proof of y in T, and there is no shorter
refutation of y.” Assuming T is consistent, RProvT (y) is true of the same
numbers as ProvT (y); but from the point of view of provability in T (and we
now know that there is a difference between truth and provability!) the two
have different properties. If T is inconsistent, then the two do not hold of
the same numbers! (RProvT (y) is often read as “y is Rosser provable.” Since,
as just discussed, Rosser provability is not some special kind of provability—
in inconsistent theories, there are sentences that are provable but not Rosser
provable—this may be confusing. To avoid the confusion, you could instead
read it as “y is shmovable.”)
By the fixed-point lemma, there is a formula ρT such that
Q ⊢ ρT ↔ ¬RProvT (⌜ρT ⌝). (37.4) inc:inp:ros:
RT
In contrast to the proof of Theorem 37.6, here we claim that if T is consistent,
T doesn’t derive ρT , and T also doesn’t derive ¬ρT . (In other words, we don’t
need the assumption of ω-consistency.)
First, let’s show that T ⊬ ρT . Suppose it did, so there is a derivation of ρT
from T ; let n be its Gödel number. Then Q ⊢ Prf T (n, ⌜ρT ⌝), since Prf T repre-
sents Prf T in Q. Also, for each k < n, k is not the Gödel number of a deriva-
tion of ¬ρT , since T is consistent. So for each k < n, Q ⊢ ¬Ref T (k, ⌜ρT ⌝). By
Lemma 35.24, Q ⊢ ∀z (z < n → ¬Ref T (z, ⌜ρT ⌝)). Thus,
Q ⊢ ∃x (Prf T (x, ⌜ρT ⌝) ∧ ∀z (z < x → ¬Ref T (z, ⌜ρT ⌝))),

552 Release : 6891b66 (2024-12-01)


37.5. COMPARISON WITH GÖDEL’S ORIGINAL PAPER

but that’s just RProvT (⌜ρT ⌝). By eq. (37.4), Q ⊢ ¬ρT . Since T extends Q, also
T ⊢ ¬ρT . We’ve assumed that T ⊢ ρT , so T would be inconsistent, contrary
to the assumption of the theorem.
Now, let’s show that T ⊬ ¬ρT . Again, suppose it did, and suppose n is the
Gödel number of a derivation of ¬ρT . Then Ref T (n, # ρT # ) holds, and since
Ref T represents Ref T in Q, Q ⊢ Ref T (n, ⌜ρT ⌝). We’ll again show that T would
then be inconsistent because it would also derive ρT . Since

Q ⊢ ρT ↔ ¬RProvT (⌜ρT ⌝),

and since T extends Q, it suffices to show that

Q ⊢ ¬RProvT (⌜ρT ⌝).

The sentence ¬RProvT (⌜ρT ⌝), i.e.,

¬∃x (Prf T (x, ⌜ρT ⌝) ∧ ∀z (z < x → ¬Ref T (z, ⌜ρT ⌝))),

is logically equivalent to

∀x (Prf T (x, ⌜ρT ⌝) → ∃z (z < x ∧ Ref T (z, ⌜ρT ⌝))).

We argue informally using logic, making use of facts about what Q derives.
Suppose x is arbitrary and Prf T (x, ⌜ρT ⌝). We already know that T ⊬ ρT ,
and so for every k, Q ⊢ ¬Prf T (k, ⌜ρT ⌝). Thus, for every k it follows that
x ̸= k. In particular, we have (a) that x ̸= n. We also have ¬(x = 0 ∨ x =
1 ∨ · · · ∨ x = n − 1) and so by Lemma 35.24, (b) ¬(x < n). By Lemma 35.25,
n < x. Since Q ⊢ Ref T (n, ⌜ρT ⌝), we have n < x ∧ Ref T (n, ⌜ρT ⌝), and from
that ∃z (z < x ∧ Ref T (z, ⌜ρT ⌝)). Since x was arbitrary we get, as required, that

∀x (Prf T (x, ⌜ρT ⌝) → ∃z (z < x ∧ Ref T (z, ⌜ρT ⌝))).

Problem 37.3. Two sets A and B of natural numbers are said to be com-
putably inseparable if there is no decidable set X such that A ⊆ X and B ⊆ X
(X is the complement, N \ X, of X). Let T be a consistent axiomatizable
extension of Q. Suppose A is the set of Gödel numbers of sentences provable
in T and B the set of Gödel numbers of sentences refutable in T. Prove that
A and B are computably inseparable.

content/incompleteness/incompleteness-provability/godels-paper.tex

37.5 Comparison with Gödel’s Original Paper


inc:inp:gop: It is worthwhile to spend some time with Gödel’s 1931 paper. The introduction
sec
sketches the ideas we have just discussed. Even if you just skim through the

Release : 6891b66 (2024-12-01) 553


CHAPTER 37. INCOMPLETENESS AND PROVABILITY

paper, it is easy to see what is going on at each stage: first Gödel describes the
formal system P (syntax, axioms, proof rules); then he defines the primitive
recursive functions and relations; then he shows that xBy is primitive recursive,
and argues that the primitive recursive functions and relations are represented
in P. He then goes on to prove the incompleteness theorem, as above. In
Section 3, he shows that one can take the unprovable assertion to be a sentence
in the language of arithmetic. This is the origin of the β-lemma, which is what
we also used to handle sequences in showing that the recursive functions are
representable in Q. Gödel doesn’t go so far to isolate a minimal set of axioms
that suffice, but we now know that Q will do the trick. Finally, in Section 4,
he sketches a proof of the second incompleteness theorem.

content/incompleteness/incompleteness-provability/provability-conditions.tex

37.6 The Derivability Conditions for PA


Peano arithmetic, or PA, is the theory extending Q with induction axioms for inc:inp:prc:
sec
all formulas. In other words, one adds to Q axioms of the form

(φ(0) ∧ ∀x (φ(x) → φ(x′ ))) → ∀x φ(x)

for every formula φ. Notice that this is really a schema, which is to say,
infinitely many axioms (and it turns out that PA is not finitely axiomatizable).
But since one can effectively determine whether or not a string of symbols is
an instance of an induction axiom, the set of axioms for PA is computable.
PA is a much more robust theory than Q. For example, one can easily prove
that addition and multiplication are commutative, using induction in the usual
way. In fact, most finitary number-theoretic and combinatorial arguments can
be carried out in PA.
Since PA is computably axiomatized, the derivability predicate Prf PA (x, y)
is computable and hence represented in Q (and so, in PA). As before, we will
take Prf PA (x, y) to denote the formula representing the relation. Let ProvPA (y)
be the formula ∃x Prf PA (x, y), which, intuitively says, “y is derivable from the
axioms of PA.” The reason we need a little bit more than the axioms of Q
is we need to know that the theory we are using is strong enough to derive a
few basic facts about this derivability predicate. In fact, what we need are the
following facts:
P1. If PA ⊢ φ, then PA ⊢ ProvPA (⌜φ⌝).
P2. For all formulas φ and ψ,

PA ⊢ ProvPA (⌜φ → ψ⌝) → (ProvPA (⌜φ⌝) → ProvPA (⌜ψ⌝)).

P3. For every formula φ,

PA ⊢ ProvPA (⌜φ⌝) → ProvPA (⌜ProvPA (⌜φ⌝)⌝).

554 Release : 6891b66 (2024-12-01)


37.7. THE SECOND INCOMPLETENESS THEOREM

The only way to verify that these three properties hold is to describe the
formula ProvPA (y) carefully and use the axioms of PA to describe the relevant
formal derivations. Conditions (1) and (2) are easy; it is really condition (3)
that requires work. (Think about what kind of work it entails . . . ) Carrying
out the details would be tedious and uninteresting, so here we will ask you to
take it on faith that PA has the three properties listed above. A reasonable
choice of ProvPA (y) will also satisfy

P4. If PA ⊢ ProvPA (⌜φ⌝), then PA ⊢ φ.

But we will not need this fact.


Incidentally, Gödel was lazy in the same way we are being now. At the end digression

of the 1931 paper, he sketches the proof of the second incompleteness theorem,
and promises the details in a later paper. He never got around to it; since
everyone who understood the argument believed that it could be carried out
(he did not need to fill in the details.)

content/incompleteness/incompleteness-provability/second-incompleteness-thm.tex

37.7 The Second Incompleteness Theorem


inc:inp:2in: How can we express the assertion that PA doesn’t prove its own consistency?
sec
Saying PA is inconsistent amounts to saying that PA ⊢ 0 = 1. So we can take
the consistency statement ConPA to be the sentence ¬ProvPA (⌜0 = 1⌝), and
then the following theorem does the job:

inc:inp:2in: Theorem 37.8. Assuming PA is consistent, then PA does not derive ConPA .
thm:second-incompleteness

It is important to note that the theorem depends on the particular represen-


tation of ConPA (i.e., the particular representation of ProvPA (y)). All we will
use is that the representation of ProvPA (y) satisfies the three derivability con-
ditions, so the theorem generalizes to any theory with a derivability predicate
having these properties.
It is informative to read Gödel’s sketch of an argument, since the theorem
follows like a good punch line. It goes like this. Let γPA be the Gödel sentence
that we constructed in the proof of Theorem 37.6. We have shown “If PA is
consistent, then PA does not derive γPA .” If we formalize this in PA, we have
a proof of
ConPA → ¬ProvPA (⌜γPA ⌝).
Now suppose PA derives ConPA . Then it derives ¬ProvPA (⌜γPA ⌝). But since
γPA is a Gödel sentence, this is equivalent to γPA . So PA derives γPA .
But: we know that if PA is consistent, it doesn’t derive γPA ! So if PA is
consistent, it can’t derive ConPA .
To make the argument more precise, we will let γPA be the Gödel sentence
for PA and use the derivability conditions (P1)–(P3) to show that PA derives

Release : 6891b66 (2024-12-01) 555


CHAPTER 37. INCOMPLETENESS AND PROVABILITY

ConPA → γPA . This will show that PA doesn’t derive ConPA . Here is a sketch
of the proof, in PA. (For simplicity, we drop the PA subscripts.)

γ ↔ ¬Prov(⌜γ⌝) (37.5) inc:inp:2in:


G2-1
γ is a Gödel sentence
γ → ¬Prov(⌜γ⌝) (37.6) inc:inp:2in:
G2-2
from eq. (37.5)
γ → (Prov(⌜γ⌝) → ⊥) (37.7) inc:inp:2in:
G2-3
from eq. (37.6) by logic
Prov(⌜γ → (Prov(⌜γ⌝) → ⊥)⌝) (37.8) inc:inp:2in:
G2-4
by from eq. (37.7) by condition P1
Prov(⌜γ⌝) → Prov(⌜(Prov(⌜γ⌝) → ⊥)⌝) (37.9) inc:inp:2in:
G2-5
from eq. (37.8) by condition P2
Prov(⌜γ⌝) → (Prov(⌜Prov(⌜γ⌝)⌝) → Prov(⌜⊥⌝)) (37.10) inc:inp:2in:
G2-6
from eq. (37.9) by condition P2 and logic
Prov(⌜γ⌝) → Prov(⌜Prov(⌜γ⌝)⌝) (37.11) inc:inp:2in:
G2-7
by P3
Prov(⌜γ⌝) → Prov(⌜⊥⌝) (37.12) inc:inp:2in:
G2-8
from eq. (37.10) and eq. (37.11) by logic
Con → ¬Prov(⌜γ⌝) (37.13) inc:inp:2in:
G2-9
contraposition of eq. (37.12) and Con ≡ ¬Prov(⌜⊥⌝)
Con → γ
from eq. (37.5) and eq. (37.13) by logic

The use of logic in the above just elementary facts from propositional logic,
e.g., eq. (37.7) uses ⊢ ¬φ ↔ (φ → ⊥) and eq. (37.12) uses φ → (ψ → χ), φ → ψ ⊢
φ → χ. The use of condition P2 in eq. (37.9) and eq. (37.10) relies on instances
of P2, Prov(⌜φ → ψ⌝) → (Prov(⌜φ⌝) → Prov(⌜ψ⌝)). In the first one, φ ≡ γ and
ψ ≡ Prov(⌜γ⌝) → ⊥; in the second, φ ≡ Prov(⌜G⌝) and ψ ≡ ⊥.
The more abstract version of the second incompleteness theorem is as fol-
lows:

Theorem 37.9. Let T be any consistent, axiomatized theory extending Q inc:inp:2in:


thm:second-incompleteness-gen
and let ProvT (y) be any formula satisfying derivability conditions P1–P3 for T.
Then T does not derive ConT .

Problem 37.4. Show that PA derives γPA → ConPA .

digression The moral of the story is that no “reasonable” consistent theory for math-
ematics can derive its own consistency statement. Suppose T is a theory of
mathematics that includes Q and Hilbert’s “finitary” reasoning (whatever that
may be). Then, the whole of T cannot derive the consistency statement of T,

556 Release : 6891b66 (2024-12-01)


37.8. LÖB’S THEOREM

and so, a fortiori, the finitary fragment can’t derive the consistency statement
of T either. In that sense, there cannot be a finitary consistency proof for “all
of mathematics.”
There is some leeway in interpreting the term “finitary,” and Gödel, in the
1931 paper, grants the possibility that something we may consider “finitary”
may lie outside the kinds of mathematics Hilbert wanted to formalize. But
Gödel was being charitable; today, it is hard to see how we might find something
that can reasonably be called finitary but is not formalizable in, say, ZFC,
Zermelo–Fraenkel set theory with the axiom of choice.

content/incompleteness/incompleteness-provability/lob-thm.tex

37.8 Löb’s Theorem


inc:inp:lob: The Gödel sentence for a theory T is a fixed point of ¬ProvT (y), i.e., a sen-
sec
tence γ such that
T ⊢ ¬ProvT (⌜γ⌝) ↔ γ.
It is not derivable, because if T ⊢ γ, (a) by derivability condition (1), T ⊢
ProvT (⌜γ⌝), and (b) T ⊢ γ together with T ⊢ ¬ProvT (⌜γ⌝) ↔ γ gives T ⊢
¬ProvT (⌜γ⌝), and so T would be inconsistent. Now it is natural to ask about
the status of a fixed point of ProvT (y), i.e., a sentence δ such that
T ⊢ ProvT (⌜δ⌝) ↔ δ.
If it were derivable, T ⊢ ProvT (⌜δ⌝) by condition (1), but the same conclusion
follows if we apply modus ponens to the equivalence above. Hence, we don’t
get that T is inconsistent, at least not by the same argument as in the case of
the Gödel sentence. This of course does not show that T does derive δ.
We can make headway on this question if we generalize it a bit. The left-
to-right direction of the fixed point equivalence, ProvT (⌜δ⌝) → δ, is an instance
of a general schema called a reflection principle: ProvT (⌜φ⌝) → φ. It is called
that because it expresses, in a sense, that T can “reflect” about what it can
derive; basically it says, “If T can derive φ, then φ is true,” for any φ. This is
true for sound theories only, of course, and this suggests that theories will in
general not derive every instance of it. So which instances can a theory (strong
enough, and satisfying the derivability conditions) derive? Certainly all those
where φ itself is derivable. And that’s it, as the next result shows.
inc:inp:lob: Theorem 37.10. Let T be an axiomatizable theory extending Q, and suppose
thm:lob
ProvT (y) is a formula satisfying conditions P1–P3 from section 37.7. If T
derives ProvT (⌜φ⌝) → φ, then in fact T derives φ.

Put differently, if T ⊬ φ, then T ⊬ ProvT (⌜φ⌝) → φ. This result is known


as Löb’s theorem.
The heuristic for the proof of Löb’s theorem is a clever proof that Santa explanation
Claus exists. (If you don’t like that conclusion, you are free to substitute any
other conclusion you would like.) Here it is:

Release : 6891b66 (2024-12-01) 557


CHAPTER 37. INCOMPLETENESS AND PROVABILITY

1. Let X be the sentence, “If X is true, then Santa Claus exists.”

2. Suppose X is true.

3. Then what it says holds; i.e., we have: if X is true, then Santa Claus
exists.

4. Since we are assuming X is true, we can conclude that Santa Claus exists,
by modus ponens from (2) and (3).

5. We have succeeded in deriving (4), “Santa Claus exists,” from the as-
sumption (2), “X is true.” By conditional proof, we have shown: “If X
is true, then Santa Claus exists.”

6. But this is just the sentence X. So we have shown that X is true.

7. But then, by the argument (2)–(4) above, Santa Claus exists.

A formalization of this idea, replacing “is true” with “is derivable,” and “Santa
Claus exists” with φ, yields the proof of Löb’s theorem. The trick is to apply
the fixed-point lemma to the formula ProvT (y) → φ. The fixed point of that
corresponds to the sentence X in the preceding sketch.

Proof of Theorem 37.10. Suppose φ is a sentence such that T derives ProvT (⌜φ⌝)→
φ. Let ψ(y) be the formula ProvT (y) → φ, and use the fixed-point lemma to
find a sentence θ such that T derives θ ↔ ψ(⌜θ⌝). Then each of the following

558 Release : 6891b66 (2024-12-01)


37.8. LÖB’S THEOREM

is derivable in T:

inc:inp:lob: θ ↔ (ProvT (⌜θ⌝) → φ) (37.14)


L-1
θ is a fixed point of ψ(y)
inc:inp:lob: θ → (ProvT (⌜θ⌝) → φ) (37.15)
L-2
from eq. (37.14)
inc:inp:lob: ProvT (⌜θ → (ProvT (⌜θ⌝) → φ)⌝) (37.16)
L-3
from eq. (37.15) by condition P1
inc:inp:lob: ProvT (⌜θ⌝) → ProvT (⌜ProvT (⌜θ⌝) → φ⌝) (37.17)
L-4
from eq. (37.16) using condition P2
inc:inp:lob: ProvT (⌜θ⌝) → (ProvT (⌜ProvT (⌜θ⌝)⌝) → ProvT (⌜φ⌝)) (37.18)
L-5
from eq. (37.17) using P2 again
inc:inp:lob: ProvT (⌜θ⌝) → ProvT (⌜ProvT (⌜θ⌝)⌝) (37.19)
L-6
by derivability condition P3
inc:inp:lob: ProvT (⌜θ⌝) → ProvT (⌜φ⌝) (37.20)
L-7
from eq. (37.18) and eq. (37.19)
inc:inp:lob: ProvT (⌜φ⌝) → φ (37.21)
L-8
by assumption of the theorem
inc:inp:lob: ProvT (⌜θ⌝) → φ (37.22)
L-9
from eq. (37.20) and eq. (37.21)
inc:inp:lob: (ProvT (⌜θ⌝) → φ) → θ (37.23)
L-10
from eq. (37.14)
inc:inp:lob: θ (37.24)
L-11
from eq. (37.22) and eq. (37.23)
inc:inp:lob: ProvT (⌜θ⌝) (37.25)
L-12
from eq. (37.24) by condition P1
φ from eq. (37.21) and eq. (37.25)

With Löb’s theorem in hand, there is a short proof of the second incomplete-
ness theorem (for theories having a derivability predicate satisfying conditions
P1–P3): if T ⊢ ProvT (⌜⊥⌝) → ⊥, then T ⊢ ⊥. If T is consistent, T ⊬ ⊥. So,
T ⊬ ProvT (⌜⊥⌝) → ⊥, i.e., T ⊬ ConT . We can also apply it to show that δ, the
fixed point of ProvT (x), is derivable. For since

T ⊢ ProvT (⌜δ⌝) ↔ δ

in particular

T ⊢ ProvT (⌜δ⌝) → δ

and so by Löb’s theorem, T ⊢ δ.

Release : 6891b66 (2024-12-01) 559


CHAPTER 37. INCOMPLETENESS AND PROVABILITY

Problem 37.5. Let T be a computably axiomatized theory, and let ProvT be


a derivability predicate for T. Consider the following four statements:

1. If T ⊢ φ, then T ⊢ ProvT (⌜φ⌝).

2. T ⊢ φ → ProvT (⌜φ⌝).

3. If T ⊢ ProvT (⌜φ⌝), then T ⊢ φ.

4. T ⊢ ProvT (⌜φ⌝) → φ

Under what conditions are each of these statements true?

content/incompleteness/incompleteness-provability/tarski-thm.tex

37.9 The Undefinability of Truth


The notion of definability depends on having a formal semantics for the lan- inc:inp:tar:
sec
guage of arithmetic. We have described a set of formulas and sentences in the
language of arithmetic. The “intended interpretation” is to read such sentences
as making assertions about the natural numbers, and such an assertion can be
true or false. Let N be the structure with domain N and the standard inter-
pretation for the symbols in the language of arithmetic. Then N ⊨ φ means
“φ is true in the standard interpretation.”

Definition 37.11. A relation R(x1 , . . . , xk ) of natural numbers is definable in


N if and only if there is a formula φ(x1 , . . . , xk ) in the language of arithmetic
such that for every n1 , . . . , nk , R(n1 , . . . , nk ) if and only if N ⊨ φ(n1 , . . . , nk ).

Put differently, a relation is definable in N if and only if it is representable


in the theory TA, where TA = {φ : N ⊨ φ} is the set of true sentences of
arithmetic. (If this is not immediately clear to you, you should go back and
check the definitions and convince yourself that this is the case.)

Lemma 37.12. Every computable relation is definable in N.

Proof. It is easy to check that the formula representing a relation in Q defines


the same relation in N.

Now one can ask, is the converse also true? That is, is every relation
definable in N computable? The answer is no. For example:

Lemma 37.13. The halting relation is definable in N.

Proof. Let H be the halting relation, i.e.,

H = {⟨e, x⟩ : ∃s T (e, x, s)}.

560 Release : 6891b66 (2024-12-01)


37.9. THE UNDEFINABILITY OF TRUTH

Let θT define T in N. Then

H = {⟨e, x⟩ : N ⊨ ∃s θT (e, x, s)},

so ∃s θT (z, x, s) defines H in N.

Problem 37.6. Show that Q(n) ⇔ n ∈ {# φ# : Q ⊢ φ} is definable in arith-


metic.

What about TA itself? Is it definable in arithmetic? That is: is the set


{# φ# : N ⊨ φ} definable in arithmetic? Tarski’s theorem answers this in the
negative.

inc:inp:tar: Theorem 37.14. The set of true sentences of arithmetic is not definable in
thm:tarski
arithmetic.

Proof. Suppose θ(x) defined it, i.e., N ⊨ φ iff N ⊨ θ(⌜φ⌝). By the fixed-
point lemma, there is a formula φ such that Q ⊢ φ ↔ ¬θ(⌜φ⌝), and hence
N ⊨ φ ↔ ¬θ(⌜φ⌝). But then N ⊨ φ if and only if N ⊨ ¬θ(⌜φ⌝), which
contradicts the fact that θ(y) is supposed to define the set of true statements
of arithmetic.

Tarski applied this analysis to a more general philosophical notion of truth.


Given any language L, Tarski argued that an adequate notion of truth for L
would have to satisfy, for each sentence X,

‘X’ is true if and only if X.

Tarski’s oft-quoted example, for English, is the sentence

‘Snow is white’ is true if and only if snow is white.


However, for any language strong enough to represent the diagonal function,
and any linguistic predicate T (x), we can construct a sentence X satisfying
“X if and only if not T (‘X’).” Given that we do not want a truth predicate
to declare some sentences to be both true and false, Tarski concluded that
one cannot specify a truth predicate for all sentences in a language without,
somehow, stepping outside the bounds of the language. In other words, a the
truth predicate for a language cannot be defined in the language itself.

Release : 6891b66 (2024-12-01) 561


Part VIII

Second-order Logic
This is the beginnings of a part on second-order logic.

Chapter 38

Syntax and Semantics

Basic syntax and semantics for SOL covered so far. As a chapter it’s
too short. Substitution for second-order variables has to be covered to be
able to talk about derivation systems for SOL, and there’s some subtle
issues there.

content/second-order-logic/syntax-and-semantics/introduction.tex

38.1 Introduction
In first-order logic, we combine the non-logical symbols of a given language, i.e., sol:syn:int:
sec
its constant symbols, function symbols, and predicate symbols, with the logical
symbols to express things about first-order structures. This is done using the
notion of satisfaction, which relates a structure M, together with a variable
assignment s, and a formula φ: M, s ⊨ φ holds iff what φ expresses when its
constant symbols, function symbols, and predicate symbols are interpreted as
M says, and its free variables are interpreted as s says, is true. The interpre-
tation of the identity predicate = is built into the definition of M, s ⊨ φ, as is
the interpretation of ∀ and ∃. The former is always interpreted as the identity
relation on the domain |M| of the structure, and the quantifiers are always
interpreted as ranging over the entire domain. But, crucially, quantification

562
38.2. TERMS AND FORMULAS

is only allowed over elements of the domain, and so only object variables are
allowed to follow a quantifier.
In second-order logic, both the language and the definition of satisfac-
tion are extended to include free and bound function and predicate variables,
and quantification over them. These variables are related to function sym-
bols and predicate symbols the same way that object variables are related
to constant symbols. They play the same role in the formation of terms
and formulas of second-order logic, and quantification over them is handled
in a similar way. In the standard semantics, the second-order quantifiers
range over all possible objects of the right type (n-place functions from |M|
to |M| for function variables, n-place relations for predicate variables). For
instance, while ∀v0 (P01 (v0 ) ∨ ¬P01 (v0 )) is a formula in both first- and second-
order logic, in the latter we can also consider ∀V01 ∀v0 (V01 (v0 ) ∨ ¬V01 (v0 )) and
∃V01 ∀v0 (V01 (v0 ) ∨ ¬V01 (v0 )). Since these contain no free variables, they are
sentences of second-order logic. Here, V01 is a second-order 1-place predicate
variable. The allowable interpretations of V01 are the same that we can assign
to a 1-place predicate symbol like P01 , i.e., subsets of |M|. Quantification over
them then amounts to saying that ∀v0 (V01 (v0 ) ∨ ¬V01 (v0 )) holds for all ways
of assigning a subset of |M| as the value of V01 , or for at least one. Since ev-
ery set either contains or fails to contain a given object, both are true in any
structure.

content/second-order-logic/syntax-and-semantics/terms-formulas.tex

38.2 Terms and Formulas


sol:syn:frm: Like in first-order logic, expressions of second-order logic are built up from a
sec
basic vocabulary containing variables, constant symbols, predicate symbols and
sometimes function symbols. From them, together with logical connectives,
quantifiers, and punctuation symbols such as parentheses and commas, terms
and formulas are formed. The difference is that in addition to variables for
objects, second-order logic also contains variables for relations and functions,
and allows quantification over them. So the logical symbols of second-order
logic are those of first-order logic, plus:

1. A denumerable set of second-order relation variables of every arity n: V0n ,


V1n , V2n , . . .

2. A denumerable set of second-order function variables: u0n , u1n , u2n , . . .

Just as we use x, y, z as meta-variables for first-order variables vi , we’ll use


X, Y , Z, etc., as metavariables for Vin and u, v, etc., as meta-variables for uin .
The non-logical symbols of a second-order language are specified the same explanation

way a first-order language is: by listing its constant symbols, function symbols,
and predicate symbols.

Release : 6891b66 (2024-12-01) 563


CHAPTER 38. SYNTAX AND SEMANTICS

In first-order logic, the identity predicate = is usually included. In first-


order logic, the non-logical symbols of a language L are crucial to allow us to
express anything interesting. There are of course sentences that use no non-
logical symbols, but with only = it is hard to say anything interesting. In
second-order logic, since we have an unlimited supply of relation and function
variables, we can say anything we can say in a first-order language even without
a special supply of non-logical symbols.

Definition 38.1 (Second-order Terms). The set of second-order terms of L,


Trm2 (L), is defined by adding to Definition 15.4 the clause

1. If u is an n-place function variable and t1 , . . . , tn are terms, then u(t1 , . . . , tn )


is a term.

explanation So, a second-order term looks just like a first-order term, except that where
a first-order term contains a function symbol fi n , a second-order term may
contain a function variable uin in its place.

Definition 38.2 (Second-order formula). The set of second-order formu-


las Frm2 (L) of the language L is defined by adding to Definition 15.4 the
clauses

1. If X is an n-place predicate variable and t1 , . . . , tn are second-order


terms of L, then X(t1 , . . . , tn ) is an atomic formula.

2. If φ is a formula and u is a function variable, then ∀u φ is a formula.

3. If φ is a formula and X is a predicate variable, then ∀X φ is a formula.

4. If φ is a formula and u is a function variable, then ∃u φ is a formula.

5. If φ is a formula and X is a predicate variable, then ∃X φ is a formula.

content/second-order-logic/syntax-and-semantics/satisfaction.tex

38.3 Satisfaction
explanation To define the satisfaction relation M, s ⊨ φ for second-order formulas, we have sol:syn:sat:
sec
to extend the definitions to cover second-order variables. The notion of a struc-
ture is the same for second-order logic as it is for first-order logic. There is
only a difference for variable assignments s: these now must not just provide
values for the first-order variables, but also for the second-order variables.

Definition 38.3 (Variable Assignment). A variable assignment s for a struc-


ture M is a function which maps each

1. object variable vi to an element of |M|, i.e., s(vi ) ∈ |M|

564 Release : 6891b66 (2024-12-01)


38.3. SATISFACTION

2. n-place relation variable Vi n to an n-place relation on |M|, i.e., s(Vi n ) ⊆


n
|M| ;
3. n-place function variable uin to an n-place function from |M| to |M|, i.e.,
n
s(uin ) : |M| → |M|;

A structure assigns a value to each constant symbol and function symbol, explanation
and a second-order variable assignment assigns objects and functions to each
object and function variable. Together, they let us assign a value to every
term.
Definition 38.4 (Value of a Term). If t is a term of the language L, M is
a structure for L, and s is a variable assignment for M, the value ValM
s (t) is
defined as for first-order terms, plus the following clause:
t ≡ u(t1 , . . . , tn ):

ValM M M
s (t) = s(u)(Vals (t1 ), . . . , Vals (tn )).

Definition 38.5 (x-Variant). If s is a variable assignment for a structure M,


then any variable assignment s′ for M which differs from s at most in what
it assigns to x is called an x-variant of s. If s′ is an x-variant of s we write
s′ ∼x s. (Similarly for second-order variables X or u.)

Definition 38.6. If s is a variable assignment for a structure M and m ∈ |M|,


then the assignment s[m/x] is the variable assignment defined by
(
m if y ≡ x
s[m/y] =
s(y) otherwise,
n
If X is an n-place relation variable and M ⊆ |M| , then s[M/X] is the variable
assignment defined by
(
M if y ≡ X
s[M/y] =
s(y) otherwise.
n
If u is an n-place function variable and f : |M| → |M|, then s[f /u] is the
variable assignment defined by
(
f if y ≡ u
s[f /y] =
s(y) otherwise.

In each case, y may be any first- or second-order variable.

Definition 38.7 (Satisfaction). For second-order formulas φ, the definition


of satisfaction is like Definition 16.11 with the addition of:
1. φ ≡ X n (t1 , . . . , tn ): M, s ⊨ φ iff ⟨ValM M n
s (t1 ), . . . , Vals (tn )⟩ ∈ s(X ).

Release : 6891b66 (2024-12-01) 565


CHAPTER 38. SYNTAX AND SEMANTICS

n
2. φ ≡ ∀X ψ: M, s ⊨ φ iff for every M ⊆ |M| , M, s[M/X] ⊨ ψ.
n
3. φ ≡ ∃X ψ: M, s ⊨ φ iff for at least one M ⊆ |M| so that M, s[M/X] ⊨
ψ.
n
4. φ ≡ ∀u ψ: M, s ⊨ φ iff for every f : |M| → |M|, M, s[f /u] ⊨ ψ.
n
5. φ ≡ ∃u ψ: M, s ⊨ φ iff for at least one f : |M| → |M| so that
M, s[f /u] ⊨ ψ.

Example 38.8. Consider the formula ∀z (X(z) ↔ ¬Y (z)). It contains no


second-order quantifiers, but does contain the second-order variables X and Y
(here understood to be one-place). The corresponding first-order sentence
∀z (P (z) ↔ ¬R(z)) says that whatever falls under the interpretation of P does
not fall under the interpretation of R and vice versa. In a structure, the inter-
pretation of a predicate symbol P is given by the interpretation P M . But for
second-order variables like X and Y , the interpretation is provided, not by the
structure itself, but by a variable assignment. Since the second-order formula
is not a sentence (it includes free variables X and Y ), it is only satisfied relative
to a structure M together with a variable assignment s.
M, s ⊨ ∀z (X(z) ↔ ¬Y (z)) whenever the elements of s(X) are not elements
of s(Y ), and vice versa, i.e., iff s(Y ) = |M| \ s(X). For instance, take |M| =
{1, 2, 3}. Since no predicate symbols, function symbols, or constant symbols
are involved, the domain of M is all that is relevant. Now for s1 (X) = {1, 2}
and s1 (Y ) = {3}, we have M, s1 ⊨ ∀z (X(z) ↔ ¬Y (z)).
By contrast, if we have s2 (X) = {1, 2} and s2 (Y ) = {2, 3}, M, s2 ⊭
∀z (X(z) ↔ ¬Y (z)). That’s because M, s2 [2/z] ⊨ X(z) (since 2 ∈ s2 [2/z](X))
but M, s2 [2/z] ⊭ ¬Y (z) (since also 2 ∈ s2 [2/z](Y )).

Example 38.9. M, s ⊨ ∃Y (∃y Y (y) ∧ ∀z (X(z) ↔ ¬Y (z))) if there is an N ⊆


|M| such that M, s[N/Y ] ⊨ (∃y Y (y)∧∀z (X(z)↔¬Y (z))). And that is the case
for any N ̸= ∅ (so that M, s[N/Y ] ⊨ ∃y Y (y)) and, as in the previous example,
M = |M| \ s(X). In other words, M, s ⊨ ∃Y (∃y Y (y) ∧ ∀z (X(z) ↔ ¬Y (z))) iff
|M| \ s(X) is non-empty, i.e., s(X) ̸= |M|. So, the formula is satisfied, e.g., if
|M| = {1, 2, 3} and s(X) = {1, 2}, but not if s(X) = {1, 2, 3} = |M|.
Since the formula is not satisfied whenever s(X) = |M|, the sentence

∀X ∃Y (∃y Y (y) ∧ ∀z (X(z) ↔ ¬Y (z)))

is never satisfied: For any structure M, the assignment s(X) = |M| will make
the sentence false. On the other hand, the sentence

∃X ∃Y (∃y Y (y) ∧ ∀z (X(z) ↔ ¬Y (z)))

is satisfied relative to any assignment s, since we can always find M ⊆ |M| but
M ̸= |M| (e.g., M = ∅).

566 Release : 6891b66 (2024-12-01)


38.4. SEMANTIC NOTIONS

Example 38.10. The second-order sentence ∀X ∀y X(y) says that every 1-


place relation, i.e., every property, holds of every object. That is clearly never
true, since in every M, for a variable assignment s with s(X) = ∅, and s(y) =
a ∈ |M| we have M, s ⊭ X(y). This means that φ → ∀X ∀y X(y) is equivalent
in second-order logic to ¬φ, that is: M ⊨ φ → ∀X ∀y X(y) iff M ⊨ ¬φ. In other
words, in second-order logic we can define ¬ using ∀ and →.

Problem 38.1. Show that in second-order logic ∀ and → can define the other
connectives:

1. Prove that in second-order logic φ ∧ ψ is equivalent to ∀X (φ → (ψ →


∀x X(x)) → ∀x X(x)).

2. Find a second-order formula using only ∀ and → equivalent to φ ∨ ψ.

content/second-order-logic/syntax-and-semantics/semantic-notions.tex

38.4 Semantic Notions


sol:syn:sem: The central logical notions of validity, entailment, and satisfiability are defined explanation
sec
the same way for second-order logic as they are for first-order logic, except that
the underlying satisfaction relation is now that for second-order formulas. A
second-order sentence, of course, is a formula in which all variables, including
predicate and function variables, are bound.

Definition 38.11 (Validity). A sentence φ is valid, ⊨ φ, iff M ⊨ φ for every


structure M.

Definition 38.12 (Entailment). A set of sentences Γ entails a sentence φ,


Γ ⊨ φ, iff for every structure M with M ⊨ Γ , M ⊨ φ.

Definition 38.13 (Satisfiability). A set of sentences Γ is satisfiable if M ⊨


Γ for some structure M. If Γ is not satisfiable it is called unsatisfiable.

content/second-order-logic/syntax-and-semantics/expressive-power.tex

38.5 Expressive Power


sol:syn:exp: Quantification over second-order variables is responsible for an immense in- explanation
sec
crease in the expressive power of the language over that of first-order logic.
Second-order existential quantification lets us say that functions or relations
with certain properties exists. In first-order logic, the only way to do that is to
specify a non-logical symbol (i.e., a function symbol or predicate symbol) for
this purpose. Second-order universal quantification lets us say that all subsets
of, relations on, or functions from the domain to the domain have a property.

Release : 6891b66 (2024-12-01) 567


CHAPTER 38. SYNTAX AND SEMANTICS

In first-order logic, we can only say that the subsets, relations, or functions
assigned to one of the non-logical symbols of the language have a property.
And when we say that subsets, relations, functions exist that have a property,
or that all of them have it, we can use second-order quantification in specifying
this property as well. This lets us define relations not definable in first-order
logic, and express properties of the domain not expressible in first-order logic.
2
Definition 38.14. If M is a structure for a language L, a relation R ⊆ |M| is
definable in L if there is some formula φR (x, y) with only the variables x and y
free, such that R(a, b) holds (i.e., ⟨a, b⟩ ∈ R) iff M, s ⊨ φR (x, y) for s(x) = a
and s(y) = b.
Example 38.15. In first-order logic we can define the identity relation Id|M|
(i.e., {⟨a, a⟩ : a ∈ |M|}) by the formula x = y. In second-order logic, we can
define this relation without =. For if a and b are the same element of |M|, then
they are elements of the same subsets of |M| (since sets are determined by their
elements). Conversely, if a and b are different, then they are not elements of
the same subsets: e.g., a ∈ {a} but b ∈ / {a} if a ̸= b. So “being elements of
the same subsets of |M|” is a relation that holds of a and b iff a = b. It is a
relation that can be expressed in second-order logic, since we can quantify over
all subsets of |M|. Hence, the following formula defines Id|M| :
∀X (X(x) ↔ X(y))
Problem 38.2. Show that ∀X (X(x)→X(y)) (note: → not ↔!) defines Id|M| .

Example 38.16. If R is a two-place predicate symbol, RM is a two-place


relation on |M|. Perhaps somewhat confusingly, we’ll use R as the predicate
symbol for R and for the relation RM itself. The transitive closure R∗ of R
is the relation that holds between a and b iff for some c1 , . . . , ck , R(a, c1 ),
R(c1 , c2 ), . . . , R(ck , b) holds. This includes the case if k = 0, i.e., if R(a, b)
holds, so does R∗ (a, b). This means that R ⊆ R∗ . In fact, R∗ is the smallest
relation that includes R and that is transitive. We can say in second-order
logic that X is a transitive relation that includes R:

ψR (X) ≡ ∀x ∀y (R(x, y) → X(x, y)) ∧


∀x ∀y ∀z ((X(x, y) ∧ X(y, z)) → X(x, z)).
The first conjunct says that R ⊆ X and the second that X is transitive.
To say that X is the smallest such relation is to say that it is itself included in
every relation that includes R and is transitive. So we can define the transitive
closure of R by the formula
R∗ (X) ≡ ψR (X) ∧ ∀Y (ψR (Y ) → ∀x ∀y (X(x, y) → Y (x, y))).
We have M, s ⊨ R∗ (X) iff s(X) = R∗ . The transitive closure of R cannot be
expressed in first-order logic.

content/second-order-logic/syntax-and-semantics/inf-count.tex

568 Release : 6891b66 (2024-12-01)


38.6. DESCRIBING INFINITE AND ENUMERABLE DOMAINS

38.6 Describing Infinite and Enumerable Domains


sol:syn:siz: A set M is (Dedekind) infinite iff there is an injective function f : M → M
sec
which is not surjective, i.e., with dom(f ) ̸= M . In first-order logic, we can
consider a one-place function symbol f and say that the function f M assigned
to it in a structure M is injective and ran(f ) ̸= |M|:

∀x ∀y (f (x) = f (y) → x = y) ∧ ∃y ∀x y ̸= f (x).

If M satisfies this sentence, f M : |M| → |M| is injective, and so |M| must be


infinite. If |M| is infinite, and hence such a function exists, we can let f M be
that function and M will satisfy the sentence. However, this requires that our
language contains the non-logical symbol f we use for this purpose. In second-
order logic, we can simply say that such a function exists. This no-longer
requires f , and we obtain the sentence in pure second-order logic

Inf ≡ ∃u (∀x ∀y (u(x) = u(y) → x = y) ∧ ∃y ∀x y ̸= u(x)).

M ⊨ Inf iff |M| is infinite. We can then define Fin ≡ ¬Inf; M ⊨ Fin iff |M| is
finite. No single sentence of pure first-order logic can express that the domain
is infinite although an infinite set of them can. There is no set of sentences of
pure first-order logic that is satisfied in a structure iff its domain is finite.

Proposition 38.17. M ⊨ Inf iff |M| is infinite.

Proof. M ⊨ Inf iff M, s ⊨ ∀x ∀y (u(x) = u(y) → x = y) ∧ ∃y ∀x y ̸= u(x) for


some s. If it does, s(u) is an injective function, and some y ∈ |M| is not in
the domain of s(u). Conversely, if there is an injective f : |M| → |M| with
dom(f ) ̸= |M|, then s(u) = f is such a variable assignment.

A set M is enumerable if there is an enumeration

m0 , m1 , m2 , . . .

of its elements (without repetitions but possibly finite). Such an enumeration


exists iff there is an element z ∈ M and a function f : M → M such that z,
f (z), f (f (z)), . . . , are all the elements of M . For if the enumeration exists,
z = m0 and f (mk ) = mk+1 (or f (mk ) = mk if mk is the last element of the
enumeration) are the requisite element and function. On the other hand, if
such a z and f exist, then z, f (z), f (f (z)), . . . , is an enumeration of M , and
M is enumerable. We can express the existence of z and f in second-order
logic to produce a sentence true in a structure iff the structure is enumerable:

Count ≡ ∃z ∃u ∀X ((X(z) ∧ ∀x (X(x) → X(u(x)))) → ∀x X(x))

Proposition 38.18. M ⊨ Count iff |M| is enumerable.

Release : 6891b66 (2024-12-01) 569


Proof. Suppose |M| is enumerable, and let m0 , m1 , . . . , be an enumeration.
By removing repetitions we can guarantee that no mk appears twice. Define
f (mk ) = mk+1 and let s(z) = m0 and s(u) = f . We show that

M, s ⊨ ∀X ((X(z) ∧ ∀x (X(x) → X(u(x)))) → ∀x X(x))

Suppose M ⊆ |M| is arbitrary. Suppose further that M, s[M/X] ⊨ (X(z) ∧


∀x (X(x) → X(u(x)))). Then s[M/X](z) ∈ M and whenever x ∈ M , also
(s[M/X](u))(x) ∈ M . In other words, since s[M/X] ∼X s, m0 ∈ M and if
x ∈ M then f (x) ∈ M , so m0 ∈ M , m1 = f (m0 ) ∈ M , m2 = f (f (m0 )) ∈ M ,
etc. Thus, M = |M|, and so M, s[M/X] ⊨ ∀x X(x). Since M ⊆ |M| was
arbitrary, we are done: M ⊨ Count.
Now assume that M ⊨ Count, i.e.,

M, s ⊨ ∀X ((X(z) ∧ ∀x (X(x) → X(u(x)))) → ∀x X(x))

for some s. Let m = s(z) and f = s(u) and consider M = {m, f (m), f (f (m)), . . . }.
M so defined is clearly enumerable. Then

M, s[M/X] ⊨ (X(z) ∧ ∀x (X(x) → X(u(x)))) → ∀x X(x)

by assumption. Also, M, s[M/X] ⊨ X(z) since M ∋ m = s[M/X](z), and also


M, s[M/X] ⊨ ∀x (X(x) → X(u(x))) since whenever x ∈ M also f (x) ∈ M .
So, since both antecedent and conditional are satisfied, the consequent must
also be: M, s[M/X] ⊨ ∀x X(x). But that means that M = |M|, and so |M| is
enumerable since M is, by definition.

Problem 38.3. The sentence Inf ∧ Count is true in all and only denumerable
domains. Adjust the definition of Count so that it becomes a different sentence
that directly expresses that the domain is denumerable, and prove that it does.

Chapter 39

Metatheory of Second-order
Logic

content/second-order-logic/metatheory/introduction.tex

570
39.1. INTRODUCTION

39.1 Introduction
sol:met:int: First-order logic has a number of nice properties. We know it is not decidable,
sec
but at least it is axiomatizable. That is, there are proof systems for first-
order logic which are sound and complete, i.e., they give rise to a derivability
relation ⊢ with the property that for any set of sentences Γ and sentence Q,
Γ ⊨ φ iff Γ ⊢ φ. This means in particular that the validities of first-order logic
are computably enumerable. There is a computable function f : N → Sent(L)
such that the values of f are all and only the valid sentences of L. This is so
because derivations can be enumerated, and those that derive a single sentence
are then mapped to that sentence. Second-order logic is more expressive than
first-order logic, and so it is in general more complicated to capture its validities.
In fact, we’ll show that second-order logic is not only undecidable, but its
validities are not even computably enumerable. This means there can be no
sound and complete proof system for second-order logic (although sound, but
incomplete proof systems are available and in fact are important objects of
research).
First-order logic also has two more properties: it is compact (if every finite
subset of a set Γ of sentences is satisfiable, Γ itself is satisfiable) and the
Löwenheim–Skolem Theorem holds for it (if Γ has an infinite model it has
a denumerable model). Both of these results fail for second-order logic. Again,
the reason is that second-order logic can express facts about the size of domains
that first-order logic cannot.

content/second-order-logic/metatheory/second-order-arithmetic.tex

39.2 Second-order Arithmetic


sol:met:spa: Recall that the theory PA of Peano arithmetic includes the eight axioms of Q,
sec

∀x x′ ̸= 0
∀x ∀y (x′ = y ′ → x = y)
∀x (x = 0 ∨ ∃y x = y ′ )
∀x (x + 0) = x
∀x ∀y (x + y ′ ) = (x + y)′
∀x (x × 0) = 0
∀x ∀y (x × y ′ ) = ((x × y) + x)
∀x ∀y (x < y ↔ ∃z (z ′ + x) = y)

plus all sentences of the form

(φ(0) ∧ ∀x (φ(x) → φ(x′ ))) → ∀x φ(x).


The latter is a “schema,” i.e., a pattern that generates infinitely many sen-
tences of the language of arithmetic, one for each formula φ(x). We call this

Release : 6891b66 (2024-12-01) 571


CHAPTER 39. METATHEORY OF SECOND-ORDER LOGIC

schema the (first-order) axiom schema of induction. In second-order Peano


arithmetic PA2 , induction can be stated as a single sentence. PA2 consists of
the first eight axioms above plus the (second-order) induction axiom:

∀X (X(0) ∧ ∀x (X(x) → X(x′ ))) → ∀x X(x).

It says that if a subset X of the domain contains 0M and with any x ∈ |M|
also contains ′M (x) (i.e., it is “closed under successor”) it contains everything
in the domain (i.e., X = |M|).
The induction axiom guarantees that any structure satisfying it contains
only those elements of |M| the axioms require to be there, i.e., the values of n
for n ∈ N. A model of PA2 contains no non-standard numbers.

Theorem 39.1. If M ⊨ PA2 then |M| = {ValM (n) : n ∈ N}. sol:met:spa:


thm:sol-pa-standard

Proof. Let N = {ValM (n) : n ∈ N}, and suppose M ⊨ PA2 . Of course, for any
n ∈ N, ValM (n) ∈ |M|, so N ⊆ |M|.
Now for inclusion in the other direction. Consider a variable assignment s
with s(X) = N . By assumption,

M ⊨ ∀X (X(0) ∧ ∀x (X(x) → X(x′ ))) → ∀x X(x), thus


M, s ⊨ (X(0) ∧ ∀x (X(x) → X(x′ ))) → ∀x X(x).

Consider the antecedent of this conditional. ValM (0) ∈ N , and so M, s ⊨


X(0). The second conjunct, ∀x (X(x) → X(x′ )) is also satisfied. For suppose
x ∈ N . By definition of N , x = ValM (n) for some n. That gives ′M (x) =
ValM (n + 1) ∈ N . So, ′M (x) ∈ N .
We have that M, s ⊨ X(0) ∧ ∀x (X(x) → X(x′ )). Consequently, M, s ⊨
∀x X(x). But that means that for every x ∈ |M| we have x ∈ s(X) = N . So,
|M| ⊆ N .

Corollary 39.2. Any two models of PA2 are isomorphic. sol:met:spa:


cor:sol-pa-categorical

Proof. By Theorem 39.1, the domain of any model of PA2 is exhausted by


ValM (n). Any such model is also a model of Q. By Proposition 26.3, any such
model is standard, i.e., isomorphic to N.

Above we defined PA2 as the theory that contains the first eight arith-
metical axioms plus the second-order induction axiom. In fact, thanks to the
expressive power of second-order logic, only the first two of the arithmetical
axioms plus induction are needed for second-order Peano arithmetic.

Proposition 39.3. Let PA2† be the second-order theory containing the first sol:met:spa:
prop:sol-pa-definable
two arithmetical axioms (the successor axioms) and the second-order induction
axiom. Then ≤, +, and × are definable in PA2† .

572 Release : 6891b66 (2024-12-01)


39.3. SECOND-ORDER LOGIC IS NOT AXIOMATIZABLE

Proof. To show that ≤ is definable, we have to find a formula φ≤ (x, y) such


that N ⊨ φ≤ (n, m) iff n ≤ m. Consider the formula

ψ(x, Y ) ≡ Y (x) ∧ ∀y (Y (y) → Y (y ′ ))

Clearly, ψ(n, Y ) is satisfied by a set Y ⊆ N iff {m : n ≤ m} ⊆ Y , so we can


take φ≤ (x, y) ≡ ∀Y (ψ(x, Y ) → Y (y)).
To see that addition is definable observe that k+l = m iff there is a function
u such that u(0) = k, u(n′ ) = u(n)′ for all n, and m = u(l). We can use this
equivalence to define addition in PA2† by the following formula:

φ+ (x, y, z) ≡ ∃u (u(0) = x ∧ ∀w u(x′ ) = u(x)′ ∧ u(y) = z)

It should be clear that N ⊨ φ+ (k, l, m) iff k + l = m.

Problem 39.1. Complete the proof of Proposition 39.3.

content/second-order-logic/metatheory/undecidability-and-axiomatizability.tex

39.3 Second-order Logic is not Axiomatizable


sol:met:nax:
sec
sol:met:nax: Theorem 39.4. Second-order logic is undecidable.
thm:sol-undecidable

Proof. A first-order sentence is valid in first-order logic iff it is valid in second-


order logic, and first-order logic is undecidable.

sol:met:nax: Theorem 39.5. There is no sound and complete derivation system for second-
cor:sol-not-axiomatizable
order logic.

Proof. Let φ be a sentence in the language of arithmetic. N ⊨ φ iff PA2 ⊨ φ.


Let P be the conjunction of the nine axioms of PA2 . PA2 ⊨ φ iff ⊨ P → φ, i.e.,
M ⊨ P → φ . Now consider the sentence ∀z ∀u ∀u′ ∀u′′ ∀L (P ′ → φ′ ) resulting
by replacing 0 by z, ′ by the one-place function variable u, + and × by the
two-place function-variables u′ and u′′ , respectively, and < by the two-place
relation variable L and universally quantifying. It is a valid sentence of pure
second-order logic iff the original sentence was valid iff PA2 ⊨ φ iff N ⊨ φ.
Thus if there were a sound and complete proof system for second-order logic,
we could use it to define a computable enumeration f : N → Sent(LA ) of the
sentences true in N. This function would be representable in Q by some first-
order formula ψf (x, y). Then the formula ∃x ψf (x, y) would define the set of
true first-order sentences of N, contradicting Tarski’s Theorem.

content/second-order-logic/metatheory/compactness.tex

Release : 6891b66 (2024-12-01) 573


CHAPTER 39. METATHEORY OF SECOND-ORDER LOGIC

39.4 Second-order Logic is not Compact


explanation Call a set of sentences Γ finitely satisfiable if every one of its finite subsets is sol:met:com:
sec
satisfiable. First-order logic has the property that if a set of sentences Γ is
finitely satisfiable, it is satisfiable. This property is called compactness. It has
an equivalent version involving entailment: if Γ ⊨ φ, then already Γ0 ⊨ φ for
some finite subset Γ0 ⊆ Γ . In this version it is an immediate corollary of the
completeness theorem: for if Γ ⊨ φ, by completeness Γ ⊢ φ. But a derivation
can only make use of finitely many sentences of Γ .
Compactness is not true for second-order logic. There are sets of second-
order sentences that are finitely satisfiable but not satisfiable, and that entail
some φ without a finite subset entailing φ.
Theorem 39.6. Second-order logic is not compact. sol:met:com:
thm:sol-undecidable

Proof. Recall that


Inf ≡ ∃u (∀x ∀y (u(x) = u(y) → x = y) ∧ ∃y ∀x y ̸= u(x))
is satisfied in a structure iff its domain is infinite. Let φ≥n be a sentence that
asserts that the domain has at least n elements, e.g.,
φ≥n ≡ ∃x1 . . . ∃xn (x1 ̸= x2 ∧ x1 ̸= x3 ∧ · · · ∧ xn−1 ̸= xn ).
Consider the set of sentences
Γ = {¬Inf, φ≥1 , φ≥2 , φ≥3 , . . . }.
It is finitely satisfiable, since for any finite subset Γ0 ⊆ Γ there is some k so
that φ≥k ∈ Γ but no φ≥n ∈ Γ for n > k. If |M| has k elements, M ⊨ Γ0 .
But, Γ is not satisfiable: if M ⊨ ¬Inf, |M| must be finite, say, of size k. Then
M ⊭ φ≥k+1 .

Problem 39.2. Give an example of a set Γ and a sentence φ so that Γ ⊨ φ


but for every finite subset Γ0 ⊆ Γ , Γ0 ⊭ φ.

content/second-order-logic/metatheory/loewenheim-skolem.tex

39.5 The Löwenheim–Skolem Theorem Fails for


Second-order Logic
explanation The (Downward) Löwenheim–Skolem Theorem states that every set of sen- sol:met:lst:
sec
tences with an infinite model has an enumerable model. It, too, is a con-
sequence of the completeness theorem: the proof of completeness generates a
model for any consistent set of sentences, and that model is enumerable. There
is also an Upward Löwenheim–Skolem Theorem, which guarantees that if a set
of sentences has a denumerable model it also has a non-enumerable model.
Both theorems fail in second-order logic.

574 Release : 6891b66 (2024-12-01)


sol:met:lst: Theorem 39.7. The Löwenheim–Skolem Theorem fails for second-order logic:
thm:sol-no-ls
There are sentences with infinite models but no enumerable models.

Proof. Recall that

Count ≡ ∃z ∃u ∀X ((X(z) ∧ ∀x (X(x) → X(u(x)))) → ∀x X(x))

is true in a structure M iff |M| is enumerable, so ¬Count is true in M iff |M| is


non-enumerable. There are such structures—take any non-enumerable set as
the domain, e.g., ℘(N) or R. So ¬Count has infinite models but no enumerable
models.

Theorem 39.8. There are sentences with denumerable but no non-enumerable


models.

Proof. Count ∧ Inf is true in N but not in any structure M with |M| non-
enumerable.

Chapter 40

Second-order Logic and Set


Theory

This section deals with coding powersets and the continuum in second-
order logic. The results are stated but proofs have yet to be filled in. There
are no problems yet—and the definitions and results themselves may have
problems. Use with caution and report anything that’s false or unclear.

content/second-order-logic/sol-and-set-theory/introduction.tex

40.1 Introduction
sol:set:int:
sec

575
CHAPTER 40. SECOND-ORDER LOGIC AND SET THEORY

Since second-order logic can quantify over subsets of the domain as well as
functions, it is to be expected that some amount, at least, of set theory can be
carried out in second-order logic. By “carry out,” we mean that it is possible
to express set theoretic properties and statements in second-order logic, and is
possible without any special, non-logical vocabulary for sets (e.g., the member-
ship predicate symbol of set theory). For instance, we can define unions and
intersections of sets and the subset relationship, but also compare the sizes of
sets, and state results such as Cantor’s Theorem.

content/second-order-logic/sol-and-set-theory/comparing-sets.tex

40.2 Comparing Sets


sol:set:cmp:
sec
Proposition 40.1. The formula ∀x (X(x)→Y (x)) defines the subset relation,
i.e., M, s ⊨ ∀x (X(x) → Y (x)) iff s(X) ⊆ s(Y ).

Proposition 40.2. The formula ∀x (X(x)↔Y (x)) defines the identity relation
on sets, i.e., M, s ⊨ ∀x (X(x) ↔ Y (x)) iff s(X) = s(Y ).

Proposition 40.3. The formula ∃x X(x) defines the property of being non-
empty, i.e., M, s ⊨ ∃x X(x) iff s(X) ̸= ∅.

A set X is no larger than a set Y , X ⪯ Y , iff there is an injective function


f : X → Y . Since we can express that a function is injective, and also that its
values for arguments in X are in Y , we can also define the relation of being no
larger than on subsets of the domain.

Proposition 40.4. The formula

∃u (∀x (X(x) → Y (u(x))) ∧ ∀x ∀y (u(x) = u(y) → x = y))

defines the relation of being no larger than.

Two sets are the same size, or “equinumerous,” X ≈ Y , iff there is a bijec-
tive function f : X → Y .

Proposition 40.5. The formula

∃u (∀x (X(x) → Y (u(x))) ∧


∀x ∀y (u(x) = u(y) → x = y) ∧
∀y (Y (y) → ∃x (X(x) ∧ y = u(x))))

defines the relation of being equinumerous with.

576 Release : 6891b66 (2024-12-01)


40.3. CARDINALITIES OF SETS

We will abbreviate these formulas, respectively, as X ⊆ Y , X = Y , X ̸= ∅,


X ⪯ Y , and X ≈ Y . (This may be slightly confusing, since we use the same
notation when we speak informally about sets X and Y —but here the notation
is an abbreviation for formulas in second-order logic involving one-place relation
variables X and Y .)

Proposition 40.6. The sentence ∀X ∀Y ((X ⪯ Y ∧ Y ⪯ X) → X ≈ Y ) is


valid.

Proof. The sentence is satisfied in a structure M if, for any subsets X ⊆ |M|
and Y ⊆ |M|, if X ⪯ Y and Y ⪯ X then X ≈ Y . But this holds for any sets
X and Y —it is the Schröder-Bernstein Theorem.

content/second-order-logic/sol-and-set-theory/cardinalities.tex

40.3 Cardinalities of Sets


sol:set:crd: Just as we can express that the domain is finite or infinite, enumerable or explanation
sec
non-enumerable, we can define the property of a subset of |M| being finite or
infinite, enumerable or non-enumerable.

Proposition 40.7. The formula Inf(X) ≡

∃u (∀x ∀y (u(x) = u(y) → x = y) ∧


∃y (X(y) ∧ ∀x (X(x) → y ̸= u(x)))

is satisfied with respect to a variable assignment s iff s(X) is infinite.

Proposition 40.8. The formula Count(X) ≡

∃z ∃u (X(z) ∧ ∀x (X(x) → X(u(x))) ∧


∀Y ((Y (z) ∧ ∀x (Y (x) → Y (u(x)))) → X = Y ))

is satisfied with respect to a variable assignment s iff s(X) is enumerable.

We know from Cantor’s Theorem that there are non-enumerable sets, and
in fact, that there are infinitely many different levels of infinite sizes. Set
theory develops an entire arithmetic of sizes of sets, and assigns infinite cardinal
numbers to sets. The natural numbers serve as the cardinal numbers measuring
the sizes of finite sets. The cardinality of denumerable sets is the first infinite
cardinality, called ℵ0 (“aleph-nought” or “aleph-zero”). The next infinite size
is ℵ1 . It is the smallest size a set can be without being countable (i.e., of
size ℵ0 ). We can define “X has size ℵ0 ” as Aleph0 (X) ↔ Inf(X) ∧ Count(X).
X has size ℵ1 iff all its subsets are finite or have size ℵ0 , but is not itself
of size ℵ0 . Hence we can express this by the formula Aleph1 (X) ≡ ∀Y (Y ⊆

Release : 6891b66 (2024-12-01) 577


CHAPTER 40. SECOND-ORDER LOGIC AND SET THEORY

X →(¬Inf(Y )∨Aleph0 (Y )))∧¬Aleph0 (X). Being of size ℵ2 is defined similarly,


etc.
There is one size of special interest, the so-called cardinality of the contin-
uum. It is the size of ℘(N), or, equivalently, the size of R. That a set is the
size of the continuum can also be expressed in second-order logic, but requires
a bit more work.

content/second-order-logic/sol-and-set-theory/power-of-continuum.tex

40.4 The Power of the Continuum


explanation In second-order logic we can quantify over subsets of the domain, but not over sol:set:pow:
sec
sets of subsets of the domain. To do this directly, we would need third-order
logic. For instance, if we wanted to state Cantor’s Theorem that there is no
injective function from the power set of a set to the set itself, we might try
to formulate it as “for every set X, and every set P , if P is the power set
of X, then not P ⪯ X”. And to say that P is the power set of X would
require formalizing that the elements of P are all and only the subsets of X,
so something like ∀Y (P (Y ) ↔ Y ⊆ X). The problem lies in P (Y ): that is not
a formula of second-order logic, since only terms can be arguments to one-place
relation variables like P .
We can, however, simulate quantification over sets of sets, if the domain is
large enough. The idea is to make use of the fact that two-place relations R
relates elements of the domain to elements of the domain. Given such an R, we
can collect all the elements to which some x is R-related: {y ∈ |M| : R(x, y)}
is the set “coded by” x. Conversely, if Z ⊆ ℘(|M|) is some collection of subsets
of |M|, and there are at least as many elements of |M| as there are sets in Z,
2
then there is also a relation R ⊆ |M| such that every Y ∈ Z is coded by
some x using R.
2
Definition 40.9. If R ⊆ |M| , then x R-codes {y ∈ |M| : R(x, y)}.

If an element x ∈ |M| R-codes a set Z ⊆ |M|, then a set Y ⊆ |M| codes a


set of sets, namely the sets coded by the elements of Y . So a set Y can R-code
℘(X). It does so iff for every Z ⊆ X, some x ∈ Y R-codes Z, and every x ∈ Y
R-codes a Z ⊆ X.
Proposition 40.10. The formula

Codes(x, R, Z) ≡ ∀y (Z(y) ↔ R(x, y))

expresses that s(x) s(R)-codes s(Z). The formula

Pow(Y, R, X) ≡
∀Z (Z ⊆ X → ∃x (Y (x) ∧ Codes(x, R, Z))) ∧
∀x (Y (x) → ∀Z (Codes(x, R, Z) → Z ⊆ X)

578 Release : 6891b66 (2024-12-01)


40.4. THE POWER OF THE CONTINUUM

expresses that s(Y ) s(R)-codes the power set of s(X), i.e., the elements of s(Y )
s(R)-code exactly the subsets of s(X).

With this trick, we can express statements about the power set by quantify- explanation

ing over the codes of subsets rather than the subsets themselves. For instance,
Cantor’s Theorem can now be expressed by saying that there is no injective
function from the domain of any relation that codes the power set of X to X
itself.

Proposition 40.11. The sentence

∀X ∀Y ∀R (Pow(Y, R, X)→
¬∃u (∀x ∀y (u(x) = u(y) → x = y) ∧
∀x (Y (x) → X(u(x)))))

is valid.

The power set of a denumerable set is non-enumerable, and so its cardinality explanation

is larger than that of any denumerable set (which is ℵ0 ). The size of ℘(N) is
called the “power of the continuum,” since it is the same size as the points on
the real number line, R. If the domain is large enough to code the power set
of a denumerable set, we can express that a set is the size of the continuum
by saying that it is equinumerous with any set Y that codes the power set of
set X of size ℵ0 . (If the domain is not large enough, i.e., it contains no subset
equinumerous with R, then there can also be no relation that codes ℘(X).)

Proposition 40.12. If R ⪯ |M|, then the formula

Cont(Y ) ≡ ∃X ∃R ((Aleph0 (X) ∧ Pow(Y, R, X))∧


∀x ∀y ((Y (x) ∧ Y (y) ∧ ∀z R(x, z) ↔ R(y, z)) → x = y))

expresses that s(Y ) ≈ R.

Proof. Pow(Y, R, X) expresses that s(Y ) s(R)-codes the power set of s(X),
which Aleph0 (X) says is countable. So s(Y ) is at least as large as the power
of the continuum, although it may be larger (if multiple elements of s(Y ) code
the same subset of X). This is ruled out be the last conjunct, which requires
the association between elements of s(Y ) and subsets of s(Z) via s(R) to be
injective.

Proposition 40.13. |M| ≈ R iff

M ⊨ ∃X ∃Y ∃R (Aleph0 (X) ∧ Pow(Y, R, X)∧


∃u (∀x ∀y (u(x) = u(y) → x = y) ∧
∀y (Y (y) → ∃x y = u(x)))).

Release : 6891b66 (2024-12-01) 579


CHAPTER 40. SECOND-ORDER LOGIC AND SET THEORY

explanation The Continuum Hypothesis is the statement that the size of the continuum
is the first non-enumerable cardinality, i.e, that ℘(N) has size ℵ1 .
Proposition 40.14. The Continuum Hypothesis is true iff

CH ≡ ∀X (Aleph1 (X) ↔ Cont(X))

is valid.

Note that it isn’t true that ¬CH is valid iff the Continuum Hypothesis is
false. In an enumerable domain, there are no subsets of size ℵ1 and also no
subsets of the size of the continuum, so CH is always true in an enumerable do-
main. However, we can give a different sentence that is valid iff the Continuum
Hypothesis is false:

Proposition 40.15. The Continuum Hypothesis is false iff

NCH ≡ ∀X (Cont(X) → ∃Y (Y ⊆ X ∧ ¬Count(Y ) ∧ ¬X ≈ Y ))

is valid.

580 Release : 6891b66 (2024-12-01)


Part IX

The Lambda Calculus


This part deals with the lambda calculus. The introduction chapter
is based on Jeremy Avigad’s notes; part of it is now redundant and cov-
ered in later chapters. The chapters on syntax, Church–Rosser property,
and lambda definability were produced by Zesen Qian during his Mitacs
summer internship. They still have to be reviewed and revised.

Chapter 41

Introduction

This chapter consists of Jeremy’s original concise notes on the lambda


calculus. The sections need to be combined, and the material on lambda
definability merged with the material in the separate, more detailed chap-
ter on lambda definability.

content/lambda-calculus/introduction/overview.tex

41.1 Overview
lam:int:ovr: The lambda calculus was originally designed by Alonzo Church in the early
sec
1930s as a basis for constructive logic, and not as a model of the computable
functions. But it was soon shown to be equivalent to other definitions of com-
putability, such as the Turing computable functions and the partial recursive
functions. The fact that this initially came as a small surprise makes the char-
acterization all the more interesting.
Lambda notation is a convenient way of referring to a function directly
by a symbolic expression which defines it, instead of defining a name for it.

581
CHAPTER 41. INTRODUCTION

Instead of saying “let f be the function defined by f (x) = x + 3,” one can
say, “let f be the function λx. (x + 3).” In other words, λx. (x + 3) is just a
name for the function that adds three to its argument. In this expression, x
is a dummy variable, or a placeholder: the same function can just as well be
denoted by λy. (y +3). The notation works even with other parameters around.
For example, suppose g(x, y) is a function of two variables, and k is a natural
number. Then λx. g(x, k) is the function which maps any x to g(x, k).
This way of defining a function from a symbolic expression is known as
lambda abstraction. The flip side of lambda abstraction is application: assum-
ing one has a function f (say, defined on the natural numbers), one can apply
it to any value, like 2. In conventional notation, of course, we write f (2) for
the result.
What happens when you combine lambda abstraction with application?
Then the resulting expression can be simplified, by “plugging” the applicand
in for the abstracted variable. For example,

(λx. (x + 3))(2)

can be simplified to 2 + 3.
Up to this point, we have done nothing but introduce new notations for
conventional notions. The lambda calculus, however, represents a more radical
departure from the set-theoretic viewpoint. In this framework:
1. Everything denotes a function.
2. Functions can be defined using lambda abstraction.
3. Anything can be applied to anything else.
For example, if F is a term in the lambda calculus, F (F ) is always assumed
to be meaningful. This liberal framework is known as the untyped lambda
calculus, where “untyped” means “no restriction on what can be applied to
what.”
digression There is also a typed lambda calculus, which is an important variation on the
untyped version. Although in many ways the typed lambda calculus is similar
to the untyped one, it is much easier to reconcile with a classical set-theoretic
framework, and has some very different properties.
Research on the lambda calculus has proved to be central in theoretical
computer science, and in the design of programming languages. LISP, designed
by John McCarthy in the 1950s, is an early example of a language that was
influenced by these ideas.

content/lambda-calculus/introduction/syntax.tex

41.2 The Syntax of the Lambda Calculus


One starts with a sequence of variables x, y, z, . . . and some constant symbols lam:int:syn:
sec
a, b, c, . . . . The set of terms is defined inductively, as follows:

582 Release : 6891b66 (2024-12-01)


41.2. THE SYNTAX OF THE LAMBDA CALCULUS

1. Each variable is a term.

2. Each constant is a term.

3. If M and N are terms, so is (M N ).

4. If M is a term and x is a variable, then (λx. M ) is a term.

The system without any constants at all is called the pure lambda calculus.
We’ll mainly be working in the pure λ-calculus, so all lowerce letters will stand
for variables. We use uppercase letters (M , N , etc.) to stand for terms of the
λ-calculus.
We will follow a few notational conventions:

Convention 1. 1. When parentheses are left out, application takes place


from left to right. For example, if M , N , P , and Q are terms, then
M N P Q abbreviates (((M N )P )Q).

2. Again, when parentheses are left out, lambda abstraction is to be given


the widest scope possible. From example, λx. M N P is read (λx. ((M N )P )).

3. A lambda can be used to abstract multiple variables. For example,


λxyz. M is short for λx. λy. λz. M .

For example,
λxy. xxyxλz. xz

abbreviates
λx. λy. ((((xx)y)x)(λz. (xz))).

You should memorize these conventions. They will drive you crazy at first, but
you will get used to them, and after a while they will drive you less crazy than
having to deal with a morass of parentheses.
Two terms that differ only in the names of the bound variables are called
α-equivalent; for example, λx. x and λy. y. It will be convenient to think of
these as being the “same” term; in other words, when we say that M and N are
the same, we also mean “up to renamings of the bound variables.” Variables
that are in the scope of a λ are called “bound”, while others are called “free.”
There are no free variables in the previous example; but in

(λz. yz)x

y and x are free, and z is bound.

content/lambda-calculus/introduction/reduction.tex

Release : 6891b66 (2024-12-01) 583


CHAPTER 41. INTRODUCTION

41.3 Reduction of Lambda Terms


What can one do with lambda terms? Simplify them. If M and N are any lam:int:red:
sec
lambda terms and x is any variable, we can use M [N/x] to denote the result
of substituting N for x in M , after renaming any bound variables of M that
would interfere with the free variables of N after the substitution. For example,

(λw. xxw)[yyz/x] = λw. (yyz)(yyz)w.

digression Alternative notations for substitution are [N/x]M , [x/N ]M , and also M [x/N ].
Beware!
Intuitively, (λx. M )N and M [N/x] have the same meaning; the act of re-
placing the first term by the second is called β-contraction. (λx. M )N is called
a redex and M [N/x] its contractum. Generally, if it is possible to change a
term P to P ′ by β-contraction of some subterm, we say that P β-reduces to P ′
in one step, and write P →− P ′ . If from P we can obtain P ′ with some number
of one-step reductions (possibly none), then P β-reduces to P ′ ; in symbols,
P →→ P ′ . A term that cannot be β-reduced any further is called β-irreducible,

or β-normal. We will say “reduces” instead of “β-reduces,” etc., when the
context is clear.
Let us consider some examples.
1. We have

(λx. xxy)λz. z →
− (λz. z)(λz. z)y

− (λz. z)y

− y.

2. “Simplifying” a term can make it more complex:

(λx. xxy)(λx. xxy) →


− (λx. xxy)(λx. xxy)y

− (λx. xxy)(λx. xxy)yy

− ...

3. It can also leave a term unchanged:

(λx. xx)(λx. xx) →


− (λx. xx)(λx. xx).

4. Also, some terms can be reduced in more than one way; for example,

(λx. (λy. yx)z)v →


− (λy. yv)z

by contracting the outermost application; and

(λx. (λy. yx)z)v →


− (λx. zx)v

by contracting the innermost one. Note, in this case, however, that both
terms further reduce to the same term, zv.

584 Release : 6891b66 (2024-12-01)


41.4. THE CHURCH–ROSSER PROPERTY

The final outcome in the last example is not a coincidence, but rather
illustrates a deep and important property of the lambda calculus, known as
the “Church–Rosser property.”

content/lambda-calculus/introduction/church-rosser.tex

41.4 The Church–Rosser Property


lam:int:cr:
sec
lam:int:cr: Theorem 41.1. Let M , N1 , and N2 be terms, such that M → −
→ N1 and
thm:church-rosser
M→−
→ N2 . Then there is a term P such that N1 →

→ P and N2 →

→ P.

Corollary 41.2. Suppose M can be reduced to normal form. Then this normal
form is unique.

Proof. If M →−
→ N1 and M → −
→ N2 , by the previous theorem there is a term P
such that N1 and N2 both reduce to P . If N1 and N2 are both in normal form,
this can only happen if N1 ≡ P ≡ N2 .

Finally, we will say that two terms M and N are β-equivalent, or just
equivalent, if they reduce to a common term; in other words, if there is some P
β
such that M → −
→ P and N → −
→ P . This is written M = N . Using Theorem 41.1,
β
you can check that = is an equivalence relation, with the additional property
β
that for every M and N , if M → −→ N or N → −
→ M , then M = N . (In fact, one
β
can show that = is the smallest equivalence relation having this property.)

content/lambda-calculus/introduction/currying.tex

41.5 Currying
lam:rep:cur: A λ-abstract λx. M represents a function of one argument, which is quite a
sec
limitation when we want to define function accepting multiple arguments. One
way to do this would be by extending the λ-calculus to allow the formation
of pairs, triples, etc., in which case, say, a three-place function λx. M would
expect its argument to be a triple. However, it is more convenient to do this
by Currying.
Let’s consider an example. We’ll pretend for a moment that we have a
+ operation in the λ-calculus. The addition function is 2-place, i.e., it takes
two arguments. But a λ-abstract only gives us functions of one argument: the
syntax does not allow expressions like λ(x, y). (x+y). However, we can consider
the one-place function fx (y) given by λy. (x + y), which adds x to its single
argument y. Actually, this is not a single function, but a family of different
functions “add x,” one for each number x. Now we can define another one-place
function g as λx. fx . Applied to argument x, g(x) returns the function fx —so
its values are other functions. Now if we apply g to x, and then the result to y

Release : 6891b66 (2024-12-01) 585


CHAPTER 41. INTRODUCTION

we get: (g(x))y = fx (y) = x + y. In this way, the one-place function g can


do the same job as the two-place addition function. “Currying” simply refers
to this trick for turning two-place functions into one place functions (whose
values are one-place functions).
Here is an example properly in the syntax of the λ-calculus. How do we
represent the function f (x, y) = x? If we want to define a function that accepts
two arguments and returns the first, we can write λx. λy. x, which literally is
a function that accepts an argument x and returns the function λy. x. The
function λy. x accepts another argument y, but drops it, and always returns x.
Let’s see what happens when we apply λx. λy. x to two arguments:
β
(λx. λy. x)M N −
→(λy. M )N
β

→M

In general, to write a function with parameters x1 , . . . , xn defined by some


term N , we can write λx1 . λx2 . . . . λxn . N . If we apply n arguments to it we
get:

β
(λx1 . λx2 . . . . λxn . N )M1 . . . Mn −

β

→ ((λx2 . . . . λxn . N )[M1 /x1 ])M2 . . . Mn
≡ (λx2 . . . . λxn . N [M1 /x1 ])M2 . . . Mn
..
.
β

→ P [M1 /x1 ] . . . [Mn /xn ]

The last line literally means substituting Mi for xi in the body of the function
definition, which is exactly what we want when applying multiple arguments
to a function.

content/lambda-calculus/introduction/lambda-definability.tex

41.6 λ-Definable Arithmetical Functions


How can the lambda calculus serve as a model of computation? At first, it is not lam:int:rep:
sec
even clear how to make sense of this statement. To talk about computability
on the natural numbers, we need to find a suitable representation for such
numbers. Here is one that works surprisingly well.

Definition 41.3. For each natural number n, define the Church numeral n to
be the lambda term λx. λy. (x(x(x(. . . x(y))))), where there are n x’s in all.

The terms n are “iterators”: on input f , n returns the function mapping y


to f n (y). Note that each numeral is normal. We can now say what it means
for a lambda term to “compute” a function on the natural numbers.

586 Release : 6891b66 (2024-12-01)


41.7. λ-DEFINABLE FUNCTIONS ARE COMPUTABLE

Definition 41.4. Let f (x0 , . . . , xk−1 ) be an n-ary partial function from N to


N. We say a λ-term F λ-defines f iff for every sequence of natural numbers
n0 , . . . , nk−1 ,
F n0 n1 . . . nk−1 →−
→ f (n0 , n1 , . . . , nk−1 )
if f (n0 , . . . , nk−1 ) is defined, and F, n0 n1 . . . nk−1 has no normal form other-
wise.

lam:int:rep: Theorem 41.5. A function f is a partial computable function if and only if


thm:lambda-def
it is λ-defined by a lambda term.

This theorem is somewhat striking. As a model of computation, the lambda explanation


calculus is a rather simple calculus; the only operations are lambda abstrac-
tion and application! From these meager resources, however, it is possible to
implement any computational procedure.

content/lambda-calculus/introduction/lambda-computable.tex

41.7 λ-Definable Functions are Computable


lam:int:cmp:
sec
lam:int:cmp: Theorem 41.6. If a partial function f is λ-defined by a lambda term, it is
thm:lambda-computable
computable.

Proof. Suppose a function f is λ-defined by a lambda term X. Let us describe


an informal procedure to compute f . On input m0 , . . . , mn−1 , write down the
term Xm0 . . . mn−1 . Build a tree, first writing down all the one-step reductions
of the original term; below that, write all the one-step reductions of those (i.e.,
the two-step reductions of the original term); and keep going. If you ever reach
a numeral, return that as the answer; otherwise, the function is undefined.
An appeal to Church’s thesis tells us that this function is computable. A
better way to prove the theorem would be to give a recursive description of this
search procedure. For example, one could define a sequence primitive recursive
functions and relations, “IsASubterm,” “Substitute,” “ReducesToInOneStep,”
“ReductionSequence,” “Numeral,” etc. The partial recursive procedure for
computing f (m0 , . . . , mn−1 ) is then to search for a sequence of one-step reduc-
tions starting with Xm0 . . . mn−1 and ending with a numeral, and return the
number corresponding to that numeral. The details are long and tedious but
otherwise routine.

content/lambda-calculus/introduction/computable-lambda.tex

41.8 Computable Functions are λ-Definable


lam:int:lrp:
sec
lam:int:lrp: Theorem 41.7. Every computable partial function is λ-definable.
thm:computable-lambda

Release : 6891b66 (2024-12-01) 587


CHAPTER 41. INTRODUCTION

Proof. We need to show that every partial computable function f is λ-defined


by a lambda term F . By Kleene’s normal form theorem, it suffices to show
that every primitive recursive function is λ-defined by a lambda term, and then
that the functions λ-definable are closed under suitable compositions and un-
bounded search. To show that every primitive recursive function is λ-defined by
a lambda term, it suffices to show that the initial functions are λ-definable, and
that the partial functions that are λ-definable are closed under composition,
primitive recursion, and unbounded search.

We will use a more conventional notation to make the rest of the proof more
readable. For example, we will write M (x, y, z) instead of M xyz. While this
is suggestive, you should remember that terms in the untyped lambda calculus
do not have associated arities; so, for the same term M , it makes just as much
sense to write M (x, y) and M (x, y, z, w). But using this notation indicates
that we are treating M as a function of three variables, and helps make the
intentions behind the definitions clearer. In a similar way, we will say “define
M by M (x, y, z) = . . . ” instead of “define M by M = λx. λy. λz. . . ..”

content/lambda-calculus/introduction/basic-pr-lambda.tex

41.9 The Basic Primitive Recursive Functions are


λ-Definable
lam:int:bas:
sec
Lemma 41.8. The functions zero, succ, and Pin are λ-definable.

Proof. zero is just λx. λy. y.


The successor function succ, is defined by Succ(u) = λx. λy. x(uxy). You
should think about why this works; for each numeral n, thought of as an
iterator, and each function f , Succ(n, f ) is a function that, on input y, applies
f n times starting with y, and then applies it once more.
There is nothing to say about projections: Projni (x0 , . . . , xn−1 ) = xi . In
other words, by our conventions, Projni is the lambda term λx0 . . . . λxn−1 . xi .

content/lambda-calculus/introduction/composition.tex

41.10 The λ-Definable Functions are Closed under


Composition
lam:int:com:
sec
Lemma 41.9. The λ-definable functions are closed under composition.

588 Release : 6891b66 (2024-12-01)


41.11. λ-DEFINABLE FUNCTIONS ARE CLOSED UNDER PRIMITIVE
RECURSION
Proof. Suppose f is defined by composition from h, g0 , . . . , gk−1 . Assuming h,
g0 , . . . , gk−1 are λ-defined by H, G0 , . . . , Gk−1 , respectively, we need to find
a term F that λ-defines f . But we can simply define F by

F (x0 , . . . , xl−1 ) = H(G0 (x0 , . . . , xl−1 ), . . . , Gk−1 (x0 , . . . , xl−1 )).

In other words, the language of the lambda calculus is well suited to represent
composition.

content/lambda-calculus/introduction/primitive-recursion.tex

41.11 λ-Definable Functions are Closed under Primitive


Recursion
lam:int:pr: When it comes to primitive recursion, we finally need to do some work. We
sec
will have to proceed in stages. As before, on the assumption that we already
have terms G and H that λ-define functions g and h, respectively, we want a
term H that λ-defines the function f defined by

f (0, ⃗z) = g(⃗z)


f (x + 1, ⃗z) = h(z, f (x, ⃗z), ⃗z).

So, in general, given lambda terms G′ and H ′ , it suffices to find a term F such
that

F (0, ⃗z) ≡ G(⃗z)


F (n + 1, ⃗z) ≡ H(n, F (n, ⃗z), ⃗z)

for every natural number n; the fact that G′ and H ′ λ-define g and h means
that whenever we plug in numerals m ⃗ for ⃗z, F (n + 1, m)
⃗ will normalize to the
right answer.
But for this, it suffices to find a term F satisfying

F (0) ≡ G
F (n + 1) ≡ H(n, F (n))

for every natural number n, where

G = λ⃗z. G′ (⃗z) and


H(u, v) = λ⃗z. H ′ (u, v(u, ⃗z), ⃗z).

In other words, with lambda trickery, we can avoid having to worry about the
extra parameters ⃗z—they just get absorbed in the lambda notation.
Before we define the term F , we need a mechanism for handling ordered
pairs. This is provided by the next lemma.

Release : 6891b66 (2024-12-01) 589


CHAPTER 41. INTRODUCTION

Lemma 41.10. There is a lambda term D such that for each pair of lambda
terms M and N , D(M, N )(0) →

→ M and D(M, N )(1) →

→ N.

Proof. First, define the lambda term K by

K(y) = λx. y.

In other words, K is the term λy. λx. y. Looking at it differently, for every M ,
K(M ) is a constant function that returns M on any input.
Now define D(x, y, z) by D(x, y, z) = z(K(y))x. Then we have

D(M, N, 0) →

→ 0(K(N ))M →

→ M and
D(M, N, 1) →

→ 1(K(N ))M →

→ K(N )M →

→ N,

as required.

The idea is that D(M, N ) represents the pair ⟨M, N ⟩, and if P is assumed
to represent such a pair, P (0) and P (1) represent the left and right projections,
(P )0 and (P )1 . We will use the latter notations.
Lemma 41.11. The λ-definable functions are closed under primitive recur-
sion.

Proof. We need to show that given any terms, G and H, we can find a term F
such that

F (0) ≡ G
F (n + 1) ≡ H(n, F (n))

for every natural number n. The idea is roughly to compute sequences of pairs

⟨0, F (0)⟩, ⟨1, F (1)⟩, . . . ,

using numerals as iterators. Notice that the first pair is just ⟨0, G⟩. Given a
pair ⟨n, F (n)⟩, the next pair, ⟨n + 1, F (n + 1)⟩ is supposed to be equivalent to
⟨n + 1, H(n, F (n))⟩. We will design a lambda term T that makes this one-step
transition.
The details are as follows. Define T (u) by

T (u) = ⟨S((u)0 ), H((u)0 , (u)1 )⟩.

Now it is easy to verify that for any number n,

T (⟨n, M ⟩) →

→ ⟨n + 1, H(n, M )⟩.

As suggested above, given G and H, define F (u) by

F (u) = (u(T, ⟨0, G⟩))1 .

In other words, on input n, F iterates T n times on ⟨0, G⟩, and then returns
the second component. To start with, we have

590 Release : 6891b66 (2024-12-01)


41.12. FIXED-POINT COMBINATORS

1. 0(T, ⟨0, G⟩) ≡ ⟨0, G⟩


2. F (0) ≡ G
By induction on n, we can show that for each natural number one has the
following:
1. n + 1(T, ⟨0, G⟩) ≡ ⟨n + 1, F (n + 1)⟩
2. F (n + 1) ≡ H(n, F (n))
For the second clause, we have
F (n + 1) →

→ (n + 1(T, ⟨0, G⟩))1
≡ (T (n(T, ⟨0, G⟩)))1
≡ (T (⟨n, F (n)⟩))1
≡ (⟨n + 1, H(n, F (n))⟩)1
≡ H(n, F (n)).
Here we have used the induction hypothesis on the second-to-last line. For the
first clause, we have
n + 1(T, ⟨0, G⟩) ≡ T (n(T, ⟨0, G⟩))
≡ T (⟨n, F (n)⟩)
≡ ⟨n + 1, H(n, F (n))⟩
≡ ⟨n + 1, F (n + 1)⟩.
Here we have used the second clause in the last line. So we have shown F (0) ≡
G and, for every n, F (n + 1) ≡ H(n, F (n)), which is exactly what we needed.

content/lambda-calculus/introduction/fixed-point-combinator.tex

41.12 Fixed-Point Combinators


lam:int:fix: Suppose you have a lambda term g, and you want another term k with the
sec
property that k is β-equivalent to gk. Define terms
diag(x) = xx
and
l(x) = g(diag(x))
using our notational conventions; in other words, l is the term λx. g(xx). Let
k be the term ll. Then we have
k = (λx. g(xx))(λx. g(xx))


→ g((λx. g(xx))(λx. g(xx)))
= gk.

Release : 6891b66 (2024-12-01) 591


CHAPTER 41. INTRODUCTION

If one takes
Y = λg. ((λx. g(xx))(λx. g(xx)))
then Y g and g(Y g) reduce to a common term; so Y g ≡β g(Y g). This is known
as “Curry’s combinator.” If instead one takes

Y = (λxg. g(xxg))(λxg. g(xxg))

then in fact Y g reduces to g(Y g), which is a stronger statement. This latter
version of Y is known as “Turing’s combinator.”

content/lambda-calculus/introduction/minimization.tex

41.13 The λ-Definable Functions are Closed under


Minimization
lam:int:min:
sec
Lemma 41.12. Suppose f (x, y) is primitive recursive. Let g be defined by

g(x) ≃ µy f (x, y).

Then g is λ-definable.

Proof. The idea is roughly as follows. Given x, we will use the fixed-point
lambda term Y to define a function hx (n) which searches for a y starting at n;
then g(x) is just hx (0). The function hx can be expressed as the solution of a
fixed-point equation:
(
n if f (x, n) = 0
hx (n) ≃
hx (n + 1) otherwise.

Here are the details. Since f is primitive recursive, it is λ-defined by


some term F . Remember that we also have a lambda term D, such that
D(M, N, 0̄) →

→ M and D(M, N, 1̄) →
−→ N . Fixing x for the moment, to λ-define
hx we want to find a term H (depending on x) satisfying

H(n) ≡ D(n, H(S(n)), F (x, n)).

We can do this using the fixed-point term Y . First, let U be the term

λh. λz. D(z, (h(Sz)), F (x, z)),

and then let H be the term Y U . Notice that the only free variable in H is x.
Let us show that H satisfies the equation above.
By the definition of Y , we have

H = Y U ≡ U (Y U ) = U (H).

592 Release : 6891b66 (2024-12-01)


In particular, for each natural number n, we have

H(n) ≡ U (H, n)


→ D(n, H(S(n)), F (x, n)),

as required. Notice that if you substitute a numeral m for x in the last line,
the expression reduces to n if F (m, n) reduces to 0, and it reduces to H(S(n))
if F (m, n) reduces to any other numeral.
To finish off the proof, let G be λx. H(0). Then G λ-defines g; in other
words, for every m, G(m) reduces to reduces to g(m), if g(m) is defined, and
has no normal form otherwise.

Chapter 42

Syntax

content/lambda-calculus/syntax/terms.tex

42.1 Terms
lam:syn:trm: The terms of the lambda calculus are built up inductively from an infinite
sec
supply of variables v0 , v1 , . . . , the symbol “λ”, and parentheses. We will use
x, y, z, . . . to designate variables, and M , N , P , . . . to desginate terms.
lam:syn:trm: Definition 42.1 (Terms). The set of terms of the lambda calculus is defined
defn:term
inductively by:
lam:syn:trm: 1. If x is a variable, then x is a term.
defn:term-var
lam:syn:trm: 2. If x is a variable and M is a term, then (λx. M ) is a term.
defn:term-abs
lam:syn:trm: 3. If both M and N are terms, then (M N ) is a term.
defn:term-app

If a term (λx. M ) is formed according to (2) we say it is the result of an


abstraction, and the x in λx is called a parameter. A term (M N ) formed
according to (3) is the result of an application.
The terms defined above are fully parenthesized. This can get rather cum-
bersome, as the term (λx. ((λx. x)(λx. (xx)))) demnostrates. We will introduce

593
CHAPTER 42. SYNTAX

conventions for avoiding parentheses. However, the official definition makes it


easy to determine how a term is constructed according to Definition 42.1. For
example, the last step of forming the term (λx. ((λx. x)(λx. (xx)))) must be
abstraction where the parameter is x. It results by abstraction from the term
((λx. x)(λx. (xx))), which is an application of two terms. Each of these two
terms is the result of an abstraction, and so on.

Problem 42.1. Describe the formation of (λg. (λx. (g(xx)))(λx. (g(xx)))).

content/lambda-calculus/syntax/unique-readability.tex

42.2 Unique Readability


We may wonder if for each term there is a unique way of forming it, and there lam:syn:unq:
sec
is. For each lambda term there is only one way to construct and interpret it.
In the following discussion, a formation is the procedure of constructing a term
using the formation rules (one or several times) of Definition 42.1.

Lemma 42.2. A term starts with either a variable or a parenthesis. lam:syn:unq:


lem:term-start

Proof. Something counts as a term only if it is constructed according to Defi-


nition 42.1. If it is the result of (1), it must be a variable. If it is the result of
(2) or (3), it starts with a parenthesis.

Lemma 42.3. The result of an application starts with either two parentheses lam:syn:unq:
lem:app-start
or a parenthesis and a variable.

Proof. If M is the result of an application, it is of the form (P Q), so it begins


with a parenthesis. Since P is a term, by Lemma 42.2, it begins either with a
parenthesis or a variable.

Lemma 42.4. No proper initial part of a term is itself a term. lam:syn:unq:


lem:initial

Problem 42.2. Prove Lemma 42.4 by induction on the length of terms.

Proposition 42.5 (Unique Readability). There is a unique formation for lam:syn:unq:


prop:unq
each term. In other words, if a term M is formed by a formation, then it is
the only formation that can form this term.

Proof. We prove this by induction on the formation of terms.

1. M is of the form x, where x is some variable. Since the results of abstrac-


tions and applications always start with parentheses, they cannot have
been used to construct M ; Thus, the formation of M must be a single
step of Definition 42.1(1).

594 Release : 6891b66 (2024-12-01)


42.3. ABBREVIATED SYNTAX

2. M is of the form (λx. N ), where x is some variable and N is a term. It


could not have been constructed according to Definition 42.1(1), because
it is not a single variable. It is not the result of an application, by
Lemma 42.3. Thus M can only be the result of an abstraction on N . By
inductive hypothesis we know that formation of N is itself unique.
3. M is of the form (P Q), where P and Q are terms. Since it starts with
a parentheses, it cannot also be constructed by Definition 42.1(1). By
Lemma 42.2, P cannot begin with λ, so (P Q) cannot be the result of an
abstraction. Now suppose there were another way of constructing M by
application, e.g., it is also of the form (P ′ Q′ ). Then P is a proper initial
segment of P ′ (or vice versa), and this is impossible by Lemma 42.4. So
P and Q are uniquely determined, and by inductive hypothesis we know
that formations of P and Q is unique.

A more readable paraphrase of the above proposition is as follows:


Proposition 42.6. A term M can only be one of the following forms:
1. x, where x is a variable uniquely determined by M .
2. (λx. N ), where x is a variable and N is another term, both of which is
uniquely determined by M .
3. (P Q), where P and Q are two terms uniquely determined by M .

content/lambda-calculus/syntax/abbreviated-syntax.tex

42.3 Abbreviated Syntax


lam:syn:abb: Terms as defined in Definition 42.1 are sometimes cumbersome to write, so it
sec
is useful to introduce a more concise syntax. We must of course be careful to
make sure that the terms in the concise notation also are uniquely readable.
One widely used version called abbreviated terms is as follows.

1. When parentheses are left out, application takes place from left to right.
For example, if M , N , P , and Q are terms, then M N P Q abbreviates
(((M N )P )Q).
2. Again, when parentheses are left out, lambda abstraction is given the
widest scope possible. From example, λx. M N P is read as (λx. M N P ).
3. A lambda can be used to abstract multiple variables. For example,
λxyz. M is short for λx. λy. λz. M .

For example,
λxy. xxyxλz. xz
abbreviates
(λx. (λy. ((((xx)y)x)(λz. (xz))))).

Release : 6891b66 (2024-12-01) 595


CHAPTER 42. SYNTAX

Problem 42.3. Expand the abbreviated term λg. (λx. g(xx))λx. g(xx).

content/lambda-calculus/syntax/free-variables.tex

42.4 Free Variables


Lambda calculus is about functions, and lambda abstraction is how functions lam:syn:fv:
sec
arise. Intuitively, λx. M is the function with values given by M when the
argument to the function is assigned to x. But not every occurrence of x in M
is relevant: if M contains another abstract λx. N then the occurrences of x
in N are relevant to λx. N but not to λx. M . So, a lambda abstract λx inside
λx. M binds those occurrences of x in M that are not already bound by another
lambda abstract—the free occurrences of x in M .
Definition 42.7 (Scope). If λx. M occurs inside a term N , then the corre-
sponding occurrence of N is the scope of the λx.

Definition 42.8 (Free and bound occurrence). An occurrence of variable


x in a term M is free if it is not in the scope of a λx, and bound otherwise. An
occurrence of a variable x in λx. M is bound by the initial λx iff the occurrence
of x in M is free.

Example 42.9. In λx. xy, both x and y are in the scope of λx, so x is bound
by λx. Since y is not in the scope of any λy, it is free. In λx. xx, both
occurrences of x are bound by λx, since both are free in xx. In ((λx. xx)x), the
last occurrence of x is free, since it is not in the scope of a λx. In λx. (λx. x)x,
the scope of the first λx is (λx. x)x and the scope of the second λx is the
second-to-last occurrence of x. In (λx. x)x, the last occurrence of x is free,
and the second-to-last is bound. Thus, the second-to-last occurrence of x in
λx. (λx. x)x is bound by the second λx, and the last occurrence by the first λx.

For a term P , we can check all variable occurrences in it and get a set of free
variables. This set is denoted by FV(P ) with a natural definition as follows:
Definition 42.10 (Free variables of a term). The set of free variables of lam:syn:fv:
def:fv
a term is defined inductively by:
1. FV(x) = {x} lam:syn:fv:
def:fv1
2. FV(λx. N ) = FV(N ) \ {x} lam:syn:fv:
def:fv2
3. FV(P Q) = FV(P ) ∪ FV(Q) lam:syn:fv:
def:fv3

Problem 42.4. 1. Identify the scopes of λg and the two λx in this term:
λg. (λx. g(xx))λx. g(xx).
2. In λg. (λx. g(xx))λx. g(xx), are all occurrences of variables bound? By
which abstractions are they bound respectively?

596 Release : 6891b66 (2024-12-01)


42.5. SUBSTITUTION

3. Give FV(λx. (λy. (λz. xy)z)y)

A free variable is like a reference to the outside world (the environment), explanation

and a term containing free variables can be seen as a partially specified term,
since its behaviour depends on how we set up the environment. For example, in
the term λx. f x, which accepts an argument x and returns f of that argument,
the variable f is free. This value of the term is dependent on the environment
it is in, in particular the value of f in that environment.
If we apply abstraction to this term, we get λf. λx. f x. This term is no
longer dependent on the environment variable f , because it now designates a
function that accepts two arguments and returns the result of applying the
first to the second. Changing f in the environment won’t have any effect on
the behavior of this term, as the term will only use whatever is passed as an
argument, and not the value of f in the environment.

Definition 42.11 (Closed term, combinator). A term with no free vari-


ables is called a closed term, or a combinator.

lam:syn:fv: Lemma 42.12.


lem:fv

lam:syn:fv: 1. If y ̸= x, then y ∈ FV(λx. N ) iff y ∈ FV(N ).


lem:fv-abs

lam:syn:fv: 2. y ∈ FV(P Q) iff y ∈ FV(P ) or y ∈ FV(Q).


lem:fv-app

Proof. Exercise.

Problem 42.5. Prove Lemma 42.12.

content/lambda-calculus/syntax/substitution.tex

42.5 Substitution
lam:syn:sub: Free variables are references to environment variables, thus it makes sense to explanation
sec
actually use a specific value in the place of a free variable. For example, we
may want to replace f in λx. f x with a specific term, like the identity function
λy. y. This results in λx. (λy. y)x. The process of replacing free variables with
lambda terms is called substitution.

lam:syn:sub: Definition 42.13 (Substitution). The substitution of a term N for a vari-


defn:substitution
able x in a term M , M [N/x], is defined inductively by:

lam:syn:sub: 1. x[N/x] = N .
defn:substitution-1

lam:syn:sub: 2. y[N/x] = y if x ̸= y.
defn:substitution-2

lam:syn:sub: 3. P Q[N/x] = (P [N/x])(Q[N/x]).


def:substitution-3

Release : 6891b66 (2024-12-01) 597


CHAPTER 42. SYNTAX

4. (λy. P )[N/x] = λy. P [N/x], if x ̸= y and y ∈


/ FV(N ), otherwise unde-
fined. lam:syn:sub:
defn:substitution-4

explanation In Definition 42.13(4), we require x ̸= y because we don’t want to replace


bound occurrences of the variable x in M by N . For example, if we compute
the substitution λx. x[y/x], the result should not be λx. y but simply λx. x.
When substituting N for x in λy. P , we also require that y ∈ / FV(N ). For
example, we cannot substitute y for x in λy. x, i.e., λy. x[y/x], because it would
result in λy. y, a term that stands for the function that accepts an argument
and returns it directly. But the term λy. x stands for a function that always
returns the term x (or whatever x refers to). So the result we actually want
is a function that accepts an argument, drop it, and returns the environment
variable y. To do this properly, we would first have to “rename” the bound
variable y.

Problem 42.6. What is the result of the following substitutions?

1. λy. x(λw. vwx)[(uv)/x]

2. λy. x(λx. x)[(λy. xy)/x]

3. y(λv. xv)[(λy. vy)/x]

Theorem 42.14. If x ∈ / FV(M ), then FV(M [N/x]) = FV(M ), if the left- lam:syn:sub:
thm:notinfv
hand side is defined.

Proof. By induction on the formation of M .

1. M is a variable: exercise.

2. M is of the form (P Q): exercise.

3. M is of the form λy. P , and since λy. P [N/x] is defined, it has to be


λy. P [N/x]. Then P [N/x] has to be defined; also, x ̸= y and x ∈
/ FV(Q).
Then:

FV(λy. P [N/x]) =
= FV(λy. P [N/x]) by (4)
= FV(P [N/x]) \ {y} by Definition 42.10(2)
= FV(P ) \ {y} by inductive hypothesis
= FV(λy. P ) by Definition 42.10(2)

Problem 42.7. Complete the proof of Theorem 42.14.

Theorem 42.15. If x ∈ FV(M )), then FV(M [N/x]) = (FV(M ) \ {x}) ∪ lam:syn:sub:
thm:infv
FV(N ), provided the left hand is defined.

598 Release : 6891b66 (2024-12-01)


42.5. SUBSTITUTION

Proof. By induction on the formation of M .


1. M is a variable: exercise.
2. M is of the form P Q: Since (P Q)[N/y] is defined, it has to be (P [N/x])(Q[N/x])
with both substitution defined. Also, since x ∈ FV(P Q), either x ∈
FV(P ) or x ∈ FV(Q) or both. The rest is left as an exercise.
3. M is of the form λy. P . Since λy. P [N/x] is defined, it has to be λy. P [N/x],
with P [N/x] defined, x ̸= y and y ∈ / FV(N ); also, since y ∈ FV(λx. P ),
we have y ∈ FV(P ) too. Now:

FV((λy. P )[N/x]) =
= FV(λy. P [N/x])
= FV(P [N/x]) \ {y}
= ((FV(P ) \ {y}) ∪ (FV(N ) \ {x}) by inductive hypothesis
= (FV(P ) \ {x, y}) ∪ FV(N ) x∈
/ FV(N )
= (FV(λy. P ) \ {x}) ∪ FV(N )

Problem 42.8. Complete the proof of Theorem 42.15.

lam:syn:sub: Theorem 42.16. x ∈


/ FV(M [N/x]), if the right-hand side is defined and
thm:clr
x∈
/ FV(N ).

Proof. Exercise.

Problem 42.9. Prove Theorem 42.16.

lam:syn:sub: Theorem 42.17. If M [y/x] is defined and y ∈


/ FV(M ), then M [y/x][x/y] =
thm:inv
M.

Proof. By induction on the formation of M .


1. M is a variable z: Exercise.
2. M is of the form (P Q). Then:
(P Q)[y/x][x/y] = ((P [y/x])(Q[y/x]))[x/y]
= (P [y/x][x/y])(Q[y/x][x/y])
= (P Q) by inductive hypothesis

3. M is of the form λz. N . Because λz. N [y/x] is defined, we know that


z ̸= y. So:
(λz. N )[y/x][x/y]
= (λz. N [y/x])[x/y]
= λz. N [y/x][x/y]
= λz. N by inductive hypothesis

Release : 6891b66 (2024-12-01) 599


CHAPTER 42. SYNTAX

Problem 42.10. Complete the proof of Theorem 42.17.

content/lambda-calculus/syntax/alpha.tex

42.6 α-Conversion
What is the relation between λx. x and λy. y? They both represent the identity lam:syn:alp:
sec
function. They are, of course, syntactically different terms. They differ only in
the name of the bound variable, and one is the result of “renaming” the bound
variable in the other. This is called α-conversion.
α
Definition 42.18 (Change of bound variable, − →). If a term M contains
an occurrence of λx. N , y ∈
/ FV(N ), and N [y/x] is defined, then replacing this
occurrence by
λy. N [y/x]
α
resulting in M ′ is called a change of bound variable, written as M −
→ M ′.

Definition 42.19 (Compatibility of relation). A relation R on terms is


said to be compatible if it satisfies following conditions:
1. If RN N ′ then Rλx. N λx. N ′
2. If RP P ′ then R(P Q)(P ′ Q)
3. If RQQ′ then R(P Q)(P Q′ )

Thus let’s rephrase the definition:


α
Definition 42.20 (Change of bound variable, − →). Change of bound vari-
α
able (−
→) is the smallest compatible relation on terms satisfying following con-
dition:
α
λx. N −
→ λy. N [y/x] if x ̸= y, y ∈
/ FV(N )
and N [y/x] is defined

“Smallest” here means the relation contains only pairs that are required by
compatibility and the additional condition, and nothing else. Thus this relation
can also be defined as follows:
α
Definition 42.21 (Change of bound variable, −
→). Change of bound vari- lam:syn:alp:
α defn:aconvone
able (−
→) is inductively defined as follows:
α α
→ N ′ then λx. N −
1. If N − → λx. N ′ lam:syn:alp:
defn:aconvone1
α α
→ P ′ then (P Q) −
2. If P − → (P ′ Q) lam:syn:alp:
defn:aconvone2
α α
→ Q′ then (P Q) −
3. If Q − → (P Q′ ) lam:syn:alp:
defn:aconvone3

600 Release : 6891b66 (2024-12-01)


42.6. α-CONVERSION

α
lam:syn:alp: 4. If x ̸= y, y ∈
/ FV(N ) and N [y/x] is defined, then λx. N −
→ λy. N [y/x].
defn:aconvone4

The definitions are equivalent, but we leave the proof as an exercise. From
now on we will use the inductive definition.
α α
Definition 42.22 (α-conversion, − →
→). α-conversion (−

→) is the smallest re-
α
flexitive and transitive relation on terms containing −
→.

As above, “smallest” means the relation only contains pairs required by


α
transitivity, and −
→, which leads to the following equivalent definition:
α α
lam:syn:alp: Definition 42.23 (α-conversion, −

→). α-conversion (−

→) is inductively de-
defn:aconv
fined as follows:
α α α
lam:syn:alp: 1. If P −

→ Q and Q −

→ R, then P −

→ R.
defn:aconv1
α α
lam:syn:alp: 2. If P −
→ Q, then P −

→ Q.
defn:aconv2
α
lam:syn:alp: 3. P −

→ P.
defn:aconv3

Example 42.24. λx. f x α-converts to λy. f y, and conversely. Informally speak-


ing, they are both functions that accept an argument and return f of that
argument, refering to the environment variable f .
λx. f x does not α-convert to λx. gx. Informally speaking, they refer to
the environment variables f and g respectively, and this makes them different
functions: they behave differently in environments where f and g are different.

Problem 42.11. Are the following pairs of terms α-convertible?


1. λx. λy. x and λy. λx. y
2. λx. λy. x and λc. λb. a
3. λx. λy. x and λc. λb. a
α
lam:syn:alp: Lemma 42.25. If P −
→ Q then FV(P ) = FV(Q).
lem:fv-one
α
Proof. By induction on the derivation of P −
→ Q.
1. If the last rule is (4), then P is of the form λx. N and Q of the form
λy. N [y/x], with x ̸= y, y ∈
/ FV(N ) and N [y/x] defined. We distinguish
cases according to whether x ∈ FV(N ):
a) If x ∈ F V (N ), then:

FV(λy. N [y/x]) = FV(N [y/x]) \ {y}


= ((FV(N ) \ {x}) ∪ {y}) \ {y} by Theorem 42.15
= FV(N ) \ {x}
= FV(λx. N )

Release : 6891b66 (2024-12-01) 601


CHAPTER 42. SYNTAX

b) If x ∈
/ F V (N ), then:

FV(λy. N [y/x]) = F V N [y/x] \ {y}


= FV(N ) \ {x} by Theorem 42.14
= FV(λx. N ).

2. The other three cases are left as exercises.

Problem 42.12. Complete the proof of Lemma 42.25.


α α
Lemma 42.26. If P −
→ Q then Q −
→ P. lam:syn:alp:
lem:inv
α
Proof. Induction on the derivation of P −
→ Q.
1. If the last rule is (4), then P is of the form λx. N and Q of the form
λy. N [y/x], where x ̸= y, y ∈
/ FV(N ) and N [y/x] defined. First, we have
y ∈/ FV(N [y/x]) by Theorem 42.16. By Theorem 42.17 we have that
N [y/x][x/y] is not only defined, but also equal to N . Then by (4), we
α
have λy. N [y/x] −→ λx. N [y/x][x/y] = λx. N .

Problem 42.13. Complete the proof of Lemma 42.26

Theorem 42.27. α-Conversion is an equivalence relation on terms, i.e., it is


reflexive, symmetric, and transitive.

Proof. 1. For each term M , M can be changed to M by zero changes of


bound variables.
2. If P is α-converts to Q by a series of changes of bound variables, then
from Q we can just inverse these changes (by Lemma 42.26) in opposite
order to obtain P .
3. If P α-converts to Q by a series of changes of bound variables, and Q
to R by another series, then we can change P to R by first applying the
first series and then the second series.
α
From now on we say that M and N are α-equivalent, M = N , iff M α-
converts to N (which, as we’ve just shown, is the case iff N α-converts to M ).
α
Theorem 42.28. If M = N , then FV(M ) = FV(N ). lam:syn:alp:
thm:fv

Proof. Immediate from Lemma 42.25.


α
Lemma 42.29. If R = R′ and M [R/y] is defined, then M [R′ /y] is defined lam:syn:alp:
lem:sub:R
and α-equivalent to M [R/y].

Proof. Exercise.

602 Release : 6891b66 (2024-12-01)


42.6. α-CONVERSION

Problem 42.14. Prove Lemma 42.29.

Recall that in section 42.5, substitution is undefined in some cases; how-


ever, using α-conversion on terms, we can make substitution always defined
by renaming bound variables. The result preserves α-equivalence, as shown in
this theorem:
α
lam:syn:alp: Theorem 42.30. For any M , R, and y, there exists M ′ such that M = M ′
thm:sub α
and M ′ [R/y] is defined. Moreover, if there is another pair M ′′ = M and R′′
α α
where M ′′ [R′′ /y] is defined and R′′ = R, then M ′ [R/y] = M ′′ [R′′ /y].

Proof. By induction on the formation of M :

1. M is a variable z: Exercise.

2. Suppose M is of the form λx. N . Select a variable z other than x and


y and such that z ∈ / FV(N ) and z ∈ / FV(R). By inductive hypothe-
α
sis, we there is N ′ such that N ′ = N and N ′ [z/x] is defined. Then
α α
λx. N = λx. N ′ too, by Definition 42.21(1). Now λx. N ′ = λz. N ′ [z/x]
by Definition 42.21(4). We can do this because z ̸= x, z ∈/ F V (N ′ ) and
′ ′
N [z/x] is defined. Finally, λz. N [z/x][R/y] is defined, because z ̸= y
and z ∈/ F V (R).
Moreover, if there is another N ′′ and R′′ satisfying the same conditions,

(λz. N ′′ [z/x])[R′′ /y] =


= λz. N ′′ [z/x][R′′ /y]
= λz. N ′′ [z/x][R/y] by Lemma 42.29

= λz. N [z/x][R/y] by inductive hypothesis

= (λz. N [z/x])[R/y]

3. M is of the form (P Q): Exercise.

Problem 42.15. Complete the proof of Theorem 42.30.

lam:syn:alp: Corollary 42.31. For any M , R, and y, there exists a pair of M ′ and R′ such
cor:sub α α
that M ′ = M , R = R′ and M ′ [R′ /y] is defined. Moreover, if there is another
α α
pair M ′′ = M and R′′ with M ′ [R′ /y] defined, then M ′ [R′ /y] = M ′′ [R′′ /y].

Proof. Immediate from Theorem 42.30.

content/lambda-calculus/syntax/de-bruijn.tex

Release : 6891b66 (2024-12-01) 603


CHAPTER 42. SYNTAX

42.7 The De Bruijn Index


α-Equivalence is very natural, as terms that are α-equivalent “mean the same.” lam:syn:deb:
sec
In fact, it is possible to give a syntax for lambda terms which does not distin-
guish terms that can be α-converted to each other. The best known replaces
variables by their De Bruijn index.
When we write λx. M , we explicitly state that x is the parameter of the
function, so that we can use x in M to refer to this parameter. In the de Bruijn
index, however, parameters have no name and reference to them in the function
body is denoted by a number denoting the levels of abstraction between them.
For example, consider the example of λx. λy. yx: the outer abstraction is on
binds the variable x; the inner abstraction binds the variable is y; the sub-term
yx lies in the scope of the inner abstraction: there is no abstraction between y
and its abstract λy, but one abstract between x and its abstract λx. Thus we
write 0 1 for yx, and λ. λ. 01 for the entire term.
Definition 42.32. De Bruijn terms are inductively defines as follows:
1. n, where n is any natural number.
2. P Q, where P and Q are both De Bruijn terms.
3. λ. N , where N is a De Bruijn term.

A formalized translation from ordinary lambda terms to De Bruijn indexed


terms is as follows:
Definition 42.33.
FΓ (x) = Γ (x)
FΓ (P Q) = FΓ (P )FΓ (Q)
FΓ (λx. N ) = λ. Fx,Γ (N )
where Γ is a list of variables indexed from zero, and Γ (x) denotes the position
of the variable x in Γ . For example, if Γ is x, y, z, then Γ (x) is 0 and Γ (z) is
2.
x, Γ denotes the list resulted from pushing x to the head of Γ ; for instance,
continuing the Γ in last example, w, Γ is w, x, y, z.
Recovering a standard lambda term from a de Bruijn term is done as follows:
Definition 42.34.
GΓ (n) = Γ [n]
GΓ (P Q) = GΓ (P )GΓ (Q)
GΓ (λ. N ) = λx. Gx,Γ (N )
where Γ is again a list of variables indexed from zero, and Γ [n] denotes the
variable in position n. For example, if Γ is x, y, z, then Γ [1] is y.
The variable x in last equation is chosen to be any variable that not in Γ .

604 Release : 6891b66 (2024-12-01)


42.8. TERMS AS α-EQUIVALENCE CLASSES

Here we give some results without proving them:


α
→ M ′ , and Γ is any list containing FV(M ), then
Proposition 42.35. If M −

FΓ (M ) ≡ FΓ (M ).

content/lambda-calculus/syntax/term-revisited.tex

42.8 Terms as α-Equivalence Classes


lam:syn:tr: From now on, we will consider terms up to α-equivalence. That means when
sec
we write a term, we mean its α-equivalance class it is in. For example, we write
λa. λb. ac for the set of all terms α-equivalent to it, such as λa. λb. ac, λb. λa. bc,
etc.
Also, while in previous sections letters such as N, Q are used to denote a
term, from now on we use them to denote a class, and it is these classes instead
of terms that will be our subjects of study in what follows. Letters such as x, y
continues to denote a variable.
We also adopt the notation M to denote an arbitrary element of the class M ,
and M 0 , M 1 , etc. if we need more than one.
We reuse the notations from terms to simplify our wording. We have fol-
lowing definition on classes:

Definition 42.36. 1. λx. N is defined as the class containing λx. N .

2. P Q is defined to be the class containing P Q.

It is not hard to see that they are well defined, because α-conversion is
compatible.

lam:syn:tr: Definition 42.37. The free variables of an α-equivalence class M , or F V (M ),


def:fv
is defined to be F V (M ).

This is well defined since F V (M 0 ) = F V (M 1 ), as shown in Theorem 42.28.


We also reuse the notation for substition into classes:

lam:syn:tr: Definition 42.38. The substitution of R for y in M , or M [R/y], is defined


def:sub
to be M R/y , for any M and R making the substition defined.

This is also well defined as shown in Corollary 42.31.


Note how this definition significantly simplifies our reasoning. For example:

lam:syn:tr: λx. x[y/x] = (42.1)


eq:1
lam:syn:tr: = λz. z[y/x] (42.2)
eq:2
= λz. z[y/x] (42.3)
= λz. z (42.4)

Release : 6891b66 (2024-12-01) 605


CHAPTER 42. SYNTAX

eq. (42.1) is undefined if we still regard it as substitution on terms; but as


mentioned earlier, we now consider it a substitution on classes, which is why
eq. (42.2) can happen: we can replace λx. x with λz. z because they belong to
the same class.
For the same reason, from now on we will assume that the representatives
we choose always satisfy the conditions needed for substitution. For example,
when we see λx. N [R/y], we will assume the representative λx. N is chosen so
that x ̸= y and x ∈ / F V (R).
Since it is a bit strange to call λx. x a “class”, let’s call them Λ-terms (or
simply “terms” in the rest of the part) from now on, to distinguish them from
λ-terms that we are familiar with.

We cannot say goodbye to terms yet: the whole definition of Λ-terms is


based on λ-terms, and we haven’t provided a method to define functions on
Λ-terms, which means all such functions have to be first defined on λ-terms,
and then “projected” to Λ-terms, as we did for substitutions. However we
assume the reader can intuitively understand how we can define functions
on Λ-terms.

content/lambda-calculus/syntax/beta.tex

42.9 β-reduction
When we see (λm. (λy. y)m), it is natural to conjecture that it has some connec- lam:int:bet:
sec
tion with λm. m, namely the second term should be the result of “simplifying”
the first. The notion of β-reduction captures this intuition formally.
β β
Definition 42.39 (β-contraction, − →). The β-contraction (− →) is the small- lam:int:bet:
defn:betacontr
est compatible relation on terms satisfying the following condition:

β
(λx. N )Q −
→ N [Q/x]

β
We say P is β-contracted to Q if P −
→ Q. A term of the form (λx. N )Q is
called a redex.

Problem 42.16. Spell out the equivalent inductive definitions of β-contraction lam:int:bet:
prob:def
as we did for change of bound variable in Definition 42.21.

β β
Definition 42.40 (β-reduction, −

→). β-reduction (−

→) is the smallest re- lam:int:bet:
β defn:betared
flexive, transitive relation on terms containing −
→. We say P is β-reduced to
β
Q if P − →
→ Q.

606 Release : 6891b66 (2024-12-01)


42.9. β-REDUCTION

β β
We will write →− instead of −→, and →−
→ instead of −

→ when context is clear.
β
Informally speaking, M − →
→ N if and only if M can be changed to N by
zero or several steps of β-contraction.
Definition 42.41 (β-normal). A term that cannot be β-contracted any fur-
ther is said to be β-normal.
β
If M −→→ N and N is β-normal, then we say N is a normal form of M . One
may ask if the normal form of a term is unique, and the answer is yes, as we
will see later.
Let us consider some examples.
1. We have
(λx. xxy)λz. z →
− (λz. z)(λz. z)y

− (λz. z)y

− y

2. “Simplifying” a term can actually make it more complex:


(λx. xxy)(λx. xxy) →
− (λx. xxy)(λx. xxy)y

− (λx. xxy)(λx. xxy)yy

− ...

3. It can also leave a term unchanged:


(λx. xx)(λx. xx) →
− (λx. xx)(λx. xx)

4. Also, some terms can be reduced in more than one way; for example,
(λx. (λy. yx)z)v →
− (λy. yv)z
by contracting the outermost application; and
(λx. (λy. yx)z)v →
− (λx. zx)v
by contracting the innermost one. Note, in this case, however, that both
terms further reduce to the same term, zv.
The final outcome in the last example is not a coincidence, but rather
illustrates a deep and important property of the lambda calculus, known as
the Church–Rosser property.
In general, there is more than one way to β-reduce a term, thus many digression
reduction strategies have been invented, among which the most common is the
natural strategy. The natural strategy always contracts the left-most redex,
where the position of a redex is defined as its starting point in the term. The
natural strategy has the useful property that a term can be reduced to a normal
form by some strategy iff it can be reduced to normal form using the natural
strategy. In what follows we will use the natural stratuegy unless otherwise
specified.

Release : 6891b66 (2024-12-01) 607


CHAPTER 42. SYNTAX

Definition 42.42 (β-equivalence, =). β-Equivalence (=) is the relation in-


ductively defined as follows:

1. M = M .

2. If M = N , then N = M .

3. If M = N , N = O, then M = O.

4. If M = N , then P M = P N .

5. If M = N , then M Q = N Q.

6. If M = N , then λx. M = λx. N .

7. (λx. N )Q = N [Q/x].

The first three rules make the relation an equivalence relation; the next
three make it compatible; the last ensures that it contains β-contraction.
Informally speaking, two terms are β-equivalent if and only if one of them
can be changed to the other in zero or more steps of β-contraction, or “inverse”
of β-contraction. The inverse of β-contraction is defined so that M inverse-β-
contracts to N iff N β-contracts to M .
Besides the above rules, we will extend the relation with more rules, and
X
denote the extended equivalence relation as =, where X is the extending rule.

content/lambda-calculus/syntax/eta.tex

42.10 η-conversion
There is another relation on λ terms. In section 42.4 we used the example lam:syn:eta:
sec
λx. (f x), which accepts an argument and applies f to it. In other words, it
is the same function as f : λx. (f x)N and f N both reduce to f N . We use
η-reduction (and η-extension) to capture this idea.
η η
Definition 42.43 (η-contraction, − →). η-contraction (− →) is the smallest lam:syn:eta:
defn:beredone
compatible relation on terms satisfying the following condition:
η
λx. M x −
→ M provided x ∈
/ F V (M )

βη βη
Definition 42.44 (βη-reduction, −−→
→). βη-reduction (−−→
→) is the smallest lam:syn:eta:
β η defn:bered
reflexive, transitive relation on terms containing −→ and −
→, i.e., the rules of
reflexivity and transitive plus the following two rules:
β βη
1. If M −
→ N then M −−→
→ N. lam:syn:eta:
defn:bered3
η βη
2. If M −
→ N then M −−→
→ N. lam:syn:eta:
defn:bered4

608 Release : 6891b66 (2024-12-01)


Definition 42.45. We extend the equivalence relation = with the η-conversion
rule:
λx. f x = f
η
and denote the extended relation as =.

η-equivalence is important because it is related to extensionality of lambda


terms:
Definition 42.46 (Extensionality). We extend the equivalence relation =
with the (ext) rule:
If M x = N x then M = N , provided x ∈
/ F V (M N ).
ext
and denote the extended relation as = .

Roughly speaking, the rule states that two terms, viewed as functions,
should be considered equal if they behave the same for the same argument.
We now prove that the η rule provides exactly the extensionality, and noth-
ing else.
ext η
Theorem 42.47. M = N if and only if M = N .
η
Proof. First we prove that = is closed under the extensionality rule. That
η η ext
is, ext rule doesn’t add anything to =. We then have = contains = , and if
ext η
M = N , then M = N .
η
To prove = is closed under ext, note that for any M = N derived by the
η η
ext rule, we have M x = N x as premise. Then we have λx. M x = λx. N x by a
η
rule of =, applying η on both side gives us M = N .
ext
Similarly we prove that the η rule is contained in = . For any λx. M x and
ext ext
M with x ∈ / F V (M ), we have that (λx. M x)x = M x, giving us λx. M x = M
by the ext rule.

Chapter 43

The Church–Rosser Property

content/lambda-calculus/church-rosser/definitions-and-properties.tex

609
CHAPTER 43. THE CHURCH–ROSSER PROPERTY

43.1 Definition and Properties


In this chapter we introduce the concept of Church–Rosser property and some lam:cr:dap:
sec
common properties of this property.

X
Definition 43.1 (Church–Rosser property, CR). A relation −→ on terms
X X
is said to satisfy the Church–Rosser property iff, whenever M −→ P and M −→
X X
Q, then there exists some N such that P −→ N and Q −→ N .

We can view the lambda calculus as a model of computation in which


terms in normal form are “values” and a reducibility relation on terms are the
“calculation rules.” The Church–Rosser property states is that when there is
more than one way to proceed with a calculation, there is still only a single
value of the expression.
To take an example from elementary algebra, there’s more than one way
to calculate 4 × (1 + 2) + 3. It can either be reduced to 4 × 3 + 3 (if we first
reduce 1 + 2 to 3) or to 4 × 1 + 4 × 2 + 3 (if we first reduce 4 × (1 + 2) using
distributivity). Both of these, however, can be further reduced to 12 + 3.
X
If we take −→ to be β-reduction, we easily see that a consequence of the
Church–Rosser property is that if a term has a normal form, then it is unique.
For suppose M can be reduced to P and Q, both of which are normal forms.
By the Church–Rosser property, there exists some N such that both P and Q
reduce to it. Since by assumption P and Q are normal forms, the reduction of
P and Q to N can only be the trivial reduction, i.e., P , Q, and N are identical.
This justifies our speaking of the normal form of a term.
In viewing the lambda calculus as a model of computation, then, the normal
form of a term can be thought of as the “final result” of the computation
starting with that term. The above corollary means there’s only one, if any,
final result of a computation, just like there is only one result of computing
4 × (1 + 2) + 3, namely 15.

X
Theorem 43.2. If a relation −→ satisfies the Church–Rosser property, and lam:cr:dap:
X X X thm:str
−→
→ is the smallest transitive relation containing −→, then −→
→ satisfies the
Church–Rosser property too.

Proof. Suppose

X X X
M −→ P1 −→ . . . −→ Pm and
X X X
M −→ Q1 −→ . . . −→ Qn .

We will prove the theorem by constructing a grid N of terms of height is m + 1


and width n + 1. We use Ni,j to denote the term in the i-th row and j-th
column.

610 Release : 6891b66 (2024-12-01)


43.2. PARALLEL β-REDUCTION

X X
We construct N in such a way that Ni,j −→ Ni+1,j and Ni,j −→ Ni,j+1 . It
is defined as follows:

N0,0 = M
Ni,0 = Pi if 1 ≤ i ≤ m
N0,j = Qj if 1 ≤ j ≤ n

and otherwise:

Ni,j = R

X X
where R is a term such that Ni−1,j −→ R and Ni,j−1 −→ R. By the Church–
X
Rosser property of −→, such a term always exists.
X X X X
Now we have Nm,0 −→ . . . −→ Nm,n and N0,n −→ . . . −→ Nm,n . Note Nm,0
X
is P and N0,n is Q. By definition of −→
→ the theorem follows.

content/lambda-calculus/church-rosser/parallel-beta-reduction.tex

43.2 Parallel β-reduction


lam:cr:pb: We introduce the notion of parallel β-reduction, and prove the it has the
sec
Church–Rosser property.
β β
lam:cr:pb: Definition 43.3 (parallel β-reduction, =⇒). Parallel reduction (=⇒) of
defn:bredpar
terms is inductively defined as follows:
β
lam:cr:pb: 1. x =⇒ x.
defn:bredpar1
β β
lam:cr:pb: → N ′ then λx. N =⇒ λx. N ′ .
2. If N −
defn:bredpar2
β β β
lam:cr:pb: 3. If P =⇒ P ′ and Q =⇒ Q′ then P Q =⇒ P ′ Q′ .
defn:bredpar3
β β β
lam:cr:pb: 4. If N =⇒ N ′ and Q =⇒ Q′ then (λx. N )Q =⇒ N ′ [Q′ /x].
defn:bredpar4

Parallel β-reduction allows us to reduce any number of redices in a term


in one step. It is different from β-reduction in the sense that we can only
contract redices that occur in the original term, but not redices arising from
parallel β-reduction. For example, the term (λf. f x)(λy. y) can only be parallel
β-reduced to itself or to (λy. y)x, but not further to x, although it β-reduces
to x, because this redex arises only after one step of parallel β-reduction. A
second parallel β-reduction step yields x, though.
β
lam:cr:pb: Theorem 43.4. M =⇒ M .
thm:refl

Release : 6891b66 (2024-12-01) 611


CHAPTER 43. THE CHURCH–ROSSER PROPERTY

Proof. Exercise.

Problem 43.1. Prove Theorem 43.4.


∗β
Definition 43.5 (β-complete development). The β-complete development M
lam:cr:pb:
defn:bcd
of M is defined inductively as follows:

x∗β = x (43.1) lam:cr:pb:


defn:bcd1
∗β ∗β
(λx. N ) = λx. N (43.2) lam:cr:pb:
defn:bcd2
∗β
(P Q) = P ∗β Q∗β if P is not a λ-abstract (43.3) lam:cr:pb:
defn:bcd3
∗β ∗β ∗β
((λx. N )Q) =N [Q /x] (43.4) lam:cr:pb:
defn:bcd4

The β-complete development of a term, as its name suggests, is a “complete


parallel reduction.” While for parallel β-reduction we still can choose to not
contract a redex, for complete development we have no choice but to contract
all of them. Thus the complete development of (λf. f x)(λy. y) is (λy. y)x, not
itself.

This definition has the problem that we haven’t introduced how to


define functions on (λ-)terms recursively. Will fix in future.

β β β
Lemma 43.6. If M =⇒ M ′ and R =⇒ R′ , then M [R/y] =⇒ M ′ [R′ /y]. lam:cr:pb:
lem:comp
β
Proof. By induction on the derivation of M =⇒ M ′ .
1. The last step is (1): Exercise.
β
2. The last step is (2): Then M is λx. N and M ′ is λx. N ′ , where N =⇒ N ′ .
β β
We want to prove that (λx. N )[R/y] =⇒ (λx. N ′ )[R′ /y], i.e., λx. N [R/y] =⇒
λx. N ′ [R/y]. This follows immediately by (2) and the induction hypoth-
esis.
3. The last step is (3): Exercise.
4. The last step is (4): M is (λx. N )Q and M ′ is N ′ [Q′ /x]. We want to prove
β β
that ((λx. N )Q)[R/y] =⇒ N ′ [Q′ /x][R′ /y], i.e., (λx. N [R/y])Q[R/y] =⇒
N ′ [R′ /y][Q′ [R′ /y]/x]. This follows by (4) and the induction hypothesis.

Problem 43.2. Complete the proof of Lemma 43.6.


β β
Lemma 43.7. If M =⇒ M ′ then M ′ =⇒ M ∗β . lam:cr:pb:
lem:cont
β
Proof. By induction on the derivation of M =⇒ M ′ .

612 Release : 6891b66 (2024-12-01)


43.3. β-REDUCTION

1. The last rule is (1): Exercise.


β
2. The last rule is (2): M is λx. N and M ′ is λx. N ′ with N =⇒ N ′ . We
β ∗β β
want to show that λx. N ′ =⇒ (λx. N ) , i.e., λx. N ′ =⇒ λx. N ∗β by
eq. (43.2). It follows by (2) and the induction hypothesis.
3. The last rule is (3):M is P Q and M ′ is P ′ Q′ for some P , Q, P ′ and
β β
Q′ , with P =⇒ P ′ and Q =⇒ Q′ . By induction hypothesis, we have
β β
P ′ =⇒ P ∗β and Q′ =⇒ Q∗β .
a) If P is λx. N for some x and N , then P ′ must be λx. N ′ for some
β β
N ′ with N =⇒ N ′ . By induction hypothesis we have N ′ =⇒ N ∗β
β β
and Q′ =⇒ Q∗β . Then (λx. N ′ )Q′ =⇒ N ∗β [Q∗β /x] by (4).
β
b) If P is not a λ-abstract, then P ′ Q′ =⇒ P ∗β Q∗β by (3), and the
right-hand side is P Q∗β by eq. (43.3).
4. The last rule is (4): M is (λx. N )Q and M ′ is N ′ [Q′ /x] for some x,
β β
N , Q, N ′ , and Q′ , with N =⇒ N ′ and Q =⇒ Q′ . By induction hy-
β β
pothesis we know N ′ =⇒ N ∗β and Q′ =⇒ Q∗β . By Lemma 43.6 we
β
have N ′ [Q′ /x] =⇒ N ∗β [Q∗β /x], the right-hand side of which is exactly
∗β
((λx. N )Q) .

Problem 43.3. Complete the proof of Lemma 43.7.


β
lam:cr:pb: Theorem 43.8. =⇒ has the Church–Rosser property.
thm:cr

Proof. Immediate from Lemma 43.7.

content/lambda-calculus/church-rosser/beta-reduction.tex

43.3 β-reduction
lam:cr:b:
sec β β
lam:cr:b: → M ′ , then M =⇒ M ′ .
Lemma 43.9. If M −
lem:one-par
β
→ M ′ , then M is (λx. N )Q, M ′ is N [Q/x], for some x, N ,
Proof. If M −
β β
and Q. Since N =⇒ N and Q =⇒ Q by Theorem 43.4, we immediately
β
have (λx. N )Q =⇒ N [Q/x] by Definition 43.3(4).
β β
lam:cr:b: Lemma 43.10. If M =⇒ M ′ , then M −
→ M ′.

lem:par-red
β
Proof. By induction on the derivation of M =⇒ M ′ .

Release : 6891b66 (2024-12-01) 613


CHAPTER 43. THE CHURCH–ROSSER PROPERTY

β
1. The last rule is (1): Then M and M ′ are just x, and x −

→ x.
2. The last rule is (2): M is λx. N and M ′ is λx. N ′ for some x, N , N ′ ,
β β
where N =⇒ N ′ . By induction hypothesis we have N − →→ N ′ . Then
β β β
→ λx. N ′ (by the same series of −
λx. N −
→ → N ′ ).
→ contractions as N −

3. The last rule is (3): M is P Q and M ′ is P ′ Q′ for some P , Q, P ′ , Q′ , where
β β β
P =⇒ P ′ and Q =⇒ Q′ . By induction hypothesis we have P −→ P ′ and

β β β
Q− → Q′ . So P Q −
→ → P ′ Q′ by the reduction sequence P −
→ → P ′ followed

β ′
by the reduction Q −→
→Q.
4. The last rule is (4): M is (λx. N )Q and M ′ is N ′ [Q′ /x] for some x, N ,
β β
M ′ , Q, Q′ , where N =⇒ N ′ and Q =⇒ Q′ . By induction hypothesis
β β β β
we get Q −→ → Q′ and N −
→→ N ′ . So (λx. N )Q −
→ N ′ [Q′ /x] by N −
→ → N′

β
→ Q′ and finally contraction of (λx. N ′ )Q′ to N ′ [Q′ /x].
followed by Q −

β β
Lemma 43.11. −

→ is the smallest transitive relation containing =⇒. lam:cr:b:
lem:str

X β
Proof. Let −→→ be the smallest transitive relation containing =⇒.
β X β β β
−→→⊆−→ →: Suppose M − →→ M ′ , i.e., M ≡ M1 − → ... − → Mk ≡ M ′ . By
β β X β
Lemma 43.9, M ≡ M1 =⇒ . . . =⇒ Mk ≡ M ′ . Since is −→ → contains =⇒ and is
X
→ M ′.
transitive, M −→
X β X β β
−→ →⊆− →→: Suppose M −→ → M ′ , i.e., M ≡ M1 =⇒ . . . =⇒ Mk ≡ M ′ .
β β β
By Lemma 43.10, M ≡ M1 − →
→ ... − →→ Mk ≡ M ′ . Since − →→ is transitive,
β ′
M− →→M .
β
Theorem 43.12. −

→ satisfies the Church–Rosser property. lam:cr:b:
thm:cr

Proof. Immediate from Theorem 43.2, Theorem 43.8, and Lemma 43.11.

content/lambda-calculus/church-rosser/parallel-beta-eta-reduction.tex

43.4 Parallel βη-reduction


In this section we prove the Church-Rosser property for parallel βη-reduction, lam:cr:pbe:
sec
the parallel reduction notion corresponding to βη-reduction.
βη βη
Definition 43.13 (Parallel βη-reduction, = =⇒). Parallel βη-reduction (=
=⇒)lam:cr:pbe:
defn:beredpar
on terms is inductively defined as follows:
βη
1. x =
=⇒ x. lam:cr:pbe:
defn:beredpar1

614 Release : 6891b66 (2024-12-01)


43.4. PARALLEL βη-REDUCTION

β βη
lam:cr:pbe: → N ′ then λx. N =
2. If N − =⇒ λx. N ′ .
defn:beredpar2
βη βη βη
lam:cr:pbe: =⇒ P ′ and Q =
3. If P = =⇒ Q′ then P Q =
=⇒ P ′ Q′ .
defn:beredpar3
βη βη βη
lam:cr:pbe: =⇒ N ′ and Q =
4. If N = =⇒ Q′ then (λx. N )Q =
=⇒ N ′ [Q′ /x].
defn:beredpar4
βη βη
lam:cr:pbe: =⇒ N ′ then λx. N x =
5. If N = =⇒ N ′ , provided x ∈
/ F V (N ).
defn:beredpar5

βη
lam:cr:pbe: Theorem 43.14. M =
=⇒ M .
thm:refl

Proof. Exercise.

Problem 43.4. Prove Theorem 43.14.

lam:cr:pbe: Definition 43.15 (βη-complete development). The βη-complete develop-


defn:becd
ment M ∗βη of M is defined as follows:

lam:cr:pbe: x∗βη = x (43.5)


defn:becd1
∗βη
lam:cr:pbe: (λx. N ) = λx. N ∗βη (43.6)
defn:becd2
∗βη ∗βη ∗βη
lam:cr:pbe: (P Q) =P Q if P is not a λ-abstract
defn:becd3
(43.7)
∗βη
lam:cr:pbe: ((λx. N )Q) = N ∗βη [Q∗βη /x] (43.8)
defn:becd4
∗βη
lam:cr:pbe: (λx. N x) = N ∗βη if x ∈
/ F V (N ) (43.9)
defn:becd5

βη βη βη
lam:cr:pbe: =⇒ M ′ and R =
Lemma 43.16. If M = =⇒ R′ , then M [R/y] =
=⇒ M ′ [R′ /y].
lem:comp

βη
Proof. By induction on the derivation of M = =⇒ M ′ .
The first four cases are exactly like those in Lemma 43.6. If the last rule
is (5), then M is λx. N x, M ′ is N ′ for some x and N ′ where x ∈ / F V (N ),
βη βη
=⇒ N ′ . We want to show that (λx. N x)[R/y] =
and N = =⇒ N ′ [R′ /y], i.e.,
βη
=⇒ N ′ [R′ /y]. It follows by Definition 43.13(5) and the induction
λx. N [R/y]x =
hypothesis.
βη βη
lam:cr:pbe: =⇒ M ′ then M ′ =
Lemma 43.17. If M = =⇒ M ∗βη .
lem:cont

βη
Proof. By induction on the derivation of M ==⇒ M ′ .
The first four cases are like those in Lemma 43.7. If the last rule is (5),
then M is λx. N x and M ′ is N ′ for some x, N , N ′ where x ∈/ F V (N ) and
βη βη ∗βη βη
N ==⇒ N ′ . We want to show that N ′ ==⇒ (λx. N x) , i.e., N ′ =
=⇒ N ∗βη ,
which is immediate by induction hypothesis.
βη
lam:cr:pbe: Theorem 43.18. =
=⇒ has the Church-Rosser property.
thm:cr

Release : 6891b66 (2024-12-01) 615


Proof. Immediate from Lemma 43.17.

content/lambda-calculus/church-rosser/beta-eta-reduction.tex

43.5 βη-reduction
βη lam:cr:be:
The Church–Rosser property holds for βη-reduction (−−→
→). sec

βη βη
Lemma 43.19. If M −−→ M ′ , then M =
=⇒ M ′ . lam:cr:be:
lem:one-par

βη β
Proof. By induction on the derivation of M −−→ M ′ . If M − → M ′ by η-
conversion (i.e., Definition 42.43), we use Theorem 43.14. The other cases are
as in Lemma 43.9.

βη βη
=⇒ M ′ , then M −−→
Lemma 43.20. If M = → M ′. lam:cr:be:
lem:par-red

βη
Proof. Induction on the derivation of M = =⇒ M ′ .
If the last rule is (5), then M is λx. N x and M ′ is N ′ for some x, N , N ′
βη
where x ∈ =⇒ N ′ . Thus we can first reduce λx. N x to N by
/ F V (N ) and N =
βη βη
→ N ′,
η-conversion, followed by the series of −−→ steps that show that N −−→
which holds by induction hypothesis.

βη βη
Lemma 43.21. −−→
→ is the smallest transitive relation containing =
=⇒. lam:cr:be:
lem:str

Proof. As in Lemma 43.11

βη
Theorem 43.22. −−→
→ satisfies Church–Rosser property. lam:cr:be:
thm:cr

Proof. By Theorem 43.2, Theorem 43.18 and Lemma 43.21.

616
Chapter 44

Lambda Definability

This chapter is experimental. It needs more explanation, and the ma-


terial should be structured better into definitions and propositions with
proofs, and more examples.

content/lambda-calculus/lambda-definability/introduction.tex

44.1 Introduction
lam:ldf:int: At first glance, the lambda calculus is just a very abstract calculus of expres-
sec
sions that represent functions and applications of them to others. Nothing in
the syntax of the lambda calculus suggests that these are functions of partic-
ular kinds of objects, in particular, the syntax includes no mention of natural
numbers. Its basic operations—application and lambda abstractions—are op-
erations that apply to any function, not just functions on natural numbers.
Nevertheless, with some ingenuity, it is possible to define arithmetical func-
tions, i.e., functions on the natural numbers, in the lambda calculus. To do
this, we define, for each natural number n ∈ N, a special λ-term n, the Church
numeral for n. (Church numerals are named for Alonzo Church.)

Definition 44.1. If n ∈ N, the corresponding Church numeral n represents n:

n ≡ λf x. f n (x)

Here, f n (x) stands for the result of applying f to x n times. For example, 0 is
λf x. x, and 3 is λf x. f (f (f x)).

The Church numeral n is encoded as a lambda term which represents a


function accepting two arguments f and x, and returns f n (x). Church numerals
are evidently in normal form.

Release : 6891b66 (2024-12-01) 617


CHAPTER 44. LAMBDA DEFINABILITY

A representation of natural numbers in the lambda calculus is only useful,


of course, if we can compute with them. Computing with Church numerals in
the lambda calculus means applying a λ-term F to such a Church numeral,
and reducing the combined term F n to a normal form. If it always reduces to
a normal form, and the normal form is always a Church numeral m, we can
think of the output of the computation as being the number m. We can then
think of F as defining a function f : N → N, namely the function such that
f (n) = m iff F n →
−→ m. Because of the Church–Rosser property, normal forms
are unique if they exist. So if F n →
−→ m, there can be no other term in normal
form, in particular no other Church numeral, that F n reduces to.
Conversely, given a function f : N → N, we can ask if there is a term F that
defines f in this way. In that case we say that F λ-defines f , and that f is
λ-definable. We can generalize this to many-place and partial functions.
Definition 44.2. Suppose f : Nk → N. We say that a lambda term F λ-
defines f if for all n0 , . . . , nk−1 ,
F n0 n1 . . . nk−1 →

→ f (n0 , n1 , . . . , nk−1 )
if f (n0 , . . . , nk−1 ) is defined, and F n0 n1 . . . nk−1 has no normal form otherwise.

A very simple example are the constant functions. The term Ck ≡ λx. k
λ-defines the function ck : N → N such that c(n) = k. For Ck n ≡ (λx. k)n →
− k
for any n. The identity function is λ-defined by λx. x. More complex functions
are of course harder to define, and often require a lot of ingenuity. So it is
perhaps surprising that every computable function is λ-definable. The converse
is also true: if a function is λ-definable, it is computable.

content/lambda-calculus/lambda-definability/arithmetical-functions.tex

44.2 λ-Definable Arithmetical Functions


lam:rep:arf:
sec
Proposition 44.3. The successor function succ is λ-definable. lam:rep:arf:
prop:succ-ld
Proof. A term that λ-defines the successor function is
Succ ≡ λa. λf x. f (af x).

Given our conventions, this is short for

Succ ≡ λa. λf. λx. (f ((af )x)).

Succ is a function that accepts as argument a number a, and evaluates to


another function, λf x. f (af x). That function is not itself a Church numeral.
However, if the argument a is a Church numeral, it reduces to one. Consider:

(λa. λf x. f (af x)) n →


− λf x. f (nf x).

618 Release : 6891b66 (2024-12-01)


44.2. λ-DEFINABLE ARITHMETICAL FUNCTIONS

The embedded term nf x is a redex, since n is λf x. f n x. So nf x →


− f n x and
so, for the entire term we have

→ λf x. f (f n (x)),
Succ n →

i.e., n + 1.

Example 44.4. Let’s look at what happens when we apply Succ to 0, i.e.,
λf x. x. We’ll spell the terms out in full:

Succ 0 ≡ (λa. λf. λx. (f ((af )x)))(λf. λx. x)



− λf. λx. (f (((λf. λx. x)f )x))

− λf. λx. (f ((λx. x)x))

− λf. λx. (f x) ≡ 1

Problem 44.1. The term

Succ′ ≡ λn. λf x. nf (f x)

λ-defines the successor function. Explain why.

lam:rep:arf: Proposition 44.5. The addition function add is λ-definable.


prop:add-ld

Proof. Addition is λ-defined by the terms

Add ≡ λab. λf x. af (bf x)

or, alternatively,

Add′ ≡ λab. a Succ b.

The first addition works as follows: Add first accept two numbers a and b.
The result is a function that accepts f and x and returns af (bf x). If a and b
are Church numerals n and m, this reduces to f n+m (x), which is identical to
f n (f m (x)). Or, slowly:

(λab. λf x. af (bf x))n m →


− λf x. n f (m f x)
− λf x. n f (f m x)

− λf x. f n (f m x) ≡ n + m.

The second representation of addition Add′ works differently: Applied to two


Church numerals n and m,

Add′ n m →
− n Succ m.

Release : 6891b66 (2024-12-01) 619


CHAPTER 44. LAMBDA DEFINABILITY

But nf x always reduces to f n (x). So,

→ Succn (m).
n Succ m →

And since Succ λ-defines the successor function, and the successor function
applied n times to m gives n + m, this in turn reduces to n + m.

Proposition 44.6. Multiplication is λ-definable by the term lam:rep:arf:


prop:mult-ld

Mult ≡ λab. λf x. a(bf )x

Proof. To see how this works, suppose we apply Mult to Church numerals n
and m: Mult n m reduces to λf x. n(m f )x. The term mf defines a function
which applies f to its argument m times. Consequently, n(mf )x applies the
function “apply f m times” itself n times to x. In other words, we apply f
to x, n · m times. But the resulting normal term is just the Church numeral
nm.

We can actually simplify this term further by η-reduction:

Mult ≡ λab. λf. a(bf ).

But then we first have to explain η-reduction.

Problem 44.2. Multiplication can be λ-defined by the term

Mult′ ≡ λab. a(Add a)0.

Explain why this works.

The definition of exponentiation as a λ-term is surprisingly simple:

Exp ≡ λbe. eb.

The first argument b is the base and the second e is the exponent. Intuitively,
ef is f e by our encoding of numbers. If you find it hard to understand, we can
still define exponentiation also by iterated multiplication:

Exp′ ≡ λbe. e(Mult b)1.

Predecessor and subtraction on Church numeral is not as simple as we might


think: it requires encoding of pairs.

content/lambda-calculus/lambda-definability/pairs.tex

620 Release : 6891b66 (2024-12-01)


44.3. PAIRS AND PREDECESSOR

44.3 Pairs and Predecessor


lam:ldf:pai:
sec
Definition 44.7. The pair of M and N (written ⟨M, N ⟩) is defined as follows:

⟨M, N ⟩ ≡ λf. f M N.

Intuitively it is a function that accepts a function, and applies that function


to the two elements of the pair. Following this idea we have this constructor,
which takes two terms and returns the pair containing them:

Pair ≡ λmn. λf. f mn

Given a pair, we also want to recover its elements. For this we need two access
functions, which accept a pair as argument and return the first or second
elements in it:

Fst ≡ λp. p(λmn. m)


Snd ≡ λp. p(λmn. n)

Problem 44.3. Explain why the access functions Fst and Snd work.

Now with pairs we can λ-define the predecessor function:

Pred ≡ λn. Fst(n(λp. ⟨Snd p, Succ(Snd p)⟩)⟨0, 0⟩)

Remember that n f x reduces to f n (x); in this case f is a function that accepts


a pair p and returns a new pair containing the second component of p and the
successor of the second component; x is the pair ⟨0, 0⟩. Thus, the result is ⟨0, 0⟩
for n = 0, and ⟨n − 1, n⟩ otherwise. Pred then returns the first component of
the result.
Subtraction can be defined as Pred applied to a, b times:

Sub ≡ λab. bPred a.

content/lambda-calculus/lambda-definability/truth-values.tex

44.4 Truth Values and Relations


lam:ldf:tvr: We can encode truth values in the pure lambda calculus as follows:
sec

true ≡ λx. λy. x


false ≡ λx. λy. y

Truth values are represented as selectors, i.e., functions that accept two
arguments and returning one of them. The truth value true selects its first
argument, and false its second. For example, true M N always reduces to M ,
while false M N always reduces to N .

Release : 6891b66 (2024-12-01) 621


CHAPTER 44. LAMBDA DEFINABILITY

Definition 44.8. We call a relation R ⊆ Nn λ-definable if there is a term R


such that
β
R n1 . . . nk −

→ true

whenever R(n1 , . . . , nk ) and

β
R n1 . . . nk −

→ false

otherwise.

For instance, the relation IsZero = {0} which holds of 0 and 0 only, is
λ-definable by
IsZero ≡ λn. n(λx. false) true.
How does it work? Since Church numerals are defined as iterators (functions
which apply their first argument n times to the second), we set the initial
value to be true, and for every step of iteration, we return false regardless of
the result of the last iteration. This step will be applied to the initial value n
times, and the result will be true if and only if the step is not applied at all,
i.e., when n = 0.
On the basis of this representation of truth values, we can further define
some truth functions. Here are two, the representations of negation and con-
junction:

Not ≡ λx. x false true


And ≡ λx. λy. xy false

The function “Not” accepts one argument, and returns true if the argument is
false, and false if the argument is true. The function “And” accepts two truth
values as arguments, and should return true iff both arguments are true. Truth
values are represented as selectors (described above), so when x is a truth value
and is applied to two arguments, the result will be the first argument if x is
true and the second argument otherwise. Now And takes its two arguments x
and y, and in return passes y and false to its first argument x. Assuming x is
a truth value, the result will evaluate to y if x is true, and to false if x is false,
which is just what is desired.
Note that we assume here that only truth values are used as arguments to
And. If it is passed other terms, the result (i.e., the normal form, if it exists)
may well not be a truth value.
Problem 44.4. Define the functions Or and Xor representing the truth func-
tions of inclusive and exclusive disjunction using the encoding of truth values
as λ-terms.

content/lambda-calculus/lambda-definability/primitive-recursive-functions.tex

622 Release : 6891b66 (2024-12-01)


44.5. PRIMITIVE RECURSIVE FUNCTIONS ARE λ-DEFINABLE

44.5 Primitive Recursive Functions are λ-Definable


lam:ldf:prf: Recall that the primitive recursive functions are those that can be defined from
sec
the basic functions zero, succ, and Pin by composition and primitive recursion.

lam:ldf:prf: Lemma 44.9. The basic primitive recursive functions zero, succ, and projec-
lem:basic
tions Pin are λ-definable.

Proof. They are λ-defined by the following terms:

Zero ≡ λa. λf x. x
Succ ≡ λa. λf x. f (af x)
Projni ≡ λx0 . . . xn−1 . xi

lam:ldf:prf: Lemma 44.10. Suppose the k-ary function f , and n-ary functions g0 , . . . , gk−1 ,
lem:comp
are λ-definable by terms F , G0 , . . . , Gk , and h is defined from them by com-
position. Then H is λ-definable.

Proof. h can be λ-defined by the term

H ≡ λx0 . . . xn−1 . F (G0 x0 . . . xn−1 ) . . . (Gk−1 x0 . . . xn−1 )

We leave verification of this fact as an exercise.

Problem 44.5. Complete the proof of Lemma 44.10 by showing that Hn0 . . . nn−1 →


h(n0 , . . . , nn−1 ).

Note that Lemma 44.10 did not require that f and g0 , . . . , gk−1 are primitive
recursive; it is only required that they are total and λ-definable.

lam:ldf:prf: Lemma 44.11. Suppose f is an n-ary function and g is an n+2-ary function,


lem:prim
they are λ-definable by terms F and G, and the function h is defined from f
and g by primitive recursion. Then h is also λ-definable.

Proof. Recall that h is defined by

h(x1 , . . . , xn , 0) = f (x1 , . . . , xn )
h(x1 , . . . , xn , y + 1) = h(x1 , . . . , xn , y, h(x1 , . . . , xn , y)).

Informally speaking, the primitive recursive definition iterates the application


of the function h y times and applies it to f (x1 , . . . , xn ). This is reminiscent
of the definition of Church numerals, which is also defined as a iterator.
For simplicity, we give the definition and proof for a single additional argu-
ment x. The function h is λ-defined by:

H ≡λx. λy. Snd(yD⟨0, F x⟩)

Release : 6891b66 (2024-12-01) 623


CHAPTER 44. LAMBDA DEFINABILITY

where

D ≡λp. ⟨Succ(Fst p), (Gx(Fst p)(Snd p))⟩


The iteration state we maintain is a pair, the first of which is the current y
and the second is the corresponding value of h. For every step of iteration we
create a pair of new values of y and h; after the iteration is done we return
the second part of the pair and that’s the final h value. We now prove this is
indeed a representation of primitive recursion.
We want to prove that for any n and m, H n m → −→ h(n, m). To do this we
first show that if Dn ≡ D[n/x], then Dnm ⟨0, F n⟩ − →
→ ⟨m, h(n, m)⟩ We proceed
by induction on m.
→ ⟨0, h(n, 0)⟩. But Dn0 ⟨0, F n⟩ just is ⟨0, F n⟩.
If m = 0, we want Dn0 ⟨0, F n⟩ →

Since F λ-defines f , this reduces to ⟨0, f (n)⟩, and since f (n) = h(n, 0), this is
⟨0, h(n, 0)⟩
Now suppose that Dnm ⟨0, F n⟩ → −→ ⟨m, h(n, m)⟩. We want to show that
m+1
Dn ⟨0, F n⟩ → −
→ ⟨m + 1, h(n, m + 1)⟩.
Dnm+1 ⟨0, F n⟩ ≡ Dn (Dnm ⟨0, F n⟩)


→ Dn ⟨m, h(n, m)⟩ (by IH)
≡ (λp. ⟨Succ(Fst p), (G n(Fst p)(Snd p))⟩)⟨m, h(n, m)⟩

− ⟨Succ(Fst ⟨m, h(n, m)⟩),
(G n(Fst ⟨m, h(n, m)⟩)(Snd ⟨m, h(n, m)⟩))⟩


→ ⟨Succ m, (G n m h(n, m))⟩


→ ⟨m + 1, g(n, m, h(n, m))⟩
Since g(n, m, h(n, m)) = h(n, m + 1), we are done.
Finally, consider
H n m ≡ λx. λy. Snd(y(λp.⟨Succ(Fst p), (G x (Fst p) (Snd p))⟩)⟨0, F x⟩)
nm


→ Snd(m (λp.⟨Succ(Fst p), (G n (Fst p)(Snd p))⟩)⟨0, F n⟩)
| {z }
Dn

≡ Snd(m Dn ⟨0, F n⟩)


→ Snd (Dnm ⟨0, F n⟩)




→ Snd ⟨m, h(n, m)⟩


→ h(n, m).

Proposition 44.12. Every primitive recursive function is λ-definable.

Proof. By Lemma 44.9, all basic functions are λ-definable, and by Lemma 44.10
and Lemma 44.11, the λ-definable functions are closed under composition and
primitive recursion.

624 Release : 6891b66 (2024-12-01)


44.6. FIXPOINTS

content/lambda-calculus/lambda-definability/fixpoints.tex

44.6 Fixpoints
lam:ldf:fp: Suppose we wanted to define the factorial function by recursion as a term Fac
sec
with the following property:

Fac ≡ λn. IsZero n 1(Mult n(Fac(Pred n)))

That is, the factorial of n is 1 if n = 0, and n times the factorial of n − 1


otherwise. Of course, we cannot define the term Fac this way since Fac itself
occurs in the right-hand side. Such recursive definitions involving self-reference
are not part of the lambda calculus. Defining a term, e.g., by

Mult ≡ λab. a(Add a)0

only involves previously defined terms in the right-hand side, such as Add. We
can always remove Add by replacing it with its defining term. This would give
the term Mult as a pure lambda term; if Add itself involved defined terms (as,
e.g., Add′ does), we could continue this process and finally arrive at a pure
lambda term.
However this is not true in the case of recursive definitions like the one of
Fac above. If we replace the occurrence of Fac on the right-hand side with the
definition of Fac itself, we get:

Fac ≡ λn. IsZero n 1


(Mult n((λn. IsZero n 1 (Mult n (Fac(Pred n))))(Pred n)))

and we still haven’t gotten rid of Fac on the right-hand side. Clearly, if we
repeat this process, the definition keeps growing longer and the process never
results in a pure lambda term. Thus this way of defining factorial (or more
generally recursive functions) is not feasible.
The recursive definition does tell us something, though: If f were a term
representing the factorial function, then the term

Fac′ ≡ λg. λn. IsZero n 1 (Mult n (g(Predn)))

applied to the term f , i.e., Fac′ f , also represents the factorial function. That
is, if we regard Fac′ as a function accepting a function and returning a function,
the value of Fac′ f is just f , provided f is the factorial. A function f with the
β
property that Fac′ f = f is called a fixpoint of Fac′ . So, the factorial is a

fixpoint of Fac .
There are terms in the lambda calculus that compute the fixpoints of a
given term, and these terms can then be used to turn a term like Fac′ into the
definition of the factorial.

Release : 6891b66 (2024-12-01) 625


CHAPTER 44. LAMBDA DEFINABILITY

Definition 44.13. The Y-combinator is the term: lam:ldf:fp:


defn:Turing-Y
Y ≡ (λux. x(uux))(λux. x(uux)).

Theorem 44.14. Y has the property that Y g →



→ g(Y g) for any term g. Thus,
Y g is always a fixpoint of g.

Proof. Let’s abbreviate (λux. x(uux)) by U , so that Y ≡ U U . Then

Y g ≡ (λux. x(uux))U g


→ (λx. x(U U x))g


→ g(U U g) ≡ g(Y g).
β
Since g(Y g) and Y g both reduce to g(Y g), g(Y g) = Y g, so Y g is a fixpoint
of g.

Of course, since Y g is a redex, the reduction can continue indefinitely:

Yg →

→ g(Y g)


→ g(g(Y g))


→ g(g(g(Y g)))
...

So we can think of Y g as g applied to itself infinitely many times. If we


apply g to it one additional time, we—so to speak—aren’t doing anything
extra; g applied to g applied infinitely many times to Y g is still g applied
to Y g infinitely many times.
Note that the above sequence of β-reduction steps starting with Y g is infi-
nite. So if we apply Y g to some term, i.e., consider (Y g)N , that term will also
reduce to infinitely many different terms, namely (g(Y g))N , (g(g(Y g)))N , . . . .
It is nevertheless possible that some other sequence of reduction steps does
terminate in a normal form.
Take the factorial for instance. Define Fac as Y Fac′ (i.e., a fixpoint of Fac′ ).
Then:

→ Y Fac′ 3
Fac 3 →

→ Fac′ (Y Fac′ ) 3


≡ (λx. λn. IsZero n 1 (Mult n (x(Pred n)))) Fac 3


→ IsZero 3 1 (Mult 3 (Fac(Pred 3)))


→ Mult 3 (Fac 2).

Similarly,

Fac 2 →

→ Mult 2 (Fac 1)
Fac 1 →

→ Mult 1 (Fac 0)

626 Release : 6891b66 (2024-12-01)


44.6. FIXPOINTS

but

→ Fac′ (Y Fac′ ) 0
Fac 0 →

≡ (λx. λn. IsZero n 1 (Mult n (x(Pred n)))) Fac 0


→ IsZero 0 1 (Mult 0 (Fac(Pred 0))).


→ 1.

So together

Fac 3 →

→ Mult 3 (Mult 2 (Mult 1 1)).
What goes for Fac′ goes for any recursive definition. Suppose we have a
recursive equation
β
g x1 . . . xn = N

where N may contain g and x1 , . . . , xn . Then there is always a term G ≡


(Y λg. λx1 . . . xn . N ) such that
β
G x1 . . . xn = N [G/g].

For by the fixpoint theorem,

G ≡ (Y λg. λx1 . . . xn . N ) →

→ λg. λx1 . . . xn . N (Y λg. λx1 . . . xn . N )
≡ (λg. λx1 . . . xn . N ) G

and consequently

G x1 . . . xn →

→ (λg. λx1 . . . xn . N ) G x1 . . . xn


→ (λx1 . . . xn . N [G/g]) x1 . . . xn


→ N [G/g].
The Y combinator of Definition 44.13 is due to Alan Turing. Alonzo Church
had proposed a different version which we’ll call YC :
YC ≡ λg. (λx. g(xx))(λx. g(xx)).
β
Church’s combinator is a bit weaker than Turing’s in that Y g = g(Y g) but not
β
Yg −

→ g(Y g). Let V be the term λx. g(xx), so that YC ≡ λg. V V . Then
V V ≡ (λx. g(xx))V →

→ g(V V ) and thus
YC g ≡ (λg. V V )g →

→VV → −→ g(V V ), but also
g(YC g) ≡ g((λg. V V )g) →

→ g(V V ).
β
In other words, YC g and g(YC g) reduce to a common term g(V V ); so YC g =
g(YC g). This is often enough for applications.

content/lambda-calculus/lambda-definability/minimization.tex

Release : 6891b66 (2024-12-01) 627


CHAPTER 44. LAMBDA DEFINABILITY

44.7 Minimization
The general recursive functions are those that can be obtained from the basic lam:ldf:min:
sec
functions zero, succ, Pin by composition, primitive recursion, and regular min-
imization. To show that all general recursive functions are λ-definable we have
to show that any function defined by regular minimization from a λ-definable
function is itself λ-definable.

Lemma 44.15. If f (x1 , . . . , xk , y) is regular and λ-definable, then g defined lam:ldf:min:


lem:min
by
g(x1 , . . . , xk ) = µy f (x1 , . . . , xk , y) = 0
is also λ-definable.

Proof. Suppose the lambda term F λ-defines the regular function f (⃗x, y). To
λ-define h we use a search function and a fixpoint combinator:

Search ≡ λg. λf ⃗x y. IsZero(f ⃗x y) y (g ⃗x(Succ y)


H ≡ λ⃗x. (Y Search)F ⃗x 0,

where Y is any fixpoint combinator. Informally speaking, Search is a self-


referencing function: starting with y, test whether f ⃗x y is zero: if so, re-
turn y, otherwise call itself with Succ y. Thus (Y Search)F n1 . . . nk 0 returns
the least m for which f (n1 , . . . , nk , m) = 0.
Specifically, observe that

(Y Search)F n1 . . . nk m →

→m

if f (n1 , . . . , nk , m) = 0, or



→ (Y Search)F n1 . . . nk m + 1

otherwise. Since f is regular, f (n1 , . . . , nk , y) = 0 for some y, and so

(Y Search)F n1 . . . nk 0 →

→ h(n1 , . . . , nk ).

Proposition 44.16. Every general recursive function is λ-definable.

Proof. By Lemma 44.9, all basic functions are λ-definable, and by Lemma 44.10,
Lemma 44.11, and Lemma 44.15, the λ-definable functions are closed under
composition, primitive recursion, and regular minimization.

content/lambda-calculus/lambda-definability/partial-recursive-functions.tex

628 Release : 6891b66 (2024-12-01)


44.8. PARTIAL RECURSIVE FUNCTIONS ARE λ-DEFINABLE

44.8 Partial Recursive Functions are λ-Definable


lam:ldf:par: Partial recursive functions are those obtained from the basic functions by com-
sec
position, primitive recursion, and unbounded minimization. They differ from
general recursive function in that the functions used in unbounded search are
not required to be regular. Not requiring regularity means that functions de-
fined by minimization may sometimes not be defined.
At first glance it might seem that the same methods used to show that the
(total) general recursive functions are all λ-definable can be used to prove that
all partial recursive functions are λ-definable. For instance, the composition of
f with g is λ-defined by λx. F (Gx) if f and g are λ-defined by terms F and G,
respectively. However, when the functions are partial, this is problematic.
When g(x) is undefined, meaning Gx has no normal form. In most cases this
means that F (Gx) has no normal forms either, which is what we want. But
consider when F is λx. λy. y, in which case F (Gx) does have a normal form
(λy. y).
This problem is not insurmountable, and there are ways to λ-define all
partial recursive functions in such a way that undefined values are represented
by terms without a normal form. These ways are, however, somewhat more
complicated and less intuitive than the approach we have taken for general
recursive functions. We record the theorem here without proof:
Theorem 44.17. All partial recursive functions are λ-definable.

content/lambda-calculus/lambda-definability/lambda-definable-recursive.tex

44.9 λ-Definable Functions are Recursive


lam:dfl:ldr: Not only are all partial recursive functions λ-definable, the converse is true,
sec
too. That is, all λ-definable functions are partial recursive.
lam:dfl:ldr: Theorem 44.18. If a partial function f is λ-definable, it is partial recursive.
thm:lambda-computable

Proof. We only sketch the proof. First, we arithmetize λ-terms, i.e., systema-
tially assign Gödel numbers to λ-terms, using the usual power-of-primes coding
of sequences. Then we define a partial recursive function normalize(t) operat-
ing on the Gödel number t of a lambda term as argument, and which returns
the Gödel number of the normal form if it has one, or is undefined otherwise.
Then define two partial recursive functions toChurch and fromChurch that
maps natural numbers to and from the Gödel numbers of the corresponding
Church numeral.
Using these recursive functions, we can define the function f as a par-
tial recursive function. There is a λ-term F that λ-defines f . To compute
f (n1 , . . . , nk ), first obtain the Gödel numbers of the corresponding Church nu-
merals using toChurch(ni ), append these to # F # to obtain the Gödel number of
the term F n1 . . . nk . Now use normalize on this Gödel number. If f (n1 , . . . , nk )

Release : 6891b66 (2024-12-01) 629


CHAPTER 44. LAMBDA DEFINABILITY

is defined, F n1 . . . nk has a normal form (which must be a Church numeral),


and otherwise it has no normal form (and so

normalize(# F n1 . . . nk # )

is undefined). Finally, use fromChurch on the Gödel number of the normalized


term.

630 Release : 6891b66 (2024-12-01)


Part X

Many-valued Logic
This part contains draft material on propositional many-valued logics.

Chapter 45

Syntax and Semantics

content/many-valued-logic/syntax-and-semantics/introduction.tex

45.1 Introduction
mvl:syn:int: In classical logic, we deal with formulas that are built from propositional vari-
sec
ables using the propositional connectives ¬, ∧, ∨, →, and ↔. When we define
a semantics for classical logic, we do so using the two truth values T and F.
We interpret propositional variables in a valuation v, which assigns these truth
values T, F to the propositional variables. Any valuation then determines a
truth value v(φ) for any formula φ, and A formula is satisfied in a valuation v,
v ⊨ φ, iff v(φ) = T.
Many-valued logics are generalizations of classical two-valued logic by allow-
ing more truth values than just T and F. So in many-valued logic, a valuation v
is a function assigning to every propositional variable p one of a range of possible
truth values. We’ll generally call the set of allowed truth values V . Classical
logic is a many-valued logic where V = {T, F}, and the truth value v(φ) is
computed using the familiar characteristic truth tables for the connectives.
Once we add additional truth values, we have more than one natural option
for how to compute v(φ) for the connectives we read as “and,” “or,” “not,” and
“if—then.” So a many-valued logic is determined not just by the set of truth
values, but also by the truth functions we decide to use for each connective.
Once these are selected for a many-valued logic L, however, the truth value

631
CHAPTER 45. SYNTAX AND SEMANTICS

vL (φ) is uniquely determined by the valuation, just like in classical logic. Many-
valued logics, like classical logic, are truth functional.
With this semantic building blocks in hand, we can go on to define the
analogs of the semantic concepts of tautology, entailment, and satisfiability. In
classical logic, a formula is a tautology if its truth value v(φ) = T for any v. In
many-valued logic, we have to generalize this a bit as well. First of all, there is
no requirement that the set of truth values V contains T. For instance, some
many-valued logics use numbers, such as all rational numbers between 0 and 1
as their set of truth values. In such a case, 1 usually plays the rule of T. In
other logics, not just one but several truth values do. So, we require that every
many-valued logic have a set V + of designated values. We can then say that
a formula is satisfied in a valuation v, v ⊨L φ, iff vL (φ) ∈ V + . A formula φ is
a tautology of the logic, ⊨L φ, iff v(φ) ∈ V + for any v. And, finally, we say
that φ is entailed by a set of formulas, Γ ⊨L φ, if every valuation that satisfies
all the formulas in Γ also satisfies φ.

content/many-valued-logic/syntax-and-semantics/connectives.tex

45.2 Languages and Connectives


Classical propositional logic, and many other logics, use a set supply of propo- mvl:syn:con:
sec
sitional constants and connectives. For instance, we use the following as prim-
itives:

1. The propositional constant for falsity ⊥.

2. The propositional constant for truth ⊤.

3. The logical connectives: ¬ (negation), ∧ (conjunction), ∨ (disjunction),


→ (conditional), ↔ (biconditional)

The same connectives are used in many-valued logics as well. However, it is


often useful to include different versions of, say, conjunction, in the same logic,
and that would require different symbols to keep the versions separate. Some
many-valued logics also include connectives that have no equivalent in classical
logic. So, we’ll be a bit more general than usual.

Definition 45.1. A propositional language consists of a set L of connectives.


Each connective ⋆ has an arity; a connective of arity n is said to be n-place.
Connectives of arity 0 are also called constants; connectives of arity 1 are called
unary, and connectives of arity 2, binary.

Example 45.2. The standard language of propositional logic L0 consists of


the following connectives (with associated arities): ⊥ (0) ¬ (1), ∧ (2), ∨ (2),
→ (2). Most logics we consider will use this language. Some logics by tradition
an convention use different symbols for some connectives. For instance, in
product logic, the conjunction symbol is often ⊙ instead of ∧. Sometimes

632 Release : 6891b66 (2024-12-01)


45.3. FORMULAS

it is convenient to add a new operator, e.g., the determinateness operator △


(1-place).

content/many-valued-logic/syntax-and-semantics/formulas.tex

45.3 Formulas
mvl:syn:fml:
sec
mvl:syn:fml: Definition 45.3 (Formula). The set Frm(L) of formulas of a propositional
defn:formulas
language L is defined inductively as follows:

1. Every propositional variable pi is an atomic formula.

2. Every 0-place connective (propositional constant) of L is an atomic for-


mula.

3. If ⋆ is an n-place connective of L, and φ1 , . . . , φn are formulas, then


⋆(φ1 , . . . , φn ) is a formula.

4. Nothing else is a formula.

If ⋆ is 1-place, then ⋆(φ1 ) will often be written simply as ⋆φ1 . If ⋆ is 2-place


⋆(φ1 , φ2 ) will often be written as (φ1 ⋆ φ2 ).

As usual, we will often silently leave out the outermost parentheses.

Example 45.4. In the standard language L0 , p1 → (p1 ∧ ¬p2 ) is a formula. In


the language of product logic, it would be written instead as p1 → (p1 ⊙ ¬p2 ).
If we add the 1-place △ to the language, we would also have formulas such as
△(p1 ∧ p2 ) → (△p1 ∧ △p2 ).

content/many-valued-logic/syntax-and-semantics/matrices.tex

45.4 Matrices
mvl:syn:mat: A many-valued logic is defined by its language, its set of truth values V , a subset
sec
of designated truth values, and truth functions for its connective. Together,
these elements are called a matrix.

mvl:syn:mat: Definition 45.5 (Matrix). A matrix for the logic L consists of:
defn:matrix

1. a set of connectives making up a language L;

2. a set V ̸= ∅ of truth values;

3. a set V + ⊆ V of designated truth values;

Release : 6891b66 (2024-12-01) 633


CHAPTER 45. SYNTAX AND SEMANTICS

¬
e ∧
e T F ∨
e T F →
f T F
T F T T F T T T T T F
F T F F F F T F F T T

Figure 45.1: Truth functions for classical logic C.


mvl:syn:mat:
fig:tf-CL
n
4. for each n-place connective ⋆ in L, a truth function e
⋆:V → V . If n = 0,
then e⋆ is just an element of V .

Example 45.6. The matrix for classical logic C consists of:


1. The standard propositional language L0 with ⊥, ¬, ∧, ∨, →.
2. The set of truth values V = {T, F}.
3. T is the only designated value, i.e., V + = {T}.

4. For ⊥, we have ⊥ e = F. The other truth functions are given by the usual
truth tables (see Figure 45.1).

content/many-valued-logic/syntax-and-semantics/valuations-sat.tex

45.5 Valuations and Satisfaction


mvl:syn:val:
sec
Definition 45.7 (Valuations). Let V be a set of truth values. A valuation
for L into V is a function v assigning an element of V to the propositional
variables of the language, i.e., v : At0 → V .

Definition 45.8. Given a valuation v into the set of truth values V of a many- mvl:syn:val:
valued logic L, define the evaluation function v : Frm(L) → V inductively by: defn:pValue

1. v(pn ) = v(pn );
2. If ⋆ is a 0-place connective, then v(⋆) = e
⋆L ;
3. If ⋆ is an n-place connective, then

v(⋆(φ1 , . . . , φn )) = e
⋆L (v(φ1 ), . . . , v(φn )).

Definition 45.9 (Satisfaction). The formula φ is satisfied by a valuation v, mvl:syn:val:


v ⊨L φ, iff vL (φ) ∈ V + , where V + is the set of designated truth values of L. defn:satisfaction

We write v ⊭L φ to mean “not v ⊨L φ.” If Γ is a set of formulas, v ⊨L Γ


iff v ⊨L φ for every φ ∈ Γ .

content/many-valued-logic/syntax-and-semantics/semantic-notions.tex

634 Release : 6891b66 (2024-12-01)


45.6. SEMANTIC NOTIONS

45.6 Semantic Notions


mvl:syn:sem: Suppose a many-valued logic L is given by a matrix. Then we can define the
sec
usual semantic notions for L.
Definition 45.10. 1. A formula φ is satisfiable if for some v, v ⊨ φ; it is
unsatisfiable if for no v, v ⊨ φ;
2. A formula φ is a tautology if v ⊨ φ for all valuations v;
3. If Γ is a set of formulas, Γ ⊨ φ (“Γ entails φ”) if and only if v ⊨ φ for
every valuation v for which v ⊨ Γ .
4. If Γ is a set of formulas, Γ is satisfiable if there is a valuation v for which
v ⊨ Γ , and Γ is unsatisfiable otherwise.

We have some of the same facts for these notions as we do for the case of
classical logic:
mvl:syn:sem: Proposition 45.11.
prop:semanticalfacts
1. φ is a tautology if and only if ∅ ⊨ φ;
2. If Γ is satisfiable then every finite subset of Γ is also satisfiable;
mvl:syn:sem: 3. Monotonicity: if Γ ⊆ ∆ and Γ ⊨ φ then also ∆ ⊨ φ;
def:monotonicity
mvl:syn:sem: 4. Transitivity: if Γ ⊨ φ and ∆ ∪ {φ} ⊨ ψ then Γ ∪ ∆ ⊨ ψ;
def:Cut

Proof. Exercise.

Problem 45.1. Prove Proposition 45.11

In classical logic we can connect entailment and the conditional. For in-
stance, we have the validity of modus ponens: If Γ ⊨ φ and Γ ⊨ φ → ψ then
Γ ⊨ ψ. Another important relationship between ⊨ and → in classical logic is
the semantic deduction theorem: Γ ⊨ φ → ψ if and only if Γ ∪ {φ} ⊨ ψ. These
results do not always hold in many-valued logics. Whether they do depends
on the truth function f→.

content/many-valued-logic/syntax-and-semantics/sublogics.tex

45.7 Many-valued logics as sublogics of C


mvl:syn:sub: The usual many-valued logics are all defined using matrices in which the value
sec
of a truth-function for arguments in {T, F} agrees with the classical truth func-
tions. Specifically, in these logics, if x ∈ {T, F}, then ¬e L (x) = ¬
e C (x), and for ⋆
any one of ∧, ∨, →, if x, y ∈ {T, F}, then e ⋆L (x, y) = e
⋆C (x, y). In other words,
the truth functions for ¬, ∧, ∨, → restricted to {T, F} are exactly the classical
truth functions.

Release : 6891b66 (2024-12-01) 635


Proposition 45.12. Suppose that a many-valued logic L contains the con- mvl:syn:sub:
nectives ¬, ∧, ∨, → in its language, T, F ∈ V , and its truth functions satisfy: prop:mvl-cl

1. ¬
e L (x) = ¬
e C (x) if x = T or x = F; mvl:syn:sub:
prop:not

2. ∧
e L (x, y) = ∧
e C (x, y), mvl:syn:sub:
prop:land

3. ∨
e L (x, y) = ∨
e C (x, y), mvl:syn:sub:
prop:lor

→L (x, y) = f
4. f →C (x, y), if x, y ∈ {T, F}. mvl:syn:sub:
prop:lif

Then, for any valuation v into V such that v(p) ∈ {T, F}, vL (φ) = vC (φ).

Proof. By induction on φ.

1. If φ ≡ p is atomic, we have vL (φ) = v(p) = vC (φ).

2. If φ ≡ ¬B, we have

vL (φ) = ¬
e L (vL (ψ)) by Definition 45.8

e L (vC (ψ)) by inductive hypothesis

e C (vC (ψ)) by assumption (1),
since vC (ψ) ∈ {T, F},
= vC (φ) by Definition 45.8.

3. If φ ≡ (ψ ∧ χ), we have

vL (φ) = ∧
e L (vL (ψ), vL (χ)) by Definition 45.8
=∧e L (vC (ψ), vC (χ)) by inductive hypothesis
=∧
e C (vC (ψ), vC (χ)) by assumption (2),
since vC (ψ), vC (χ) ∈ {T, F},
= vC (φ) by Definition 45.8.

The cases where φ ≡ (ψ ∨ χ) and φ ≡ (ψ → χ) are similar.

Corollary 45.13. If a many-valued logic satisfies the conditions of Proposi-


tion 45.12, T ∈ V + and F ∈ / V + , then ⊨L ⊆ ⊨C , i.e., if Γ ⊨L ψ then Γ ⊨C ψ.
In particular, every tautology of L is also a classical tautology.

Proof. We prove the contrapositive. Suppose Γ ⊭C ψ. Then there is some


valuation v : At0 → {T, F} such that vC (φ) = T for all φ ∈ Γ and vC (ψ) = F.
Since T, F ∈ V , the valuation v is also a valuation for L. By Proposition 45.12,
vL (φ) = T for all φ ∈ Γ and vL (ψ) = F. Since T ∈ V + and F ∈ / V + that
means v ⊨L Γ and v ⊭L ψ, i.e., Γ ⊭L ψ.

636
Chapter 46

Three-valued Logics

content/many-valued-logic/three-valued-logics/introduction.tex

46.1 Introduction
mvl:thr:int: If we just add one more value U to T and F, we get a three-valued logic. Even
sec
though there is only one more truth value, the possibilities for defining the
truth-functions for ¬, ∧, ∨, and → are quite numerous. Then a logic might use
any combination of these truth functions, and you also have a choice of making
only T designated, or both T and U.
We present here a selection of the most well-known three-valued logics, their
motivations, and some of their properties.

content/many-valued-logic/three-valued-logics/lukasiewicz.tex

46.2 Lukasiewicz logic


mvl:thr:luk: One of the first published, worked out proposals for a many-valued logic is due
sec
to the Polish philosopher Jan Lukasiewicz in 1921. Lukasiewicz was motivated
by Aristotle’s sea battle problem: It seems that, today, the sentence “There
will be a sea battle tomorrow” is neither true nor false: its truth value is not
yet settled. Lukasiewicz proposed to introduce a third truth value, to such
“future contingent” sentences.

I can assume without contradiction that my presence in Warsaw


at a certain moment of next year, e.g., at noon on 21 December,
is at the present time determined neither positively nor negatively.
Hence it is possible, but not necessary, that I shall be present in
Warsaw at the given time. On this assumption the proposition “I
shall be in Warsaw at noon on 21 December of next year,” can at
the present time be neither true nor false. For if it were true now,

Release : 6891b66 (2024-12-01) 637


CHAPTER 46. THREE-VALUED LOGICS

my future presence in Warsaw would have to be necessary, which is


contradictory to the assumption. If it were false now, on the other
hand, my future presence in Warsaw would have to be impossible,
which is also contradictory to the assumption. Therefore the propo-
sition considered is at the moment neither true nor false and must
possess a third value, different from “0” or falsity and “1” or truth.
This value we can designate by “ 12 .” It represents “the possible,”
and joins “the true” and “the false” as a third value.

We will use U for Lukasiewicz’s third truth value.1


The truth functions for the connectives ¬, ∧, and ∨ are easy to determine on
this interpretation: the negation of a future contingent sentence is also a future
contingent sentence, so ¬e (U) = U. If one conjunct of a conjunction is undeter-
mined and the other is true, the conjunction is also undetermined—after all,
depending on how the future contingent conjunct turns out, the conjunction
might turn out to be true, and it might turn out to be false. So


e (T, U) = ∧
e (U, T) = U.

If the other conjunct is false, however, it cannot turn out true, so


e (F, U) = ∧
e (F, U) = F.

The other values (if the arguments are settled truth values, T or F, are like in
classical logic.
For the conditional, the situation is a little trickier. Suppose q is a future
contingent statement. If p is false, then p → q will be true, regardless of how
q turns out, so we should set f →(F, U) = T. And if p is true, then q → p will
be true, regardless of what q turns out to be, so f →(U, T) = T. If p is true,
then p → q might turn out to be true or false, so f →(T, U) = U. Similarly, if p
is false, then q → p might turn out to be true or false, so f →(U, F) = U. This
leaves the case where p and q are both future contingents. On the basis of the
motivation, we should really assign U in this case. However, this would make
φ → φ not a tautology. Lukasiewicz had not trouble giving up φ ∨ ¬φ and
¬(φ ∧ ¬φ), but balked at giving up φ → φ. So he stipulated f →(U, U) = T.

Definition 46.1. Three-valued Lukasiewicz logic is defined using the matrix: mvl:thr:luk:
def:lukasiewicz

1. The standard propositional language L0 with ¬, ∧, ∨, →.

2. The set of truth values V = {T, U, F}.

3. T is the only designated value, i.e., V + = {T}.

4. Truth functions are given by the following tables:


1 Lukasiewicz here uses “possible” in a way that is uncommon today, namely to mean

possible but not necessary.

638 Release : 6891b66 (2024-12-01)


46.2. LUKASIEWICZ LOGIC

¬
e ∧
e L3 T U F
T F T T U F
U U U U U F
F T F F F F


e L3 T U F →
fL3 T U F
T T T T T T U F
U T U U U T T U
F T U F F T T T

As can easily be seen, any formula φ containing only ¬, ∧, and ∨ will


take the truth value U if all its propositional variables are assigned U. So for
instance, the classical tautologies p ∨ ¬p and ¬(p ∧ ¬p) are not tautologies in
L3 , since v(φ) = U whenever v(p) = U.
On valuations where v(p) = T or F, v(φ) will coincide with its classical
truth value.

Proposition 46.2. If v(p) ∈ {T, F} for all p in φ, then vL3 (φ) = vC (φ).

Problem 46.1. Suppose we define v(φ ↔ ψ) = v((φ → ψ) ∧ (ψ → φ)) in L3 .


What truth table would ↔ have?

Many classical tautologies are also tautologies in L3 , e.g, ¬p→(p→q). Just


like in classical logic, we can use truth tables to verify this:
p q ¬ p → (p → q)
T T F T T T T T
T U F T T T U U
T F F T T T F F
U T U U T U T T
U U U U T U T U
U F U U T U U F
F T T F T F T T
F U T F T F T U
F F T F T F T F
Problem 46.2. Show that the following are tautologies in L3 :

1. p → (q → p)

2. ¬(p ∧ q) ↔ (¬p ∨ ¬q)

3. ¬(p ∨ q) ↔ (¬p ∧ ¬q)

(In (2) and (3), take φ ↔ ψ as an abbreviation for (φ → ψ) ∧ (ψ → φ), or refer


to your solution to Problem 46.1.)

Problem 46.3. Show that the following classical tautologies are not tautolo-
gies in L3 :

Release : 6891b66 (2024-12-01) 639


CHAPTER 46. THREE-VALUED LOGICS

1. (¬p ∧ p) → q)
2. ((p → q) → p) → p
3. (p → (p → q)) → (p → q)

One might therefore perhaps think that although not all classical tautologies
are tautologies in L3 , they should at least take either the value T or the value U
on every valuation. This is not the case. A counterexample is given by

¬(p → ¬p) ∨ ¬(¬p → p)

which is F if p is U.
Problem 46.4. Which of the following relations hold in Lukasiewicz logic?
Give a truth table for each.
1. p, p → q ⊨ q
2. ¬¬p ⊨ p
3. p ∧ q ⊨ p
4. p ⊨ p ∧ p
5. p ⊨ p ∨ q

Lukasiewicz hoped to build a logic of possibility on the basis of his three-


valued system, by introducing a one-place connective ♢φ (for “φ is possible”)
and a corresponding □φ (for “φ is necessary”):


e □
e
T T T T
U T U F
F F F F
In other words, p is possible iff it is not already settled as false; and p is
necessary iff it is already settled as true.
Problem 46.5. Show that □p ↔ ¬♢¬p and ♢p ↔ ¬□¬p are tautologies in L3 ,
extended with the truth tables for □ and ♢.

However, the shortcomings of this proposed modal logic soon became ev-
ident: However things turn out, p ∧ ¬p can never turn out to be true. So
even if it is not now settled (and therefore undetermined), it should count as
impossible, i.e., ¬♢(p ∧ ¬p) should be a tautology. However, if v(p) = U, then
v(¬♢(p ∧ ¬p)) = U. Although Lukasiewicz was correct that two truth values
will not be enough to accommodate modal distinctions such as possiblity and
necessity, introducing a third truth value is also not enough.

content/many-valued-logic/three-valued-logics/kleene.tex

640 Release : 6891b66 (2024-12-01)


46.3. KLEENE LOGICS

46.3 Kleene logics


mvl:thr:skl: Stephen Kleene introduced two three-valued logics motivated by a logic in
sec
which truth values are thought of the outcomes of computational procedures:
a procedure may yield T or F, but it may also fail to terminate. In that case
the corresponding truth value is undefined, represented by the truth value U.
To compute the negation of a proposition φ, you would first compute the
value of φ, and then return the opposite of the result. If the computation of φ
does not terminate, then the entire procedure does not either: so the negation
of U is U.
To compute a conjunction φ ∧ ψ, there are two options: one can first com-
pute φ, then ψ, and then the result would be T if the outcome of both is T,
and F otherwise. If either computation fails to halt, the entire procedure does
as well. So in this case, the if one conjunct is undefined, the conjunction is as
well. The same goes for disjunction.
However, if we can evaluate φ and ψ in parallel, we can do better. Then, if
one of the two procedures halts and returns F, we can stop, as the answer must
be false. So in that case a conjunction with one false conjunct is false, even
if the other conjunct is undefined. Similarly, when computing a disjunction
in parallel, we can stop once the procedure for one of the two disjuncts has
returned true: then the disjunction must be true. So in this case we can know
what the outcome of a compound claim is, even if one of the components is
undefined. On this interpretation, we might read U as “unknown” rather than
“undefined.”
The two interpretations give rise to Kleene’s strong and weak logic. The
conditional is defined as equivalent to ¬φ ∨ ψ.

Definition 46.3. Strong Kleene logic Ks is defined using the matrix:

1. The standard propositional language L0 with ¬, ∧, ∨, →.

2. The set of truth values V = {T, U, F}.

3. T is the only designated value, i.e., V + = {T}.

4. Truth functions are given by the following tables:

¬
e ∧
e Ks T U F
T F T T U F
U U U U U F
F T F F F F


e Ks T U F →
fKs T U F
T T T T T T U F
U T U U U T U U
F T U F F T T T

Definition 46.4. Weak Kleene logic Kw is defined using the matrix:

Release : 6891b66 (2024-12-01) 641


CHAPTER 46. THREE-VALUED LOGICS

1. The standard propositional language L0 with ¬, ∧, ∨, →.

2. The set of truth values V = {T, U, F}.

3. T is the only designated value, i.e., V + = {T}.

4. Truth functions are given by the following tables:

¬
e ∧
e Kw T U F
T F T T U F
U U U U U U
F T F F U F


e Kw T U F →
fKw T U F
T T U T T T U F
U U U U U U U U
F T U F F T U T

Proposition 46.5. Ks and Kw have no tautologies.

Proof. If v(p) = U for all propositional variables p, then any formula φ will
have truth value v(φ) = U, since

¬
e (U) = ∨
e (U, U) = ∧ →(U, U) = U
e (U, U) = f

/ V + for either Ks or Kw, on this valuation, φ will not


in both logics. As U ∈
be designated.

Although both weak and strong Kleene logic have no tautologies, they have
non-trivial consequence relations.

Problem 46.6. Which of the following relations hold in (a) strong and (b)
weak Kleene logic? Give a truth table for each.

1. p, p → q ⊨ q

2. p ∨ q, ¬p ⊨ q

3. p ∧ q ⊨ p

4. p ⊨ p ∧ p

5. p ⊨ p ∨ q

Dmitry Bochvar interpreted U as “meaningless” and attempted to use it


to solve paradoxes such as the Liar paradox by stipulating that paradoxical
sentences take the value U. He introduced a logic which is essentially weak
Kleene logic extended by additional connectives, two of which are “external
negation” and the “is undefined” operator:

642 Release : 6891b66 (2024-12-01)


46.4. GÖDEL LOGICS


e +
e
T F T F
U T U T
F T F F

Problem 46.7. Can you define ∼ in Bochvar’s logic in terms of ¬ and +, i.e.,
find a formula with only the propositional variable p and not involving ∼ which
always takes the same truth value as ∼p? Give a truth table to show you’re
right.

content/many-valued-logic/three-valued-logics/goedel.tex

46.4 Gödel logics


mvl:thr:god: Kurt Gödel introduced a sequence of n-valued logics that each contain all
sec
formulas valid in intuitionistic logic, and are contained in classical logic. Here
is the first interesting one:

mvl:thr:god: Definition 46.6. 3-valued Gödel logic G is defined using the matrix:
defn:goedel

1. The standard propositional language L0 with ⊥, ¬, ∧, ∨, →.

2. The set of truth values V = {T, U, F}.

3. T is the only designated value, i.e., V + = {T}.

4. For ⊥, we have ⊥ e = F. Truth functions for the remaining connectives


are given by the following tables:

¬
eG ∧
eG T U F
T F T T U F
U F U U U F
F T F F F F


eG T U F →
fG T U F
T T T T T T U F
U T U U U T T F
F T U F F T T T

You’ll notice that the truth tables for ∧ and ∨ are the same as in Lukasiewicz
and strong Kleene logic, but the truth tables for ¬ and → differ for each. In
Gödel logic, ¬e (U) = F. In contrast to Lukasiewicz logic and Kleene logic,
→(U, F) = F; in contrast to Kleene logic (but as in Lukasiewicz logic), f
f →(U, U) =
T.
As the connection to intuitionistic logic alluded to above suggests, G3 is
close to intuitionistic logic. All intuitionistic truths are tautologies in G3 , and

Release : 6891b66 (2024-12-01) 643


CHAPTER 46. THREE-VALUED LOGICS

many classical tautologies that are not valid intuitionistically also fail to be
tautologies in G3 . For instance, the following are not tautologies:

p ∨ ¬p (p → q) → (¬p ∨ q)
¬¬p → p ¬(¬p ∧ ¬q) → (p ∨ q)
((p → q) → p) → p ¬(p → q) → (p ∧ ¬q)

However, not every tautology of G3 is also intuitionistically valid, e.g., ¬¬p∨¬p


or (p → q) ∨ (q → p).

Problem 46.8. Give truth tables to show that the following are tautologies
of G3 :

¬¬p ∨ ¬p
(p → q) ∨ (q → p)
¬(p ∧ q) → (¬p ∨ ¬q)
(p → q) ∨ (q → r) ∨ (r → s)

Problem 46.9. Give truth tables that show that the following are not tau-
tologies of G3

(p → q) → (¬p ∨ q)
¬(¬p ∧ ¬q) → (p ∨ q)
((p → q) → p) → p
¬(p → q) → (p ∧ ¬q)

Problem 46.10. Which of the following relations hold in Gödel logic? Give
a truth table for each.

1. p, p → q ⊨ q

2. p ∨ q, ¬p ⊨ q

3. p ∧ q ⊨ p

4. p ⊨ p ∧ p

5. p ⊨ p ∨ q

content/many-valued-logic/three-valued-logics/multiple-designation.tex

644 Release : 6891b66 (2024-12-01)


46.5. DESIGNATING NOT JUST T

46.5 Designating not just T


mvl:thr:mul: So far the logics we’ve seen all had the set of designated truth values V + = {T},
sec
i.e., something counts as true iff its truth value is T. But one might also count
something as true if it’s just not F. Then one would get a logic by stipulating
in the matrix, e.g., that V + = {T, U}.
Definition 46.7. The logic of paradox LP is defined using the matrix:
1. The standard propositional language L0 with ¬, ∧, ∨, →.
2. The set of truth values V = {T, U, F}.
3. T and U are designated, i.e., V + = {T, U}.
4. Truth functions are the same as in strong Kleene logic.

Definition 46.8. Halldén’s logic of nonsense Hal is defined using the matrix:
1. The standard propositional language L0 with ¬, ∧, ∨, → and a 1-place
connective +.
2. The set of truth values V = {T, U, F}.
3. T and U are designated, i.e., V + = {T, U}.
4. Truth functions are the same as weak Kleene logic, plus the “is meaning-
less” operator:

+
e
T F
U T
F F

By contrast to the Kleene logics with which they share truth tables, these
do have tautologies.
mvl:thr:mul: Proposition 46.9. The tautologies of LP are the same as the tautologies of
prop:LP-taut-CL
classical propositional logic.

Proof. By Proposition 45.12, if ⊨LP φ then ⊨C φ. To show the reverse, we


show that if there is a valuation v : At0 → {F, T, U} such that vKs (φ) = F then
there is a valuation v′ : At0 → {F, T} such that v′ C (φ) = F. This establishes
the result for LP, since Ks and LP have the same characteristic truth func-
tions, and F is the only truth value of LP that is not designated (that is the
only difference between LP and Ks). Thus, if ⊭LP φ, for some valuation v,
vLP (φ) = vKs (φ) = F. By the claim we’re proving, v′ C (φ) = F, i.e., ⊭C φ.
To establish the claim, we first define v′ as
(
T if v(p) ∈ {T, U}
v′ (p) =
F otherwise

Release : 6891b66 (2024-12-01) 645


CHAPTER 46. THREE-VALUED LOGICS

We now show by induction on φ that (a) if vKs (φ) = F then v′ C (φ) = F, and
(b) if vKs (φ) = T then v′ C (φ) = T
1. Induction basis: φ ≡ p. By Definition 45.8, vKs (φ) = v(p) = v′ C (φ),
which implies both (a) and (b).
For the induction step, consider the cases:
2. φ ≡ ¬ψ.
a) Suppose vKs (¬ψ) = F. By the definition of ¬ e Ks , vKs (ψ) = T. By
inductive hypothesis, case (b), we get v′ C (ψ) = T, so v′ C (¬ψ) = F.
b) Suppose vKs (¬ψ) = T. By the definition of ¬ e Ks , vKs (ψ) = F. By
inductive hypothesis, case (a), we get v′ C (ψ) = F, so v′ C (¬ψ) = T.
3. φ ≡ (ψ ∧ χ).
a) Suppose vKs (ψ ∧ χ) = F. By the definition of ∧e Ks , vKs (ψ) = F or
vKs (ψ) = F. By inductive hypothesis, case (a), we get v′ C (ψ) = F
or v′ C (χ) = F, so v′ C (ψ ∧ χ) = F.
b) Suppose vKs (ψ ∧ χ) = T. By the definition of ∧
e Ks , vKs (ψ) = T and
vKs (ψ) = T. By inductive hypothesis, case (b), we get v′ C (ψ) = T
and v′ C (χ) = T, so v′ C (ψ ∧ χ) = T.
The other two cases are similar, and left as exercises. Alternatively, the
proof above establishes the result for all formulas only containing ¬ and ∧.
One may now appeal to the facts that in both Ks and C, for any v, v(ψ ∨ χ) =
v(¬(¬ψ ∧ ¬χ)) and v(ψ → χ) = v(¬(ψ ∧ ¬χ)).

Problem 46.11. Complete the proof Proposition 46.9, i.e., establish (a) and (b)
for the cases where φ ≡ (ψ ∨ χ) and φ ≡ (ψ → χ).

Problem 46.12. Prove that every classical tautology is a tautology in Hal.

Although they have the same tautologies as classical logic, their consequence
relations are different. LP, for instance, is paraconsistent in that ¬p, p ⊭ q,
and so the principle of explosion ¬φ, φ ⊨ ψ does not hold in general. (It holds
for some cases of φ and ψ, e.g., if ψ is a tautology.)
Problem 46.13. Which of the following relations hold in (a) LP and in
(b) Hal? Give a truth table for each.
1. p, p → q ⊨ q
2. ¬q, p → q ⊨ ¬p
3. p ∨ q, ¬p ⊨ q
4. ¬p, p ⊨ q
5. p ⊨ p ∨ q

646 Release : 6891b66 (2024-12-01)


6. p → q, q → r ⊨ p → r

What if you make U designated in L3 ?

Definition 46.10. The logic 3-valued R-Mingle RM3 is defined using the ma-
trix:

1. The standard propositional language L0 with ⊥, ¬, ∧, ∨, →.

2. The set of truth values V = {T, U, F}.

3. T and U are designated, i.e., V + = {T, U}.

4. Truth functions are the same as Lukasiewicz logic L3 .

Problem 46.14. Which of the following relations hold in RM3 ?

1. p, p → q ⊨ q

2. p ∨ q, ¬p ⊨ q

3. ¬p, p ⊨ q

4. p ⊨ p ∨ q

Different truth tables can sometimes generate the same logic (entailment
relation) just by changing the designated values. E.g., this happens if in Gödel
logic we take V + = {T, U} instead of {T}.

mvl:thr:mul: Proposition 46.11. The matrix with V = {F, U, T}, V + = {T, U}, and the
prop:gl-udes
truth functions of 3-valued Gödel logic defines classical logic.

Proof. Exercise.

Problem 46.15. Prove Proposition 46.11 by showing that for the logic L de-
fined just like Gödel logic but with V + = {T, U}, if Γ ⊭L ψ then Γ ⊭C ψ. Use
the ideas of Proposition 46.9, except instead of proving properties (a) and (b),
show that vG (φ) = F iff v′ C (φ) = F (and hence that vG (φ) ∈ {T, U} iff
v′ C (φ) = T). Explain why this establishes the proposition.

647
CHAPTER 47. INFINITE-VALUED LOGICS

Chapter 47

Infinite-valued Logics

content/many-valued-logic/infinite-valued-logics/introduction.tex

47.1 Introduction
The number of truth values of a matrix need not be finite. An obvious choice mvl:inf:int:
sec
for a set of infinitely many truth values is the set of rational numbers between
0 and 1, V∞ = [0, 1] ∩ Q, i.e.,

n
V∞ = { : n, m ∈ N and n ≤ m}.
m

When considering this infinite truth value set, it is often useful to also consider
the subsets

n
Vm = { : n ∈ N and n ≤ m}
m−1

For instance, V5 is the set with 5 evenly spaced truth values,

1 1 3
V5 = {0, , , , 1}.
4 2 4

In logics based on these truth value sets, usually only 1 is designated, i.e.,
V + = {1}. In other words, we let 1 play the role of (absolute) truth, 0 as
absolute falsity, but formulas may take any intermediate value in V .
One can also consider the set V[0,1] = [0, 1] of all real numbers between 0
and 1, or other infinite subsets of [0, 1], however. Logics with this truth value
set are often called fuzzy.

content/many-valued-logic/infinite-valued-logics/lukasiewicz.tex

648 Release : 6891b66 (2024-12-01)


47.2. LUKASIEWICZ LOGIC

47.2 Lukasiewicz logic


mvl:inf:luk:
sec

This is a short “stub” of a section on infinite-valued Lukasiewicz logic.

mvl:inf:luk: Definition 47.1. Infinite-valued Lukasiewicz logic L∞ is defined using the


def:lukasiewicz
matrix:

1. The standard propositional language L0 with ¬, ∧, ∨, →.

2. The set of truth values V∞ .

3. 1 is the only designated value, i.e., V + = {1}.

4. Truth functions are given by the following functions:

¬
e L (x) = 1 − x
∧L (x, y) = min(x, y)
e

e L (x, y) = max(x, y)
(
1 if x ≤ y
→L (x, y) = min(1, 1 − (x − y)) =
f
1 − (x − y) otherwise.

m-valued Lukasiewicz logic is defined the same, except V = Vm .

Proposition 47.2. The logic L3 defined by Definition 46.1 is the same as L3


defined by Definition 47.1.

Proof. This can be seen by comparing the truth tables for the connectives
given in Definition 46.1 with the truth tables determined by the equations in
Definition 47.1:
¬
e ∧
e L3 1 1/2 0
1 0 1 1 1/2 0
1/2 1/2 1/2 1/2 1/2 0
0 1 0 0 0 0


e L3 1 1/2 0 →
fL3 1 1/2 0
1 1 1 1 1 1 1/2 0
1/2 1 1/2 1/2 1/2 1 1 1/2
0 1 1/2 0 0 1 1 1

mvl:inf:luk: Proposition 47.3. If Γ ⊨L∞ ψ then Γ ⊨Lm ψ for all m ≥ 2.


prop:luk-infty-m

Proof. Exercise.

Release : 6891b66 (2024-12-01) 649


CHAPTER 47. INFINITE-VALUED LOGICS

Problem 47.1. Prove Proposition 47.3.

In fact, the converse holds as well.


Infinite-valued Lukasiewicz logic is the most popular fuzzy logic. In the
fuzzy logic literature, the conditional is often defined as ¬φ ∨ ψ. The result
would be an infinite-valued strong Kleene logic.

Problem 47.2. Show that (p → q) ∨ (q → p) is a tautology of L∞ .

content/many-valued-logic/infinite-valued-logics/goedel.tex

47.3 Gödel logics


mvl:inf:god:
sec

This is a short “stub” of a section on infinite-valued Gödel logic.

Definition 47.4. Infinite-valued Gödel logic G∞ is defined using the matrix: mvl:inf:god:
def:goedel
1. The standard propositional language L0 with ⊥, ¬, ∧, ∨, →.

2. The set of truth values V∞ .

3. 1 is the only designated value, i.e., V + = {1}.

4. Truth functions are given by the following functions:

⊥e =0
(
1 if x = 0
¬
e G (x) =
0 otherwise

e G (x, y) = min(x, y)

e G (x, y) = max(x, y)
(
1 if x ≤ y
→G (x, y) =
f
y otherwise.

m-valued Gödel logic is defined the same, except V = Vm .

Proposition 47.5. The logic G3 defined by Definition 46.6 is the same as G3


defined by Definition 47.4.

Proof. This can be seen by comparing the truth tables for the connectives
given in Definition 46.6 with the truth tables determined by the equations in
Definition 47.4:

650 Release : 6891b66 (2024-12-01)


¬
e G3 ∧
eG 1 1/2 0
1 0 1 1 1/2 0
1/2 0 1/2 1/2 1/2 0
0 1 0 0 0 0


eG 1 1/2 0 →
fG 1 1/2 0
1 1 1 1 1 1 1/2 0
1/2 1 1/2 1/2 1/2 1 1 0
0 1 1/2 0 0 1 1 1

mvl:inf:god: Proposition 47.6. If Γ ⊨G∞ ψ then Γ ⊨Gm ψ for all m ≥ 2.


prop:god-infty-m
Proof. Exercise.
Problem 47.3. Prove Proposition 47.6.
In fact, the converse holds as well.
Like G3 , G∞ has all intuitionistically valid formulas as tautologies, and the
same examples of non-tautologies are non-tautologies of G∞ :
p ∨ ¬p (p → q) → (¬p ∨ q)
¬¬p → p ¬(¬p ∧ ¬q) → (p ∨ q)
((p → q) → p) → p ¬(p → q) → (p ∧ ¬q)
The example of an intuitionistically invalid formula that is nevertheless a tau-
tology of G3 , (p → q) ∨ (q → p), is also a tautology in G∞ . In fact, G∞ can be
characterized as intuitionistic logic to which the schema (φ → ψ) ∨ (ψ → φ) is
added. This was shown by Michael Dummett, and so G∞ is often referred to
as Gödel–Dummett logic LC.
Problem 47.4. Show that (p → q) ∨ (q → p) is a tautology of G∞ .
Problem 47.5. Show that (p → q) ∨ (q → r) ∨ (r → s), which is a tautology of
G3 , is not a tautology of G∞ .

Chapter 48

Sequent Calculus

651
CHAPTER 48. SEQUENT CALCULUS

content/many-valued-logic/sequent-calculus/introduction.tex

48.1 Introduction
The sequent calculus for classical logic is an efficient and simple derivation mvl:seq:int:
sec
system. If a many-valued logic is defined by a matrix with finitely many truth
values, i.e., V is finite, it is possible to provide a sequent calculus for it. The
idea for how to do this comes from considering the meanings of sequents and
the form of inference rules in the classical case.
Now recall that a sequent
φ1 , . . . , φn ⇒ ψ1 , . . . , ψn

can be interpreted as the formula

(φ1 ∧ · · · ∧ φm ) → (ψ1 ∨ · · · ∨ ψn )
In other words, A valuation v satisfies a sequent Γ ⇒ ∆ iff either v(φ) = F
for some φ ∈ Γ or v(φ) = T for some φ ∈ ∆. On this interpretation, initial
sequents φ ⇒ φ are always satisfied, because either v(φ) = T or (φ) = F.
Here are the inference rules for the conditional in LK, with side formulas
Γ , ∆ left out:

⇒ φ ψ ⇒ φ ⇒ ψ
→L →R
φ→ψ ⇒ ⇒ φ→ψ

If we apply the above semantic interpretation of a sequent, we can read the


→L rule as saying that if v(φ) = T and v(ψ) = F, then v(φ→ψ) = F. Similarly,
the →R rule says that if either v(φ) = F or v(ψ) = T, then v(φ → ψ) = T.
And in fact, these conditionals are actually biconditionals. In the case of the
∧L and ∨R rules in their standard formulation, the corresponding conditionals
would not be biconditionals. But there are alternative versions of these rules
where they are:

φ, ψ, Γ ⇒ ∆ Γ ⇒ ∆, φ, ψ
∧L ∨R
φ ∧ ψ, Γ ⇒ ∆ Γ ⇒ ∆, φ ∨ ψ

This basic idea, applied to an n-valued logic, then results in a sequent


calculus with n instead of two places, one for each truth value. For a three-
valued logic with V = {F, U, T}, a sequent is an expression Γ | Π | ∆. It
is satisfied in a valuation v iff either v(φ) = F for some φ ∈ Γ or v(φ) = T
for some φ ∈ ∆ or v(φ) = U for some φ ∈ Π. Consequently, initial sequents
φ | φ | φ are always satisfied.

content/many-valued-logic/sequent-calculus/rules-and-proofs.tex

652 Release : 6891b66 (2024-12-01)


48.2. RULES AND DERIVATIONS

48.2 Rules and Derivations


mvl:seq:rul: For the following, let Γ, ∆, Π, Λ represent finite sequences of sentences.
sec

Definition 48.1 (Sequent). An n-sided sequent is an expression of the form

Γ1 | . . . | Γn

where each Γ1 is a finite (possibly empty) sequences of sentences of the lan-


guage L.

Definition 48.2 (Initial Sequent). An n-sided initial sequent is an n-sided


sequent of the form φ | . . . | φ for any sentence φ in the language.
If the language contains a 0-place connective ⋆, i.e., a propositional constant,
then we also take the sequent . . . | ⋆ | . . . where ⋆ appears in the space for the
truth value associated with e ⋆ ∈ V , and is empty otherwise.

For each connective of an n-valued logic L, there is a logical rule for each
truth value that this connective can take in L. Derivations in an n-sided
sequent calculus for L are trees of sequents, where the topmost sequents are
initial sequents, and if a sequent stands below one or more other sequents, it
must follow correctly by a rule of inference for the connectives of L.

Definition 48.3 (Theorems). A sentence φ is a theorem of an n-valued


logic L if there is a derivation of the n-sequent containing φ in each posi-
tion corresponding to a designated truth value of L. We write ⊢L φ if φ is a
theorem and ⊬L φ if it is not.

Definition 48.4 (Derivability). A sentence φ is derivable from a set of sen-


tences Γ in an n-valued logic L, Γ ⊢L φ, iff there is a finite subset Γ0 ⊆ Γ
and a sequence Γ0′ of the sentences in Γ0 such that the following sequent has
a derivation:
Λ1 | . . . | Λn

where Λi is φ if position i corresponds to a designated truth value, and Γ0′ otherwise.


If φ is not derivable from Γ we write Γ ⊬ φ.

For instance, 3-valued Lukasiewicz logic has a 3-sided sequent calculus. In


a 3-sided sequent Γ | Π | ∆, Γ corresponds to F, ∆ to T, and Π to U. Axioms
are φ | φ | φ. Since only T is designated, Γ ⊢L3 φ iff the sequent Γ | Γ | φ
has a derivation. (If U were also designated, we would need a derivation of
Γ | φ | φ.)

content/many-valued-logic/sequent-calculus/structural-rules.tex

Release : 6891b66 (2024-12-01) 653


CHAPTER 48. SEQUENT CALCULUS

48.3 Structural Rules


The structural rules for n-sided sequent calculus operate as in the classical case, mvl:seq:str:
sec
except for each position i.

Γ1 | . . . | Γi | . . . | Γn
Wi
Γ1 | . . . | φ, Γi | . . . | Γn
Γ1 | . . . | φ, φ, Γi | . . . | Γn
Ci
Γ1 | . . . | φ, Γi | . . . | Γn
Γ1 | . . . | Γi , φ, ψ, Γi′ | . . . | Γn
Xi
Γ1 | . . . | Γi , ψ, φ, Γi′ | . . . | Γn

A series of weakening, contraction, and exchange inferences will often be


indicated by double inference lines.
The Cut rule comes in several forms, one for every combination of distinct
positions in the sequent i ̸= j:

Γ1 | . . . | φ, Γi | . . . | Γn ∆1 | . . . | φ, ∆j | . . . | ∆n
Cuti, j
Γ1 , ∆1 | . . . | Γn , ∆n

content/many-valued-logic/sequent-calculus/propositional-rules.tex

48.4 Propositional Rules for Selected Logics


The inference rules for a connective in an n-sided sequent calculus only depend mvl:seq:prl:
sec
on the characteristic truth function for the connective. Thus, if some connective
is defined by the same truth function in different logics, these n-sided sequent
rules for the connective are the same in those logics.

Rules for ¬
The following rules for ¬ apply to Lukasiewicz and Kleene logics, and their
variants.

Γ | Π | ∆, φ
¬F
¬φ, Γ | Π | ∆

654 Release : 6891b66 (2024-12-01)


48.4. PROPOSITIONAL RULES FOR SELECTED LOGICS

Γ | φ, Π | ∆
¬U
Γ | ¬φ, Π | ∆
φ, Γ | Π | ∆
¬T
Γ | Π | ∆, ¬φ

The following rules for ¬ apply to Gödel logic.

Γ | φ, Π | ∆, φ φ, Γ | Π | ∆
¬G F ¬G T
¬φ, Γ | Π | ∆ Γ | Π | ∆, ¬φ

(In Gödel logic, ¬φ can never take the value U, so there is no rule for the
middle position.)

Rules for ∧
These are the rules for ∧ in Lukasiewicz, strong Kleene, and Gödel logic.

φ, ψ, Γ | Π | ∆
∧F
φ ∧ ψ, Γ | Π | ∆
Γ | φ, Π | φ, ∆ Γ | ψ, Π | ψ, ∆ Γ | φ, ψ, Π | ∆
∧U
Γ | φ ∧ ψ, Π | ∆
Γ | Π | ∆, φ Γ | Π | ∆, ψ
∧T
Γ | Π | ∆, φ ∧ ψ

Rules for ∨
These are the rules for ∨ in Lukasiewicz, strong Kleene, and Gödel logic.

φ, Γ | Π | ∆ ψ, Γ | Π | ∆
∨F
φ ∨ ψ, Γ | Π | ∆
φ, Γ | φ, Π | ∆ ψ, Γ | ψ, Π | ∆ Γ | φ, ψ, Π | ∆
∨U
Γ | φ ∨ ψ, Π | ∆
Γ | Π | ∆, φ, ψ
∨T
Γ | Π | ∆, φ ∨ ψ

Release : 6891b66 (2024-12-01) 655


CHAPTER 48. SEQUENT CALCULUS

Rules for →
These are the rules for → in Lukasiewicz logic.

Γ | Π | ∆, φ ψ, Γ | Π | ∆
→L3 F
φ → ψ, Γ | Π | ∆
Γ | φ, ψ, Π | ∆ ψ, Γ | Π | ∆, φ
→L3 U
Γ | φ → ψ, Π | ∆
φ, Γ | ψ, Π | ∆, ψ φ, Γ | φ, Π | ∆, ψ
→L3 T
Γ | Π | ∆, φ → ψ

These are the rules for → in strong Kleene logic.

Γ | Π | ∆, φ ψ, Γ | Π | ∆
→Ks F
φ → ψ, Γ | Π | ∆
ψ, Γ | ψ, Π | ∆ Γ | φ, ψ, Π | ∆ Γ | φ, Π | ∆, φ
→Ks U
Γ | φ → ψ, Π | ∆
φ, Γ | Π | ∆, ψ
→Ks T
Γ | Π | ∆, φ → ψ

These are the rules for → in Gödel logic.

Γ | φ, Π | ∆, φ ψ, Γ | Π | ∆
→ G3 F
φ → ψ, Γ | Π | ∆
Γ | ψ, Π | ∆ Γ | Π | ∆, φ
→ G3 U
Γ | φ → ψ, Π | ∆
φ, Γ | ψ, Π | ∆, ψ φ, Γ | φ, Π | ∆, ψ
→G3 T
Γ | Π | ∆, φ → ψ

656 Release : 6891b66 (2024-12-01)


Release : 6891b66 (2024-12-01)
B|B|B
WU
B | A, B | B
XU
A|A|A A|A|A B | B, A | B A|A|A
WT WT WU WT
A | A | B, A A | A | A, A B | A, B, A | B A | A | B, A
WU WT WF WF
A | B, A | B, A A | A | B, A, A A, B | A, B, A | B B, A | A | B, A
WU WF XF WF
A | A, B, A | B, A B, A | A | B, A, A B, A | A, B, A | B B, B, A | A | B, A
→U →U
A | A → B, A | B, A B, A | A → B, A | B
→F
A → B, A | A → B, A | B

Figure 48.1: Example derivation in L3


48.4. PROPOSITIONAL RULES FOR SELECTED LOGICS

657
Part XI

Normal Modal Logics


This part covers the metatheory of normal modal logics. It currently
consists of Aldo Antonelli’s notes on classical correspondence theory for
basic modal logic.

Chapter 49

Syntax and Semantics

content/normal-modal-logic/syntax-and-semantics/introduction.tex

49.1 Introduction
Modal logic deals with modal propositions and the entailment relations among nml:syn:int:
sec
them. Examples of modal propositions are the following:

1. It is necessary that 2 + 2 = 4.

2. It is necessarily possible that it will rain tomorrow.

3. If it is necessarily possible that φ then it is possible that φ.

Possibility and necessity are not the only modalities: other unary connectives
are also classified as modalities, for instance, “it ought to be the case that φ,”
“It will be the case that φ,” “Dana knows that φ,” or “Dana believes that φ.”
Modal logic makes its first appearance in Aristotle’s De Interpretatione: he
was the first to notice that necessity implies possibility, but not vice versa; that
possibility and necessity are inter-definable; that If φ ∧ ψ is possibly true then
φ is possibly true and ψ is possibly true, but not conversely; and that if φ → ψ
is necessary, then if φ is necessary, so is ψ.

658
49.1. INTRODUCTION

The first modern approach to modal logic was the work of C. I. Lewis, cul-
minating with Lewis and Langford, Symbolic Logic (1932). Lewis & Langford
were unhappy with the representation of implication by means of the mate-
rial conditional: φ → ψ is a poor substitute for “φ implies ψ.” Instead, they
proposed to characterize implication as “Necessarily, if φ then ψ,” symbolized
as φ J ψ. In trying to sort out the different properties, Lewis identified five
different modal systems, S1, . . . , S4, S5, the last two of which are still in use.
The approach of Lewis and Langford was purely syntactical : they identified
reasonable axioms and rules and investigated what was provable with those
means. A semantic approach remained elusive for a long time, until a first
attempt was made by Rudolf Carnap in Meaning and Necessity (1947) using
the notion of a state description, i.e., a collection of atomic sentences (those
that are “true” in that state description). After lifting the truth definition to
arbitrary sentences φ, Carnap defines φ to be necessarily true if it is true in all
state descriptions. Carnap’s approach could not handle iterated modalities, in
that sentences of the form “Possibly necessarily . . . possibly φ” always reduce
to the innermost modality.
The major breakthrough in modal semantics came with Saul Kripke’s article
“A Completeness Theorem in Modal Logic” (JSL 1959). Kripke based his
work on Leibniz’s idea that a statement is necessarily true if it is true “at
all possible worlds.” This idea, though, suffers from the same drawbacks as
Carnap’s, in that the truth of statement at a world w (or a state description
s) does not depend on w at all. So Kripke assumed that worlds are related by
an accessibility relation R, and that a statement of the form “Necessarily φ”
is true at a world w if and only if φ is true at all worlds w′ accessible from
w. Semantics that provide some version of this approach are called Kripke
semantics and made possible the tumultuous development of modal logics (in
the plural).
When interpreted by the Kripke semantics, modal logic shows us what
relational structures look like “from the inside.” A relational structure is just
a set equipped with a binary relation (for instance, the set of students in
the class ordered by their social security number is a relational structure).
But in fact relational structures come in all sorts of domains: besides relative
possibility of states of the world, we can have epistemic states of some agent
related by epistemic possibility, or states of a dynamical system with their state
transitions, etc. Modal logic can be used to model all of these: the first gives
us ordinary, alethic, modal logic; the others give us epistemic logic, dynamic
logic, etc.
We focus on one particular angle, known to modal logicians as “correspon-
dence theory.” One of the most significant early discoveries of Kripke’s is that
many properties of the accessibility relation R (whether it is transitive, sym-
metric, etc.) can be characterized in the modal language itself by means of
appropriate “modal schemas.” Modal logicians say, for instance, that the re-
flexivity of R “corresponds” to the schema “If necessarily φ, then φ”. We
explore mainly the correspondence theory of a number of classical systems of
modal logic (e.g., S4 and S5) obtained by a combination of the schemas D, T,

Release : 6891b66 (2024-12-01) 659


CHAPTER 49. SYNTAX AND SEMANTICS

B, 4, and 5.

content/normal-modal-logic/syntax-and-semantics/language-modal-logic.tex

49.2 The Language of Basic Modal Logic


nml:syn:lan:
sec
Definition 49.1. The basic language of modal logic contains

1. The propositional constant for falsity ⊥.

2. The propositional constant for truth ⊤.

3. A denumerable set of propositional variables: p0 , p1 , p2 , . . .

4. The propositional connectives: ¬ (negation), ∧ (conjunction), ∨ (disjunc-


tion), → (conditional), ↔ (biconditional).

5. The modal operator □.

6. The modal operator ♢.

Definition 49.2. Formulas of the basic modal language are inductively de-
fined as follows:

1. ⊥ is an atomic formula.

2. ⊤ is an atomic formula.

3. Every propositional variable pi is an (atomic) formula.

4. If φ is a formula, then ¬φ is a formula.

5. If φ and ψ are formulas, then (φ ∧ ψ) is a formula.

6. If φ and ψ are formulas, then (φ ∨ ψ) is a formula.

7. If φ and ψ are formulas, then (φ → ψ) is a formula.

8. If φ and ψ are formulas, then (φ ↔ ψ) is a formula.

9. If φ is a formula, then □φ is a formula.

10. If φ is a formula, then ♢φ is a formula.

11. Nothing else is a formula.

If a formula φ does not contain □ or ♢, we say it is modal-free.

content/normal-modal-logic/syntax-and-semantics/substitution.tex

660 Release : 6891b66 (2024-12-01)


49.3. SIMULTANEOUS SUBSTITUTION

49.3 Simultaneous Substitution


nml:syn:sub: An instance of a formula φ is the result of replacing all occurrences of a propo-
sec
sitional variable in φ by some other formula. We will refer to instances of
formulas often, both when discussing validity and when discussing derivability.
It therefore is useful to define the notion precisely.

nml:syn:sub: Definition 49.3. Where φ is a modal formula all of whose propositional


def:subst-inst
variables are among p1 , . . . , pn , and θ1 , . . . , θn are also modal formulas, we
define φ[θ1 /p1 , . . . , θn /pn ] as the result of simultaneously substituting each θi
for pi in φ. Formally, this is a definition by induction on φ:

1. φ ≡ ⊥: φ[θ1 /p1 , . . . , θn /pn ] is ⊥.

2. φ ≡ ⊤: φ[θ1 /p1 , . . . , θn /pn ] is ⊤.

3. φ ≡ q: φ[θ1 /p1 , . . . , θn /pn ] is q, provided q ̸≡ pi for i = 1, . . . , n.

4. φ ≡ pi : φ[θ1 /p1 , . . . , θn /pn ] is θi .

5. φ ≡ ¬ψ: φ[θ1 /p1 , . . . , θn /pn ] is ¬ψ[θ1 /p1 , . . . , θn /pn ].

6. φ ≡ (ψ ∧ χ): φ[θ1 /p1 , . . . , θn /pn ] is

(ψ[θ1 /p1 , . . . , θn /pn ] ∧ χ[θ1 /p1 , . . . , θn /pn ]).

7. φ ≡ (ψ ∨ χ): φ[θ1 /p1 , . . . , θn /pn ] is

(ψ[θ1 /p1 , . . . , θn /pn ] ∨ χ[θ1 /p1 , . . . , θn /pn ]).

8. φ ≡ (ψ → χ): φ[θ1 /p1 , . . . , θn /pn ] is

(ψ[θ1 /p1 , . . . , θn /pn ] → χ[θ1 /p1 , . . . , θn /pn ]).

9. φ ≡ (ψ ↔ χ): φ[θ1 /p1 , . . . , θn /pn ] is

(ψ[θ1 /p1 , . . . , θn /pn ] ↔ χ[θ1 /p1 , . . . , θn /pn ]).

10. φ ≡ □ψ: φ[θ1 /p1 , . . . , θn /pn ] is □ψ[θ1 /p1 , . . . , θn /pn ].

11. φ ≡ ♢ψ: φ[θ1 /p1 , . . . , θn /pn ] is ♢ψ[θ1 /p1 , . . . , θn /pn ].

The formula φ[θ1 /p1 , . . . , θn /pn ] is called a substitution instance of φ.

Example 49.4. Suppose φ is p1 → □(p1 ∧ p2 ), θ1 is ♢(p2 → p3 ) and θ2 is ¬□p1 .


Then φ[θ1 /p1 , θ2 /p2 ] is

♢(p2 → p3 ) → □(♢(p2 → p3 ) ∧ ¬□p1 )

Release : 6891b66 (2024-12-01) 661


CHAPTER 49. SYNTAX AND SEMANTICS

while φ[θ2 /p1 , θ1 /p2 ] is

¬□p1 → □(¬□p1 ∧ ♢(p2 → p3 ))

Note that simultaneous substitution is in general not the same as iterated


substitution, e.g., compare φ[θ1 /p1 , θ2 /p2 ] above with (φ[θ1 /p1 ])[θ2 /p2 ], which
is:

♢(p2 → p3 ) → □(♢(p2 → p3 ) ∧ p2 )[¬□p1 /p2 ], i.e.,


♢(¬□p1 → p3 ) → □(♢(¬□p1 → p3 ) ∧ ¬□p1 )

and with (φ[θ2 /p2 ])[θ1 /p1 ]:

p1 → □(p1 ∧ ¬□p1 )[♢(p2 → p3 )/p1 ], i.e.,


♢(p2 → p3 ) → □(♢(p2 → p3 ) ∧ ¬□♢(p2 → p3 )).

content/normal-modal-logic/syntax-and-semantics/relational-models.tex

49.4 Relational Models


The basic concept of semantics for normal modal logics is that of a relational nml:syn:rel:
sec
model. It consists of a set of worlds, which are related by a binary “accessibility
relation,” together with an assignment which determines which propositional
variables count as “true” at which worlds.
Definition 49.5. A model for the basic modal language is a triple M =
⟨W, R, V ⟩, where
1. W is a nonempty set of “worlds,”
2. R is a binary accessibility relation on W , and
3. V is a function assigning to each propositional variable p a set V (p) of
possible worlds.
When Rww′ holds, we say that w′ is accessible from w. When w ∈ V (p) we
say p is true at w.

The great advantage of relational semantics is that models can be repre-


sented by means of simple diagrams, such as the one in Figure 49.1. Worlds are
represented by nodes, and world w′ is accessible from w precisely when there is
an arrow from w to w′ . Moreover, we label a node (world) by p when w ∈ V (p),
and otherwise by ¬p. Figure 49.1 represents the model with W = {w1 , w2 , w3 },
R = {⟨w1 , w2 ⟩, ⟨w1 , w3 ⟩}, V (p) = {w1 , w2 }, and V (q) = {w2 }.

content/normal-modal-logic/syntax-and-semantics/truth-at-w.tex

662 Release : 6891b66 (2024-12-01)


49.5. TRUTH AT A WORLD

p
w2
q

p
w1
¬q

¬p
w3
¬q

Figure 49.1: A simple model.


nml:syn:rel:
fig:simple
49.5 Truth at a World
nml:syn:trw: Every modal model determines which modal formulas count as true at which
sec
worlds in it. The relation “model M makes formula φ true at world w” is the
basic notion of relational semantics. The relation is defined inductively and
coincides with the usual characterization using truth tables for the non-modal
operators.
nml:syn:trw: Definition 49.6. Truth of a formula φ at w in a M, in symbols: M, w ⊩ φ,
defn:mmodels
is defined inductively as follows:
1. φ ≡ ⊥: Never M, w ⊩ ⊥.
2. φ ≡ ⊤: Always M, w ⊩ ⊤.
3. M, w ⊩ p iff w ∈ V (p).
4. φ ≡ ¬ψ: M, w ⊩ φ iff M, w ⊮ ψ.
5. φ ≡ (ψ ∧ χ): M, w ⊩ φ iff M, w ⊩ ψ and M, w ⊩ χ.
6. φ ≡ (ψ ∨ χ): M, w ⊩ φ iff M, w ⊩ ψ or M, w ⊩ χ (or both).
7. φ ≡ (ψ → χ): M, w ⊩ φ iff M, w ⊮ ψ or M, w ⊩ χ.
8. φ ≡ (ψ ↔ χ): M, w ⊩ φ iff either both M, w ⊩ ψ and M, w ⊩ χ or
neither M, w ⊩ ψ nor M, w ⊩ χ.
nml:syn:trw: 9. φ ≡ □ψ: M, w ⊩ φ iff M, w′ ⊩ ψ for all w′ ∈ W with Rww′ .
defn:sub:mmodels-box
nml:syn:trw: 10. φ ≡ ♢ψ: M, w ⊩ φ iff M, w′ ⊩ ψ for at least one w′ ∈ W with Rww′ .
defn:sub:mmodels-diamond
Note that by clause (9), a formula □ψ is true at w whenever there are
no w′ with Rww′ . In such a case □ψ is vacuously true at w. Also, □ψ may
be satisfied at w even if ψ is not. The truth of ψ at w does not guarantee the
truth of ♢ψ at w. This holds, however, if Rww, e.g., if R is reflexive. If there
is no w′ such that Rww′ , then M, w ⊮ ♢φ, for any φ.

Release : 6891b66 (2024-12-01) 663


CHAPTER 49. SYNTAX AND SEMANTICS

Problem 49.1. Consider the model of Figure 49.1. Which of the following
hold?
1. M, w1 ⊩ q;
2. M, w3 ⊩ ¬q;
3. M, w1 ⊩ p ∨ q;
4. M, w1 ⊩ □(p ∨ q);
5. M, w3 ⊩ □q;
6. M, w3 ⊩ □⊥;
7. M, w1 ⊩ ♢q;
8. M, w1 ⊩ □q;
9. M, w1 ⊩ ¬□□¬q.

Proposition 49.7. nml:syn:trw:


prop:dual
1. M, w ⊩ □φ iff M, w ⊩ ¬♢¬φ.
2. M, w ⊩ ♢φ iff M, w ⊩ ¬□¬φ.

Proof. 1. M, w ⊩ ¬♢¬φ iff M, w ⊮ ♢¬φ by definition of M, w ⊩. M, w ⊩


♢¬φ iff for some w′ with Rww′ , M, w′ ⊩ ¬φ. Hence, M, w ⊮ ♢¬φ iff for
all w′ with Rww′ , M, w′ ⊮ ¬φ. We also have M, w′ ⊮ ¬φ iff M, w′ ⊩ φ.
Together we have M, w ⊩ ¬♢¬φ iff for all w′ with Rww′ , M, w′ ⊩ φ.
Again by definition of M, w ⊩, that is the case iff M, w ⊩ □φ.
2. M, w ⊩ ¬□¬φ iff M ⊮ □¬φ. M, w ⊩ □¬φ iff for all w′ with Rww′ ,
M, w′ ⊩ ¬φ. Hence, M, w ⊮ □¬φ iff for some w′ with Rww′ , M, w′ ⊮ ¬φ.
We also have M, w′ ⊮ ¬φ iff M, w′ ⊩ φ. Together we have M, w ⊩ ¬□¬φ
iff for some w′ with Rww′ , M, w′ ⊩ φ. Again by definition of M, w ⊩,
that is the case iff M, w ⊩ ♢φ.

Problem 49.2. Complete the proof of Proposition 49.7.

Problem 49.3. Let M = ⟨W, R, V ⟩ be a model, and suppose w1 , w2 ∈ W are


such that:
1. w1 ∈ V (p) if and only if w2 ∈ V (p) (for every propositional variable p);
and
2. for all w ∈ W : Rw1 w if and only if Rw2 w.
Using induction on formulas, show that for all formulas φ: M, w1 ⊩ φ if and
only if M, w2 ⊩ φ.

664 Release : 6891b66 (2024-12-01)


49.6. TRUTH IN A MODEL

Problem 49.4. Let M = ⟨W, R, V ⟩. Show that M, w ⊩ ¬♢φ if and only if


M, w ⊩ □¬φ.

content/normal-modal-logic/syntax-and-semantics/truth-in-model.tex

49.6 Truth in a Model


nml:syn:tru: Sometimes we are interested in which formulas are true at every world in a
sec
given model. Let’s introduce a notation for this.

Definition 49.8. A formula φ is true in a model M = ⟨W, R, V ⟩, written


M ⊩ φ, if and only if M, w ⊩ φ for every w ∈ W .

nml:syn:tru: Proposition 49.9.


prop:truthfacts
1. If M ⊩ φ then M ⊮ ¬φ, but not vice-versa.

2. If M ⊩ φ → ψ then M ⊩ φ only if M ⊩ ψ, but not vice-versa.

Proof. 1. If M ⊩ φ then φ is true at all worlds in W , and since W ̸= ∅, it


can’t be that M ⊩ ¬φ, or else φ would have to be both true and false at
some world.
On the other hand, if M ⊮ ¬φ then φ is true at some world w ∈ W . It
does not follow that M, w ⊩ φ for every w ∈ W . For instance, in the
model of Figure 49.1, M ⊮ ¬p, and also M ⊮ p.

2. Assume M ⊩ φ → ψ and M ⊩ φ; to show M ⊩ ψ let w ∈ W be an


arbitrary world. Then M, w ⊩ φ → ψ and M, w ⊩ φ, so M, w ⊩ ψ, and
since w was arbitrary, M ⊩ ψ.
To show that the converse fails, we need to find a model M such that
M ⊩ φ only if M ⊩ ψ, but M ⊮ φ → ψ. Consider again the model
of Figure 49.1: M ⊮ p and hence (vacuously) M ⊩ p only if M ⊩ q.
However, M ⊮ p → q, as p is true but q false at w1 .

Problem 49.5. Consider the following model M for the language comprising
p1 , p2 , p3 as the only propositional variables:

p1 p1
¬p2 w1 w3 p2
¬p3 p3

p1
w2 p2
¬p3

Release : 6891b66 (2024-12-01) 665


CHAPTER 49. SYNTAX AND SEMANTICS

Are the following formulas and schemas true in the model M, i.e., true at every
world in M? Explain.

1. p → ♢p (for p atomic);

2. φ → ♢φ (for φ arbitrary);

3. □p → p (for p atomic);

4. ¬p → ♢□p (for p atomic);

5. ♢□φ (for φ arbitrary);

6. □♢p (for p atomic).

content/normal-modal-logic/syntax-and-semantics/modal-validity.tex

49.7 Validity
explanation Formulas that are true in all models, i.e., true at every world in every model, nml:syn:val:
sec
are particularly interesting. They represent those modal propositions which are
true regardless of how □ and ♢ are interpreted, as long as the interpretation
is “normal” in the sense that it is generated by some accessibility relation on
possible worlds. We call such formulas valid. For instance, □(p ∧ q) → □p is
valid. Some formulas one might expect to be valid on the basis of the alethic
interpretation of □, such as □p → p, are not valid, however. Part of the interest
of relational models is that different interpretations of □ and ♢ can be captured
by different kinds of accessibility relations. This suggests that we should define
validity not just relative to all models, but relative to all models of a certain
kind. It will turn out, e.g., that □p → p is true in all models where every world
is accessible from itself, i.e., R is reflexive. Defining validity relative to classes
of models enables us to formulate this succinctly: □p → p is valid in the class
of reflexive models.

Definition 49.10. A formula φ is valid in a class C of models if it is true in


every model in C (i.e., true at every world in every model in C). If φ is valid
in C, we write C ⊨ φ, and we write ⊨ φ if φ is valid in the class of all models.

Proposition 49.11. If φ is valid in C it is also valid in each class C ′ ⊆ C. nml:syn:val:


prop:subset-class

Proposition 49.12. If φ is valid, then so is □φ. nml:syn:val:


prop:Nec-rule

Proof. Assume ⊨ φ. To show ⊨ □φ let M = ⟨W, R, V ⟩ be a model and w ∈ W .


If Rww′ then M, w′ ⊩ φ, since φ is valid, and so also M, w ⊩ □φ. Since M
and w were arbitrary, ⊨ □φ.

Problem 49.6. Show that the following are valid:

666 Release : 6891b66 (2024-12-01)


49.8. TAUTOLOGICAL INSTANCES

1. □p → □(q → p);

2. □¬⊥;

3. □p → (□q → □p).

Problem 49.7. Show that φ → □φ is valid in the class C of models M =


⟨W, R, V ⟩ where W = {w}. Similarly, show that ψ → □φ and ♢φ → ψ are valid
in the class of models M = ⟨W, R, V ⟩ where R = ∅.

content/normal-modal-logic/syntax-and-semantics/tautological-instances.tex

49.8 Tautological Instances


nml:syn:tau: A modal-free formula is a tautology if it is true under every truth-value assign- explanation
sec
ment. Clearly, every tautology is true at every world in every model. But for
formulas involving □ and ♢, the notion of tautology is not defined. Is it the
case, e.g., that □p ∨ ¬□p—an instance of the principle of excluded middle—is
valid? The notion of a tautological instance helps: a formula that is a substitu-
tion instance of a (non-modal) tautology. It is not surprising, but still requires
proof, that every tautological instance is valid.

Definition 49.13. A modal formula ψ is a tautological instance if and only if


there is a modal-free tautology φ with propositional variables p1 , . . . , pn and
formulas θ1 , . . . , θn such that ψ ≡ φ[θ1 /p1 , . . . , θn /pn ].

nml:syn:tau: Lemma 49.14. Suppose φ is a modal-free formula whose propositional vari-


lem:valid-taut
ables are p1 , . . . , pn , and let θ1 , . . . , θn be modal formulas. Then for any
assignment v, any model M = ⟨W, R, V ⟩, and any w ∈ W such that v(pi ) = T if
and only if M, w ⊩ θi we have that v ⊨ φ if and only if M, w ⊩ φ[θ1 /p1 , . . . , θn /pn ].

Proof. By induction on φ.

1. φ ≡ ⊥: Both v ⊭ ⊥ and M, w ⊮ ⊥.

2. φ ≡ ⊤: Both v ⊨ ⊤ and M, w ⊩ ⊤.

3. φ ≡ pi :

v ⊨ pi ⇔ v(pi ) = T
by definition of v ⊨ pi
⇔ M, w ⊩ θi
by assumption
⇔ M, w ⊩ pi [θ1 /p1 , . . . , θn /pn ]
since pi [θ1 /p1 , . . . , θn /pn ] ≡ θi .

Release : 6891b66 (2024-12-01) 667


CHAPTER 49. SYNTAX AND SEMANTICS

4. φ ≡ ¬ψ:

v ⊨ ¬ψ ⇔ v ⊭ ψ
by definition of v ⊨;
⇔ M, w ⊮ ψ[θ1 /p1 , . . . , θn /pn ]
by induction hypothesis
⇔ M, w ⊩ ¬ψ[θ1 /p1 , . . . , θn /pn ]
by definition of v ⊨.

5. φ ≡ (ψ ∧ χ):

v ⊨ ψ ∧ χ ⇔ v ⊨ ψ and v ⊨ χ
by definition of v ⊨
⇔ M, w ⊩ ψ[θ1 /p1 , . . . , θn /pn ] and
M, w ⊩ χ[θ1 /p1 , . . . , θn /pn ]
by induction hypothesis
⇔ M, w ⊩ (ψ ∧ χ)[θ1 /p1 , . . . , θn /pn ]
by definition of M, w ⊩.

6. φ ≡ (ψ ∨ χ):

v ⊨ ψ ∨ χ ⇔ v ⊨ ψ or v ⊨ χ
by definition of v ⊨;
⇔ M, w ⊩ ψ[θ1 /p1 , . . . , θn /pn ] or
M, w ⊩ χ[θ1 /p1 , . . . , θn /pn ]
by induction hypothesis
⇔ M, w ⊩ (ψ ∨ χ)[θ1 /p1 , . . . , θn /pn ]
by definition of M, w ⊩.

7. φ ≡ (ψ → χ):

v ⊨ ψ → χ ⇔ v ⊭ ψ or v ⊨ χ
by definition of v ⊨
⇔ M, w ⊮ ψ[θ1 /p1 , . . . , θn /pn ] or
M, w ⊩ χ[θ1 /p1 , . . . , θn /pn ]
by induction hypothesis
⇔ M, w ⊩ (ψ → χ)[θ1 /p1 , . . . , θn /pn ]
by definition of M, w ⊩.

668 Release : 6891b66 (2024-12-01)


49.9. SCHEMAS AND VALIDITY

8. φ ≡ (ψ ↔ χ):

v ⊨ ψ → χ ⇔ either v ⊨ ψ and v ⊨ χ
or v ⊭ ψ and v ⊭ χ
by definition of v ⊨
⇔ either M, w ⊩ ψ[θ1 /p1 , . . . , θn /pn ] and
M, w ⊩ χ[θ1 /p1 , . . . , θn /pn ]
or M, w ⊮ ψ[θ1 /p1 , . . . , θn /pn ] and
M, w ⊮ χ[θ1 /p1 , . . . , θn /pn ]
by induction hypothesis
⇔ M, w ⊩ (ψ ↔ χ)[θ1 /p1 , . . . , θn /pn ]
by definition of M, w ⊩.

nml:syn:tau: Proposition 49.15. All tautological instances are valid.


prop:valid-taut

Proof. Contrapositively, suppose φ is such that M, w ⊮ φ[θ1 /p1 , . . . , θn /pn ],


for some model M and world w. Define an assignment v such that v(pi ) = T
if and only if M, w ⊩ θi (and v assigns arbitrary values to q ∈
/ {p1 , . . . , pn }).
Then by Lemma 49.14, v ⊭ φ, so φ is not a tautology.

content/normal-modal-logic/syntax-and-semantics/schemas.tex

49.9 Schemas and Validity


nml:syn:sch:
sec
Definition 49.16. A schema is a set of formulas comprising all and only the
substitution instances of some modal formula χ, i.e.,

{ψ : ∃θ1 , . . . , ∃θn (ψ = χ[θ1 /p1 , . . . , θn /pn ])}.

The formula χ is called the characteristic formula of the schema, and it is


unique up to a renaming of the propositional variables. A formula φ is an
instance of a schema if it is a member of the set.

It is convenient to denote a schema by the meta-linguistic expression ob-


tained by substituting ‘φ’, ‘ψ’, . . . , for the atomic components of χ. So, for
instance, the following denote schemas: ‘φ’, ‘φ → □φ’, ‘φ → (ψ → φ)’. They
correspond to the characteristic formulas p, p → □p, p → (q → p). The schema
‘φ’ denotes the set of all formulas.

Definition 49.17. A schema is true in a model if and only if all of its instances
are; and a schema is valid if and only if it is true in every model.

Release : 6891b66 (2024-12-01) 669


CHAPTER 49. SYNTAX AND SEMANTICS

Proposition 49.18. The following schema K is valid nml:syn:sch:


prop:Kvalid

□(φ → ψ) → (□φ → □ψ). (K)

Proof. We need to show that all instances of the schema are true at every world
in every model. So let M = ⟨W, R, V ⟩ and w ∈ W be arbitrary. To show that a
conditional is true at a world we assume the antecedent is true to show that the
consequent is true as well. In this case, let M, w ⊩ □(φ → ψ) and M, w ⊩ □φ.
We need to show M, w ⊩ □ψ. So let w′ be arbitrary such that Rww′ . Then by
the first assumption M, w′ ⊩ φ → ψ and by the second assumption M, w′ ⊩ φ.
It follows that M, w′ ⊩ ψ. Since w′ was arbitrary, M, w ⊩ □ψ.

Proposition 49.19. The following schema dual is valid nml:syn:sch:


prop:Dual-valid

♢φ ↔ ¬□¬φ. (dual)

Proof. Exercise.

Problem 49.8. Prove Proposition 49.19.

Proposition 49.20. If φ and φ → ψ are true at a world in a model then so nml:syn:sch:


prop:soundMP
is ψ. Hence, the valid formulas are closed under modus ponens.

Proposition 49.21. A formula φ is valid iff all its substitution instances are. nml:syn:sch:
prop:valid-instances
In other words, a schema is valid iff its characteristic formula is.

Proof. The “if” direction is obvious, since φ is a substitution instance of itself.


To prove the “only if” direction, we show the following: Suppose M =
⟨W, R, V ⟩ is a modal model, and ψ ≡ φ[θ1 /p1 , . . . , θn /pn ] is a substitution
instance of φ. Define M′ = ⟨W, R, V ′ ⟩ by V ′ (pi ) = {w : M, w ⊩ θi }. Then
M, w ⊩ ψ iff M′ , w ⊩ φ, for any w ∈ W . (We leave the proof as an exercise.)
Now suppose that φ was valid, but some substitution instance ψ of φ was not
valid. Then for some M = ⟨W, R, V ⟩ and some w ∈ W , M, w ⊮ ψ. But then
M′ , w ⊮ φ by the claim, and φ is not valid, a contradiction.

Problem 49.9. Prove the claim in the “only if” part of the proof of Proposi-
tion 49.21. (Hint: use induction on φ.)

Note, however, that it is not true that a schema is true in a model iff its
characteristic formula is. Of course, the “only if” direction holds: if every
instance of φ is true in M, φ itself is true in M. But it may happen that φ is
true in M but some instance of φ is false at some world in M. For a very simple
counterexample consider p in a model with only one world w and V (p) = {w},
so that p is true at w. But ⊥ is an instance of p, and not true at w.

Problem 49.10. Show that none of the following formulas are valid:

D: □p → ♢p;

670 Release : 6891b66 (2024-12-01)


49.10. ENTAILMENT

Valid Schemas Invalid Schemas


□(φ → ψ) → (♢φ → ♢ψ) □(φ ∨ ψ) → (□φ ∨ □ψ)
♢(φ → ψ) → (□φ → ♢ψ) (♢φ ∧ ♢ψ) → ♢(φ ∧ ψ)
□(φ ∧ ψ) ↔ (□φ ∧ □ψ) φ → □φ
□φ → □(ψ → φ) □♢φ → ψ
¬♢φ → □(φ → ψ) □□φ → □φ
♢(φ ∨ ψ) ↔ (♢φ ∨ ♢ψ) □♢φ → ♢□φ.
Table 49.1: Valid and (or?) invalid schemas.
nml:syn:sch:
tab:valid-invalidSchemas
T: □p → p;
B: p → □♢p;
4: □p → □□p;
5: ♢p → □♢p.

Problem 49.11. Prove that the schemas in the first column of Table 49.1 are
valid and those in the second column are not valid.

Problem 49.12. Decide whether the following schemas are valid or invalid:
1. (♢φ → □ψ) → (□φ → □ψ);
2. ♢(φ → ψ) ∨ □(ψ → φ).

Problem 49.13. For each of the following schemas find a model M such that
every instance of the formula is true in M:
1. p → ♢♢p;
2. ♢p → □p.

content/normal-modal-logic/syntax-and-semantics/entailment.tex

49.10 Entailment
nml:syn:ent: With the definition of truth at a world, we can define an entailment relation explanation
sec
between formulas. A formula ψ entails φ iff, whenever ψ is true, φ is true as
well. Here, “whenever” means both “whichever model we consider” as well as
“whichever world in that model we consider.”
Definition 49.22. If Γ is a set of formulas and φ a formula, then Γ entails φ,
in symbols: Γ ⊨ φ, if and only if for every model M = ⟨W, R, V ⟩ and world
w ∈ W , if M, w ⊩ ψ for every ψ ∈ Γ , then M, w ⊩ φ. If Γ contains a single
formula ψ, then we write ψ ⊨ φ.

Release : 6891b66 (2024-12-01) 671


w2 p w3 p

w1 ¬p

Figure 49.2: Counterexample to p → ♢p ⊨ □p → p.


nml:syn:ent:
fig:counterex
Example 49.23. To show that a formula entails another, we have to reason
about all models, using the definition of M, w ⊩. For instance, to show p→♢p ⊨
□¬p → ¬p, we might argue as follows: Consider a model M = ⟨W, R, V ⟩ and
w ∈ W , and suppose M, w ⊩ p → ♢p. We have to show that M, w ⊩ □¬p → ¬p.
Suppose not. Then M, w ⊩ □¬p and M, w ⊮ ¬p. Since M, w ⊮ ¬p, M, w ⊩ p.
By assumption, M, w ⊩ p → ♢p, hence M, w ⊩ ♢p. By definition of M, w ⊩ ♢p,
there is some w′ with Rww′ such that M, w′ ⊩ p. Since also M, w ⊩ □¬p,
M, w′ ⊩ ¬p, a contradiction.
To show that a formula ψ does not entail another φ, we have to give a
counterexample, i.e., a model M = ⟨W, R, V ⟩ where we show that at some
world w ∈ W , M, w ⊩ ψ but M, w ⊮ φ. Let’s show that p → ♢p ⊭ □p → p.
Consider the model in Figure 49.2. We have M, w1 ⊩ ♢p and hence M, w1 ⊩
p → ♢p. However, since M, w1 ⊩ □p but M, w1 ⊮ p, we have M, w1 ⊮ □p → p.
Often very simple counterexamples suffice. The model M′ = {W ′ , R′ , V ′ }
with W ′ = {w}, R′ = ∅, and V ′ (p) = ∅ is also a counterexample: Since
M′ , w ⊮ p, M′ , w ⊩ p → ♢p. As no worlds are accessible from w, we have
M′ , w ⊩ □p, and so M′ , w ⊮ □p → p.

Problem 49.14. Show that □(φ ∧ ψ) ⊨ □φ.

Problem 49.15. Show that □(p → q) ⊭ p → □q and p → □q ⊭ □(p → q).

Chapter 50

Frame Definability

672
50.1. INTRODUCTION

content/normal-modal-logic/frame-definability/introduction.tex

50.1 Introduction
nml:frd:int: One question that interests modal logicians is the relationship between the
sec
accessibility relation and the truth of certain formulas in models with that ac-
cessibility relation. For instance, suppose the accessibility relation is reflexive,
i.e., for every w ∈ W , Rww. In other words, every world is accessible from
itself. That means that when □φ is true at a world w, w itself is among the
accessible worlds at which φ must therefore be true. So, if the accessibility
relation R of M is reflexive, then whatever world w and formula φ we take,
□φ → φ will be true there (in other words, the schema □p → p and all its
substitution instances are true in M).
The converse, however, is false. It’s not the case, e.g., that if □p → p is
true in M, then R is reflexive. For we can easily find a non-reflexive model M
where □p → p is true at all worlds: take the model with a single world w,
not accessible from itself, but with w ∈ V (p). By picking the truth value of p
suitably, we can make □φ → φ true in a model that is not reflexive.
The solution is to remove the variable assignment V from the equation. If
we require that □p→p is true at all worlds in M, regardless of which worlds are
in V (p), then it is necessary that R is reflexive. For in any non-reflexive model,
there will be at least one world w such that not Rww. If we set V (p) = W \{w},
then p will be true at all worlds other than w, and so at all worlds accessible
from w (since w is guaranteed not to be accessible from w, and w is the only
world where p is false). On the other hand, p is false at w, so □p → p is false
at w.
This suggests that we should introduce a notation for model structures with-
out a valuation: we call these frames. A frame F is simply a pair ⟨W, R⟩ con-
sisting of a set of worlds with an accessibility relation. Every model ⟨W, R, V ⟩
is then, as we say, based on the frame ⟨W, R⟩. Conversely, a frame determines
the class of models based on it; and a class of frames determines the class of
models which are based on any frame in the class. And we can define F ⊨ φ,
the notion of a formula being valid in a frame as: M ⊩ φ for all M based on F.
With this notation, we can establish correspondence relations between for-
mulas and classes of frames: e.g., F ⊨ □p → p if, and only if, F is reflexive.

content/normal-modal-logic/frame-definability/properties-accessibility.tex

50.2 Properties of Accessibility Relations


nml:frd:acc: Many modal formulas turn out to be characteristic of simple, and even familiar,
sec
properties of the accessibility relation. In one direction, that means that any
model that has a given property makes a corresponding formula (and all its
substitution instances) true. We begin with five classical examples of kinds of
accessibility relations and the formulas the truth of which they guarantee.

Release : 6891b66 (2024-12-01) 673


CHAPTER 50. FRAME DEFINABILITY

If R is . . . then . . . is true in M:
serial : ∀u∃vRuv □p → ♢p (D)
reflexive: ∀wRww □p → p (T)
symmetric: p → □♢p (B)
∀u∀v(Ruv → Rvu)
transitive: □p → □□p (4)
∀u∀v∀w((Ruv ∧ Rvw) → Ruw)
euclidean: ♢p → □♢p (5)
∀w∀u∀v((Rwu ∧ Rwv) → Ruv)
Table 50.1: Five correspondence facts.
nml:frd:acc:
tab:five
w w′
⊩φ ⊩ ♢φ
⊩ □♢φ

Figure 50.1: The argument from symmetry.


nml:frd:acc:
fig:Bsymm
Theorem 50.1. Let M = ⟨W, R, V ⟩ be a model. If R has the property on the nml:frd:acc:
thm:soundschemas
left side of Table 50.1, every instance of the formula on the right side is true
in M.

Proof. Here is the case for B: to show that the schema is true in a model we
need to show that all of its instances are true at all worlds in the model. So
let φ → □♢φ be a given instance of B, and let w ∈ W be an arbitrary world.
Suppose the antecedent φ is true at w, in order to show that □♢φ is true at
w. So we need to show that ♢φ is true at all w′ accessible from w. Now, for
any w′ such that Rww′ we have, using the hypothesis of symmetry, that also
Rw′ w (see Figure 50.1). Since M, w ⊩ φ, we have M, w′ ⊩ ♢φ. Since w′ was
an arbitrary world such that Rww′ , we have M, w ⊩ □♢φ.
We leave the other cases as exercises.

Problem 50.1. Complete the proof of Theorem 50.1.

Notice that the converse implications of Theorem 50.1 do not hold: it’s not
true that if a model verifies a schema, then the accessibility relation of that
model has the corresponding property. In the case of T and reflexive models, it
is easy to give an example of a model in which T itself fails: let W = {w} and
V (p) = ∅. Then R is not reflexive, but M, w ⊩ □p and M, w ⊮ p. But here we
have just a single instance of T that fails in M; other instances, e.g., □¬p → ¬p
are true. It is harder to give examples where every substitution instance of T
is true in M and M is not reflexive. But there are such models, too:

674 Release : 6891b66 (2024-12-01)


50.2. PROPERTIES OF ACCESSIBILITY RELATIONS

If R is . . . then . . . is true in M:
partially functional :
♢p → □p
∀w∀u∀v((Rwu ∧ Rwv) → u = v)
functional : ∀w∃v∀u(Rwu ↔ u = v) ♢p ↔ □p
weakly dense:
□□p → □p
∀u∀v(Ruv → ∃w(Ruw ∧ Rwv))
weakly connected :
□((p ∧ □p) → q) ∨
∀w∀u∀v((Rwu ∧ Rwv) → (L)
□((q ∧ □q) → p)
(Ruv ∨ u = v ∨ Rvu))
weakly directed :
∀w∀u∀v((Rwu ∧ Rwv) → ♢□p → □♢p (G)
∃t(Rut ∧ Rvt))
Table 50.2: Five more correspondence facts.
nml:frd:acc:
tab:anotherfive
nml:frd:acc: Proposition 50.2. Let M = ⟨W, R, V ⟩ be a model such that W = {u, v},
prop:reflexive
where worlds u and v are related by R: i.e., both Ruv and Rvu. Suppose that
for all p: u ∈ V (p) ⇔ v ∈ V (p). Then:

1. For all φ: M, u ⊩ φ if and only if M, v ⊩ φ (use induction on φ).

2. Every instance of T is true in M.

Since M is not reflexive (it is, in fact, irreflexive), the converse of Theorem 50.1
fails in the case of T (similar arguments can be given for some—though not
all—the other schemas mentioned in Theorem 50.1).

Problem 50.2. Prove the claims in Proposition 50.2.

Although we will focus on the five classical formulas D, T, B, 4, and 5,


we record in Table 50.2 a few more properties of accessibility relations. The
accessibility relation R is partially functional, if from every world at most
one world is accessible. If it is the case that from every world exactly one
world is accessible, we call it functional. (Thus the functional relations are
precisely those that are both serial and partially functional). They are called
“functional” because the accessibility relation operates like a (partial) function.
A relation is weakly dense if whenever Ruv, there is a w “between” u and v.
So weakly dense relations are in a sense the opposite of transitive relations: in
a transitive relation, whenever you can reach v from u by a detour via w, you
can reach v from u directly; in a weakly dense relation, whenever you can reach
v from u directly, you can also reach it by a detour via some w. A relation is
weakly directed if whenever you can reach worlds u and v from some world w,
you can reach a single world t from both u and v—this is sometimes called the
“diamond property” or “confluence.”

Release : 6891b66 (2024-12-01) 675


CHAPTER 50. FRAME DEFINABILITY

Problem 50.3. Let M = ⟨W, R, V ⟩ be a model. Show that if R satisfies the


left-hand properties of Table 50.2, every instance of the corresponding right-
hand formula is true in M.

content/normal-modal-logic/frame-definability/frames.tex

50.3 Frames
nml:frd:fra:
sec
Definition 50.3. A frame is a pair F = ⟨W, R⟩ where W is a non-empty set
of worlds and R a binary relation on W . A model M is based on a frame
F = ⟨W, R⟩ if and only if M = ⟨W, R, V ⟩ for some valuation V .

Definition 50.4. If F is a frame, we say that φ is valid in F, F ⊨ φ, if M ⊩ φ


for every model M based on F.
If F is a class of frames, we say φ is valid in F, F ⊨ φ, iff F ⊨ φ for every
frame F ∈ F.

The reason frames are interesting is that correspondence between schemas


and properties of the accessibility relation R is at the level of frames, not of
models. For instance, although T is true in all reflexive models, not every model
in which T is true is reflexive. However, it is true that not only is T valid on
all reflexive frames, also every frame in which T is valid is reflexive.

Remark 6. Validity in a class of frames is a special case of the notion of validity


in a class of models: F ⊨ φ iff C ⊨ φ where C is the class of all models based
on a frame in F.
Obviously, if a formula or a schema is valid, i.e., valid with respect to the
class of all models, it is also valid with respect to any class F of frames.

content/normal-modal-logic/frame-definability/definability.tex

50.4 Frame Definability


Even though the converse implications of Theorem 50.1 fail, they hold if we nml:frd:def:
sec
replace “model” by “frame”: for the properties considered in Theorem 50.1, it
is true that if a formula is valid in a frame then the accessibility relation of
that frame has the corresponding property. So, the formulas considered define
the classes of frames that have the corresponding property.

Definition 50.5. If F is a class of frames, we say φ defines F iff F ⊨ φ for all


and only frames F ∈ F.

We now proceed to establish the full definability results for frames.

676 Release : 6891b66 (2024-12-01)


50.4. FRAME DEFINABILITY

nml:frd:def: Theorem 50.6. If the formula on the right side of Table 50.1 is valid in a
thm:fullCorrespondence
frame F, then F has the property on the left side.

Proof. 1. Suppose D is valid in F = ⟨W, R⟩, i.e., F ⊨ □p → ♢p. Let M =


⟨W, R, V ⟩ be a model based on F, and w ∈ W . We have to show that there
is a v such that Rwv. Suppose not: then both M ⊩ □φ and M, w ⊮ ♢φ
for any φ, including p. But then M, w ⊮ □p → ♢p, contradicting the
assumption that F ⊨ □p → ♢p.

2. Suppose T is valid in F, i.e., F ⊨ □p → p. Let w ∈ W be an arbitrary


world; we need to show Rww. Let u ∈ V (p) if and only if Rwu (when q
is other than p, V (q) is arbitrary, say V (q) = ∅). Let M = ⟨W, R, V ⟩. By
construction, for all u such that Rwu: M, u ⊩ p, and hence M, w ⊩ □p.
But by hypothesis □p→p is true at w, so that M, w ⊩ p, but by definition
of V this is possible only if Rww.

3. We prove the contrapositive: Suppose F is not symmetric, we show that


B, i.e., p → □♢p is not valid in F = ⟨W, R⟩. If F is not symmetric, there
are u, v ∈ W such that Ruv but not Rvu. Define V such that w ∈ V (p) if
and only if not Rvw (and V is arbitrary otherwise). Let M = ⟨W, R, V ⟩.
Now, by definition of V , M, w ⊩ p for all w such that not Rvw, in
particular, M, u ⊩ p since not Rvu. Also, since Rvw iff w ∈ / V (p), there
is no w such that Rvw and M, w ⊩ p, and hence M, v ⊮ ♢p. Since Ruv,
also M, u ⊮ □♢p. It follows that M, u ⊮ p → □♢p, and so B is not valid
in F.

4. Suppose 4 is valid in F = ⟨W, R⟩, i.e., F ⊨ □p → □□p, and let u, v,


w ∈ W be arbitrary worlds such that Ruv and Rvw; we need to show
that Ruw. Define V such that z ∈ V (p) if and only if Ruz (and V is
arbitrary otherwise). Let M = ⟨W, R, V ⟩. By definition of V , M, z ⊩ p
for all z such that Ruz, and hence M, u ⊩ □p. But by hypothesis 4,
□p → □□p, is true at u, so that M, u ⊩ □□p. Since Ruv and Rvw, we
have M, w ⊩ p, but by definition of V this is possible only if Ruw, as
desired.

5. We proceed contrapositively, assuming that the frame F = ⟨W, R⟩ is not


euclidean, and show that it falsifies 5, i.e., F ⊭ ♢p → □♢p. Suppose there
are worlds u, v, w ∈ W such that Rwu and Rwv but not Ruv. Define
V such that for all worlds z, z ∈ V (p) if and only if it is not the case
that Ruz. Let M = ⟨W, R, V ⟩. Then by hypothesis M, v ⊩ p and since
Rwv also M, w ⊩ ♢p. However, there is no world y such that Ruy and
M, y ⊩ p so M, u ⊮ ♢p. Since Rwu, it follows that M, w ⊮ □♢p, so that
5, ♢p → □♢p, fails at w.

You’ll notice a difference between the proof for D and the other cases: no
mention was made of the valuation V . In effect, we proved that if M ⊩ D then
M is serial. So D defines the class of serial models, not just frames.

Release : 6891b66 (2024-12-01) 677


CHAPTER 50. FRAME DEFINABILITY

Corollary 50.7. Any model where D is true is serial. nml:frd:def:


prop:D-serial

Corollary 50.8. Each formula on the right side of Table 50.1 defines the class
of frames which have the property on the left side.

Proof. In Theorem 50.1, we proved that if a model has the property on the
left, the formula on the right is true in it. Thus, if a frame F has the property
on the left, the formula on the right is valid in F. In Theorem 50.6, we proved
the converse implications: if a formula on the right is valid in F, F has the
property on the left.

Problem 50.4. Show that if the formula on the right side of Table 50.2 is valid
in a frame F, then F has the property on the left side. To do this, consider a
frame that does not satisfy the property on the left, and define a suitable V
such that the formula on the right is false at some world.

Theorem 50.6 also shows that the properties can be combined: for instance
if both B and 4 are valid in F then the frame is both symmetric and transitive,
etc. Many important modal logics are characterized as the set of formulas valid
in all frames that combine some frame properties, and so we can characterize
them as the set of formulas valid in all frames in which the corresponding
defining formulas are valid. For instance, the classical system S4 is the set of
all formulas valid in all reflexive and transitive frames, i.e., in all those where
both T and 4 are valid. S5 is the set of all formulas valid in all reflexive,
symmetric, and euclidean frames, i.e., all those where all of T, B, and 5 are
valid.
Logical relationships between properties of R in general correspond to re-
lationships between the corresponding defining formulas. For instance, every
reflexive relation is serial; hence, whenever T is valid in a frame, so is D. (Note
that this relationship is not that of entailment. It is not the case that whenever
M, w ⊩ T then M, w ⊩ D.) We record some such relationships.
Proposition 50.9. Let R be a binary relation on a set W ; then: nml:frd:def:
prop:relation-facts
1. If R is reflexive, then it is serial.
2. If R is symmetric, then it is transitive if and only if it is euclidean.
3. If R is symmetric or euclidean then it is weakly directed (it has the “di-
amond property”).
4. If R is euclidean then it is weakly connected.
5. If R is functional then it is serial.

Problem 50.5. Prove Proposition 50.9.

content/normal-modal-logic/frame-definability/first-order-definability.tex

678 Release : 6891b66 (2024-12-01)


50.5. FIRST-ORDER DEFINABILITY

50.5 First-order Definability


nml:frd:fol: We’ve seen that a number of properties of accessibility relations of frames
sec
can be defined by modal formulas. For instance, symmetry of frames can be
defined by the formula B, p → □♢p. The conditions we’ve encountered so far
can all be expressed by first-order formulas in a language involving a single two-
place predicate symbol. For instance, symmetry is defined by ∀x ∀y (Q(x, y) →
Q(y, x)) in the sense that a first-order structure M with |M| = W and QM = R
satisfies the preceding formula iff R is symmetric. This suggests the following
definition:

Definition 50.10. A class F of frames is first-order definable if there is a sen-


tence φ in the first-order language with a single two-place predicate symbol Q
such that F = ⟨W, R⟩ ∈ F iff M ⊨ φ in the first-order structure M with
|M| = W and QM = R.

It turns out that the properties and modal formulas that define them con-
sidered so far are exceptional. Not every formula defines a first-order definable
class of frames, and not every first-order definable class of frames is definable
by a modal formula.
A counterexample to the first is given by the Löb formula:

□(□p → p) → □p. (W)

W defines the class of transitive and converse well-founded frames. A relation


is well-founded if there is no infinite sequence w1 , w2 , . . . such that Rw2 w1 ,
Rw3 w2 , . . . . For instance, the relation < on N is well-founded, whereas the
relation < on Z is not. A relation is converse well-founded iff its converse is
well-founded. So converse well-founded relations are those where there is no
infinite sequence w1 , w2 , . . . such that Rw1 w2 , Rw2 w3 , . . . .
There is, however, no first-order formula defining transitive converse well-
founded relations. For suppose M ⊨ β iff R = QM is transitive converse
well-founded. Let φn be the formula

(Q(a1 , a2 ) ∧ · · · ∧ Q(an−1 , an ))

Now consider the set of formulas

Γ = {β, φ1 , φ2 , . . . }.

Every finite subset of Γ is satisfiable: Let k be largest such that φk is in the


subset, |Mk | = {1, . . . , k}, aM
i
k
= i, and QMk =<. Since < on {1, . . . , k} is
transitive and converse well-founded, Mk ⊨ β. Mk ⊨ φi by construction, for
all i ≤ k. By the Compactness Theorem for first-order logic, Γ is satisfiable in
some structure M. By hypothesis, since M ⊨ β, the relation QM is converse
well-founded. But clearly, aM M
1 , a2 , . . . would form an infinite sequence of the
kind ruled out by converse well-foundedness.

Release : 6891b66 (2024-12-01) 679


CHAPTER 50. FRAME DEFINABILITY

A counterexample to the second claim is given by the property of univer-


sality: for every u and v, Ruv. Universal frames are first-order definable by
the formula ∀x ∀y Q(x, y). However, no modal formula is valid in all and only
the universal frames. This is a consequence of a result that is independently
interesting: the formulas valid in universal frames are exactly the same as those
valid in reflexive, symmetric, and transitive frames. There are reflexive, sym-
metric, and transitive frames that are not universal, hence every formula valid
in all universal frames is also valid in some non-universal frames.

content/normal-modal-logic/frame-definability/equivalence-S5.tex

50.6 Equivalence Relations and S5


The modal logic S5 is characterized as the set of formulas valid on all universal nml:frd:es5:
sec
frames, i.e., every world is accessible from every world, including itself. In such
a scenario, □ corresponds to necessity and ♢ to possibility: □φ is true if φ is
true at every world, and ♢φ is true if φ is true at some world. It turns out that
S5 can also be characterized as the formulas valid on all reflexive, symmetric,
and transitive frames, i.e., on all equivalence relations.

Definition 50.11. A binary relation R on W is an equivalence relation if and


only if it is reflexive, symmetric and transitive. A relation R on W is universal
if and only if Ruv for all u, v ∈ W .

Since T, B, and 4 characterize the reflexive, symmetric, and transitive


frames, the frames where the accessibility relation is an equivalence relation
are exactly those in which all three formulas are valid. It turns out that the
equivalence relations can also be characterized by other combinations of for-
mulas, since the conditions with which we’ve defined equivalence relations are
equivalent to combinations of other familiar conditions on R.

Proposition 50.12. The following are equivalent: nml:frd:es5:


prop:equivalences
1. R is an equivalence relation;

2. R is reflexive and euclidean;

3. R is serial, symmetric, and euclidean;

4. R is serial, symmetric, and transitive.

Proof. Exercise.

Problem 50.6. Prove Proposition 50.12 by showing:

1. If R is symmetric and transitive, it is euclidean.

2. If R is reflexive, it is serial.

680 Release : 6891b66 (2024-12-01)


50.6. EQUIVALENCE RELATIONS AND S5

3. If R is reflexive and euclidean, it is symmetric.


4. If R is symmetric and euclidean, it is transitive.
5. If R is serial, symmetric, and transitive, it is reflexive.
Explain why this suffices for the proof that the conditions are equivalent.

Proposition 50.12 is the semantic counterpart to Proposition 51.30, in that


it gives an equivalent characterization of the modal logic of frames over which
R is an equivalence relation (the logic traditionally referred to as S5).
What is the relationship between universal and equivalence relations? Al-
though every universal relation is an equivalence relation, clearly not every
equivalence relation is universal. However, the formulas valid on all universal
relations are exactly the same as those valid on all equivalence relations.
Proposition 50.13. Let R be an equivalence relation, and for each w ∈ W
define the equivalence class of w as the set [w] = {w′ ∈ W : Rww′ }. Then:
1. w ∈ [w];
2. R is universal on each equivalence class [w];
3. The collection of equivalence classes partitions W into mutually exclusive
and jointly exhaustive subsets.

nml:frd:es5: Proposition 50.14. A formula φ is valid in all frames F = ⟨W, R⟩ where R


prop:S5=univ
is an equivalence relation, if and only if it is valid in all frames F = ⟨W, R⟩
where R is universal. Hence, the logic of universal frames is just S5.

Proof. It’s immediate to verify that a universal relation R on W is an equiva-


lence. Hence, if φ is valid in all frames where R is an equivalence it is valid in
all universal frames. For the other direction, we argue contrapositively: sup-
pose ψ is a formula that fails at a world w in a model M = ⟨W, R, V ⟩ based
on a frame ⟨W, R⟩, where R is an equivalence on W . So M, w ⊮ ψ. Define a
model M′ = ⟨W ′ , R′ , V ′ ⟩ as follows:
1. W ′ = [w];
2. R′ is universal on W ′ ;
3. V ′ (p) = V (p) ∩ W ′ .
(So the set W ′ of worlds in M′ is represented by the shaded area in Figure 50.2.)
It is easy to see that R and R′ agree on W ′ . Then one can show by induction
on formulas that for all w′ ∈ W ′ : M′ , w′ ⊩ φ if and only if M, w′ ⊩ φ for each
φ (this makes sense since W ′ ⊆ W ). In particular, M′ , w ⊮ ψ, and ψ fails in a
model based on a universal frame.

content/normal-modal-logic/frame-definability/second-order-definability.tex

Release : 6891b66 (2024-12-01) 681


CHAPTER 50. FRAME DEFINABILITY

[w]

[z]

[u]
[v]

Figure 50.2: A partition of W in equivalence classes.


nml:frd:es5:
fig:partition
50.7 Second-order Definability
Not every frame property definable by modal formulas is first-order definable. nml:frd:st:
sec
However, if we allow quantification over one-place predicates (i.e., monadic
second-order quantification), we define all modally definable frame properties.
The trick is to exploit a systematic way in which the conditions under which a
modal formula is true at a world are related to first-order formulas. This is the
so-called standard translation of modal formulas into first-order formulas in a
language containing not just a two-place predicate symbol Q for the accessi-
bility relation, but also a one-place predicate symbol Pi for the propositional
variables pi occurring in φ.

Definition 50.15. The standard translation STx (φ) is inductively defined as


follows:

1. φ ≡ ⊥: STx (φ) = ⊥.

2. φ ≡ ⊥: STx (φ) = ⊤.

3. φ ≡ pi : STx (φ) = Pi (x).

4. φ ≡ ¬ψ: STx (φ) = ¬STx (ψ).

5. φ ≡ (ψ ∧ χ): STx (φ) = (STx (ψ) ∧ STx (χ)).

6. φ ≡ (ψ ∨ χ): STx (φ) = (STx (ψ) ∨ STx (χ)).

7. φ ≡ (ψ → χ): STx (φ) = (STx (ψ) → STx (χ)).

8. φ ≡ (ψ ↔ χ): STx (φ) = (STx (ψ) ↔ STx (χ)).

9. φ ≡ □ψ: STx (φ) = ∀y (Q(x, y) → STy (ψ)).

10. φ ≡ ♢ψ: STx (φ) = ∃y (Q(x, y) ∧ STy (ψ)).

682 Release : 6891b66 (2024-12-01)


50.7. SECOND-ORDER DEFINABILITY

For instance, STx (□p → p) is ∀y (Q(x, y) → P (y)) → P (x). Any structure for
the language of STx (φ) requires a domain, a two-place relation assigned to Q,
and subsets of the domain assigned to the one-place predicate symbols Pi .
In other words, the components of such a structure are exactly those of a
model for φ: the domain is the set of worlds, the two-place relation assigned
to Q is the accessibility relation, and the subsets assigned to Pi are just the
assignments V (pi ). It won’t surprise that satisfaction of φ in a modal model
and of STx (φ) in the corresponding structure agree:
nml:frd:st: Proposition 50.16. Let M = ⟨W, R, V ⟩, M′ be the first-order structure with
prop:st ′ ′
|M′ | = W , QM = R, and PiM = V (pi ), and s(x) = w. Then

M, w ⊩ φ iff M′ , s ⊨ STx (φ)

Proof. By induction on φ.

Proposition 50.17. Suppose φ is a modal formula and F = ⟨W, R⟩ is a frame.



Let F′ be the first-order structure with |F′ | = W and QF = R, and let φ′ be
the second-order formula

∀X1 . . . ∀Xn ∀x STx (φ)[X1 /P1 , . . . , Xn /Pn ],

where P1 , . . . , Pn are all one-place predicate symbols in STx (φ). Then

F ⊨ φ iff F′ ⊨ φ′

Proof. F′ ⊨ φ′ iff for every structure M′ where PiM ⊆ W for i = 1, . . . , n, and
for every s with s(x) ∈ W , M′ , s ⊨ STx (φ). By Proposition 50.16, that is the
case iff for all models M based on F and every world w ∈ W , M, w ⊩ φ, i.e.,
F ⊨ φ.

Definition 50.18. A class F of frames is second-order definable if there is


a sentence φ in the second-order language with a single two-place predicate
symbol P and quantifiers only over monadic set variables such that F =
⟨W, R⟩ ∈ F iff M ⊨ φ in the structure M with |M| = W and P M = R.

Corollary 50.19. If a class of frames is definable by a formula φ, the corre-


sponding class of accessibility relations is definable by a monadic second-order
sentence.

Proof. The monadic second-order sentence φ′ of the preceding proof has the
required property.

As an example, consider again the formula □p → p. It defines reflexivity.


Reflexivity is of course first-order definable by the sentence ∀x Q(x, x). But it
is also definable by the monadic second-order sentence

∀X ∀x (∀y (Q(x, y) → X(y)) → X(x)).

Release : 6891b66 (2024-12-01) 683


This means, of course, that the two sentences are equivalent. Here’s how
you might convince yourself of this directly: First suppose the second-order
sentence is true in a structure M. Since x and X are universally quantified,
the remainder must hold for any x ∈ W and set X ⊆ W , e.g., the set {z : Rxz}
where R = QM . So, for any s with s(x) ∈ W and s(X) = {z : Rxz} we have
M ⊨ ∀y (Q(x, y) → X(y)) → X(x). But by the way we’ve picked s(X) that
means M, s ⊨ ∀y (Q(x, y) → Q(x, y)) → Q(x, x), which is equivalent to Q(x, x)
since the antecedent is valid. Since s(x) is arbitrary, we have M ⊨ ∀x Q(x, x).
Now suppose that M ⊨ ∀x Q(x, x) and show that M ⊨ ∀X ∀x (∀y (Q(x, y)→
X(y))→X(x)). Pick any assignment s, and assume M, s ⊨ ∀y (Q(x, y)→X(y)).
Let s′ be the y-variant of s with s′ (y) = s(x); we have M, s′ ⊨ Q(x, y) → X(y),
i.e., M, s ⊨ Q(x, x) → X(x). Since M ⊨ ∀x Q(x, x), the antecedent is true, and
we have M, s ⊨ X(x), which is what we needed to show.
Since some definable classes of frames are not first-order definable, not
every monadic second-order sentence of the form φ′ is equivalent to a first-
order sentence. There is no effective method to decide which ones are.

Chapter 51

Axiomatic Derivations

content/normal-modal-logic/axioms-systems/introduction.tex

51.1 Introduction
We have a semantics for the basic modal language in terms of modal models, nml:axs:int:
sec
and a notion of a formula being valid—true at all worlds in all models—or valid
with respect to some class of models or frames—true at all worlds in all models
in the class, or based on the frame. Logic usually connects such semantic
characterizations of validity with a proof-theoretic notion of derivability. The
aim is to define a notion of derivability in some system such that a formula is
derivable iff it is valid.
The simplest and historically oldest derivation systems are so-called Hilbert-
type or axiomatic derivation systems. Hilbert-type derivation systems for many
modal logics are relatively easy to construct: they are simple as objects of
metatheoretical study (e.g., to prove soundness and completeness). However,

684
51.1. INTRODUCTION

they are much harder to use to prove formulas in than, say, natural deduction
systems.
In Hilbert-type derivation systems, a derivation of a formula is a sequence
of formulas leading from certain axioms, via a handful of inference rules, to
the formula in question. Since we want the derivation system to match the
semantics, we have to guarantee that the set of derivable formulas are true
in all models (or true in all models in which all axioms are true). We’ll first
isolate some properties of modal logics that are necessary for this to work: the
“normal” modal logics. For normal modal logics, there are only two inference
rules that need to be assumed: modus ponens and necessitation. As axioms we
take all (substitution instances) of tautologies, and, depending on the modal
logic we deal with, a number of modal axioms. Even if we are just interested
in the class of all models, we must also count all substitution instances of K
and Dual as axioms. This alone generates the minimal normal modal logic K.

Definition 51.1. The rule of modus ponens is the inference schema


φ φ→ψ
mp
ψ

We say a formula ψ follows from formulas φ, χ by modus ponens iff χ ≡ φ → ψ.

Definition 51.2. The rule of necessitation is the inference schema


φ
nec
□φ

We say the formula ψ follows from the formulas φ by necessitation iff ψ ≡ □φ.

Definition 51.3. A derivation from a set of axioms Σ is a sequence of formulas


ψ1 , ψ2 , . . . , ψn , where each ψi is either

1. a substitution instance of a tautology, or

2. a substitution instance of a formula in Σ, or

3. follows from two formulas ψj , ψk with j, k < i by modus ponens, or

4. follows from a formula ψj with j < i by necessitation.

If there is such a derivation with ψn ≡ φ, we say that φ is derivable from Σ,


in symbols Σ ⊢ φ.

With this definition, it will turn out that the set of derivable formulas forms
a normal modal logic, and that any derivable formula is true in every model
in which every axiom is true. This property of derivations is called soundness.
The converse, completeness, is harder to prove.

content/normal-modal-logic/axioms-systems/normal-logics.tex

Release : 6891b66 (2024-12-01) 685


CHAPTER 51. AXIOMATIC DERIVATIONS

51.2 Normal Modal Logics


Not every set of modal formulas can easily be characterized as those formulas nml:prf:nor:
sec
derivable from a set of axioms. We want modal logics to be well-behaved. First
of all, everything we can derive in classical propositional logic should still be
derivable, of course taking into account that the formulas may now contain also
□ and ♢. To this end, we require that a modal logic contain all tautological
instances and be closed under modus ponens.

Definition 51.4. A modal logic is a set Σ of modal formulas which

1. contains all tautologies, and

2. is closed under substitution, i.e., if φ ∈ Σ, and θ1 , . . . , θn are formulas,


then
φ[θ1 /p1 , . . . , θn /pn ] ∈ Σ,

3. is closed under modus ponens, i.e., if φ and φ → ψ ∈ Σ, then ψ ∈ Σ.

In order to use the relational semantics for modal logics, we also have to
require that all formulas valid in all modal models are included. It turns
out that this requirement is met as soon as all instances of K and dual are
derivable, and whenever a formula φ is derivable, so is □φ. A modal logic that
satisfies these conditions is called normal. (Of course, there are also non-normal
modal logics, but the usual relational models are not adequate for them.)

Definition 51.5. A modal logic Σ is normal if it contains

□(p → q) → (□p → □q), (K)


♢p ↔ ¬□¬p (dual)

and is closed under necessitation, i.e., if φ ∈ Σ, then □φ ∈ Σ.

Observe that while tautological implication is “fine-grained” enough to pre-


serve truth at a world, the rule nec only preserves truth in a model (and hence
also validity in a frame or in a class of frames).

Proposition 51.6. Every normal modal logic is closed under rule rk, nml:prf:nor:
prop:rk
φ1 → (φ2 → · · · (φn−1 → φn ) · · · )
rk
□φ1 → (□φ2 → · · · (□φn−1 → □φn ) · · · ).

Proof. By induction on n: If n = 1, then the rule is just nec, and every normal
modal logic is closed under nec.
Now suppose the result holds for n − 1; we show it holds for n.
Assume

φ1 → (φ2 → · · · (φn−1 → φn ) · · · ) ∈ Σ

686 Release : 6891b66 (2024-12-01)


51.3. DERIVATIONS AND MODAL SYSTEMS

By the induction hypothesis, we have

□φ1 → (□φ2 → · · · □(φn−1 → φn ) · · · ) ∈ Σ

Since Σ is a normal modal logic, it contains all instances of K, in particular

□(φn−1 → φn ) → (□φn−1 → □φn ) ∈ Σ

Using modus ponens and suitable tautological instances we get

□φ1 → (□φ2 → · · · (□φn−1 → □φn ) · · · ) ∈ Σ.

nml:prf:nor: Proposition 51.7. Every normal modal logic Σ contains ¬♢⊥.


prop:notDiamondBot

Problem 51.1. Prove Proposition 51.7.

Proposition 51.8. Let φ1 , . . . , φn be formulas. Then there is a smallest


modal logic Σ containing all instances of φ1 , . . . , φn .

Proof. Given φ1 , . . . , φn , define Σ as the intersection of all normal modal


logics containing all instances of φ1 , . . . , φn . The intersection is non-empty as
Frm(L), the set of all formulas, is such a modal logic.

Definition 51.9. The smallest normal modal logic containing φ1 , . . . , φn is


called a modal system and denoted by Kφ1 . . . φn . The smallest normal modal
logic is denoted by K.

content/normal-modal-logic/axioms-systems/logics-proofs.tex

51.3 Derivations and Modal Systems


nml:prf:prf: We first define what a derivation is for normal modal logics. Roughly, a deriva-
sec
tion is a sequence of formulas in which every element is either (a substitution
instance of) one of a number of axioms, or follows from previous elements by
one of a few inference rules. For normal modal logics, all instances of tau-
tologies, K, and dual count as axioms. This results in the modal system K,
the smallest normal modal logic. We may wish to add additional axioms to
obtain other systems, however. The rules are always modus ponens mp and
necessitation nec.
Definition 51.10. Given a modal system Kφ1 . . . φn and a formula ψ we say
that ψ is derivable in Kφ1 . . . φn , written Kφ1 . . . φn ⊢ ψ, if and only if there
are formulas χ1 , . . . , χk such that χk = ψ and each χi is either a tautological
instance, or an instance of one of K, dual, φ1 , . . . , φn , or it follows from
previous formulas by means of the rules mp or nec.

Release : 6891b66 (2024-12-01) 687


CHAPTER 51. AXIOMATIC DERIVATIONS

The following proposition allows us to show that ψ ∈ Σ by exhibiting a


Σ-derivation of ψ.

Proposition 51.11. Kφ1 . . . φn = {ψ : Kφ1 . . . φn ⊢ ψ}.

Proof. We use induction on the length of derivations to show that {ψ : Kφ1 . . . φn ⊢


ψ} ⊆ Kφ1 . . . φn .
If the derivation of ψ has length 1, it contains a single formula. That formula
cannot follow from previous formulas by mp or nec, so must be a tautological
instance, an instance of K, dual, or an instance of one of φ1 , . . . , φn . But
Kφ1 . . . φn contains these as well, so ψ ∈ Kφ1 . . . φn .
If the derivation of ψ has length > 1, then ψ may in addition be obtained
by mp or nec from formulas not occurring as the last line in the derivation.
If ψ follows from χ and χ → ψ (by mp), then χ and χ → ψ ∈ Kφ1 . . . φn by
induction hypothesis. But every modal logic is closed under modus ponens, so
ψ ∈ Kφ1 . . . φn . If ψ ≡ □χ follows from χ by nec, then χ ∈ Kφ1 . . . φn by
induction hypothesis. But every normal modal logic is closed under nec, so
ψ ∈ Kφ1 . . . φn .
The converse inclusion follows by showing that Σ = {ψ : Kφ1 . . . φn ⊢ ψ}
is a normal modal logic containing all the instances of φ1 , . . . , φn , and the
observation that Kφ1 . . . φn is, by definition, the smallest such logic.

1. Every tautology ψ is a tautological instance, so Kφ1 . . . φn ⊢ ψ, so Σ


contains all tautologies.

2. If Kφ1 . . . φn ⊢ χ and Kφ1 . . . φn ⊢ χ→ψ, then Kφ1 . . . φn ⊢ ψ: Combine


the derivation of χ with that of χ → ψ, and add the line ψ. The last line
is justified by mp. So Σ is closed under modus ponens.

3. If ψ has a derivation, then every substitution instance of ψ also has a


derivation: apply the substitution to every formula in the derivation.
(Exercise: prove by induction on the length of derivations that the result
is also a correct derivation). So Σ is closed under uniform substitution.
(We have now established that Σ satisfies all conditions of a modal logic.)

4. We have Kφ1 . . . φn ⊢ K, so K ∈ Σ.

5. We have Kφ1 . . . φn ⊢ dual, so dual ∈ Σ.

6. If Kφ1 . . . φn ⊢ χ, the additional line □χ is justified by nec. Conse-


quently, Σ is closed under nec. Thus, Σ is normal.

content/normal-modal-logic/axioms-systems/proofs-in-K.tex

688 Release : 6891b66 (2024-12-01)


51.4. PROOFS IN K

51.4 Proofs in K
nml:prf:prk: In order to practice proofs in the smallest modal system, we show the valid
sec
formulas on the left-hand side of Table 49.1 can all be given K-proofs.

Proposition 51.12. K ⊢ □φ → □(ψ → φ)

Proof.

1. φ → (ψ → φ) taut
2. □(φ → (ψ → φ)) nec, 1
3. □(φ → (ψ → φ)) → (□φ → □(ψ → φ)) K
4. □φ → □(ψ → φ) mp, 2, 3

Proposition 51.13. K ⊢ □(φ ∧ ψ) → (□φ ∧ □ψ)

Proof.

1. (φ ∧ ψ) → φ taut
2. □((φ ∧ ψ) → φ) nec
3. □((φ ∧ ψ) → φ) → (□(φ ∧ ψ) → □φ) K
4. □(φ ∧ ψ) → □φ mp, 2, 3
5. (φ ∧ ψ) → ψ taut
6. □((φ ∧ ψ) → ψ) nec
7. □((φ ∧ ψ) → ψ) → (□(φ ∧ ψ) → □ψ) K
8. □(φ ∧ ψ) → □ψ mp, 6, 7
9. (□(φ ∧ ψ) → □φ) →
((□(φ ∧ ψ) → □ψ) →
(□(φ ∧ ψ) → (□φ ∧ □ψ))) taut
10. (□(φ ∧ ψ) → □ψ) →
(□(φ ∧ ψ) → (□φ ∧ □ψ)) mp, 4, 9
11. □(φ ∧ ψ) → (□φ ∧ □ψ) mp, 8, 10.

Note that the formula on line 9 is an instance of the tautology

(p → q) → ((p → r) → (p → (q ∧ r))).

Proposition 51.14. K ⊢ (□φ ∧ □ψ) → □(φ ∧ ψ)

Proof.

Release : 6891b66 (2024-12-01) 689


CHAPTER 51. AXIOMATIC DERIVATIONS

1. φ → (ψ → (φ ∧ ψ)) taut
2. □(φ → (ψ → (φ ∧ ψ))) nec, 1
3. □(φ → (ψ → (φ ∧ ψ))) → (□φ → □(ψ → (φ ∧ ψ))) K
4. □φ → □(ψ → (φ ∧ ψ)) mp, 2, 3
5. □(ψ → (φ ∧ ψ)) → (□ψ → □(φ ∧ ψ)) K
6. (□φ → □(ψ → (φ ∧ ψ))) →
(□(ψ → (φ ∧ ψ)) → (□ψ → □(φ ∧ ψ))) →
(□φ → (□ψ → □(φ ∧ ψ)))) taut
7. (□(ψ → (φ ∧ ψ)) → (□ψ → □(φ ∧ ψ))) →
(□φ → (□ψ → □(φ ∧ ψ))) mp, 4, 6
8. □φ → (□ψ → □(φ ∧ ψ))) mp, 5, 7
9. (□φ → (□ψ → □(φ ∧ ψ)))) →
((□φ ∧ □ψ) → □(φ ∧ ψ)) taut
10. (□φ ∧ □ψ) → □(φ ∧ ψ) mp, 8, 9
The formulas on lines 6 and 9 are instances of the tautologies

(p → q) → ((q → r) → (p → r))
(p → (q → r)) → ((p ∧ q) → r)

Proposition 51.15. K ⊢ ¬□p → ♢¬p

Proof.
1. ♢¬p ↔ ¬□¬¬p dual
2. (♢¬p ↔ ¬□¬¬p) →
(¬□¬¬p → ♢¬p) taut
3. ¬□¬¬p → ♢¬p mp, 1, 2
4. ¬¬p → p taut
5. □(¬¬p → p) nec, 4
6. □(¬¬p → p) → (□¬¬p → □p) K
7. (□¬¬p → □p) mp, 5, 6
8. (□¬¬p → □p) → (¬□p → ¬□¬¬p) taut
9. ¬□p → ¬□¬¬p mp, 7, 8
10. (¬□p → ¬□¬¬p) →
((¬□¬¬p → ♢¬p) → (¬□p → ♢¬p)) taut
11. (¬□¬¬p → ♢¬p) → (¬□p → ♢¬p) mp, 9, 10
12. ¬□p → ♢¬p mp, 3, 11
The formulas on lines 8 and 10 are instances of the tautologies

(p → q) → (¬q → ¬p)
(p → q) → ((q → r) → (p → r)).

Problem 51.2. Find derivations in K for the following formulas:


1. □¬p → □(p → q)

690 Release : 6891b66 (2024-12-01)


51.5. DERIVED RULES

2. (□p ∨ □q) → □(p ∨ q)


3. ♢p → ♢(p ∨ q)

content/normal-modal-logic/axioms-systems/derived-rules.tex

51.5 Derived Rules


nml:prf:der: Finding and writing derivations is obviously difficult, cumbersome, and repet-
sec
itive. For instance, very often we want to pass from φ → ψ to □φ → □ψ, i.e.,
apply rule rk. That requires an application of nec, then recording the proper
instance of K, then applying mp. Passing from φ → ψ and ψ → χ to φ → χ
requires recording the (long) tautological instance

(φ → ψ) → ((ψ → χ) → (φ → χ))

and applying mp twice. Often we want to replace a sub-formula by a formula


we know to be equivalent, e.g., ♢φ by ¬□¬φ, or ¬¬φ by φ. So rather than
write out the actual derivation, it is more convenient to simply record why the
intermediate steps are derivable. For this purpose, let us collect some facts
about derivability.
Proposition 51.16. If K ⊢ φ1 , . . . , K ⊢ φn , and ψ follows from φ1 , . . . , φn
by propositional logic, then K ⊢ ψ.

Proof. If ψ follows from φ1 , . . . , φn by propositional logic, then

φ1 → (φ2 → · · · (φn → ψ) . . . )

is a tautological instance. Applying mp n times gives a derivation of ψ.

We will indicate use of this proposition by pl.


Proposition 51.17. If K ⊢ φ1 → (φ2 → · · · (φn−1 → φn ) . . . ) then K ⊢ □φ1 →
(□φ2 → · · · (□φn−1 → □φn ) . . . ).

Proof. By induction on n, just as in the proof of Proposition 51.6.

We will indicate use of this proposition by rk. Let’s illustrate how these
results help establishing derivability results more easily.
Proposition 51.18. K ⊢ (□φ ∧ □ψ) → □(φ ∧ ψ)

Proof.
1. K ⊢ φ → (ψ → (φ ∧ ψ)) taut
2. K ⊢ □φ → (□ψ → □(φ ∧ ψ))) rk, 1
3. K ⊢ (□φ ∧ □ψ) → □(φ ∧ ψ) pl, 2

Release : 6891b66 (2024-12-01) 691


CHAPTER 51. AXIOMATIC DERIVATIONS

Proposition 51.19. If K ⊢ φ ↔ ψ and K ⊢ χ[φ/q] then K ⊢ χ[B/q] nml:prf:der:


prop:rewriting
Proof. Exercise.

Problem 51.3. Prove Proposition 51.19 by proving, by induction on the com-


plexity of χ, that if K ⊢ φ ↔ ψ then K ⊢ χ[φ/q] ↔ χ[ψ/q].

This proposition comes in handy especially when we want to convert ♢


into □ (or vice versa), or remove double negations inside a formula. In what
follows, we will mark applications of Proposition 51.19 by “φ for ψ” whenever
we re-write a formula χ(ψ) for χ(φ). In other words, “φ for ψ” abbreviates:
⊢ χ(φ)
⊢φ↔ψ
⊢ χ(ψ) by Proposition 51.19
For instance:
Proposition 51.20. K ⊢ ¬□p → ♢¬p

Proof.
1. K ⊢ ♢¬p ↔ ¬□¬¬p dual
2. K ⊢ ¬□¬¬p → ♢¬p pl, 1
3. K ⊢ ¬□p → ♢¬p p for ¬¬p

In the above derivation, the final step “p for ¬¬p” is short for
K ⊢ ¬□¬¬p → ♢¬p
K ⊢ ¬¬p ↔ p taut
K ⊢ ¬□p → ♢¬p by Proposition 51.19
The roles of χ(q), φ, and ψ in Proposition 51.19 are played here, respectively,
by ¬□q → ♢¬p, ¬¬p, and p.
When a formula contains a sub-formula ¬♢φ, we can replace it by □¬φ
using Proposition 51.19, since K ⊢ ¬♢φ ↔ □¬φ. We’ll indicate this and similar
replacements simply by “□¬ for ¬♢.”
The following proposition justifies that we can establish derivability results
schematically. E.g., the previous proposition does not just establish that K ⊢
¬□p → ♢¬p, but K ⊢ ¬□φ → ♢¬φ for arbitrary φ.
Proposition 51.21. If φ is a substitution instance of ψ and K ⊢ ψ, then
K ⊢ φ.

Proof. It is tedious but routine to verify (by induction on the length of the
derivation of ψ) that applying a substitution to an entire derivation also re-
sults in a correct derivation. Specifically, substitution instances of tautolog-
ical instances are themselves tautological instances, substitution instances of
instances of dual and K are themselves instances of dual and K, and appli-
cations of mp and nec remain correct when substituting formulas for proposi-
tional variables in both premise(s) and conclusion.

692 Release : 6891b66 (2024-12-01)


51.6. MORE PROOFS IN K

content/normal-modal-logic/axioms-systems/more-proofs-in-K.tex

51.6 More Proofs in K


nml:prf:mpr: Let’s see some more examples of derivability in K, now using the simplified
sec
method introduced in section 51.5.
Proposition 51.22. K ⊢ □(φ → ψ) → (♢φ → ♢ψ)

Proof.
1. K ⊢ (φ → ψ) → (¬ψ → ¬φ) pl
2. K ⊢ □(φ → ψ) → (□¬ψ → □¬φ) rk, 1
3. K ⊢ (□¬ψ → □¬φ) → (¬□¬φ → ¬□¬ψ) taut
4. K ⊢ □(φ → ψ) → (¬□¬φ → ¬□¬ψ) pl, 2, 3
5. K ⊢ □(φ → ψ) → (♢φ → ♢ψ) ♢ for ¬□¬.

Proposition 51.23. K ⊢ □φ → (♢(φ → ψ) → ♢ψ)

Proof.
1. K ⊢ φ → (¬ψ → ¬(φ → ψ)) taut
2. K ⊢ □φ → (□¬ψ → □¬(φ → ψ)) rk, 1
3. K ⊢ □φ → (¬□¬(φ → ψ) → ¬□¬ψ) pl, 2
4. K ⊢ □φ → (♢(φ → ψ) → ♢ψ) ♢ for ¬□¬.

Proposition 51.24. K ⊢ (♢φ ∨ ♢ψ) → ♢(φ ∨ ψ)

Proof.
1. K ⊢ ¬(φ ∨ ψ) → ¬φ taut
2. K ⊢ □¬(φ ∨ ψ) → □¬φ rk, 1
3. K ⊢ ¬□¬φ → ¬□¬(φ ∨ ψ) pl, 2
4. K ⊢ ♢φ → ♢(φ ∨ ψ) ♢ for ¬□¬
5. K ⊢ ♢ψ → ♢(φ ∨ ψ) similarly
6. K ⊢ (♢φ ∨ ♢ψ) → ♢(φ ∨ ψ) pl, 4, 5.

Proposition 51.25. K ⊢ ♢(φ ∨ ψ) → (♢φ ∨ ♢ψ)

Proof.
1. K ⊢ ¬φ → (¬ψ → ¬(φ ∨ ψ) taut
2. K ⊢ □¬φ → (□¬ψ → □¬(φ ∨ ψ) rk
3. K ⊢ □¬φ → (¬□¬(φ ∨ ψ) → ¬□¬ψ)) pl, 2
4. K ⊢ ¬□¬(φ ∨ ψ) → (□¬φ → ¬□¬ψ) pl, 3
5. K ⊢ ¬□¬(φ ∨ ψ) → (¬¬□¬ψ → ¬□¬φ) pl, 4
6. K ⊢ ♢(φ ∨ ψ) → (¬♢ψ → ♢φ) ♢ for ¬□¬
7. K ⊢ ♢(φ ∨ ψ) → (♢ψ ∨ ♢φ) pl, 6.

Release : 6891b66 (2024-12-01) 693


CHAPTER 51. AXIOMATIC DERIVATIONS

Problem 51.4. Show that the following derivability claims hold:

1. K ⊢ ♢¬⊥ → (□φ → ♢φ);

2. K ⊢ □(φ ∨ ψ) → (♢φ ∨ □ψ);

3. K ⊢ (♢φ → □ψ) → □(φ → ψ).

content/normal-modal-logic/axioms-systems/duals.tex

51.7 Dual Formulas


nml:prf:dua:
sec
Definition 51.26. Each of the formulas T, B, 4, and 5 has a dual, denoted nml:prf:dua:
def:duals
by a subscripted diamond, as follows:

p → ♢p (T♢ )
♢□p → p (B♢ )
♢♢p → ♢p (4♢ )
♢□p → □p (5♢ )

Each of the above dual formulas is obtained from the corresponding formula
by substituting ¬p for p, contraposing, replacing ¬□¬ by ♢, and replacing ¬♢¬
by □. D, i.e., □φ → ♢φ is its own dual in that sense.

Proposition 51.27. For each formula φ in Definition 51.26: Kφ = Kφ♢ . nml:prf:dua:


prop:dualsys

Proof. Exercise.

Problem 51.5. Prove Proposition 51.27.

content/normal-modal-logic/axioms-systems/proofs-modal-systems.tex

51.8 Proofs in Modal Systems


We now come to proofs in systems of modal logic other than K. nml:prf:prs:
sec

Proposition 51.28. The following provability results obtain: nml:prf:prs:


prop:S5facts
1. KT5 ⊢ B;

2. KT5 ⊢ 4;

3. KDB4 ⊢ T;

4. KB4 ⊢ 5;

694 Release : 6891b66 (2024-12-01)


51.8. PROOFS IN MODAL SYSTEMS

5. KB5 ⊢ 4;

nml:prf:prs: 6. KT ⊢ D.
prop:S5facts-KT-D

Proof. We exhibit proofs for each.

1. KT5 ⊢ B:

1. KT5 ⊢ ♢φ → □♢φ 5
2. KT5 ⊢ φ → ♢φ T♢
3. KT5 ⊢ φ → □♢φ pl.

2. KT5 ⊢ 4:

1. KT5 ⊢ ♢□φ → □♢□φ 5 with □φ for p


2. KT5 ⊢ □φ → ♢□φ T♢ with □φ for p
3. KT5 ⊢ □φ → □♢□φ pl, 1, 2
4. KT5 ⊢ ♢□φ → □φ 5♢
5. KT5 ⊢ □♢□φ → □□φ rk, 4
6. KT5 ⊢ □φ → □□φ pl, 3, 5.

3. KDB4 ⊢ T:

1. KDB4 ⊢ ♢□φ → φ B♢
2. KDB4 ⊢ □□φ → ♢□φ D with □φ for p
3. KDB4 ⊢ □□φ → φ pl1, 2
4. KDB4 ⊢ □φ → □□φ 4
5. KDB4 ⊢ □φ → φ pl, 1, 4.

4. KB4 ⊢ 5:

1. KB4 ⊢ ♢φ → □♢♢φ B with ♢φ for p


2. KB4 ⊢ ♢♢φ → ♢φ 4♢
3. KB4 ⊢ □♢♢φ → □♢φ rk, 2
4. KB4 ⊢ ♢φ → □♢φ pl, 1, 3.

5. KB5 ⊢ 4:

1. KB5 ⊢ □φ → □♢□φ B with □φ for p


2. KB5 ⊢ ♢□φ → □φ 5♢
3. KB5 ⊢ □♢□φ → □□φ rk, 2
4. KB5 ⊢ □φ → □□φ pl, 1, 3.

6. KT ⊢ D:

1. KT ⊢ □φ → φ T
2. KT ⊢ φ → ♢φ T♢
3. KT ⊢ □φ → ♢φ pl, 1, 2

Release : 6891b66 (2024-12-01) 695


CHAPTER 51. AXIOMATIC DERIVATIONS

Definition 51.29. Following tradition, we define S4 to be the system KT4,


and S5 the system KTB4.

The following proposition shows that the classical system S5 has several
equivalent axiomatizations. This should not surprise, as the various combina-
tions of axioms all characterize equivalence relations (see Proposition 50.12).
Proposition 51.30. KTB4 = KT5 = KDB4 = KDB5. nml:prf:prs:
prop:S5

Proof. Exercise.

Problem 51.6. Prove Proposition 51.30.

content/normal-modal-logic/axioms-systems/soundness.tex

51.9 Soundness
A derivation system is called sound if everything that can be derived is valid. nml:prf:snd:
sec
When considering modal systems, i.e., derivations where in addition to K we
can use instances of some formulas φ1 , . . . , φn , we want every derivable formula
to be true in any model in which φ1 , . . . , φn are true.
Theorem 51.31 (Soundness Theorem). If every instance of φ1 , . . . , φn nml:prf:snd:
is valid in the classes of models C1 , . . . , Cn , respectively, then Kφ1 . . . φn ⊢ ψ thm:soundness

implies that ψ is valid in the class of models C1 ∩ · · · ∩ Cn .

Proof. By induction on length of proofs. For brevity, put C = C1 ∩ · · · ∩ Cn .


1. Induction Basis: If ψ has a proof of length 1, then it is either a tautological
instance, an instance of K, or of dual, or an instance of one of φ1 , . . . , φn .
In the first case, ψ is valid in C, since tautological instance are valid in
any class of models, by Proposition 49.15. Similarly in the second case,
by Proposition 49.18 and Proposition 49.19. Finally in the third case,
since ψ is valid in Ci and C ⊆ Ci , we have that ψ is valid in C as well by
Proposition 49.11.
2. Inductive step: Suppose ψ has a proof of length k > 1. If ψ is a tauto-
logical instance or an instance of one of φ1 , . . . , φn , we proceed as in the
previous step. So suppose ψ is obtained by mp from previous formulas
χ → ψ and χ. Then χ → ψ and χ have proofs of length < k, and by induc-
tive hypothesis they are valid in C. By Proposition 49.20, ψ is valid in C
as well. Finally suppose ψ is obtained by nec from χ (so that ψ = □χ).
By inductive hypothesis, χ is valid in C, and by Proposition 49.12 so is ψ.

content/normal-modal-logic/axioms-systems/systems-distinct.tex

696 Release : 6891b66 (2024-12-01)


51.10. SHOWING SYSTEMS ARE DISTINCT

51.10 Showing Systems are Distinct


nml:prf:dis: In section 51.8 we saw how to prove that two systems of modal logic are in fact
sec
the same system. Theorem 51.31 allows us to show that two modal systems
Σ and Σ ′ are distinct, by finding a formula φ such that Σ ′ ⊢ φ that fails in a
model of Σ.
Proposition 51.32. KD ⊊ KT

Proof. This is the syntactic counterpart to the semantic fact that all reflexive
relations are serial. To show KD ⊆ KT we need to see that KD ⊢ ψ implies
KT ⊢ ψ, which follows from KT ⊢ D, as shown in Proposition 51.28(6). To
show that the inclusion is proper, by Soundness (Theorem 51.31), it suffices
to exhibit a model of KD where T, i.e., □p → p, fails (an easy task left as an
exercise), for then by Soundness KD ⊬ □p → p.

Proposition 51.33. KB ̸= K4.

Proof. We construct a symmetric model where some instance of 4 fails; since


obviously the instance is derivable for K4 but not in KB, it will follow K4 ⊈
KB. Consider the symmetric model M of Figure 51.1. Since the model is
symmetric, K and B are true in M (by Proposition 49.18 and Theorem 50.1,
respectively). However, M, w1 ⊮ □p → □□p.
¬p p
w1 w2
⊩ □p ⊮ □p
⊮ □□p
Figure 51.1: A symmetric model falsifying an instance of 4.
nml:prf:dis:
nml:prf:dis:
fig:Bnot4
Theorem 51.34. KTB ⊬ 4 and KTB ⊬ 5.
thm:KTBnot45

Proof. By Theorem 50.1 we know that all instances of T and B are true in
every reflexive symmetric model (respectively). So by soundness, it suffices to
find a reflexive symmetric model containing a world at which some instance
of 4 fails, and similarly for 5. We use the same model for both claims. Consider
the symmetric, reflexive model in Figure 51.2. Then M, w1 ⊮ □p → □□p, so 4
fails at w1 . Similarly, M, w2 ⊮ ♢¬p → □♢¬p, so the instance of 5 with φ = ¬p
fails at w2 .

nml:prf:dis: Theorem 51.35. KD5 ̸= KT4 = S4.


thm:KD5not4

Proof. By Theorem 50.1 we know that all instances of D and 5 are true in all
serial euclidean models. So it suffices to find a serial euclidean model containing
a world at which some instance of 4 fails. Consider the model of Figure 51.3,
and notice that M, w1 ⊮ □p → □□p.

Release : 6891b66 (2024-12-01) 697


CHAPTER 51. AXIOMATIC DERIVATIONS

w1 p w2 p w3 ¬p
⊩ □p ⊩ ♢¬p
⊮ □□p ⊮ □♢¬p
⊮ ♢¬p
Figure 51.2: The model for Theorem 51.34.
nml:prf:dis:
fig:KTBnot45

w4 ¬p

p p
w2 w3

w1 ¬p
⊩ □p, ⊮ □□p
Figure 51.3: The model for Theorem 51.35.
nml:prf:dis:
fig:KD5not4
Problem 51.7. Give an alternative proof of Theorem 51.35 using a model
with 3 worlds.

Problem 51.8. Provide a single reflexive transitive model showing that both
KT4 ⊬ B and KT4 ⊬ 5.

content/normal-modal-logic/axioms-systems/provability-from-set.tex

51.11 Derivability from a Set of Formulas


In section 51.8 we defined a notion of provability of a formula in a system Σ. nml:prf:prg:
sec
We now extend this notion to provability in Σ from formulas in a set Γ .

Definition 51.36. A formula φ is derivable in a system Σ from a set of nml:prf:prg:


formulas Γ , written Γ ⊢Σ φ if and only if there are ψ1 , . . . , ψn ∈ Γ such that defn:Gammaproves

Σ ⊢ ψ1 → (ψ2 → · · · (ψn → φ) · · · ).

content/normal-modal-logic/axioms-systems/provability-properties.tex

698 Release : 6891b66 (2024-12-01)


51.12. PROPERTIES OF DERIVABILITY

51.12 Properties of Derivability


nml:prf:prp:
sec
nml:prf:prp: Proposition 51.37. Let Σ be a modal system and Γ a set of modal formulas.
prop:derivabilityfacts
The following properties hold:
nml:prf:prp: 1. Monotonicity: If Γ ⊢Σ φ and Γ ⊆ ∆ then ∆ ⊢Σ φ;
rop:derivabilityfacts-monotonicity
nml:prf:prp: 2. Reflexivity: If φ ∈ Γ then Γ ⊢Σ φ;
prop:derivabilityfacts-reflexivity
nml:prf:prp: 3. Cut: If Γ ⊢Σ φ and ∆ ∪ {φ} ⊢Σ ψ then Γ ∪ ∆ ⊢Σ ψ;
prop:derivabilityfacts-cut
nml:prf:prp: 4. Deduction theorem: Γ ∪ {ψ} ⊢Σ φ if and only if Γ ⊢Σ ψ → φ;
prop:derivabilityfacts-deduction
nml:prf:prp: 5. Γ ⊢Σ φ1 and . . . and Γ ⊢Σ φn and φ1 → (φ2 → · · · (φn → ψ) · · · ) is a
prop:derivabilityfacts-ruleT
tautological instance, then Γ ⊢Σ ψ.

The proof is an easy exercise. Part (5) of Proposition 51.37 gives us that,
for instance, if Γ ⊢Σ φ ∨ ψ and Γ ⊢Σ ¬φ, then Γ ⊢Σ ψ. Also, in what follows,
we write Γ, φ ⊢Σ ψ instead of Γ ∪ {φ} ⊢Σ ψ.
Definition 51.38. A set Γ is deductively closed relatively to a system Σ if
and only if Γ ⊢Σ φ implies φ ∈ Γ .

content/normal-modal-logic/axioms-systems/consistency.tex

51.13 Consistency
nml:prf:con: Consistency is an important property of sets of formulas. A set of formulas is
sec
inconsistent if a contradiction, such as ⊥, is derivable from it; and otherwise
consistent. If a set is inconsistent, its formulas cannot all be true in a model at
a world. For the completeness theorem we prove the converse: every consistent
set is true at a world in a model, namely in the “canonical model.”
Definition 51.39. A set Γ is consistent relatively to a system Σ or, as we
will say, Σ-consistent, if and only if Γ ⊬Σ ⊥.

So for instance, the set {□(p→q), □p, ¬□q} is consistent relatively to propo-
sitional logic, but not K-consistent. Similarly, the set {♢p, □♢p → q, ¬q} is not
K5-consistent.
nml:prf:con: Proposition 51.40. Let Γ be a set of formulas. Then:
prop:consistencyfacts
1. Γ is Σ-consistent if and only if there is some formula φ such that Γ ⊬Σ φ.
nml:prf:con: 2. Γ ⊢Σ φ if and only if Γ ∪ {¬φ} is not Σ-consistent.
prop:consistencyfacts-b
nml:prf:con: 3. If Γ is Σ-consistent, then for any formula φ, either Γ ∪ {φ} is Σ-
prop:consistencyfacts-c
consistent or Γ ∪ {¬φ} is Σ-consistent.

Release : 6891b66 (2024-12-01) 699


Proof. These facts follow easily using classical propositional logic. We give the
argument for (3). Proceed contrapositively and suppose neither Γ ∪ {φ} nor
Γ ∪ {¬φ} is Σ-consistent. Then by (2), both Γ, φ ⊢Σ ⊥ and Γ, ¬φ ⊢Σ ⊥. By
the deduction theorem Γ ⊢Σ φ → ⊥ and Γ ⊢Σ ¬φ→⊥. But (φ→⊥)→((¬φ→
⊥) → ⊥) is a tautological instance, hence by Proposition 51.37(5), Γ ⊢Σ ⊥.

Chapter 52

Completeness and Canonical


Models

content/normal-modal-logic/completeness/introduction.tex

52.1 Introduction
If Σ is a modal system, then the soundness theorem establishes that if Σ ⊢ φ, nml:com:int:
sec
then φ is valid in any class C of models in which all instances of all formulas
in Σ are valid. In particular that means that if K ⊢ φ then φ is true in all
models; if KT ⊢ φ then φ is true in all reflexive models; if KD ⊢ φ then φ is
true in all serial models, etc.
Completeness is the converse of soundness: that K is complete means that
if a formula φ is valid, ⊢ φ, for instance. Proving completeness is a lot harder to
do than proving soundness. It is useful, first, to consider the contrapositive: K
is complete iff whenever ⊬ φ, there is a countermodel, i.e., a model M such that
M ⊮ φ. Equivalently (negating φ), we could prove that whenever ⊬ ¬φ, there
is a model of φ. In the construction of such a model, we can use information
contained in φ. When we find models for specific formulas we often do the
same: e.g., if we want to find a countermodel to p → □q, we know that it has to
contain a world where p is true and □q is false. And a world where □q is false
means there has to be a world accessible from it where q is false. And that’s
all we need to know: which worlds make the propositional variables true, and
which worlds are accessible from which worlds.
In the case of proving completeness, however, we don’t have a specific for-
mula φ for which we are constructing a model. We want to establish that a

700
52.2. COMPLETE Σ-CONSISTENT SETS

model exists for every φ such that ⊬Σ ¬φ. This is a minimal requirement, since
if ⊢Σ ¬φ, by soundness, there is no model for φ (in which Σ is true). Now
note that ⊬Σ ¬φ iff φ is Σ-consistent. (Recall that Σ ⊬Σ ¬φ and φ ⊬Σ ⊥ are
equivalent.) So our task is to construct a model for every Σ-consistent formula.
The trick we’ll use is to find a Σ-consistent set of formulas that contains φ,
but also other formulas which tell us what the world that makes φ true has
to look like. Such sets are complete Σ-consistent sets. It’s not enough to
construct a model with a single world to make φ true, it will have to contain
multiple worlds and an accessibility relation. The complete Σ-consistent set
containing φ will also contain other formulas of the form □ψ and ♢χ. In all
accessible worlds, ψ has to be true; in at least one, χ has to be true. In order
to accomplish this, we’ll simply take all possible complete Σ-consistent sets
as the basis for the set of worlds. A tricky part will be to figure out when a
complete Σ-consistent set should count as being accessible from another in our
model.
We’ll show that in the model so defined, φ is true at a world—which is
also a complete Σ-consistent set—iff φ is an element of that set. If φ is Σ-
consistent, it will be an element of at least one complete Σ-consistent set (a
fact we’ll prove), and so there will be a world where φ is true. So we will have
a single model where every Σ-consistent formula φ is true at some world. This
single model is the canonical model for Σ.

content/normal-modal-logic/completeness/complete-consistent-sets.tex

52.2 Complete Σ-Consistent Sets


nml:com:ccs: Suppose Σ is a set of modal formulas—think of them as the axioms or defining
sec
principles of a normal modal logic. A set Γ is Σ-consistent iff Γ ⊬Σ ⊥, i.e.,
if there is no derivation of φ1 → (φ2 → · · · (φn → ⊥) . . . ) from Σ, where each
φi ∈ Γ . We will construct a “canonical” model in which each world is taken
to be a special kind of Σ-consistent set: one which is not just Σ-consistent,
but maximally so, in the sense that it settles the truth value of every modal
formula: for every φ, either φ ∈ Γ or ¬φ ∈ Γ :

Definition 52.1. A set Γ is complete Σ-consistent if and only if it is Σ-


consistent and for every φ, either φ ∈ Γ or ¬φ ∈ Γ .

Complete Σ-consistent sets Γ have a number of useful properties. For one,


they are deductively closed, i.e., if Γ ⊢Σ φ then φ ∈ Γ . This means in particular
that every instance of a formula φ ∈ Σ is also ∈ Γ . Moreover, membership in
Γ mirrors the truth conditions for the propositional connectives. This will be
important when we define the “canonical model.”

nml:com:ccs: Proposition 52.2. Suppose Γ is complete Σ-consistent. Then:


prop:ccs-properties

nml:com:ccs: 1. Γ is deductively closed in Σ.


prop:ccs-closed

Release : 6891b66 (2024-12-01) 701


CHAPTER 52. COMPLETENESS AND CANONICAL MODELS

2. Σ ⊆ Γ . nml:com:ccs:
prop:ccs-sigma
3. ⊥ ∈
/Γ nml:com:ccs:
prop:ccs-lfalse
4. ⊤ ∈ Γ nml:com:ccs:
prop:ccs-ltrue
5. ¬φ ∈ Γ if and only if φ ∈
/ Γ. nml:com:ccs:
prop:ccs-lnot
6. φ ∧ ψ ∈ Γ iff φ ∈ Γ and ψ ∈ Γ nml:com:ccs:
prop:ccs-land
7. φ ∨ ψ ∈ Γ iff φ ∈ Γ or ψ ∈ Γ nml:com:ccs:
prop:ccs-lor
8. φ → ψ ∈ Γ iff φ ∈
/ Γ or ψ ∈ Γ nml:com:ccs:
prop:ccs-lif
9. φ ↔ ψ ∈ Γ iff either φ ∈ Γ and ψ ∈ Γ , or φ ∈
/ Γ and ψ ∈
/Γ nml:com:ccs:
prop:ccs-liff

Proof. 1. Suppose Γ ⊢Σ φ but φ ∈ / Γ . Then since Γ is complete Σ-


consistent, ¬φ ∈ Γ . This would make Γ inconsistent, since φ, ¬φ ⊢Σ ⊥.
2. If φ ∈ Σ then Γ ⊢Σ φ, and φ ∈ Γ by deductive closure, i.e., case (1).
3. If ⊥ ∈ Γ , then Γ ⊢Σ ⊥, so Γ would be Σ-inconsistent.
4. Γ ⊢Σ ⊤, so ⊤ ∈ Γ by deductive closure, i.e., case (1).
5. If ¬φ ∈ Γ , then by consistency φ ∈
/ Γ ; and if φ ∈
/ Γ then φ ∈ Γ since Γ
is complete Σ-consistent.
6. Suppose φ ∧ ψ ∈ Γ . Since (φ ∧ ψ) → φ is a tautological instance, φ ∈ Γ
by deductive closure, i.e., case (1). Similarly for ψ ∈ Γ . On the other
hand, suppose both φ ∈ Γ and ψ ∈ Γ . Then deductive closure implies
(φ ∧ ψ) ∈ Γ , since φ → (ψ → (φ ∧ ψ)) is a tautological instance.
7. Suppose φ ∨ ψ ∈ Γ , and φ ∈ / Γ and ψ ∈/ Γ . Since Γ is complete Σ-
consistent, ¬φ ∈ Γ and ¬ψ ∈ Γ . Then ¬(φ ∨ ψ) ∈ Γ since ¬φ →
(¬ψ → ¬(φ ∨ ψ)) is a tautological instance. This would mean that Γ is
Σ-inconsistent, a contradiction.
8. Suppose φ→ψ ∈ Γ and φ ∈ Γ ; then Γ ⊢Σ ψ, whence ψ ∈ Γ by deductive
closure. Conversely, if φ → ψ ∈
/ Γ then since Γ is complete Σ-consistent,
¬(φ → ψ) ∈ Γ . Since ¬(φ → ψ) → φ is a tautological instance, φ ∈ Γ
by deductive closure. Since ¬(φ → ψ) → ¬ψ is a tautological instance,
¬ψ ∈ Γ . Then ψ ∈/ Γ since Γ is Σ-consistent.
9. Suppose φ ↔ ψ ∈ Γ . If φ ∈ Γ , then ψ ∈ Γ , since (φ ↔ ψ) → (φ → ψ) is
a tautological instance. Similarly, if ψ ∈ Γ , then φ ∈ Γ . So either both
φ ∈ Γ and ψ ∈ Γ , or neither φ ∈ Γ nor ψ ∈ Γ .
Conversely, suppose φ→ψ ∈ / Γ . Since Γ is complete Σ-consistent, ¬(φ↔
ψ) ∈ Γ . Since ¬(φ ↔ ψ) → (φ → ¬ψ) is a tautological instance, if φ ∈ Γ
then ¬ψ ∈ Γ , and since Γ is Σ-consistent, ψ ∈ / Γ . Similarly, if ψ ∈ Γ
then φ ∈/ Γ . So neither φ ∈ Γ and ψ ∈ Γ , nor φ ∈/ Γ and ψ ∈ / Γ.

702 Release : 6891b66 (2024-12-01)


52.3. LINDENBAUM’S LEMMA

Problem 52.1. Complete the proof of Proposition 52.2.

content/normal-modal-logic/completeness/lindenbaums-lemma.tex

52.3 Lindenbaum’s Lemma


nml:com:lin: Lindenbaum’s Lemma establishes that every Σ-consistent set of formulas is
sec
contained in at least one complete Σ-consistent set. Our construction of the
canonical model will show that for each complete Σ-consistent set ∆, there is a
world in the canonical model where all and only the formulas in ∆ are true. So
Lindenbaum’s Lemma guarantees that every Σ-consistent set is true at some
world in the canonical model.

nml:com:lin: Theorem 52.3 (Lindenbaum’s Lemma). If Γ is Σ-consistent then there


thm:lindenbaum
is a complete Σ-consistent set ∆ extending Γ .

Proof. Let φ0 , φ1 , . . . be an exhaustive listing of all formulas of the language


(repetitions are allowed). For instance, start by listing p0 , and at each stage
n ≥ 1 list the finitely many formulas of length n using only variables among
p0 , . .S
. , pn . We define sets of formulas ∆n by induction on n, and we then set
∆ = n ∆n . We first put ∆0 = Γ . Supposing that ∆n has been defined, we
define ∆n+1 by:
(
∆n ∪ {φn }, if ∆n ∪ {φn } is Σ-consistent;
∆n+1 =
∆n ∪ {¬φn }, otherwise.
S∞
Now let ∆ = n=0 ∆n .
We have to show that this definition actually yields a set ∆ with the required
properties, i.e., Γ ⊆ ∆ and ∆ is complete Σ-consistent.
It’s obvious that Γ ⊆ ∆, since ∆0 ⊆ ∆ by construction, and ∆0 = Γ . In
fact, ∆n ⊆ ∆ for all n, since ∆ is the union of all ∆n . (Since in each step of the
construction, we add a formula to the set already constructed, ∆n ⊆ ∆n+1 ,
so since ⊆ is transitive, ∆n ⊆ ∆m whenever n ≤ m.) At each stage of the
construction, we either add φn or ¬φn , and every formula appears (at least
once) in the list of all φn . So, for every φ either φ ∈ ∆ or ¬φ ∈ ∆, so ∆ is
complete by definition.
Finally, we have to show, that ∆ is Σ-consistent. To do this, we show that
(a) if ∆ were Σ-inconsistent, then some ∆n would be Σ-inconsistent, and (b)
all ∆n are Σ-consistent.
So suppose ∆ were Σ-inconsistent. Then ∆ ⊢Σ ⊥, i.e., thereS are φ1 ,

. . . , φk ∈ ∆ such that Σ ⊢ φ1 → (φ2 → · · · (φk → ⊥) . . . ). Since ∆ = n=0 ∆n ,
each φi ∈ ∆ni for some ni . Let n be the largest of these. Since ni ≤ n,
∆ni ⊆ ∆n . So, all φi are in some ∆n . This would mean ∆n ⊢Σ ⊥, i.e., ∆n is
Σ-inconsistent.

Release : 6891b66 (2024-12-01) 703


CHAPTER 52. COMPLETENESS AND CANONICAL MODELS

To show that each ∆n is Σ-consistent, we use a simple induction on n.


∆0 = Γ , and we assumed Γ was Σ-consistent. So the claim holds for n = 0.
Now suppose it holds for n, i.e., ∆n is Σ-consistent. ∆n+1 is either ∆n ∪ {φn }
if that is Σ-consistent, otherwise it is ∆n ∪ {¬φn }. In the first case, ∆n+1 is
clearly Σ-consistent. However, by Proposition 51.40(3), either ∆n ∪ {φn } or
∆n ∪ {¬φn } is consistent, so ∆n+1 is consistent in the other case as well.

Corollary 52.4. Γ ⊢Σ φ if and only if φ ∈ ∆ for each complete Σ-consistent nml:com:lin:

set ∆ extending Γ (including when Γ = ∅, in which case we get another char- cor:provability-characterization

acterization of the modal system Σ.)

Proof. Suppose Γ ⊢Σ φ, and let ∆ be any complete Σ-consistent set extending


Γ . If φ ∈
/ ∆ then by maximality ¬φ ∈ ∆ and so ∆ ⊢Σ φ (by monotonicity) and
∆ ⊢Σ ¬φ (by reflexivity), and so ∆ is inconsistent. Conversely if Γ ⊬Σ φ, then
Γ ∪ {¬φ} is Σ-consistent, and by Lindenbaum’s Lemma there is a complete
consistent set ∆ extending Γ ∪ {¬φ}. By consistency, φ ∈ / ∆.

content/normal-modal-logic/completeness/modalities-ccs.tex

52.4 Modalities and Complete Consistent Sets


explanation When we construct a model MΣ whose set of worlds is given by the complete nml:com:mod:
sec
Σ-consistent sets ∆ in some normal modal logic Σ, we will also need to define
an accessibility relation RΣ between such “worlds.” We want it to be the case
that the accessibility relation (and the assignment V Σ ) are defined in such a
way that MΣ , ∆ ⊩ φ iff φ ∈ ∆. How should we do this?
Once the accessibility relation is defined, the definition of truth at a world
ensures that MΣ , ∆ ⊩ □φ iff MΣ , ∆′ ⊩ φ for all ∆′ such that RΣ ∆∆′ . The
proof that MΣ , ∆ ⊩ φ iff φ ∈ ∆ requires that this is true in particular for
formulas starting with a modal operator, i.e., MΣ , ∆ ⊩ □φ iff □φ ∈ ∆. Com-
bining this requirement with the definition of truth at a world for □φ yields:

□φ ∈ ∆ iff φ ∈ ∆′ for all ∆′ with RΣ ∆∆′

Consider the left-to-right direction: it says that if □φ ∈ ∆, then φ ∈ ∆′ for


any φ and any ∆′ with RΣ ∆∆′ . If we stipulate that RΣ ∆∆′ iff φ ∈ ∆′ for all
□φ ∈ ∆, then this holds. We can write the condition on the right of the “iff”
more compactly as: {φ : □φ ∈ ∆} ⊆ ∆′ .
So the question is: does this definition of RΣ in fact guarantee that □φ ∈ ∆
iff MΣ , ∆ ⊩ □φ? Does it also guarantee that ♢φ ∈ ∆ iff MΣ , ∆ ⊩ ♢φ? The
next few results will establish this.
Definition 52.5. If Γ is a set of formulas, let

□Γ = {□ψ : ψ ∈ Γ }
♢Γ = {♢ψ : ψ ∈ Γ }

704 Release : 6891b66 (2024-12-01)


52.4. MODALITIES AND COMPLETE CONSISTENT SETS

and

□−1 Γ = {ψ : □ψ ∈ Γ }
♢−1 Γ = {ψ : ♢ψ ∈ Γ }

In other words, □Γ is Γ with □ in front of every formula in Γ ; □−1 Γ is


all the □’ed formulas of Γ with the initial □’s removed. This definition is not
terribly important on its own, but will simplify the notation considerably.
Note that □□−1 Γ ⊆ Γ :
□□−1 Γ = {□ψ : □ψ ∈ Γ }
i.e., it’s just the set of all those formulas of Γ that start with □.
nml:com:mod: Lemma 52.6. If Γ ⊢Σ φ then □Γ ⊢Σ □φ.
lem:box1
Proof. If Γ ⊢Σ φ then there are ψ1 , . . . , ψk ∈ Γ such that Σ ⊢ ψ1 → (ψ2 →
· · · (ψn →φ) · · · ). Since Σ is normal, by rule rk, Σ ⊢ □ψ1 →(□ψ2 →· · · (□ψn →
□φ) · · · ), where obviously □ψ1 , . . . , □ψk ∈ □Γ . Hence, by definition, □Γ ⊢Σ
□φ.
nml:com:mod: Lemma 52.7. If □−1 Γ ⊢Σ φ then Γ ⊢Σ □φ.
lem:box2
Proof. Suppose □−1 Γ ⊢Σ φ; then by Lemma 52.6, □□−1 Γ ⊢ □φ. But since
□□−1 Γ ⊆ Γ , also Γ ⊢Σ □φ by monotonicity.
nml:com:mod: Proposition 52.8. If Γ is complete Σ-consistent, then □φ ∈ Γ if and only if
prop:box
for every complete Σ-consistent ∆ such that □−1 Γ ⊆ ∆, it holds that φ ∈ ∆.
Proof. Suppose Γ is complete Σ-consistent. The “only if” direction is easy:
Suppose □φ ∈ Γ and that □−1 Γ ⊆ ∆. Since □φ ∈ Γ , φ ∈ □−1 Γ ⊆ ∆, so
φ ∈ ∆.
For the “if” direction, we prove the contrapositive: Suppose □φ ∈
/ Γ . Since
Γ is complete Σ-consistent, it is deductively closed, and hence Γ ⊬Σ □φ.
By Lemma 52.7, □−1 Γ ⊬Σ φ. By Proposition 51.40(2), □−1 Γ ∪ {¬φ} is Σ-
consistent. By Lindenbaum’s Lemma, there is a complete Σ-consistent set ∆
such that □−1 Γ ∪ {¬φ} ⊆ ∆. By consistency, φ ∈ / ∆.
nml:com:mod: Lemma 52.9. Suppose Γ and ∆ are complete Σ-consistent. Then □−1 Γ ⊆ ∆
lem:box-iff-diamond
if and only if ♢∆ ⊆ Γ .
Proof. “Only if” direction: Assume □−1 Γ ⊆ ∆ and suppose ♢φ ∈ ♢∆ (i.e.,
φ ∈ ∆). In order to show ♢φ ∈ Γ , it suffices to show □¬φ ∈ / Γ , for then by
maximality, ¬□¬φ ∈ Γ . Now, if □¬φ ∈ Γ then by hypothesis ¬φ ∈ ∆, against
the consistency of ∆ (since φ ∈ ∆). Hence □¬φ ∈ / Γ , as required.
“If” direction: Assume ♢∆ ⊆ Γ . We argue contrapositively: suppose φ ∈ /∆
in order to show □φ ∈ / Γ . If φ ∈/ ∆ then by maximality ¬φ ∈ ∆ and so by
hypothesis ♢¬φ ∈ Γ . But in a normal modal logic ♢¬φ is equivalent to ¬□φ,
and if the latter is in Γ , by consistency □φ ∈
/ Γ , as required.

Release : 6891b66 (2024-12-01) 705


CHAPTER 52. COMPLETENESS AND CANONICAL MODELS

Proposition 52.10. If Γ is complete Σ-consistent, then ♢φ ∈ Γ if and only nml:com:mod:


if for some complete Σ-consistent ∆ such that ♢∆ ⊆ Γ , it holds that φ ∈ ∆. prop:diamond

Proof. Suppose Γ is complete Σ-consistent. ♢φ ∈ Γ iff ¬□¬φ ∈ Γ by dual


and closure. ¬□¬φ ∈ Γ iff □¬φ ∈ / Γ by Proposition 52.2(5) since Γ is com-
plete Σ-consistent. By Proposition 52.8, □¬φ ∈ / Γ iff, for some complete
Σ-consistent ∆ with □−1 Γ ⊆ ∆, ¬φ ∈ / ∆. Now consider any such ∆. By
Lemma 52.9, □−1 Γ ⊆ ∆ iff ♢∆ ⊆ Γ . Also, ¬φ ∈ / ∆ iff φ ∈ ∆ by Proposi-
tion 52.2(5). So ♢φ ∈ Γ iff, for some complete Σ-consistent ∆ with ♢∆ ⊆ Γ ,
φ ∈ ∆.

Problem 52.2. Show that if Γ is complete Σ-consistent, then ♢φ ∈ Γ if and


only if there is a complete Σ-consistent ∆ such that □−1 Γ ⊆ ∆ and φ ∈ ∆.
Do this without using Lemma 52.9.

content/normal-modal-logic/completeness/canonical-models.tex

52.5 Canonical Models


The canonical model for a modal system Σ is a specific model MΣ in which nml:com:cmd:
sec
the worlds are all complete Σ-consistent sets. Its accessibility relation RΣ and
valuation V Σ are defined so as to guarantee that the formulas true at a world ∆
are exactly the formulas making up ∆.
Definition 52.11. Let Σ be a normal modal logic. The canonical model for
Σ is MΣ = ⟨W Σ , RΣ , V Σ ⟩, where:
1. W Σ = {∆ : ∆ is complete Σ-consistent}.
2. RΣ ∆∆′ holds if and only if □−1 ∆ ⊆ ∆′ .
3. V Σ (p) = {∆ : p ∈ ∆}.

content/normal-modal-logic/completeness/truth-lemma.tex

52.6 The Truth Lemma


The canonical model MΣ is defined in such a way that MΣ , ∆ ⊩ φ iff φ ∈ ∆. nml:com:tru:
sec
For propositional variables, the definition of V Σ yields this directly. We have
to verify that the equivalence holds for all formulas, however. We do this by
induction. The inductive step involves proving the equivalence for formulas
involving propositional operators (where we have to use Proposition 52.2) and
the modal operators (where we invoke the results of section 52.4).
Proposition 52.12 (Truth Lemma). For every formula φ, MΣ , ∆ ⊩ φ if nml:com:tru:
and only if φ ∈ ∆. prop:truthlemma

706 Release : 6891b66 (2024-12-01)


52.6. THE TRUTH LEMMA

Proof. By induction on φ.
1. φ ≡ ⊥: MΣ , ∆ ⊮ ⊥ by Definition 49.6, and ⊥ ∈
/ ∆ by Proposi-
tion 52.2(3).
2. φ ≡ ⊤: MΣ , ∆ ⊩ ⊤ by Definition 49.6, and ⊤ ∈ ∆ by Proposi-
tion 52.2(4).
3. φ ≡ p: MΣ , ∆ ⊩ p iff ∆ ∈ V Σ (p) by Definition 49.6. Also, ∆ ∈ V Σ (p)
iff p ∈ ∆ by definition of V Σ .
4. φ ≡ ¬ψ: MΣ , ∆ ⊩ ¬ψ iff MΣ , ∆ ⊮ ψ (Definition 49.6) iff ψ ∈
/ ∆ (by
inductive hypothesis) iff ¬ψ ∈ ∆ (by Proposition 52.2(5)).
5. φ ≡ ψ ∧ χ: MΣ , ∆ ⊩ ψ ∧ χ iff MΣ , ∆ ⊩ ψ and MΣ , ∆ ⊩ χ (by
Definition 49.6) iff ψ ∈ ∆ and χ ∈ ∆ (by inductive hypothesis) iff ψ ∧ χ ∈
∆ (by Proposition 52.2(6)).
6. φ ≡ ψ ∨ χ: MΣ , ∆ ⊩ ψ ∨ χ iff MΣ , ∆ ⊩ ψ or MΣ , ∆ ⊩ χ (by Defini-
tion 49.6) iff ψ ∈ ∆ or χ ∈ ∆ (by inductive hypothesis) iff ψ ∨ χ ∈ ∆ (by
Proposition 52.2(7)).
7. φ ≡ ψ → χ: MΣ , ∆ ⊩ ψ → χ iff MΣ , ∆ ⊮ ψ or MΣ , ∆ ⊩ χ (by
Definition 49.6) iff ψ ∈
/ ∆ or χ ∈ ∆ (by inductive hypothesis) iff ψ→χ ∈ ∆
(by Proposition 52.2(8)).
8. φ ≡ ψ ↔ χ: MΣ , ∆ ⊩ ψ ↔ χ iff either MΣ , ∆ ⊩ ψ and MΣ , ∆ ⊩ χ or
MΣ , ∆ ⊮ ψ and MΣ , ∆ ⊮ χ (by Definition 49.6) iff either ψ ∈ ∆ and
χ ∈ ∆ or ψ ∈/ ∆ and χ ∈
/ ∆ (by inductive hypothesis) iff ψ ↔ χ ∈ ∆ (by
Proposition 52.2(9)).
9. φ ≡ □ψ: First suppose that MΣ , ∆ ⊩ □ψ. By Definition 49.6, for every
∆′ such that RΣ ∆∆′ , MΣ , ∆′ ⊩ ψ. By inductive hypothesis, for every
∆′ such that RΣ ∆∆′ , ψ ∈ ∆′ . By definition of RΣ , for every ∆′ such
that □−1 ∆ ⊆ ∆′ , ψ ∈ ∆′ . By Proposition 52.8, □ψ ∈ ∆.
Now assume □ψ ∈ ∆. Let ∆′ ∈ W Σ be such that RΣ ∆∆′ , i.e., □−1 ∆ ⊆
∆′ . Since □ψ ∈ ∆, ψ ∈ □−1 ∆. Consequently, ψ ∈ ∆′ . By inductive
hypothesis, MΣ , ∆′ ⊩ ψ. Since ∆′ is arbitrary with RΣ ∆∆′ , for all ∆′ ∈
W Σ such that RΣ ∆∆′ , MΣ , ∆′ ⊩ ψ. By Definition 49.6, MΣ , ∆ ⊩ □ψ.
10. φ ≡ ♢ψ: First suppose that MΣ , ∆ ⊩ ♢ψ. By Definition 49.6, for some
∆′ such that RΣ ∆∆′ , MΣ , ∆′ ⊩ ψ. By inductive hypothesis, for some
∆′ such that RΣ ∆∆′ , ψ ∈ ∆′ . By definition of RΣ , for some ∆′ such
that □−1 ∆ ⊆ ∆′ , ψ ∈ ∆′ . By Proposition 52.10, for some ∆′ such that
♢∆′ ⊆ ∆, ψ ∈ ∆′ . Since ψ ∈ ∆′ , ♢ψ ∈ ♢∆′ , so ♢ψ ∈ ∆.
Now assume ♢ψ ∈ ∆. By Proposition 52.10, there is a complete Σ-
consistent ∆′ ∈ W Σ such that ♢∆′ ⊆ ∆ and ψ ∈ ∆′ . By Lemma 52.9,
there is a ∆′ ∈ W Σ such that □−1 ∆ ⊆ ∆′ , and ψ ∈ ∆′ . By definition of
RΣ , RΣ ∆∆′ , so there is a ∆′ ∈ W Σ such that RΣ ∆∆′ and ψ ∈ ∆′ . By
Definition 49.6, MΣ , ∆ ⊩ ♢ψ.

Release : 6891b66 (2024-12-01) 707


CHAPTER 52. COMPLETENESS AND CANONICAL MODELS

Problem 52.3. Complete the proof of Proposition 52.12.

content/normal-modal-logic/completeness/completeness-K.tex

52.7 Determination and Completeness for K


We are now prepared to use the canonical model to establish completeness. nml:com:cmk:
sec
Completeness follows from the fact that the formulas true in the canonical
model for Σ are exactly the Σ-derivable ones. Models with this property are
said to determine Σ.

Definition 52.13. A model M determines a normal modal logic Σ precisely


when M ⊩ φ if and only if Σ ⊢ φ, for all formulas φ.

Theorem 52.14 (Determination). MΣ ⊩ φ if and only if Σ ⊢ φ. nml:com:cmk:


thm:determination

Proof. If MΣ ⊩ φ, then for every complete Σ-consistent ∆, we have MΣ , ∆ ⊩


φ. Hence, by the Truth Lemma, φ ∈ ∆ for every complete Σ-consistent ∆,
whence by Corollary 52.4 (with Γ = ∅), Σ ⊢ φ.
Conversely, if Σ ⊢ φ then by Proposition 52.2(1), every complete Σ-
consistent ∆ contains φ, and hence by the Truth Lemma, MΣ , ∆ ⊩ φ for
every ∆ ∈ W Σ , i.e., MΣ ⊩ φ.

Since the canonical model for K determines K, we immediately have com-


pleteness of K as a corollary:

Corollary 52.15. The basic modal logic K is complete with respect to the nml:com:cmk:
class of all models, i.e., if ⊨ φ then K ⊢ φ. cor:Kcomplete

Proof. Contrapositively, if K ⊬ φ then by Determination MK ⊮ φ and hence


φ is not valid.

For the general case of completeness of a system Σ with respect to a class


of models, e.g., of KTB4 with respect to the class of reflexive, symmetric,
transitive models, determination alone is not enough. We must also show that
the canonical model for the system Σ is a member of the class, which does not
follow obviously from the canonical model construction—nor is it always true!

content/normal-modal-logic/completeness/frame-completeness.tex

52.8 Frame Completeness


The completeness theorem for K can be extended to other modal systems, once nml:com:fra:
sec
we show that the canonical model for a given logic has the corresponding frame
property.

708 Release : 6891b66 (2024-12-01)


52.8. FRAME COMPLETENESS

nml:com:fra: Theorem 52.16. If a normal modal logic Σ contains one of the formulas
thm:completeframeprops
on the left-hand side of Table 52.1, then the canonical model for Σ has the
corresponding property on the right-hand side.

If Σ contains . . . . . . the canonical model for Σ is:


D: □φ → ♢φ serial;
T: □φ → φ reflexive;
B: φ → □♢φ symmetric;
4: □φ → □□φ transitive;
5: ♢φ → □♢φ euclidean.
Table 52.1: Basic correspondence facts.
nml:com:fra:
tab:correspondencetable Proof. We take each of these up in turn.
Suppose Σ contains D, and let ∆ ∈ W Σ ; we need to show that there is a ∆′
such that RΣ ∆∆′ . It suffices to show that □−1 ∆ is Σ-consistent, for then by
Lindenbaum’s Lemma, there is a complete Σ-consistent set ∆′ ⊇ □−1 ∆, and
by definition of RΣ we have RΣ ∆∆′ . So, suppose for contradiction that □−1 ∆
is not Σ-consistent, i.e., □−1 ∆ ⊢Σ ⊥. By Lemma 52.7, ∆ ⊢Σ □⊥, and since Σ
contains D, also ∆ ⊢Σ ♢⊥. But Σ is normal, so Σ ⊢ ¬♢⊥ (Proposition 51.7),
whence also ∆ ⊢Σ ¬♢⊥, against the consistency of ∆.
Now suppose Σ contains T, and let ∆ ∈ W Σ . We want to show RΣ ∆∆,
i.e., □−1 ∆ ⊆ ∆. But if □φ ∈ ∆ then by T also φ ∈ ∆, as desired.
Now suppose Σ contains B, and suppose RΣ ∆∆′ for ∆, ∆′ ∈ W Σ . We need
to show that RΣ ∆′ ∆, i.e., □−1 ∆′ ⊆ ∆. By Lemma 52.9, this is equivalent to
♢∆ ⊆ ∆′ . So suppose φ ∈ ∆. By B, also □♢φ ∈ ∆. By the hypothesis that
RΣ ∆∆′ , we have that □−1 ∆ ⊆ ∆′ , and hence ♢φ ∈ ∆′ , as required.
Now suppose Σ contains 4, and suppose RΣ ∆1 ∆2 and RΣ ∆2 ∆3 . We need
to show RΣ ∆1 ∆3 . From the hypothesis we have both □−1 ∆1 ⊆ ∆2 and
□−1 ∆2 ⊆ ∆3 . In order to show RΣ ∆1 ∆3 it suffices to show □−1 ∆1 ⊆ ∆3 . So
let ψ ∈ □−1 ∆1 , i.e., □ψ ∈ ∆1 . By 4, also □□ψ ∈ ∆1 and by hypothesis we
get, first, that □ψ ∈ ∆2 and, second, that ψ ∈ ∆3 , as desired.
Now suppose Σ contains 5, suppose RΣ ∆1 ∆2 and RΣ ∆1 ∆3 . We need
to show RΣ ∆2 ∆3 . The first hypothesis gives □−1 ∆1 ⊆ ∆2 , and the second
hypothesis is equivalent to ♢∆3 ⊆ ∆2 , by Lemma 52.9. To show RΣ ∆2 ∆3 , by
Lemma 52.9, it suffices to show ♢∆3 ⊆ ∆2 . So let ♢φ ∈ ♢∆3 , i.e., φ ∈ ∆3 . By
the second hypothesis ♢φ ∈ ∆1 and by 5, □♢φ ∈ ∆1 as well. But now the first
hypothesis gives ♢φ ∈ ∆2 , as desired.

As a corollary we obtain completeness results for a number of systems. For


instance, we know that S5 = KT5 = KTB4 is complete with respect to the
class of all reflexive euclidean models, which is the same as the class of all
reflexive, symmetric and transitive models.

nml:com:fra: Theorem 52.17. Let CD , CT , CB , C4 , and C5 be the class of all serial, re-
thm:generaldet
flexive, symmetric, transitive, and euclidean models (respectively). Then for

Release : 6891b66 (2024-12-01) 709


CHAPTER 52. COMPLETENESS AND CANONICAL MODELS

any schemas φ1 , . . . , φn among D, T, B, 4, and 5, the system Kφ1 . . . φn is


determined by the class of models C = Cφ1 ∩ · · · ∩ Cφn .

Proposition 52.18. Let Σ be a normal modal logic; then:

1. If Σ contains the schema ♢φ → □φ then the canonical model for Σ is nml:com:fra:


prop:anotherfive-a
partially functional.

2. If Σ contains the schema ♢φ ↔ □φ then the canonical model for Σ is


functional.

3. If Σ contains the schema □□φ → □φ then the canonical model for Σ is


weakly dense.

(see Table 50.2 for definitions of these frame properties).

Proof. 1. Suppose that Σ contains the schema ♢φ → □φ, to show that


RΣ is partially functional we need to prove that for any ∆1 , ∆2 , ∆3 ∈
W Σ , if RΣ ∆1 ∆2 and RΣ ∆1 ∆3 then ∆2 = ∆3 . Since RΣ ∆1 ∆2 we have
□−1 ∆1 ⊆ ∆2 and since RΣ ∆1 ∆3 also □−1 ∆1 ⊆ ∆3 . The identity ∆2 =
∆3 will follow if we can establish the two inclusions ∆2 ⊆ ∆3 and ∆3 ⊆
∆2 . For the first inclusion, let φ ∈ ∆2 ; then ♢φ ∈ ∆1 , and by the schema
and deductive closure of ∆1 also □φ ∈ ∆1 , whence by the hypothesis
that RΣ ∆1 ∆3 , φ ∈ ∆3 . The second inclusion is similar.

2. This follows immediately from part (1) and the seriality proof in Theo-
rem 52.16.

3. Suppose Σ contains the schema □□φ → □φ and to show that RΣ is


weakly dense, let RΣ ∆1 ∆2 . We need to show that there is a complete
Σ-consistent set ∆3 such that RΣ ∆1 ∆3 and RΣ ∆3 ∆2 . Let:

Γ = □−1 ∆1 ∪ ♢∆2 .

It suffices to show that Γ is Σ-consistent, for then by Lindenbaum’s


Lemma it can be extended to a complete Σ-consistent set ∆3 such that
□−1 ∆1 ⊆ ∆3 and ♢∆2 ⊆ ∆3 , i.e., RΣ ∆1 ∆3 and RΣ ∆3 ∆2 (by Lemma 52.9).
Suppose for contradiction that Γ is not consistent. Then there are for-
mulas □φ1 , . . . , □φn ∈ ∆1 and ψ1 , . . . , ψm ∈ ∆2 such that

φ1 , . . . , φn , ♢ψ1 , . . . , ♢ψm ⊢Σ ⊥.

710 Release : 6891b66 (2024-12-01)


Since ♢(ψ1 ∧ · · · ∧ ψm ) → (♢ψ1 ∧ · · · ∧ ♢ψm ) is derivable in every normal
modal logic, we argue as follows, contradicting the consistency of ∆2 :
φ1 , . . . , φn ,♢ψ1 , . . . , ♢ψm ⊢Σ ⊥
φ1 , . . . , φn ⊢Σ (♢ψ1 ∧ · · · ∧ ♢ψm ) → ⊥
by the deduction theorem
Proposition 51.37(4), and taut
φ1 , . . . , φn ⊢Σ ♢(ψ1 ∧ · · · ∧ ψm ) → ⊥
since Σ is normal
φ1 , . . . , φn ⊢Σ ¬♢(ψ1 ∧ · · · ∧ ψm )
by pl
φ1 , . . . , φn ⊢Σ □¬(ψ1 ∧ · · · ∧ ψm )
□¬ for ¬♢
□φ1 , . . . , □φn ⊢Σ □□¬(ψ1 ∧ · · · ∧ ψm )
by Lemma 52.6
□φ1 , . . . , □φn ⊢Σ □¬(ψ1 ∧ · · · ∧ ψm )
by schema □□φ → □φ
∆1 ⊢Σ □¬(ψ1 ∧ · · · ∧ ψm )
by monotonicity, Proposition 51.37(1)
□¬(ψ1 ∧ · · · ∧ ψm ) ∈ ∆1
by deductive closure;
¬(ψ1 ∧ · · · ∧ ψm ) ∈ ∆2
since RΣ ∆1 ∆2 .
On the strength of these examples, one might think that every system Σ of
modal logic is complete, in the sense that it proves every formula which is valid
in every frame in which every theorem of Σ is valid. Unfortunately, there are
many systems that are not complete in this sense.

Chapter 53

Filtrations and Decidability

711
CHAPTER 53. FILTRATIONS AND DECIDABILITY

content/normal-modal-logic/filtrations/introduction.tex

53.1 Introduction
One important question about a logic is always whether it is decidable, i.e., if nml:fil:int:
sec
there is an effective procedure which will answer the question “is this formula
valid.” Propositional logic is decidable: we can effectively test if a formula is
a tautology by constructing a truth table, and for a given formula, the truth
table is finite. But we can’t obviously test if a modal formula is true in all
models, for there are infinitely many of them. We can list all the finite models
relevant to a given formula, since only the assignment of subsets of worlds to
propositional variables which actually occur in the formula are relevant. If the
accessibility relation is fixed, the possible different assignments V (p) are just
all the subsets of W , and if |W | = n there are 2n of those. If our formula φ
contains m propositional variables there are then 2nm different models with n
worlds. For each one, we can test if φ is true at all worlds, simply by computing
the truth value of φ in each. Of course, we also have to check all possible
accessibility relations, but there are only finitely many relations on n worlds
2
as well (specifically, the number of subsets of W × W , i.e., 2n .
If we are not interested in the logic K, but a logic defined by some class of
models (e.g., the reflexive transitive models), we also have to be able to test
if the accessibility relation is of the right kind. We can do that whenever the
frames we are interested in are definable by modal formulas (e.g., by testing if
T and 4 valid in the frame). So, the idea would be to run through all the finite
frames, test each one if it is a frame in the class we’re interested in, then list
all the possible models on that frame and test if φ is true in each. If not, stop:
φ is not valid in the class of models of interest.
There is a problem with this idea: we don’t know when, if ever, we can stop
looking. If the formula has a finite countermodel, our procedure will find it.
But if it has no finite countermodel, we won’t get an answer. The formula may
be valid (no countermodels at all), or it may have only an infinite countermodel,
which we’ll never look at. This problem can be overcome if we can show that
every formula that has a countermodel has a finite countermodel. If this is the
case we say the logic has the finite model property.
But how would we show that a logic has the finite model property? One
way of doing this would be to find a way to turn an infinite (counter)model
of φ into a finite one. If that can be done, then whenever there is a model in
which φ is not true, then the resulting finite model also makes φ not true. That
finite model will show up on our list of all finite models, and we will eventually
determine, for every formula that is not valid, that it isn’t. Our procedure
won’t terminate if the formula is valid. If we can show in addition that there
is some maximum size that the finite model our procedure provides can have,
and that this maximum size depends only on the formula φ, we will have a size
up to which we have to test finite models in our search for countermodels. If
we haven’t found a countermodel by then, there are none. Then our procedure

712 Release : 6891b66 (2024-12-01)


53.1. INTRODUCTION

will, in fact, decide the question “is φ valid?” for any formula φ.
A strategy that often works for turning infinite structures into finite struc-
tures is that of “identifying” elements of the structure which behave the same
way in relevant respects. If there are infinitely many worlds in M that behave
the same in relevant respects, then we may be able to collect all worlds in
finitely many (possibly infinite) “classes” of such worlds. In other words, we
should partition the set of worlds in the right way, i.e., in such a way that each
partition contains infinitely many worlds, but there are only finitely many par-
titions. Then we define a new model M∗ where the worlds are the partitions.
Finitely many partitions in the old model give us finitely many worlds in the
new model, i.e., a finite model. Let’s call the partition a world w is in [w].
We’ll want it to be the case that M, w ⊩ φ iff M∗ , [w] ⊩ φ, since we want the
new model to be a countermodel to φ if the old one was. This requires that
we define the partition, as well as the accessibility relation of M∗ in the right
way.
To see how this would go, first imagine we have no accessibility relation.
M, w ⊩ □ψ iff for some v ∈ W , M, v ⊩ □ψ, and the same for M∗ , except with
[w] and [v]. As a first idea, let’s say that two worlds u and v are equivalent
(belong to the same partition) if they agree on all propositional variables in M,
i.e., M, u ⊩ p iff M, v ⊩ p. Let V ∗ (p) = {[w] : M, w ⊩ p}. Our aim is to show
that M, w ⊩ φ iff M∗ , [w] ⊩ φ. Obviously, we’d prove this by induction: The
base case would be φ ≡ p. First suppose M, w ⊩ p. Then [w] ∈ V ∗ by
definition, so M∗ , [w] ⊩ p. Now suppose that M∗ , [w] ⊩ p. That means that
[w] ∈ V ∗ (p), i.e., for some v equivalent to w, M, v ⊩ p. But “w equivalent to v”
means “w and v make all the same propositional variables true,” so M, w ⊩ p.
Now for the inductive step, e.g., φ ≡ ¬ψ. Then M, w ⊩ ¬ψ iff M, w ⊮ ψ iff
M∗ , [w] ⊮ ψ (by inductive hypothesis) iff M∗ , [w] ⊩ ¬ψ. Similarly for the other
non-modal operators. It also works for □: suppose M∗ , [w] ⊩ □ψ. That means
that for every [u], M∗ , [u] ⊩ ψ. By inductive hypothesis, for every u, M, u ⊩ ψ.
Consequently, M, w ⊩ □ψ.
In the general case, where we have to also define the accessibility relation
for M∗ , things are more complicated. We’ll call a model M∗ a filtration if its
accessibility relation R∗ satisfies the conditions required to make the inductive
proof above go through. Then any filtration M∗ will make φ true at [w] iff
M makes φ true at w. However, now we also have to show that there are
filtrations, i.e., we can define R∗ so that it satisfies the required conditions.
In order for this to work, however, we have to require that worlds u, v count
as equivalent not just when they agree on all propositional variables, but on
all sub-formulas of φ. Since φ has only finitely many sub-formulas, this will
still guarantee that the filtration is finite. There is not just one way to define
a filtration, and in order to make sure that the accessibility relation of the
filtration satisfies the required properties (e.g., reflexive, transitive, etc.) we
have to be inventive with the definition of R∗ .

content/normal-modal-logic/filtrations/preliminaries.tex

Release : 6891b66 (2024-12-01) 713


CHAPTER 53. FILTRATIONS AND DECIDABILITY

53.2 Preliminaries
Filtrations allow us to establish the decidability of our systems of modal logic nml:fil:pre:
sec
by showing that they have the finite model property, i.e., that any formula that
is true (false) in a model is also true (false) in a finite model. Filtrations are
defined relative to sets of formulas which are closed under subformulas.

Definition 53.1. A set Γ of formulas is closed under subformulas if it contains nml:fil:pre:


defn:modallyclosed
every subformula of a formula in Γ . Further, Γ is modally closed if it is closed
under subformulas and moreover φ ∈ Γ implies □φ, ♢φ ∈ Γ .

For instance, given a formula φ, the set of all its sub-formulas is closed
under sub-formulas. When we’re defining a filtration of a model through the
set of sub-formulas of φ, it will have the property we’re after: it makes φ true
(false) iff the original model does.
The set of worlds of a filtration of M through Γ is defined as the set of all
equivalence classes of the following equivalence relation.

Definition 53.2. Let M = ⟨W, R, V ⟩ and suppose Γ is closed under sub-


formulas. Define a relation ≡ on W to hold of any two worlds that make the
same formulas from Γ true, i.e.:

u≡v if and only if ∀φ ∈ Γ : M, u ⊩ φ ⇔ M, v ⊩ φ.

The equivalence class [w]≡ of a world w, or [w] for short, is the set of all worlds
≡-equivalent to w:
[w] = {v : v ≡ w}.

Proposition 53.3. Given M and Γ , ≡ as defined above is an equivalence


relation, i.e., it is reflexive, symmetric, and transitive.

Proof. The relation ≡ is reflexive, since w makes exactly the same formulas
from Γ true as itself. It is symmetric since if u makes the same formulas
from Γ true as v, the same holds for v and u. It is also transitive, since if u
makes the same formulas from Γ true as v, and v as w, then u makes the same
formulas from Γ true as w.

The relation ≡, like any equivalence relation, divides W into partitions, i.e.,
subsets of W which are pairwise disjoint, and together cover all of W . Every
w ∈ W is an element of one of the partitions, namely of [w], since w ≡ w. So
the partitions [w] cover all of W . They are pairwise disjoint, for if u ∈ [w] and
u ∈ [v], then u ≡ w and u ≡ v, and by symmetry and transitivity, w ≡ v, and
so [w] = [v].

content/normal-modal-logic/filtrations/filtrations-def.tex

714 Release : 6891b66 (2024-12-01)


53.3. FILTRATIONS

53.3 Filtrations
nml:fil:fil: Rather than define “the” filtration of M through Γ , we define when a model M∗
sec
counts as a filtration of M. All filtrations have the same set of worlds W ∗ and
the same valuation V ∗ . But different filtrations may have different accessibility
relations R∗ . To count as a filtration, R∗ has to satisfy a number of conditions,
however. These conditions are exactly what we’ll require to prove the main
result, namely that M, w ⊩ φ iff M∗ , [w] ⊩ φ, provided φ ∈ Γ .
nml:fil:fil: Definition 53.4. Let Γ be closed under subformulas and M = ⟨W, R, V ⟩. A
defn:filtration
filtration of M through Γ is any model M∗ = ⟨W ∗ , R∗ , V ∗ ⟩, where:
1. W ∗ = {[w] : w ∈ W };
nml:fil:fil: 2. For any u, v ∈ W :
defn:filtration-R
nml:fil:fil: a) If Ruv then R∗ [u][v];
defn:filtration-R1
nml:fil:fil: b) If R∗ [u][v] then for any □φ ∈ Γ , if M, u ⊩ □φ then M, v ⊩ φ;
defn:filtration-R2
nml:fil:fil: c) If R∗ [u][v] then for any ♢φ ∈ Γ , if M, v ⊩ φ then M, u ⊩ ♢φ.
defn:filtration-R3
3. V ∗ (p) = {[u] : u ∈ V (p)}.

It’s worthwhile thinking about what V ∗ (p) is: the set consisting of the
equivalence classes [w] of all worlds w where p is true in M. On the one hand,
if w ∈ V (p), then [w] ∈ V ∗ (p) by that definition. However, it is not necessarily
the case that if [w] ∈ V ∗ (p), then w ∈ V (p). If [w] ∈ V ∗ (p) we are only
guaranteed that [w] = [u] for some u ∈ V (p). Of course, [w] = [u] means that
w ≡ u. So, when [w] ∈ V ∗ (p) we can (only) conclude that w ≡ u for some
u ∈ V (p).
nml:fil:fil: Theorem 53.5. If M∗ is a filtration of M through Γ , then for every φ ∈ Γ
thm:filtrations
and w ∈ W , we have M, w ⊩ φ if and only if M∗ , [w] ⊩ φ.

Proof. By induction on φ, using the fact that Γ is closed under subformulas.


Since φ ∈ Γ and Γ is closed under sub-formulas, all sub-formulas of φ are also
∈ Γ . Hence in each inductive step, the induction hypothesis applies to the
sub-formulas of φ.
1. φ ≡ ⊥: Neither M, w ⊩ φ nor M∗ , [w] ⊩ φ.
2. φ ≡ ⊤: Both M, w ⊩ φ and M∗ , [w] ⊩ φ.
3. φ ≡ p: The left-to-right direction is immediate, as M, w ⊩ φ only if w ∈
V (p), which implies [w] ∈ V ∗ (p), i.e., M∗ , [w] ⊩ φ. Conversely, suppose
M∗ , [w] ⊩ φ, i.e., [w] ∈ V ∗ (p). Then for some v ∈ V (p), w ≡ v. Of
course then also M, v ⊩ p. Since w ≡ v, w and v make the same formulas
from Γ true. Since by assumption p ∈ Γ and M, v ⊩ p, M, w ⊩ φ.
4. φ ≡ ¬ψ: M, w ⊩ φ iff M, w ⊮ ψ. By induction hypothesis, M, w ⊮ ψ iff
M∗ , [w] ⊮ ψ. Finally, M∗ , [w] ⊮ ψ iff M∗ , [w] ⊩ φ.

Release : 6891b66 (2024-12-01) 715


CHAPTER 53. FILTRATIONS AND DECIDABILITY

5. φ ≡ (ψ ∧ χ): M, w ⊩ φ iff M, w ⊩ ψ and M, w ⊩ χ. By induction


hypothesis, M, w ⊩ ψ iff M∗ , [w] ⊩ ψ, and M, w ⊩ χ iff M∗ , [w] ⊩ χ.
And M∗ , [w] ⊩ φ iff M∗ , [w] ⊩ ψ and M∗ , [w] ⊩ χ.
6. φ ≡ (ψ ∨ χ): M, w ⊩ φ iff M, w ⊩ ψ or M, w ⊩ χ. By induction
hypothesis, M, w ⊩ ψ iff M∗ , [w] ⊩ ψ, and M, w ⊩ χ iff M∗ , [w] ⊩ χ.
And M∗ , [w] ⊩ φ iff M∗ , [w] ⊩ ψ or M∗ , [w] ⊩ χ.
7. φ ≡ (ψ → χ): M, w ⊩ φ iff M, w ⊮ ψ or M, w ⊩ χ. By induction
hypothesis, M, w ⊩ ψ iff M∗ , [w] ⊩ ψ, and M, w ⊩ χ iff M∗ , [w] ⊩ χ.
And M∗ , [w] ⊩ φ iff M∗ , [w] ⊮ ψ or M∗ , [w] ⊩ χ.
8. φ ≡ (ψ ↔ χ): M, w ⊩ φ iff M, w ⊩ ψ and M, w ⊩ χ, or M, w ⊮ ψ
and M, w ⊮ χ. By induction hypothesis, M, w ⊩ ψ iff M∗ , [w] ⊩ ψ,
and M, w ⊩ χ iff M∗ , [w] ⊩ χ. And M∗ , [w] ⊩ φ iff M∗ , [w] ⊩ ψ and
M∗ , [w] ⊩ χ, or M∗ , [w] ⊮ ψ and M∗ , [w] ⊮ χ.
9. φ ≡ □ψ: Suppose M, w ⊩ φ; to show that M∗ , [w] ⊩ φ, let v be such
that R∗ [w][v]. From Definition 53.4(2b), we have that M, v ⊩ ψ, and
by inductive hypothesis M∗ , [v] ⊩ ψ. Since v was arbitrary, M∗ , [w] ⊩ φ
follows.
Conversely, suppose M∗ , [w] ⊩ φ and let v be arbitrary such that Rwv.
From Definition 53.4(2a), we have R∗ [w][v], so that M∗ , [v] ⊩ ψ; by
inductive hypothesis M, v ⊩ ψ, and since v was arbitrary, M, w ⊩ φ.
10. φ ≡ ♢ψ: Suppose M, w ⊩ φ. Then for some v ∈ W , Rwv and M, v ⊩ ψ.
By inductive hypothesis M∗ , [v] ⊩ ψ, and by Definition 53.4(2a), we have
R∗ [w][v]. Thus, M∗ , [w] ⊩ φ.
Now suppose M∗ , [w] ⊩ φ. Then for some [v] ∈ W ∗ with R∗ [w][v],
M∗ , [v] ⊩ ψ. By inductive hypothesis M, v ⊩ ψ. By Definition 53.4(2c),
we have that M, w ⊩ φ.

Problem 53.1. Complete the proof of Theorem 53.5

What holds for truth at worlds in a model also holds for truth in a model
and validity in a class of models.
Corollary 53.6. Let Γ be closed under subformulas. Then:
1. If M∗ is a filtration of M through Γ then for any φ ∈ Γ : M ⊩ φ if and
only if M∗ ⊩ φ.
2. If C is a class of models and Γ (C) is the class of Γ -filtrations of models
in C, then any formula φ ∈ Γ is valid in C if and only if it is valid in
Γ (C).

content/normal-modal-logic/filtrations/examples-of-filtrations.tex

716 Release : 6891b66 (2024-12-01)


53.4. EXAMPLES OF FILTRATIONS

53.4 Examples of Filtrations


nml:fil:exf: We have not yet shown that there are any filtrations. But indeed, for any model
sec
M, there are many filtrations of M through Γ . We identify two, in particular:
the finest and coarsest filtrations. Filtrations of the same models will differ
in their accessibility relation (as Definition 53.4 stipulates directly what W ∗
and V ∗ should be). The finest filtration will have as few related worlds as
possible, whereas the coarsest will have as many as possible.
Definition 53.7. Where Γ is closed under subformulas, the finest filtration
M∗ of a model M is defined by putting:

R∗ [u][v] if and only if ∃u′ ∈ [u] ∃v ′ ∈ [v] : Ru′ v ′ .

nml:fil:exf: Proposition 53.8. The finest filtration M∗ is indeed a filtration.


prop:finest

Proof. We need to check that R∗ , so defined, satisfies Definition 53.4(2). We


check the three conditions in turn.
If Ruv then since u ∈ [u] and v ∈ [v], also R∗ [u][v], so (2a) is satisfied.
For (2b), suppose □φ ∈ Γ , R∗ [u][v], and M, u ⊩ □φ. By definition of R∗ ,
there are u′ ≡ u and v ′ ≡ v such that Ru′ v ′ . Since u and u′ agree on Γ , also
M, u′ ⊩ □φ, so that M, v ′ ⊩ φ. By closure of Γ under sub-formulas, v and v ′
agree on φ, so M, v ⊩ φ, as desired.
To verify (2c), suppose ♢φ ∈ Γ , R∗ [u][v], and M, v ⊩ φ. By definition of
R , there are u′ ≡ u and v ′ ≡ v such that Ru′ v ′ . Since v and v ′ agree on Γ ,

and Γ is closed under sub-formulas, also M, v ′ ⊩ φ, so that M, u′ ⊩ ♢φ. Since


u and u′ also agree on Γ , M, u ⊩ ♢φ.

Problem 53.2. Complete the proof of Proposition 53.8.

Definition 53.9. Where Γ is closed under subformulas, the coarsest filtra-


tion M∗ of a model M is defined by putting R∗ [u][v] if and only if both of the
following conditions are met:
nml:fil:exf: 1. If □φ ∈ Γ and M, u ⊩ □φ then M, v ⊩ φ;
defn:coarsest-Box
nml:fil:exf: 2. If ♢φ ∈ Γ and M, v ⊩ φ then M, u ⊩ ♢φ.
defn:coarsest-Diamond

Proposition 53.10. The coarsest filtration M∗ is indeed a filtration.

Proof. Given the definition of R∗ , the only condition that is left to verify is
the implication from Ruv to R∗ [u][v]. So assume Ruv. Suppose □φ ∈ Γ and
M, u ⊩ □φ; then obviously M, v ⊩ φ, and (1) is satisfied. Suppose ♢φ ∈ Γ
and M, v ⊩ φ. Then M, u ⊩ ♢φ since Ruv, and (2) is satisfied.

Example 53.11. Let W = Z+ , Rnm iff m = n + 1, and V (p) = {2n : n ∈ N}.


The model M = ⟨W, R, V ⟩ is depicted in Figure 53.1. The worlds are 1, 2, etc.;
each world can access exactly one other world—its successor—and p is true at
all and only the even numbers.

Release : 6891b66 (2024-12-01) 717


CHAPTER 53. FILTRATIONS AND DECIDABILITY

1 2 3 4
¬p p ¬p p

[1] [2] [1] [2]


¬p p ¬p p
Figure 53.1: An infinite model and its filtrations.
nml:fil:exf:
fig:ex-filtration

Now let Γ be the set of sub-formulas of □p→p, i.e., {p, □p, □p→p}. p is true
at all and only the even numbers, □p is true at all and only the odd numbers,
so □p → p is true at all and only the even numbers. In other words, every odd
number makes □p true but p and □p → p false; every even number makes p
and □p → p true, but □p false. So W ∗ = {[1], [2]}, where [1] = {1, 3, 5, . . . } and
[2] = {2, 4, 6, . . . }. Since 2 ∈ V (p), [2] ∈ V ∗ (p); since 1 ∈ / V ∗ (p). So
/ V (p), [1] ∈

V (p) = {[2]}.

Any filtration based on W ∗ must have an accessibility relation that includes


⟨[1], [2]⟩, ⟨[2], [1]⟩: since R12, we must have R∗ [1][2] by Definition 53.4(2a), and
since R23 we must have R∗ [2][3], and [3] = [1]. It cannot include ⟨[1], [1]⟩: if it
did, we’d have R∗ [1][1], M, 1 ⊩ □p but M, 1 ⊮ p, contradicting (2b). Nothing
requires or rules out that R∗ [2][2]. So, there are two possible filtrations of M,
corresponding to the two accessibility relations

{⟨[1], [2]⟩, ⟨[2], [1]⟩} and {⟨[1], [2]⟩, ⟨[2], [1]⟩, ⟨[2], [2]⟩}.

In either case, p and □p → p are false and □p is true at [1]; p and □p → p are
true and □p is false at [2].

Problem 53.3. Consider the following model M = ⟨W, R, V ⟩ where W =


{0σ : σ ∈ B∗ }, the set of sequences of 0s and 1s starting with 0, with Rσσ ′ iff
σ ′ = σ0 or σ ′ = σ1, and V (p) = {σ0 : σ ∈ B∗ } and V (q) = {σ1 : σ ∈ B∗ \ {1}}.
Here’s a picture:

718 Release : 6891b66 (2024-12-01)


53.5. FILTRATIONS ARE FINITE

000

p
00
¬q
p
¬q 001

¬p
0 q
p
¬q 010

p
01
¬q
¬p
q 011

¬p
q

We have M, w ⊮ □(p ∨ q) → (□p ∨ □q) for every w.


Let Γ be the set of sub-formulas of □(p ∨ q) → (□p ∨ □q). What are W ∗
and V ∗ ? What is the accessibility relation of the finest filtration of M? Of the
coarsest?

content/normal-modal-logic/filtrations/finite.tex

53.5 Filtrations are Finite


nml:fil:fin: We’ve defined filtrations for any set Γ that is closed under sub-formulas. Noth-
sec
ing in the definition itself guarantees that filtrations are finite. In fact, when
Γ is infinite (e.g., is the set of all formulas), it may well be infinite. However,
if Γ is finite (e.g., when it is the set of sub-formulas of a given formula φ), so
is any filtration through Γ .

nml:fil:fin: Proposition 53.12. If Γ is finite then any filtration M∗ of a model M


prop:filt-are-finite
through Γ is also finite.

Proof. The size of W ∗ is the number of different classes [w] under the equiva-
lence relation ≡. Any two worlds u, v in such class—that is, any u and v such
that u ≡ v—agree on all formulas φ in Γ , φ ∈ Γ either φ is true at both u and
v, or at neither. So each class [w] corresponds to subset of Γ , namely the set
of all φ ∈ Γ such that φ is true at the worlds in [w]. No two different classes
[u] and [v] correspond to the same subset of Γ . For if the set of formulas true
at u and that of formulas true at v are the same, then u and v agree on all
formulas in Γ , i.e., u ≡ v. But then [u] = [v]. So, there is an injective function
from W ∗ to ℘(Γ ), and hence |W ∗ | ≤ |℘(Γ )|. Hence if Γ contains n sentences,
the cardinality of W ∗ is no greater than 2n .

Release : 6891b66 (2024-12-01) 719


CHAPTER 53. FILTRATIONS AND DECIDABILITY

content/normal-modal-logic/filtrations/S5-fmp.tex

53.6 K and S5 have the Finite Model Property


nml:fil:fmp:
sec
Definition 53.13. A system Σ of modal logic is said to have the finite model
property if whenever a formula φ is true at a world in a model of Σ then φ is
true at a world in a finite model of Σ.

Proposition 53.14. K has the finite model property. nml:fil:fmp:


prop:K-fmp

Proof. K is the set of valid formulas, i.e., any model is a model of K. By


Theorem 53.5, if M, w ⊩ φ, then M∗ , w ⊩ φ for any filtration of M through the
set Γ of sub-formulas of φ. Any formula only has finitely many sub-formulas, so
Γ is finite. By Proposition 53.12, |W ∗ | ≤ 2n , where n is the number of formulas
in Γ . And since K imposes no restriction on models, M∗ is a K-model.

To show that a logic L has the finite model property via filtrations it is
essential that the filtration of an L-model is itself a L-model. Often this re-
quires a fair bit of work, and not any filtration yields a L-model. However, for
universal models, this still holds.

Proposition 53.15. Let U be the class of universal models (see Proposi- nml:fil:fmp:
tion 50.14) and UFin the class of all finite universal models. Then any for- prop:univ-fin

mula φ is valid in U if and only if it is valid in UFin .

Proof. Finite universal models are universal models, so the left-to-right direc-
tion is trivial. For the right-to left direction, suppose that φ is false at some
world w in a universal model M. Let Γ contain φ as well as all of its sub-
formulas; clearly Γ is finite. Take a filtration M∗ of M; then M∗ is finite by
Proposition 53.12, and by Theorem 53.5, φ is false at [w] in M∗ . It remains to
observe that M∗ is also universal: given u and v, by hypothesis Ruv and by
Definition 53.4(2), also R∗ [u][v].

Corollary 53.16. S5 has the finite model property. nml:fil:fmp:


cor:S5fmp

Proof. By Proposition 50.14, if φ is true at a world in some reflexive and


euclidean model then it is true at a world in a universal model. By Proposi-
tion 53.15, it is true at a world in a finite universal model (namely the filtration
of the model through the set of sub-formulas of φ). Every universal model is
also reflexive and euclidean; so φ is true at a world in a finite reflexive euclidean
model.

Problem 53.4. Show that any filtration of a serial or reflexive model is also
serial or reflexive (respectively).

720 Release : 6891b66 (2024-12-01)


53.7. S5 IS DECIDABLE

Problem 53.5. Find a non-symmetric (non-transitive, non-euclidean) filtra-


tion of a symmetric (transitive, euclidean) model.

content/normal-modal-logic/filtrations/S5-decidable.tex

53.7 S5 is Decidable
nml:fil:dec: The finite model property gives us an easy way to show that systems of modal
sec
logic given by schemas are decidable (i.e., that there is a computable procedure
to determine whether a formula is derivable in the system or not).

Theorem 53.17. S5 is decidable.

Proof. Let φ be given, and suppose the propositional variables occurring in φ


are among p1 , . . . , pk . Since for each n there are only finitely many models
with n worlds assigning a value to p1 , . . . , pk , we can enumerate, in parallel,
all the theorems of S5 by generating proofs in some systematic way; and all
the models containing 1, 2, . . . worlds and checking whether φ fails at a world
in some such model. Eventually one of the two parallel processes will give an
answer, as by Theorem 52.17 and Corollary 53.16, either φ is derivable or it
fails in a finite universal model.

The above proof works for S5 because filtrations of universal models are
automatically universal. The same holds for reflexivity and seriality, but more
work is needed for other properties.

content/normal-modal-logic/filtrations/more-filtrations.tex

53.8 Filtrations and Properties of Accessibility


nml:fil:acc: As noted, filtrations of universal, serial, and reflexive models are always also
sec
universal, serial, or reflexive. But not every filtration of a symmetric or tran-
sitive model is symmetric or transitive, respectively. In some cases, however,
it is possible to define filtrations so that this does hold. In order to do so, we
proceed as in the definition of the coarsest filtration, but add additional condi-
tions to the definition of R∗ . Let Γ be closed under sub-formulas. Consider the
relations Ci (u, v) in Table 53.1 between worlds u, v in a model M = ⟨W, R, V ⟩.
We can define R∗ [u][v] on the basis of combinations of these conditions. For
instance, if we stipulate that R∗ [u][v] iff the condition C1 (u, v) holds, we get
exactly the coarsest filtration. If we stipulate R∗ [u][v] iff both C1 (u, v) and
C2 (u, v) hold, we get a different filtration. It is “finer” than the coarsest since
fewer pairs of worlds satisfy C1 (u, v) and C2 (u, v) than C1 (u, v) alone.

nml:fil:acc: Theorem 53.18. Let M = ⟨W, R, V ⟩ be a model, Γ closed under sub-formulas.


thm:more-filtrations
Let W ∗ and V ∗ be defined as in Definition 53.4. Then:

Release : 6891b66 (2024-12-01) 721


CHAPTER 53. FILTRATIONS AND DECIDABILITY

if □φ ∈ Γ and M, u ⊩ □φ then M, v ⊩ φ; and


C1 (u, v):
if ♢φ ∈ Γ and M, v ⊩ φ then M, u ⊩ ♢φ;
if □φ ∈ Γ and M, v ⊩ □φ then M, u ⊩ φ; and
C2 (u, v):
if ♢φ ∈ Γ and M, u ⊩ φ then M, v ⊩ ♢φ;
if □φ ∈ Γ and M, u ⊩ □φ then M, v ⊩ □φ; and
C3 (u, v):
if ♢φ ∈ Γ and M, v ⊩ ♢φ then M, u ⊩ ♢φ;
if □φ ∈ Γ and M, v ⊩ □φ then M, u ⊩ □φ; and
C4 (u, v):
if ♢φ ∈ Γ and M, u ⊩ ♢φ then M, v ⊩ ♢φ;
Table 53.1: Conditions on possible worlds for defining filtrations.
nml:fil:acc:
tab:Cn-filtrations
∗ ∗
1. Suppose R [u][v] if and only if C1 (u, v)∧C2 (u, v). Then R is symmetric,
and M∗ = ⟨W ∗ , R∗ , V ∗ ⟩ is a filtration if M is symmetric.
2. Suppose R∗ [u][v] if and only if C1 (u, v) ∧ C3 (u, v). Then R∗ is transitive,
and M∗ = ⟨W ∗ , R∗ , V ∗ ⟩ is a filtration if M is transitive.
3. Suppose R∗ [u][v] if and only if C1 (u, v) ∧ C2 (u, v) ∧ C3 (u, v) ∧ C4 (u, v).
Then R∗ is symmetric and transitive, and M∗ = ⟨W ∗ , R∗ , V ∗ ⟩ is a filtra-
tion if M is symmetric and transitive.
4. Suppose R∗ is defined as R∗ [u][v] if and only if C1 (u, v) ∧ C3 (u, v) ∧
C4 (u, v). Then R∗ is transitive and euclidean, and M∗ = ⟨W ∗ , R∗ , V ∗ ⟩
is a filtration if M is transitive and euclidean.

Proof. 1. It’s immediate that R∗ is symmetric, since C1 (u, v) ⇔ C2 (v, u)


and C2 (u, v) ⇔ C1 (v, u). So it’s left to show that if M is symmetric
then M∗ is a filtration through Γ . Condition C1 (u, v) guarantees that
(2b) and (2c) of Definition 53.4 are satisfied. So we just have to verify
Definition 53.4(2a), i.e., that Ruv implies R∗ [u][v].
So suppose Ruv. To show R∗ [u][v] we need to establish that C1 (u, v) and
C2 (u, v). For C1 : if □φ ∈ Γ and M, u ⊩ □φ then also M, v ⊩ φ (since
Ruv). Similarly, if ♢φ ∈ Γ and M, v ⊩ φ then M, u ⊩ ♢φ since Ruv. For
C2 : if □φ ∈ Γ and M, v ⊩ □φ then Ruv implies Rvu by symmetry, so
that M, u ⊩ φ. Similarly, if ♢φ ∈ Γ and M, u ⊩ φ then M, v ⊩ ♢φ (since
Rvu by symmetry).
2. Exercise.
3. Exercise.
4. Exercise.

Problem 53.6. Complete the proof of Theorem 53.18.

content/normal-modal-logic/filtrations/euclidean-filtrations.tex

722 Release : 6891b66 (2024-12-01)


53.9. FILTRATIONS OF EUCLIDEAN MODELS

53.9 Filtrations of Euclidean Models


nml:fil:euc: The approach of section 53.8 does not work in the case of models that are
sec
euclidean or serial and euclidean. Consider the model at the top of Figure 53.2,
which is both euclidean and serial. Let Γ = {p, □p}. When taking a filtration
through Γ , then [w1 ] = [w3 ] since w1 and w3 are the only worlds that agree
on Γ . Any filtration will also have the arrow inherited from M, as depicted in
Figure 53.3. That model isn’t euclidean. Moreover, we cannot add arrows to
that model in order to make it euclidean. We would have to add double arrows
between [w2 ] and [w4 ], and then also between w2 and w5 . But □p is supposed
to be true at w2 , while p is false at w5 .
¬p w1 w2 p
⊩ □p ⊩ □p

¬p w3 w4 p w5 ¬p
⊩ □p ⊮ □p ⊮ □p
Figure 53.2: A serial and euclidean model.
nml:fil:euc:
fig:ser-eucl
[w2 ] p

⊩ □p
¬p [w1 ] [w1 ] = [w3 ]

⊩ □p

[w4 ] p [w5 ] ¬p

⊮ □p ⊮ □p
Figure 53.3: The filtration of the model in Figure 53.2.
nml:fil:euc: In particular, to obtain a euclidean filtration it is not enough to consider
fig:ser-eucl2 filtrations through arbitrary Γ ’s closed under sub-formulas. Instead we need
to consider sets Γ that are modally closed (see Definition 53.1). Such sets of
sentences are infinite, and therefore do not immediately yield a finite model
property or the decidability of the corresponding system.
nml:fil:euc: Theorem 53.19. Let Γ be modally closed, M = ⟨W, R, V ⟩, and M∗ = ⟨W ∗ , R∗ , V ∗ ⟩
thm:modal-closed-filt
be a coarsest filtration of M.
1. If M is symmetric, so is M∗ .
2. If M is transitive, so is M∗ .
3. If M is euclidean, so is M∗ .

Release : 6891b66 (2024-12-01) 723


Proof. 1. If M∗ is a coarsest filtration, then by definition R∗ [u][v] holds if
and only if C1 (u, v). For transitivity, suppose C1 (u, v) and C1 (v, w); we
have to show C1 (u, w). Suppose M, u ⊩ □φ; then M, u ⊩ □□φ since 4 is
valid in all transitive models; since □□φ ∈ Γ by closure, also by C1 (u, v),
M, v ⊩ □φ and by C1 (v, w), also M, w ⊩ φ. Suppose M, w ⊩ φ; then
M, v ⊩ ♢φ by C1 (v, w), since ♢φ ∈ Γ by modal closure. By C1 (u, v), we
get M, u ⊩ ♢♢φ since ♢♢φ ∈ Γ by modal closure. Since 4♢ is valid in all
transitive models, M, u ⊩ ♢φ.

2. Exercise. Use the fact that both 5 and 5♢ are valid in all euclidean
models.

3. Exercise. Use the fact that B and B♢ are valid in all symmetric models.

Problem 53.7. Complete the proof of Theorem 53.19.

Chapter 54

Modal Tableaux

Draft chapter on prefixed tableaux for modal logic. Needs more exam-
ples, completeness proofs, and discussion of how one can find countermod-
els from unsuccessful searches for closed tableaux.

content/normal-modal-logic/tableaux/introduction.tex

54.1 Introduction
Tableaux are certain (downward-branching) trees of signed formulas, i.e., pairs nml:tab:int:
sec
consisting of a truth value sign (T or F) and a sentence

T φ or F φ.

724
54.2. RULES FOR K

A tableau begins with a number of assumptions. Each further signed formula


is generated by applying one of the inference rules. Some inference rules add
one or more signed formulas to a tip of the tree; others add two new tips,
resulting in two branches. Rules result in signed formulas where the formula is
less complex than that of the signed formula to which it was applied. When a
branch contains both T φ and F φ, we say the branch is closed. If every branch
in a tableau is closed, the entire tableau is closed. A closed tableau constitutes
a derivation that shows that the set of signed formulas which were used to
begin the tableau are unsatisfiable. This can be used to define a ⊢ relation:
Γ ⊢ φ iff there is some finite set Γ0 = {ψ1 , . . . , ψn } ⊆ Γ such that there is a
closed tableau for the assumptions

{F φ, T ψ1 , . . . , T ψn }.

For modal logics, we have to both extend the notion of signed formula
and add rules that cover □ and ♢ In addition to a sign(T or F), formulas in
modal tableaux also have prefixes σ. The prefixes are non-empty sequences of
positive integers, i.e., σ ∈ (Z+ )∗ \ {Λ}. When we write such prefixes without
the surrounding ⟨ ⟩, and separate the individual elements by .’s instead of ,’s.
If σ is a prefix, then σ.n is σ ⌢ ⟨n⟩; e.g., if σ = 1.2.1, then σ.3 is 1.2.1.3. So
for instance,
1.2 T □φ → φ
is a prefixed signed formula (or just a prefixed formula for short).
Intuitively, the prefix names a world in a model that might satisfy the
formulas on a branch of a tableau, and if σ names some world, then σ.n names
a world accessible from (the world named by) σ.

content/normal-modal-logic/tableaux/rules-for-K.tex

54.2 Rules for K


nml:tab:rul: The rules for the regular propositional connectives are the same as for regular
sec
propositional signed tableaux, just with prefixes added. In each case, the rule
applied to a signed formula σ S φ produces new formulas that are also prefixed
by σ. This should be intuitively clear: e.g., if φ ∧ ψ is true at (a world named
by) σ, then φ and ψ are true at σ (and not at any other world). We collect the
propositional rules in Table 54.1.
The closure condition is the same as for ordinary tableaux, although we
require that not just the formulas but also the prefixes must match. So a
branch is closed if it contains both

σ T φ and σ F φ

for some prefix σ and formula φ.


The rules for setting up assumptions is also as for ordinary tableaux, except
that for assumptions we always use the prefix 1. (It does not matter which

Release : 6891b66 (2024-12-01) 725


CHAPTER 54. MODAL TABLEAUX

σ T ¬φ σ F ¬φ
¬T ¬F
σFφ σ Tφ

σ Tφ ∧ ψ
∧T σFφ∧ψ
σ Tφ ∧F
σFφ | σFψ
σ Tψ
σFφ∨ψ
σ Tφ ∨ ψ ∨F
∨T σFφ
σ Tφ | σ Tψ
σFψ
σFφ→ψ
σ Tφ → ψ →F
→T σ Tφ
σ F φ | σ Tψ
σFψ

Table 54.1: Prefixed tableau rules for the propositional connectives


nml:tab:rul:
tab:prop-rules

prefix we use, as long as it’s the same for all assumptions.) So, e.g., we say
that

ψ1 , . . . , ψn ⊢ φ

iff there is a closed tableau for the assumptions

1 T ψ1 , . . . , 1 T ψn , 1 F φ.

For the modal operators □ and ♢, the prefix of the conclusion of the rule
applied to a formula with prefix σ is σ.n. However, which n is allowed depends
on whether the sign is T or F.
The □T rule extends a branch containing σ T □φ by σ.n T φ. Similarly,
the ♢F rule extends a branch containing σ F ♢φ by σ.n F φ. They can only
be applied for a prefix σ.n which already occurs on the branch in which it is
applied. Let’s call such a prefix “used” (on the branch).
The □F rule extends a branch containing σ F □φ by σ.n F φ. Similarly, the
♢T rule extends a branch containing σ T ♢φ by σ.n T φ. These rules, however,
can only be applied for a prefix σ.n which does not already occur on the branch
in which it is applied. We call such prefixes “new” (to the branch).
The rules are given in Table 54.2.
The requirement that the restriction that the prefix for □T must be used
is necessary as otherwise we would count the following as a closed tableau:

726 Release : 6891b66 (2024-12-01)


54.3. TABLEAUX FOR K

σ T □φ σ F □φ
□T □F
σ.n T φ σ.n F φ

σ.n is used σ.n is new

σ T ♢φ σ F ♢φ
♢T ♢F
σ.n T φ σ.n F φ

σ.n is new σ.n is used

Table 54.2: The modal rules for K.


nml:tab:rul:
tab:rules-K

1. 1 T □φ Assumption
2. 1 F ♢φ Assumption
3. 1.1 T φ □T 1
4. 1.1 F φ ♢F 2

But □φ ⊭ ♢φ, so our proof system would be unsound. Likewise, ♢φ ⊭ □φ,


but without the restriction that the prefix for □F must be new, this would be
a closed tableau:

1. 1 T ♢φ Assumption
2. 1 F □φ Assumption
3. 1.1 T φ ♢T 1
4. 1.1 F φ □F 2

content/normal-modal-logic/tableaux/proofs-in-K.tex

54.3 Tableaux for K


nml:tab:prk:
sec
Example 54.1. We give a closed tableau that shows ⊢ (□φ ∧ □ψ) → □(φ ∧ ψ).

Release : 6891b66 (2024-12-01) 727


CHAPTER 54. MODAL TABLEAUX

1. 1 F (□φ ∧ □ψ) → □(φ ∧ ψ) Assumption


2. 1 T □φ ∧ □ψ →F 1
3. 1 F □(φ ∧ ψ) →F 1
4. 1 T □φ ∧T 2
5. 1 T □ψ ∧T 2
6. 1.1 F φ ∧ ψ □F 3

7. 1.1 F φ 1.1 F ψ ∧F 6
8. 1.1 T φ 1.1 T ψ □T 4; □T 5
⊗ ⊗

Example 54.2. We give a closed tableau that shows ⊢ ♢(φ ∨ ψ) → (♢φ ∨ ♢ψ):

1. 1 F ♢(φ ∨ ψ) → (♢φ ∨ ♢ψ) Assumption


2. 1 T ♢(φ ∨ ψ) →F 1
3. 1 F ♢φ ∨ ♢ψ →F 1
4. 1 F ♢φ ∨F 3
5. 1 F ♢ψ ∨F 3
6. 1.1 T φ ∨ ψ ♢T 2

7. 1.1 T φ 1.1 T ψ ∨T 6
8. 1.1 F φ 1.1 F ψ ♢F 4; ♢F 5
⊗ ⊗

Problem 54.1. Find closed tableaux in K for the following formulas:

1. □¬p → □(p → q)

2. (□p ∨ □q) → □(p ∨ q)

3. ♢p → ♢(p ∨ q)

4. □(p ∧ q) → □p

content/normal-modal-logic/tableaux/soundness.tex

54.4 Soundness for K


nml:tab:sou:
sec

728 Release : 6891b66 (2024-12-01)


54.4. SOUNDNESS FOR K

This soundness proof reuses the soundness proof for classical proposi-
tional logic, i.e., it proves everything from scratch. That’s ok if you want
a self-contained soundness proof. If you already have seen soundness for
ordinary tableau this will be repetitive. It’s planned to make it possi-
ble to switch between self-contained version and a version building on the
non-modal case.

In order to show that prefixed tableaux are sound, we have to show that if explanation

1 T ψ1 , . . . , 1 T ψn , 1 F φ

has a closed tableau then ψ1 , . . . , ψn ⊨ φ. It is easier to prove the contra-


positive: if for some M and world w, M, w ⊩ ψi for all i = 1, . . . , n but
M, w ⊩ φ, then no tableau can close. Such a countermodel shows that the
initial assumptions of the tableau are satisfiable. The strategy of the proof is
to show that whenever all the prefixed formulas on a tableau branch are sat-
isfiable, any application of a rule results in at least one extended branch that
is also satisfiable. Since closed branches are unsatisfiable, any tableau for a
satisfiable set of prefixed formulas must have at least one open branch.
In order to apply this strategy in the modal case, we have to extend our
definition of “satisfiable” to modal modals and prefixes. With that in hand,
however, the proof is straightforward.

Definition 54.3. Let P be some set of prefixes, i.e., P ⊆ (Z+ )∗ \ {Λ} and
let M be a model. A function f : P → W is an interpretation of P in M if,
whenever σ and σ.n are both in P , then Rf (σ)f (σ.n).
Relative to an interpretation of prefixes P we can define:

1. M satisfies σ T φ iff M, f (σ) ⊩ φ.

2. M satisfies σ F φ iff M, f (σ) ⊮ φ.

Definition 54.4. Let Γ be a set of prefixed formulas, and let P (Γ ) be the


set of prefixes that occur in it. If f is an interpretation of P (Γ ) in M, we say
that M satisfies Γ with respect to f , M, f ⊩ Γ , if M satisfies every prefixed
formula in Γ with respect to f . Γ is satisfiable iff there is a model M and
interpretation f of P (Γ ) such that M, f ⊩ Γ .

Proposition 54.5. If Γ contains both σ T φ and σ F φ, for some formula φ


and prefix σ, then Γ is unsatisfiable.

Proof. There cannot be a model M and interpretation f of P (Γ ) such that


both M, f (σ) ⊩ φ and M, f (σ) ⊮ φ.

nml:tab:sou: Theorem 54.6 (Soundness). If Γ has a closed tableau, Γ is unsatisfiable.


thm:tableau-soundness

Release : 6891b66 (2024-12-01) 729


CHAPTER 54. MODAL TABLEAUX

Proof. We call a branch of a tableau satisfiable iff the set of signed formulas
on it is satisfiable, and let’s call a tableau satisfiable if it contains at least one
satisfiable branch.
We show the following: Extending a satisfiable tableau by one of the rules
of inference always results in a satisfiable tableau. This will prove the theo-
rem: any closed tableau results by applying rules of inference to the tableau
consisting only of assumptions from Γ . So if Γ were satisfiable, any tableau
for it would be satisfiable. A closed tableau, however, is clearly not satisfiable,
since all its branches are closed and closed branches are unsatisfiable.
Suppose we have a satisfiable tableau, i.e., a tableau with at least one
satisfiable branch. Applying a rule of inference either adds signed formulas
to a branch, or splits a branch in two. If the tableau has a satisfiable branch
which is not extended by the rule application in question, it remains a satisfiable
branch in the extended tableau, so the extended tableau is satisfiable. So we
only have to consider the case where a rule is applied to a satisfiable branch.
Let Γ be the set of signed formulas on that branch, and let σ S φ ∈ Γ be
the signed formula to which the rule is applied. If the rule does not result in a
split branch, we have to show that the extended branch, i.e., Γ together with
the conclusions of the rule, is still satisfiable. If the rule results in split branch,
we have to show that at least one of the two resulting branches is satisfiable.
First, we consider the possible inferences with only one premise.

1. The branch is expanded by applying ¬T to σ T ¬ψ ∈ Γ . Then the


extended branch contains the signed formulas Γ ∪ {σ F ψ}. Suppose
M, f ⊩ Γ . In particular, M, f (σ) ⊩ ¬ψ. Thus, M, f (σ) ⊮ ψ, i.e., M
satisfies σ F ψ with respect to f .

2. The branch is expanded by applying ¬F to σ F ¬ψ ∈ Γ : Exercise.

3. The branch is expanded by applying ∧T to σ T ψ ∧ χ ∈ Γ , which results


in two new signed formulas on the branch: σ T ψ and σ T χ. Suppose
M, f ⊩ Γ , in particular M, f (σ) ⊩ ψ ∧ χ. Then M, f (σ) ⊩ ψ and
M, f (σ) ⊩ χ. This means that M satisfies both σ T ψ and σ T χ with
respect to f .

4. The branch is expanded by applying ∨F to F ψ ∨ χ ∈ Γ : Exercise.

5. The branch is expanded by applying →F to σ F ψ → χ ∈ Γ : This results


in two new signed formulas on the branch: σ T ψ and σ F χ. Suppose
M, f ⊩ Γ , in particular M, f (σ) ⊮ ψ → χ. Then M, f (σ) ⊩ ψ and
M, f (σ) ⊮ χ. This means that M, f satisfies both σ T ψ and σ F χ.

6. The branch is expanded by applying □T to σ T □ψ ∈ Γ : This results in


a new signed formula σ.n T ψ on the branch, for some σ.n ∈ P (Γ ) (since
σ.n must be used). Suppose M, f ⊩ Γ , in particular, M, f (σ) ⊩ □ψ.
Since f is an interpretation of prefixes and both σ, σ.n ∈ P (Γ ), we know
that Rf (σ)f (σ.n). Hence, M, f (σ.n) ⊩ ψ, i.e., M, f satisfies σ.n T ψ.

730 Release : 6891b66 (2024-12-01)


54.5. RULES FOR OTHER ACCESSIBILITY RELATIONS

7. The branch is expanded by applying □F to σ F □ψ ∈ Γ : This results in a


new signed formula σ.n F φ, where σ.n is a new prefix on the branch, i.e.,
σ.n ∈/ P (Γ ). Since Γ is satisfiable, there is a M and interpretation f of
P (Γ ) such that M, f ⊨ Γ , in particular M, f (σ) ⊮ □ψ. We have to show
that Γ ∪ {σ.n F ψ} is satisfiable. To do this, we define an interpretation
of P (Γ ) ∪ {σ.n} as follows:
Since M, f (σ) ⊮ □ψ, there is a w ∈ W such that Rf (σ)w and M, w ⊮ ψ.
Let f ′ be like f , except that f ′ (σ.n) = w. Since f ′ (σ) = f (σ) and
Rf (σ)w, we have Rf ′ (σ)f ′ (σ.n), so f ′ is an interpretation of P (Γ )∪{σ.n}.
Obviously M, f ′ (σ.n) ⊮ ψ. Since f (σ ′ ) = f ′ (σ ′ ) for all prefixes σ ′ ∈
P (Γ ), M, f ′ ⊩ Γ . So, M, f ′ satisfies Γ ∪ {σ.n F ψ}.
Now let’s consider the possible inferences with two premises.
1. The branch is expanded by applying ∧F to σ F ψ ∧ χ ∈ Γ , which results
in two branches, a left one continuing through σ F ψ and a right one
through σ F χ. Suppose M, f ⊩ Γ , in particular M, f (σ) ⊮ ψ ∧ χ. Then
M, f (σ) ⊮ ψ or M, f (σ) ⊮ χ. In the former case, M, f satisfies σ F ψ,
i.e., the left branch is satisfiable. In the latter, M, f satisfies σ F χ, i.e.,
the right branch is satisfiable.
2. The branch is expanded by applying ∨T to σ T ψ ∨ χ ∈ Γ : Exercise.
3. The branch is expanded by applying →T to σ T ψ → χ ∈ Γ : Exercise.

Problem 54.2. Complete the proof of Theorem 54.6.

nml:tab:sou: Corollary 54.7. If Γ ⊢ φ then Γ ⊨ φ.


cor:entailment-soundness

Proof. If Γ ⊢ φ then for some ψ1 , . . . , ψn ∈ Γ , ∆ = {1 F φ, 1 T ψ1 , . . . , 1 T ψn }


has a closed tableau. We want to show that Γ ⊨ φ. Suppose not, so for some
M and w, M, w ⊩ ψi for i = 1, . . . , n, but M, w ⊮ φ. Let f (1) = w; then f is
an interpretation of P (∆) into M, and M satisfies ∆ with respect to f . But by
Theorem 54.6, ∆ is unsatisfiable since it has a closed tableau, a contradiction.
So we must have Γ ⊢ φ after all.

nml:tab:sou: Corollary 54.8. If ⊢ φ then φ is true in all models.


cor:weak-soundness

content/normal-modal-logic/tableaux/more-rules.tex

54.5 Rules for Other Accessibility Relations


nml:tab:mru: In order to deal with logics determined by special accessibility relations, we
sec
consider the additional rules in Table 54.3.
Adding these rules results in systems that are sound and complete for the
logics given in Table 54.4.

Release : 6891b66 (2024-12-01) 731


CHAPTER 54. MODAL TABLEAUX

σ T □φ σ F ♢φ
T□ T♢
σ Tφ σFφ

σ T □φ σ F ♢φ
D□ D♢
σ T ♢φ σ F □φ

σ.n T □φ σ.n F ♢φ
B□ B♢
σ Tφ σFφ

σ T □φ σ F ♢φ
4□ 4♢
σ.n T □φ σ.n F ♢φ

σ.n is used σ.n is used

σ.n T □φ σ.n F ♢φ
4r□ 4r♢
σ T □φ σ F ♢φ

Table 54.3: More modal rules.


nml:tab:mru:
tab:more-rules

Logic R is . . . Rules
T = KT reflexive T□, T♢
D = KD serial D□, D♢
K4 transitive 4□, 4♢
B = KTB reflexive, T□, T♢
symmetric B□, B♢
S4 = KT4 reflexive, T□, T♢,
transitive 4□, 4♢
S5 = KT4B reflexive, T□, T♢,
transitive, 4□, 4♢,
euclidean 4r□, 4r♢

Table 54.4: Tableau rules for various modal logics.


nml:tab:mru:
tab:logics-rules

732 Release : 6891b66 (2024-12-01)


54.6. SOUNDNESS FOR ADDITIONAL RULES

Example 54.9. We give a closed tableau that shows S5 ⊢ 5, i.e., □φ → □♢φ.

1. 1 F □φ → □♢φ Assumption
2. 1 T □φ →F 1
3. 1 F □♢φ →F 1
4. 1.1 F ♢φ □F 3
5. 1 F ♢φ 4r♢ 4
6. 1.1 F φ ♢F 5
7. 1.1 T φ □T 2

Problem 54.3. Give closed tableaux that show the following:

1. KT5 ⊢ B;

2. KT5 ⊢ 4;

3. KDB4 ⊢ T;

4. KB4 ⊢ 5;

5. KB5 ⊢ 4;

6. KT ⊢ D.

content/normal-modal-logic/tableaux/more-soundness.tex

54.6 Soundness for Additional Rules


nml:tab:msn: We say a rule is sound for a class of models if, whenever a branch in a tableau
sec
is satisfiable in a model from that class, the branch resulting from applying the
rule is also satisfiable in a model from that class.

nml:tab:msn: Proposition 54.10. T□ and T♢ are sound for reflexive models.


prop:soundness-T

Proof. 1. The branch is expanded by applying T□ to σ T □ψ ∈ Γ : This


results in a new signed formula σ T ψ on the branch. Suppose M, f ⊩
Γ , in particular, M, f (σ) ⊩ □ψ. Since R is reflexive, we know that
Rf (σ)f (σ). Hence, M, f (σ) ⊩ ψ, i.e., M, f satisfies σ T ψ.

2. The branch is expanded by applying T♢ to σ F ♢ψ ∈ Γ : This results


in a new signed formula σ F ψ on the branch. Suppose M, f ⊩ Γ , in
particular, M, f (σ) ⊮ ♢ψ. Since R is reflexive, we know that Rf (σ)f (σ).
Hence, M, f (σ) ⊮ ψ, i.e., M, f satisfies σ F ψ.

nml:tab:msn: Proposition 54.11. D□ and D♢ are sound for serial models.


prop:soundness-D

Release : 6891b66 (2024-12-01) 733


CHAPTER 54. MODAL TABLEAUX

Proof. 1. The branch is expanded by applying D□ to σ T □ψ ∈ Γ : This


results in a new signed formula σ T ♢ψ on the branch. Suppose M, f ⊩ Γ ,
in particular, M, f (σ) ⊩ □ψ. Since R is serial, there is a w ∈ W such that
Rf (σ)w. Then M, w ⊩ ψ, and hence M, f (σ) ⊩ ♢ψ. So, M, f satisfies
σ T ♢ψ.

2. The branch is expanded by applying D♢ to σ F ♢ψ ∈ Γ : This results


in a new signed formula σ F □ψ on the branch. Suppose M, f ⊩ Γ , in
particular, M, f (σ) ⊮ ♢ψ. Since R is serial, there is a w ∈ W such that
Rf (σ)w. Then M, w ⊮ ψ, and hence M, f (σ) ⊮ □ψ. So, M, f satisfies
σ F □ψ.

Proposition 54.12. B□ and B♢ are sound for symmetric models. nml:tab:msn:


prop:soundness-B

Proof. 1. The branch is expanded by applying B□ to σ.n T □ψ ∈ Γ : This


results in a new signed formula σ T ψ on the branch. Suppose M, f ⊩ Γ ,
in particular, M, f (σ.n) ⊩ □ψ. Since f is an interpretation of prefixes on
the branch into M, we know that Rf (σ)f (σ.n). Since R is symmetric,
Rf (σ.n)f (σ). Since M, f (σ.n) ⊩ □ψ, M, f (σ) ⊩ ψ. Hence, M, f satisfies
σ T ψ.

2. The branch is expanded by applying B♢ to σ.n F ♢ψ ∈ Γ : This results


in a new signed formula σ F ψ on the branch. Suppose M, f ⊩ Γ , in
particular, M, f (σ.n) ⊮ ♢ψ. Since f is an interpretation of prefixes on
the branch into M, we know that Rf (σ)f (σ.n). Since R is symmetric,
Rf (σ.n)f (σ). Since M, f (σ.n) ⊮ ♢ψ, M, f (σ) ⊮ ψ. Hence, M, f satisfies
σ F ψ.

Proposition 54.13. 4□ and 4♢ are sound for transitive models. nml:tab:msn:


prop:soundness-4

Proof. 1. The branch is expanded by applying 4□ to σ T □ψ ∈ Γ : This


results in a new signed formula σ.n T □ψ on the branch. Suppose M, f ⊩
Γ , in particular, M, f (σ) ⊩ □ψ. Since f is an interpretation of prefixes
on the branch into M and σ.n must be used, we know that Rf (σ)f (σ.n).
Now let w be any world such that Rf (σ.n)w. Since R is transitive,
Rf (σ)w. Since M, f (σ) ⊩ □ψ, M, w ⊩ ψ. Hence, M, f (σ.n) ⊩ □ψ, and
M, f satisfies σ.n T □ψ.

2. The branch is expanded by applying 4♢ to σ F ♢ψ ∈ Γ : This results in


a new signed formula σ.n F ♢ψ on the branch. Suppose M, f ⊩ Γ , in
particular, M, f (σ) ⊮ ♢ψ. Since f is an interpretation of prefixes on the
branch into M and σ.n must be used, we know that Rf (σ)f (σ.n). Now
let w be any world such that Rf (σ.n)w. Since R is transitive, Rf (σ)w.
Since M, f (σ) ⊮ ♢ψ, M, w ⊮ ψ. Hence, M, f (σ.n) ⊮ ♢ψ, and M, f
satisfies σ.n F ♢ψ.

734 Release : 6891b66 (2024-12-01)


54.7. SIMPLE TABLEAUX FOR S5

nml:tab:msn: Proposition 54.14. 4r□ and 4r♢ are sound for euclidean models.
prop:soundness-4r

Proof. 1. The branch is expanded by applying 4r□ to σ.n T □ψ ∈ Γ : This


results in a new signed formula σ T □ψ on the branch. Suppose M, f ⊩ Γ ,
in particular, M, f (σ.n) ⊩ □ψ. Since f is an interpretation of prefixes on
the branch into M, we know that Rf (σ)f (σ.n). Now let w be any world
such that Rf (σ)w. Since R is euclidean, Rf (σ.n)w. Since M, f (σ).n ⊩
□ψ, M, w ⊩ ψ. Hence, M, f (σ) ⊩ □ψ, and M, f satisfies σ T □ψ.
2. The branch is expanded by applying 4r♢ to σ.n F ♢ψ ∈ Γ : This results
in a new signed formula σ T □ψ on the branch. Suppose M, f ⊩ Γ , in
particular, M, f (σ.n) ⊮ ♢ψ. Since f is an interpretation of prefixes on the
branch into M, we know that Rf (σ)f (σ.n). Now let w be any world such
that Rf (σ)w. Since R is euclidean, Rf (σ.n)w. Since M, f (σ).n ⊮ ♢ψ,
M, w ⊮ ψ. Hence, M, f (σ) ⊮ ♢ψ, and M, f satisfies σ F ♢ψ.

nml:tab:msn: Corollary 54.15. The tableau systems given in Table 54.4 are sound for the
cor:soundness-logics
respective classes of models.

content/normal-modal-logic/tableaux/simple-S5.tex

54.7 Simple Tableaux for S5


nml:tab:s5: S5 is sound and complete with respect to the class of universal models, i.e.,
sec
models where every world is accessible from every world. In universal models
the accessibility relation doesn’t matter: “there is a world w where M, w ⊩ φ”
is true if and only if there is such a w that’s accessible from u. So in S5, we can
define models as simply a set of worlds and a valuation V . This suggests that
we should be able to simplify the tableau rules as well. In the general case,
we take as prefixes sequences of positive integers, so that we can keep track of
which such prefixes name worlds which are accessible from others: σ.n names
a world accessible from σ. But in S5 any world is accessible from any world,
so there is no need to so keep track. Instead, we can use positive integers as
prefixes. The simplified rules are given in Table 54.5.
Example 54.16. We give a simplified closed tableau that shows S5 ⊢ 5, i.e.,
♢φ → □♢φ.

1. 1 F ♢φ → □♢φ Assumption
2. 1 T ♢φ →F 1
3. 1 F □♢φ →F 1
4. 2 F ♢φ □F 3
5. 3T φ ♢T 2
6. 3F φ ♢F 4

Release : 6891b66 (2024-12-01) 735


CHAPTER 54. MODAL TABLEAUX

n T □φ n F □φ
□T □F
mTφ mFφ

m is used m is new

n T ♢φ n F ♢φ
♢T ♢F
mTφ mFφ

m is new m is used

Table 54.5: Simplified rules for S5.


nml:tab:s5:
tab:rules-S5

content/normal-modal-logic/tableaux/completeness.tex

54.8 Completeness for K


explanation To show that the method of tableaux is complete, we have to show that when- nml:tab:cpl:
sec
ever there is no closed tableau to show Γ ⊢ φ, then Γ ⊭ φ, i.e., there is
a countermodel. But “there is no closed tableau” means that every way we
could try to construct one has to fail to close. The trick is to see that if ev-
ery such way fails to close, then a specific, systematic and exhaustive way also
fails to close. And this systematic and exhaustive way would close if a closed
tableau exists. The single tableau will contain, among its open branches, all
the information required to define a countermodel. The countermodel given by
an open branch in this tableau will contain the all the prefixes used on that
branch as the worlds, and a propositional variable p is true at σ iff σ T p occurs
on the branch.
Definition 54.17. A branch in a tableau is called complete if, whenever it
contains a prefixed formula σ S φ to which a rule can be applied, it also contains
1. the prefixed formulas that are the corresponding conclusions of the rule,
in the case of propositional stacking rules;
2. one of the corresponding conclusion formulas in the case of propositional
branching rules;
3. at least one possible conclusion in the case of modal rules that require a
new prefix;
4. the corresponding conclusion for every prefix occurring on the branch in
the case of modal rules that require a used prefix.

736 Release : 6891b66 (2024-12-01)


54.8. COMPLETENESS FOR K

For instance, a complete branch contains σ T ψ and σ T χ whenever it con- explanation

tains T ψ ∧ χ. If it contains σ T ψ ∨ χ it contains at least one of σ F ψ and σ T χ.


If it contains σ F □ it also contains σ.n F □ for at least one n. And whenever
it contains σ T □ it also contains σ.n T □ for every n such that σ.n is used on
the branch.

nml:tab:cpl: Proposition 54.18. Every finite Γ has a tableau in which every branch is
prop:complete-tableau
complete.

Proof. Consider an open branch in a tableau for Γ . There are finitely many
prefixed formulas in the branch to which a rule could be applied. In some fixed
order (say, top to bottom), for each of these prefixed formulas for which the
conditions (1)–(4) do not already hold, apply the rules that can be applied to it
to extend the branch. In some cases this will result in branching; apply the rule
at the tip of each resulting branch for all remaining prefixed formulas. Since the
number of prefixed formulas is finite, and the number of used prefixes on the
branch is finite, this procedure eventually results in (possibly many) branches
extending the original branch. Apply the procedure to each, and repeat. But
by construction, every branch is closed.

nml:tab:cpl: Theorem 54.19 (Completeness). If Γ has no closed tableau, Γ is satisfi-


thm:tableau-completeness
able.

Proof. By the proposition, Γ has a tableau in which every branch is complete.


Since it has no closed tableau, it thas has a tableau in which at least one branch
is open and complete. Let ∆ be the set of prefixed formulas on the branch,
and P (∆) the set of prefixes occurring in it.
We define a model M(∆) = ⟨P (∆), R, V ⟩ where the worlds are the prefixes
occurring in ∆, the accessibility relation is given by:

Rσσ ′ iff σ ′ = σ.n for some n

and
V (p) = {σ : σ T p ∈ ∆}.
We show by induction on φ that if σ T φ ∈ ∆ then M(∆), σ ⊩ φ, and if
σ F φ ∈ ∆ then M(∆), σ ⊮ φ.

1. φ ≡ p: If σ T φ ∈ ∆ then σ ∈ V (p) (by definition of V ) and so M(∆), σ ⊩


φ.
If σ F φ ∈ ∆ then σ T φ ∈
/ ∆, since the branch would otherwise be closed.
So σ ∈/ V (p) and thus M(∆), σ ⊮ φ.

2. φ ≡ ¬ψ: If σ T φ ∈ ∆, then σ F ψ ∈ ∆ since the branch is complete. By


induction hypothesis, M(∆), σ ⊮ ψ and thus M(∆), σ ⊩ φ.
If σ F φ ∈ ∆, then σ T ψ ∈ ∆ since the branch is complete. By induction
hypothesis, M(∆), σ ⊩ ψ and thus M(∆), σ ⊮ φ.

Release : 6891b66 (2024-12-01) 737


CHAPTER 54. MODAL TABLEAUX

3. φ ≡ ψ ∧ χ: If σ T φ ∈ ∆, then both σ T ψ ∈ ∆ and σ T χ ∈ ∆ since


the branch is complete. By induction hypothesis, M(∆), σ ⊩ ψ and
M(∆), σ ⊩ χ. Thus M(∆), σ ⊩ φ.
If σ F φ ∈ ∆, then either σ F ψ ∈ ∆ or σ F χ ∈ ∆ since the branch is
complete. By induction hypothesis, either M(∆), σ ⊮ ψ or M(∆), σ ⊮ ψ.
Thus M(∆), σ ⊮ φ.
4. φ ≡ ψ ∨ χ: If σ T φ ∈ ∆, then either σ T ψ ∈ ∆ or σ T χ ∈ ∆ since
the branch is complete. By induction hypothesis, either M(∆), σ ⊩ ψ or
M(∆), σ ⊩ χ. Thus M(∆), σ ⊩ φ.
If σ F φ ∈ ∆, then both σ F ψ ∈ ∆ and σ F χ ∈ ∆ since the branch is
complete. By induction hypothesis, both M(∆), σ ⊮ ψ and M(∆), σ ⊮ ψ.
Thus M(∆), σ ⊮ φ.
5. φ ≡ ψ → χ: If σ T φ ∈ ∆, then either σ F ψ ∈ ∆ or σ T χ ∈ ∆ since
the branch is complete. By induction hypothesis, either M(∆), σ ⊮ ψ or
M(∆), σ ⊩ χ. Thus M(∆), σ ⊩ φ.
If σ F φ ∈ ∆, then both σ T ψ ∈ ∆ and σ F χ ∈ ∆ since the branch is
complete. By induction hypothesis, both M(∆), σ ⊩ ψ and M(∆), σ ⊮ ψ.
Thus M(∆), σ ⊮ φ.
6. φ ≡ □ψ: If σ T φ ∈ ∆, then, since the branch is complete, σ.n T ψ ∈ ∆
for every σ.n used on the branch, i.e., for every σ ′ ∈ P (∆) such that
Rσσ ′ . By induction hypothesis, M(∆), σ ′ ⊩ ψ for every σ ′ such that
Rσσ ′ . Therefore, M(∆), σ ⊩ φ.
If σ F φ ∈ ∆, then for some σ.n, σ.n F ψ ∈ ∆ since the branch is complete.
By induction hypothesis, M(∆), σ.n ⊮ ψ. Since Rσ(σ.n), there is a σ ′
such that M(∆), σ ′ ⊮ ψ. Thus M(∆), σ ⊮ φ.
7. φ ≡ ♢ψ: If σ T φ ∈ ∆, then for some σ.n, σ.n T ψ ∈ ∆ since the branch
is complete. By induction hypothesis, M(∆), σ.n ⊩ ψ. Since Rσ(σ.n),
there is a σ ′ such that M(∆), σ ′ ⊩ ψ. Thus M(∆), σ ⊩ φ.
If σ F φ ∈ ∆, then, since the branch is complete, σ.n F ψ ∈ ∆ for every σ.n
used on the branch, i.e., for every σ ′ ∈ P (∆) such that Rσσ ′ . By induc-
tion hypothesis, M(∆), σ ′ ⊮ ψ for every σ ′ such that Rσσ ′ . Therefore,
M(∆), σ ⊮ φ.

Since Γ ⊆ ∆, M(∆) ⊩ Γ .

Problem 54.4. Complete the proof of Theorem 54.19.

Corollary 54.20. If Γ ⊨ φ then Γ ⊢ φ. nml:tab:cpl:


cor:entailment-completeness

Corollary 54.21. If φ is true in all models, then ⊢ φ. nml:tab:cpl:


cor:weak-completeness

content/normal-modal-logic/tableaux/countermodels.tex

738 Release : 6891b66 (2024-12-01)


54.9. COUNTERMODELS FROM TABLEAUX

54.9 Countermodels from Tableaux


nml:tab:cou: The proof of the completeness theorem doesn’t just show that if ⊨ φ then ⊢ φ, explanation
sec
it also gives us a method for constructing countermodels to φ if ⊭ A. In the case
of K, this method constitutes a decision procedure. For suppose ⊭ φ. Then the
proof of Proposition 54.18 gives a method for constructing a complete tableau.
The method in fact always terminates. The propositional rules for K only add
prefixed formulas of lower complexity, i.e., each propositional rule need only
be applied once on a branch for any signed formula σ S φ. New prefixes are
only generated by the □F and ♢T rules, and also only have to be applied once
(and produce a single new prefix). □T and ♢F have to be applied potentially
multiple times, but only once per prefix, and only finitely many new prefixes are
generated. So the construction either results in a closed branch or a complete
branch after finitely many stages.
Once a tableau with an open complete branch is constructed, the proof
of Theorem 54.19 gives us an explict model that satisfies the original set of
prefixed formulas. So not only is it the case that if Γ ⊨ φ, then a closed
tableau exists and Γ ⊢ φ, if we look for the closed tableau in the right way and
end up with a “complete” tableau, we’ll not only know that Γ ⊭ φ but actually
be able to construct a countermodel.

Example 54.22. We know that ⊬ □(p ∨ q) → (□p ∨ □q). The construction of


a tableau begins with:

1. 1 F □(p ∨ q) → (□p ∨ □q) ✓ Assumption


2. 1 T □(p ∨ q) →F 1
3. 1 F □p ∨ □q ✓ →F 1
4. 1 F □p ✓ ∨F 3
5. 1 F □q ✓ ∨F 3
6. 1.1 F p ✓ □F 4
7. 1.2 F q ✓ □F 5

The tableau is of course not finished yet. In the next step, we consider the
only line without a checkmark: the prefixed formula 1 T □(p ∨ q) on line 2. The
construction of the closed tableau says to apply the □T rule for every prefix
used on the branch, i.e., for both 1.1 and 1.2:

1. 1 F □(p ∨ q) → (□p ∨ □q) ✓ Assumption


2. 1 T □(p ∨ q) →F 1
3. 1 F □p ∨ □q ✓ →F 1
4. 1 F □p ✓ ∨F 3
5. 1 F □q ✓ ∨F 3
6. 1.1 F p ✓ □F 4
7. 1.2 F q ✓ □F 5
8. 1.1 T p ∨ q □T 2
9. 1.2 T p ∨ q □T 2

Release : 6891b66 (2024-12-01) 739


CHAPTER 54. MODAL TABLEAUX

¬p p
1.1 q 1.2 ¬q

¬p
1 ¬q

Figure 54.1: A countermodel to □(p ∨ q) → (□p ∨ □q).


nml:tab:cou:
fig:counter-Box
Now lines 2, 8, and 9, don’t have checkmarks. But no new prefix has been
added, so we apply ∨T to lines 8 and 9, on all resulting branches (as long as
they don’t close):

1. 1 F □(p ∨ q) → (□p ∨ □q) ✓ Assumption


2. 1 T □(p ∨ q) →F 1
3. 1 F □p ∨ □q ✓ →F 1
4. 1 F □p ✓ ∨F 3
5. 1 F □q ✓ ∨F 3
6. 1.1 F p ✓ □F 4
7. 1.2 F q ✓ □F 5
8. 1.1 T p ∨ q ✓ □T 2
9. 1.2 T p ∨ q ✓ □T 2

10. 1.1 T p ✓ 1.1 T q ✓ ∨T 8



11. 1.2 T p ✓ 1.2 T q ✓ ∨T 9

There is one remaining open branch, and it is complete. From it we define the
model with worlds W = {1, 1.1, 1.2} (the only prefixes appearing on the open
branch), the accessibility relation R = {⟨1, 1.1⟩, ⟨1, 1.2⟩}, and the assignment
V (p) = {1.2} (because line 11 contains 1.2 T p) and V (q) = {1.1} (because
line 10 contains 1.1 T q). The model is pictured in Figure 54.1, and you can
verify that it is a countermodel to □(p ∨ q) → (□p ∨ □q).

740 Release : 6891b66 (2024-12-01)


Part XII

Intuitionistic Logic
This is a brief introduction to intuitionistic logic produced by Zesen
Qian and revised by RZ. It is not yet well integrated with the rest of the
text and needs examples and motivations.

Chapter 55

Introduction

content/intuitionistic-logic/introduction/constructive-reasoning.tex

55.1 Constructive Reasoning


In contrast to extensions of classical logic by modal operators or second-order
quantifiers, intuitionistic logic is “non-classical” in that it restricts classical
logic. Classical logic is non-constructive in various ways. Intuitionistic logic
is intended to capture a more “constructive” kind of reasoning characteristic
of a kind of constructive mathematics. The following examples may serve to
illustrate some of the underlying motivations.
Suppose someone claimed that they had determined a natural number n
with the property that if n is even, the Riemann hypothesis is true, and if n
is odd, the Riemann hypothesis is false. Great news! Whether the Riemann
hypothesis is true or not is one of the big open questions of mathematics, and
they seem to have reduced the problem to one of calculation, that is, to the
determination of whether a specific number is even or not.
What is the magic value of n? They describe it as follows: n is the natural
number that is equal to 2 if the Riemann hypothesis is true, and 3 otherwise.
Angrily, you demand your money back. From a classical point of view, the
description above does in fact determine a unique value of n; but what you
really want is a value of n that is given explicitly.

741
CHAPTER 55. INTRODUCTION

To take another, perhaps less contrived example, consider the following


question. We know that it is possible to raise an irrational number to a rational
√ 2
power, and get a rational result. For example, 2 = 2. What is less clear
is whether or not it is possible to raise an irrational number to an irrational
power, and get a rational result. The following theorem answers this in the
affirmative:

Theorem 55.1. There are irrational numbers a and b such that ab is rational.
√ √2 √
Proof. Consider 2 . If this is rational, we are done: we can let a = b = 2.
Otherwise, it is irrational. Then we have
√ √2 √ √ √2·√2 √ 2
( 2 ) 2= 2 = 2 = 2,

√ 2 √
which is rational. So, in this case, let a be 2 , and let b be 2.

Does this constitute a valid proof? Most mathematicians feel that it does.
But again, there is something a little bit unsatisfying here: we have proved
the existence of a pair of real numbers with a certain property, without being
able to say which pair of numbers it is. It is possible to prove the same√result,
but in such a way that the pair a, b is given in the proof: take a = 3 and
b = log3 4. Then
√ log3 4
ab = 3 = 31/2·log3 4 = (3log3 4 )1/2 = 41/2 = 2,

since 3log3 x = x.
Intuitionistic logic is designed to capture a kind of reasoning where moves
like the one in the first proof are disallowed. Proving the existence of an x
satisfying φ(x) means that you have to give a specific x, and a proof that it
satisfies φ, like in the second proof. Proving that φ or ψ holds requires that
you can prove one or the other.
Formally speaking, intuitionistic logic is what you get if you restrict a deriva-
tion system for classical logic in a certain way. From the mathematical point
of view, these are just formal deductive systems, but, as already noted, they
are intended to capture a kind of mathematical reasoning. One can take this
to be the kind of reasoning that is justified on a certain philosophical view of
mathematics (such as Brouwer’s intuitionism); one can take it to be a kind
of mathematical reasoning which is more “concrete” and satisfying (along the
lines of Bishop’s constructivism); and one can argue about whether or not
the formal description captures the informal motivation. But whatever philo-
sophical positions we may hold, we can study intuitionistic logic as a formally
presented logic; and for whatever reasons, many mathematical logicians find it
interesting to do so.

content/intuitionistic-logic/introduction/syntax.tex

742 Release : 6891b66 (2024-12-01)


55.2. SYNTAX OF INTUITIONISTIC LOGIC

55.2 Syntax of Intuitionistic Logic


int:int:syn: The syntax of intuitionistic logic is the same as that for propositional logic. In
sec
classical propositional logic it is possible to define connectives by others, e.g.,
one can define φ → ψ by ¬φ ∨ ψ, or φ ∨ ψ by ¬(¬φ ∧ ¬ψ). Thus, presentations
of classical logic often introduce some connectives as abbreviations for these
definitions. This is not so in intuitionistic logic, with two exceptions: ¬φ can
be—and often is—defined as an abbreviation for φ → ⊥. Then, of course, ⊥
must not itself be defined! Also, φ ↔ ψ can be defined, as in classical logic, as
(φ → ψ) ∧ (ψ → φ).
Formulas of propositional intuitionistic logic are built up from propositional
variables and the propositional constant ⊥ using logical connectives. We have:

1. A denumerable set At0 of propositional variables p0 , p1 , . . .

2. The propositional constant for falsity ⊥.

3. The logical connectives: ∧ (conjunction), ∨ (disjunction), → (condi-


tional)

4. Punctuation marks: (, ), and the comma.

int:int:syn: Definition 55.2 (Formula). The set Frm(L0 ) of formulas of propositional


defn:formulas
intuitionistic logic is defined inductively as follows:

1. ⊥ is an atomic formula.

2. Every propositional variable pi is an atomic formula.

3. If φ and ψ are formulas, then (φ ∧ ψ) is a formula.

4. If φ and ψ are formulas, then (φ ∨ ψ) is a formula.

5. If φ and ψ are formulas, then (φ → ψ) is a formula.

6. Nothing else is a formula.

In addition to the primitive connectives introduced above, we also use the


following defined symbols: ¬ (negation) and ↔ (biconditional). Formulas con-
structed using the defined operators are to be understood as follows:

1. ¬φ abbreviates φ → ⊥.

2. φ ↔ ψ abbreviates (φ → ψ) ∧ (ψ → φ).

Although ¬ is officially treated as an abbreviation, we will sometimes give


explicit rules and clauses in definitions for ¬ as if it were primitive. This is
mostly so we can state practice problems.

content/intuitionistic-logic/introduction/bhk-interpretation.tex

Release : 6891b66 (2024-12-01) 743


CHAPTER 55. INTRODUCTION

55.3 The Brouwer–Heyting–Kolmogorov Interpretation


int:int:bhk:
sec

Proofs of validity of intuitionistic propositions using the BHK interpre-


tation are confusing; they have to be explained better.

There is an informal constructive interpretation of the intuitionist connectives,


usually known as the Brouwer–Heyting–Kolmogorov interpretation. It uses
the notion of a “construction,” which you may think of as a constructive proof.
(We don’t use “proof” in the BHK interpretation so as not to get confused with
the notion of a derivation in a formal derivation system.) Based on this intu-
itive notion, the BHK interpretation explains the meanings of the intuitionistic
connectives.

1. We assume that we know what constitutes a construction of an atomic


statement.

2. A construction of φ1 ∧ φ2 is a pair ⟨M1 , M2 ⟩ where M1 is a construction


of φ1 and M2 is a construction of φ2 .

3. A construction of φ1 ∨ φ2 is a pair ⟨s, M ⟩ where s is 1 and M is a


construction of φ1 , or s is 2 and M is a construction of φ2 .

4. A construction of φ → ψ is a function that converts a construction of φ


into a construction of ψ.

5. There is no construction for ⊥ (absurdity).

6. ¬φ is defined as synonym for φ → ⊥. That is, a construction of ¬φ is a


function converting a construction of φ into a construction of ⊥.

Example 55.3. Take ¬⊥ for example. A construction of it is a function which,


given any construction of ⊥ as input, provides a construction of ⊥ as output.
Obviously, the identity function Id is such a construction: given a construc-
tion M of ⊥, Id(M ) = M yields a construction of ⊥.

Generally speaking, ¬φ means “A construction of φ is impossible”.

Example 55.4. Let us prove φ → ¬¬φ for any proposition φ, which is φ →


((φ → ⊥) → ⊥). The construction should be a function f that, given a con-
struction M of φ, returns a construction f (M ) of (φ → ⊥) → ⊥. Here is how f
constructs the construction of (φ → ⊥) → ⊥: We have to define a function g
which, when given a construction h of φ → ⊥ as input, outputs a construction
of ⊥. We can define g as follows: apply the input h to the construction M of
φ (that we received earlier). Since the output h(M ) of h is a construction of
⊥, f (M )(h) = h(M ) is a construction of ⊥ if M is a construction of φ.

744 Release : 6891b66 (2024-12-01)


55.3. THE BROUWER–HEYTING–KOLMOGOROV INTERPRETATION

Example 55.5. Let us give a construction for ¬(φ∧¬φ), i.e., (φ∧(φ→⊥))→⊥.


This is a function f which, given as input a construction M of φ ∧ (φ → ⊥),
yields a construction of ⊥. A construction of a conjunction ψ1 ∧ ψ2 is a pair
⟨N1 , N2 ⟩ where N1 is a construction of ψ1 and N2 is a construction of ψ2 . We
can define functions p1 and p2 which recover from a construction of ψ1 ∧ ψ2
the constructions of ψ1 and ψ2 , respectively:

p1 (⟨N1 , N2 ⟩) = N1
p2 (⟨N1 , N2 ⟩) = N2

Here is what f does: First it applies p1 to its input M . That yields a construc-
tion of φ. Then it applies p2 to M , yielding a construction of φ → ⊥. Such a
construction, in turn, is a function p2 (M ) which, if given as input a construction
of φ, yields a construction of ⊥. In other words, if we apply p2 (M ) to p1 (M ),
we get a construction of ⊥. Thus, we can define f (M ) = p2 (M )(p1 (M )).

Example 55.6. Let us give a construction of ((φ ∧ ψ) → χ) → (φ → (ψ → χ)),


i.e., a function f which turns a construction g of (φ∧ψ)→χ into a construction
of (φ → (ψ → χ)). The construction g is itself a function (from constructions
of φ ∧ ψ to constructions of C). And the output f (g) is a function hg from
constructions of φ to functions from constructions of ψ to constructions of χ.
Ok, this is confusing. We have to construct a certain function hg , which
will be the output of f for input g. The input of hg is a construction M
of φ. The output of hg (M ) should be a function kM from constructions N
of ψ to constructions of χ. Let kg,M (N ) = g(⟨M, N ⟩). Remember that ⟨M, N ⟩
is a construction of φ ∧ ψ. So kg,M is a construction of ψ → χ: it maps
constructions N of ψ to constructions of χ. Now let hg (M ) = kg,M . That’s a
function that maps constructions M of φ to constructions kg,M of ψ → χ. Now
let f (g) = hg . That’s a function that maps constructions g of (φ ∧ ψ) → χ to
constructions of φ → (ψ → χ). Whew!

The statement φ ∨ ¬φ is called the Law of Excluded Middle. We can prove


it for some specific φ (e.g., ⊥ ∨ ¬⊥), but not in general. This is because the
intuitionistic disjunction requires a construction of one of the disjuncts, but
there are statements which currently can neither be proved nor refuted (say,
Goldbach’s conjecture). However, you can’t refute the law of excluded middle
either: that is, ¬¬(φ ∨ ¬φ) holds.
Example 55.7. To prove ¬¬(φ ∨ ¬φ), we need a function f that transforms a
construction of ¬(φ ∨ ¬φ), i.e., of (φ ∨ (φ → ⊥)) → ⊥, into a construction of ⊥.
In other words, we need a function f such that f (g) is a construction of ⊥ if g
is a construction of ¬(φ ∨ ¬φ).
Suppose g is a construction of ¬(φ ∨ ¬φ), i.e., a function that transforms a
construction of φ ∨ ¬φ into a construction of ⊥. A construction of φ ∨ ¬φ is a
pair ⟨s, M ⟩ where either s = 1 and M is a construction of φ, or s = 2 and M is
a construction of ¬φ. Let h1 be the function mapping a construction M1 of φ
to a construction of φ ∨ ¬φ: it maps M1 to ⟨1, M2 ⟩. And let h2 be the function

Release : 6891b66 (2024-12-01) 745


CHAPTER 55. INTRODUCTION

mapping a construction M2 of ¬φ to a construction of φ ∨ ¬φ: it maps M2 to


⟨2, M2 ⟩.
Let k be g ◦ h1 : it is a function which, if given a construction of φ, returns a
construction of ⊥, i.e., it is a construction of φ → ⊥ or ¬φ. Now let l be g ◦ h2 .
It is a function which, given a construction of ¬φ, provides a construction of ⊥.
Since k is a construction of ¬φ, l(k) is a construction of ⊥.
Together, what we’ve done is describe how we can turn a construction g of
¬(φ∨¬φ) into a construction of ⊥, i.e., the function f mapping a construction g
of ¬(φ ∨ ¬φ) to the construction l(k) of ⊥ is a construction of ¬¬(φ ∨ ¬φ).

As you can see, using the BHK interpretation to show the intuitionistic
validity of formulas quickly becomes cumbersome and confusing. Luckily, there
are better derivation systems for intuitionistic logic, and more precise semantic
interpretations.

content/intuitionistic-logic/introduction/natural-deduction.tex

55.4 Natural Deduction


Natural deduction without the ⊥C rules is a standard derivation system for int:int:ntd:
sec
intuitionistic logic. We repeat the rules here and indicate the motivation using
the BHK interpretation. In each case, we can think of a rule which allows us
to conclude that if the premises have constructions, so does the conclusion.
Since natural deduction derivations have undischarged assumptions, we
should consider such a derivation, say, of φ from undischarged assumptions Γ ,
as a function that turns constructions of all ψ ∈ Γ into a construction of φ. If
there is a derivation of φ from no undischarged assumptions, then there is a
construction of φ in the sense of the BHK interpretation. For the purpose of
the discussion, however, we’ll suppress the Γ when not needed.
An assumption φ by itself is a derivation of φ from the undischarged as-
sumption φ. This agrees with the BHK-interpretation: the identity function
on constructions turns any construction of φ into a construction of φ.

Conjunction

φ∧ψ
φ ∧Elim
φ ψ
∧Intro
φ∧ψ φ∧ψ
∧Elim
ψ

Suppose we have constructions N1 , N2 of φ1 and φ2 , respectively. Then we


also have a construction φ1 ∧ φ2 , namely the pair ⟨N1 , N2 ⟩.

746 Release : 6891b66 (2024-12-01)


55.4. NATURAL DEDUCTION

A construction of φ1 ∧ φ1 on the BHK interpretation is a pair ⟨N1 , N2 ⟩. So


assume we have such a pair. Then we also have a construction of each conjunct:
N1 is a construction of φ1 and N2 is a construction of φ2 .

Conditional

[φ]u

φ→ψ φ
→Elim
ψ
ψ
u →Intro
φ→ψ

If we have a derivation of ψ from undischarged assumption φ, then there is a


function f that turns constructions of φ into constructions of ψ. That same
function is a construction of φ→ψ. So, if the premise of →Intro has a construc-
tion conditional on a construction of φ, the conclusion φ→ψ has a construction.
On the other hand, suppose there are constructions N of φ and f of φ →
ψ. A construction of φ → ψ is a function that turns constructions of φ into
constructions of ψ. So, f (N ) is a construction of ψ, i.e., the conclusion of
→Elim has a construction.

Disjunction

φ [φ]n [ψ]n
∨Intro
φ∨ψ
ψ
∨Intro φ∨ψ χ χ
φ∨ψ n ∨Elim
χ

If we have a construction Ni of φi we can turn it into a construction ⟨i, Ni ⟩


of φ1 ∨ φ2 . On the other hand, suppose we have a construction of φ1 ∨ φ2 , i.e.,
a pair ⟨i, Ni ⟩ where Ni is a construction of φi , and also functions f1 , f2 , which
turn constructions of φ1 , φ2 , respectively, into constructions of χ. Then fi (Ni )
is a construction of χ, the conclusion of ∨Elim.

Absurdity

⊥ ⊥
φ I

Release : 6891b66 (2024-12-01) 747


CHAPTER 55. INTRODUCTION

If we have a derivation of ⊥ from undischarged assumptions ψ1 , . . . , ψn , then


there is a function f (M1 , . . . , Mn ) that turns constructions of ψ1 , . . . , ψn into
a construction of ⊥. Since ⊥ has no construction, there cannot be any con-
structions of all of ψ1 , . . . , ψn either. Hence, f also has the property that if
M1 , . . . , Mn are constructions of ψ1 , . . . , ψn , respectively, then f (M1 , . . . , Mn )
is a construction of φ.

Rules for ¬
Since ¬φ is defined as φ → ⊥, we strictly speaking do not need rules for ¬. But
if we did, this is what they’d look like:

[φ]n
¬φ φ
¬Elim


n
¬φ ¬Intro

Examples of Derivations
1. ⊢ φ → (¬φ → ⊥), i.e., ⊢ φ → ((φ → ⊥) → ⊥)

[φ]2 [φ → ⊥]1
→Elim

1 →Intro
(φ → ⊥) → ⊥
2 →Intro
φ → (φ → ⊥) → ⊥

2. ⊢ ((φ ∧ ψ) → χ) → (φ → (ψ → χ))

[φ]2 [ψ]1
∧Intro
[(φ ∧ ψ) → χ]3 φ∧ψ
χ →Elim
1 →Intro
ψ→χ
2 →Intro
φ → (ψ → χ)
3 →Intro
((φ ∧ ψ) → χ) → (φ → (ψ → χ))

3. ⊢ ¬(φ ∧ ¬φ), i.e., ⊢ (φ ∧ (φ → ⊥)) → ⊥

[φ ∧ (φ → ⊥)]1 [φ ∧ (φ → ⊥)]1
∧Elim ∧Elim
φ→⊥ φ
→Elim

1 →Intro
(φ ∧ (φ → ⊥)) → ⊥

4. ⊢ ¬¬(φ ∨ ¬φ), i.e., ⊢ ((φ ∨ (φ → ⊥)) → ⊥) → ⊥

748 Release : 6891b66 (2024-12-01)


55.5. AXIOMATIC DERIVATIONS

[φ]1
2 ∨Intro
[(φ ∨ (φ → ⊥)) → ⊥] φ ∨ (φ → ⊥)
→Elim

1 →Intro
φ→⊥
2 ∨Intro
[(φ ∨ (φ → ⊥)) → ⊥] φ ∨ (φ → ⊥)
→Elim

2 →Intro
((φ ∨ (φ → ⊥)) → ⊥) → ⊥

Proposition 55.8. If Γ ⊢ φ in intuitionistic logic, Γ ⊢ φ in classical logic.


In particular, if φ is an intuitionistic theorem, it is also a classical theorem.

Proof. Every natural deduction rule is also a rule in classical natural deduction,
so every derivation in intuitionistic logic is also a derivation in classical logic.

Problem 55.1. Give derivations in intuitionistic logic of the following formu-


las:

1. (¬φ ∨ ψ) → (φ → ψ)

2. ¬¬¬φ → ¬φ

3. ¬¬(φ ∧ ψ) ↔ (¬¬φ ∧ ¬¬ψ)

4. ¬(φ ∨ ψ) ↔ (¬φ ∧ ¬ψ)

5. (¬φ ∨ ¬ψ) → ¬(φ ∧ ψ)

6. ¬¬(φ ∧ ψ) → (¬¬φ ∨ ¬¬ψ)

content/intuitionistic-logic/introduction/axiomatic-derivations.tex

55.5 Axiomatic Derivations


int:int:axd: Axiomatic derivations for intuitionistic propositional logic are the conceptu-
sec
ally simplest, and historically first, derivation systems. They work just as in
classical propositional logic.

Definition 55.9 (Derivability). If Γ is a set of formulas of L then a deriva-


tion from Γ is a finite sequence φ1 , . . . , φn of formulas where for each i ≤ n
one of the following holds:

1. φi ∈ Γ ; or

2. φi is an axiom; or

3. φi follows from some φj and φk with j < i and k < i by modus ponens,
i.e., φk ≡ φj → φi .

Release : 6891b66 (2024-12-01) 749


Definition 55.10 (Axioms). The set of Ax0 of axioms for the intuitionistic
propositional logic are all formulas of the following forms:

(φ ∧ ψ) → φ (55.1) int:int:axd:
ax:land1
(φ ∧ ψ) → ψ (55.2) int:int:axd:
ax:land2
φ → (ψ → (φ ∧ ψ)) (55.3) int:int:axd:
ax:land3
φ → (φ ∨ ψ) (55.4) int:int:axd:
ax:lor1
φ → (ψ ∨ φ) (55.5) int:int:axd:
ax:lor2
(φ → χ) → ((ψ → χ) → ((φ ∨ ψ) → χ)) (55.6) int:int:axd:
ax:lor3
φ → (ψ → φ) (55.7) int:int:axd:
ax:lif1
(φ → (ψ → χ)) → ((φ → ψ) → (φ → χ)) (55.8) int:int:axd:
ax:lif2
⊥→φ (55.9) int:int:axd:
ax:lfalse1

Definition 55.11 (Derivability). A formula φ is derivable from Γ , written


Γ ⊢ φ, if there is a derivation from Γ ending in φ.

Definition 55.12 (Theorems). A formula φ is a theorem if there is a deriva-


tion of φ from the empty set. We write ⊢ φ if φ is a theorem and ⊬ φ if it is
not.

Proposition 55.13. If Γ ⊢ φ in intuitionistic logic, Γ ⊢ φ in classical logic.


In particular, if φ is an intuitionistic theorem, it is also a classical theorem.

Proof. Every intuitionistic axiom is also a classical axiom, so every derivation


in intuitionistic logic is also a derivation in classical logic.

Chapter 56

Semantics

This chapter collects definitions for semantics for intuitionistic logic.


So far only Kripke and topological semantics are covered. There are no
examples yet, either of how models make formulas true or of proofs that
formulas are valid.

750
56.1. INTRODUCTION

content/intuitionistic-logic/semantics/introduction.tex

56.1 Introduction
int:sem:int: No logic is satisfactorily described without a semantics, and intuitionistic logic
sec
is no exception. Whereas for classical logic, the semantics based on valuations is
canonical, there are several competing semantics for intuitionistic logic. None of
them are completely satisfactory in the sense that they give an intuitionistically
acceptable account of the meanings of the connectives.
The semantics based on relational models, similar to the semantics for
modal logics, is perhaps the most popular one. In this semantics, proposi-
tional variables are assigned to worlds, and these worlds are related by an
accessibility relation. That relation is always a partial order, i.e., it is reflexive,
antisymmetric, and transitive.
Intuitively, you might think of these worlds as states of knowledge or “evi-
dentiary situations.” A state w′ is accessible from w iff, for all we know, w′ is
a possible (future) state of knowledge, i.e., one that is compatible with what’s
known at w. Once a proposition is known, it can’t become un-known, i.e.,
whenever φ is known at w and Rww′ , φ is known at w′ as well. So “knowl-
edge” is monotonic with respect to the accessibility relation.
If we define “φ is known” as in epistemic logic as “true in all epistemic
alternatives,” then φ ∧ ψ is known at w if in all epistemic alternatives, both φ
and ψ are known. But since knowledge is monotonic and R is reflexive, that
means that φ ∧ ψ is known at w iff φ and ψ are known at w. For the same
reason, φ ∨ ψ is known at w iff at least one of them is known. So for ∧ and ∨,
the truth conditions of the connectives coincide with those in classical logic.
The truth conditions for the conditional, however, differ from classical logic.
φ → ψ is known at w iff at no w′ with Rww′ , φ is known without ψ also being
known. This is not the same as the condition that φ is unknown or ψ is known
at w. For if we know neither φ nor ψ at w, there might be a future epistemic
state w′ with Rww′ such that at w′ , φ is known without also coming to know ψ.
We know ¬φ only if there is no possible future epistemic state in which
we know φ. Here the idea is that if φ were knowable, then in some possible
future epistemic state φ becomes known. Since we can’t know ⊥, in that future
epistemic state, we would know φ but not know ⊥.
On this interpretation the principle of excluded middle fails. For there are
some φ which we don’t yet know, but which we might come to know. For such
a formula φ, both φ and ¬φ are unknown, so φ ∨ ¬φ is not known. But we do
know, e.g., that ¬(φ ∧ ¬φ). For no future state in which we know both φ and
¬φ is possible, and we know this independently of whether or not we know φ
or ¬φ.
Relational models are not the only available semantics for intuitionistic
logic. The topological semantics is another: here propositions are interpreted

Release : 6891b66 (2024-12-01) 751


CHAPTER 56. SEMANTICS

as open sets in a topological space, and the connectives are interpreted as


operations on these sets (e.g., ∧ corresponds to intersection).

content/intuitionistic-logic/semantics/relational-models.tex

56.2 Relational models


In order to give a precise semantics for intuitionistic propositional logic, we int:sem:rel:
sec
have to give a definition of what counts as a model relative to which we can
evaluate formulas. On the basis of such a definition it is then also possible to
define semantics notions such as validity and entailment. One such semantics
is given by relational models.
Definition 56.1. A relational model for intuitionistic propositional logic is a
triple M = ⟨W, R, V ⟩, where
1. W is a non-empty set,
2. R is a partial order (i.e., a reflexive, antisymmetric, and transitive binary
relation) on W , and
3. V is a function assigning to each propositional variable p a subset of W ,
such that
4. V is monotone with respect to R, i.e., if w ∈ V (p) and Rww′ , then
w′ ∈ V (p).

Definition 56.2. We define the notion of φ being true at w in M, M, w ⊩ φ, int:sem:rel:


defn:true-at-w
inductively as follows:
1. φ ≡ p: M, w ⊩ φ iff w ∈ V (p).
2. φ ≡ ⊥: not M, w ⊩ φ.
3. φ ≡ ¬ψ: M, w ⊩ φ iff for no w′ such that Rww′ , M, w′ ⊩ ψ.
4. φ ≡ ψ ∧ χ: M, w ⊩ φ iff M, w ⊩ ψ and M, w ⊩ χ.
5. φ ≡ ψ ∨ χ: M, w ⊩ φ iff M, w ⊩ ψ or M, w ⊩ χ (or both).
6. φ ≡ ψ → χ: M, w ⊩ φ iff for every w′ such that Rww′ , not M, w′ ⊩ ψ
or M, w′ ⊩ χ (or both).
We write M, w ⊮ φ if not M, w ⊩ φ. If Γ is a set of formulas, M, w ⊩ Γ means
M, w ⊩ ψ for all ψ ∈ Γ .

Problem 56.1. Show that according to Definition 56.2, M, w ⊩ ¬φ iff M, w ⊩


φ → ⊥.

Proposition 56.3. Truth at worlds is monotonic with respect to R, i.e., if int:sem:rel:


M, w ⊩ φ and Rww′ , then M, w′ ⊩ φ. prop:true-monotonic

752 Release : 6891b66 (2024-12-01)


56.3. SEMANTIC NOTIONS

Proof. Exercise.

Problem 56.2. Prove Proposition 56.3.

content/intuitionistic-logic/semantics/semantic-notions.tex

56.3 Semantic Notions


int:sem:sem:
sec
Definition 56.4. We say φ is true in the model M = ⟨W, R, V ⟩, M ⊩ φ, iff
M, w ⊩ φ for all w ∈ W . φ is valid, ⊨ φ, iff it is true in all models. We say
a set of formulas Γ entails φ, Γ ⊨ φ, iff for every model M and every w such
that M, w ⊩ Γ , M, w ⊩ φ.

int:sem:sem: Proposition 56.5.


prop:sat-entails
int:sem:sem: 1. If M, w ⊩ Γ and Γ ⊨ φ, then M, w ⊩ φ.
prop:sat-entails1
int:sem:sem: 2. If M ⊩ Γ and Γ ⊨ φ, then M ⊩ φ.
prop:sat-entails2

Proof. 1. Suppose M ⊩ Γ . Since Γ ⊨ φ, we know that if M, w ⊩ Γ , then


M, w ⊩ φ. Since M, u ⊩ Γ for all every u ∈ W , M, w ⊩ Γ . Hence
M, w ⊩ φ.
2. Follows immediately from (1).

int:sem:sem: Definition 56.6. Suppose M is a relational model and w ∈ W . The restric-


defn:restrict
tion Mw = ⟨Ww , Rw , Vw ⟩ of M to w is given by:

Ww = {u ∈ W : Rwu},
Rw = R ∩ (Ww )2 , and
Vw (p) = V (p) ∩ Ww .

int:sem:sem: Proposition 56.7. M, w ⊩ φ iff Mw ⊩ φ.


prop:restrict

Problem 56.3. Prove Proposition 56.7.

Proposition 56.8. Suppose for every model M such that M ⊩ Γ , M ⊩ φ.


Then Γ ⊨ φ.

Proof. Suppose that M, w ⊩ Γ . By the Proposition 56.7 applied to every ψ ∈


Γ , we have Mw ⊩ Γ . By the assumption, we have Mw ⊩ φ. By Proposi-
tion 56.7 again, we get M, w ⊩ φ.

content/intuitionistic-logic/semantics/topological-semantics.tex

Release : 6891b66 (2024-12-01) 753


CHAPTER 56. SEMANTICS

56.4 Topological Semantics


Another way to provide a semantics for intuitionistic logic is using the mathe- int:sem:top:
sec
matical concept of a topology.

Definition 56.9. Let X be a set. A topology on X is a set O ⊆ ℘(X) that


satisfies the properties below. The elements of O are called the open sets of
the topology. The set X together with O is called a topological space.

1. The empty set and the entire space are open: ∅, X ∈ O.

2. Open sets are closed under finite intersections: if U , V ∈ O then U ∩ V ∈


O

Open sets are closed under arbitrary unions: if Ui ∈ O for all i ∈ I, then
3. S
{Ui : i ∈ I} ∈ O.

We may write X for a topology if the collection of open sets can be inferred
from the context; note that, still, only after X is endowed with open sets can
it be called a topology.

Definition 56.10. A topological model of intuitionistic propositional logic is a


triple X = ⟨X, O, V ⟩ where O is a topology on X and V is a function assigning
an open set in O to each propositional variable.
Given a topological model X, we can define [φ]]X inductively as follows:

1. [⊥]]X = ∅

2. [p]]X = V (p)

3. [φ ∧ ψ]]X = [φ]]X ∩ [ψ]]X

4. [φ ∨ ψ]]X = [φ]]X ∪ [ψ]]X

5. [φ → ψ]]X = Int((X \ [φ]]X ) ∪ [ψ]]X )

Here, Int(V ) is the function that maps a set V ⊆ X to its interior, that is, the
union of all open sets it contains. In other words,
[
Int(V ) = {U : U ⊆ V and U ∈ O}.

Note that the interior of any set is always open, since it is a union of open
sets. Thus, [φ]]X is always an open set.
Although topological semantics is highly abstract, there are ways to think
about it that might motivate it. Suppose that the elements, or “points,” of X
are points at which statements can be evaluated. The set of all points where φ
is true is the proposition expressed by φ. Not every set of points is a potential
proposition; only the elements of O are. φ ⊨ ψ iff ψ is true at every point at
which φ is true, i.e., [φ]]X ⊆ [ψ]]X , for all X. The absurd statement ⊥ is never
true, so [⊥]]X = ∅.

754 Release : 6891b66 (2024-12-01)


How must the propositions expressed by ψ ∧ χ, ψ ∨ χ, and ψ → χ be related
to those expressed by ψ and χ for the intuitionistically valid laws to hold, i.e.,
so that φ ⊢ ψ iff [φ]]X ⊂ [ψ]]X ? We require ⊥ ⊢ φ for any φ, which is satisfied
because ∅ ⊆ U for all U . Since ψ ∧ χ ⊢ ψ, we require that [ψ ∧ χ]]X ⊆ [ψ]]X ,
and similarly [ψ ∧ χ]]X ⊆ [χ]]X . The largest set satisfying W ⊆ U and W ⊆ V
is U ∩ V . Conversely, ψ ⊢ ψ ∨ χ and χ ⊢ ψ ∨ χ, and so we require that
[ψ]]X ⊆ [ψ ∨ χ]]X and [χ]]X ⊆ [ψ ∨ χ]]X . The smallest set W such that U ⊆ W and
V ⊆ W is U ∪ V .
The definition for → is tricky: φ → ψ expresses the weakest proposition
that, combined with φ, entails ψ. That φ → ψ combined with φ entails ψ is
clear from (φ → ψ) ∧ φ ⊢ ψ. So [φ → ψ]]X should be the greatest open set such
that [φ → ψ]]X ∩ [φ]]X ⊂ [ψ]]X , leading to our definition.

Chapter 57

Soundness and Completeness

This chapter collects soundness and completeness results for propo-


sitional intuitionistic logic. It needs an introduction. The completeness
proof makes use of facts about provability that should be stated and proved
explicitly somewhere.

content/intuitionistic-logic/soundness-completeness/soundness-axd.tex

57.1 Soundness of Axiomatic Derivations


int:sc:sax:
sec

The soundness proof relies on the fact that all axioms are intuitionisti-
cally valid; this still needs to be proved, e.g., in the Semantics chapter.

int:sc:sax: Theorem 57.1 (Soundness). If Γ ⊢ φ, then Γ ⊨ φ.


thm:soundness

755
CHAPTER 57. SOUNDNESS AND COMPLETENESS

Proof. We prove that if Γ ⊢ φ, then Γ ⊨ φ. The proof is by induction on


the number n of formulas in the derivation of φ from Γ . We show that if φ1 ,
. . . , φn = φ is a derivation from Γ , then Γ ⊨ φn . Note that if φ1 , . . . , φn is
a derivation, so is φ1 , . . . , φk for any k < n.
There are no derivations of length 0, so for n = 0 the claim holds vacuously.
So the claim holds for all derivations of length < n. We distinguish cases
according to the justification of φn .

1. φn is an axiom. All axioms are valid, so Γ ⊨ φn for any Γ .

2. φn ∈ Γ . Then for any M and w, if M, w ⊩ Γ , obviously M ⊩ Γ φn [w],


i.e., Γ ⊨ φ.

3. φn follows by mp from φi and φj ≡ φi → φn . φ1 , . . . , φi and φ1 ,


. . . , φj are derivations from Γ , so by inductive hypothesis, Γ ⊨ φi and
Γ ⊨ φi → φn .
Suppose M, w ⊩ Γ . Since M, w ⊩ Γ and Γ ⊨ φi → φn , M, w ⊩ φi → φn .
By definition, this means that for all w′ such that Rww′ , if M, w′ ⊩ φi
then M, w′ ⊩ φn . Since R is reflexive, w is among the w′ such that Rww′ ,
i.e., we have that if M, w ⊩ φi then M, w ⊩ φn . Since Γ ⊨ φi , M, w ⊩ φi .
So, M, w ⊩ φn , as we wanted to show.

content/intuitionistic-logic/soundness-completeness/soundness-nd.tex

57.2 Soundness of Natural Deduction


We will now prove soundness of natural deduction with regards to the rela- int:sc:snd:
sec
tional semantics, that is, showing that if a formula is derivable from a set of
assumptions then the set of assumptions entails the formula.

Theorem 57.2 (Soundness). If Γ ⊢ φ, then Γ ⊨ φ. int:sc:snd:


thm:soundness

Proof. We prove that if Γ ⊢ φ, then Γ ⊨ φ. The proof is by induction on the


derivation of φ from Γ .

1. If the derivation consists of just the assumption φ, we have φ ⊢ φ, and


want to show that φ ⊨ φ. Suppose that M, w ⊩ φ. Then trivially
M, w ⊩ φ.

2. The derivation ends in ∧Intro: The derivations of the premises ψ from


undischarged assumptions Γ and of χ from undischarged assumptions ∆
show that Γ ⊢ ψ and ∆ ⊢ χ. By induction hypothesis we have that Γ ⊨ ψ
and ∆ ⊨ χ. We have to show that Γ ∪ ∆ ⊨ φ ∧ ψ, since the undischarged
assumptions of the entire derivation are Γ together with ∆. So suppose
M, w ⊩ Γ ∪ ∆. Then also M, w ⊩ Γ . Since Γ ⊨ ψ, M, w ⊩ ψ. Similarly,
M, w ⊩ χ. So M, w ⊩ ψ ∧ χ.

756 Release : 6891b66 (2024-12-01)


57.2. SOUNDNESS OF NATURAL DEDUCTION

3. The derivation ends in ∧Elim: The derivation of the premise ψ ∧ χ from


undischarged assumptions Γ shows that Γ ⊢ ψ ∧ χ. By induction hy-
pothesis, Γ ⊨ ψ ∧ χ. We have to show that Γ ⊨ ψ. So suppose M, w ⊩ Γ .
Since Γ ⊨ ψ ∧ χ, M, w ⊩ ψ ∧ χ. Then also M, w ⊩ ψ. Similarly if ∧Elim
ends in χ, then Γ ⊨ χ.
4. The derivation ends in ∨Intro: Suppose the premise is ψ, and the undis-
charged assumptions of the derivation ending in ψ are Γ . Then we have
Γ ⊢ ψ and by inductive hypothesis, Γ ⊨ ψ. We have to show that
Γ ⊨ ψ ∨ χ. Suppose M, w ⊩ Γ . Since Γ ⊨ ψ, M, w ⊩ ψ. But then also
M, w ⊩ ψ ∨ χ. Similarly, if the premise is χ, we have that Γ ⊨ χ.
5. The derivation ends in ∨Elim: The derivations ending in the premises
are of ψ ∨ χ from undischarged assumptions Γ , of θ from undischarged
assumptions ∆1 ∪{ψ}, and of θ from undischarged assumptions ∆2 ∪{χ}.
So we have Γ ⊢ ψ ∨ χ, ∆1 ∪ {ψ} ⊢ θ, and ∆2 ∪ {χ} ⊢ θ. By induction
hypothesis, Γ ⊨ ψ ∨ χ, ∆1 ∪ {ψ} ⊨ θ, and ∆2 ∪ {χ} ⊨ θ. We have to prove
that Γ ∪ ∆1 ∪ ∆2 ⊨ θ.
Suppose M, w ⊩ Γ ∪ ∆1 ∪ ∆2 . Then M, w ⊩ Γ and since Γ ⊨ ψ ∨ χ,
M, w ⊩ ψ ∨ χ. By definition of M ⊩, either M, w ⊩ ψ or M, w ⊩ χ.
So we distinguish cases: (a) M ⊩ ψ[w]. Then M, w ⊩ ∆1 ∪ {ψ}. Since
∆1 ∪ ψ ⊨ θ, we have M, w ⊩ θ. (b) M, w ⊩ χ. Then M, w ⊩ ∆2 ∪ {χ}.
Since ∆2 ∪ χ ⊨ θ, we have M, w ⊩ θ. So in either case, M, w ⊩ θ, as we
wanted to show.
6. The derivation ends with →Intro concluding ψ → χ. Then the premise
is χ, and the derivation ending in the premise has undischarged assump-
tions Γ ∪ {ψ}. So we have that Γ ∪ {ψ} ⊢ χ, and by induction hypothesis
that Γ ∪ {ψ} ⊨ χ. We have to show that Γ ⊨ ψ → χ.
Suppose M, w ⊩ Γ . We want to show that for all w′ such that Rww′ , if
M, w′ ⊩ ψ, then M, w′ ⊩ χ. So assume that Rww′ and M, w′ ⊩ ψ. By
Proposition 56.3, M, w′ ⊩ Γ . Since Γ ∪ {ψ} ⊨ χ, M, w′ ⊩ χ, which is
what we wanted to show.
7. The derivation ends in →Elim and conclusion χ. The premises are ψ → χ
and ψ, with derivations from undischarged assumptions Γ , ∆. So we
have Γ ⊢ ψ → χ and ∆ ⊢ ψ. By inductive hypothesis, Γ ⊨ ψ → χ and
∆ ⊨ ψ. We have to show that Γ ∪ ∆ ⊨ χ.
Suppose M, w ⊩ Γ ∪ ∆. Since M, w ⊩ Γ and Γ ⊨ ψ → χ, M, w ⊩ ψ → χ.
By definition, this means that for all w′ such that Rww′ , if M, w′ ⊩ ψ
then M, w′ ⊩ χ. Since R is reflexive, w is among the w′ such that Rww′ ,
i.e., we have that if M, w ⊩ ψ then M, w ⊩ χ. Since M, w ⊩ ∆ and
∆ ⊨ ψ, M, w ⊩ ψ. So, M, w ⊩ χ, as we wanted to show.
8. The derivation ends in ⊥I , concluding φ. The premise is ⊥ and the
undischarged assumptions of the derivation of the premise are Γ . Then
Γ ⊢ ⊥. By inductive hypothesis, Γ ⊨ ⊥. We have to show Γ ⊨ φ.

Release : 6891b66 (2024-12-01) 757


CHAPTER 57. SOUNDNESS AND COMPLETENESS

We proceed indirectly. If Γ ⊭ φ there is a model M and world w such


that M, w ⊩ Γ and M, w ⊮ φ. Since Γ ⊨ ⊥, M, w ⊩ ⊥. But that’s
impossible, since by definition, M, w ⊮ ⊥. So Γ ⊨ φ.

9. The derivation ends in ¬Intro: Exercise.

10. The derivation ends in ¬Elim: Exercise.

Problem 57.1. Complete the proof of Theorem 57.2. For the cases for ¬Intro
and ¬Elim, use the definition of M, w ⊩ ¬φ in Definition 56.2, i.e., don’t treat
¬φ as defined by φ → ⊥.

Problem 57.2. Show that the following formulas are not derivable in intu-
itionistic logic:

1. (φ → ψ) ∨ (ψ → φ)

2. (¬¬φ → φ) → (φ ∨ ¬φ)

3. (φ → ψ ∨ χ) → (φ → ψ) ∨ (φ → χ)

content/intuitionistic-logic/soundness-completeness/lindenbaum.tex

57.3 Lindenbaum’s Lemma


The completeness theorem for intuitionistic logic is proved by assuming Γ ⊬ φ int:sc:lin:
sec
and constructing a model M ⊩ Γ and M ⊮ φ.
In classical logic the relation of derivability can be reduced to the notion
of consistency since a formula φ is derivable from a set of formulas iff the
set together with the negation of φ is inconsistent. This is not possible in
intuitionistic logic. In intuitionistic logic, if ¬φ is inconsistent, we only get
that ⊢ ¬¬φ. Since ¬¬φ → φ does not hold intuitionistically in general, we
cannot conclude that ⊢ φ.
Thus, when constructing the model M, we will need to keep track of the
non-derivability of the formula φ and thus we will not be able to use a complete
set Γ ∗ ⊇ Γ to build the model M, as in every complete set Γ ∗ , we have
Γ ∗ ⊢ φ ∨ ¬φ.
Instead of using a complete set Γ ∗ , we will us the notion of a prime set of
formulas:

Definition 57.3. A set of formulas Γ is prime iff int:sc:lin:


defn:prime
1. Γ is consistent, i.e., Γ ⊬ ⊥; int:sc:lin:
defn:prime1
2. if Γ ⊢ φ then φ ∈ Γ ; and int:sc:lin:
defn:prime2
3. if φ ∨ ψ ∈ Γ then φ ∈ Γ or ψ ∈ Γ . int:sc:lin:
defn:prime3

758 Release : 6891b66 (2024-12-01)


57.3. LINDENBAUM’S LEMMA

int:sc:lin: Lemma 57.4 (Lindenbaum’s Lemma). If Γ ⊬ φ, there is a Γ ∗ ⊇ Γ such


lem:lindenbaum
that Γ ∗ is prime and Γ ∗ ⊬ φ.

Proof. Let ψ1 ∨ χ1 , ψ2 ∨ χ2 , . . . , be an enumeration of all formulas of the


form ψ ∨ χ. We’ll define an increasing sequence of sets of formulas Γn , where
each Γn+1 is defined as Γn together with one new formula. Γ ∗ will be the
union of all Γn . The new formulas are selected so as to ensure that Γ ∗ is
prime and still Γ ∗ ⊬ φ. This means that at each step we should find the first
disjunction ψi ∨ χi such that:

int:sc:lin: 1. Γn ⊢ ψi ∨ χi
gamma-1
int:sc:lin: 2. ψi ∈
/ Γn and χi ∈
/ Γn
gamma-2
We add to Γn either ψi if Γn ∪ {ψi } ⊬ φ, or χi otherwise. We’ll have to show
that this works. For now, let’s define i(n) as the least i such that (1) and (2)
hold.
Define Γ0 = Γ and
(
Γn ∪ {ψi(n) } if Γn ∪ {ψi(n) } ⊬ φ
Γn+1 =
Γn ∪ {χi(n) } otherwise

If i(n) is undefined, i.e., whenever


S∞ Γn ⊢ ψ ∨ χ, either ψ ∈ Γn or χ ∈ Γn , we let
Γn+1 = Γn . Now let Γ ∗ = n=0 Γn
First we show that for all n, Γn ⊬ φ. We proceed by induction on n. For
n = 0 the claim holds by the hypothesis of the theorem, i.e., Γ ⊬ φ. If n > 0,
we have to show that if Γn ⊬ φ then Γn+1 ⊬ φ. If i(n) is undefined, Γn+1 = Γn
and there is nothing to prove. So suppose i(n) is defined. For simplicity, let
i = i(n).
We’ll prove the contrapositive of the claim. Suppose Γn+1 ⊢ φ. By con-
struction, Γn+1 = Γn ∪ {ψi } if Γn ∪ {ψi } ⊬ φ, or else Γn+1 = Γn ∪ {χi }. It
clearly can’t be the first, since then Γn+1 ⊬ φ. Hence, Γn ∪ {ψi } ⊢ φ and
Γn+1 = Γn ∪ {χi }. By definition of i(n), we have that Γn ⊢ ψi ∨ χi . We have
Γn ∪ {ψi } ⊢ φ. We also have Γn+1 = Γn ∪ {χi } ⊢ φ. Hence, Γn ⊢ φ, which is
what we wanted to show.
If Γ ∗ ⊢ φ, there would be some finite subset Γ ′ ⊆ Γ ∗ such that Γ ′ ⊢ φ.
Each θ ∈ Γ ′ must be in Γi for some i. Let n be the largest of these. Since
Γi ⊆ Γn if i ≤ n, Γ ′ ⊆ Γn . But then Γn ⊢ φ, contrary to our proof above that
Γn ⊬ φ.
Lastly, we show that Γ ∗ is prime, i.e., satisfies conditions (1), (2), and (3)
of Definition 57.3.
First, Γ ∗ ⊬ φ, so Γ ∗ is consistent, so (1) holds.
We now show that if Γ ∗ ⊢ ψ ∨ χ, then either ψ ∈ Γ ∗ or χ ∈ Γ ∗ . This
proves (3), since if ψ ∨ χ ∈ Γ ∗ then also Γ ∗ ⊢ ψ ∨ χ. So assume Γ ∗ ⊢ ψ ∨ χ but
ψ∈ / Γ ∗ and χ ∈ / Γ ∗ . Since Γ ∗ ⊢ ψ ∨ χ, Γn ⊢ ψ ∨ χ for some n. ψ ∨ χ appears
on the enumeration of all disjunctions, say, as ψj ∨ χj . ψj ∨ χj satisfies the
properties in the definition of i(n), namely we have Γn ⊢ ψj ∨χj , while ψj ∈ / Γn

Release : 6891b66 (2024-12-01) 759


CHAPTER 57. SOUNDNESS AND COMPLETENESS

and χj ∈/ Γn . At each stage, at least one fewer disjunction ψi ∨ χi satisfies the


conditions (since at each stage we add either ψi or χi ), so at some stage m we
will have j = i(m). But then either ψ ∈ Γm+1 or χ ∈ Γm+1 , contrary to the
assumption that ψ ∈ / Γ ∗ and χ ∈/ Γ ∗.
Now suppose Γ ⊢ ψ. Then Γ ∗ ⊢ ψ ∨ ψ. But we’ve just proved that if

Γ ⊢ ψ ∨ ψ then ψ ∈ Γ ∗ . Hence, Γ ∗ satisfies (2) of Definition 57.3.


Problem 57.3. Show that if Γ ⊬ ⊥ then Γ is consistent in classical logic, i.e.,


there is a valuation making all formulas in Γ true.

content/intuitionistic-logic/soundness-completeness/canonical-model.tex

57.4 The Canonical Model


The worlds in our model will be finite sequences σ of natural numbers, i.e., int:sc:mod:
sec
σ ∈ N∗ . Note that N∗ is inductively defined by:
1. Λ ∈ N∗ .
2. If σ ∈ N∗ and n ∈ N, then σ.n ∈ N∗ (where σ.n is σ ⌢ ⟨n⟩ and σ ⌢ σ ′ is
the concatenation if σ and σ ′ ).
3. Nothing else is in N∗ .
So we can use N∗ to give inductive definitions.
Let ⟨ψ1 , χ1 ⟩, ⟨ψ2 , χ2 ⟩, . . . , be an enumeration of all pairs of formulas. Given
a set of formulas ∆, define ∆(σ) by induction as follows:
1. ∆(Λ) = ∆
2. ∆(σ.n) = (
(∆(σ) ∪ {ψn })∗ if ∆(σ) ∪ {ψn } ⊬ χn
∆(σ) otherwise

Here by (∆(σ) ∪ {ψn })∗ we mean the prime set of formulas which exists by
Lemma 57.4 applied to the set ∆(σ) ∪ {ψn } and the formula χn . Note that by
this definition, if ∆(σ) ∪ {ψn } ⊬ χn , then ∆(σ.n) ⊢ ψn and ∆(σ.n) ⊬ χn . Note
also that ∆(σ) ⊆ ∆(σ.n) for any n. If ∆ is prime, then ∆(σ) is prime for all σ.
Definition 57.5. Suppose ∆ is prime. Then the canonical model M(∆) for int:sc:mod:
defn:canonical-model
∆ is defined by:
1. W = N∗ , the set of finite sequences of natural numbers.
2. R is the partial order according to which Rσσ ′ iff σ is an initial segment
of σ ′ (i.e., σ ′ = σ ⌢ σ ′′ for some sequence σ ′′ ).
3. V (p) = {σ : p ∈ ∆(σ)}.

760 Release : 6891b66 (2024-12-01)


57.5. THE TRUTH LEMMA

It is easy to verify that R is indeed a partial order. Also, the monotonic-


ity condition on V is satisfied. Since ∆(σ) ⊆ ∆(σ.n) we get ∆(σ) ⊆ ∆(σ ′ )
whenever Rσσ ′ by induction on σ.

content/intuitionistic-logic/soundness-completeness/truth-lemma.tex

57.5 The Truth Lemma


int:sc:tru:
sec
int:sc:tru: Lemma 57.6. If ∆ is prime, then M(∆), σ ⊩ φ iff ∆(σ) ⊢ φ.
lem:truth

Proof. By induction on φ.

1. φ ≡ ⊥: Since ∆(σ) is prime, it is consistent, so ∆(σ) ⊬ φ. By definition,


M(∆), σ ⊮ φ.

2. φ ≡ p: By definition of ⊩, M(∆), σ ⊩ φ iff σ ∈ V (p), i.e., ∆(σ) ⊢ φ.

3. φ ≡ ¬ψ: exercise.

4. φ ≡ ψ ∧ χ: M(∆), σ ⊩ φ iff M(∆), σ ⊩ ψ and M(∆), σ ⊩ χ. By


induction hypothesis, M(∆), σ ⊩ ψ iff ∆(σ) ⊢ ψ, and similarly for χ.
But ∆(σ) ⊢ ψ and ∆(σ) ⊢ χ iff ∆(σ) ⊢ φ.

5. φ ≡ ψ ∨ χ: M(∆), σ ⊩ φ iff M(∆), σ ⊩ ψ or M(∆), σ ⊩ χ. By induction


hypothesis, this holds iff ∆(σ) ⊢ ψ or ∆(σ) ⊢ χ. We have to show that
this in turn holds iff ∆(σ) ⊢ φ. The left-to-right direction is clear. The
right-to-left direction follows since ∆(σ) is prime.

6. φ ≡ ψ→χ: First the contrapositive of the left-to-right direction: Assume


∆(σ) ⊬ ψ → χ. Then also ∆(σ) ∪ {ψ} ⊬ χ. Since ⟨ψ, χ⟩ is ⟨ψn , χn ⟩ for
some n, we have ∆(σ.n) = (∆(σ)∪{ψ})∗ , and ∆(σ.n) ⊢ ψ but ∆(σ.n) ⊬ χ.
By inductive hypothesis, M(∆), σ.n ⊩ ψ and M(∆), σ.n ⊮ χ. Since
Rσ(σ.n), this means that M(∆), σ ⊮ φ.
Now assume ∆(σ) ⊢ ψ → χ, and let Rσσ ′ . Since ∆(σ) ⊆ ∆(σ ′ ), we
have: if ∆(σ ′ ) ⊢ ψ, then ∆(σ ′ ) ⊢ χ. In other words, for every σ ′ such
that Rσσ ′ , either ∆(σ ′ ) ⊬ ψ or ∆(σ ′ ) ⊢ χ. By induction hypothesis, this
means that whenever Rσσ ′ , either M(∆), σ ′ ⊮ ψ or M(∆), σ ′ ⊩ χ, i.e.,
M(∆), σ ⊩ φ.

content/intuitionistic-logic/soundness-completeness/completeness-thm.tex

57.6 The Completeness Theorem


int:sc:cpl:
sec
int:sc:cpl: Theorem 57.7. If Γ ⊨ φ then Γ ⊢ φ.
thm:completeness

Release : 6891b66 (2024-12-01) 761


Proof. We prove the contrapositive: Suppose Γ ⊬ φ. Then by Lemma 57.4,
there is a prime set Γ ∗ ⊇ Γ such that Γ ∗ ⊬ φ. Consider the canonical
model M(Γ ∗ ) for Γ ∗ as defined in Definition 57.5. For any ψ ∈ Γ , Γ ∗ ⊢ ψ. Note
that Γ ∗ (Λ) = Γ ∗ . By the Truth Lemma (Lemma 57.6), we have M(Γ ∗ ), Λ ⊩ ψ
for all ψ ∈ Γ and M(Γ ∗ ), Λ ⊮ φ. This shows that Γ ⊭ φ.

Problem 57.4. Show that if φ only contains propositional variables, ∨, and ∧,


then ⊭ φ. Use this to conclude that → is not definable in intuitionistic logic
from ∨ and ∧.

Problem 57.5. By using the completeness theorem prove that if ⊢ φ ∨ ψ then


⊢ φ or ⊢ ψ. (Hint: Assume M1 ⊮ φ and M2 ⊮ ψ and construct a new model
M such that M ⊮ φ ∨ ψ.)

Problem 57.6. Show that if M is a relational model using a linear order then
M ⊩ (φ → ψ) ∨ (ψ → φ).

content/intuitionistic-logic/soundness-completeness/decidability.tex

57.7 Decidability
Observe that the proof of the completeness theorem gives us for every Γ ⊬ φ a int:sc:dec:
sec
model with an infinite number of worlds witnessing the fact that Γ ⊭ φ. The
following proposition shows that to prove ⊨ φ it is enough to prove that M ⊩ φ
for all finite models (i.e., models with a finite set of worlds).

Theorem 57.8. If ⊭ φ then there is a finite model M′ ⊮ φ. int:sc:dec:


thm:decidability

Proof. Assume M = ⟨W, R, V ⟩ is such that M ⊮ φ and P is the set of


propositional variables occurring in φ. Define M′ = ⟨W ′ , R′ , V ′ ⟩ by letting
W ′ = {[w] : w ∈ W } where [w] = {p ∈ P : w ∈ V (p)}, R′ be the subset
relation, and V ′ (p) = {[w] : p ∈ [w]}. It should be clear that W ′ is a finite set
and that M′ is a relational model.
It can be shown, by induction on φ, that

M, w ⊩ φ iff M′ , [w] ⊩ φ

for all formulas φ with only propositional variables from P . This is left as an
exercise for the reader.

Problem 57.7. Finish the proof of Theorem 57.8 by showing that M, w ⊩ φ


iff M′ , [w] ⊩ φ for all formulas φ with only propositional variables from P .

762
From Theorem 57.8 it follows that there is an algorithm to decide whether ⊨
φ.

Chapter 58

Propositions as Types

This is a very experimental draft of a chapter on the Curry–Howard


correspondence. It needs more explanation and motivation, and there
are probably errors and omissions. The proof of normalization should be
reviewed and expanded. There are no examples for the product type.
Permutation and simplification conversions are not covered. It will make a
lot more sense once there is also material on the (typed) lambda calculus
which is basically presupposed here. Use with extreme caution.

content/intuitionistic-logic/propositions-as-types/introduction.tex

58.1 Introduction
int:pty:int:
sec
Historically the lambda calculus and intuitionistic logic were developed sepa-
rately. Haskell Curry and William Howard independently discovered a close
similarity: types in a typed lambda calculus correspond to formulas in intu-
itionistic logic in such a way that a derivation of a formula corresponds directly
to a typed lambda term with that formula as its type. Moreover, beta reduc-
tion in the typed lambda calculus corresponds to certain transformations of
derivations.
For instance, a derivation of φ→ψ corresponds to a term λxφ . N ψ , which has
the function type φ → ψ. The inference rules of natural deduction correspond
to typing rules in the typed lambda calculus, e.g.,

Release : 6891b66 (2024-12-01) 763


CHAPTER 58. PROPOSITIONS AS TYPES

[φ]x

ψ x:φ ⇒ N :ψ
x →Intro λ
φ→ψ corresponds to ⇒ λxφ . N ψ : φ → ψ

where the rule on the right means that if x is of type φ and N is of type ψ,
then λxφ . N is of type φ → ψ.
The →Elim rule corresponds to the typing rule for composition terms, i.e.,
φ→ψ φ
→Elim
ψ corresponds to
⇒ P :φ→ψ ⇒ Q:φ
app
⇒ P φ→ψ Qφ : ψ
If a →Intro rule is followed immediately by a →Elim rule, the derivation
can be simplified:

[φ]x

φ


ψ
x →Intro
φ→ψ φ
→Elim
ψ ψ

which corresponds to the beta reduction of lambda terms

(λxφ . P ψ )Q →
− P [Q/x].

Similar correspondences hold between the rules for ∧ and “product” types,
and between the rules for ∨ and “sum” types.
This correspondence between terms in the simply typed lambda calculus
and natural deduction derivations is called the “Curry–Howard”, or “proposi-
tions as types” correspondence. In addition to formulas (propositions) corre-
sponding to types, and proofs to terms, we can summarize the correspondences
as follows:
logic program
proposition type
proof term
assumption variable
discharged assumption bind variable
not discharged assumption free variable
implication function type
conjunction product type
disjunction sum type
absurdity bottom type

764 Release : 6891b66 (2024-12-01)


58.2. SEQUENT NATURAL DEDUCTION

The Curry–Howard correspondence is one of the cornerstones of automated


proof assistants and type checkers for programs, since checking a proof witness-
ing a proposition (as we did above) amounts to checking if a program (term)
has the declared type.

content/intuitionistic-logic/propositions-as-types/sequent-natural-deduction.tex

58.2 Sequent Natural Deduction


int:pty:snd: Let us write Γ ⇒ φ if there is a natural deduction derivation with Γ as undis-
sec
charged assumptions and φ as conclusion; or ⇒ φ if Γ is empty.
We write Γ, φ1 , . . . , φn for Γ ∪ {φ1 , . . . , φn }, and Γ, ∆ for Γ ∪ ∆.
Observe that when we have Γ ⇒ φ ∧ φ, meaning we have a derivation with
Γ as undischarged assumptions and φ ∧ φ as end-formula, then by applying
∧Elim at the bottom, we can get a derivation with the same undischarged
assumptions and φ as conclusion. In other words, if Γ ⇒ φ ∧ ψ, then Γ ⇒ φ.
Γ ⇒ φ∧ψ Γ ⇒ φ∧ψ
∧Elim ∧Elim
Γ ⇒ φ Γ ⇒ ψ

The label ∧Elim hints at the relation with the rule of the same name in natural
deduction.
Likewise, suppose we have Γ, φ ⇒ ψ, meaning we have a derivation with
undischarged assumptions Γ, φ and end-formula ψ. If we apply the →Intro
rule, we have a derivation with Γ as undischarged assumptions and φ → ψ as
the end-formula, i.e., Γ ⇒ φ → ψ. Note how this has made the discharge of
assumptions more explicit.
Γ, φ ⇒ ψ
→Intro
Γ ⇒ φ→ψ

We can draw conclusions from other rules in the same fashion, which is
spelled out as follows:

Γ ⇒ φ ∆ ⇒ ψ
∧Intro
Γ, ∆ ⇒ φ ∧ ψ
Γ ⇒ φ∧ψ Γ ⇒ φ∧ψ
∧Elim1 ∧Elim2
Γ ⇒ φ Γ ⇒ ψ
Γ ⇒ φ Γ ⇒ ψ
∨Intro1 ∨Intro2
Γ ⇒ φ∨ψ Γ ⇒ φ∨ψ
Γ ⇒ φ∨ψ ∆, φ ⇒ χ ∆′ , ψ ⇒ χ
∨Elim
Γ, ∆, ∆′ ⇒ χ
Γ, φ ⇒ ψ ∆ ⇒ φ→ψ Γ ⇒ φ
→Intro →Elim
Γ ⇒ φ→ψ Γ, ∆ ⇒ ψ
Γ ⇒ ⊥ ⊥
I
Γ ⇒ φ

Release : 6891b66 (2024-12-01) 765


CHAPTER 58. PROPOSITIONS AS TYPES

Any assumption by itself is a derivation of φ from φ, i.e., we always have


φ ⇒ φ.

φ ⇒ φ

Together, these rules can be taken as a calculus about what natural de-
duction derivations exist. They can also be taken as a notational variant of
natural deduction, in which each step records not only the formula derived but
also the undischarged assumptions from which it was derived.
φ ⇒ φ
φ ⇒ φ ∨ (φ → ⊥) ψ ⇒ ψ
φ, ψ→ ⇒ ⊥
(ψ ⇒ φ → ⊥
(ψ ⇒ φ ∨ (φ → ⊥) (ψ ⇒ ψ
(ψ ⇒ ⊥
⇒ ψ→⊥

where ψ is short for (φ ∨ (φ → ⊥)) → ⊥.

content/intuitionistic-logic/propositions-as-types/proof-terms.tex

58.3 Proof Terms


We give the definition of proof terms, and then establish its relation with int:pty:ter:
sec
natural deduction derivations.

Definition 58.1 (Proof terms). Proof terms are inductively generated by


the following rules:

1. A single variable x is a proof term.

2. If P and Q are proof terms, then P Q is also a proof term.

3. If x is a variable, φ is a formula, and N is a proof term, then λxφ . N is


also a proof term.

4. If P and Q are proof terms, then ⟨P, Q⟩ is a proof term.

5. If M is a proof term, then pi (M ) is also a proof term, where i is 1 or 2.

6. If M is a proof term, and φ is a formula, then inφ


i (M ) is a proof term,
where i is 1 or 2.

7. If M, N1 , N2 is proof terms, and x1 , x2 are variables, then case(M, x1 .N1 , x2 .N2 )


is a proof term.

8. If M is a proof term and φ is a formula, then contrφ (M ) is proof term.

766 Release : 6891b66 (2024-12-01)


58.4. CONVERTING DERIVATIONS TO PROOF TERMS

Each of the above rules corresponds to an inference rule in natural deduc-


tion. Thus we can inductively assign proof terms to the formulas in a deriva-
tion. To make this assignment unique, we must distinguish between the two
versions of ∧Elim and of ∨Intro. For instance, the proof terms assigned to the
conclusion of ∨Intro must carry the information whether φ ∨ ψ is inferred from
φ or from ψ. Suppose M is the term assigned to φfrom which φ ∨ ψ is inferred.
Then the proof term assigned to φ ∨ ψ is inφ 1 (M ). If we instead infer ψ ∨ φ
then the proof term assigned is inφ2 (M ).
The term λxφ . N is assigned to the conclusion of →Intro. The φ represents
the assumption being discharged; only have we included it can we infer the
formula of λxφ . N based on the formula of N .
Definition 58.2 (Typing context). A typing context is a mapping from vari-
ables to formulas. We will call it simply the “context” if there is no confusion.
We write a context Γ as a set of pairs ⟨x, φ⟩.

A pair Γ ⇒ M where M is a proof term represents a derivation of a formula


with context Γ .
Definition 58.3 (Typing pair). A typing pair is a pair ⟨Γ, M ⟩, where Γ is
a typing context and M is a proof term.

Since in general terms only make sense with specific contexts, we will speak
simply of “terms” from now on instead of “typing pair”; and it will be apparent
when we are talking about the literal term M .

content/intuitionistic-logic/propositions-as-types/proofs-to-terms.tex

58.4 Converting Derivations to Proof Terms


int:pty:pt: We will describe the process of converting natural deduction derivations to
sec
pairs. We will write a proof term to the left of each formula in the derivation,
resulting in expressions of the form M : φ. We’ll then say that, M witnesses φ.
Let’s call such an expression a judgment.
First let us assign to each assumption a variable, with the following con-
straints:
1. Assumptions discharged in the same step (that is, with the same number
on the square bracket) must be assigned the same variable.
2. For assumptions not discharged, assumptions of different formulas should
be assigned different variables.
Such an assignment translates all assumptions of the form
φ into x : φ.
With assumptions all associated with variables (which are terms), we can now
inductively translate the rest of the deduction tree. The modified natural

Release : 6891b66 (2024-12-01) 767


CHAPTER 58. PROPOSITIONS AS TYPES

deduction rules taking into account context and proof terms are given below.
Given the proof terms for the premise(s), we obtain the corresponding proof
term for conclusion.

M1 : φ 1 M2 : φ2
∧Intro
⟨M1 , M2 ⟩ : φ1 ∧ φ2
M : φ1 ∧ φ2 M : φ1 ∧ φ2
∧Elim1 ∧Elim2
pi (M ) : φ1 pi (M ) : φ2
In ∧Intro we assume we have φ1 witnessed by term M1 and φ2 witnessed
by term M2 . We pack up the two terms into a pair ⟨M1 , M2 ⟩ which witnesses
φ1 ∧ φ2 .
In ∧Elimi we assume that M witnesses φ1 ∧ φ2 . The term witnessing φi
is pi (M ). Note that M is not necessary of the form ⟨M1 , M2 ⟩, so we cannot
simply assign M1 to the conclusion φi .
Note how this coincides with the BHK interpretation. What the BHK
interpretation does not specify is how the function used as proof for φ → ψ is
supposed to be obtained. If we think of proof terms as proofs or functions of
proofs, we can be more explicit.
[x : φ]

P :φ→ψ Q:φ
→Elim
PQ : ψ
N :ψ
→Intro
λxφ . N : φ → ψ
The λ notation should be understood as the same as in the lambda calculus,
and P Q means applying P to Q.

M1 : φ1 M2 : φ2
φ1 ∨Intro1 φ2 ∨Intro2
in1 (M1 ) : φ1 ∨ φ2 in2 (M2 ) : φ1 ∨ φ2
[x1 : φ1 ] [x2 : φ2 ]

M : A1 ∨ φ2 N1 : χ N2 : χ
∨Elim
case(M, x1 .N1 , x2 .N2 ) : χ
The proof term inφ1 (M1 ) is a term witnessing φ1 ∨ φ2 , where M1 witnesses φ1 .
1

The term case(M, x1 .N1 , x2 .N2 ) mimics the case clause in programming
languages: we already have the derivation of φ∨ψ, a derivation of χ assuming φ,
and a derivation of χ assuming ψ. The case operator thus select the appropriate
proof depending on M ; either way it’s a proof of χ.

N :⊥ ⊥I
contrφ (N ) : φ

768 Release : 6891b66 (2024-12-01)


58.4. CONVERTING DERIVATIONS TO PROOF TERMS

contrφ (N ) is a term witnessing φ, whenever N is a term witnessing ⊥.


Now we have a natural deduction derivation with all formulas associated
with a term. At each step, the relevant typing context Γ is given by the list of
assumptions remaining undischarged at that step. Note that Γ is well defined:
since we have forbidden assumptions of different undischarged assumptions to
be assigned the same variable, there won’t be any disagreement about the
formulas mapped to which a variable is mapped.
We now give some examples of such translations:
Consider the derivation of ¬¬(φ ∨ ¬φ), i.e., ((φ ∨ (φ → ⊥)) → ⊥) → ⊥. Its
translation is:

[x : φ]1
[y : (φ ∨ (φ → ⊥)) → ⊥]2 inφ→⊥
1 (x) : φ ∨ (φ → ⊥)
y(inφ→⊥
1 (x)) : ⊥
1
λxφ . y(inφ→⊥
1 (x)) : φ → ⊥
[y : (φ ∨ (φ → ⊥)) → ⊥]2 inφ φ φ→⊥
2 (λx . y(in1 (x))) : φ ∨ (φ → ⊥)
y(inφ φ φ→⊥
2 (λx . yin1 (x))) : ⊥
2
λy (φ∨(φ→⊥))→⊥ . y(inφ φ φ→⊥
2 (λx . yin1 (x))) : ((φ ∨ (φ → ⊥)) → ⊥) → ⊥
The tree has no assumptions, so the context is empty; we get:
⊢ λy (φ∨(φ→⊥))→⊥ . y(inφ φ φ→⊥
2 (λx . yin1 (x))) : ((φ ∨ (φ → ⊥)) → ⊥) → ⊥
If we leave out the last →Intro, the assumptions denoted by y would be in the
context and we would get:
y : ((φ ∨ (φ → ⊥)) → ⊥) ⊢ y(inφ φ φ→⊥
2 (λx . yin1 (x))) : ⊥
Another example: ⊢ φ → (φ → ⊥) → ⊥

[x : φ]2 [y : φ → ⊥]1
yx : ⊥
1
φ→⊥
λy . yx : (φ → ⊥) → ⊥
2
λxφ . λy φ→⊥ . yx : φ → (φ → ⊥) → ⊥
Again all assumptions are discharged and thus the context is empty, the re-
sulting term is
⊢ λxφ . λy φ→⊥ . yx : φ → (φ → ⊥) → ⊥
If we leave out the last two →Intro inferences, the assumptions denoted by
both x and y would be in context and we would get
x : φ, y : φ → ⊥ ⊢ yx : ⊥

content/intuitionistic-logic/propositions-as-types/terms-to-proofs.tex

Release : 6891b66 (2024-12-01) 769


CHAPTER 58. PROPOSITIONS AS TYPES

58.5 Recovering Derivations from Proof Terms


Now let us consider the other direction: translating terms back to natural int:pty:tp:
sec
deduction trees. We will use still use the double refutation of the excluded
middle as example, and let S denote this term, i.e.,

λy (φ∨(φ→⊥))→⊥ . y(inφ φ φ→⊥


2 (λx . yin1 (x))) : ((φ ∨ (φ → ⊥)) → ⊥) → ⊥

For each natural deduction rule, the term in the conclusion is always formed
by wrapping some operator around the terms assigned to the premise(s). Rules
correspond uniquely to such operators. For example, from the structure of the
S we infer that the last rule applied must be →Intro, since it is of the form
λy ... . . . ., and the λ operator corresponds to →Intro. In general we can recover
the skeleton of the derivation solely by the structure of the term, e.g.,

[x]1
∨Intro1
[y :]2 inφ→⊥
1 (x) :
→Elim
y(inφ→⊥
1 (x)) :
1 →Intro
λxφ . y(inφ→⊥
1 (x)) :
∨Intro2
[y :]2 inφ φ φ→⊥
2 (λx . yin1 (x)) :
→Elim
y(inφ φ φ→⊥
2 (λx . yin1 (x))) :
2 →Intro
λy (φ∨(φ→⊥))→⊥ . y(inφ φ φ→⊥
2 (λx . y(in1 (x)))) :
Our next step is to recover the formulas these terms witness. We define a
function F (Γ, M ) which denotes the formula witnessed by M in context Γ , by
induction on M as follows:

F (Γ, x) = Γ (x)
F (Γ, ⟨N1 , N2 ⟩ = F (Γ, N1 ) ∧ F (Γ, N2 )
F (Γ, pi (N ) = φi if F (Γ, N ) = φ1 ∧ φ2
(
φ F (N ) ∨ φ if i = 1
F (Γ, ini (N ) =
φ ∨ F (N ) if i = 2
F (Γ, case(M, x1 .N1 , x2 .N2 )) = F (Γ ∪ {xi : F (Γ, M )}, Ni )
F (Γ, λxφ . N ) = φ → F (Γ ∪ {x : φ}, N )
F (Γ, N M ) = ψ if F (Γ, N ) = φ → ψ

where Γ (x) means the formula mapped to by x in Γ and Γ ∪ {x : φ} is a


context exactly as Γ except mapping x to φ, whether or not x is already in Γ .
Note there are cases where F (Γ, M ) is not defined, for example:
1. In the first line, it is possible that x is not in Γ .
2. In recursive cases, the inner invocation may be undefined, making the
outer one undefined too.

770 Release : 6891b66 (2024-12-01)


58.5. RECOVERING DERIVATIONS FROM PROOF TERMS

3. In the third line, its only defined when F (Γ, M ) is of the form φ1 ∨ φ2 ,
and the right hand is independent on i.

As we recursively compute F (Γ, M ), we work our way up the natural deduc-


tion derivation. The every step in the computation of F (Γ, M ) corresponds to
a term in the derivation to which the derivation-to-term translation assigns M ,
and the formula computed is the end-formula of the derivation. However, the
result may not be defined for some choices of Γ . We say that such pairs ⟨Γ, M ⟩
are ill-typed, and otherwise well-typed. However, if the term M results from
translating a derivation, and the formulas in Γ correspond to the undischarged
assumptions of the derivation, the pair ⟨Γ, M ⟩ will be well-typed.

Proposition 58.4. If D is a derivation with undischarged assumptions φ1 ,


. . . , φn , M is the proof term associated with D and Γ = {x1 : φ1 , . . . , xn : φn },
then the result of recovering derivation from M in context Γ is D.

In the other direction, if we first translate a typing pair to natural deduction


and then translate it back, we won’t get the same pair back since the choice of
variables for the undischarged assumptions is underdetermined. For example,
consider the pair ⟨{x : φ, y : φ → ψ}, yx⟩. The corresponding derivation is

φ→ψ φ
→Elim
ψ

By assigning different variables to the undischarged assumptions, say, u to


φ → ψ and v to φ, we would get the term uv rather than yx. There is a
connection, though: the terms will be the same up to renaming of variables.
Now we have established the correspondence between typing pairs and nat-
ural deduction, we can prove theorems for typing pairs and transfer the result
to natural deduction derivations.
Similar to what we did in the natural deduction section, we can make some
observations here too. Let Γ ⊢ M : φ denote that there is a pair (Γ, M )
witnessing the formula φ. Then always Γ ⊢ x : φ if x : φ ∈ Γ , and the
following rules are valid:

Γ ⊢ M1 : φ1 ∆ ⊢ M2 : φ2 Γ ⊢ M : φ1 ∧ φ2
∧Intro ∧Elimi
Γ, ∆ ⊢ ⟨M1 , M2 ⟩ : φ1 ∧ φ2 Γ ⊢ pi (M ) : φi
Γ ⊢ M1 : φ1 Γ ⊢ M2 : φ 2
∨Intro1 ∨Intro2
Γ ⊢ inφ1
2
(M ) : φ 1 ∨ φ 2 Γ ⊢ inφ1
2 (M ) : φ1 ∨ φ2
Γ ⊢M :φ∨ψ ∆1 , x1 : φ1 ⊢ N1 : χ ∆2 , x2 : φ2 ⊢ N2 : χ
′ ∨Elim
Γ, ∆, ∆ ⊢ case(M, x1 .N1 , x2 .N2 ) : χ
Γ, x : φ ⊢ N : ψ Γ ⊢Q:φ ∆⊢P :φ→ψ
φ →Intro →Elim
Γ ⊢ λx . N : φ → ψ Γ, ∆ ⊢ P Q : ψ
Γ ⊢M :⊥
⊥Elim
Γ ⊢ contrφ (M ) : φ

Release : 6891b66 (2024-12-01) 771


CHAPTER 58. PROPOSITIONS AS TYPES

These are the typing rules of the simply typed lambda calculus extended
with product, sum and bottom.
In addition, the F (Γ, M ) is actually a type checking algorithm; it returns
the type of the term with respect to the context, or is undefined if the term is
ill-typed with respect to the context.

content/intuitionistic-logic/propositions-as-types/reduction.tex

58.6 Reduction
In natural deduction derivations, an introduction rule that is followed by an int:pty:red:
sec
elimination rule is redundant. For instance, the derivation

φ φ→ψ
→Elim
ψ [χ]
∧Intro
ψ∧χ
∧Elim
ψ
→Intro
χ→ψ

can be replaced with the simpler derivation:

φ φ→ψ
→Elim
ψ
→Intro
χ→ψ

As we see, an ∧Intro followed by ∧Elim “cancel out.” In general, we see that


the conclusion of ∧Elim is always the formula on one side of the conjunction,
and the premises of ∧Intro requires both sides of the conjunction, thus if we
need a derivation of either side, we can simply use that derivation without
introducing the conjunction followed by eliminating it.
Thus in general we have

D1 D2
φ1 φ2 Di
φ1 ∧ φ2 ∧Intro
φi ∧Elimi →
− φi

The →− symbol has a similar meaning as in the lambda calculus, i.e., a


single step of a reduction. In the proof term syntax for derivations, the above
reduction rule thus becomes:

(Γ, pi ⟨M1φ1 , M2φ2 ⟩) →


− (Γ, Mi )

In the typed lambda calculus, this is the beta reduction rule for the product
type.

772 Release : 6891b66 (2024-12-01)


58.6. REDUCTION

Note the type annotation on M1 and M2 : while in the standard term syntax
only λxφ . N has such notion, we reuse the notation here to remind us of the
formula the term is associated with in the corresponding natural deduction
derivation, to reveal the correspondence between the two kinds of syntax.
In natural deduction, a pair of inferences such as those on the left, i.e., a
pair that is subject to cancelling is called a cut. In the typed lambda calculus
the term on the left of →
− is called a redex, and the term to the right is called the
reductum. Unlike untyped lambda calculus, where only (λx. N )Q is considered
to be redex, in the typed lambda calculus the syntax is extended to terms
involving ⟨N, M ⟩, pi (N ), inφ
i (N ), case(N, x1 .M1 , x2 .M2 ), and contrN (), with
corresponding redexes.
Similarly we have reduction for disjunction:

D
[φ1 ]u [φ2 ]u
D φi
D1 D2
φi Di
φ1 ∨ φ2 ∨Intro χ χ
u
χ ∨Elim →
− χ
This corresponds to a reduction on proof terms:
(Γ, case(inφi φi φ1 χ φ2 χ
− (Γ, Niχ [M φi /xφ
i (M ), x1 .N1 , x2 .N2 )) →
i
i ])

This is the beta reduction rule of for sum types. Here, M [N/x] means replacing
all assumptions denoted by variable x in M with N ,
It would be nice if we pass the context Γ to the substitution function so
that it can check if the substitution makes sense. For example, xy[ab/y] does
not make sense under the context {x : φ → θ, y : φ, a : ψ → χ, b : ψ} since then
we would be substituting a derivation of χ where a derivation of φ is expected.
However, as long as our usage of substitution is careful enough to avoid such
errors, we won’t have to worry about such conflicts. Thus we can define it
recursively as we did for untyped lambda calculus as if we are dealing with
untyped terms.
Finally, the reduction of the function type corresponds to removal of a
detour of a →Intro followed by a →Elim.

[φ]u
D′
D φ
ψ D′
u →Intro D
φ→ψ φ
→Elim
ψ →
− ψ
For proof terms, this amounts to ordinary beta reduction:
(Γ, (λxφ . N ψ )Qφ ) →
− (Γ, N ψ [Qφ /xφ ])

Release : 6891b66 (2024-12-01) 773


CHAPTER 58. PROPOSITIONS AS TYPES

Absurdity has only an elimination rule and no introduction rule, thus there
is no such reduction for it.
Note that the above notion of reduction concerns only deductions with a cut
at the end of a derivation. We would of course like to extend it to reduction
of cuts anywhere in a derivation, or reductions of subterms of proof terms
which constitute redexes. Note that, however, the conclusion of the reduction
does not change after reduction, thus we are free to continue applying rules to
both sides of →− . The resulting pairs of trees constitutes an extended notion of
reduction; it is analogous to compatibility in the untyped lambda calculus.
It’s easy to see that the context Γ does not change during the reduction
(both the original and the extended version), thus it’s unnecessary to mention
the context when we are discussing reductions. In what follows we will assume
that every term is accompanied by a context which does no change during
reduction. We then say “proof term” when we mean a proof term accompanied
by a context which makes it well-typed.
As in lambda calculus, the notion of normal-form term and normal deduc-
tion is given:

Definition 58.5. A proof term with no redex is said to be in normal form;


likewise, a derivation without cuts is a normal derivation. A proof term is in
normal form if and only if its counterpart derivation is normal.

content/intuitionistic-logic/propositions-as-types/normalization.tex

58.7 Normalization
In this section we prove that, via some reduction order, any deduction can int:pty:nor:
sec
be reduced to a normal deduction, which is called the normalization property.
We will make use of the propositions-as-types correspondence: we show that
every proof term can be reduced to a normal form; normalization for natural
deduction derivations then follows.
Firstly we define some functions that measure the complexity of terms. The
length len(φ) of a formulas is defined by

len(p) = 0
len(φ ∧ ψ) = len(φ) + len(ψ) + 1
len(φ ∨ ψ) = len(φ) + len(ψ) + 1
len(φ → ψ) = len(φ) + len(ψ) + 1.

The complexity of a redex M is measured by its cut rank cr(M ):

cr((λxφ . N ψ )Q) = len(φ) + len(ψ) + 1


cr(pi (⟨M φ , N ψ ⟩)) = len(φ) + len(ψ) + 1
cr(case(ini (M φi ), xφ
φi 1 χ φ2 χ
1 .N1 , x2 .N2 )) = len(φ) + len(ψ) + 1

774 Release : 6891b66 (2024-12-01)


58.7. NORMALIZATION

The complexity of a proof term is measured by the most complex redex in it,
and 0 if it is normal:

mr(M ) = max{cr(N )|N is a sub term of M and is redex}

int:pty:nor: Lemma 58.6. If M [N φ /xφ ] is a redex and M ̸≡ x, then one of the following
lem:subst
cases holds:

1. M is itself a redex, or

2. M is of the form pi (x), and N is of the form ⟨P1 , P2 ⟩

3. M is of the form case(i, x1 .P1 , x2 .P2 ), and N is of the form ini (Q)

4. M is of the form xQ, and N is of the form λx. P

In the first case, cr(M [N/x]) = cr(M ); in the other cases, cr(M [N/x]) =
len(φ)).

Proof. Proof by induction on M .

1. If M is a single variable y and y ̸≡ x, then y[N/x] is y, hence not a redex.

2. If M is of the form ⟨N1 , N2 ⟩, or λx. N , or inφ φ φ


i (N ), then M [N /x ] is also
of that form, and so is not a redex.

3. If M is of the form pi (P ), we consider two cases.

a) If P is of the form ⟨P1 , P2 ⟩, then M ≡ pi (⟨P1 , P2 ⟩) is a redex, and


clearly
M [N/x] ≡ pi (⟨P1 [N/x], P2 [N/x]⟩)
is also a redex. The cut ranks are equal.
b) If P is a single variable, it must be x to make the substitution a
redex, and N must be of the form ⟨P1 , P2 ⟩. Now consider

M [N/x] ≡ pi (x)[⟨P1 , P2 ⟩/x],

which is pi (⟨P1 , P2 ⟩). Its cut rank is equal to cr(x), which is len(φ).

The cases of case(N, x1 .N1 , x2 .N2 ) and P Q are similar.

Lemma 58.7. If M contracts to M ′ , and cr(M ) > cr(N ) for all proper redex
sub-terms N of M , then cr(M ) > mr(M ′ ).

Proof. Proof by cases.

1. If M is of the form pi (⟨M1 , M2 ⟩), then M ′ is Mi ; since any sub-term of


Mi is also proper sub-term of M , the claim holds.

Release : 6891b66 (2024-12-01) 775


CHAPTER 58. PROPOSITIONS AS TYPES

2. If M is of the form (λxφ . N )Qφ , then M ′ is N [Qφ /xφ ]. Consider a redex


in M ′ . Either there is corresponding redex in N with equal cut rank,
which is less than cr(M ) by assumption, or the cut rank equals len(φ),
which by definition is less than cr((λxφ . N )Q).
3. If M is of the form

case(ini (N φi ), xφ1 χ φ2 χ
1 .N1 , x2 .N2 ),

then M ′ ≡ Ni [N/xφ i ′
i ]. Consider a redex in M . Either there is corre-
sponding redex in Ni with equal cut rank, which is less than cr(M ) by
assumption; or the cut rank equals len(φi ), which by definition is less
than cr(case(ini (N φi ), xφ1 χ φ2 χ
1 .N1 , x2 .N2 )).

Theorem 58.8. All proof terms reduce to normal form; all derivations reduce
to normal derivations.

Proof. The second follows from the first. We prove the first by complete in-
duction on m = mr(M ), where M is a proof term.

1. If m = 0, M is already normal.
2. Otherwise, we proceed by induction on n, the number of redexes in M
with cut rank equal to m.
a) If n = 1, select any redex N such that m = cr(N ) > cr(P ) for any
proper sub-term P which is also a redex of course. Such a redex
must exist, since any term only has finitely many subterms.
Let N ′ denote the reductum of N . Now by the lemma mr(N ′ ) <
mr(N ), thus we can see that n, the number of redexes with cr(=)m
is decreased. So m is decreased (by 1 or more), and we can apply
the inductive hypothesis for m.
b) For the induction step, assume n > 1. the process is similar, except
that n is only decreased to a positive number and thus m does not
change. We simply apply the induction hypothesis for n.

The normalization of terms is actually not specific to the reduction order


we chose. In fact, one can prove that regardless of the order in which redexes
are reduced, the term always reduces to a normal form. This property is called
strong normalization.

776 Release : 6891b66 (2024-12-01)


Part XIII

Counterfactuals

Chapter 59

Introduction

content/counterfactuals/introduction/material-conditional.tex

59.1 The Material Conditional


cnt:int:mat: In its simplest form in English, a conditional is a sentence of the form “If
sec
. . . then . . . ,” where the . . . are themselves sentences, such as “If the butler
did it, then the gardener is innocent.” In introductory logic courses, we earn to
symbolize conditionals using the → connective: symbolize the parts indicated
by . . . , e.g., by formulas φ and ψ, and the entire conditional is symbolized by
φ → ψ.
The connective → is truth-functional, i.e., the truth value—T or F—of φ→ψ
is determined by the truth values of φ and ψ: φ → ψ is true iff φ is false or ψ is
true, and false otherwise. Relative to a truth value assignment v, we define
v ⊨ φ → ψ iff v ⊭ φ or v ⊨ ψ. The connective → with this semantics is called
the material conditional.
This definition results in a number of elementary logical facts. First of all,
the deduction theorem holds for the material conditional:

If Γ, φ ⊨ ψ then Γ ⊨ φ → ψ (59.1)

It is truth-functional: φ → ψ and ¬φ ∨ ψ are equivalent:

φ → ψ ⊨ ¬φ ∨ ψ (59.2)
¬φ ∨ ψ ⊨ φ → ψ (59.3)

777
CHAPTER 59. INTRODUCTION

A material conditional is entailed by its consequent and by the negation of its


antecedent:

ψ ⊨φ→ψ (59.4)
¬φ ⊨ φ → ψ (59.5)

A false material conditional is equivalent to the conjunction of its antecedent


and the negation of its consequent: if φ → ψ is false, φ ∧ ¬ψ is true, and vice
versa:

¬(φ → ψ) ⊨ φ ∧ ¬ψ (59.6)
φ ∧ ¬ψ ⊨ ¬(φ → ψ) (59.7)

The material conditional supports modus ponens:

φ, φ → ψ ⊨ ψ (59.8)

The material conditional agglomerates:

φ → ψ, φ → χ ⊨ φ → (ψ ∧ χ) (59.9)

We can always strengthen the antecedent, i.e., the conditional is monotonic:

φ → ψ ⊨ (φ ∧ χ) → ψ (59.10)

The material conditional is transitive, i.e., the chain rule is valid:

φ → ψ, ψ → χ ⊨ φ → χ (59.11)

The material conditional is equivalent to its contrapositive:

φ → ψ ⊨ ¬ψ → ¬φ (59.12)
¬ψ → ¬φ ⊨ φ → ψ (59.13)

These are all useful and unproblematic inferences in mathematical rea-


soning. However, the philosophical and linguistic literature is replete with
purported counterexamples to the equivalent inferences in non-mathematical
contexts. These suggest that the material conditional → is not—or at least
not always—the appropriate connective to use when symbolizing English “if
. . . then . . . ” statements.

content/counterfactuals/introduction/paradoxes-material.tex

778 Release : 6891b66 (2024-12-01)


59.2. PARADOXES OF THE MATERIAL CONDITIONAL

59.2 Paradoxes of the Material Conditional


cnt:int:par: One of the first to criticize the use of φ → ψ as a way to symbolize “if . . . then
sec
. . . ” statements of English was C. I. Lewis. Lewis was criticizing the use
of the material condition in Whitehead and Russell’s Principia Mathematica,
who pronounced → as “implies.” Lewis rightly complained that if → meant
“implies,” then any false proposition p implies that p implies q, since p→(p→q)
is true if p is false, and that any true proposition q implies that p implies q,
since q → (p → q) is true if q is true.
Logicians of course know that implication, i.e., logical entailment, is not a
connective but a relation between formulas or statements. So we should just
not read → as “implies” to avoid confusion.1 As long as we don’t, the particular
worry that Lewis had simply does not arise: p does not “imply” q even if we
think of p as standing for a false English sentence. To determine if p ⊨ q we
must consider all valuations, and p ⊭ q even when we use p to symbolize a
sentence which happens to be false.
But there is still something odd about “if . . . then. . . ” statements such as
Lewis’s
If the moon is made of green cheese, then 2 + 2 = 4.
and about the inferences
The moon is not made of green cheese. Therefore, if the moon is
made of green cheese, then 2 + 2 = 4.
2 + 2 = 4. Therefore, if the moon is made of green cheese, then
2 + 2 = 4.
Yet, if “if . . . then . . . ” were just →, the sentence would be unproblematically
true, and the inferences unproblematically valid.
Another example of concerns the tautology (φ → ψ) ∨ (ψ → φ). This would
suggest that if you take two indicative sentences S and T from the newspaper
at random, the sentence “If S then T , or if T then S” should be true.

content/counterfactuals/introduction/strict-conditional.tex

59.3 The Strict Conditional


cnt:int:str: Lewis introduced the strict conditional J and argued that it, not the material
sec
conditional, corresponds to implication. In alethic modal logic, φ J ψ can be
defined as □(φ → ψ). A strict conditional is thus true (at a world) iff the
corresponding material conditional is necessary.
How does the strict conditional fare vis-a-vis the paradoxes of the material
conditional? A strict conditional with a false antecedent and one with a true
1 Reading “→” as “implies” is still widely practised by mathematicians and computer

scientists, although philosophers try to avoid the confusions Lewis highlighted by pronouncing
it as “only if.”

Release : 6891b66 (2024-12-01) 779


CHAPTER 59. INTRODUCTION

consequent, may be true, or it may be false. Moreover, (φ J ψ) ∨ (ψ J φ) is


not valid. The strict conditional φ J ψ is also not equivalent to ¬φ ∨ ψ, so it
is not truth functional.
We have:

φ J ψ ⊨ ¬φ ∨ ψ but: (59.14)
¬φ ∨ ψ ⊭ φ J ψ (59.15)
ψ⊭φJψ (59.16)
¬φ ⊭ φ J ψ (59.17)
¬(φ → ψ) ⊭ φ ∧ ¬ψ but: (59.18)
φ ∧ ¬ψ ⊨ ¬(φ J ψ) (59.19)

However, the strict conditional still supports modus ponens:

φ, φ J ψ ⊨ ψ (59.20)

The strict conditional agglomerates:

φ J ψ, φ J χ ⊨ φ J (ψ ∧ χ) (59.21)

Antecedent strengthening holds for the strict conditional:

φ J ψ ⊨ (φ ∧ χ) J ψ (59.22)

The strict conditional is also transitive:

φ J ψ, ψ J χ ⊨ φ J χ (59.23)

Finally, the strict conditional is equivalent to its contrapositive:

φ J ψ ⊨ ¬ψ J ¬φ (59.24)
¬ψ J ¬φ ⊨ φ J ψ (59.25)

Problem 59.1. Give S5-counterexamples to the entailment relations which


do not hold for the strict conditional, i.e., for:
1. ¬p ⊭ □(p → q)
2. q ⊭ □(p → q)
3. ¬□(p → q) ⊭ p ∧ ¬q
4. ⊭ □(p → q) ∨ □(q → p)

Problem 59.2. Show that the valid entailment relations hold for the strict
conditional by giving S5-proofs of:

780 Release : 6891b66 (2024-12-01)


59.3. THE STRICT CONDITIONAL

1. □(φ → ψ) ⊨ ¬φ ∨ ψ

2. φ ∧ ¬ψ ⊨ ¬□(φ → ψ)

3. φ, □(φ → ψ) ⊨ ψ

4. □(φ → ψ), □(φ → χ) ⊨ □(φ → (ψ ∧ χ))

5. □(φ → ψ) ⊨ □((φ ∧ χ) → ψ)

6. □(φ → ψ), □(ψ → χ) ⊨ □(φ → χ)

7. □(φ → ψ) ⊨ □(¬ψ → ¬φ)

8. □(¬ψ → ¬φ) ⊨ □(φ → ψ)

However, the strict conditional still has its own “paradoxes.” Just as a
material conditional with a false antecedent or a true consequent is true, a strict
conditional with a necessarily false antecedent or a necessarily true consequent
is true. Moreover, any true strict conditional is necessarily true, and any false
strict conditional is necessarily false. In other words, we have

□¬φ ⊨ φ J ψ (59.26)
□ψ ⊨ φ J ψ (59.27)
φ J ψ ⊨ □(φ J ψ) (59.28)
¬(φ J ψ) ⊨ □¬(φ J ψ) (59.29)

These are not problems if you think of J as “implies.” Logical entailment


relationships are, after all, mathematical facts and so can’t be contingent. But
they do raise issues if you want to use J as a logical connective that is supposed
to capture “if . . . then . . . ,” especially the last two. For surely there are “if
. . . then . . . ” statements that are contingently true or contingently false—in
fact, they generally are neither necessary nor impossible.

Problem 59.3. Give proofs in S5 of:

1. □¬φ ⊨ φ J ψ

2. φ J ψ ⊨ □(φ J ψ)

3. ¬(φ J ψ) ⊨ □¬(φ J ψ)

Use the definition of J to do so.

content/counterfactuals/introduction/counterfactuals.tex

Release : 6891b66 (2024-12-01) 781


CHAPTER 59. INTRODUCTION

59.4 Counterfactuals
A very common and important form of “if . . . then . . . ” constructions in En- cnt:int:cnt:
sec
glish are built using the past subjunctive form of to be: “if it were the case that
. . . then it would be the case that . . . ” Because usually the antecedent of such
a conditional is false, i.e., counter to fact, they are called counterfactual con-
ditionals (and because they use the subjunctive form of to be, also subjunctive
conditionals. They are distinguished from indicative conditionals which take
the form of “if it is the case that . . . then it is the case that . . . ” Counterfac-
tual and indicative conditionals differ in truth conditions. Consider Adams’s
famous example:

If Oswald didn’t kill Kennedy, someone else did.


If Oswald hadn’t killed Kennedy, someone else would have.

The first is indicative, the second counterfactual. The first is clearly true: we
know President John F. Kennedy was killed by someone, and if that someone
wasn’t (contrary to the Warren Report) Lee Harvey Oswald, then someone
else killed Kennedy. The second one says something different. It claims that
if Oswald hadn’t killed Kennedy, i.e., if the Dallas shooting had been avoided
or had been unsuccessful, history would have subsequently unfolded in such
a way that another assassination would have been successful. In order for it
to be true, it would have to be the case that powerful forces had conspired to
ensure JFK’s death (as many JFK conspiracy theorists believe).
It is a live debate whether the indicative conditional is correctly captured
by the material conditional, in particular, whether the paradoxes of the mate-
rial conditional can be “explained” in a way that is compatible with it giving
the truth conditions for English indicative conditionals. By contrast, it is un-
controversial that counterfactual conditionals cannot be symbolized correctly
by the material conditionals. That is clear because, even though generally the
antecedents of counterfactuals are false, not all counterfactuals with false an-
tecedents are true—for instance, if you believe the Warren Report, and there
was no conspiracy to assassinate JFK, then Adams’s counterfactual conditional
is an example.
Counterfactual conditionals play an important role in causal reasoning: a
prime example of the use of counterfactuals is to express causal relationships.
E.g., striking a match causes it to light, and you can express this by saying
“if this match were struck, it would light.” Material, and generally indicative
conditionals, cannot be used to express this: “the match is struck → the match
lights” is true if the match is never struck, regardless of what would happen if
it were. Even worse, “the match is struck → the match turns into a bouquet
of flowers” is also true if it is never struck, but the match would certainly not
turn into a bouquet of flowers if it were struck.
It is still debated What exactly the correct logic of counterfactuals is. An
influential analysis of counterfactuals was given by Stalnaker and Lewis. Ac-
cording to them, a counterfactual “if it were the case that S then it would be

782 Release : 6891b66 (2024-12-01)


the case that T ” is true iff T is true in the counterfactual situation (“possible
world”) that is closest to the way the actual world is and where S is true.
This is called an “ontic” analysis, since it makes reference to an ontology of
possible worlds. Other analyses make use of conditional probabilities or the-
ories of belief revision. There is a proliferation of different proposed logics of
counterfactuals. There isn’t even a single Lewis–Stalnaker logic of counterfac-
tuals: even though Stalnaker and Lewis proposed accounts along similar lines
with reference to closest possible worlds, the assumptions they made result in
different valid inferences.

Chapter 60

Minimal Change Semantics

content/counterfactuals/minimal-change-semantics/introduction.tex

60.1 Introduction
cnt:min:int: Stalnaker and Lewis proposed accounts of counterfactual conditionals such as
sec
“If the match were struck, it would light.” Their accounts were proposals for
how to properly understand the truth conditions for such sentences. The idea
behind both proposals is this: to evaluate whether a counterfactual conditional
is true, we have to consider those possible worlds which are minimally differ-
ent from the way the world actually is to make the antecedent true. If the
consequent is true in these possible worlds, then the counterfactual is true.
For instance, suppose I hold a match and a matchbook in my hand. In the
actual world I only look at them and ponder what would happen if I were to
strike the match. The minimal change from the actual world where I strike the
match is that where I decide to act and strike the match. It is minimal in that
nothing else changes: I don’t also jump in the air, striking the match doesn’t
also light my hair on fire, I don’t suddenly lose all strength in my fingers, I am
not simultaneously doused with water in a SuperSoaker ambush, etc. In that
alternative possibility, the match lights. Hence, it’s true that if I were to strike
the match, it would light.
This intuitive account can be paired with formal semantics for logics of
counterfactuals. Lewis introduced the symbol “€” for the counterfactual

783
CHAPTER 60. MINIMAL CHANGE SEMANTICS

while Stalnaker used the symbol “>”. We’ll use €, and add it as a binary
connective to propositional logic. So, we have, in addition to formulas of the
form φ → ψ also formulas of the form φ € ψ. The formal semantics, like the
relational semantics for modal logic, is based on models in which formulas are
evaluated at worlds, and the satisfaction condition defining M, w ⊩ φ € ψ is
given in terms of M, w′ ⊩ φ and M, w′ ⊩ ψ for some (other) worlds w′ . Which
w′ ? Intuitively, the one(s) closest to w for which it holds that M, w′ ⊩ φ. This
requires that a relation of “closeness” has to be included in the model as well.
Lewis introduced an instructive way of representing counterfactual situa-
tions graphically. Each possible world is at the center of a set of nested spheres
containing other worlds—we draw these spheres as concentric circles. The
worlds between two spheres are equally close to the world at the center as each
other, those contained in a nested sphere are closer, and those in a surrounding
sphere further away.

w φ

The closest φ-worlds are those worlds w′ where φ is satisfied which lie in the
smallest sphere around the center world w (the gray area). Intuitively, φ € ψ
is satisfied at w if ψ is true at all closest φ-worlds.

content/counterfactuals/minimal-change-semantics/sphere-models.tex

60.2 Sphere Models


One way of providing a formal semantics for counterfactuals is to turn Lewis’s con:min:sph:
sec
informal account into a mathematical structure. The spheres around a world w
then are sets of worlds. Since the spheres are nested, the sets of worlds around w
have to be linearly ordered by the subset relation.

Definition 60.1. A sphere model is a triple M = ⟨W, O, V ⟩ where W is a non-


empty set of worlds, V : At0 → ℘(W ) is a valuation, and O : W → ℘(℘(W ))
assigns to each world w a system of spheres Ow . For each w, Ow is a set of
sets of worlds, and must satisfy:

1. Ow is centered on w: {w} ∈ Ow .

2. Ow is nested : whenever S1 , S2 ∈ Ow , S1 ⊆ S2 or S2 ⊆ S1 , i.e., Ow is


linearly ordered by ⊆.

784 Release : 6891b66 (2024-12-01)


60.2. SPHERE MODELS

w1 w7

w5
w p
w6
w2
w3

w4

Figure 60.1: Diagram of a sphere model


con:min:sph:
fig:sphere-model

3. Ow is closed under non-empty unions.

4. Ow is closed under non-empty intersections.

The intuition behind Ow is that the worlds “around” w are stratified ac-
cording to how far away they are from w. The innermost sphere is just w by
itself, i.e., the set {w}: w is closer to w than the worlds in any other sphere. If
S ⊊ S ′ , then the worlds in S ′ \ S are further way from w than the worlds in S:
S ′ \ S is the “layer” between the S and the worlds outside of S ′ . In particular,
we have to think of the spheres as containing all the worlds within their outer
surface; they are not just the individual layers.
The diagram in Figure 60.1 corresponds to the sphere model with W =
{w, w1 , . . . , w7 }, V (p) = {w5 , w6 , w7 }. The innermost sphere S1 = {w}. The
closest worlds to w are w1 , w2 , w3 , so the next larger sphere is S2 = {w, w1 , w2 , w3 }.
The worlds further out are w4 , w5 , w6 , so the outermost sphere is S3 =
{w, w1 , . . . , w6 }. The system of spheres around w is Ow = {S1 , S2 , S3 }. The
world w7 is not in any sphere around w. The closest worlds in which p is true
are w5 and w6 , and so the smallest p-admitting sphere is S3 .
To define satisfaction of a formula φ at world w in a sphere model M,
M, w ⊩ φ, we expand the definition for modal formulas to include a clause for
ψ € χ:

Definition 60.2. M, w ⊩ ψ € χ iff either


S
con:min:sph: 1. For all u ∈ Ow , M, u ⊮ ψ, or
sphere-vac

con:min:sph: 2. For some S ∈ Ow ,


sphere-nonvac

a) M, u ⊩ ψ for some u ∈ S, and


b) for all v ∈ S, either M, v ⊮ ψ or M, v ⊩ χ.

Release : 6891b66 (2024-12-01) 785


CHAPTER 60. MINIMAL CHANGE SEMANTICS

w φ

Figure 60.2: Non-vacuously true counterfactual


con:min:tf:
fig:true

According to this definition, M, w ⊩ ψ € χ iff either the antecedent ψ


is false everywhere in the spheres around w, or there is a sphere S where ψ
is true, and the material conditional ψ → χ is true at all worlds in that “ψ-
admitting” sphere. Note that we didn’t require in the definition that S is the
innermost ψ-admitting sphere, contrary to what one might expect from the
intuitive explanation. But if the condition in (2) is satisfied for some sphere S,
then it is also satisfied for all spheres S contains, and hence in particular for
the innermost sphere.
Note also that the definition of sphere models does not require that there
is an innermost ψ-admitting sphere: we may have an infinite sequence S1 ⊋
S2 ⊋ · · · ⊋ {w} of ψ-admitting spheres, and hence no innermost ψ-admitting
spheres. In that case, M, w ⊩ ψ € χ iff ψ → χ holds throughout the spheres
Si , Si+1 , . . . , for some i.

content/counterfactuals/minimal-change-semantics/true-false.tex

60.3 Truth and Falsity of Counterfactuals


A counterfactual φ € ψ is (non-vacuously) true if the closest φ-worlds are all con:min:tf:
sec
ψ-worlds, as depicted in Figure 60.2. A counterfactual is also true at w if the
system of spheres around w has no φ-admitting spheres at all. In that case it
is vacuously true (see Figure 60.3).
It can be false in two ways. One way is if the closest φ-worlds are not
all ψ-worlds, but some of them are. In this case, φ € ¬ψ is also false (see
Figure 60.4). If the closest φ-worlds do not overlap with the ψ-worlds at all,
then φ € ψ. But, in this case all the closest φ-worlds are ¬ψ-worlds, and so
φ € ¬ψ is true (see Figure 60.5).
In contrast to the strict conditional, counterfactuals may be contingent.
Consider the sphere model in Figure 60.6. The φ-worlds closest to u are all
ψ-worlds, so M, u ⊩ φ € ψ. But there are φ-worlds closest to v which are not
ψ-worlds, so M, v ⊮ φ € ψ.

786 Release : 6891b66 (2024-12-01)


60.3. TRUTH AND FALSITY OF COUNTERFACTUALS

w φ

Figure 60.3: Vacuously true counterfactual


con:min:tf:
fig:vacuous

w φ

Figure 60.4: False counterfactual, false opposite


con:min:tf:
fig:false

w φ

Figure 60.5: False counterfactual, true opposite


con:min:tf:
fig:false-opposite

Release : 6891b66 (2024-12-01) 787


CHAPTER 60. MINIMAL CHANGE SEMANTICS

u v

Figure 60.6: Contingent counterfactual


con:min:tf:
fig:contingent

content/counterfactuals/minimal-change-semantics/antecedent-strengthening.tex

60.4 Antecedent Strengthening


“Strengthening the antecedent” refers to the inference φ → χ ⊨ (φ ∧ ψ) → χ. It cnt:min:agg:
sec
is valid for the material conditional, but invalid for counterfactuals. Suppose
it is true that if I were to strike this match, it would light. (That means, there
is nothing wrong with the match or the matchbook surface, I will not break
the match, etc.) But it is not true that if I were to light this match in outer
space, it would light. So the following inference is invalid:

If the match were struck, it would light.


Therefore, if the match were struck in outer space, it would light.

The Lewis–Stalnaker account of conditionals explains this: the closest world


where I light the match and I do so in outer space is much further removed
from the actual world than the closest world where I light the match is. So
although it’s true that the match lights in the latter, it is not in the former.
And that is as it should be.

Example 60.3. The sphere semantics invalidates the inference, i.e., we have
p € r ⊭ (p ∧ q) € r. Consider the model M = ⟨W, O, V ⟩ where W =
{w, w1 , w2 }, Ow = {{w}, {w, w1 }, {w, w1 , w2 }}, V (p) = {w1 , w2 }, V (q) =
{w2 }, and V (r) = {w1 }. There is a p-admitting sphere S = {w, w1 } and p → r
is true at all worlds in it, so M, w ⊩ p € r. There is also a (p ∧ q)-admitting
sphere S ′ = {w, w1 , w2 } but M, w2 ⊮ (p ∧ q) → r, so M, w ⊮ (p ∧ q) € r (see
Figure 60.7).

content/counterfactuals/minimal-change-semantics/transitivity.tex

788 Release : 6891b66 (2024-12-01)


60.5. TRANSITIVITY

w1

w
w2
q

Figure 60.7: Counterexample to antecedent strengthening


cnt:min:agg:
fig:antecedent-strengthening

60.5 Transitivity
cnt:min:tra: For the material conditional, the chain rule holds: φ → ψ, ψ → χ ⊨ φ → χ.
sec
In other words, the material conditional is transitive. Is the same true for
counterfactuals? Consider the following example due to Stalnaker.
If J. Edgar Hoover had been born a Russian, he would have been a
Communist.
If J. Edgar Hoover were a Communist, he would have been be a
traitor.
Therefore, If J. Edgar Hoover had been born a Russian, he would
have been be a traitor.
If Hoover had been born (at the same time he actually did), not in the United
States, but in Russia, he would have grown up in the Soviet Union and become
a Communist (let’s assume). So the first premise is true. Likewise, the second
premise, considered in isolation is true. The conclusion, however, is false: in all
likelihood, Hoover would have been a fervent Communist if he had been born in
the USSR, and not been a traitor (to his country). The intuitive assignment of
truth values is borne out by the Stalnaker–Lewis account. The closest possible
world to ours with the only change being Hoover’s place of birth is the one
where Hoover grows up to be a good citizen of the USSR. This is the closest
possible world where the antecedent of the first premise and of the conclusion is
true, and in that world Hoover is a loyal member of the Communist party, and
so not a traitor. To evaluate the second premise, we have to look at a different
world, however: the closest world where Hoover is a Communist, which is one
where he was born in the United States, turned, and thus became a traitor.1
1 Of course, to appreciate the force of the example we have to take on board some meta-

physical and political assumptions, e.g., that it is possible that Hoover could have been born

Release : 6891b66 (2024-12-01) 789


CHAPTER 60. MINIMAL CHANGE SEMANTICS

Problem 60.1. Find a convincing, intuitive example for the failure of transi-
tivity of counterfactuals.

Example 60.4. The sphere semantics invalidates the inference, i.e., we have cnt:min:tra:
ex:trans-counterex
p € q, q € r ⊭ p € r. Consider the model M = ⟨W, O, V ⟩ where W =
{w, w1 , w2 }, Ow = {{w}, {w, w1 }, {w, w1 , w2 }}, V (p) = {w2 }, V (q) = {w1 , w2 },
and V (r) = {w1 }. There is a p-admitting sphere S = {w, w1 , w2 } and p → q is
true at all worlds in it, so M, w ⊩ p € q. There is also a q-admitting sphere
S ′ = {w, w1 } and M ⊮ q → r is true at all worlds in it, so M, w ⊩ q € r. How-
ever, the p-admitting sphere {w, w1 , w2 } contains a world, namely w2 , where
M, w2 ⊮ p → r.

Problem 60.2. Draw the sphere diagram corresponding to the counterexam-


ple in Example 60.4.

Problem 60.3. In Example 60.4, world w2 is where Hoover is born in Russia,


is a communist, and not a traitor, and w1 is the world where Hoover is born
in the US, is a communist, and a traitor. In this model, w1 is closer to w than
w2 is. Is this necessary? Can you give a counterexample that does not assume
that Hoover’s being born in Russia is a more remote possibility than him being
a Communist?

content/counterfactuals/minimal-change-semantics/contraposition.tex

60.6 Contraposition
Material and strict conditionals are equivalent to their contrapositives. Coun- cnt:min:cpo:
sec
terfactuals are not. Here is an example due to Kratzer:

If Goethe hadn’t died in 1832, he would (still) be dead now.


If Goethe weren’t dead now, he would have died in 1832.

The first sentence is true: humans don’t live hundreds of years. The second
is clearly false: if Goethe weren’t dead now, he would be still alive, and so
couldn’t have died in 1832.

Example 60.5. The sphere semantics invalidates contraposition, i.e., we have cnt:min:cpo:
ex:contraposition-counterex
p € q ⊭ ¬q € ¬p. Think of p as “Goethe didn’t die in 1832” and q as
“Goethe is dead now.” We can capture this in a model M1 = ⟨W, O, V ⟩
with W = {w, w1 , w2 }, O = {{w}, {w, w1 }, {w, w1 , w2 }}, V (p) = {w1 , w2 } and
V (q) = {w, w1 }. So w is the actual world where Goethe died in 1832 and is still
dead; w1 is the (close) world where Goethe died in, say, 1833, and is still dead;
and w2 is a (remote) world where Goethe is still alive. There is a p-admitting
sphere S = {w, w1 } and p → q is true at all worlds in it, so M, w ⊩ p € q.
to Russian parents, or that Communists in the US of the 1950s were traitors to their country.

790 Release : 6891b66 (2024-12-01)


60.6. CONTRAPOSITION

¬q
q

w w1

w2

p
¬p

Figure 60.8: Counterexample to contraposition


cnt:min:cpo:
fig:contraposition

However, the ¬q-admitting sphere {w, w1 , w2 } contains a world, namely w2 ,


where q is false and p is true, so M, w2 ⊮ ¬q → ¬p.

Release : 6891b66 (2024-12-01) 791


Part XIV

Set Theory

Chapter 61

The Iterative Conception

content/set-theory/story/extensionality.tex

61.1 Extensionality
The very first thing to say is that sets are individuated by their elements. More sth:story:extensionality:
sec
precisely:
Axiom (Extensionality). If sets A and B have the same elements, then A
and B are the same set.
∀A ∀B (∀x (x ∈ A ↔ x ∈ B) → A = B)
We assumed this throughout part I. But it bears repeating. The Axiom of
Extensionality expresses the basic idea that a set is determined by its elements.
(So sets might be contrasted with concepts, where precisely the same objects
might fall under many different concepts.)
Why embrace this principle? Well, it is plausible to say that any denial of
Extensionality is a decision to abandon anything which might even be called set
theory. Set theory is no more nor less than the theory of extensional collections.
The real challenge in part XIV, though, is to lay down principles which tell
us which sets exist. And it turns out that the only truly “obvious” answer to
this question is provably wrong.

content/set-theory/story/russells-paradox-again.tex

61.2 Russell’s Paradox (again)


sth:story:rus:
sec

792
61.2. RUSSELL’S PARADOX (AGAIN)

In part I, we worked with a naı̈ve set theory. But according to a very naı̈ve
conception, sets are just the extensions of predicates. This naı̈ve thought would
mandate the following principle:

Naı̈ve Comprehension. {x : φ(x)} exists for any formula φ.

Tempting as this principle is, it is provably inconsistent. We saw this in


section 1.6, but the result is so important, and so straightforward, that it’s
worth repeating. Verbatim.
Theorem 61.1 (Russell’s Paradox). There is no set R = {x : x ∈
/ x}

Proof. If R = {x : x ∈
/ x} exists, then R ∈ R iff R ∈
/ R, which is a contradic-
tion.

Russell discovered this result in June 1901. (He did not, though, put the
paradox in quite the form we just presented it, since he was considering Frege’s
set theory, as outlined in Grundgesetze. We will return to this in section 61.6.)
Russell wrote to Frege on June 16, 1902, explaining the inconsistency in Frege’s
system. For the correspondence, and a bit of background, see Heijenoort (1967,
pp. 124–8).
It is worth emphasising that this two-line proof is a result of pure logic.
Granted, we implicitly used a (non-logical?) axiom, Extensionality, in our no-
tation {x : x ∈/ x}; for {x : φ(x)} is to be the unique (by Extensionality) set
of the φs, if one exists. But we can avoid even the hint of Extensionality, just
by stating the result as follows: there is no set whose members are exactly the
non-self-membered sets. And this has nothing much to do with sets. As Russell
himself observed, exactly similar reasoning will lead you to conclude: no man
shaves exactly the men who do not shave themselves. Or: no pug sniffs exactly
the pugs which don’t sniff themselves. And so on. Schematically, the shape of
the result is just:
¬∃x∀z(Rzx ↔ ¬Rzz).
And that’s just a theorem (scheme) of first-order logic. Consequently, we can’t
avoid Russell’s Paradox just by tinkering with our set theory; it arises before
we even get to set theory. If we’re going to use (classical) first-order logic, we
simply have to accept that there is no set R = {x : x ∈/ x}.
The upshot is this. If you want to accept Naı̈ve Comprehension whilst
avoiding inconsistency, you cannot just tinker with the set theory. Instead,
you would have to overhaul your logic.
Of course, set theories with non-classical logics have been presented. But
they are—to say the least—non-standard. The standard approach to Russell’s
Paradox is to treat it as a straightforward non-existence proof, and then to try
to learn how to live with it. That is the approach we will follow.

content/set-theory/story/predicativity.tex

Release : 6891b66 (2024-12-01) 793


CHAPTER 61. THE ITERATIVE CONCEPTION

61.3 Predicative and Impredicative


The Russell set, R, was defined via {x : x ∈ / x}. Spelled out more fully, R sth:story:predicative:
sec
would be the set which contains all and only those sets which are not non-self-
membered. So in defining R, we quantify over the domain which would contain
R (if it existed).
This is an impredicative definition. More generally, we might say that a
definition is impredicative iff it quantifies over a domain which contains the
object that is being defined.
In the wake of the paradoxes, Whitehead, Russell, Poincaré and Weyl re-
jected such impredicative definitions as “viciously circular”:

An analysis of the paradoxes to be avoided shows that they all result


from a kind of vicious circle. The vicious circles in question arise
from supposing that a collection of objects may contain members
which can only be defined by means of the collection as a whole[. . . .
¶]
The principle which enables us to avoid illegitimate totalities may
be stated as follows: ‘Whatever involves all of a collection must not
be one of the collection’; or, conversely: ‘If, provided a certain col-
lection had a total, it would have members only definable in terms
of that total, then the said collection has no total.’ We shall call
this the ‘vicious-circle principle,’ because it enables us to avoid the
vicious circles involved in the assumption of illegitimate totalities.
(Whitehead and Russell, 1910, p. 37)

If we follow them in rejecting the vicious-circle principle, then we might at-


tempt to replace the disastrous Naı̈ve Comprehension Scheme (of section 61.2)
with something like this:

Predicative Comprehension. For every formula φ quantifying only over


sets: the set′ {x : φ(x)} exists.

So long as sets′ are not sets, no contradiction will ensue.


Unfortunately, Predicative Comprehension is not very comprehensive. After
all, it introduces us to new entities, sets′ . So we will have to consider formulas
which quantify over sets′ . If they always yield a set′ , then Russell’s paradox
will arise again, just by considering the set′ of all non-self-membered sets′ . So,
pursuing the same thought, we must say that a formula quantifying over sets′
yields a corresponding set′′ . And then we will need sets′′′ , sets′′′′ , etc. To
prevent a rash of primes, it will be easier to think of these as sets0 , sets1 , sets2 ,
sets3 , sets4 ,. . . . And this would give us a way into the (simple) theory of types.
There are a few obvious objections against such a theory (though it is not
obvious that they are overwhelming objections). In brief: the resulting theory
is cumbersome to use; it is profligate in postulating different kinds of objects;

794 Release : 6891b66 (2024-12-01)


61.4. THE CUMULATIVE-ITERATIVE APPROACH

and it is not clear, in the end, that impredicative definitions are even all that
bad.
To bring out the last point, consider this remark from Ramsey:

we may refer to a man as the tallest in a group, thus identifying


him by means of a totality of which he is himself a member without
there being any vicious circle. (Ramsey, 1925)

Ramsey’s point is that “the tallest man in the group” is an impredicative


definition; but it is obviously perfectly kosher.
One might respond that, in this case, we could pick out the tallest person by
predicative means. For example, maybe we could just point at the man in ques-
tion. The objection against impredicative definitions, then, would clearly need
to be limited to entities which can only be picked out impredicatively. But even
then, we would need to hear more, about why such “essential impredicativity”
would be so bad.1
Admittedly, impredicative definitions are extremely bad news, if we want
our definitions to provide us with something like a recipe for creating an object.
For, given an impredicative definition, one would genuinely be caught in a
vicious circle: to create the impredicatively specified object, one would first
need to create all the objects (including the impredicatively specified object),
since the impredicatively specified object is specified in terms of all the objects;
so one would need to create the impredicatively specified object before one had
created it itself. But again, this is only a serious objection against “essentially
impredicatively” specified sets, if we think of sets as things that we create. And
we (probably) don’t.
As such—for better or worse—the approach which became common does
not involve taking a hard line concerning (im)predicativity. Rather, it involves
what is now regarded as the cumulative-iterative approach. In the end, this will
allow us to stratify our sets into “stages”—a bit like the predicative approach
stratifies entities into sets0 , sets1 , sets2 , . . . —but we will not postulate any
difference in kind between them.

content/set-theory/story/cumulative-approach.tex

61.4 The Cumulative-Iterative Approach


sth:story:approach: Here is a slightly fuller statement of how we will stratify sets into stages:
sec

Sets are formed in stages. For each stage S, there are certain stages
which are before S. At stage S, each collection consisting of sets
formed at stages before S is formed into a set. There are no sets
other than the sets which are formed at stages. (Shoenfield, 1977,
p. 323)
1 For more, see Linnebo (2010).

Release : 6891b66 (2024-12-01) 795


CHAPTER 61. THE ITERATIVE CONCEPTION

This is a sketch of the cumulative-iterative conception of set. It will underpin


the formal set theory that we present in part XIV.
Let’s explore this in a little more detail. As Shoenfield describes the process,
at every stage, we form new sets from the sets which were available to us from
earlier stages. So, on Shoenfield’s picture, at the initial stage, stage 0, there
are no earlier stages, and so a fortiori there are no sets available to us from
earlier stages.2 So we form only one set: the set with no elements ∅. At stage
1, exactly one set is available to us from earlier stages, so only one new set is
{∅}. At stage 2, two sets are available to us from earlier stages, and we form
two new sets {{∅}} and {∅, {∅}}. At stage 3, four sets are available to us from
earlier stages, so we form twelve new sets. . . . As such, the cumulative-iterative
picture of the sets will look a bit like this (with numbers indicating stages):

6
5
4
3
2
1
0

So: why should we embrace this story?


One reason is that it is a nice, tractable story. Given the demise of the most
obvious story, i.e., Naı̈ve Comprehension, we are in want of something nice.
But the story is not just nice. We have a good reason to believe that any
set theory based on this story will be consistent. Here is why.
Given the cumulative-iterative conception of set, we form sets at stages;
and their elements must be objects which were available already. So, for any
stage S, we can form the set

RS = {x : x ∈
/ x and x was available before S}

The reasoning involved in proving Russell’s Paradox will now establish that RS
itself is not available before stage S. And that’s not a contradiction. Moreover,
if we embrace the cumulative-iterative conception of set, then we shouldn’t even
have expected to be able to form the Russell set itself. For that would be the
set of all non-self-membered sets that “will ever be available”. In short: the
fact that we (provably) can’t form the Russell set isn’t surprising, given the
cumulative-iterative story; it’s what we would predict.
2 Why should we assume that there is a first stage? See the footnote to Stages-are-ordered

in section 62.1.

796 Release : 6891b66 (2024-12-01)


61.5. URELEMENTS OR NOT?

content/set-theory/story/urelements.tex

61.5 Urelements or Not?


sth:story:urelements: In the next few chapters, we will try to extract axioms from the cumulative-
sec
iterative conception of set. But, before going any further, we need to say
something more about urelements.
The picture of section 61.4 allowed us only to form new sets from old sets.
However, we might want to allow that certain non-sets—cows, pigs, grains of
sand, or whatever—can be elements of sets. In that case, we would start with
certain basic elements, urelements, and then say that at each stage S we would
form “all possible” sets consisting of urelements or sets formed at stages before
S (in any combination). The resulting picture would look more like this:

6
5
4
3
2
1
0

So now we have a decision to take: Should we allow urelements?


Philosophically, it makes sense to include urelements in our theorising. The
main reason for this is to make our set theory applicable. To illustrate the point,
recall from chapter 4 that we say that two sets A and B have the same size, i.e.,
A ≈ B, iff there is a bijection between them. Now, if the cows in the field and
the pigs in the sty both form sets, we can offer a set-theoretical treatment of
the claim “there are as many cows as pigs”. But if we ban urelements, so that
the cows and the pigs do not form sets, then that set-theoretical treatment will
be unavailable. Indeed, we will have no straightforward ability to apply set
theory to anything other than sets themselves. (For more reasons to include
urelements, see Potter 2004, pp. vi, 24, 50–1.)
Mathematically, however, it is quite rare to allow urelements. In part, this
is because it is very slightly easier to formulate set theory without urelements.
But, occasionally, one finds more interesting justifications for excluding urele-
ment from set theory:

In accordance with the belief that set theory is the foundation of


mathematics, we should be able to capture all of mathematics by

Release : 6891b66 (2024-12-01) 797


CHAPTER 61. THE ITERATIVE CONCEPTION

just talking about sets, so our variable should not range over objects
like cows and pigs. (Kunen, 1980, p. 8)

So: a focus on applicability would suggest including urelements; a focus on a


reductive foundational goal (reducing mathematics to pure set theory) might
suggest excluding them. Mild laziness, too, points in the direction of excluding
urelements.
We will follow the laziest path. Partly, though, there is a pedagogical
justification. Our aim is to introduce you to the elements of set theory that
you would need in order to get started on the philosophy of set theory. And
most of that philosophical literature discusses set theories formulated without
urelements. So this book will, perhaps, be of more use, if it hews fairly closely
to that literature.

content/set-theory/story/grundgesetze.tex

61.6 Appendix: Frege’s Basic Law V


In section 61.2, we explained that Russell’s formulated his paradox as a problem sth:story:blv:
sec
for the system Frege outlined in his Grundgesetze. Frege’s system did not
include a direct formulation of Naı̈ve Comprehension. So, in this appendix, we
will very briefly explain what Frege’s system did include, and how it relates to
Naı̈ve Comprehension and how it relates to Russell’s Paradox.
Frege’s system is second-order, and was designed to formulate the notion
of an extension of a concept.3 Using notation inspired by Frege, we will write
ϵx F (x) for the extension of the concept F . This is a device which takes a
predicate, “F ”, and turns it into a (first-order) term, “ϵx F (x)”. Using this
device, Frege offered the following definition of membership:

a ∈ b =df ∃G(b = ϵx G(x) ∧ Ga)

roughly: a ∈ b iff a falls under a concept whose extension is b. (Note that the
quantifier “∃G” is second-order.) Frege also maintained the following principle,
known as Basic Law V :

ϵx F (x) = ϵx G(x) ↔ ∀x(F x ↔ Gx)

roughly: concepts have identical extensions iff they are coextensive. (Again,
both “F ” and “G” are in predicate position.) Now a simple principle connects
membership with property-satisfaction:

Lemma 61.2 (in Grundgesetze). ∀F ∀a(a ∈ ϵx F (x) ↔ F a) sth:story:blv:


lem:Fregeextensions
3 Strictly speaking, Frege attempts to formalize a more general notion: the “value-range”

of a function. Extensions of concepts are a special case of the more general notion. See Heck
(2012, pp. 8–9) for the details.

798 Release : 6891b66 (2024-12-01)


Proof. Fix F and a. Now a ∈ ϵx F (x) iff ∃G(ϵx F (x) = ϵx G(x) ∧ Ga) (by the
definition of membership) iff ∃G(∀x(F x ↔ Gx) ∧ Ga) (by Basic Law V) iff F a
(by elementary second-order logic).

And this yields Naı̈ve Comprehension almost immediately:

Lemma 61.3 (in Grundgesetze.). ∀F ∃s∀a(a ∈ s ↔ F a)

Proof. Fix F ; now Lemma 61.2 yields ∀a(a ∈ ϵx F (x) ↔ F a); so ∃s∀a(a ∈
s↔F a) by existential generalisation. The result follows since F was arbitrary.

Russell’s Paradox follows by taking F as given by ∀x(F x ↔ x ∈


/ x).

Chapter 62

Steps towards Z

content/set-theory/z/story.tex

62.1 The Story in More Detail


sth:z:story: In section 61.4, we quoted Schoenfield’s description of the process of set-
sec
formation. We now want to write down a few more principles, to make this
story a bit more precise. Here they are:

Stages-are-key. Every set is formed at some stage.

Stages-are-ordered. Stages are ordered: some come before others.1

Stages-accumulate. For any stage S, and for any sets which were formed
before stage S: a set is formed at stage S whose members are exactly
those sets. Nothing else is formed at stage S.
1 We will actually assume—tacitly—that the stages are well-ordered. What this amounts

to is explained in chapter 63. This is a substantial assumption. In fact, using a very clever
technique due to Scott (1974), this assumption can be avoided and then derived. (This will
also explain why we should think that there is an initial stage.) We cannot go into that here;
for more, see Button (2021).

799
CHAPTER 62. STEPS TOWARDS Z

These are informal principles, but we will be able to use them to vindicate
several of the axioms of Zermelo’s set theory.
(We should offer a word of caution. Although we will be presenting some
completely standard axioms, with completely standard names, the italicized
principles we have just presented have no particular names in the literature.
We simply monikers which we hope are helpful.)

content/set-theory/z/separation.tex

62.2 Separation
We start with a principle to replace Naı̈ve Comprehension: sth:z:sep:
sec

Axiom (Scheme of Separation). For every formula φ(x), this is an axiom:


for any A, the set {x ∈ A : φ(x)} exists.

Note that this is not a single axiom. It is a scheme of axioms. There are
infinitely many Separation axioms; one for every formula φ(x). The scheme
can equally well be (and normally is) written down as follows:

For any formula φ(x) which does not contain “S”, this is an axiom:

∀A∃S∀x(x ∈ S ↔ (φ(x) ∧ x ∈ A)).

In keeping with the convention noted at the start of part XIV, the formu-
las φ in the Separation axioms may have parameters.2
Separation is immediately justified by our cumulative-iterative conception
of sets we have been telling. To see why, let A be a set. So A is formed
by some stage S (by Stages-are-key). Since A was formed at stage S, all
of A’s members were formed before stage S (by Stages-accumulate). Now in
particular, consider all the sets which are members of A and which also satisfy
φ; clearly all of these sets, too, were formed before stage S. So they are formed
into a set {x ∈ A : φ(x)} at stage S too (by Stages-accumulate).
Unlike Naı̈ve Comprehension, this avoid Russell’s Paradox. For we cannot
simply assert the existence of the set {x : x ∈
/ x}. Rather, given some set A, we
can assert the existence of the set RA = {x ∈ A : x ∈ / x}. But all this proves
is that RA ∈/ RA and RA ∈ / A, none of which is very worrying.
However, Separation has an immediate and striking consequence:
Theorem 62.1. There is no universal set, i.e., {x : x = x} does not exist. sth:z:sep:
thm:NoUniversalSet

Proof. For reductio, suppose V is a universal set. Then by Separation, R =


{x ∈ V : x ∈
/ x} = {x : x ∈
/ x} exists, contradicting Russell’s Paradox.
2 For an explanation of what this means, see the discussion immediately after Corol-

lary 6.7.

800 Release : 6891b66 (2024-12-01)


62.3. UNION

The absence of a universal set—indeed, the open-endedness of the hierarchy


of sets—is one of the most fundamental ideas behind the cumulative-iterative
conception. So it is worth seeing that, intuitively, we could reach it via a
different route. A universal set must be an element of itself. But, on our
cumulative-iterative conception, every set appears (for the first time) in the
hierarchy at the first stage immediately after all of its elements. But this
entails that no set is self-membered. For any self-membered set would have
to first occur immediately after the stage at which it first occurred, which is
absurd. (We will see in Definition 64.15 how to make this explanation more
rigorous, by using the notion of the “rank” of a set. However, we will need to
have a few more axioms in place to do this.)
Here are a few more consequences of Separation and Extensionality.
sth:z:sep: Proposition 62.2. If any set exists, then ∅ exists.
prop:emptyexists

Proof. If A is a set, ∅ = {x ∈ A : x ̸= x} exists by Separation.

Proposition 62.3. A \ B exists for any sets A and B

Proof. A \ B = {x ∈ A : x ∈
/ B} exists by Separation.

It also turns out that (almost) arbitrary intersections exist:


T
sth:z:sep: Proposition 62.4. If A ̸= ∅, then A = {x : (∀y ∈ A)x ∈ y} exists.
prop:intersectionsexist
T
Proof. Let A ̸= ∅, so there is some c ∈ A. Then A = {x : (∀y ∈ A)x ∈ y} =
{x ∈ c : (∀y ∈ A)x ∈ y}, which exists by Separation.
T
Note the condition that A ̸= ∅, though; for ∅ would be the universal set,
vacuously, contradicting Theorem 62.1.

content/set-theory/z/union.tex

62.3 Union
sth:z:union: Proposition 62.4 gave us intersections. But if we want arbitrary unions to exist,
sec
we need to lay down another axiom:
S
Axiom (Union). For any set A, the set A = {x : (∃b ∈ A)x ∈ b} exists.

∀A∃U ∀x(x ∈ U ↔ (∃b ∈ A)x ∈ b)

This axiom is also justified by the cumulative-iterative conception. Let A


be a set, so A is formed at some stage S (by Stages-are-key). Every member of
A was formed before S (by Stages-accumulate); so, reasoning similarly, every
member of every member of A was formed before S. Thus all of thoseSsets are
available before S, to be formed into a set at S. And that set is just A.

content/set-theory/z/pairs.tex

Release : 6891b66 (2024-12-01) 801


CHAPTER 62. STEPS TOWARDS Z

62.4 Pairs
The next axiom to consider is the following: sth:z:pairs:
sec

Axiom (Pairs). For any sets a, b, the set {a, b} exists.

∀a∀b∃P ∀x(x ∈ P ↔ (x = a ∨ x = b))

Here is how to justify this axiom, using the iterative conception. Suppose
a is available at stage S, and b is available at stage T . Let M be whichever of
stages S and T comes later. Then since a and b are both available at stage M ,
the set {a, b} is a possible collection available at any stage after M (whichever
is the greater).
But hold on! Why assume that there are any stages after M ? If there are
none, then our justification will fail. So, to justify Pairs, we will have to add
another principle to the story we told in section 62.1, namely:

Stages-keep-going. There is no last stage.

Is this principle justified? Nothing in Shoenfield’s story stated explicitly that


there is no last stage. Still, even if it is (strictly speaking) an extra addition to
our story, it fits well with the basic idea that sets are formed in stages. We will
simply accept it in what follows. And so, we will accept the Axiom of Pairs
too.
Armed with this new Axiom, we can prove the existence of plenty more
sets. For example:

Proposition 62.5. For any sets a and b, the following sets exist: sth:z:pairs:
prop:pairsconsequences

1. {a} sth:z:pairs:
singleton

2. a ∪ b sth:z:pairs:
binunion

3. ⟨a, b⟩ sth:z:pairs:
tuples

Proof. (1). By Pairs, {a, a} exists, which is {a} by Extensionality.


S
(2). By Pairs, {a, b} exists. Now a ∪ b = {a, b} exists by Union.
(3). By (1), {a} exists. By Pairs, {a, b} exists. Now {{a}, {a, b}} = ⟨a, b⟩
exists, by Pairs again.

Problem 62.1. Show that, for any sets a, b, c, the set {a, b, c} exists.

Problem 62.2. Show that, for any sets a1 , . . . , an , the set {a1 , . . . , an } exists.

content/set-theory/z/powerset.tex

802 Release : 6891b66 (2024-12-01)


62.5. POWERSETS

62.5 Powersets
sth:z:power: We will proceed with another axiom:
sec

Axiom (Powersets). For any set A, the set ℘(A) = {x : x ⊆ A} exists.

∀A∃P ∀x(x ∈ P ↔ (∀z ∈ x)z ∈ A)

Our justification for this is pretty straightforward. Suppose A is formed


at stage S. Then all of A’s members were available before S (by Stages-
accumulate). So, reasoning as in our justification for Separation, every subset
of A is formed by stage S. So they are all available, to be formed into a single
set, at any stage after S. And we know that there is some such stage, since S
is not the last stage (by Stages-keep-going). So ℘(A) exists.
Here is a nice consequence of Powersets:

Proposition 62.6. Given any sets A, B, their Cartesian product A×B exists.

Proof. The set ℘(℘(A ∪ B)) exists by Powersets and Proposition 62.5. So by
Separation, this set exists:

C = {z ∈ ℘(℘(A ∪ B)) : (∃x ∈ A)(∃y ∈ B)z = ⟨x, y⟩}.

Now, for any x ∈ A and y ∈ B, the set ⟨x, y⟩ exists by Proposition 62.5.
Moreover, since x, y ∈ A ∪ B, we have that {x}, {x, y} ∈ ℘(A ∪ B), and ⟨x, y⟩ ∈
℘(℘(A ∪ B)). So A × B = C.

In this proof, Powerset interacts with Separation. And that is no surprise.


Without Separation, Powersets wouldn’t be a very powerful principle. After
all, Separation tells us which subsets of a set exist, and hence determines just
how “fat” each Powerset is.

Problem 62.3. Show that, for any sets A, B: (i) the set of all relations with
domain A and range B exists; and (ii) the set of all functions from A to B
exists.

Problem 62.4. Let A be a set, and let ∼ be an equivalence relation on A.


Prove that the set of equivalence classes under ∼ on A, i.e., A/∼ , exists.

content/set-theory/z/infinity-again.tex

62.6 Infinity
sth:z:infinity-again: We already have enough axioms to ensure that there are infinitely many sets
sec
(if there are any). For suppose some set exists, and so ∅ exists (by Proposi-
tion 62.2). Now for any set x, the set x ∪ {x} exists by Proposition 62.5. So,
applying this a few times, we will get sets as follows:

Release : 6891b66 (2024-12-01) 803


CHAPTER 62. STEPS TOWARDS Z

0. ∅

1. {∅}

2. {∅, {∅}}

3. {∅, {∅}, {∅, {∅}}}

4. {∅, {∅}, {∅, {∅}}, {∅, {∅}, {∅, {∅}}}}

and we can check that each of these sets is distinct.


We have started the numbering from 0, for a few reasons. But one of them
is this. It is not that hard to check that the set we have labelled “n” has exactly
n members, and (intuitively) is formed at the nth stage.
But. This gives us infinitely many sets, but it does not guarantee that
there is an infinite set, i.e., a set with infinitely many members. And this really
matters: unless we can find a (Dedekind) infinite set, we cannot construct a
Dedekind algebra. But we want a Dedekind algebra, so that we can treat it as
the set of natural numbers. (Compare section 6.4.)
Importantly, the axioms we have laid down so far do not guarantee the
existence of any infinite set. So we have to lay down a new axiom:

Axiom (Infinity). There is a set, I, such that ∅ ∈ I and x∪{x} ∈ I whenever


x ∈ I.

∃I((∃o ∈ I)∀x x ∈
/ o∧
(∀x ∈ I)(∃s ∈ I)∀z(z ∈ s ↔ (z ∈ x ∨ z = x)))

It is easy to see that the set I given to us by the Axiom of Infinity is


Dedekind infinite. Its distinguished element is ∅, and the injection on I is
given by s(x) = x ∪ {x}. Now, Theorem 6.5 showed how to extract a Dedekind
Algebra from a Dedekind infinite set; and we will treat this as our set of natural
numbers. More precisely:

Definition 62.7. Let I be any set given to us by the Axiom of Infinity. Let sth:z:infinity-again:
s be the function s(x) = x ∪ {x}. Let ω = clos (∅). We call the members of ω defnomega

the natural numbers, and say that n is the result of n-many applications of s
to ∅.

You can now look back and check that the set labelled “n”, a few paragraphs
earlier, will be treated as the number n.
We will discuss this significance of this stipulation in section 62.8. For now,
it enables us to prove an intuitive result:

Proposition 62.8. No natural number is Dedekind infinite. sth:z:infinity-again:


naturalnumbersarentinfinite

Proof. The proof is by induction, i.e., Theorem 6.6. Clearly 0 = ∅ is not


Dedekind infinite. For the induction step, we will establish the contrapositive:
if (absurdly) s(n) is Dedekind infinite, then n is Dedekind infinite.

804 Release : 6891b66 (2024-12-01)


62.7. Z− : A MILESTONE

So suppose that s(n) is Dedekind infinite, i.e., there is some injection f


with ran(f ) ⊊ dom(f ) = s(n) = n ∪ {n}. There are two cases to consider.
Case 1: n ∈ / ran(f ). So ran(f ) ⊆ n, and f (n) ∈ n. Let g = f ↾n ; now
ran(g) = ran(f ) \ {f (n)} ⊊ n = dom(g). Hence n is Dedekind infinite.
Case 2: n ∈ ran(f ). Fix m ∈ dom(f ) \ ran(f ), and define a function h with
domain s(n) = n ∪ {n}:
(
f (x) if f (x) ̸= n
h(x) =
m if f (x) = n

So h and f agree everywhere, except that h(f −1 (n)) = m =


̸ n = f (f −1 (n)).
Since f is an injection, n ∈
/ ran(h); and ran(h) ⊊ dom(h) = s(n). Now n is
Dedekind infinite, using the argument of Case 1.

The question remains, though, of how we might justify the Axiom of Infinity.
The short answer is that we will need to add another principle to the story we
have been telling. That principle is as follows:
Stages-hit-infinity. There is an infinite stage. That is, there is a stage
which (a) is not the first stage, and which (b) has some stages before it,
but which (c) has no immediate predecessor.
The Axiom of Infinity follows straightforwardly from this principle. We know
that natural number n is formed at stage n. So the set ω is formed at the first
infinite stage. And ω itself witnesses the Axiom of Infinity.
This, however, simply pushes us back to the question of how we might
justify Stages-hit-infinity. As with Stages-keep-going, it was not an explicit
part of the story we told about the cumulative-iterative hierarchy. But more
than that: nothing in the very idea of an iterative hierarchy, in which sets are
formed stage by stage, forces us to think that the process involves an infinite
stage. It seems perfectly coherent to think that the stages are ordered like the
natural numbers.
This, however, gives rise to an obvious problem. In section 6.4, we con-
sidered Dedekind’s “proof” that there is a Dedekind infinite set (of thoughts).
This may not have struck you as very satisfying. But if Stages-hit-infinity is
not “forced upon us” by the iterative conception of set (or by “the laws of
thought”), then we are still left without an intrinsic justification for the claim
that there is a Dedekind infinite set.
There is much more to say here, of course. But hopefully you are now at
a point to start thinking about what it might take to justify an axiom (or
principle). In what follows we will simply take Stages-hit-infinity for granted.

content/set-theory/z/milestone.tex

62.7 Z− : a Milestone
sth:z:milestone:
sec

Release : 6891b66 (2024-12-01) 805


CHAPTER 62. STEPS TOWARDS Z

We will revisit Stages-hit-infinity in the next section. However, with the Axiom
of Infinity, we have reached an important milestone. We now have all the
axioms required for the theory Z− . In detail:

Definition 62.9. The theory Z− has these axioms: Extensionality, Union,


Pairs, Powersets, Infinity, and all instances of the Separation scheme.

The name stands for Zermelo set theory (minus something which we will
come to later). Zermelo deserves the honour, since he essentially formulated
this theory in his 1908a.3
This theory is powerful enough to allow us to do an enormous amount of
mathematics. In particular, you should look back through part I, and con-
vince yourself that everything we did, naı̈vely, could be done more formally
within Z− . (Once you have done that for a bit, you might want to skip ahead
and read section 62.9.) So, henceforth, and without any further comment, we
will take ourselves to be working in Z− (at least).

content/set-theory/z/nat.tex

62.8 Selecting our Natural Numbers


In Definition 62.7, we explicitly defined the expression “natural numbers”. sth:z:nat:
sec
How should you understand this stipulation? It is not a metaphysical claim,
but just a decision to treat certain sets as the natural numbers. We touched
upon reasons for thinking this in section 2.2, section 5.5 and section 6.4. But
we can make these reasons even more pointed.
Our Axiom of Infinity follows von Neumann (1925). But here is another
axiom, which we could have adopted instead:

Zermelo’s 1908a Axiom of Infinity. There is a set A such that ∅ ∈ A and


(∀x ∈ A){x} ∈ A.

Had we used Zermelo’s axiom, instead of our (von Neumann-inspired) Ax-


iom of Infinity, we would equally well have been given a Dedekind infinite set,
and so a Dedekind algebra. On Zermelo’s approach, the distinguished element
of our algebra would again have been ∅ (our surrogate for 0), but the injection
would have been given by the map x 7→ {x}, rather than x 7→ x ∪ {x}. The
simplest upshot of this is that Zermelo treats 2 as {{∅}}, whereas we (with von
Neumann) treat 2 as {∅, {∅}}.
Why choose one axiom of Infinity rather than the other? The main prac-
tical reason is that von Neumann’s approach “scales up” to handle transfinite
numbers rather well. We will explore this from chapter 63 onwards. However,
3 For interesting comments on the history and technicalities, see Potter (2004, Appendix
A).

806 Release : 6891b66 (2024-12-01)


62.9. APPENDIX: CLOSURE, COMPREHENSION, AND
INTERSECTION
from the simple perspective of doing arithmetic, both approaches would do
equally well. So if someone tells you that the natural numbers are sets, the
obvious question is: Which sets are they?
This precise question was made famous by Benacerraf (1965). But it is
worth emphasising that it is just the most famous example of a phenomenon
that we have encountered many times already. The basic point is this. Set
theory gives us a way to simulate a bunch of “intuitive” kinds of entities: the
reals, rationals, integers, and naturals, yes; but also ordered pairs, functions,
and relations. However, set theory never provides us with a unique choice
of simulation. There are always alternatives which—straightforwardly—would
have served us just as well.

content/set-theory/z/arbintersections.tex

62.9 Appendix: Closure, Comprehension, and


Intersection
sth:z:arbintersections: In section 62.7, we suggested that you should look back through the naı̈ve
sec
work of part I and check that it can be carried out in Z− . If you followed
that advice, one point might have tripped you up: the use of intersection in
Dedekind’s treatment of closures.
Recall from Definition 6.2 that
\
clof (o) = {X : o ∈ X and X is f -closed}.

The general shape of this is a definition of the form:


\
C= {X : φ(X)}.

But this should ring alarm bells: since Naı̈ve Comprehension fails, there is
no guarantee that {X : φ(X)} exists. It looks dangerously, then, like such
definitions are cheating.
Fortunately, they are not cheating; or rather, if they are cheating as they
stand, then we can engage in some honest toil to render them kosher. That T
honest toil was foreshadowed in Proposition 62.4, when we explained why A
exists for any A ̸= ∅. But we will spell it out explicitly.
T
Given Extensionality, if we attempt to define C as {X : φ(X)}, all we are
really asking is for an object C which obeys the following:

sth:z:arbintersections: ∀x(x ∈ C ↔ ∀X(φ(X) → x ∈ X)) (*)


bicondelimarbintersection

Now, suppose there is some set, S, such that φ(S). Then to deliver eq. (*), we
can simply define C using Separation, as follows:

C = {x ∈ S : ∀X(φ(X) → x ∈ X)}.

Release : 6891b66 (2024-12-01) 807


We leave it as an exercise to check that this definition yields eq. (*), as desired.
And this general strategy will allow us to circumvent any apparent use of Naı̈ve
Comprehension in defining intersections. In the particular case which got us
started on this line of thought, namely that of clof (o), here is how that would
work. We began the proof of Lemma 6.3 by noting that o ∈ ran(f ) ∪ {o} and
that ran(f ) ∪ {o} is f -closed. So, we can define what we want thus:

clof (o) = {x ∈ ran(f ) ∪ {o} : (∀X ∋ o)(X is f -closed → x ∈ X)}.

Chapter 63

Ordinals

content/set-theory/ordinals/introduction.tex

63.1 Introduction
In chapter 62, we postulated that there is an infinite-th stage of the hierarchy, sth:ordinals:intro:
sec
in the form of Stages-hit-infinity (see also our axiom of Infinity). However,
given Stages-keep-going, we can’t stop at the infinite-th stage; we have to keep
going. So: at the next stage after the first infinite stage, we form all possible
collections of sets that were available at the first infinite stage; and repeat; and
repeat; and repeat; . . .
Implicitly what has happened here is that we have started to invoke an
“intuitive” notion of number, according to which there can be numbers after all
the natural numbers. In particular, the notion involved is that of a transfinite
ordinal. The aim of this chapter is to make this idea more rigorous. We will
explore the general notion of an ordinal, and then explicitly define certain sets
to be our ordinals.

content/set-theory/ordinals/idea.tex

63.2 The General Idea of an Ordinal


Consider the natural numbers, in their usual order: sth:ordinals:idea:
sec

808
63.3. WELL-ORDERINGS

0 < 1 < 2 < 3 < 4 < 5 < ...

We call this, in the jargon, an ω-sequence. And indeed, this general ordering
is mirrored in our initial construction of the stages of the set hierarchy. But,
now suppose we move 0 to the end of this sequence, so that it comes after all
the other numbers:
1 < 2 < 3 < 4 < 5 < ... < 0

We have the same entities here, but ordered in a fundamentally different way:
our first ordering had no last element; our new ordering does. Indeed, our
new ordering consists of an ω-sequence of entities (1, 2, 3, 4, 5, . . .), followed by
another entity. It will be an ω + 1-sequence.
We can generate even more types of ordering, using just these entities. For
example, consider all the even numbers (in their natural order) followed by all
the odd numbers (in their natural order):

0 < 2 < 4 < ... < 1 < 3 < ...

This is an ω-sequence followed by another ω-sequence; an ω + ω-sequence.


Well, we can keep going. But what we would like is a general way to
understand this talk about orderings.

content/set-theory/ordinals/wo.tex

63.3 Well-Orderings
sth:ordinals:wo: The fundamental notion is as follows:
sec

Definition 63.1. The relation < well-orders A iff it meets these two condi-
tions:

1. < is connected, i.e., for all a, b ∈ A, either a < b or a = b or b < a;

2. every non-empty subset of A has a <-minimal element, i.e., if ∅ =


̸ X⊆A
then (∃m ∈ X)(∀z ∈ X)z ≮ m

It is easy to see that three examples we just considered were indeed well-
ordering relations.

Problem 63.1. Section 63.2 presented three example orderings on the natural
numbers. Check that each is a well-ordering.

Here are some elementary but extremely important observations concerning


well-ordering.

sth:ordinals:wo: Proposition 63.2. If < well-orders A, then every non-empty subset of A has
wo:strictorder
a unique <-least member, and < is irreflexive, asymmetric and transitive.

Release : 6891b66 (2024-12-01) 809


CHAPTER 63. ORDINALS

Proof. If X is a non-empty subset of A, it has a <-minimal element m, i.e.,


(∀z ∈ X)z ≮ m. Since < is connected, (∀z ∈ X)m ≤ z. So m is the <-least
element of X.
For irreflexivity, fix a ∈ A; the <-least element of {a} is a, so a ≮ a. For
transitivity, if a < b < c, then since {a, b, c} has a <-least element, a < c.
Asymmetry follows from irreflexivity and transitivity

Proposition 63.3. If < well-orders A, then for any formula φ(x): sth:ordinals:wo:
propwoinduction

if (∀a ∈ A)((∀b < a)φ(b) → φ(a)), then (∀a ∈ A)φ(a).

Proof. We will prove the contrapositive. Suppose ¬(∀a ∈ A)φ(a), i.e., that
X = {x ∈ A : ¬φ(x)} ̸= ∅. Then X has an <-minimal element, a. So
(∀b < a)φ(b) but ¬φ(a).

This last property should remind you of the principle of strong induction on
the naturals, i.e.: if (∀n ∈ ω)((∀m < n)φ(m) → φ(n)), then (∀n ∈ ω)φ(n). And
this property makes well-ordering into a very robust notion.1

content/set-theory/ordinals/iso.tex

63.4 Order-Isomorphisms
To explain how robust well-ordering is, we will start by introducing a method sth:ordinals:iso:
sec
for comparing well-orderings.

Definition 63.4. A well-ordering is a pair ⟨A, <⟩, such that < well-orders A.
The well-orderings ⟨A, <⟩ and ⟨B, ⋖⟩ are order-isomorphic iff there is a bi-
jection f : A → B such that: x < y iff f (x) ⋖ f (y). In this case, we write
⟨A, <⟩ ∼
= ⟨B, ⋖⟩, and say that f is an order-isomorphism.

In what follows, for brevity, we will speak of “isomorphisms” rather than


“order-isomorphisms”. Intuitively, isomorphisms are structure-preserving bi-
jections. Here are some simple facts about isomorphisms.

Lemma 63.5. Compositions of isomorphisms are isomorphisms, i.e.: if f : A →sth:ordinals:iso:


B and g : B → C are isomorphisms, then (g ◦ f ) : A → C is an isomorphism. isoscompose

Problem 63.2. Prove Lemma 63.5.

Proof. Left as an exercise.

Corollary 63.6. X ∼
= Y is an equivalence relation. sth:ordinals:iso:
ordisoisequiv

Proposition 63.7. If ⟨A, <⟩ and ⟨B, ⋖⟩ are isomorphic well-orderings, then sth:ordinals:iso:
ordisounique
the isomorphism between them is unique.
1A reminder: all formulas can have parameters (unless explicitly stated otherwise).

810 Release : 6891b66 (2024-12-01)


63.4. ORDER-ISOMORPHISMS

Proof. Let f and g be isomorphisms A → B. We will prove the result by


induction, i.e. using Proposition 63.3. Fix a ∈ A, and suppose (for induction)
that (∀b < a)f (b) = g(b). Fix x ∈ B.
If x ⋖ f (a), then f −1 (x) < a, so g(f −1 (x)) ⋖ g(a), invoking the fact that
f and g are isomorphisms. But since f −1 (x) < a, by our supposition x =
f (f −1 (x)) = g(f −1 (x)). So x ⋖ g(a). Similarly, if x ⋖ g(a) then x ⋖ f (a).
Generalising, (∀x ∈ B)(x ⋖ f (a) ↔ x ⋖ g(a)). It follows that f (a) = g(a)
by Proposition 2.26. So (∀a ∈ A)f (a) = g(a) by Proposition 63.3.

This gives some sense that well-orderings are robust. But to continue explaining
this, it will help to introduce some more notation.

Definition 63.8. When ⟨A, <⟩ is a well-ordering with a ∈ A, let Aa = {x ∈


A : x < a}. We say that Aa is a proper initial segment of A (and allow that
A itself is an improper initial segment of A). Let <a be the restriction of < to
the initial segment, i.e., <↾A2a .

Using this notation, we can state and prove that no well-ordering is isomorphic
to any of its proper initial segments.

sth:ordinals:iso: Lemma 63.9. If ⟨A, <⟩ is a well-ordering with a ∈ A, then ⟨A, <⟩ ≇ ⟨Aa , <a ⟩
wellordnotinitial

Proof. For reductio, suppose f : A → Aa is an isomorphism. Since f is a


bijection and Aa ⊊ A, using Proposition 63.2 let b ∈ A be the <-least element
of A such that b ̸= f (b). We’ll show that (∀x ∈ A)(x < b ↔ x < f (b)), from
which it will follow by Proposition 2.26 that b = f (b), completing the reductio.
Suppose x < b. So x = f (x), by the choice of b. And f (x) < f (b), as f is
an isomorphism. So x < f (b).
Suppose x < f (b). So f −1 (x) < b, since f is an isomorphism, and so
−1
f (x) = x by the choice of b. So x < b.

Our next result shows, roughly put, that an “initial segment” of an isomor-
phism is an isomorphism:

sth:ordinals:iso: Lemma 63.10. Let ⟨A, <⟩ and ⟨B, ⋖⟩ be well-orderings. If f : A → B is an


wellordinitialsegment
isomorphism and a ∈ A, then f ↾Aa : Aa → Bf (a) is an isomorphism.

Proof. Since f is an isomorphism:

f [Aa ] = f [{x ∈ A : x < a}]


= f [{f −1 (y) ∈ A : f −1 (y) < a}]
= {y ∈ B : y ⋖ f (a)}
= Bf (a)

And f ↾Aa preserves order because f does.

Our next two results establish that well-orderings are always comparable:

Release : 6891b66 (2024-12-01) 811


CHAPTER 63. ORDINALS

Lemma 63.11. Let ⟨A, <⟩ and ⟨B, ⋖⟩ be well-orderings. If ⟨Aa1 , <a1 ⟩ ∼= sth:ordinals:iso:
⟨Bb1 , ⋖b1 ⟩ and ⟨Aa2 , <a2 ⟩ ∼
= ⟨Bb2 , ⋖b2 ⟩, then a1 < a2 iff b1 ⋖ b2 lemordsegments

Proof. We will prove left to right; the other direction is similar. Suppose both
⟨Aa1 , <a1 ⟩ ∼
= ⟨Bb1 , ⋖b1 ⟩ and ⟨Aa2 , <a2 ⟩ ∼ = ⟨Bb2 , ⋖b2 ⟩, with f : Aa2 → Bb2 our
isomorphism. Let a1 < a2 ; then ⟨Aa1 , <a1 ⟩ ∼ = ⟨Bf (a1 ) , ⋖f (a1 ) ⟩ by Lemma 63.10.
So ⟨Bb1 , ⋖b1 ⟩ ∼
= ⟨Bf (a1 ) , ⋖f (a1 ) ⟩, and so b1 = f (a1 ) by Lemma 63.9. Now b1 ⋖b2
as f ’s domain is Bb2 .

Theorem 63.12. Given any two well-orderings, one is isomorphic to an ini- sth:ordinals:iso:
thm:woalwayscomparable
tial segment (not necessarily proper) of the other.

Proof. Let ⟨A, <⟩ and ⟨B, ⋖⟩ be well-orderings. Using Separation, let

f = {⟨a, b⟩ ∈ A × B : ⟨Aa , <a ⟩ ∼


= ⟨Bb , ⋖b ⟩}.

By Lemma 63.11, a1 < a2 iff b1 ⋖b2 for all ⟨a1 , b1 ⟩, ⟨a2 , b2 ⟩ ∈ f . So f : dom(f ) →
ran(f ) is an isomorphism.
If a2 ∈ dom(f ) and a1 < a2 , then a1 ∈ dom(f ) by Lemma 63.10; so dom(f )
is an initial segment of A. Similarly, ran(f ) is an initial segment of B. For
reductio, suppose both are proper initial segments. Then let a be the <-least
element of A \ dom(f ), so that dom(f ) = Aa , and let b be the ⋖-least element
of B \ ran(f ), so that ran(f ) = Bb . So f : Aa → Bb is an isomorphism, and
hence ⟨a, b⟩ ∈ f , a contradiction.

content/set-theory/ordinals/vn.tex

63.5 Von Neumann’s Construction of the Ordinals


Theorem 63.12 gives rise to a thought. We could introduce certain objects, sth:ordinals:vn:
sec
called order types, to go proxy for the well-orderings. Writing ord(A, <) for the
order type of the well-ordering ⟨A, <⟩, we would hope to secure the following
two principles:

ord(A, <) = ord(B, ⋖) iff ⟨A, <⟩ ∼


= ⟨B, ⋖⟩
ord(A, <) < ord(B, ⋖) iff ⟨A, <⟩ ∼
= ⟨Bb , ⋖b ⟩ for some b ∈ B

Moreover, we might hope to introduce order-types as certain sets, just as we


can introduce the natural numbers as certain sets.
The most common way to do this—and the approach we will follow—is to
define these order-types via certain canonical well-ordered sets. These canoni-
cal sets were first introduced by von Neumann:

Definition 63.13. The set A is transitive iff (∀x ∈ A)x ⊆ A. Then A is an


ordinal iff A is transitive and well-ordered by ∈.

812 Release : 6891b66 (2024-12-01)


63.6. BASIC PROPERTIES OF THE ORDINALS

In what follows, we will use Greek letters for ordinals. It follows immediately
from the definition that, if α is an ordinal, then ⟨α, ∈α ⟩ is a well-ordering,
where ∈α = {⟨x, y⟩ ∈ α2 : x ∈ y}. So, abusing notation a little, we can just say
that α itself is a well-ordering.
Here are our first few ordinals:
∅, {∅}, {∅, {∅}}, {∅, {∅}, {∅, {∅}}}, . . .
You will note that these are the first few ordinals that we encountered in
our Axiom of Infinity, i.e., in von Neumann’s definition of ω (see section 62.6).
This is no coincidence. Von Neumann’s definition of the ordinals treats natural
numbers as ordinals, but allows for transfinite ordinals too.
As always, we can now ask: are these the ordinals? Or has von Neumann
simply given us some sets that we can treat as the ordinals? The kinds of
discussions one might have about this question are similar to the discussions
we had in section 2.2, section 5.5, section 6.4, and section 62.8, so we will not
belabour the point. Instead, in what follows, we will simply use “the ordinals”
to speak of “the von Neumann ordinals”.

content/set-theory/ordinals/basic.tex

63.6 Basic Properties of the Ordinals


sth:ordinals:basic: We observed that the first few ordinals are the natural numbers. The main
sec
reason for developing a theory of ordinals is to extend the principle of induction
which holds on the natural numbers. We will build up to this via a sequence
of elementary results.
sth:ordinals:basic: Lemma 63.14. Every element of an ordinal is an ordinal.
ordmemberord

Proof. Let α be an ordinal with b ∈ α. Since α is transitive, b ⊆ α. So ∈


well-orders b as ∈ well-orders α.
To see that b is transitive, suppose x ∈ c ∈ b. So c ∈ α as b ⊆ α. Again, as
α is transitive, c ⊆ α, so that x ∈ α. So x, c, b ∈ α. But ∈ well-orders α, so
that ∈ is a transitive relation on α by Proposition 63.2. So since x ∈ c ∈ b, we
have x ∈ b. Generalising, c ⊆ b

sth:ordinals:basic: Corollary 63.15. α = {β ∈ α : β is an ordinal}, for any ordinal α


ordissetofsmallerord

Proof. Immediate from Lemma 63.14.

The rough gist of the next two main results, Theorem 63.16 and Theo-
rem 63.17, is that the ordinals themselves are well-ordered by membership:
sth:ordinals:basic: Theorem 63.16 (Transfinite Induction). For any formula φ(x):
ordinductionschema
if ∃αφ(α), then ∃α(φ(α) ∧ (∀β ∈ α)¬φ(β))
where the displayed quantifiers are implicitly restricted to ordinals.

Release : 6891b66 (2024-12-01) 813


CHAPTER 63. ORDINALS

Proof. Suppose φ(α), for some ordinal α. If (∀β ∈ α)¬φ(β), then we are done.
Otherwise, as α is an ordinal, it has some ∈-least element which is φ, and this
is an ordinal by Lemma 63.14.

Note that we can equally express Theorem 63.16 as the scheme:

if ∀α((∀β ∈ α)φ(β) → φ(α)), then ∀αφ(α)

just by taking ¬φ(α) in Theorem 63.16, and then performing elementary logical
manipulations.
Theorem 63.17 (Trichotomy). α ∈ β ∨ α = β ∨ β ∈ α, for any ordinals α sth:ordinals:basic:
ordtrichotomy
and β.

Proof. The proof is by double induction, i.e., using Theorem 63.16 twice. Say
that x is comparable with y iff x ∈ y ∨ x = y ∨ y ∈ x.
For induction, suppose that every ordinal in α is comparable with every
ordinal. For further induction, suppose that α is comparable with every ordinal
in β. We will show that α is comparable with β. By induction on β, it will
follow that α is comparable with every ordinal; and so by induction on α, every
ordinal is comparable with every ordinal, as required. It suffices to assume that
α∈ / β and β ∈/ α, and show that α = β.
To show that α ⊆ β, fix γ ∈ α; this is an ordinal by Lemma 63.14. So by
the first induction hypothesis, γ is comparable with β. But if either γ = β or
β ∈ γ then β ∈ α (invoking the fact that α is transitive if necessary), contrary
to our assumption; so γ ∈ β. Generalising, α ⊆ β.
Exactly similar reasoning, using the second induction hypothesis, shows
that β ⊆ α. So α = β.

As such, we will sometimes write α < β rather than α ∈ β, since ∈ is behaving


as an ordering relation. There are no deep reasons for this, beyond familiarity,
and because it is easier to write α ≤ β than α ∈ β ∨ α = β.2
Here are two quick consequences of our last results, the first of which puts
our new notation into action:
Corollary 63.18. If ∃αφ(α), then ∃α(φ(α) ∧ ∀β(φ(β) → α ≤ β)). Moreover, sth:ordinals:basic:
for any ordinals α, β, γ, both α ∈
/ α and α ∈ β ∈ γ → α ∈ γ. ordordered

Proof. Just like Proposition 63.2.

Problem 63.3. Complete the “exactly similar reasoning” in the proof of The-
orem 63.17.

Corollary 63.19. A is an ordinal iff A is a transitive set of ordinals. sth:ordinals:basic:


corordtransitiveord

Proof. Left-to-right. By Lemma 63.14. Right-to-left. If A is a transitive set of


ordinals, then ∈ well-orders A by Theorem 63.16 and Theorem 63.17.
2 We could write α ∈ β; but that would be wholly non-standard.

814 Release : 6891b66 (2024-12-01)


63.6. BASIC PROPERTIES OF THE ORDINALS

Now, we glossed Theorem 63.16 and Theorem 63.17 as telling us that ∈


well-orders the ordinals. However, we have to be very cautious about this sort
of claim, thanks to the following result:

sth:ordinals:basic: Theorem 63.20 (Burali-Forti Paradox). There is no set of all the ordi-
buraliforti
nals

Proof. For reductio, suppose O is the set of all ordinals. If α ∈ β ∈ O, then α


is an ordinal, by Lemma 63.14, so α ∈ O. So O is transitive, and hence O is
an ordinal by Corollary 63.19. Hence O ∈ O, contradicting Corollary 63.18.

This result is named after Burali-Forti. But, it was Cantor in 1899—in a letter
to Dedekind—who first saw clearly the contradiction in supposing that there
is a set of all the ordinals. As van Heijenoort explains:

Burali-Forti himself considered the contradiction as establishing, by


reductio ad absurdum, the result that the natural ordering of the
ordinals is just a partial ordering. (Heijenoort, 1967, p. 105)

Setting Burali-Forti’s mistake to one side, we can summarize the foregoing as


follows. Ordinals are sets which are individually well-ordered by membership,
and collectively well-ordered by membership (without collectively constituting
a set).
Rounding this off, here are some more basic properties about the ordinals
which follow from Theorem 63.16 and Theorem 63.17.

Proposition 63.21. Any strictly descending sequence of ordinals is finite.

Proof. Any infinite strictly descending sequence of ordinals α0 > α1 > α2 > . . .
has no <-minimal member, contradicting Theorem 63.16.

sth:ordinals:basic: Proposition 63.22. α ⊆ β ∨ β ⊆ α, for any ordinals α, β.


ordinalsaresubsets

Proof. If α ∈ β, then α ⊆ β as β is transitive. Similarly, if β ∈ α, then β ⊆ α.


And if α = β, then α ⊆ β and β ⊆ α. So by Theorem 63.17 we are done.

sth:ordinals:basic: Proposition 63.23. α = β iff α ∼


= β, for any ordinals α, β.
ordisoidentity

Proof. The ordinals are well-orders; so this is immediate from Trichotomy


(Theorem 63.17) and Lemma 63.9.
S
sth:ordinals:basic: Problem 63.4. Prove that, if every member of X is an ordinal, then X is
probunionordinalsordinal
an ordinal.

content/set-theory/ordinals/replacement.tex

Release : 6891b66 (2024-12-01) 815


CHAPTER 63. ORDINALS

63.7 Replacement
In section 63.5, we motivated the introduction of ordinals by suggesting that sth:ordinals:replacement:
sec
we could treat them as order-types, i.e., canonical proxies for well-orderings.
In order for that to work, we would need to prove that every well-ordering is
isomorphic to some ordinal. This would allow us to define ord(A, <) as the
ordinal α such that ⟨A, <⟩ ∼
= α.
Unfortunately, we cannot prove the desired result only the Axioms we pro-
vided introduced so far. (We will see why in section 65.2, but for now the point
is: we can’t.) We need a new thought, and here it is:

Axiom (Scheme of Replacement). For any formula φ(x, y), the following
is an axiom:

for any A, if (∀x ∈ A)∃!y φ(x, y), then {y : (∃x ∈ A)φ(x, y)} exists.

As with Separation, this is a scheme: it yields infinitely many axioms, for each
of the infinitely many different φ’s. And it can equally well be (and normally
is) written down thus:

For any formula φ(x, y) which does not contain “B”, the following is an
axiom:

∀A[(∀x ∈ A)∃!y φ(x, y) → ∃B∀y(y ∈ B ↔ (∃x ∈ A)φ(x, y))]

On first encounter, however, this is quite a tangled formula. The following


quick consequence of Replacement probably gives a clearer expression to the
intuitive idea we are working with:

Corollary 63.24. For any term τ (x), and any set A, this set exists:

{τ (x) : x ∈ A} = {y : (∃x ∈ A)y = τ (x)}.

Proof. Since τ is a term, ∀x∃!y τ (x) = y. A fortiori, (∀x ∈ A)∃!y τ (x) = y. So


{y : (∃x ∈ A)τ (x) = y} exists by Replacement.

This suggests that “Replacement” is a good name for the Axiom: given a set
A, you can form a new set, {τ (x) : x ∈ A}, by replacing every member of A
with its image under τ . Indeed, following the notation for the image of a set
under a function, we might write τ [A] for {τ (x) : x ∈ A}.
Crucially, however, τ is a term. It need not be (a name for) a function, in
the sense of section 3.3, i.e., a certain set of ordered pairs. After all, if f is a
function (in that sense), then the set f [A] = {f (x) : x ∈ A} is just a particular
subset of ran(f ), and that is already guaranteed to exist, just using the axioms

816 Release : 6891b66 (2024-12-01)


63.8. ZF− : A MILESTONE

of Z− .3 Replacement, by contrast, is a powerful addition to our axioms, as we


will see in chapter 65.

content/set-theory/ordinals/milestone.tex

63.8 ZF− : a milestone


sth:ordinals:zfm: The question of how to justify Replacement (if at all) is not straightforward.
sec
As such, we will reserve that for chapter 65. However, with the addition of
Replacement, we have reached another important milestone. We now have all
the axioms required for the theory ZF− . In detail:

Definition 63.25. The theory ZF− has these axioms: Extensionality, Union,
Pairs, Powersets, Infinity, and all instances of the Separation and Replacement
schemes. Otherwise put, ZF− adds Replacement to Z− .

This stands for Zermelo–Fraenkel set theory (minus something which we will
come to later). Fraenkel gets the honour, since he is credited with the formu-
lation of Replacement in 1922, although the first precise formulation was due
to Skolem (1922).

content/set-theory/ordinals/ordtype.tex

63.9 Ordinals as Order-Types


sth:ordinals:ordtype: Armed with Replacement, and so now working in ZF− , we can finally prove
sec
the result we have been aiming for:

sth:ordinals:ordtype: Theorem 63.26. Every well-ordering is isomorphic to a unique ordinal.


thmOrdinalRepresentation

Proof. Let ⟨A, <⟩ be a well-order. By Proposition 63.23, it is isomorphic to


at most one ordinal. So, for reductio, suppose ⟨A, <⟩ is not isomorphic to any
ordinal. We will first “make ⟨A, <⟩ as small as possible”. In detail: if some
proper initial segment ⟨Aa , <a ⟩ is not isomorphic to any ordinal, there is a least
a ∈ A with that property; then let B = Aa and ⋖ = <a . Otherwise, let B = A
and ⋖ = <.
By definition, every proper initial segment of B is isomorphic to some or-
dinal, which is unique as above. So by Replacement, the following set exists,
and is a function:

f = {⟨β, b⟩ : b ∈ B and β ∼
= ⟨Bb , ⋖b ⟩}

To complete the reductio, we’ll show that f is an isomorphism α → B, for


some ordinal α.
3 Just
SS
consider {y ∈ f : (∃x ∈ A)y = f (x)}.

Release : 6891b66 (2024-12-01) 817


CHAPTER 63. ORDINALS

It is obvious that ran(f ) = B. And by Lemma 63.11, f preserves ordering,


i.e., γ ∈ β iff f (γ)⋖f (β). To show that dom(f ) is an ordinal, by Corollary 63.19
it suffices to show that dom(f ) is transitive. So fix β ∈ dom(f ), i.e., β ∼ =
⟨Bb , ⋖b ⟩ for some b. If γ ∈ β, then γ ∈ dom(f ) by Lemma 63.10; generalising,
β ⊆ dom(f ).

This result licenses the following definition, which we have wanted to offer
since section 63.5:

Definition 63.27. If ⟨A, <⟩ is a well-ordering, then its order type, ord(A, <),
is the unique ordinal α such that ⟨A, <⟩ ∼
= α.

Moreover, this definition licenses two nice principles:

Corollary 63.28. Where ⟨A, <⟩ and ⟨B, ⋖⟩ are well-orderings: sth:ordinals:ordtype:
ordtypesworklikeyouwant

ord(A, <) = ord(B, ⋖) iff ⟨A, <⟩ ∼


= ⟨B, ⋖⟩
∼ ⟨Bb , ⋖b ⟩ for some b ∈ B
ord(A, <) ∈ ord(B, ⋖) iff ⟨A, <⟩ =

Proof. The identity holds by Proposition 63.23. To prove the second claim, let
ord(A, <) = α and ord(B, ⋖) = β, and let f : β → ⟨B, ⋖⟩ be our isomorphism.
Then:

α ∈ β iff f ↾α : α → Bf (α) is an isomorphism


iff ⟨A, <⟩ =∼ ⟨Bf (α) , ⋖f (α) ⟩
iff ⟨A, <⟩ ∼
= ⟨Bb , ⋖b ⟩ for some b ∈ B

by Proposition 63.7, Lemma 63.10, and Corollary 63.15.

content/set-theory/ordinals/opps.tex

63.10 Successor and Limit Ordinals


In the next few chapters, we will use ordinals a great deal. So it will help if we sth:ordinals:opps:
sec
introduce some simple notions.

Definition 63.29. For any ordinal α, its successor is α+ = α ∪ {α}. We say


that α is a successor ordinal if β + = α for some ordinal β. We say that α is a
limit ordinal iff α is neither empty nor a successor ordinal.

The following result shows that this is the right notion of successor :

Proposition 63.30. For any ordinal α:

1. α ∈ α+ ;

2. α+ is an ordinal;

818 Release : 6891b66 (2024-12-01)


3. there is no ordinal β such that α ∈ β ∈ α+ .

Proof. Trivially, α ∈ α ∪ {α} = α+ . Equally, α+ is a transitive set of ordinals,


and hence an ordinal by Corollary 63.19. And it is impossible that α ∈ β ∈ α+ ,
since then either β ∈ α or β = α, contradicting Corollary 63.18.

This also licenses a variant of proof by transfinite induction:

sth:ordinals:opps: Theorem 63.31 (Simple Transfinite Induction). Let φ(x) be a formula


simpletransrecursion
such that:

1. φ(∅); and

2. for any ordinal α, if φ(α) then φ(α+ ); and

3. if α is a limit ordinal and (∀β ∈ α)φ(β), then φ(α).

Then ∀αφ(α).

Proof. We prove the contrapositive. So, suppose there is some ordinal which
is ¬φ; let γ be the least such ordinal. Then either γ = ∅, or γ = α+ for some
α such that φ(α); or γ is a limit ordinal and (∀β ∈ γ)φ(β).

A final bit of notation will prove helpful later on:

α+ .
S
sth:ordinals:opps: Definition 63.32. If X is a set of ordinals, then lsub(X) = α∈X
defsupstrict

Here, “lsub” stands for “least strict upper bound”.4 The following result ex-
plains this:

Proposition 63.33. If X is a set of ordinals, lsub(X) is the least ordinal


greater than every ordinal in X.

Proof. Let Y = {α+ : α ∈ X}, so that lsub(X) = Y . Since ordinals are


S
transitive and every member of an ordinal is an ordinal, lsub(X) is a transitive
set of ordinals, and so is an ordinal by Corollary 63.19.
If α ∈ X, then α+ ∈ Y , so α+ ⊆ Y = lsub(X), and hence α ∈ lsub(X).
S
So lsub(X) is strictly greater than every ordinal in X.
Conversely, if α ∈ lsub(X), then α ∈ β + ∈ Y for some β ∈ X, so that
α ≤ β ∈ X. So lsub(X) is the least strict upper bound on X.
4 Some books use “sup(X)” for this. But other books use “sup(X)” for the least non-
S
strict upper bound, i.e., simply X. If X has a greatest element, α, these notions come
apart: the least strict upper bound is α+ , whereas the least non-strict upper bound is just
α.

819
CHAPTER 64. STAGES AND RANKS

Chapter 64

Stages and Ranks

content/set-theory/spine/idea.tex

64.1 Defining the Stages as the Vα s


In chapter 63, we defined well-orderings and the (von Neumann) ordinals. In sth:spine:valpha:
sec
this chapter, we will use these to characterise the hierarchy of sets itself. To
do this, recall that in section 63.10, we defined the idea of successor and limit
ordinals. We use these ideas in following definition:

Definition 64.1. sth:spine:valpha:


defValphas

V∅ = ∅
Vα+ = ℘(Vα ) for any ordinal α
[
Vα = Vγ when α is a limit ordinal
γ<α

This will be a definition by transfinite recursion on the ordinals. In this regard,


we should compare this with recursive definitions of functions on the natural
numbers.1 As when dealing with natural numbers, one defines a base case and
successor cases; but when dealing with ordinals, we also need to describe the
behaviour of limit cases.
This definition of the Vα s will be an important milestone. We have infor-
mally motivated our hierarchy of sets as forming sets by stages. The Vα s are,
in effect, just those stages. Importantly, though, this is an internal character-
isation of the stages. Rather than suggesting a possible model of the theory,
we will have defined the stages within our set theory.

content/set-theory/spine/recursion.tex
1 Cf. the definitions of addition, multiplication, and exponentiation in section 6.2.

820 Release : 6891b66 (2024-12-01)


64.2. THE TRANSFINITE RECURSION THEOREM(S)

64.2 The Transfinite Recursion Theorem(s)


sth:spine:recursion: The first thing we must do, though, is confirm that Definition 64.1 is a suc-
sec
cessful definition. More generally, we need to prove that any attempt to offer
a transfinite by (transfinite) recursion will succeed. That is the aim of this
section.
Warning: this is tricky material. The overarching moral, though, is quite
simple: Transfinite Induction plus Replacement guarantee the legitimacy of
(several versions of) transfinite recursion.2

Definition 64.2. Let τ (x) be a term; let f be a function; let α be an ordinal.


We say that f is an α-approximation for τ iff both dom(f ) = α and (∀β ∈
α)f (β) = τ (f ↾β ).

sth:spine:recursion: Lemma 64.3 (Bounded Recursion). For any term τ (x) and any ordinal
transrecursionfun
α, there is a unique α-approximation for τ .

Proof. We will show that, for any γ ≤ α, there is a unique γ-approximation.


We first establish uniqueness. Let g and h (respectively) be γ- and δ-
approximations. A transfinite induction on their arguments shows that g(β) =
h(β) for any β ∈ dom(g) ∩ dom(h) = γ ∩ δ = min(γ, δ). So our approximations
are unique (if they exist), and agree on all values.
To establish existence, we now use a simple transfinite induction (Theo-
rem 63.31) on ordinals δ ≤ α.
The empty function is trivially an ∅-approximation.
If g is a γ-approximation, then g ∪ {⟨γ + , τ (g)⟩} is a γ + -approximation.
S If γ is a limit ordinal and gδ is a δ-approximation for all δ < γ, let g =
δ∈γ gδ . This is a function, since our various gδ s agree on all values. And if
δ ∈ γ then g(δ) = gδ+ (δ) = τ (gδ+ ↾δ ) = τ (g↾δ ).
This completes the proof by transfinite induction.

If we allow ourselves to define a term rather than a function, then we can


remove the bound α from the previous result. In the statement and proof of
the following result, when σ is a term, we let σ↾α = {⟨β, σ(β)⟩ : β ∈ α}.

sth:spine:recursion: Theorem 64.4 (General Recursion). For any term τ (x), we can explicitly
transrecursionschema
define a term σ(x), such that σ(α) = τ (σ↾α ) for any ordinal α.

2 A reminder: all formulas and terms can have parameters (unless explicitly stated oth-

erwise).

Release : 6891b66 (2024-12-01) 821


CHAPTER 64. STAGES AND RANKS

Proof. For each α, by Lemma 64.3 there is a unique α-approximation, fα , for τ .


Define σ(α) as fα+ (α). Now:

σ(α) = fα+ (α)


= τ (fα+ ↾α )
= τ ({⟨β, fα+ (β)⟩ : β ∈ α})
= τ ({⟨β, fβ + (β)⟩ : β ∈ α})
= τ (σ↾α )

noting that fβ + (β) = fα+ (β) for all β < α, as in Lemma 64.3.

Note that Theorem 64.4 is a schema. Crucially, we cannot expect σ to define


a function, i.e., a certain kind of set, since then dom(σ) would be the set of all
ordinals, contradicting the Burali-Forti Paradox (Theorem 63.20).
It still remains to show, though, that Theorem 64.4 vindicates our definition
of the Vα s. This may not be immediately obvious; but it will become apparent
with a last, simple, version of transfinite recursion.

Theorem 64.5 (Simple Recursion). For any terms τ (x) and θ(x) and any sth:spine:recursion:
simplerecursionschema
set A, we can explicitly define a term σ(x) such that:

σ(∅) = A
σ(α+ ) = τ (σ(α)) for any ordinal α
σ(α) = θ(ran(σ↾α )) when α is a limit ordinal

Proof. We start by defining a term, ξ(x), as follows:




 A if x is not a function whose

 domain is an ordinal; otherwise:
ξ(x) =


 τ (x(α)) if dom(x) = α+
θ(ran(x)) if dom(x) is a limit ordinal

By Theorem 64.4, there is a term σ(x) such that σ(α) = ξ(σ↾α ) for every
ordinal α; moreover, σ↾α is a function with domain α. We show that σ has the
required properties, by simple transfinite induction (Theorem 63.31).
First, σ(∅) = ξ(∅) = A.
Next, σ(α+ ) = ξ(σ↾α+ ) = τ (σ↾α+ (α)) = τ (σ(α)).
Last, σ(α) = ξ(σ↾α ) = θ(ran(σ↾α )), when α is a limit.

Now, toS vindicate Definition 64.1, just take A = ∅ and τ (x) = ℘(x) and
θ(x) = x. At long last, this vindicates the definition of the Vα s!

content/set-theory/spine/stagesbasics.tex

822 Release : 6891b66 (2024-12-01)


64.3. BASIC PROPERTIES OF STAGES

64.3 Basic Properties of Stages


sth:spine:Valphabasic: To bring out the foundational importance of the definition of the Vα s, we will
sec
present a few basic results about them. We start with a definition:3

Definition 64.6. The set A is potent iff ∀x((∃y ∈ A)x ⊆ y → x ∈ A).

sth:spine:Valphabasic: Lemma 64.7. For each ordinal α:


Valphabasicprops
sth:spine:Valphabasic: 1. Each Vα is transitive.
Valphatrans
sth:spine:Valphabasic: 2. Each Vα is potent.
Valphapotent
sth:spine:Valphabasic: 3. If γ ∈ α, then Vγ ∈ Vα (and hence also Vγ ⊆ Vα by (1))
Valphacum

Proof. We prove this by a (simultaneous) transfinite induction. For induction,


suppose that (1)–(3) holds for each ordinal β < α.
The case of α = ∅ is trivial.
Suppose α = β + . To show (3), if γ ∈ α then Vγ ⊆ Vβ by hypothesis, so
Vγ ∈ ℘(Vβ ) = Vα . To show (2), suppose A ⊆ B ∈ Vα i.e., A ⊆ B ⊆ Vβ ; then
A ⊆ Vβ so A ∈ Vα . To show (1), note that if x ∈ A ∈ Vα we have A ⊆ Vβ , so
x ∈ Vβ , so x ⊆ Vβ as Vβ is transitive by hypothesis, and so x ∈ Vα .
Suppose α is a limit ordinal. To showS(3), if γ ∈ α then γ ∈ γ + ∈ α, so
that Vγ ∈ Vγ + by assumption, hence Vγ ∈ β∈α Vβ = Vα . To show (1) and (2),
just observe that a union of transitive (respectively, potent) sets is transitive
(respectively, potent).

sth:spine:Valphabasic: Lemma 64.8. For each ordinal α, Vα ∈


/ Vα .
Valphanotref

Proof. By transfinite induction. Evidently V∅ ∈/ V∅ .


If Vα+ ∈ Vα+ = ℘(Vα ), then Vα+ ⊆ Vα ; and since Vα ∈ Vα+ by Lemma 64.7,
we have Vα ∈ Vα . Conversely: if Vα S∈/ Vα then Vα+ ∈ / Vα+
If α is a limit and Vα ∈ Vα = β∈α Vβ , then Vα ∈ Vβ for some β ∈ α;
but then also Vβ ∈ Vα so that Vβ ∈ Vβ by Lemma 64.7 (twice). Conversely, if
Vβ ∈
/ Vβ for all β ∈ α, then Vα ∈
/ Vα .

Corollary 64.9. For any ordinals α, β: α ∈ β iff Vα ∈ Vβ

Proof. Lemma 64.7 gives one direction. Conversely, suppose Vα ∈ Vβ . Then


α ̸= β by Lemma 64.8; and β ∈
/ α, for otherwise we would have Vβ ∈ Vα and
hence Vβ ∈ Vβ by Lemma 64.7 (twice), contradicting Lemma 64.8. So α ∈ β
by Trichotomy.

All of this allows us to think of each Vα as the αth stage of the hierarchy. Here
is why.
Certainly our Vα s can be thought of as being formed in an iterative process,
for our use of ordinals tracks the notion of iteration. Moreover, if one stage
3 There’s no standard terminology for “potent”; this is the name used by Button (2021).

Release : 6891b66 (2024-12-01) 823


CHAPTER 64. STAGES AND RANKS

is formed before the other, i.e., Vβ ∈ Vα , i.e., β ∈ α, then our process of


formation is cumulative, since Vβ ⊆ Vα . Finally, we are indeed forming all
possible collections of sets that were available at any earlier stage, since any
successor stage Vα+ is the power-set of its predecessor Vα .
In short: with ZF− , we are almost done, in articulating our vision of the
cumulative-iterative hierarchy of sets. (Though, of course, we still need to
justify Replacement.)

content/set-theory/spine/foundation.tex

64.4 Foundation
We are only almost done—and not quite finished—because nothing in ZF− sth:spine:foundation:
sec
guarantees that every set is in some Vα , i.e., that every set is formed at some
stage.
Now, there is a fairly straightforward (mathematical) sense in which we
don’t care whether there are sets outside the hierarchy. (If there are any
there, we can simply ignore them.) But we have motivated our concept of set
with the thought that every set is formed at some stage (see Stages-are-key
in section 62.1). So we will want to preclude the possibility of sets which fall
outside of the hierarchy. Accordingly, we must add a new axiom, which ensures
that every set occurs somewhere in the hierarchy.
Since the Vα s are our stages, we might simply consider adding the following
as an axiom:

Regularity. ∀A∃α A ⊆ Vα

This would be a perfectly reasonable approach. However, for reasons that


will be explained in the next section, we will instead adopt an alternative axiom:

Axiom (Foundation). (∀A ̸= ∅)(∃B ∈ A)A ∩ B = ∅.

With some effort, we can show (in ZF− ) that Foundation entails Regularity:

Definition 64.10. For each set A, let:

cl0 (A) = A,
[
cln+1 (A) = cln (A),
[
trcl(A) = cln (A).
n<ω

We call trcl(A) the transitive closure of A.

The name “transitive closure” is apt:

824 Release : 6891b66 (2024-12-01)


64.5. Z AND ZF: A MILESTONE

sth:spine:foundation: Proposition 64.11. A ⊆ trcl(A) and trcl(A) is a transitive set.


subsetoftrcl
Proof. Evidently A = cl0 (A) ⊆ trcl(A). And if x ∈ b ∈ trcl(A), then b ∈ cln (A)
for some n, so x ∈ cln+1 (A) ⊆ trcl(A).

sth:spine:foundation: Lemma 64.12. If A is a transitive set, then there is some α such that A ⊆ Vα .
lem:TransitiveWellFounded

Proof. Recalling the definition of “lsub(X)” from Definition 63.32, define two
sets:
D = {x ∈ A : ∀δ x ⊈ Vδ }
α = lsub{δ : (∃x ∈ A)(x ⊆ Vδ ∧ (∀γ ∈ δ)x ⊈ Vγ )}
Suppose D = ∅. So if x ∈ A, then there is some δ such that x ⊆ Vδ and, by
the well-ordering of the ordinals, (∀γ ∈ δ)x ⊈ Vγ ; hence δ ∈ α and so x ∈ Vα
by Lemma 64.7. Hence A ⊆ Vα , as required.
So it suffices to show that D = ∅. For reductio, suppose otherwise. By
Foundation, there is some B ∈ D ⊆ A such that D ∩ B = ∅. If x ∈ B then
x ∈ A, since A is transitive, and since x ∈/ D, it follows that ∃δ x ⊆ Vδ . So
now let
β = lsub{δ : (∃x ∈ b)(x ⊆ Vδ ∧ (∀γ < δ)x ⊈ Vγ )}.
As before, B ⊆ Vβ , contradicting the claim that B ∈ D.

sth:spine:foundation: Theorem 64.13. Regularity holds.


zfentailsregularity

Proof. Fix A; now A ⊆ trcl(A) by Proposition 64.11, which is transitive. So


there is some α such that A ⊆ trcl(A) ⊆ Vα by Lemma 64.12

These results show that ZF− proves the conditional Foundation ⇒ Regularity.
In Proposition 64.22, we will show that ZF− proves Regularity ⇒ Foundation.
As such, Foundation and Regularity are equivalent (modulo ZF− ). But this
means that, given ZF− , we can justify Foundation by noting that it is equiva-
lent to Regularity. And we can justify Regularity immediately on the basis of
Stages-are-key.

content/set-theory/spine/zf.tex

64.5 Z and ZF: A Milestone


sth:spine:zf: With Foundation, we reach another important milestone. We have considered
sec
theories Z− and ZF− , which we said were certain theories “minus” a certain
something. That certain something is Foundation. So:
Definition 64.14. The theory Z adds Foundation to Z− . So its axioms are
Extensionality, Union, Pairs, Powersets, Infinity, Foundation, and all instances
of the Separation scheme.
The theory ZF adds Foundation to ZF− . Otherwise put, ZF adds all
instances of Replacement to Z.

Release : 6891b66 (2024-12-01) 825


CHAPTER 64. STAGES AND RANKS

Still, one question might have occurred to you. If Regularity is equivalent


over ZF− to Foundation, and Regularity’s justification is clear, why bother to
go around the houses, and take Foundation as our basic axiom, rather than
Regularity?
Setting aside historical reasons (to do with who formulated what and when),
the basic reason is that Foundation can be presented without employing the
definition of the Vα s. That definition relied upon all of the work of section 64.2:
we needed to prove Transfinite Recursion, to show that it was justified. But our
proof of Transfinite Recursion employed Replacement. So, whilst Foundation
and Regularity are equivalent modulo ZF− , they are not equivalent modulo
Z− .
Indeed, the matter is more drastic than this simple remark suggests. Though
it goes well beyond this book’s remit, it turns out that both Z− and Z are too
weak to define the Vα s. So, if you are working only in Z, then Regularity (as we
have formulated it) does not even make sense. This is why our official axiom
is Foundation, rather than Regularity.
From now on, we will work in ZF (unless otherwise stated), without any
further comment.

content/set-theory/spine/rank.tex

64.6 Rank
Now that we have defined the stages as the Vα ’s, and we know that every set sth:spine:rank:
sec
is a subset of some stage, we can define the rank of a set. Intuitively, the rank
of A is the first moment at which A is formed. More precisely:
Definition 64.15. For each set A, rank(A) is the least ordinal α such that sth:spine:rank:
A ⊆ Vα . defnsetrank

Proposition 64.16. rank(A) exists, for any A. sth:spine:rank:


ranksexist

Proof. Left as an exercise.

Problem 64.1. Prove Proposition 64.16.

The well-ordering of ranks allows us to prove some important results:


Proposition 64.17. For any ordinal α, Vα = {x : rank(x) ∈ α}. sth:spine:rank:
valphalowerrank
Proof. If rank(x) ∈ α then x ⊆ Vrank(x) ∈ Vα , so x ∈ Vα as Vα is potent
(invoking Lemma 64.7 multiple times). Conversely, if x ∈ Vα then x ⊆ Vα , so
rank(x) ≤ α; now a simple transfinite induction shows that x ∈
/ Vα .

Problem 64.2. Complete the simple transfinite induction mentioned in Propo-


sition 64.17.

Proposition 64.18. If B ∈ A, then rank(B) ∈ rank(A). sth:spine:rank:


rankmemberslower

826 Release : 6891b66 (2024-12-01)


Proof. A ⊆ Vrank(A) = {x : rank(x) ∈ rank(A)} by Proposition 64.17.

Using this fact, we can establish a result which allows us to prove things about
all sets by a form of induction:

Theorem 64.19 (∈-Induction Scheme). For any formula φ:

∀A((∀x ∈ A)φ(x) → φ(A)) → ∀Aφ(A).

Proof. We will prove the contrapositive. So, suppose ¬∀Aφ(A). By Transfinite


Induction (Theorem 63.16), there is some non-φ of least possible rank; i.e. some
A such that ¬φ(A) and ∀x(rank(x) ∈ rank(A) → φ(x)). Now if x ∈ A then
rank(x) ∈ rank(A), by Proposition 64.18, so that φ(x); i.e. (∀x ∈ A)φ(x) ∧
¬φ(A).

Here is an informal way to gloss this powerful result. Say that φ is hereditary
iff whenever every element of a set is φ, the set itself is φ. Then ∈-Induction
tells you the following: if φ is hereditary, every set is φ.
To wrap up the discussion of ranks (for now), we’ll prove a few claims which
we have foreshadowed a few times.

sth:spine:rank: Proposition 64.20. rank(A) = lsubx∈A rank(x).


ranksupstrict

Proof. Let α = lsubx∈A rank(x). By Proposition 64.18, α ≤ rank(A). But


if x ∈ A then rank(x) ∈ α, so that x ∈ Vα by Proposition 64.17, and hence
A ⊆ Vα , i.e., rank(A) ≤ α. Hence rank(A) = α.

sth:spine:rank: Corollary 64.21. For any ordinal α, rank(α) = α.


ordsetrankalpha

Proof. Suppose for transfinite induction that rank(β) = β for all β ∈ α. Now
rank(α) = lsubβ∈α rank(β) = lsubβ∈α β = α by Proposition 64.20.

Finally, here is a quick proof of the result promised at the end of section 64.4,
that ZF− proves the conditional Regularity ⇒ Foundation. (Note that the
notion of “rank” and Proposition 64.18 are available for use in this proof since—
as mentioned at the start of this section—they can be presented using ZF− +
Regularity.)

sth:spine:rank: Proposition 64.22 (working in ZF− + Regularity). Foundation holds.


zfminusregularityfoundation

Proof. Fix A ̸= ∅, and some B ∈ A of least possible rank. If c ∈ B then


rank(c) ∈ rank(B) by Proposition 64.18, so that c ∈
/ A by choice of B.

827
CHAPTER 65. REPLACEMENT

Chapter 65

Replacement

content/set-theory/replacement/introduction.tex

65.1 Introduction
Replacement is the axiom scheme which makes the difference between ZF sth:replacement:intro:
sec
and Z. We helped ourselves to it throughout chapters 63 to 64. In this chapter,
we will finally consider the question: is Replacement justified?
To make the question sharp, it is worth observing that Replacement is really
rather strong. We will get a sense of just how strong it is, during this chapter
(and again in section 68.5). But this will suggest that justification really is
required.
We will discuss two kinds of justification. Roughly: an extrinsic justification
is an attempt to justify an axiom by its fruits; an intrinsic justification is
an attempt to justify an axiom by suggesting that it is vindicated by the
mathematical concepts in question. We will get a greater sense of what this
means during this chapter, but it is just the tip of an iceberg. For more, see in
particular Maddy (1988a and 1988b).

content/set-theory/replacement/strength.tex

65.2 The Strength of Replacement


We begin with a simple observation about the strength of Replacement: unless sth:replacement:strength:
sec
we go beyond Z, we cannot prove the existence of any von Neumann ordinal
greater than or equal to ω + ω.
Here is a sketch of why. Working in ZF, consider the set Vω+ω . This set
acts as the domain for a model for Z. To see this, we introduce some notation
for the relativization of a formula:

828 Release : 6891b66 (2024-12-01)


65.3. EXTRINSIC CONSIDERATIONS

sth:replacement:strength: Definition 65.1. For any set M , and any formula φ, let φM be the formula
formularelativization
which results by restricting all of φ’s quantifiers to M . That is, replace “∃x”
with “(∃x ∈ M )”, and replace “∀x” with “(∀x ∈ M )”.
It can be shown that, for every axiom φ of Z, we have that ZF ⊢ φVω+ω .
But ω + ω is not in Vω+ω , by Corollary 64.21. So Z is consistent with the
non-existence of ω + ω.
This is why we said, in section 63.7, that Theorem 63.26 cannot be proved
without Replacement. For it is easy, within Z, to define an explicit well-
ordering which intuitively should have order-type ω + ω. Indeed, we gave an
informal example of this in section 63.2, when we presented the ordering on
the natural numbers given by:

n ⋖ m iff either n < m and m − n is even,


or n is even and m is odd.

But if ω + ω does not exist, this well-ordering is not isomorphic to any ordinal.
So Z does not prove Theorem 63.26.
Flipping things around: Replacement allows us to prove the existence of
ω + ω, and hence must allow us to prove the existence of Vω+ω . And not just
that. For any well-ordering we can define, Theorem 63.26 tells us that there
is some α isomorphic with that well-ordering, and hence that Vα exists. In a
straightforward way, then, Replacement guarantees that the hierarchy of sets
must be very tall.
Over the next few sections, and then again in section 68.5, we’ll get a better
sense of better just how tall Replacement forces the hierarchy to be. The simple
point, for now, is that Replacement really does stand in need of justification!

content/set-theory/replacement/extrinsic.tex

65.3 Extrinsic Considerations about Replacement


sth:replacement:extrinsic: We start by considering an extrinsic attempt to justify Replacement. Boolos
sec
suggests one, as follows.
[. . . ] the reason for adopting the axioms of replacement is quite
simple: they have many desirable consequences and (apparently)
no undesirable ones. In addition to theorems about the iterative
conception, the consequences include a satisfactory if not ideal the-
ory of infinite numbers, and a highly desirable result that justifies
inductive definitions on well-founded relations. (Boolos, 1971, 229)
The gist of Boolos’s idea is that we should justify Replacement by its fruits.
And the specific fruits he mentions are the things we have discussed in the
past few chapters. Replacement allowed us to prove that the von Neumann
ordinals were excellent surrogates for the idea of a well-ordering type (this
is our “satisfactory if not ideal theory of infinite numbers”). Replacement

Release : 6891b66 (2024-12-01) 829


CHAPTER 65. REPLACEMENT

also allowed us to define the Vα s, establish the notion of rank, and prove ∈-
Induction (this amounts to our “theorems about the iterative conception”).
Finally, Replacement allows us to prove the Transfinite Recursion Theorem
(this is the “inductive definitions on well-founded relations”).
These are, indeed, desirable consequences. But do these desirable conse-
quences suffice to justify Replacement? No. Or at least, not straightforwardly.
Here is a simple problem. Whilst we have stated some desirable conse-
quences of Replacement, we could have obtained many of them via other means.
This is not as well known as it ought to be, though, so we should pause to ex-
plain the situation.
There is a simple theory of sets, Level Theory, or LT for short.1 LT’s axioms
are just Extensionality, Separation, and the claim that every set is a subset of
some level, where “level” is cunningly defined so that the levels behave like our
friends, the Vα s. So ZF proves LT; but LT is much weaker than ZF. In fact,
LT does not give you Pairs, Powersets, Infinity, or Replacement. Let Zr be
the result of adding Infinity and Powersets to LT; this delivers Pairs too, so,
Zr is at least as strong as Z. But, in fact, Zr is strictly stronger than Z, since
it adds the claim that every set has a rank (hence my suggestion that we call
it Zr). Indeed, Zr delivers: a perfectly satisfactory theory of ordinals; results
which stratify the hierarchy into well-ordered stages; a proof of ∈-Induction;
and a version of Transfinite Recursion.
In short: although Boolos didn’t know this, all of the desirable consequences
which he mentions could have been arrived at without Replacement; he simply
needed to use Zr rather than Z.
(Given all of this, why did we follow the conventional route, of teaching
you ZF, rather than LT and Zr? There are two reasons. First: for purely
historical reasons, starting with LT is rather nonstandard; we wanted to equip
you to be able to read more standard discussions of set theory. Second: when
you are ready to appreciate LT and Zr, you can simply read Potter 2004 and
Button 2021.)
Of course, since Zr is strictly weaker than ZF, there are results which
ZF proves which Zr leaves open. So one could try to justify Replacement on
extrinsic grounds by pointing to one of these results. But, once you know how
to use Zr, it is quite hard to find many examples of things that are (a) settled
by Replacement but not otherwise, and (b) are intuitively true. (For more on
this, see Potter 2004, §13.2.)
The bottom line is this. To provide a compelling extrinsic justification for
Replacement, one would need to find a result which cannot be achieved without
Replacement. And that’s not an easy enterprise.
Let’s consider a further problem which arises for any attempt to offer a
purely extrinsic justification for Replacement. (This problem is perhaps more
fundamental than the first.) Boolos does not just point out that Replace-
1 The first versions of LT are offered by Montague (1965) and Scott (1974); this was

simplified, and given a book-length treatment, by Potter (2004); and Button (2021) has
recently simplified LT further.

830 Release : 6891b66 (2024-12-01)


65.4. LIMITATION-OF-SIZE

ment has many desirable consequences. He also states that Replacement has
“(apparently) no undesirable” consequences. But this parenthetical caveat,
“apparently,” is surely absolutely crucial.
Recall how we ended up here: Naı̈ve Comprehension ran into inconsistency,
and we responded to this inconsistency by embracing the cumulative-iterative
conception of set. This conception comes equipped with a story which, we hope,
assures us of its consistency. But if we cannot justify Replacement from within
that story, then we have (as yet) no reason to believe that ZF is consistent.
Or rather: we have no reason to believe that ZF is consistent, apart from the
(perhaps merely contingent) fact that no one has discovered a contradiction
yet. In exactly that sense, Boolos’s comment seems to come down to this:
“(apparently) ZF is consistent”. We should demand greater reassurance of
consistency than this.
This issue will affect any purely extrinsic attempt to justify Replacement,
i.e., any justification which is couched solely in terms of the (known) conse-
quences of ZF. As such, we will want to look for an intrinsic justification of
Replacement, i.e., a justification which suggests that the story which we told
about sets somehow “already” commits us to Replacement.

content/set-theory/replacement/limofsize.tex

65.4 Limitation-of-size
sth:replacement:limofsize: Perhaps the most common attempt to offer an “intrinsic” justification of Re-
sec
placement comes via the following notion:

Limitation-of-size. Any things form a set, provided that there are not
too many of them.

This principle will immediately vindicate Replacement. After all, any set
formed by Replacement cannot be any larger than any set from which it was
formed. Stated precisely: suppose you form a set τ [A] = {τ (x) : x ∈ A} using
Replacement; then τ [A] ⪯ A; so if the elements of A were not too numerous to
form a set, their images are not too numerous to form τ [A].
The obvious difficulty with invoking Limitation-of-size to justify Replace-
ment is that we have not yet laid down any principle like Limitation-of-size.
Moreover, when we told our story about the cumulative-iterative conception of
set in chapters 61 to 62, nothing ever hinted in the direction of Limitation-of-
size. This, indeed, is precisely why Boolos at one point wrote: “Perhaps one
may conclude that there are at least two thoughts ‘behind’ set theory” (1989,
p. 19). On the one hand, the ideas surrounding the cumulative-iterative con-
ception of set are meant to vindicate Z. On the other hand, Limitation-of-size
is meant to vindicate Replacement.
But the issue it is not just that we have thus far been silent about Limitation-
of-size. Rather, the issue is that Limitation-of-size (as just formulated) seems

Release : 6891b66 (2024-12-01) 831


CHAPTER 65. REPLACEMENT

to sit quite badly with the cumulative-iterative notion of set. After all, it
mentions nothing about the idea of sets as formed in stages.
This is really not much of a surprise, given the history of these “two
thoughts” (i.e., the cumulative-iterative conception of set, and Limitation-of-
size). These “two thoughts” ultimately amount to two rather different projects
for blocking the set-theoretic paradoxes. The cumulative-iterative notion of set
blocks Russell’s paradox by saying, roughly: we should never have expected a
Russell set to exist, because it would not be “formed” at any stage. By contrast,
Limitation-of-size is meant to rule out the Russell set, by saying, roughly: we
should never have expected a Russell set to exist, because it would have been
too big.
Put like this, then, let’s be blunt: considered as a reply to the paradoxes,
Limitation-of-size stands in need of much more justification. Consider, for
example, this version of Russell’s Paradox: no pug sniffs exactly the pugs which
don’t sniff themselves (see section 61.2). If you ask “why is there no such pug?”,
it is not a good answer to be told that such a pug would have to sniff too many
pugs. So why would it be a good intuitive explanation, of the non-existence of
a Russell set, that it would have to be “too big” to exist?
In short, it’s forgivable if you are a bit mystified concerning the “intuitive”
motivation for Limitation-of-size.

content/set-theory/replacement/absinf.tex

65.5 Replacement and “Absolute Infinity”


We will now put Limitation-of-size behind us, and explore a different family sth:replacement:absinf:
sec
of (intrinsic) attempts to justify Replacement, which do take seriously the idea
of the sets as formed in stages.
When we first outlined the iterative process, we offered some principles
which explained what happens at each stage. These were Stages-are-key,
Stages-are-ordered, and Stages-accumulate. Later, we added some principles
which told us something about the number of stages: Stages-keep-going told
us that the process of set-formation never ends, and Stages-hit-infinity told us
that the process goes through an infinite-th stage.
It is reasonable to suggest that these two latter principles fall out of some
a broader principle, like:

Stages-are-inexhaustible. There are absolutely infinitely many stages; the


hierarchy is as tall as it could possibly be.

Obviously this is an informal principle. But even if it is not immediately


entailed by the cumulative-iterative conception of set, it certainly seems con-
sonant with it. At the very least, and unlike Limitation-of-size, it retains the
idea that sets are formed stage-by-stage.
The hope, now, is to leverage Stages-are-inexhaustible into a justification
of Replacement. So let us see how this might be done.

832 Release : 6891b66 (2024-12-01)


65.5. REPLACEMENT AND “ABSOLUTE INFINITY”

In section 63.2, we saw that it is easy to construct a well-ordering which


(morally) should be isomorphic to ω + ω. Otherwise put, we can easily imagine
a stage-by-stage iterative process, whose order-type (morally) is ω+ω. As such,
if we have accepted Stages-are-inexhaustible, then we should surely accept that
there is at least an ω + ω-th stage of the hierarchy, i.e., Vω+ω , for the hierarchy
surely could continue thus far.
This thought generalizes as follows: for any well-ordering, the process of
building the iterative hierarchy should run at least as far as that well-ordering.
And we could guarantee this, just by treating Theorem 63.26 as an axiom. This
would tell us that any well-ordering is isomorphic to a von Neumann ordinal.
Since each von Neumann ordinal will be equal to its own rank, Theorem 63.26
will then tell us that, whenever we can describe a well-ordering in our set
theory, the iterative process of set building must outrun that well-ordering.
This idea certainly seems like a corollary of Stages-are-inexhaustible. Un-
fortunately, if our aim is to extract Replacement from this idea, then we face a
simple, technical, barrier: Replacement is strictly stronger than Theorem 63.26.
(This observation is made by Potter (2004, §13.2); we will prove it in sec-
tion 65.8.)
The upshot is that, if we are going to understand Stages-are-inexhaustible
in such a way as to yield Replacement, then it cannot merely say that the hier-
archy outruns any well-ordering. It must make a stronger claim than that. To
this end, Shoenfield (1977) proposed a very natural strengthening of the idea,
as follows: the hierarchy is not cofinal with any set.2 In slightly more detail:
if τ is a mapping which sends sets to stages of the hierarchy, the image of any
set A under τ does not exhaust the hierarchy. Otherwise put (schematically):
Stages-are-super-cofinal. If A is a set and τ (x) is a stage for every x ∈ A,
then there is a stage which comes after each τ (x) for x ∈ A.
It is obvious that ZF proves a suitably formalised version of Stages-are-super-
cofinal. Conversely, we can informally argue that Stages-are-super-cofinal jus-
tifies Replacement.3 For suppose (∀x ∈ A)∃!y φ(x, y). Then for each x ∈ A,
let σ(x) be the y such that φ(x, y), and let τ (x) be the stage at which σ(x)
is first formed. By Stages-are-super-cofinal, there is a stage V such that
(∀x ∈ A)τ (x) ∈ V . Now since each τ (x) ∈ V and σ(x) ⊆ τ (x), by Sepa-
ration we can obtain {y ∈ V : (∃x ∈ A)σ(x) = y} = {y : (∃x ∈ A)φ(x, y)}.
Problem 65.1. Formalize Stages-are-super-cofinal within ZF.

So Stages-are-super-cofinal vindicates Replacement. And it is at least plau-


sible that Stages-are-inexhaustible vindicates Stages-are-super-cofinal. For sup-
pose Stages-are-super-cofinal fails. So the hierarchy is cofinal with some set A,
2 Gödel seems to have proposed a similar thought; see Potter (2004, p. 223). For discussion

of Gödel and Shoenfield, see Incurvati (2020, 90–5).


3 It would be harder to prove Replacement using some formalisation of Stages-are-super-

cofinal, since Z on its own is not strong enough to define the stages, so it is not clear how one
would formalise Stages-are-super-cofinal. One option, though, is to work in some extension
of LT, as discussed in section 65.3.

Release : 6891b66 (2024-12-01) 833


CHAPTER 65. REPLACEMENT

i.e., we have a map τ such that for any stage S there is some x ∈ A such that
S ∈ τ (x). In that case, we do have a way to get a handle on the supposed “ab-
solute infinity” of the hierarchy: it is exhausted by the range of τ applied to A.
And that compromises the thought that the hierarchy is “absolutely infinite”.
Contraposing: Stages-are-inexhaustible entails Stages-are-super-cofinal, which
in turn justifies Replacement.
This represents a genuinely promising attempt to provide an intrinsic jus-
tification for Replacement. But whether it ultimately works, or not, we will
have to leave to you to decide.

content/set-theory/replacement/ref.tex

65.6 Replacement and Reflection


Our last attempt to justify Replacement, via Stages-are-inexhaustible, begins sth:replacement:ref:
sec
with a deep and lovely result:4

Theorem 65.2 (Reflection Schema). For any formula φ: sth:replacement:ref:


reflectionschema

∀α∃β > α(∀x1 . . . , xn ∈ Vβ )(φ(x1 , . . . , xn ) ↔ φVβ (x1 , . . . , xn ))

As in Definition 65.1, φVβ is the result of restricting every quantifier in φ to the


set Vβ . So, intuitively, Reflection says this: if φ is true in the entire hierarchy,
then φ is true in arbitrarily many initial segments of the hierarchy.
Montague (1961) and Lévy (1960) showed that (suitable formulations of)
Replacement and Reflection are equivalent, modulo Z, so that adding either
gives you ZF. (We prove these results in section 65.7.) Given this equiva-
lence, one might hope to justify Reflection and Replacement via Stages-are-
inexhaustible as follows: given Stages-are-inexhaustible, the hierarchy should
be very, very tall; so tall, in fact, that nothing we can say about it is sufficient
to bound its height. And we can understand this as the thought that, if any
sentence φ is true in the entire hierarchy, then it is true in arbitrarily many
initial segments of the hierarchy. And that is just Reflection.
Again, this seems like a genuinely promising attempt to provide an intrinsic
justification for Replacement. But there is much too much to say about it here.
You must now decide for yourself whether it succeeds.5

content/set-theory/replacement/refproofs.tex

65.7 Appendix: Results surrounding Replacement


In this section, we will prove Reflection within ZF. We will also prove a sense in sth:replacement:refproofs:
sec
which Reflection is equivalent to Replacement. And we will prove an interesting
4A reminder: all formulas can have parameters (unless explicitly stated otherwise).
5 Though you might like to continue by reading Incurvati (2020, 95–100).

834 Release : 6891b66 (2024-12-01)


65.7. APPENDIX: RESULTS SURROUNDING REPLACEMENT

consequence of all this, concerning the strength of Reflection/Replacement.


Warning: this is easily the most advanced bit of mathematics in this textbook.
We’ll start with a lemma which, for brevity, employs the notational device of
overlining to deal with sequences of variables or objects. So: “ak ” abbreviates
“ak1 , . . . , akn ”, where n is determined by context.
sth:replacement:refproofs: Lemma 65.3. For each 1 ≤ i ≤ k, let φi (v i , x) be a formula. Then for each
lemreflection
α there is some β > α such that, for any a1 , . . . , ak ∈ Vβ and each 1 ≤ i ≤ k:
∃xφi (ai , x) → (∃x ∈ Vβ )φi (ai , x)

Proof. We define a term µ as follows: µ(a1 , . . . , ak ) is the least stage, V , which


satisfies all of the following conditionals, for 1 ≤ i ≤ k:
∃xφi (ai , x) → (∃x ∈ V )φi (ai , x))
It is easy to confirm that µ(a1 , . . . , ak ) exists for all a1 , . . . , ak . Now, using
Replacement and our recursion theorem, define:
S0 = Vα+1
[
Sn+1 = Sn ∪ {µ(a1 , . . . , ak ) : a1 , . . . , ak ∈ Sn }
[
S= Sn .
m<ω

Each Sn , and hence S itself, is a stage after Vα . Now fix a1 , . . . , ak ∈ S; so


there is some n < ω such that a1 , . . . , ak ∈ Sn . Fix some 1 ≤ i ≤ k, and
suppose that ∃xφi (ai , x). So (∃x ∈ µ(a1 , . . . , ak ))φi (ai , x) by construction, so
(∃x ∈ Sn+1 )φi (ai , x) and hence (∃x ∈ S)φi (ai , x). So S is our Vβ .

We can now prove Theorem 65.2 quite straightforwardly:


Proof. Fix α. Without loss of generality, we can assume φ’s only connectives
are ∃, ¬ and ∧ (since these are expressively adequate). Let ψ1 , . . . , ψk enu-
merate each of φ’s subformulas according to complexity, so that ψk = φ. By
Lemma 65.3, there is a β > α such that, for any ai ∈ Vβ and each 1 ≤ i ≤ k:
∃xψi (ai , x) → (∃x ∈ Vβ )ψi (ai , x) (*)
V
By induction on complexity of ψi , we will show that ψi (ai ) ↔ ψi β (ai ), for
any ai ∈ Vβ . If ψi is atomic, this is trivial. The biconditional also estab-
lishes that, when ψi is a negation or conjunction of subformulas satisfying this
property, ψi itself satisfies this property. So the only interesting case concerns
quantification. Fix ai ∈ Vβ ; then:
V
(∃xψi (ai , x))Vβ iff (∃x ∈ Vβ )ψi β (ai , x) by definition
iff (∃x ∈ Vβ )ψi (ai , x) by hypothesis
iff ∃xψi (ai , x) by (*)
This completes the induction; the result follows as ψk = φ.

Release : 6891b66 (2024-12-01) 835


CHAPTER 65. REPLACEMENT

We have proved Reflection in ZF. Our proof essentially followed Montague


(1961). We now want to prove in Z that Reflection entails Replacement. The
proof follows Lévy (1960), but with a simplification.
Since we are working in Z, we cannot present Reflection in exactly the form
given above. After all, we formulated Reflection using the “Vα ” notation, and
that cannot be defined in Z (see section 64.5). So instead we will offer an
apparently weaker formulation of Replacement, as follows:

Weak-Reflection. For any formula φ, there is a transitive set S such that


0, 1, and any parameters to φ are elements of S, and (∀x ∈ S)(φ ↔ φS ).

To use this to prove Replacement, we will first follow Lévy (1960, first part
of Theorem 2) and show that we can “reflect” two formulas at once:

Lemma 65.4 (in Z + Weak-Reflection.). For any formulas ψ, χ, there is sth:replacement:refproofs:


lem:reflect
a transitive set S such that 0 and 1 (and any parameters to the formulas) are
elements of S, and (∀x ∈ S)((ψ ↔ ψ S ) ∧ (χ ↔ χS )).

Proof. Let φ be the formula (z = 0 ∧ ψ) ∨ (z = 1 ∧ χ).


Here we use an abbreviation; we should spell out “z = 0” as “∀t t ∈ / z” and
“z = 1” as “∀s(s ∈ z ↔ ∀t t ∈ / s)”. But since 0, 1 ∈ S and S is transitive, these
formulas are absolute for S; that is, they will apply to the same object whether
we restrict their quantifiers to S.6
By Weak-Reflection, we have some appropriate S such that:

(∀z, x ∈ S)(φ ↔ φS )
i.e. (∀z, x ∈ S)(((z = 0 ∧ ψ) ∨ (z = 1 ∧ χ)) ↔
((z = 0 ∧ ψ) ∨ (z = 1 ∧ χ))S )
i.e. (∀z, x ∈ S)(((z = 0 ∧ ψ) ∨ (z = 1 ∧ χ)) ↔
((z = 0 ∧ ψ S ) ∨ (z = 1 ∧ χS )))
i.e. (∀x ∈ S)((ψ ↔ ψ S ) ∧ (χ ↔ χS ))

The second claim entails the third because “z = 0” and “z = 1” are absolute
for S; the fourth claim follows since 0 ̸= 1.

We can now obtain Replacement, just by following and simplifying Lévy (1960,
Theorem 6):

Theorem 65.5 (in Z + Weak-Reflection). For any formula φ(v, w), and
any A, if (∀x ∈ A)∃!y φ(x, y), then {y : (∃x ∈ A)φ(x, y} exists.
6 More formally, letting ξ be either of these formulas, ξ(z) ↔ ξ S (z).

836 Release : 6891b66 (2024-12-01)


65.8. APPENDIX: FINITE AXIOMATIZABILITY

Proof. Fix A such that (∀x ∈ A)∃!y φ(x, y), and define formulas:

ψ is (φ(x, z) ∧ A = A)
χ is ∃y φ(x, y)

Using Lemma 65.4, since A is a parameter to ψ, there is a transitive S such


that 0, 1, A ∈ S (along with any other parameters), and such that:

(∀x, z ∈ S)((ψ ↔ ψ S ) ∧ (χ ↔ χS ))

So in particular:

(∀x, z ∈ S)(φ(x, z) ↔ φS (x, z))


(∀x ∈ S)(∃yφ(x, y) ↔ (∃y ∈ S)φS (x, y))

Combining these, and observing that A ⊆ S since A ∈ S and S is transitive:

(∀x ∈ A)(∃yφ(x, y) ↔ (∃y ∈ S)φ(x, y))

Now (∀x ∈ A)(∃!y ∈ S)φ(x, y), because (∀x ∈ A)∃!y φ(x, y). Now Separation
yields {y ∈ S : (∃x ∈ A)φ(x, y)} = {y : (∃x ∈ A)φ(x, y)}.

content/set-theory/replacement/finiteaxiomatizability.tex

65.8 Appendix: Finite axiomatizability


placement:finiteaxiomatizability: We close this chapter by extracting some results from Replacement. The first
sec
result is due to Montague (1961); note that it is not a proof within ZF, but a
proof about ZF:
replacement:finiteaxiomatizability: Theorem 65.6. ZF is not finitely axiomatizable. More generally: if T is
zfnotfinitely
finite and T ⊢ ZF, then T is inconsistent.
(Here, we tacitly restrict ourselves to first-order sentences whose only non-
logical primitive is ∈, and we write T ⊢ ZF to indicate that T ⊢ φ for all
φ ∈ ZF.)
Proof. Fix finite T such that T ⊢ ZF. So, T proves Reflection, i.e. The-
orem 65.2. Since T is finite, we can rewrite it as a single conjunction, θ.
Reflecting with this formula, T ⊢ ∃β(θ ↔ θVβ ). Since trivially T ⊢ θ, we find
that T ⊢ ∃β θVβ .
Now, let ψ(X) abbreviate:

θX ∧ X is transitive ∧ (∀Y ∈ X)(Y is transitive → ¬θY )

roughly this says: X is a transitive model of θ, and ∈-minimal in this regard.


Now, recalling that T ⊢ ∃β θVβ , by basic facts about ranks within ZF and
hence within T, we have:
T ⊢ ∃M ψ(M ). (*)

Release : 6891b66 (2024-12-01) 837


Using the first conjunct of ψ(X), whenever T ⊢ σ, we have that T ⊢ ∀X(ψ(X)→
σ X ). So, by (*):

T ⊢ ∀X(ψ(X) → (∃N ψ(N ))X )

Using this, and (*) again:

T ⊢ ∃M (ψ(M ) ∧ (∃N ψ(N ))M )

In particular, then:

T ⊢ ∃M (ψ(M ) ∧ (∃N ∈ M )((N is transitive)N ∧ (θN )M ))

So, by elementary reasoning concerning transitivity:

T ⊢ ∃M (ψ(M ) ∧ (∃N ∈ M )(N is transitive ∧ θN ))

So that T is inconsistent.7

Here is a similar result, noted by Potter (2004, 223):

Proposition 65.7. Let T extend Z with finitely many new axioms. If T ⊢ sth:replacement:finiteaxiomatizabi
finiteextensionofZ
ZF, then T is inconsistent. (Here we use the same tacit restrictions as for
Theorem 65.6.)

Proof. Use θ for the conjunction of all of T’s axioms except for the (infinitely
many) instances of Separation. Defining ψ from θ as in Theorem 65.6, we can
show that T ⊢ ∃M ψ(M ).
As in Theorem 65.6, we can establish the schema that, whenever T ⊢ σ,
we have that T ⊢ ∀X(ψ(X) → σ X ). We then finish our proof, exactly as in
Theorem 65.6.
However, establishing the schema involves a little more work than in The-
orem 65.6. After all, the Separation-instances are in T, but they are not
conjuncts of θ. However, we can overcome this obstacle by proving that
T ⊢ ∀X(X is transitive → σ X ), for every Separation-instance σ. We leave
this to the reader.

Problem 65.2. Show that, for every Separation-instance σ, we have: Z ⊢


∀X(X is transitive → σ X ). (We used this schema in Proposition 65.7.)

Problem 65.3. Show that, for every φ ∈ Z, we have ZF ⊢ φVω+ω .

Problem 65.4. Confirm the remaining schematic results invoked in the proofs
of Theorem 65.6 and Proposition 65.7.
7 This “elementary reasoning” involves proving certain “absoluteness facts” for transitive
sets.

838
As remarked in section 65.5, this shows that Replacement is strictly stronger
than Theorem 63.26. Or, slightly more strictly: if Z + “every well-ordering
is isomorphic to a unique ordinal” is consistent, then it fails to prove some
Replacement-instance.

Chapter 66

Ordinal Arithmetic

content/set-theory/ord-arithmetic/introduction.tex

66.1 Introduction
sth:ord-arithmetic:intro: In chapter 63, we developed a theory of ordinal numbers. We saw in chapter 64
sec
that we can think of the ordinals as a spine around which the remainder of the
hierarchy is constructed. But that is not the only role for the ordinals. There
is also the task of performing ordinal arithmetic.
We already gestured at this, back in section 63.2, when we spoke of ω, ω + 1
and ω + ω. At the time, we spoke informally; the time has come to spell it out
properly. However, we should mention that there is not much philosophy in
this chapter; just technical developments, coupled with a (mildly) interesting
observation that we can do the same thing in two different ways.

content/set-theory/ord-arithmetic/addition.tex

66.2 Ordinal Addition


sth:ord-arithmetic:add: Suppose we want to add α and β. We can simply put a copy of β immediately
sec
after a copy of α. (We need to take copies, since we know from Proposi-
tion 63.22 that either α ⊆ β or β ⊆ α.) The intuitive effect of this is to run
through an α-sequence of steps, and then to run through a β-sequence. The
resulting sequence will be well-ordered; so by Theorem 63.26 it is isomorphic
to a (unique) ordinal. That ordinal can be regarded as the sum of α and β.
That is the intuitive idea behind ordinal addition. To define it rigorously,
we start with the idea of taking copies of sets. The idea here is to use arbitrary
tags, 0 and 1, to keep track of which object came from where:

Release : 6891b66 (2024-12-01) 839


CHAPTER 66. ORDINAL ARITHMETIC

Definition 66.1. The disjoint sum of A and B is A⊔B = (A×{0})∪(B×{1}). sth:ord-arithmetic:add:


defdissum

We next define an ordering on pairs of ordinals:

Definition 66.2. For any ordinals α1 , α2 , β1 , β2 , say that:

⟨α1 , α2 ⟩ ∢ ⟨β1 , β2 ⟩ iff either α2 ∈ β2


or both α2 = β2 and α1 ∈ β1

This is a reverse lexicographic ordering, since you order by the second ele-
ment, then by the first. Now recall that we wanted to define α + β as the order
type of a copy of α followed by a copy of β. To achieve that, we say:

Definition 66.3. For any ordinals α, β, their sum is α + β = ord(α ⊔ β, ∢). sth:ord-arithmetic:add:
defordplus

Note that we slightly abused notation here; strictly we should write “{⟨x, y⟩ ∈
α ⊔ β : x ∢ y}” in place of “∢”. For brevity, though, we will continue to abuse
notation in this way in what follows.
The following result, together with Theorem 63.26, confirms that our defi-
nition is well-formed:

Lemma 66.4. ⟨α ⊔ β, ∢⟩ is a well-order, for any ordinals α and β. sth:ord-arithmetic:add:


ordsumlessiswo

Proof. Obviously ∢ is connected on α ⊔ β. To show it is well-founded, fix a


non-empty X ⊆ α ⊔ β. Let Y be the subset of X whose second coordinate is
as small as possible, i.e. Y = {⟨γ, i⟩ ∈ X : (∀⟨δ, j⟩ ∈ X)i ≤ j}. Now choose the
element of Y with smallest first coordinate.

So we have a nice, explicit definition of ordinal addition. Here is an unsurprising


fact (recall that 1 = {0}, by Definition 62.7):

Proposition 66.5. α + 1 = α+ , for any ordinal α.

Proof. Consider the isomorphism f from α+ = α ∪ {α} to α ⊔ 1 = (α × {0}) ⊔


({0} × {1}) given by f (γ) = ⟨γ, 0⟩ for γ ∈ α, and f (α) = ⟨0, 1⟩.

Moreover, it is easy to show that addition obeys certain recursive conditions:

Lemma 66.6. For any ordinals α, β, we have: sth:ord-arithmetic:add:


ordadditionrecursion

α+0=α
α + (β + 1) = (α + β) + 1
α + β = lsub(α + δ) if β is a limit ordinal
δ<β

840 Release : 6891b66 (2024-12-01)


66.2. ORDINAL ADDITION

Proof. We check case-by-case; first:


α + 0 = ord((α × {0}) ∪ (0 × {1}), ∢)
= ord((α × {0}) ∪ {0}, ∢)

α + (β + 1) = ord((α × {0}) ∪ (β + × {1}), ∢)
= ord((α × {0}) ∪ (β × {1}), ∢) + 1
= (α + β) + 1
Now let β ̸= ∅ be a limit. If δ < β then also δ +1 < β, so α+δ is a proper initial
segment of α + β. So α + β is a strict upper bound on X = {α + δ : δ < β}.
Moreover, if α ≤ γ < α + β, then clearly γ = α + δ for some δ < β. So
α + β = lsubδ<β (α + δ).
But here is a striking fact. To define ordinal addition, we could instead have
simply used the Transfinite Recursion Theorem, and laid down the recursion
equations, exactly as given in Lemma 66.6 (though using “β + ” rather than
“β + 1”).
There are, then, two different ways to define operations on the ordinals. We
can define them synthetically, by explicitly constructing a well-ordered set and
considering its order type. Or we can define them recursively, just by laying
down the recursion equations. Done correctly, though, the outcome is identical.
For Theorem 63.26 guarantees that these recursion equations pin down unique
ordinals.
In many ways, ordinal arithmetic behaves just like addition of the natural
numbers. For example, we can prove the following:
sth:ord-arithmetic:add: Lemma 66.7. If α, β, γ are ordinals, then:
ordinaladditionisnice
sth:ord-arithmetic:add: 1. if β < γ, then α + β < α + γ
ordaddition1
sth:ord-arithmetic:add: 2. if α + β = α + γ, then β = γ
ordaddition2
sth:ord-arithmetic:add: 3. α + (β + γ) = (α + β) + γ, i.e., addition is associative
ordaddition3
sth:ord-arithmetic:add: 4. If α ≤ β, then α + γ ≤ β + γ
ordaddition4
Proof. We prove (3), leaving the rest as an exercise. The proof is by Simple
Transfinite Induction on γ, using Lemma 66.6. When γ = 0:
(α + β) + 0 = α + β = α + (β + 0)
When γ = δ + 1, suppose for induction that (α + β) + δ = α + (β + δ); now
using Lemma 66.6 three times:
(α + β) + (δ + 1) = ((α + β) + δ) + 1
= (α + (β + δ)) + 1
= α + ((β + δ) + 1)
= α + (β + (δ + 1))

Release : 6891b66 (2024-12-01) 841


CHAPTER 66. ORDINAL ARITHMETIC

When γ is a limit ordinal, suppose for induction that if δ ∈ γ then (α+β)+δ =


α + (β + δ); now:

(α + β) + γ = lsub((α + β) + δ)
δ<γ

= lsub(α + (β + δ))
δ<γ

= α + lsub(β + δ)
δ<γ

= α + (β + γ)

Problem 66.1. Prove the remainder of Lemma 66.7.

In these ways, ordinal addition should be very familiar. But, there is a cru-
cial way in which ordinal addition is not like addition on the natural numbers.
Proposition 66.8. Ordinal addition is not commutative; 1 + ω = ω < ω + 1. sth:ord-arithmetic:add:
ordsumnotcommute

Proof. Note that 1 + ω = lsubn<ω (1 + n) = ω ∈ ω ∪ {ω} = ω + = ω + 1.

Whilst this may initially come as a surprise, it shouldn’t. On the one hand,
when you consider 1 + ω, you are thinking about the order type you get by
putting an extra element before all the natural numbers. Reasoning as we did
with Hilbert’s Hotel in section 6.1, intuitively, this extra first element shouldn’t
make any difference to the overall order type. On the other hand, when you
consider ω + 1, you are thinking about the order type you get by putting an
extra element after all the natural numbers. And that’s a radically different
beast!

content/set-theory/ord-arithmetic/using-addition.tex

66.3 Using Ordinal Addition


Using addition on the ordinals, we can explicitly calculate the ranks of various sth:ord-arithmetic:using-addition
sec
sets, in the sense of Definition 64.15:
Lemma 66.9. If rank(A) = α and rank(B) = β, then: sth:ord-arithmetic:using-addition:
rankcomputation
1. rank(℘(A)) = α + 1 sth:ord-arithmetic:using-addition:
exrankpow
2. rank({A, B}) = max(α, β) + 1 sth:ord-arithmetic:using-addition:
exrankpair
3. rank(A ∪ B) = max(α, β) sth:ord-arithmetic:using-addition:
exrankcup
4. rank(⟨A, B⟩) = max(α, β) + 2 sth:ord-arithmetic:using-addition:
exranktuple
5. rank(A × B) ≤ max(α, β) + 2 sth:ord-arithmetic:using-addition:
exranktimes
S S
6. rank( A) = α when α is empty or a limit; rank( A) = γ when α = sth:ord-arithmetic:using-addition:
exrankunion
γ+1

842 Release : 6891b66 (2024-12-01)


66.3. USING ORDINAL ADDITION

Proof. Throughout, we invoke Proposition 64.20 repeatedly.


(1). If x ⊆ A then rank(x) ≤ rank(A). So rank(℘(A)) ≤ α + 1. Since
A ∈ ℘(A) in particular, rank(℘(A)) = α + 1.
(2). By Proposition 64.20
(3). By Proposition 64.20.
(4). By (2), twice.
(5). Note that A × B ⊆ ℘(℘(A ∪ B)), and invoke (4).
(6). If α = γ + 1, there is some
S c ∈ A with rank(c) = γ, and no element
of A has higher rank; so rank( A) = γ. If α is a limit ordinal, then A S has
elements with rank arbitrarily close to (but strictly less than) α, so that A
also has elements
S with rank arbitrarily close to (but strictly less than) α, so
that rank( A) = α.
We leave it as an exercise to show why (5) involves an inequality.
Problem 66.2. Produce sets A and B such that rank(A×B) = max(rank(A), rank(B)).
Produce sets A and B such that rank(A × B) max(rank(A), rank(B)) + 2. Are
any other ranks possible?
We are also now in a position to show that several reasonable notions of
what it might mean to describe an ordinal as “finite” or “infinite” coincide:
sth:ord-arithmetic:using-addition: Lemma 66.10. For any ordinal α, the following are equivalent:
ordinfinitycharacter
sth:ord-arithmetic:using-addition: 1. α ∈
/ ω, i.e., α is not a natural number
ord:notinomega
sth:ord-arithmetic:using-addition: 2. ω ≤ α
ord:omegaplus
sth:ord-arithmetic:using-addition: 3. 1 + α = α
ord:oneplus
sth:ord-arithmetic:using-addition: 4. α ≈ α + 1, i.e., α and α + 1 are equinumerous
ord:plusone
sth:ord-arithmetic:using-addition: 5. α is Dedekind infinite
ord:infinite
So we have five provably equivalent ways to understand what it takes for an
ordinal to be (in)finite.
Proof. (1) ⇒ (2). By Trichotomy.
(2) ⇒ (3). Fix α ≥ ω. By Transfinite Induction, there is some least ordinal
γ (possibly 0) such that there is a limit ordinal β with α = β + γ. Now:
1 + α = 1 + (β + γ) = (1 + β) + γ = lsub(1 + δ) + γ = β + γ = α.
δ<β

(3) ⇒ (4). There is clearly a bijection f : (α ⊔ 1) → (1 ⊔ α). If 1 + α = α, there


is an isomorphism g : (1 ⊔ α) → α. Now consider g ◦ f .
(4) ⇒ (5). If α ≈ α + 1, there is a bijection f : (α ⊔ 1) → α. Define
g(γ) = f (γ, 0) for each γ < α; this injection witnesses that α is Dedekind
infinite, since f (0, 1) ∈ α \ ran(g).
(5) ⇒ (1). This is Proposition 62.8.

content/set-theory/ord-arithmetic/multiplication.tex

Release : 6891b66 (2024-12-01) 843


CHAPTER 66. ORDINAL ARITHMETIC

66.4 Ordinal Multiplication


We now turn to ordinal multiplication, and we approach this much like or- sth:ord-arithmetic:mult:
sec
dinal addition. So, suppose we want to multiply α by β. To do this, you
might imagine a rectangular grid, with width α and height β; the product of
α and β is now the result of moving along each row, then moving through the
next row. . . until you have moved through the entire grid. Otherwise put, the
product of α and β arises by replacing each element in β with a copy of α.
To make this formal, we simply use the reverse lexicographic ordering on
the Cartesian product of α and β:

Definition 66.11. For any ordinals α, β, their product α · β = ord(α × β, ∢).

We must again confirm that this is a well-formed definition:

Lemma 66.12. ⟨α × β, ∢⟩ is a well-order, for any ordinals α and β. sth:ord-arithmetic:mult:


ordtimeslessiswo

Proof. Exactly as for Lemma 66.4.

And it is not hard to prove that multiplication behaves thus:

Lemma 66.13. For any ordinals α, β: sth:ord-arithmetic:mult:


ordtimesrecursion

α·0=0
α · (β + 1) = (α · β) + α
α · β = lsub(α · δ) when β is a limit ordinal.
δ<β

Proof. Left as an exercise.

Indeed, just as in the case of addition, we could have defined ordinal multi-
plication via these recursion equations, rather than offering a direct definition.
Equally, as with addition, certain behaviour is familiar:

Lemma 66.14. If α, β, γ are ordinals, then: sth:ord-arithmetic:mult:


ordinalmultiplicationisnice
1. if α ̸= 0 and β < γ, then α · β < α · γ; sth:ord-arithmetic:mult:
ordtimes1
2. if α ̸= 0 and α · β = α · γ, then β = γ; sth:ord-arithmetic:mult:
ordtimes2
3. α · (β · γ) = (α · β) · γ; sth:ord-arithmetic:mult:
ordtimes3
4. If α ≤ β, then α · γ ≤ β · γ; sth:ord-arithmetic:mult:
ordtimes4
5. α · (β + γ) = (α · β) + (α · γ). sth:ord-arithmetic:mult:
ordtimes5

Proof. Left as an exercise.

You can prove (or look up) other results, to your heart’s content. But,
given Proposition 66.8, the following should not come as a surprise:

844 Release : 6891b66 (2024-12-01)


66.5. ORDINAL EXPONENTIATION

Proposition 66.15. Ordinal multiplication is not commutative: 2 · ω = ω <


ω·2

Proof. 2 · ω = lsubn<ω (2 · n) = ω ∈ lsubn<ω (ω + n) = ω + ω = ω · 2.

Again, the intuitive rationale is quite straightforward. To compute 2 · ω, you


replace each natural number with two entities. You would get the same order
type if you simply inserted all the “half” numbers into the natural numbers,
i.e., you considered the natural ordering on {n/2 : n ∈ ω}. And, put like that,
the order type is plainly the same as that of ω itself. But, to compute ω · 2,
you place down two copies of ω, one after the other.

Problem 66.3. Prove Lemma 66.12, Lemma 66.13, and Lemma 66.14

content/set-theory/ord-arithmetic/exponentiation.tex

66.5 Ordinal Exponentiation


sth:ord-arithmetic:expo: We now move to ordinal exponentiation. Sadly, there is no nice synthetic
sec
definition for ordinal exponentiation.
Sure, there are explicit synthetic definitions. Here is one. Let finfun(α, β)
be the set of all functions f : α → β such that {γ ∈ α : f (γ) ̸= 0} is equinu-
merous with some natural number. Define a well-ordering on finfun(α, β) by
f ⊏ g iff f ̸= g and f (γ0 ) < g(γ0 ), where γ0 = max{γ ∈ α : f (γ) ̸= g(γ)}.
Then we can define α(β) as ord(finfun(α, β), ⊏). Potter employs this explicit
definition, and then immediately explains:

The choice of this ordering is determined purely by our desire to


obtain a definition of ordinal exponentiation which obeys the ap-
propriate recursive condition. . . , and it is much harder to picture
than either the ordered sum or the ordered product. (Potter, 2004,
p. 199)

Quite. We explained addition as “a copy of α followed by a copy of β”, and


multiplication as “a β-sequence of copies of α”. But we have nothing pithy to
say about finfun(α, γ). So instead, we’ll offer the definition of ordinal exponen-
tiation just by transfinite recursion, i.e.:

sth:ord-arithmetic:expo: Definition 66.16.


ordexporecursion

α(0) = 1
α(β+1) = α(β) · α
[
α(β) = α(δ) when β is a limit ordinal
δ<β

Release : 6891b66 (2024-12-01) 845


If we were working as set theorists, we might want to explore some of
the properties of ordinal exponentiation. But we have nothing much more to
add, except to note the unsurprising fact that ordinal exponentiation does not
commute. Thus 2(ω) = δ<ω 2(δ) = ω, whereas ω (2) = ω · ω. But then, we
S
should not expect exponentiation to commute, since it does not commute with
natural numbers: 2(3) = 8 < 9 = 3(2) .

Problem 66.4. Using Transfinite Induction, prove that, if we define α(β) =


ord(finfun(α, β), ⊏), we obtain the recursion equations of Definition 66.16.

Chapter 67

Cardinals

content/set-theory/cardinals/cp.tex

67.1 Cantor’s Principle


Cast your mind back to section 63.5. We were discussing well-ordered sets, and sth:cardinals:cp:
sec
suggested that it would be nice to have objects which go proxy for well-orders.
With this is mind, we introduced ordinals, and then showed in Corollary 63.28
that these behave as we would want them to, i.e.:

ord(A, <) = ord(B, ⋖) iff ⟨A, <⟩ ∼


= ⟨B, ⋖⟩.

Cast your mind back even further, to section 4.8. There, working naı̈vely, we
introduced the notion of the “size” of a set. Specifically, we said that two
sets are equinumerous, A ≈ B, just in case there is a bijection f : A → B.
This is an intrinsically simpler notion than that of a well-ordering: we are
only interested in bijections, and not (as with order-isomorphisms) whether
the bijections “preserve any structure”.
This all gives rise to an obvious thought. Just as we introduced certain
objects, ordinals, to calibrate well-orders, we can introduce certain objects,
cardinals, to calibrate size. That is the aim of this chapter.
Before we say what these cardinals will be, we should lay down a principle
which they ought to satisfy. Writing |X| for the cardinality of the set X, we

846
67.2. CARDINALS AS ORDINALS

would want them to obey:

|A| = |B| iff A ≈ B.

We’ll call this Cantor’s Principle, since Cantor was probably the first to have it
very clearly in mind. (We’ll say more about its relationship to Hume’s Principle
in section 67.5.) So our aim is to define |X|, for each X, in such a way that it
delivers Cantor’s Principle.

content/set-theory/cardinals/cardsasords.tex

67.2 Cardinals as Ordinals


sth:cardinals:cardsasords: In fact, our theory of cardinals will just make (shameless) use of our theory of
sec
ordinals. That is: we will just define cardinals as certain specific ordinals. In
particular, we will offer the following:

sth:cardinals:cardsasords: Definition 67.1. If A can be well-ordered, then |A| is the least ordinal γ such
defcardinalasordinal
that A ≈ γ. For any ordinal γ, we say that γ is a cardinal iff γ = |γ|.

We just used the phrase “A can be well-ordered”. As is almost always the


case in mathematics, the modal locution here is just a hand-waving gloss on
an existential claim: to say “A can be well-ordered” is just to say “there is a
relation which well-orders A”.
But there is a snag with Definition 67.1. We would like it to be the case
that every set has a size, i.e., that |A| exists for every A. The definition we
just gave, though, begins with a conditional: “If A can be well-ordered. . . ”.
If there is some set A which cannot be well-ordered, then our definition will
simply fail to define an object |A|.
So, to use Definition 67.1, we need a guarantee that every set can be well-
ordered. Sadly, though, this guarantee is unavailable in ZF. So, if we want to
use Definition 67.1, there is no alternative but to add a new axiom, such as:

Axiom (Well-Ordering). Every set can be well-ordered.

We will discuss whether the Well-Ordering Axiom is acceptable in chapter 69.


From now on, though, we will simply help ourselves to it. And, using it, it
is quite straightforward to prove that cardinals (as defined in Definition 67.1)
exist and behave nicely:

sth:cardinals:cardsasords: Lemma 67.2. For every set A:


lem:CardinalsExist

sth:cardinals:cardsasords: 1. |A| exists and is unique;


cardaexists

sth:cardinals:cardsasords: 2. |A| ≈ A;
cardaapprox

sth:cardinals:cardsasords: 3. |A| is a cardinal, i.e., |A| = ||A||;


cardaidem

Release : 6891b66 (2024-12-01) 847


CHAPTER 67. CARDINALS

Proof. Fix A. By Well-Ordering, there is a well-ordering ⟨A, R⟩. By Theo-


rem 63.26, ⟨A, R⟩ is isomorphic to a unique ordinal, β. So A ≈ β. By Transfi-
nite Induction, there is a uniquely least ordinal, γ, such that A ≈ γ. So |A| = γ,
establishing (1) and (2). To establish (3), note that if δ ∈ γ then δ ≺ A, by our
choice of γ, so that also δ ≺ γ since equinumerosity is an equivalence relation
(Proposition 4.20). So γ = |γ|.

The next result guarantees Cantor’s Principle, and more besides. (Note
that cardinals inherit their ordering from the ordinals, i.e., a < b iff a ∈ b. In
formulating this, we will use Fraktur letters for objects we know to be cardinals.
This is fairly standard. A common alternative is to use Greek letters, since
cardinals are ordinals, but to choose them from the middle of the alphabet,
e.g.: κ, λ.):

Lemma 67.3. For any sets A and B: sth:cardinals:cardsasords:


lem:CardinalsBehaveRight

A ≈ B iff |A| = |B|


A ⪯ B iff |A| ≤ |B|
A ≺ B iff |A| < |B|

Proof. We will prove the left-to-right direction of the second claim (the other
cases are similar, and left as an exercise). So, consider the following diagram:

A B

|A| |B|

The double-headed arrows indicate bijections, whose existence is guaranteed


by Lemma 67.2. In assuming that A ⪯ B, there is an injection A → B. Now,
chasing the arrows around from |A| to A to B to |B|, we obtain an injection
|A| → |B| (the dashed arrow).

We can also use Lemma 67.3 to re-prove Schröder–Bernstein. This is the claim
that if A ⪯ B and B ⪯ A then A ≈ B. We stated this as Theorem 4.25, but
first proved it—with some effort—in section 6.5. Now consider:

Re-proof of Schröder-Bernstein. If A ⪯ B and B ⪯ A, then |A| ≤ |B| and


|B| ≤ |A| by Lemma 67.3. So |A| = |B| and A ≈ B by Trichotomy and
Lemma 67.3.

Whilst this is a very simple proof, it implicitly relies on both Replacement (to
secure Theorem 63.26) and on Well-Ordering (to guarantee Lemma 67.3). By
contrast, the proof of section 6.5 was much more self-standing (indeed, it can
be carried out in Z− ).

content/set-theory/cardinals/milestone.tex

848 Release : 6891b66 (2024-12-01)


67.3. ZFC: A MILESTONE

67.3 ZFC: A Milestone


sth:cardinals:zfc: With the addition of Well-Ordering, we have reached the final theoretical mile-
sec
stone. We now have all the axioms required for ZFC. In detail:

Definition 67.4. The theory ZFC has these axioms: Extensionality, Union,
Pairs, Powersets, Infinity, Foundation, Well-Ordering and all instances of the
Separation and Replacement schemes. Otherwise put, ZFC adds Well-Ordering
to ZF.

ZFC stands for Zermelo–Fraenkel set theory with Choice. Now this might
seem slightly odd, since the axiom we added was called “Well-Ordering”, not
“Choice”. But, when we later formulate Choice, it will turn out that Well-
Ordering is equivalent (modulo ZF) to Choice (see Theorem 69.6). So which
to take as our “basic” axiom is a matter of indifference. And the name “ZFC”
is entirely standard in the literature.

content/set-theory/cardinals/classing.tex

67.4 Finite, Enumerable, Non-enumerable


sth:cardinals:classing: Now that we have been introduced to cardinals, it is worth spending a little time
sec
talking about different varieties of cardinals; specifically, finite, enumerable,
and non-enumerable cardinals.
Our first two results entail that the finite cardinals will be exactly the finite
ordinals, which we defined as our natural numbers back in Definition 62.7:

sth:cardinals:classing: Proposition 67.5. Let n, m ∈ ω. Then n = m iff n ≈ m.


finitecardisoequal

Proof. Left-to-right is trivial. To prove right-to-left, suppose n ≈ m although


n ̸= m. By Trichotomy, either n ∈ m or m ∈ n; suppose n ∈ m without loss
of generality. Then n ⊊ m and there is a bijection f : m → n, so that m is
Dedekind infinite, contradicting Proposition 62.8.

sth:cardinals:classing: Corollary 67.6. If n ∈ ω, then n is a cardinal.


naturalsarecardinals

Proof. Immediate.

It also follows that several reasonable notions of what it might mean to describe
a cardinal as “finite” or “infinite” coincide:

sth:cardinals:classing: Theorem 67.7. For any set A, the following are equivalent:
generalinfinitycharacter
sth:cardinals:classing: 1. |A| ∈
/ ω, i.e., A is not a natural number;
card:notinomega
sth:cardinals:classing: 2. ω ≤ |A|;
card:omegaplus
sth:cardinals:classing: 3. A is Dedekind infinite.
card:infinite

Release : 6891b66 (2024-12-01) 849


CHAPTER 67. CARDINALS

Proof. From Lemma 66.10, Lemma 67.3, and Corollary 67.6.

This licenses the following definition of some notions which we used rather
informally in part I:

Definition 67.8. We say that A is finite iff |A| is a natural number, i.e., sth:cardinals:classing:

|A| ∈ ω. Otherwise, we say that A is infinite. defnfinite

But note that this definition is presented against the background of ZFC. After
all, we needed Well-Ordering to guarantee that every set has a cardinality. And
indeed, without Well-Ordering, there can be a set which is neither finite nor
Dedekind infinite. We will return to this sort of issue in chapter 69. For now,
we continue to rely upon Well-Ordering.
Let us now turn from the finite cardinals to the infinite cardinals. Here are
two elementary points:

Corollary 67.9. ω is the least infinite cardinal. sth:cardinals:classing:


omegaisacardinal

Proof. ω is a cardinal, since ω is Dedekind infinite and if ω ≈ n for any n ∈ ω


then n would be Dedekind infinite, contradicting Proposition 62.8. Now ω is
the least infinite cardinal by definition.

Corollary 67.10. Every infinite cardinal is a limit ordinal.

Proof. Let α be an infinite successor ordinal, so α = β + 1 for some β. By


Proposition 67.5, β is also infinite, so β ≈ β + 1 by Lemma 66.10. Now
|β| = |β + 1| = |α| by Lemma 67.3, so that α ̸= |α|.

Now, as early as Definition 4.27, we flagged we can distinguish between


enumerable and non-enumerable infinite sets. That definition naturally leads
to the following:

Proposition 67.11. A is enumerable iff |A| ≤ ω, and A is non-enumerable


iff ω < |A|.

Proof. By Trichotomy, the two claims are equivalent, so it suffices to prove


that A is enumerable iff |A| ≤ ω. For right-to-left: if |A| ≤ ω, then A ⪯ ω by
Lemma 67.3 and Corollary 67.9. For left-to-right: suppose A is enumerable;
then by Definition 4.27 there are three possible cases:

1. if A = ∅, then |A| = 0 ∈ ω, by Corollary 67.6 and Lemma 67.3.

2. if n ≈ A, then |A| = n ∈ ω, by Corollary 67.6 and Lemma 67.3.

3. if ω ≈ A, then |A| = ω, by Corollary 67.9.

So in all cases, |A| ≤ ω.

Indeed, ω has a special place. Whilst there are many countable ordinals:

850 Release : 6891b66 (2024-12-01)


67.5. APPENDIX: HUME’S PRINCIPLE

Corollary 67.12. ω is the only enumerable infinite cardinal.

Proof. Let a be an enumerable infinite cardinal. Since a is infinite, ω ≤ a.


Since a is an enumerable cardinal, a = |a| ≤ ω. So a = ω by Trichotomy.

Of course, there are infinitely many cardinals. So we might ask: How


many cardinals are there? The following results show that we might want to
reconsider that question.
S
sth:cardinals:classing: Proposition 67.13. If every member of X is a cardinal, then X is a car-
unioncardinalscardinal
dinal.
S S
Proof. It is easy to check that X is an ordinal. Let α ∈ X be an ordinal;
thenSα ∈ b ∈ X for someS S α ≺ b. Since
cardinal b. SSince b is a cardinal,
b ⊆ X, we have b ⪯ X, and so α ̸≈ X. Generalising, X is a cardinal.

sth:cardinals:classing: Theorem 67.14. There is no largest cardinal.


lem:NoLargestCardinal

Proof. For any cardinal a, Cantor’s Theorem (Theorem 4.24) and Lemma 67.2
entail that a < |℘(a)|.

Theorem 67.15. The set of all cardinals does not exist.


S
Proof. For reductio, suppose C = {a : a is a cardinal}. Now C is aScardinal
by Proposition 67.13, soSby Theorem 67.14 S there is a cardinal b > C. By
definition b ∈ C, so b ⊆ C, so that b ≤ C, a contradiction.

You should compare this with both Russell’s Paradox and Burali-Forti.

content/set-theory/cardinals/hp.tex

67.5 Appendix: Hume’s Principle


sth:cardinals:hp: In section 67.1, we described Cantor’s Principle. This was:
sec

|A| = |B| iff A ≈ B.

This is very similar to what is now called Hume’s Principle, which says:

#x F (x) = #x G(x) iff F ∼ G

where ‘F ∼ G’ abbreviates that there are exactly as many F s as Gs, i.e., the
F s can be put into a bijection with the Gs, i.e.:

∃R(∀v∀y(Rvy → (F v ∧ Gy)) ∧
∀v(F v → ∃!y Rvy) ∧
∀y(Gy → ∃!v Rvy))

Release : 6891b66 (2024-12-01) 851


CHAPTER 67. CARDINALS

But there is a type-difference between Hume’s Principle and Cantor’s Principle.


In the statement of Cantor’s Principle, the variables “A” and “B” are first-
order terms which stand for sets. In the statement of Hume’s Principle, “F ”,
“G” and “R” are not first-order terms; rather, they are in predicate position.
(Maybe they stand for properties.) So we might gloss Hume’s Principle in
English as: the number of F s is the number of Gs iff the F s are bijective with
the Gs. This is called Hume’s Principle, because Hume once wrote this:

When two numbers are so combined as that the one has always an
unit answering to every unit of the other, we pronounce them equal.
(Hume, 1740, Pt.III Bk.1 §1)

And Hume’s Principle was brought to contemporary mathematico-logical promi-


nence by Frege (1884, §63), who quoted this passage from Hume, before (in
effect) sketching (what we have called) Hume’s Principle.
You should note the structural similarity between Hume’s Principle and
Basic Law V. We formulated this in section 61.6 as follows:

ϵx F (x) = ϵx G(x)iff ∀x (F (x) ↔ G(x)).

And, at this point, some commentary and comparison might help.


There are two ways to take a principle like Hume’s Principle or Basic Law V:
predicatively or impredicatively (recall section 61.3). On the impredicative read-
ing of Basic Law V, for each F , the object ϵx F (x) falls within the domain of
quantification that we used in formulating Basic Law V itself. Similarly, on
the impredicative reading of Hume’s Principle, for each F , the object #x F (x)
falls within the domain of quantification that we used in formulating Hume’s
Principle. By contrast, on the predicative understanding, the objects ϵx F (x)
and #x F (x) would be entities from some different domain.
Now, if we read Basic Law V impredicatively, it leads to inconsistency,
via Naı̈ve Comprehension (for the details, see section 61.6). Much like Naı̈ve
Comprehension, it can be rendered consistent by reading it predicatively. But
it probably will not do everything that we wanted it to.
Hume’s Principle, however, can consistently be read impredicatively. And,
read thus, it is quite powerful.
To illustrate: consider the predicate “x ̸= x”, which obviously nothing
satisfies. Hume’s Principle now yields an object #x(x ̸= x). We might treat
this as the number 0. Now, on the impredicative understanding—but only on
the impredicative understanding—this entity 0 falls within our original domain
of quantification. So we can sensibly apply Hume’s Principle with the predicate
“x = 0” to obtain an object #x(x = 0). We might treat this as the number 1.
Moreover, Hume’s Principle entails that 0 ̸= 1, since there cannot be a bijection
from the non-self-identical objects to the objects identical with 0 (there are
none of the former, but one of the latter). Now, working impredicatively again,
1 falls within our original domain of quantification. So we can sensibly apply
Hume’s Principle with the predicate “(x = 0 ∨ x = 1)” to obtain an object

852 Release : 6891b66 (2024-12-01)


#x(x = 0 ∨ x = 1). We might treat this as the number 2, and we can show
that 0 ̸= 2 and 1 ̸= 2 and so on.
In short, taken impredicatively, Hume’s Principle entails that there are
infinitely many objects. And this has encouraged neo-Fregean logicists to take
Hume’s Principle as the foundation for arithmetic.
Frege himself, though, did not take Hume’s Principle as his foundation for
arithmetic. Instead, Frege proved Hume’s Principle from an explicit definition:
#x F (x) is defined as the extension of the concept F ∼ Φ. In modern terms,
we might attempt to render this as #x F (x) = {G : F ∼ G}; but this will pull
us back into the problems of Naı̈ve Comprehension.

Chapter 68

Cardinal Arithmetic

content/set-theory/card-arithmetic/opps.tex

68.1 Defining the Basic Operations


sth:card-arithmetic:opps: Since we do not need to keep track of order, cardinal arithmetic is rather easier
sec
to define than ordinal arithmetic. We will define addition, multiplication, and
exponentiation simultaneously.

Definition 68.1. When a and b are cardinals:

a ⊕ b = |a ⊔ b|
a ⊗ b = |a × b|
ab = b
a

where X Y = {f : f is a function X → Y }. (It is easy to show that X


Y exists
for any sets X and Y ; we leave this as an exercise.)

Problem 68.1. Prove in Z− that X Y exists for any sets X and Y . Work-
ing in ZF, compute rank(X Y ) from rank(X) and rank(Y ), in the manner of
Lemma 66.9.

853
CHAPTER 68. CARDINAL ARITHMETIC

It might help to explain this definition. Concerning addition: this uses the
notion of disjoint sum, ⊔, as defined in Definition 66.1; and it is easy to see that
this definition gives the right verdict for finite cases. Concerning multiplica-
tion: Proposition 1.27 tells us that if A has n members and B has m members
then A × B has n · m members, so our definition simply generalises the idea
to transfinite multiplication. Exponentiation is similar: we are simply gener-
alising the thought from the finite to the transfinite. Indeed, in certain ways,
transfinite cardinal arithmetic looks much more like “ordinary” arithmetic than
does transfinite ordinal arithmetic:

Proposition 68.2. ⊕ and ⊗ are commutative and associative. sth:card-arithmetic:opps:


cardplustimescommute

Proof. For commutativity, by Lemma 67.3 it suffices to observe that (a ⊔ b) ≈


(b ⊔ a) and (a × b) ≈ (b × a). We leave associativity as an exercise.

Problem 68.2. Prove that ⊕ and ⊗ are associative.

Proposition 68.3. A is infinite iff |A| ⊕ 1 = 1 ⊕ |A| = |A|.

Proof. As in Theorem 67.7, from Lemma 66.10 and Lemma 67.3.

This explains why we need to use different symbols for ordinal versus car-
dinal addition/multiplication: these are genuinely different operations. This
next pair of results shows that ordinal versus cardinal exponentiation are also
different operations. (Recall that Definition 62.7 entails that 2 = {0, 1}):

Lemma 68.4. |℘(A)| = 2|A| , for any A. sth:card-arithmetic:opps:


lem:SizePowerset2Exp

Proof. For each subset B ⊆ A, let χB ∈ A 2 be given by:


(
1 if x ∈ B
χB (x) =
0 otherwise.

A A
Now let f (B) = χB ; this defines a bijection f : ℘(A) → 2. So ℘(A) ≈ 2.
Hence ℘(A) ≈ |A| 2, so that |℘(A)| = |A| 2 = 2|A| .

This snappy proof essentially subsumes the discussion of section 4.13. There,
we showed how to “reduce” the uncountability of ℘(ω) to the uncountability
of the set of infinite binary strings, Bω . In effect, Bω is just ω 2; and the pre-
ceding proof showed that the reasoning we went through in section 4.13 will
go through using any set A in place of ω. The result also yields a quick fact
about cardinal exponentiation:

Corollary 68.5. a < 2a for any cardinal a. sth:card-arithmetic:opps:


cantorcor

Proof. From Cantor’s Theorem (Theorem 4.24) and Lemma 68.4.

854 Release : 6891b66 (2024-12-01)


68.2. SIMPLIFYING ADDITION AND MULTIPLICATION

So ω < 2ω . But note: this is a result about cardinal exponentiation. It should


be contrasted with ordinal exponentiation, since in the latter case ω = 2(ω)
(see section 66.5).
Whilst we are on the topic of cardinal exponentiation, we can also be a bit
more precise about the “way” in which R is non-enumerable.
sth:card-arithmetic:opps: Theorem 68.6. |R| = 2ω
continuumis2aleph0

Proof skeleton. There are plenty of ways to prove this. The most straightfor-
ward is to argue that ℘(ω) ⪯ R and R ⪯ ℘(ω), and then use Schröder-Bernstein
to infer that R ≈ ℘(ω), and Lemma 68.4 to infer that |R| = 2ω . We leave it as
an (illuminating) exercise to define injections f : ℘(ω) → R and g : R → ℘(ω).

Problem 68.3. Complete the proof of Theorem 68.6, by showing that ℘(ω) ⪯
R and R ⪯ ℘(ω).

content/set-theory/card-arithmetic/simp.tex

68.2 Simplifying Addition and Multiplication


sth:card-arithmetic:simp: It turns out that transfinite cardinal addition and multiplication is extremely
sec
easy. This follows from the fact that cardinals are (certain) ordinals, and so
well-ordered, and so can be manipulated in a certain way. Showing this, though,
is not so easy. To start, we need a tricksy definition:
Definition 68.7. We define a canonical ordering, ◁, on pairs of ordinals, by
stipulating that ⟨α1 , α2 ⟩ ◁ ⟨β1 , β2 ⟩ iff either:
1. max(α1 , α2 ) < max(β1 , β2 ); or
2. max(α1 , α2 ) = max(β1 , β2 ) and α1 < β1 ; or
3. max(α1 , α2 ) = max(β1 , β2 ) and α1 = β1 and α2 < β2

Lemma 68.8. ⟨α × α, ◁⟩ is a well-order, for any ordinal α.

Proof. Evidently ◁ is connected on α × α. For suppose that neither ⟨α1 , α2 ⟩


nor ⟨β1 , β2 ⟩ is ◁-less than the other. Then max(α1 , α2 ) = max(β1 , β2 ) and
α1 = β1 and α2 = β2 , so that ⟨α1 , α2 ⟩ = ⟨β1 , β2 ⟩.
To show well-ordering, let X ⊆ α × α be non-empty. Since α is an ordinal,
some δ is the least member of {max(γ1 , γ2 ) : ⟨γ1 , γ2 ⟩ ∈ X}. Now discard
all pairs from {⟨γ1 , γ2 ⟩ ∈ X : max(γ1 , γ2 ) = δ} except those with least first
coordinate; from among these, the pair with least second coordinate is the
◁-least element of X.

Now for a teensy, simple observation:


sth:card-arithmetic:simp: Proposition 68.9. If α ≈ β, then α × α ≈ β × β.
simplecardproduct

Release : 6891b66 (2024-12-01) 855


CHAPTER 68. CARDINAL ARITHMETIC

Proof. Just let f : α → β induce ⟨γ1 , γ2 ⟩ 7→ ⟨f (γ1 ), f (γ2 )⟩.

And now we will put all this to work, in proving a crucial lemma:

Lemma 68.10. α ≈ α × α, for any infinite ordinal α sth:card-arithmetic:simp:


alphatimesalpha

Proof. For reductio, let α be the least infinite ordinal for which this is false.
Proposition 4.12 shows that ω ≈ ω × ω, so ω ∈ α. Moreover, α is a cardinal:
suppose otherwise, for reductio; then |α| ∈ α, so that |α| ≈ |α| × |α|, by
hypothesis; and |α| ≈ α by definition; so that α ≈ α × α by Proposition 68.9.
Now, for each ⟨γ1 , γ2 ⟩ ∈ α × α, consider the segment:

Seg(γ1 , γ2 ) = {⟨δ1 , δ2 ⟩ ∈ α × α : ⟨δ1 , δ2 ⟩ ◁ ⟨γ1 , γ2 ⟩}

Letting γ = max(γ1 , γ2 ), note that ⟨γ1 , γ2 ⟩ ◁ ⟨γ + 1, γ + 1⟩. So, when γ is


infinite, observe:

Seg(γ1 , γ2 ) ≾ ((γ + 1) · (γ + 1))


≈ (γ · γ), by Lemma 66.10 and Proposition 68.9
≈ γ, by the induction hypothesis
≺ α, since α is a cardinal

So ord(α × α, ◁) ≤ α, and hence α × α ⪯ α. Since of course α ⪯ α × α, the


result follows by Schröder-Bernstein.

Finally, we get to our simplifying result:

Theorem 68.11. If a, b are infinite cardinals, then: sth:card-arithmetic:simp:


cardplustimesmax

a ⊗ b = a ⊕ b = max(a, b).

Proof. Without loss of generality, suppose a = max(a, b). Then invoking


Lemma 68.10, a ⊗ a = a ≤ a ⊕ b ≤ a ⊕ a ≤ a ⊗ a.

Similarly, if a is infinite, an a-sized union of ≤ a-sized sets has size ≤ a:

Proposition 68.12. Let a be an infinite cardinal. For each ordinal β ∈ a, let sth:card-arithmetic:simp:
S kappaunionkappasize
Xβ be a set with |Xβ | ≤ a. Then β∈a Xβ ≤ a.

1
S For each β ∈ a, fix an injection fβ : Xβ → a. Define an injection
Proof.
→ a × a by g(v) = ⟨β, fβ (v)⟩, where v ∈ Xβ and v ∈
g : β∈a Xβ S / Xγ for any
γ ∈ β. Now β∈a Xβ ⪯ a × a ≈ a by Theorem 68.11.

content/set-theory/card-arithmetic/expotough.tex
1 How are these “fixed”? See section 69.5.

856 Release : 6891b66 (2024-12-01)


68.3. SOME SIMPLIFICATIONS

68.3 Some Simplification with Cardinal Exponentiation


sth:card-arithmetic:expotough: Whilst defining ◁ was a little involved, the upshot is a useful result concerning
sec
cardinal addition and multiplication, Theorem 68.11. Transfinite exponentia-
tion, however, cannot be simplified so straightforwardly. To explain why, we
start with a result which extends a familiar pattern from the finitary case
(though its proof is at a high level of abstraction):

sth:card-arithmetic:expotough: Proposition 68.13. ab⊕c = ab ⊗ac and (ab )c = ab⊗c , for any cardinals a, b, c.
simplecardexpo

Proof. For the first claim, consider a function f : (b ⊔ c) → a. Now “split this”,
by defining fb (β) = f (β, 0) for each β ∈ b, and fc (γ) = f (γ, 1) for each γ ∈ c.
The map f 7→ (fb × fc ) is a bijection b⊔c a → (b a × c a).
For the second claim, consider a function f : c → (b a); so for each γ ∈ c
we have some function f (γ) : b → a. Now define f ∗ (β, γ) = (f (γ))(β) for each
⟨β, γ⟩ ∈ b × c. The map f 7→ f ∗ is a bijection c (b a) → b⊗c a.

Now, what we would like is an easy way to compute ab when we are dealing
with infinite cardinals. Here is a nice step in this direction:

sth:card-arithmetic:expotough: Proposition 68.14. If 2 ≤ a ≤ b and b is infinite, then ab = 2b


cardexpo2reduct

Proof.

2b ≤ ab , as 2 ≤ a
≤ (2a )b , by Lemma 68.4
= 2a⊗b , by Proposition 68.13
= 2b , by Theorem 68.11

We should not really expect to be able to simplify this any further, since
b < 2b by Lemma 68.4. However, this does not tell us what to say about ab
when b < a. Of course, if b is finite, we know what to do.

Proposition 68.15. If a is infinite and n ∈ ω then an = a

Proof. an = a ⊗ a ⊗ . . . ⊗ a = a, by Theorem 68.11.

Additionally, in some other cases, we can control the size of ab :

Proposition 68.16. If 2 ≤ b < a ≤ 2b and b is infinite, then ab = 2b

Proof. 2b ≤ ab ≤ (2b )b = 2b⊗b = 2b , reasoning as in Proposition 68.14.

But, beyond this point, things become rather more subtle.

content/set-theory/card-arithmetic/ch.tex

Release : 6891b66 (2024-12-01) 857


CHAPTER 68. CARDINAL ARITHMETIC

68.4 The Continuum Hypothesis


The previous result hints (correctly) that cardinal exponentiation would be sth:card-arithmetic:ch:
sec
quite easy, if infinite cardinals are guaranteed to “play straightforwardly” with
powers of 2, i.e., (by Lemma 68.4) with taking powersets. But we cannot
assume that infinite cardinals do play straightforwardly powersets.
To start unpacking this, we introduce some nice notation.
Definition 68.17. Where a⊕ is the least cardinal strictly greater than a, we
define two infinite sequences:
ℵ0 = ω ℶ0 = ω

ℵα+1 = (ℵα ) ℶα+1 = 2ℶα
[ [
ℵα = ℵβ ℶα = ℶβ when α is a limit ordinal.
β<α β<α

The definition of a⊕ is in order, since Theorem 67.14 tells us that, for each
cardinal a, there is some cardinal greater than a, and Transfinite Induction
guarantees that there is a least cardinal greater than a. The rest of the defini-
tion of a is provided by transfinite recursion.
Cantor introduced this “ℵ” notation; this is aleph, the first letter in the
Hebrew alphabet and the first letter in the Hebrew word for “infinite”. Peirce
introduced the “ℶ” notation; this is beth, which is the second letter in the
Hebrew alphabet.2 Now, these notations provide us with infinite cardinals.
Proposition 68.18. ℵα and ℶα are cardinals, for every ordinal α.

Proof. Both results hold by a simple transfinite induction. ℵ0 = ℶ0 = ω is a


cardinal by Corollary 67.9. Assuming ℵα and ℶα are both cardinals, ℵα+1 and
ℶα+1 are explicitly defined as cardinals. And the union of a set of cardinals is
a cardinal, by Proposition 67.13.

Moreover, every infinite cardinal is an ℵ:


Proposition 68.19. If a is an infinite cardinal, then a = ℵγ for some unique
γ.

Proof. By transfinite induction on cardinals. For induction, suppose that if


b < a then b = ℵγb . If a = b⊕ for some b, then a = (ℵγb )⊕ = ℵγb +1 . If
S a is not
the
S successor of any cardinal, then
S since cardinals are ordinals a = b<a b =
b<a ℵγb , so a = ℵγ where γ = b<a γb .

Since every infinite cardinal is an ℵ, this prompts us to ask: is every infinite


cardinal a ℶ? Certainly if that were the case, then the infinite cardinals would
“play straightforwardly” with the operation of taking powersets. Indeed, we
would have the following:
2 Peirce used this notation in a letter to Cantor of December 1900. Unfortunately, Peirce

also gave a bad argument there that ℶα does not exist for α ≥ ω.

858 Release : 6891b66 (2024-12-01)


68.4. THE CONTINUUM HYPOTHESIS

Generalized Continuum Hypothesis (GCH). ℵα = ℶα , for all α.

Moreover, if GCH held, then we could make some considerable simplifica-


tions with cardinal exponentiation. In particular, we could show that when
b < a, the value of ab is trapped by a ≤ ab ≤ a⊕ . We could then go on to give
precise conditions which determine which of the two possibilities obtains (i.e.,
whether a = ab or ab = a⊕ ).3
But GCH is a hypothesis, not a theorem. In fact, Gödel (1938) proved that
if ZFC is consistent, then so is ZFC + GCH. But it later turned out that
we can equally add ¬GCH to ZFC. Indeed, consider the simplest non-trivial
instance of GCH, namely:

Continuum Hypothesis (CH). ℵ1 = ℶ1 .

Cohen (1963) proved that if ZFC is consistent then so is ZFC + ¬CH. So


the Continuum Hypothesis is independent from ZFC.
The Continuum Hypothesis is so-called, since “the continuum” is another
name for the real line, R. Theorem 68.6 tells us that |R| = ℶ1 . So the Contin-
uum Hypothesis states that there is no cardinal between the cardinality of the
natural numbers, ℵ0 = ℶ0 , and the cardinality of the continuum, ℶ1 .
Given the independence of (G)CH from ZFC, what should say about their
truth? Well, there is much to say. Indeed, and much fertile recent work in
set theory has been directed at investigating these issues. But two very quick
points are certainly worth emphasising.
First: it does not immediately follow from these formal independence results
that either GCH or CH is indeterminate in truth value. After all, maybe we
just need to add more axioms, which strike us as natural, and which will settle
the question one way or another. Gödel himself suggested that this was the
right response.
Second: the independence of CH from ZFC is certainly striking, but it is
certainly not incredible (in the literal sense). The point is simply that, for
all ZFC tells us, moving from cardinals to their successors may involve a less
blunt tool than simply taking powersets.
With those two observations made, if you want to know more, you will now
have to turn to the various philosophers and mathematicians with horses in
the race.4

content/set-theory/card-arithmetic/fix.tex

3 The condition is dictated by cofinality.


4 Though you might want to start by reading Potter (2004, §15.6).

Release : 6891b66 (2024-12-01) 859


CHAPTER 68. CARDINAL ARITHMETIC

68.5 ℵ-Fixed Points


In chapter 64, we suggested that Replacement stands in need of justification, sth:card-arithmetic:fix:
sec
because it forces the hierarchy to be rather tall. Having done some cardinal
arithmetic, we can give a little illustration of the height of the hierarchy.
Evidently 0 < ℵ0 , and 1 < ℵ1 , and 2 < ℵ2 . . . and, indeed, the difference
in size only gets bigger with every step. So it is tempting to conjecture that
κ < ℵκ for every ordinal κ.
But this conjecture is false, given ZFC. In fact, we can prove that there
are ℵ-fixed-points, i.e., cardinals κ such that κ = ℵκ .

Proposition 68.20. There is an ℵ-fixed-point. sth:card-arithmetic:fix:


alephfixed

Proof. Using recursion, define:

κ0 = 0
κn+1 = ℵκn
[
κ= κn
n<ω

Now κ is a cardinal by Proposition 67.13. But now:


[ [ [
κ= κn+1 = ℵκn = ℵα = ℵκ
n<ω n<ω α<κ

Boolos once wrote an article about exactly the ℵ-fixed-point we just con-
structed. After noting the existence of κ, at the start of his article, he said:

[κ is] a pretty big number, by the lights of those with no previous


exposure to set theory, so big, it seems to me, that it calls into
question the truth of any theory, one of whose assertions is the
claim that there are at least κ objects. (Boolos, 2000, p. 257)

And he ultimately concluded his paper by asking:

[do] we suspect that, however it may have been at the beginning of


the story, by the time we have come thus far the wheels are spinning
and we are no longer listening to a description of anything that is
the case? (Boolos, 2000, p. 268)

If we have, indeed, outrun “anything that is the case”, then we must point the
finger of blame directly at Replacement. For it is this axiom which allows our
proof to work. In which case, one assumes, Boolos would need to revisit the
claim he made, a few decades earlier, that Replacement has “no undesirable”
consequences (see section 65.3).
But is the existence of κ so bad? It might help, here, to consider Russell’s
Tristram Shandy paradox. Tristram Shandy documents his life in his diary, but
it takes him a year to record a single day. With every passing year, Tristram

860 Release : 6891b66 (2024-12-01)


68.5. ℵ-FIXED POINTS

falls further and further behind: after one year, he has recorded only one day,
and has lived 364 days unrecorded days; after two years, he has only recorded
two days, and has lived 728 unrecorded days; after three years, he has only
recorded three days, and lived 1092 unrecorded days . . . 5 Still, if Tristram is
immortal, Tristram will manage to record every day, for he will record the nth
day on the nth year of his life. And so, “at the end of time”, Tristram will
have a complete diary.
Now: why is this so different from the thought that α is smaller than ℵα —
and indeed, increasingly, desperately smaller—up until κ, at which point, we
catch up, and κ = ℵκ ?
Setting that aside, and assuming we accept ZFC, let’s close with a little
more fun concerning fixed-point constructions. The next three results establish,
intuitively, that there is a (non-trivial) point at which the hierarchy is as wide
as it is tall:
sth:card-arithmetic:fix: Proposition 68.21. There is a ℶ-fixed-point, i.e., a κ such that κ = ℶκ .
bethfixed

Proof. As in Proposition 68.20, using “ℶ” in place of “ℵ”.

sth:card-arithmetic:fix: Proposition 68.22. |Vω+α | = ℶα . If ω · ω ≤ α, then |Vα | = ℶα .


stagesize

Proof. The first claim holds by a simple transfinite induction. The second
claim follows, since if ω · ω ≤ α then ω + α = α. To establish this, we use facts
about ordinal arithmetic from chapter 66. First note that ω · ω = ω · (1 + ω) =
(ω · 1) + (ω · ω) = ω + (ω · ω). Now if ω · ω ≤ α, i.e., α = (ω · ω) + β for some
β, then ω + α = ω + ((ω · ω) + β) = (ω + (ω · ω)) + β = (ω · ω) + β = α.

Corollary 68.23. There is a κ such that |Vκ | = κ.

Proof. Let κ be a ℶ-fixed point, as given by Proposition 68.21. Clearly ω·ω < κ.
So |Vκ | = ℶκ = κ by Proposition 68.22.

There are as many stages beneath Vκ as there are elements of Vκ . Intuitively,


then, Vκ is as wide as it is tall. This is very Tristram-Shandy-esque: we move
from one stage to the next by taking powersets, thereby making our hierarchy
much bigger with each step. But, “in the end”, i.e., at stage κ, the hierarchy’s
width catches up with its height.
One might ask: How often does the hierarchy’s width match its height? The
answer is: As often as there are ordinals. But this needs a little explanation.
We define a term τ as follows. For any A, let:

τ0 (A) = |A|
τn+1 (A) = ℶτn (A)
[
τ (A) = τn (A)
n<ω

5 Forgetting about leap years.

Release : 6891b66 (2024-12-01) 861


As in Proposition 68.21, τ (A) is a ℶ-fixed point for any A, and trivially |A| <
τ (A). So now consider this recursive definition:

W0 = 0
Wα+1 = τ (Wα )
[
Wα = Wβ , when α is a limit
β<α

The construction is defined for all ordinals. Intuitively, then, W is “an injec-
tion” from the ordinals to ℶ-fixed points. And, exactly as before, VWα is as
wide as it is tall, for any α.

Chapter 69

Choice

content/set-theory/choice/introduction.tex

69.1 Introduction
In chapters 67 to 68, we developed a theory of cardinals by treating cardinals sth:choice:intro:
sec
as ordinals. That approach depends upon the Axiom of Well-Ordering. It
turns out that Well-Ordering is equivalent to another principle—the Axiom of
Choice—and there has been serious philosophical discussion of its acceptability.
Our question for this chapter are: How is the Axiom used, and can it be
justified?

content/set-theory/choice/tarskiscott.tex

69.2 The Tarski–Scott Trick


In Definition 67.1, we defined cardinals as ordinals. To do this, we assumed sth:choice:tarskiscott:
sec
the Axiom of Well-Ordering. We did this, for no other reason than that it is
the “industry standard”.
Before we discuss any of the philosophical issues surrounding Well-Ordering,
then, it is important to be clear that we can depart from the industry standard,

862
69.3. COMPARABILITY AND HARTOGS’ LEMMA

and develop a theory of cardinals without assuming Well-Ordering. We can


still employ the definitions of A ≈ B, A ⪯ B and A ≺ B, as they appeared in
chapter 4. We will just need a new notion of cardinal.
A naı̈ve thought would be to attempt to define A’s cardinality thus:

{x : A ≈ x}.

You might want to compare this with Frege’s definition of #xF x, sketched
at the very end of section 67.5. And, for reasons we gestured at there, this
definition fails. Any singleton set is equinumerous with {∅}. But new singleton
sets are formed at every successor stage of the hierarchy (just consider the
singleton of the previous stage). So {x : A ≈ x} does not exist, since it cannot
have a rank.
To get around this problem, we use a trick due to Tarski and Scott:1
Definition 69.1 (Tarski–Scott). For any formula φ(x), let [x : φ(x)] be the
set of all x, of least possible rank, such that φ(x) (or ∅, if there are no φs).

We should check that this definition is legitimate. Working in ZF, Theo-


rem 64.13 guarantees that rank(x) exists for every x. Now, if there are any en-
tities satisfying φ, then we can let α be the least rank such that (∃x ⊆ Vα )φ(x),
i.e., (∀β ∈ α)(∀x ⊆ Vβ )¬φ(x). We can then define [x : φ(x)] by Separation as
{x ∈ Vα+1 : φ(x)}.
Having justified the Tarski–Scott trick, we can now use it to define a notion
of cardinality:
Definition 69.2. The ts-cardinality of A is tsc(A) = [x : A ≈ x].

The definition of a ts-cardinal does not use Well-Ordering. But, even


without that Axiom, we can show that ts-cardinals behave rather like cardinals
as defined in Definition 67.1. For example, if we restate Lemma 67.3 and
Lemma 68.4 in terms of ts-cardinals, the proofs go through just fine in ZF,
without assuming Well-Ordering.
Whilst we are on the topic, it is worth noting that we can also develop a
theory of ordinals using the Tarski–Scott trick. Where ⟨A, <⟩ is a well-ordering,
let tso(A, <) = [⟨X, R⟩ : ⟨A, <⟩ ∼ = ⟨X, R⟩]. For more on this treatment of
cardinals and ordinals, see Potter (2004, chs. 9–12).

content/set-theory/choice/hartogs.tex

69.3 Comparability and Hartogs’ Lemma


sth:choice:hartogs: That’s the plus side. Here’s the minus side. Without Choice, things get messy.
sec
To see why, here is a nice result due to Hartogs (1915):
sth:choice:hartogs: Lemma 69.3 (in ZF). For any set A, there is an ordinal α such that α ⪯̸ A
HartogsLemma
1A reminder: all formulas may have parameters (unless explicitly stated otherwise).

Release : 6891b66 (2024-12-01) 863


CHAPTER 69. CHOICE

Proof. If B ⊆ A and R ⊆ B 2 , then ⟨B, R⟩ ⊆ Vrank(A)+4 by Lemma 66.9. So,


using Separation, consider:

C = {⟨B, R⟩ ∈ Vrank(A)+5 : B ⊆ A and ⟨B, R⟩ is a well-ordering}

Using Replacement and Theorem 63.26, form the set:

α = {ord(B, R) : ⟨B, R⟩ ∈ C}.

By Corollary 63.19, α is an ordinal, since it is a transitive set of ordinals.


After all, if γ ∈ β ∈ α, then β = ord(B, R) for some B ⊆ R, whereupon
γ = ord(Bb , Rb ) for some b ∈ B by Lemma 63.10, so that γ ∈ α.
For reductio, suppose there is an injection f : α → A. Then, where:

B = ran(f )
R = {⟨f (α), f (β)⟩ ∈ A × A : α ∈ β}.

Clearly α = ord(B, R) and ⟨B, R⟩ ∈ C. So α ∈ α, which is a contradiction.

This entails a deep result:

Theorem 69.4 (in ZF). The following claims are equivalent:

1. The Axiom of Well-Ordering sth:choice:hartogs:


equivwo

2. Either A ⪯ B or B ⪯ A, for any sets A and B sth:choice:hartogs:


equivcompare

Proof. (1) ⇒ (2). Fix A and B. Invoking (1), there are well-orderings ⟨A, R⟩
and ⟨B, S⟩. Invoking Theorem 63.26, let f : α → ⟨A, R⟩ and g : β → ⟨B, S⟩ be
isomorphisms. By Proposition 63.22, either α ⊆ β or β ⊆ α. If α ⊆ β, then
g ◦ f −1 : A → B is an injection, and hence A ⪯ B; similarly, if β ⊆ α then
B ⪯ A.
(2) ⇒ (1). Fix A; by Lemma 69.3 there is some ordinal β such that β ⪯̸ A.
Invoking (2), we have A ⪯ β. So there is some injection f : A → β, and we
can use this injection to well-order the elements of A, by defining an order
{⟨a, b⟩ ∈ A × A : f (a) ∈ f (b)}.

As an immediate consequence: if Well-Ordering fails, then some sets are lit-


erally incomparable with regard to their size. So, if Well-Ordering fails, then
transfinite cardinal arithmetic will be messy. For example, we will have to
abandon the idea that if A and B are infinite then A ⊔ B ≈ A × B ≈ M , where
M is the larger of A and B (see Theorem 68.11). The problem is simple: if
we cannot compare the size of A and B, then it is nonsensical to ask which is
larger.

content/set-theory/choice/wellorderingproblem.tex

864 Release : 6891b66 (2024-12-01)


69.4. THE WELL-ORDERING PROBLEM

69.4 The Well-Ordering Problem


sth:choice:woproblem: Evidently rather a lot hangs on whether we accept Well-Ordering. But the
sec
discussion of this principle has tended to focus on an equivalent principle, the
Axiom of Choice. So we will now turn our attention to that (and prove the
equivalence).
In 1883, Cantor expressed his support for the Axiom of Well-Ordering,
calling it “a law of thought which appears to me to be fundamental, rich in
its consequences, and particularly remarkable for its general validity” (cited
in Potter 2004, p. 243). But Cantor ultimately became convinced that the
“Axiom” was in need of proof. So did the mathematical community.
The problem was “solved” by Zermelo in 1904. To explain his solution, we
need some definitions.

Definition 69.5. A function f is a choice function iff f (x) ∈ x for all x ∈


dom(f ). We say that f is a choice function for A iff f is a choice function with
dom(f ) = A \ {∅}.

Intuitively, for every (non-empty) set x ∈ A, a choice function for A chooses


a particular element, f (x), from x. The Axiom of Choice is then:

Axiom (Choice). Every set has a choice function.

Zermelo showed that Choice entails well-ordering, and vice versa:

sth:choice:woproblem: Theorem 69.6 (in ZF). Well-Ordering and Choice are equivalent.
thmwochoice
S
Proof. Left-to-right. Let A be a set of sets. Then A exists by theSAxiom of
Union, and so by Well-Ordering there is some < which well-orders A. Now
let f (x) = the <-least member of x. This is a choice function for A.
Right-to-left. Fix A. By Choice, there is a choice function, f , for ℘(A)\{∅}.
Using Transfinite Recursion, define a function:

g(0) = f (A)
(
stop! if A = g[α]
g(α) =
f (A \ g[α]) otherwise

The indication to “stop!” is just a shorthand for what would otherwise be a


more long-winded definition. That is, when A = g[α] for the first time, let
g(δ) = A for all δ ≤ α. Now, in the first instance, we can only be sure that this
defines a term (see the remarks after Theorem 64.4); but we will show that we
indeed have a function.
Since f is a choice function, for each α (when defined) we have g(α) =
f (A \ g[α]) ∈ A \ g[α]; i.e., g(α) ∈
/ g[α]. So if g(α) = g(β) then g(β) ∈
/ g[α], i.e.,
β∈ / α, and similarly α ∈ / β. So α = β, by Trichotomy. So g is injective.
Next, observe that we do stop!, i.e. that there is some (least) ordinal α such
that A = g[α]. For suppose otherwise; then as g is injective we would have

Release : 6891b66 (2024-12-01) 865


CHAPTER 69. CHOICE

α ≺ ℘(A) \ {∅} for every ordinal α, contradicting Lemma 69.3. Hence also
ran(g) = A.
Assembling these facts, g is a bijection from some ordinal to A. Now g can
be used to well-order A.

So Well-Ordering and Choice stand or fall together. But the question re-
mains: do they stand or fall?

content/set-theory/choice/countablechoice.tex

69.5 Countable Choice


It is easy to prove, without any use of Choice/Well-Ordering, that: sth:choice:countablechoice:
sec
Lemma 69.7 (in Z− ). Every finite set has a choice function.

Proof. Let a = {b1 , . . . , bn }. Suppose for simplicity that each bi ̸= ∅. So there


are objects c1 , . . . , cn such that c1 ∈ b1 , . . . , cn ∈ bn . Now by Proposition 62.5,
the set {⟨b1 , c1 ⟩, . . . , ⟨bn , cn ⟩} exists; and this is a choice function for a.

But matters get murkier as soon as we consider infinite sets. For example,
consider this “minimal” extension to the above:

Countable Choice. Every countable set has a choice function.

This is a special case of Choice. And it transpires that this principle was
invoked fairly frequently, without an obvious awareness of its use. Here are two
nice examples.2
Example 69.8. Here is a natural thought: for any set A, either ω ⪯ A, or
A ≈ n for some n ∈ ω. This is one way to state the intuitive idea, that every set
is either finite or infinite. Cantor, and many other mathematicians, made this
claim without proving it. Cautious as we are, we proved this in Theorem 67.7.
But in that proof we were working in ZFC, since we were assuming that any
set A can be well-ordered, and hence that |A| is guaranteed to exist. That is:
we explicitly assumed Choice.
In fact, Dedekind (1888) offered his own proof of this claim, as follows:
Theorem 69.9 (in Z− + Countable Choice). For any A, either ω ⪯ A or
A ≈ n for some n ∈ ω.

Proof. Suppose A ̸≈ n for all n ∈ ω. Then in particular for each n < ω there is
subset An ⊆ A with exactly 2n elements. Using this sequence A0 , A1 , A2 , . . .,
we define for each n: [
B n = An \ Ai .
i<n
2 Due to Potter (2004, §9.4) and Luca Incurvati.

866 Release : 6891b66 (2024-12-01)


69.5. COUNTABLE CHOICE

Now note the following

[
An ≤ |A0 | + |A1 | + . . . + |An−1 |
i<n

= 1 + 2 + . . . + 2n−1
= 2n − 1
< 2n = |An |

Hence each Bn has at least one member, cn . Moreover, the Bn s are pairwise
disjoint; so if cn = cm then n = m. But every cn ∈ A. So the function
f (n) = cn is an injection ω → A.

Dedekind did not flag that he had used Countable Choice. But, did you spot
its use? Look again. (Really: look again.)
The proof used Countable Choice twice. We used it once, to obtain our
sequence of sets A0 , A1 , A2 , . . . We then used it again to select our elements
cn from each Bn . Moreover, this use of Choice is ineliminable. Cohen (1966,
p. 138) proved that the result fails if we have no version of Choice. That is: it
is consistent with ZF that there are sets which are incomparable with ω.

Example 69.10. In 1878, Cantor stated that a countable union of countable


sets is countable. He did not present a proof, perhaps indicating that he took
the proof to be obvious. Now, cautious as we are, we proved a more general
version of this result in Proposition 68.12. But our proof explicitly assumed
Choice. And even the proof of the less general result requires Countable Choice.
Theorem 69.11
S (in Z− + Countable Choice). If An is countable for each
n ∈ ω, then n<ω An is countable.

Proof. Without loss of generality, suppose that each AnS̸= ∅. So for each n ∈ ω
there is a surjection fn : ω → An . Define f : ω × ω → n<ω An by f (m, n) =
fn (m). The result follows because ω × ω is countable (Proposition 4.12) and f
is a surjection.

Did you spot the use of the Countable Choice? It is used to choose our sequence
of functions f0 , f1 , f2 , . . . 3 And again, the result fails in the absence of any
Choice principle. Specifically, Feferman and Levy (1963) proved that it is
consistent with ZF that a countable union of countable sets has cardinality ℶ1 .
But here is a much funnier statement of the point, from Russell:
This is illustrated by the millionaire who bought a pair of socks
whenever he bought a pair of boots, and never at any other time,
and who had such a passion for buying both that at last he had
ℵ0 pairs of boots and ℵ0 pairs of socks. . . Among boots we can
3 A similar use of Choice occurred in Proposition 68.12, when we gave the instruction

“For each β ∈ a, fix an injection fβ ”.

Release : 6891b66 (2024-12-01) 867


CHAPTER 69. CHOICE

distinguish right and left, and therefore we can make a selection of


one out of each pair, namely, we can choose all the right boots or all
the left boots; but with socks no such principle of selection suggests
itself, and we cannot be sure, unless we assume the multiplicative
axiom [i.e., in effect Choice], that there is any class consisting of
one sock out of each pair. (Russell, 1919, p. 126)

In short, some form of Choice is needed to prove the following: If you have
countably many pairs of socks, then you have (only) countably many socks.
And in fact, without Countable Choice (or something equivalent), a countable
union of countable sets can fail to be countable.

The moral is that Countable Choice was used repeatedly, without much
awareness of its users. The philosophical question is: How could we justify
Countable Choice?
An attempt at an intuitive justification might invoke an appeal to a super-
task. Suppose we make the first choice in 1/2 a minute, our second choice in
1/4 a minute, . . . , our n-th choice in 1/2n a minute, . . . Then within 1 minute,

we will have made an ω-sequence of choices, and defined a choice function.


But what, really, could such a thought-experiment tell us? For a start, it
relies upon taking this idea of “choosing” rather literally. For another, it seems
to bind up mathematics in metaphysical possibility.
More important: it is not going to give us any justification for Choice tout
court, rather than mere Countable Choice. For if we need every set to have a
choice function, then we’ll need to be able to perform a “supertask of arbitrary
ordinal length.” Bluntly, that idea is laughable.

content/set-theory/choice/justifications.tex

69.6 Intrinsic Considerations about Choice


The broader question, then, is whether Well-Ordering, or Choice, or indeed the sth:choice:justifications:
sec
comparability of all sets as regards their size—it doesn’t matter which—can be
justified.
Here is an attempted intrinsic justification. Back in section 62.1, we intro-
duced several principles about the hierarchy. One of these is worth restating:

Stages-accumulate. For any stage S, and for any sets which were formed
before stage S: a set is formed at stage S whose members are exactly
those sets. Nothing else is formed at stage S.

In fact, many authors have suggested that the Axiom of Choice can be justified
via (something like) this principle. We will briefly provide a gloss on that
approach.
We will start with a simple little result, which offers yet another equivalent
for Choice:

868 Release : 6891b66 (2024-12-01)


69.7. THE BANACH–TARSKI PARADOX

sth:choice:justifications: Theorem 69.12 (in ZF). Choice is equivalent to the following principle. If
choiceset
the elements of A are disjoint and non-empty, then there is some C such that
C ∩ x is a singleton for every x ∈ A. (We call such a C a choice set for A.)

The proof of this result is straightforward, and we leave it as an exercise


for the reader.

Problem 69.1. Prove Theorem 69.12. If you struggle, you can find a proof
in (Potter, 2004, pp. 242–3).

The essential point is that a choice set for A is just the range of a choice
function for A. So, to justify Choice, we can simply try to justify its equivalent
formulation, in terms of the existence of choice sets. And we will now try to
do exactly that.
Let A’s elements be disjoint and non-empty. By Stages-are-key (see S sec-
tion 62.1), A is formed at some stage S. Note that all the elements of A
are available before stage S. Now, by Stages-accumulate, for any sets which
were formed before S, a set is formed whose members are exactly those sets.
Otherwise put: every possible collections of earlier-available sets will exist at S.
But it is certainly possible to select objects which could
S be formed into a choice
set for A; that is just some very specific subset of A. So: some such choice
set exists, as required.
Well, that’s a very quick attempt to offer a justification of Choice on intrin-
sic grounds. But, to pursue this idea further, you should read Potter’s (2004,
§14.8) neat development of it.

content/set-theory/choice/banach.tex

69.7 The Banach–Tarski Paradox


sth:choice:banach: We might also attempt to justify Choice, as Boolos attempted to justify Re-
sec
placement, by appealing to extrinsic considerations (see section 65.3). After
all, adopting Choice has many desirable consequences: the ability to compare
every cardinal; the ability to well-order every set; the ability to treat cardinals
as a particular kind of ordinal; etc.
Sometimes, however, it is claimed that Choice has undesirable consequences.
Mostly, this is due to a result by Banach and Tarski (1924).

Theorem 69.13 (Banach–Tarski Paradox (in ZFC)). Any ball can be de-
composed into finitely many pieces, which can be reassembled (by rotation and
transportation) to form two copies of that ball.

At first glance, this is a bit amazing. Clearly the two balls have twice the
volume of the original ball. But rigid motions—rotation and transportation—
do not change volume. So it looks as if Banach–Tarski allows us to magick new
matter into existence.

Release : 6891b66 (2024-12-01) 869


CHAPTER 69. CHOICE

It gets worse.4 Similar reasoning shows that a pea can be cut into finitely
many pieces, which can then be reassembled (by rotation and transportation)
to form an entity the shape and size of Big Ben.
None of this, however, holds in ZF on its own.5 So we face a decision:
reject Choice, or learn to live with the “paradox”.
We’re going to suggest that we should learn to live with the “paradox”.
Indeed, we don’t think it’s much of a paradox at all. In particular, we don’t
see why it is any more or less paradoxical than any of the following results:6

1. There are as many points in the interval (0, 1) as in R.


Proof : consider tan(π(r − 1/2))).

2. There are as many points in a line as in a square.


See section 73.3 and section 73.5.

3. There are space-filling curves.


See section 73.3 and section 73.6.

None of these three results require Choice. Indeed, we now just regard them
as surprising, lovely, bits of mathematics. Maybe we should adopt the same
attitude to the Banach–Tarski Paradox.
To be sure, a technical observation is required here; but it only requires
keeping a level head. Rigid motions preserve volume. Consequently, the five7
pieces into which the ball is decomposed cannot all be measurable. Roughly
put, then, it makes no sense to assign a volume to these individual pieces. You
should think of these as unpicturable, “infinite scatterings” of points. Now,
maybe it is “weird” to conceive of such “infinitely scattered” sets. But their
existence seems to fall out from the injunction, embodied in Stages-accumulate,
that you should form all possible collections of earlier-available sets.
If none of that convinces, here is a final (extrinsic) argument in favour of
embracing the Banach–Tarski Paradox. It immediately entails the best math
joke of all time:

Question. What’s an anagram of “Banach–Tarski”?

Answer. “Banach–Tarski Banach–Tarski”.

content/set-theory/choice/vitali.tex
4 See Tomkowicz and Wagon (2016, Theorem 3.12).
5 Though Banach–Tarski can be proved with principles which are strictly weaker than
Choice; see Tomkowicz and Wagon (2016, 303).
6 Potter (2004, 276–7), Weston (2003, 16), Tomkowicz and Wagon (2016, 31, 308–9),

make similar points, using other examples.


7 We stated the Paradox in terms of “finitely many pieces”. In fact, Robinson (1947)

proved that the decomposition can be achieved with five pieces (but no fewer). For a proof,
see Tomkowicz and Wagon (2016, pp. 66–7).

870 Release : 6891b66 (2024-12-01)


69.8. APPENDIX: VITALI’S PARADOX

69.8 Appendix: Vitali’s Paradox


sth:choice:vitali: To get a real sense of whether the Banach-Tarski construction is acceptable or
sec
not, we should examine its proof. Unfortunately, that would require much more
algebra than we can present here. However, we can offer some quick remarks
which might shed some insight on the proof of Banach-Tarski,8 by focussing
on the following result:

sth:choice:vitali: Theorem 69.14 (Vitali’s Paradox (in ZFC)). Any circle can be decom-
vitaliparadox
posed into countably many pieces, which can be reassembled (by rotation and
transportation) to form two copies of that circle.

Vitali’s Paradox is much easier to prove than the Banach–Tarski Paradox.


We have called it “Vitali’s Paradox”, since it follows from Vitali’s 1905 con-
struction of an unmeasurable set. But the set-theoretic aspects of the proof of
Vitali’s Paradox and the Banach-Tarski Paradox are very similar. The essen-
tial difference between the results is just that Banach-Tarski considers a finite
decomposition, whereas Vitali’s Paradox considers a countably infinite decom-
position. As Weston (2003) puts it, Vitali’s Paradox “is certainly not nearly
as striking as the Banach–Tarski paradox, but it does illustrate that geometric
paradoxes can happen even in ‘simple’ situations.”
Vitali’s Paradox concerns a two-dimensional figure, a circle. So we will work
on the plane, R2 . Let R be the set of (clockwise) rotations of points around
the origin by rational radian values between [0, 2π). Here are some algebraic
facts about R (if you don’t understand the statement of the result, the proof
will make its meaning clear):

sth:choice:vitali: Lemma 69.15. R forms an abelian group under composition of functions.


rotationsgroupabelian

Proof. Writing 0R for the rotation by 0 radians, this is an identity element for
R, since ρ ◦ 0R = 0R ◦ ρ = ρ for any ρ ∈ R.
Every element has an inverse. Where ρ ∈ R rotates by r radians, ρ−1 ∈ R
rotates by 2π − r radians, so that ρ ◦ ρ−1 = 0R .
Composition is associative: (τ ◦ σ) ◦ ρ = τ ◦ (σ ◦ ρ) for any ρ, σ, τ ∈ R
Composition is commutative: σ ◦ ρ = ρ ◦ σ for any ρ, σ ∈ R.

In fact, we can split our group R in half, and then use either half to recover
the whole group:

sth:choice:vitali: Lemma 69.16. There is a partition of R into two disjoint sets, R1 and R2 ,
disjointgroup
both of which are a basis for R.

Proof. Let R1 consist of the rotations by rational radian values in [0, π); let
R2 = R \ R1 . By elementary algebra, {ρ ◦ ρ : ρ ∈ R1 } = R. A similar result
can be obtained for R2 .
8 For a much fuller treatment, see Weston (2003) or Tomkowicz and Wagon (2016).

Release : 6891b66 (2024-12-01) 871


CHAPTER 69. CHOICE

We will use this fact about groups to establish Theorem 69.14. Let S be
the unit circle, i.e., the set of √
points exactly 1 unit away from the origin of
the plane, i.e., {⟨r, s⟩ ∈ R2 : r2 + s2 = 1}. We will split S into parts by
considering the following relation on S:

r ∼ s iff (∃ρ ∈ R)ρ(r) = s.

That is, the points of S are linked by this relation iff you can get from one to
the other by a rational-valued rotation about the origin. Unsurprisingly:

Lemma 69.17. ∼ is an equivalence relation.

Proof. Trivial, using Lemma 69.15.

We now invoke Choice to obtain a set, C, containing exactly one member


from each equivalence class of S under ∼. That is, we consider a choice function
f on the set of equivalence classes,9

E = {[r]∼ : r ∈ S},

and let C = ran(f ). For each rotation ρ ∈ R, the set ρ[C] consists of the points
obtained by applying the rotation ρ to each point in C. These next two results
show that these sets cover the circle completely and without overlap:
S
Lemma 69.18. S = ρ∈R ρ[C]. sth:choice:vitali:
vitalicover

Proof. Fix s ∈ S; there is some r ∈ C such that r ∈ [s]∼ , i.e., r ∼ s, i.e.,


ρ(r) = s for some ρ ∈ R.

Lemma 69.19. If ρ1 ̸= ρ2 then ρ1 [C] ∩ ρ2 [C] = ∅. sth:choice:vitali:


vitalinooverlap

Proof. Suppose s ∈ ρ1 [C] ∩ ρ2 [C]. So s = ρ1 (r1 ) = ρ2 (r2 ) for some r1 , r2 ∈ C.


Hence ρ−1 −1
2 (ρ1 (r1 )) = r2 , and ρ2 ◦ ρ1 ∈ R, so r1 ∼ r2 . So r1 = r2 , as C selects
exactly one member from each equivalence class under ∼. So s = ρ1 (r1 ) =
ρ2 (r1 ), and hence ρ1 = ρ2 .

We now apply our earlier algebraic facts to our circle:

Lemma 69.20. There is a partition of S into two disjoint sets, D1 and D2 , sth:choice:vitali:
pseudobanachtarski
such that D1 can be partitioned into countably many sets which can be rotated
to form a copy of S (and similarly for D2 ).
9 Since R is enumerable, each element of E is enumerable. Since S is non-enumerable, it

follows from Lemma 69.18 and Proposition 68.12 that E is non-enumerable. So this is a use
of uncountable Choice.

872 Release : 6891b66 (2024-12-01)


69.8. APPENDIX: VITALI’S PARADOX

Proof. Using R1 and R2 from Lemma 69.16, let:


[ [
D1 = ρ[C] D2 = ρ[C]
ρ∈R1 ρ∈R2

This is a partition of S, by Lemma 69.18, and D1 and D2 are disjoint by


Lemma 69.19. By construction, D1 can be partitioned into countably many
S for each ρS∈ R1 . And these can be rotated to form a copy of S, since
sets, ρ[C]
S = ρ∈R ρ[C] = ρ∈R1 (ρ ◦ ρ)[C] by Lemma 69.16 and Lemma 69.18. The
same reasoning applies to D2 .

This immediately entails Vitali’s Paradox. For we can generate two copies
of S from S, just by splitting it up into countably many pieces (the various
ρ[C]’s) and then rigidly moving them (simply rotate each piece of D1 , and first
transport and then rotate each piece of D2 ).
Let’s recap the proof-strategy. We started with some algebraic facts about
the group of rotations on the plane. We used this group to partition S into
equivalence classes. We then arrived at a “paradox”, by using Choice to select
elements from each class.
We use exactly the same strategy to prove Banach–Tarski. The main dif-
ference is that the algebraic facts used to prove Banach–Tarski are significantly
more complicated than those used to prove Vitali’s Paradox. But those alge-
braic facts have nothing to do with Choice. We will summarise them quickly.
To prove Banach–Tarski, we start by establishing an analogue of Lemma 69.16:
any free group can be split into four pieces, which intuitively we can “move
around” to recover two copies of the whole group.10 We then show that we can
use two particular rotations around the origin of R3 to generate a free group of
rotations, F .11 (No Choice yet.) We now regard points on the surface of the
sphere as “similar” iff one can be obtained from the other by a rotation in F .
We then use Choice to select exactly one point from each equivalence class of
“similar” points. Applying our division of F to the surface of the sphere, as
in Lemma 69.20, we split that surface into four pieces, which we can “move
around” to obtain two copies of the surface of the sphere. And this establishes
(Hausdorff, 1914):

Theorem 69.21 (Hausdorff ’s Paradox (in ZFC)). The surface of any sphere
can be decomposed into finitely many pieces, which can be reassembled (by ro-
tation and transportation) to form two disjoint copies of that sphere.

A couple of further algebraic tricks are needed to obtain the full Banach-
Tarski Theorem (which concerns not just the sphere’s surface, but its interior
too). Frankly, however, this is just icing on the algebraic cake. Hence Weston
writes:
10 The fact that we can use four pieces is due to Robinson (1947). For a recent proof, see

Tomkowicz and Wagon (2016, Theorem 5.2). We follow Weston (2003, p. 3) in describing
this as “moving” the pieces of the group.
11 See Tomkowicz and Wagon (2016, Theorem 2.1).

Release : 6891b66 (2024-12-01) 873


CHAPTER 69. CHOICE

[. . . ] the result on free groups is the key step in the proof of the
Banach-Tarski paradox. From this point of view, the Banach-Tarski
paradox is not a statement about R3 so much as it is a statement
about the complexity of the group [of translations and rotations in
R3 ]. (Weston, 2003, p. 16)

That is: whether we can offer a finite decomposition (as in Banach–Tarski) or a


countably infinite decomposition (as in Vitali’s Paradox) comes down to certain
group-theoretic facts about working in two-dimension or three-dimensions.
Admittedly, this last observation slightly spoils the joke at the end of sec-
tion 69.7. Since it is two dimensional, “Banach-Tarski” must be divided into
a countable infinity of pieces, if one wants to rearrange those pieces to form
“Banach-Tarski Banach-Tarski”. To repair the joke, one must write in three
dimensions. We leave this as an exercise for the reader.
One final comment. In section 69.7, we mentioned that the “pieces” of the
sphere one obtains cannot be measurable, but must be unpicturable “infinite
scatterings”. The same is true of our use of Choice in obtaining Lemma 69.20.
And this is all worth explaining.
Again, we must sketch some background (but this is just a sketch; you may
want to consult a textbook entry on measure). To define a measure for a set
X is to assign a value µ(E) ∈ R for each E in some “σ-algebra” on X. Details
here are not essential, except that the function µ must obey the principle of
countable additivity: the measure of a countable
S unionPof disjoint sets is the
sum of their individual measures, i.e., µ( n<ω Xn ) = n<ω µ(Xn ) whenever
the Xn s are disjoint. To say that a set is “unmeasurable” is to say that no
measure can be suitably assigned. Now, using our R from before:

Corollary 69.22 (Vitali). Let µ be a measure such that µ(S) = 1, and such
that µ(X) = µ(Y ) if X and Y are congruent. Then ρ[C] is unmeasurable for
all ρ ∈ R.

Proof. For reductio, suppose otherwise. So let µ(σ[C]) = r for some σ ∈ R


and some r ∈ R. For any ρ ∈ C, ρ[C] and σ[C] are congruent, and S hence
µ(ρ[C]) = r for any ρ ∈ C. By Lemma 69.18 and Lemma 69.19, S = ρ∈R ρ[C]
is a countable union of pairwise disjoint sets. So countable additivity dictates
that µ(S) = 1 is the sum of the measures of each ρ[C], i.e.,
X X
1 = µ(S) = µ(ρ[C]) = r
ρ∈R ρ∈R
P P
But if r = 0 then ρ∈R r = 0, and if r > 0 then ρ∈R r = ∞.

874 Release : 6891b66 (2024-12-01)


Part XV

Methods
This part covers general and methodological material, especially ex-
planations of various proof methods a non-mathematics student may be
unfamiliar with. It currently contains a chapter on how to write proofs,
and a chapter on induction, but additional sections for those, exercises,
and a chapter on mathematical terminology is also planned.

Chapter 70

Proofs

content/methods/proofs/introduction.tex

70.1 Introduction
mth:prf:int: Based on your experiences in introductory logic, you might be comfortable with
sec
a derivation system—probably a natural deduction or Fitch style derivation
system, or perhaps a proof-tree system. You probably remember doing proofs
in these systems, either proving a formula or show that a given argument is
valid. In order to do this, you applied the rules of the system until you got
the desired end result. In reasoning about logic, we also prove things, but in
most cases we are not using a derivation system. In fact, most of the proofs we
consider are done in English (perhaps, with some symbolic language thrown
in) rather than entirely in the language of first-order logic. When constructing
such proofs, you might at first be at a loss—how do I prove something without
a derivation system? How do I start? How do I know if my proof is correct?
Before attempting a proof, it’s important to know what a proof is and how
to construct one. As implied by the name, a proof is meant to show that
something is true. You might think of this in terms of a dialogue—someone

875
CHAPTER 70. PROOFS

asks you if something is true, say, if every prime other than two is an odd
number. To answer “yes” is not enough; they might want to know why. In this
case, you’d give them a proof.
In everyday discourse, it might be enough to gesture at an answer, or give
an incomplete answer. In logic and mathematics, however, we want rigorous
proof—we want to show that something is true beyond any doubt. This means
that every step in our proof must be justified, and the justification must be
cogent (i.e., the assumption you’re using is actually assumed in the statement
of the theorem you’re proving, the definitions you apply must be correctly
applied, the justifications appealed to must be correct inferences, etc.).
Usually, we’re proving some statement. We call the statements we’re prov-
ing by various names: propositions, theorems, lemmas, or corollaries. A propo-
sition is a basic proof-worthy statement: important enough to record, but
perhaps not particularly deep nor applied often. A theorem is a significant,
important proposition. Its proof often is broken into several steps, and some-
times it is named after the person who first proved it (e.g., Cantor’s Theorem,
the Löwenheim–Skolem theorem) or after the fact it concerns (e.g., the com-
pleteness theorem). A lemma is a proposition or theorem that is used in the
proof of a more important result. Confusingly, sometimes lemmas are impor-
tant results in themselves, and also named after the person who introduced
them (e.g., Zorn’s Lemma). A corollary is a result that easily follows from
another one.
A statement to be proved often contains assumptions that clarify which
kinds of things we’re proving something about. It might begin with “Let φ be
a formula of the form ψ → χ” or “Suppose Γ ⊢ φ” or something of the sort.
These are hypotheses of the proposition, theorem, or lemma, and you may
assume these to be true in your proof. They restrict what we’re proving, and
also introduce some names for the objects we’re talking about. For instance,
if your proposition begins with “Let φ be a formula of the form ψ → χ,”
you’re proving something about all formulas of a certain sort only (namely,
conditionals), and it’s understood that ψ → χ is an arbitrary conditional that
your proof will talk about.

content/methods/proofs/starting-proofs.tex

70.2 Starting a Proof


But where do you even start? mth:prf:str:
sec
You’ve been given something to prove, so this should be the last thing that
is mentioned in the proof (you can, obviously, announce that you’re going to
prove it at the beginning, but you don’t want to use it as an assumption). Write
what you are trying to prove at the bottom of a fresh sheet of paper—this way
you don’t lose sight of your goal.
Next, you may have some assumptions that you are able to use (this will be
made clearer when we talk about the type of proof you are doing in the next

876 Release : 6891b66 (2024-12-01)


70.3. USING DEFINITIONS

section). Write these at the top of the page and make sure to flag that they are
assumptions (i.e., if you are assuming p, write “assume that p,” or “suppose
that p”). Finally, there might be some definitions in the question that you
need to know. You might be told to use a specific definition, or there might
be various definitions in the assumptions or conclusion that you are working
towards. Write these down and ensure that you understand what they mean.
How you set up your proof will also be dependent upon the form of the
question. The next section provides details on how to set up your proof based
on the type of sentence.

content/methods/proofs/using-definitions.tex

70.3 Using Definitions


mod:prf:def: We mentioned that you must be familiar with all definitions that may be used
sec
in the proof, and that you can properly apply them. This is a really important
point, and it is worth looking at in a bit more detail. Definitions are used to
abbreviate properties and relations so we can talk about them more succinctly.
The introduced abbreviation is called the definiendum, and what it abbreviates
is the definiens. In proofs, we often have to go back to how the definiendum
was introduced, because we have to exploit the logical structure of the definiens
(the long version of which the defined term is the abbreviation) to get through
our proof. By unpacking definitions, you’re ensuring that you’re getting to the
heart of where the logical action is.
We’ll start with an example. Suppose you want to prove the following:
Proposition 70.1. For any sets A and B, A ∪ B = B ∪ A.

In order to even start the proof, we need to know what it means for two sets
to be identical; i.e., we need to know what the “=” in that equation means for
sets. Sets are defined to be identical whenever they have the same elements.
So the definition we have to unpack is:
Definition 70.2. Sets A and B are identical, A = B, iff every element of A is
an element of B, and vice versa.

This definition uses A and B as placeholders for arbitrary sets. What it


defines—the definiendum—is the expression “A = B” by giving the condition
under which A = B is true. This condition—“every element of A is an element
of B, and vice versa”—is the definiens.1 The definition specifies that A = B
is true if, and only if (we abbreviate this to “iff”) the condition holds.
When you apply the definition, you have to match the A and B in the
definition to the case you’re dealing with. In our case, it means that in order
1 In this particular case—and very confusingly!—when A = B, the sets A and B are just

one and the same set, even though we use different letters for it on the left and the right side.
But the ways in which that set is picked out may be different, and that makes the definition
non-trivial.

Release : 6891b66 (2024-12-01) 877


CHAPTER 70. PROOFS

for A ∪ B = B ∪ A to be true, each z ∈ A ∪ B must also be in B ∪ A, and


vice versa. The expression A ∪ B in the proposition plays the role of A in the
definition, and B ∪ A that of B. Since A and B are used both in the definition
and in the statement of the proposition we’re proving, but in different uses,
you have to be careful to make sure you don’t mix up the two. For instance, it
would be a mistake to think that you could prove the proposition by showing
that every element of A is an element of B, and vice versa—that would show
that A = B, not that A ∪ B = B ∪ A. (Also, since A and B may be any two
sets, you won’t get very far, because if nothing is assumed about A and B they
may well be different sets.)
Within the proof we are dealing with set-theoretic notions such as union,
and so we must also know the meanings of the symbol ∪ in order to understand
how the proof should proceed. And sometimes, unpacking the definition gives
rise to further definitions to unpack. For instance, A ∪ B is defined as {z : z ∈
A or z ∈ B}. So if you want to prove that x ∈ A ∪ B, unpacking the definition
of ∪ tells you that you have to prove x ∈ {z : z ∈ A or z ∈ B}. Now you also
have to remember that x ∈ {z : . . . z . . .} iff . . . x . . . . So, further unpacking the
definition of the {z : . . . z . . .} notation, what you have to show is: x ∈ A or
x ∈ B. So, “every element of A ∪ B is also an element of B ∪ A” really means:
“for every x, if x ∈ A or x ∈ B, then x ∈ B or x ∈ A.” If we fully unpack the
definitions in the proposition, we see that what we have to show is this:

Proposition 70.3. For any sets A and B: (a) for every x, if x ∈ A or x ∈ B,


then x ∈ B or x ∈ A, and (b) for every x, if x ∈ B or x ∈ A, then x ∈ A or
x ∈ B.

What’s important is that unpacking definitions is a necessary part of con-


structing a proof. Properly doing it is sometimes difficult: you must be careful
to distinguish and match the variables in the definition and the terms in the
claim you’re proving. In order to be successful, you must know what the ques-
tion is asking and what all the terms used in the question mean—you will often
need to unpack more than one definition. In simple proofs such as the ones
below, the solution follows almost immediately from the definitions themselves.
Of course, it won’t always be this simple.

Problem 70.1. Suppose you are asked to prove that A ∩ B ̸= ∅. Unpack all
the definitions occurring here, i.e., restate this in a way that does not mention
“∩”, “=”, or “∅”.

content/methods/proofs/inference-patterns.tex

70.4 Inference Patterns


Proofs are composed of individual inferences. When we make an inference, mth:prf:pat:
sec
we typically indicate that by using a word like “so,” “thus,” or “therefore.”

878 Release : 6891b66 (2024-12-01)


70.4. INFERENCE PATTERNS

The inference often relies on one or two facts we already have available in
our proof—it may be something we have assumed, or something that we’ve
concluded by an inference already. To be clear, we may label these things, and
in the inference we indicate what other statements we’re using in the inference.
An inference will often also contain an explanation of why our new conclusion
follows from the things that come before it. There are some common patterns
of inference that are used very often in proofs; we’ll go through some below.
Some patterns of inference, like proofs by induction, are more involved (and
will be discussed later).
We’ve already discussed one pattern of inference: unpacking, or applying,
a definition. When we unpack a definition, we just restate something that
involves the definiendum by using the definiens. For instance, suppose that we
have already established in the course of a proof that D = E (a). Then we
may apply the definition of = for sets and infer: “Thus, by definition from (a),
every element of D is an element of E and vice versa.”
Somewhat confusingly, we often do not write the justification of an inference
when we actually make it, but before. Suppose we haven’t already proved that
D = E, but we want to. If D = E is the conclusion we aim for, then we
can restate this aim also by applying the definition: to prove D = E we have
to prove that every element of D is an element of E and vice versa. So our
proof will have the form: (a) prove that every element of D is an element of E;
(b) every element of E is an element of D; (c) therefore, from (a) and (b) by
definition of =, D = E. But we would usually not write it this way. Instead
we might write something like,
We want to show D = E. By definition of =, this amounts to
showing that every element of D is an element of E and vice versa.
(a) . . . (a proof that every element of D is an element of E) . . .
(b) . . . (a proof that every element of E is an element of D) . . .

Using a Conjunction
Perhaps the simplest inference pattern is that of drawing as conclusion one of
the conjuncts of a conjunction. In other words: if we have assumed or already
proved that p and q, then we’re entitled to infer that p (and also that q). This
is such a basic inference that it is often not mentioned. For instance, once
we’ve unpacked the definition of D = E we’ve established that every element
of D is an element of E and vice versa. From this we can conclude that every
element of E is an element of D (that’s the “vice versa” part).

Proving a Conjunction
Sometimes what you’ll be asked to prove will have the form of a conjunction;
you will be asked to “prove p and q.” In this case, you simply have to do
two things: prove p, and then prove q. You could divide your proof into two
sections, and for clarity, label them. When you’re making your first notes, you

Release : 6891b66 (2024-12-01) 879


CHAPTER 70. PROOFS

might write “(1) Prove p” at the top of the page, and “(2) Prove q” in the
middle of the page. (Of course, you might not be explicitly asked to prove a
conjunction but find that your proof requires that you prove a conjunction. For
instance, if you’re asked to prove that D = E you will find that, after unpacking
the definition of =, you have to prove: every element of D is an element of E
and every element of E is an element of D).

Proving a Disjunction
When what you are proving takes the form of a disjunction (i.e., it is an state-
ment of the form “p or q”), it is enough to show that one of the disjuncts is
true. However, it basically never happens that either disjunct just follows from
the assumptions of your theorem. More often, the assumptions of your theorem
are themselves disjunctive, or you’re showing that all things of a certain kind
have one of two properties, but some of the things have the one and others
have the other property. This is where proof by cases is useful (see below).

Conditional Proof
Many theorems you will encounter are in conditional form (i.e., show that
if p holds, then q is also true). These cases are nice and easy to set up—
simply assume the antecedent of the conditional (in this case, p) and prove the
conclusion q from it. So if your theorem reads, “If p then q,” you start your
proof with “assume p” and at the end you should have proved q.
Conditionals may be stated in different ways. So instead of “If p then q,”
a theorem may state that “p only if q,” “q if p,” or “q, provided p.” These all
mean the same and require assuming p and proving q from that assumption.
Recall that a biconditional (“p if and only if (iff) q”) is really two conditionals
put together: if p then q, and if q then p. All you have to do, then, is two
instances of conditional proof: one for the first conditional and another one for
the second. Sometimes, however, it is possible to prove an “iff” statement by
chaining together a bunch of other “iff” statements so that you start with “p”
an end with “q”—but in that case you have to make sure that each step really
is an “iff.”

Universal Claims
Using a universal claim is simple: if something is true for anything, it’s true for
each particular thing. So if, say, the hypothesis of your proof is A ⊆ B, that
means (unpacking the definition of ⊆), that, for every x ∈ A, x ∈ B. Thus, if
you already know that z ∈ A, you can conclude z ∈ B.
Proving a universal claim may seem a little bit tricky. Usually these state-
ments take the following form: “If x has P , then it has Q” or “All P s are Qs.”
Of course, it might not fit this form perfectly, and it takes a bit of practice
to figure out what you’re asked to prove exactly. But: we often have to prove
that all objects with some property have a certain other property.

880 Release : 6891b66 (2024-12-01)


70.4. INFERENCE PATTERNS

The way to prove a universal claim is to introduce names or variables, for


the things that have the one property and then show that they also have the
other property. We might put this by saying that to prove something for all P s
you have to prove it for an arbitrary P . And the name introduced is a name
for an arbitrary P . We typically use single letters as these names for arbitrary
things, and the letters usually follow conventions: e.g., we use n for natural
numbers, φ for formulas, A for sets, f for functions, etc.
The trick is to maintain generality throughout the proof. You start by as-
suming that an arbitrary object (“x”) has the property P , and show (based
only on definitions or what you are allowed to assume) that x has the prop-
erty Q. Because you have not stipulated what x is specifically, other that it has
the property P , then you can assert that everything with P has the property Q.
In short, x is a stand-in for all things with property P .

Proposition 70.4. For all sets A and B, A ⊆ A ∪ B.

Proof. Let A and B be arbitrary sets. We want to show that A ⊆ A ∪ B. By


definition of ⊆, this amounts to: for every x, if x ∈ A then x ∈ A ∪ B. So let
x ∈ A be an arbitrary element of A. We have to show that x ∈ A ∪ B. Since
x ∈ A, x ∈ A or x ∈ B. Thus, x ∈ {x : x ∈ A ∨ x ∈ B}. But that, by definition
of ∪, means x ∈ A ∪ B.

Proof by Cases
Suppose you have a disjunction as an assumption or as an already established
conclusion—you have assumed or proved that p or q is true. You want to prove
r. You do this in two steps: first you assume that p is true, and prove r, then
you assume that q is true and prove r again. This works because we assume
or know that one of the two alternatives holds. The two steps establish that
either one is sufficient for the truth of r. (If both are true, we have not one
but two reasons for why r is true. It is not necessary to separately prove that
r is true assuming both p and q.) To indicate what we’re doing, we announce
that we “distinguish cases.” For instance, suppose we know that x ∈ B ∪ C.
B ∪ C is defined as {x : x ∈ B or x ∈ C}. In other words, by definition,
x ∈ B or x ∈ C. We would prove that x ∈ A from this by first assuming that
x ∈ B, and proving x ∈ A from this assumption, and then assume x ∈ C, and
again prove x ∈ A from this. You would write “We distinguish cases” under
the assumption, then “Case (1): x ∈ B” underneath, and “Case (2): x ∈ C
halfway down the page. Then you’d proceed to fill in the top half and the
bottom half of the page.
Proof by cases is especially useful if what you’re proving is itself disjunctive.
Here’s a simple example:

Proposition 70.5. Suppose B ⊆ D and C ⊆ E. Then B ∪ C ⊆ D ∪ E.

Proof. Assume (a) that B ⊆ D and (b) C ⊆ E. By definition, any x ∈ B is


also ∈ D (c) and any x ∈ C is also ∈ E (d). To show that B ∪ C ⊆ D ∪ E, we

Release : 6891b66 (2024-12-01) 881


CHAPTER 70. PROOFS

have to show that if x ∈ B ∪ C then x ∈ D ∪ E (by definition of ⊆). x ∈ B ∪ C


iff x ∈ B or x ∈ C (by definition of ∪). Similarly, x ∈ D ∪ E iff x ∈ D or x ∈ E.
So, we have to show: for any x, if x ∈ B or x ∈ C, then x ∈ D or x ∈ E.

So far we’ve only unpacked definitions! We’ve reformulated our


proposition without ⊆ and ∪ and are left with trying to prove a
universal conditional claim. By what we’ve discussed above, this is
done by assuming that x is something about which we assume the
“if” part is true, and we’ll go on to show that the “then” part is
true as well. In other words, we’ll assume that x ∈ B or x ∈ C and
show that x ∈ D or x ∈ E.2

Suppose that x ∈ B or x ∈ C. We have to show that x ∈ D or x ∈ E. We


distinguish cases.
Case 1: x ∈ B. By (c), x ∈ D. Thus, x ∈ D or x ∈ E. (Here we’ve made
the inference discussed in the preceding subsection!)
Case 2: x ∈ C. By (d), x ∈ E. Thus, x ∈ D or x ∈ E.

Proving an Existence Claim


When asked to prove an existence claim, the question will usually be of the form
“prove that there is an x such that . . . x . . . ”, i.e., that some object that has the
property described by “. . . x . . . ”. In this case you’ll have to identify a suitable
object show that is has the required property. This sounds straightforward, but
a proof of this kind can be tricky. Typically it involves constructing or defining
an object and proving that the object so defined has the required property.
Finding the right object may be hard, proving that it has the required property
may be hard, and sometimes it’s even tricky to show that you’ve succeeded in
defining an object at all!
Generally, you’d write this out by specifying the object, e.g., “let x be . . . ”
(where . . . specifies which object you have in mind), possibly proving that . . .
in fact describes an object that exists, and then go on to show that x has the
property Q. Here’s a simple example.
Proposition 70.6. Suppose that x ∈ B. Then there is an A such that A ⊆ B
and A ̸= ∅.

Proof. Assume x ∈ B. Let A = {x}.


Here we’ve defined the set A by enumerating its elements. Since
we assume that x is an object, and we can always form a set by
enumerating its elements, we don’t have to show that we’ve suc-
ceeded in defining a set A here. However, we still have to show that
A has the properties required by the proposition. The proof isn’t
complete without that!
2 This paragraph just explains what we’re doing—it’s not part of the proof, and you don’t

have to go into all this detail when you write down your own proofs.

882 Release : 6891b66 (2024-12-01)


70.4. INFERENCE PATTERNS

Since x ∈ A, A ̸= ∅.

This relies on the definition of A as {x} and the obvious facts that
x ∈ {x} and x ∈ / ∅.

Since x is the only element of {x}, and x ∈ B, every element of A is also


an element of B. By definition of ⊆, A ⊆ B.

Using Existence Claims


Suppose you know that some existence claim is true (you’ve proved it, or it’s
a hypothesis you can use), say, “for some x, x ∈ A” or “there is an x ∈ A.” If
you want to use it in your proof, you can just pretend that you have a name
for one of the things which your hypothesis says exist. Since A contains at
least one thing, there are things to which that name might refer. You might of
course not be able to pick one out or describe it further (other than that it is
∈ A). But for the purpose of the proof, you can pretend that you have picked
it out and give a name to it. It’s important to pick a name that you haven’t
already used (or that appears in your hypotheses), otherwise things can go
wrong. In your proof, you indicate this by going from “for some x, x ∈ A” to
“Let a ∈ A.” Now you can reason about a, use some other hypotheses, etc.,
until you come to a conclusion, p. If p no longer mentions a, p is independent
of the asusmption that a ∈ A, and you’ve shown that it follows just from the
assumption “for some x, x ∈ A.”

Proposition 70.7. If A ̸= ∅, then A ∪ B ̸= ∅.

Proof. Suppose A ̸= ∅. So for some x, x ∈ A.

Here we first just restated the hypothesis of the proposition. This


hypothesis, i.e., A ̸= ∅, hides an existential claim, which you get
to only by unpacking a few definitions. The definition of = tells us
that A = ∅ iff every x ∈ A is also ∈ ∅ and every x ∈ ∅ is also ∈ A.
Negating both sides, we get: A ̸= ∅ iff either some x ∈ A is ∈/ ∅ or
some x ∈ ∅ is ∈ / A. Since nothing is ∈ ∅, the second disjunct can
never be true, and “x ∈ A and x ∈ / ∅” reduces to just x ∈ A. So
x ̸= ∅ iff for some x, x ∈ A. That’s an existence claim. Now we use
that existence claim by introducing a name for one of the elements
of A:

Let a ∈ A.

Now we’ve introduced a name for one of the things ∈ A. We’ll


continue to argue about a, but we’ll be careful to only assume that
a ∈ A and nothing else:

Since a ∈ A, a ∈ A ∪ B, by definition of ∪. So for some x, x ∈ A ∪ B, i.e.,


A ∪ B ̸= ∅.

Release : 6891b66 (2024-12-01) 883


CHAPTER 70. PROOFS

In that last step, we went from “a ∈ A ∪ B” to “for some x, x ∈


A ∪ B.” That doesn’t mention a anymore, so we know that “for
some x, x ∈ A ∪ B” follows from “for some x, x ∈ A alone.” But
that means that A ∪ B ̸= ∅.

It’s maybe good practice to keep bound variables like “x” separate from
hypothetical names like a, like we did. In practice, however, we often don’t
and just use x, like so:
Suppose A ̸= ∅, i.e., there is an x ∈ A. By definition of ∪, x ∈ A∪B.
So A ∪ B ̸= ∅.
However, when you do this, you have to be extra careful that you use different
x’s and y’s for different existential claims. For instance, the following is not a
correct proof of “If A ̸= ∅ and B ̸= ∅ then A ∩ B ̸= ∅” (which is not true).
Suppose A ̸= ∅ and B ̸= ∅. So for some x, x ∈ A and also for some
x, x ∈ B. Since x ∈ A and x ∈ B, x ∈ A ∩ B, by definition of ∩.
So A ∩ B ̸= ∅.
Can you spot where the incorrect step occurs and explain why the result does
not hold?

content/methods/proofs/example-1.tex

70.5 An Example
Our first example is the following simple fact about unions and intersections of mth:prf:ex1:
sec
sets. It will illustrate unpacking definitions, proofs of conjunctions, of universal
claims, and proof by cases.
Proposition 70.8. For any sets A, B, and C, A∪(B ∩C) = (A∪B)∩(A∪C)

Let’s prove it!


Proof. We want to show that for any sets A, B, and C, A ∪ (B ∩ C) = (A ∪
B) ∩ (A ∪ C)
First we unpack the definition of “=” in the statement of the propo-
sition. Recall that proving sets identical means showing that the
sets have the same elements. That is, all elements of A ∪ (B ∩ C)
are also elements of (A ∪ B) ∩ (A ∪ C), and vice versa. The “vice
versa” means that also every element of (A ∪ B) ∩ (A ∪ C) must be
an element of A ∪ (B ∩ C). So in unpacking the definition, we see
that we have to prove a conjunction. Let’s record this:
By definition, A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C) iff every element of A ∪ (B ∩ C)
is also an element of (A ∪ B) ∩ (A ∪ C), and every element of (A ∪ B) ∩ (A ∪ C)
is an element of A ∪ (B ∩ C).

884 Release : 6891b66 (2024-12-01)


70.5. AN EXAMPLE

Since this is a conjunction, we must prove each conjunct separately.


Lets start with the first: let’s prove that every element of A∪(B∩C)
is also an element of (A ∪ B) ∩ (A ∪ C).
This is a universal claim, and so we consider an arbitrary element
of A ∪ (B ∩ C) and show that it must also be an element of (A ∪
B) ∩ (A ∪ C). We’ll pick a variable to call this arbitrary element
by, say, z. Our proof continues:

First, we prove that every element of A ∪ (B ∩ C) is also an element of (A ∪


B) ∩ (A ∪ C). Let z ∈ A ∪ (B ∩ C). We have to show that z ∈ (A ∪ B) ∩ (A ∪ C).

Now it is time to unpack the definition of ∪ and ∩. For instance,


the definition of ∪ is: A ∪ B = {z : z ∈ A or z ∈ B}. When we
apply the definition to “A ∪ (B ∩ C),” the role of the “B” in the
definition is now played by “B ∩ C,” so A ∪ (B ∩ C) = {z : z ∈
A or z ∈ B ∩ C}. So our assumption that z ∈ A ∪ (B ∩ C) amounts
to: z ∈ {z : z ∈ A or z ∈ B ∩ C}. And z ∈ {z : . . . z . . .} iff . . . z
. . . , i.e., in this case, z ∈ A or z ∈ B ∩ C.

By the definition of ∪, either z ∈ A or z ∈ B ∩ C.

Since this is a disjunction, it will be useful to apply proof by cases.


We take the two cases, and show that in each one, the conclusion
we’re aiming for (namely, “z ∈ (A ∪ B) ∩ (A ∪ C)”) obtains.

Case 1: Suppose that z ∈ A.

There’s not much more to work from based on our assumptions. So


let’s look at what we have to work with in the conclusion. We want
to show that z ∈ (A ∪ B) ∩ (A ∪ C). Based on the definition of ∩, if
we want to show that z ∈ (A ∪ B) ∩ (A ∪ C), we have to show that
it’s in both (A ∪ B) and (A ∪ C). But z ∈ A ∪ B iff z ∈ A or z ∈ B,
and we already have (as the assumption of case 1) that z ∈ A. By
the same reasoning—switching C for B—z ∈ A∪C. This argument
went in the reverse direction, so let’s record our reasoning in the
direction needed in our proof.

Since z ∈ A, z ∈ A or z ∈ B, and hence, by definition of ∪, z ∈ A ∪ B.


Similarly, z ∈ A ∪ C. But this means that z ∈ (A ∪ B) ∩ (A ∪ C), by definition
of ∩.

This completes the first case of the proof by cases. Now we want
to derive the conclusion in the second case, where z ∈ B ∩ C.

Case 2: Suppose that z ∈ B ∩ C.

Again, we are working with the intersection of two sets. Let’s apply
the definition of ∩:

Release : 6891b66 (2024-12-01) 885


CHAPTER 70. PROOFS

Since z ∈ B ∩ C, z must be an element of both B and C, by definition of ∩.

It’s time to look at our conclusion again. We have to show that z is


in both (A ∪ B) and (A ∪ C). And again, the solution is immediate.

Since z ∈ B, z ∈ (A∪B). Since z ∈ C, also z ∈ (A∪C). So, z ∈ (A∪B)∩(A∪C).

Here we applied the definitions of ∪ and ∩ again, but since we’ve


already recalled those definitions, and already showed that if z is in
one of two sets it is in their union, we don’t have to be as explicit
in what we’ve done.
We’ve completed the second case of the proof by cases, so now we
can assert our first conclusion.

So, if z ∈ A ∪ (B ∩ C) then z ∈ (A ∪ B) ∩ (A ∪ C).

Now we just want to show the other direction, that every element of
(A ∪ B) ∩ (A ∪ C) is an element of A ∪ (B ∩ C). As before, we prove
this universal claim by assuming we have an arbitrary element of
the first set and show it must be in the second set. Let’s state what
we’re about to do.

Now, assume that z ∈ (A∪B)∩(A∪C). We want to show that z ∈ A∪(B ∩C).

We are now working from the hypothesis that z ∈ (A∪B)∩(A∪C).


It hopefully isn’t too confusing that we’re using the same z here as
in the first part of the proof. When we finished that part, all the
assumptions we’ve made there are no longer in effect, so now we
can make new assumptions about what z is. If that is confusing to
you, just replace z with a different variable in what follows.
We know that z is in both A ∪ B and A ∪ C, by definition of ∩.
And by the definition of ∪, we can further unpack this to: either
z ∈ A or z ∈ B, and also either z ∈ A or z ∈ C. This looks like
a proof by cases again—except the “and” makes it confusing. You
might think that this amounts to there being three possibilities: z
is either in A, B or C. But that would be a mistake. We have to
be careful, so let’s consider each disjunction in turn.

By definition of ∩, z ∈ A ∪ B and z ∈ A ∪ C. By definition of ∪, z ∈ A or


z ∈ B. We distinguish cases.

Since we’re focusing on the first disjunction, we haven’t gotten our


second disjunction (from unpacking A ∪ C) yet. In fact, we don’t
need it yet. The first case is z ∈ A, and an element of a set is also
an element of the union of that set with any other. So case 1 is
easy:

Case 1: Suppose that z ∈ A. It follows that z ∈ A ∪ (B ∩ C).

886 Release : 6891b66 (2024-12-01)


70.6. ANOTHER EXAMPLE

Now for the second case, z ∈ B. Here we’ll unpack the second ∪
and do another proof-by-cases:

Case 2: Suppose that z ∈ B. Since z ∈ A ∪ C, either z ∈ A or z ∈ C. We


distinguish cases further:
Case 2a: z ∈ A. Then, again, z ∈ A ∪ (B ∩ C).

Ok, this was a bit weird. We didn’t actually need the assumption
that z ∈ B for this case, but that’s ok.

Case 2b: z ∈ C. Then z ∈ B and z ∈ C, so z ∈ B ∩ C, and consequently,


z ∈ A ∪ (B ∩ C).

This concludes both proofs-by-cases and so we’re done with the


second half.

So, if z ∈ (A ∪ B) ∩ (A ∪ C) then z ∈ A ∪ (B ∩ C).

content/methods/proofs/example-2.tex

70.6 Another Example


mth:prf:ex2:
sec
Proposition 70.9. If A ⊆ C, then A ∪ (C \ A) = C.

Proof. Suppose that A ⊆ C. We want to show that A ∪ (C \ A) = C.

We begin by observing that this is a conditional statement. It is


tacitly universally quantified: the proposition holds for all sets A
and C. So A and C are variables for arbitrary sets. To prove such
a statement, we assume the antecedent and prove the consequent.
We continue by using the assumption that A ⊆ C. Let’s unpack
the definition of ⊆: the assumption means that all elements of A
are also elements of C. Let’s write this down—it’s an important
fact that we’ll use throughout the proof.

By the definition of ⊆, since A ⊆ C, for all z, if z ∈ A, then z ∈ C.

We’ve unpacked all the definitions that are given to us in the as-
sumption. Now we can move onto the conclusion. We want to show
that A ∪ (C \ A) = C, and so we set up a proof similarly to the
last example: we show that every element of A ∪ (C \ A) is also
an element of C and, conversely, every element of C is an element
of A ∪ (C \ A). We can shorten this to: A ∪ (C \ A) ⊆ C and
C ⊆ A ∪ (C \ A). (Here we’re doing the opposite of unpacking a
definition, but it makes the proof a bit easier to read.) Since this is
a conjunction, we have to prove both parts. To show the first part,

Release : 6891b66 (2024-12-01) 887


CHAPTER 70. PROOFS

i.e., that every element of A ∪ (C \ A) is also an element of C, we


assume that z ∈ A∪(C \A) for an arbitrary z and show that z ∈ C.
By the definition of ∪, we can conclude that z ∈ A or z ∈ C \ A
from z ∈ A ∪ (C \ A). You should now be getting the hang of this.

A ∪ (C \ A) = C iff A ∪ (C \ A) ⊆ C and C ⊆ (A ∪ (C \ A). First we prove that


A ∪ (C \ A) ⊆ C. Let z ∈ A ∪ (C \ A). So, either z ∈ A or z ∈ (C \ A).

We’ve arrived at a disjunction, and from it we want to prove that


z ∈ C. We do this using proof by cases.

Case 1: z ∈ A. Since for all z, if z ∈ A, z ∈ C, we have that z ∈ C.

Here we’ve used the fact recorded earlier which followed from the
hypothesis of the proposition that A ⊆ C. The first case is com-
plete, and we turn to the second case, z ∈ (C \A). Recall that C \A
denotes the difference of the two sets, i.e., the set of all elements
of C which are not elements of A. But any element of C not in A
is in particular an element of C.

Case 2: z ∈ (C \ A). This means that z ∈ C and z ∈


/ A. So, in particular,
z ∈ C.

Great, we’ve proved the first direction. Now for the second direc-
tion. Here we prove that C ⊆ A ∪ (C \ A). So we assume that z ∈ C
and prove that z ∈ A ∪ (C \ A).

Now let z ∈ C. We want to show that z ∈ A or z ∈ C \ A.

Since all elements of A are also elements of C, and C \ A is the


set of all things that are elements of C but not A, it follows that
z is either in A or in C \ A. This may be a bit unclear if you
don’t already know why the result is true. It would be better to
prove it step-by-step. It will help to use a simple fact which we can
state without proof: z ∈ A or z ∈ / A. This is called the “principle
of excluded middle:” for any statement p, either p is true or its
negation is true. (Here, p is the statement that z ∈ A.) Since this
is a disjunction, we can again use proof-by-cases.

Either z ∈ A or z ∈
/ A. In the former case, z ∈ A ∪ (C \ A). In the latter case,
z ∈ C and z ∈
/ A, so z ∈ C \ A. But then z ∈ A ∪ (C \ A).

Our proof is complete: we have shown that A ∪ (C \ A) = C.

content/methods/proofs/proof-by-contradiction.tex

888 Release : 6891b66 (2024-12-01)


70.7. PROOF BY CONTRADICTION

70.7 Proof by Contradiction


mth:prf:con: In the first instance, proof by contradiction is an inference pattern that is used
sec
to prove negative claims. Suppose you want to show that some claim p is false,
i.e., you want to show ¬p. The most promising strategy is to (a) suppose that
p is true, and (b) show that this assumption leads to something you know
to be false. “Something known to be false” may be a result that conflicts
with—contradicts—p itself, or some other hypothesis of the overall claim you
are considering. For instance, a proof of “if q then ¬p” involves assuming that
q is true and proving ¬p from it. If you prove ¬p by contradiction, that means
assuming p in addition to q. If you can prove ¬q from p, you have shown that
the assumption p leads to something that contradicts your other assumption q,
since q and ¬q cannot both be true. Of course, you have to use other inference
patterns in your proof of the contradiction, as well as unpacking definitions.
Let’s consider an example.

Proposition 70.10. If A ⊆ B and B = ∅, then A has no elements.

Proof. Suppose A ⊆ B and B = ∅. We want to show that A has no elements.

Since this is a conditional claim, we assume the antecedent and want


to prove the consequent. The consequent is: A has no elements. We
can make that a bit more explicit: it’s not the case that there is
an x ∈ A.

A has no elements iff it’s not the case that there is an x such that x ∈ A.

So we’ve determined that what we want to prove is really a negative


claim ¬p, namely: it’s not the case that there is an x ∈ A. To
use proof by contradiction, we have to assume the corresponding
positive claim p, i.e., there is an x ∈ A, and prove a contradiction
from it. We indicate that we’re doing a proof by contradiction by
writing “by way of contradiction, assume” or even just “suppose
not,” and then state the assumption p.

Suppose not: there is an x ∈ A.

This is now the new assumption we’ll use to obtain a contradiction.


We have two more assumptions: that A ⊆ B and that B = ∅. The
first gives us that x ∈ B:

Since A ⊆ B, x ∈ B.

But since B = ∅, every element of B (e.g., x) must also be an ele-


ment of ∅.

Since B = ∅, x ∈ ∅. This is a contradiction, since by definition ∅ has no


elements.

Release : 6891b66 (2024-12-01) 889


CHAPTER 70. PROOFS

This already completes the proof: we’ve arrived at what we need (a


contradiction) from the assumptions we’ve set up, and this means
that the assumptions can’t all be true. Since the first two assump-
tions (A ⊆ B and B = ∅) are not contested, it must be the last
assumption introduced (there is an x ∈ A) that must be false. But
if we want to be thorough, we can spell this out.

Thus, our assumption that there is an x ∈ A must be false, hence, A has no


elements by proof by contradiction.

Every positive claim is trivially equivalent to a negative claim: p iff ¬¬p.


So proofs by contradiction can also be used to establish positive claims “indi-
rectly,” as follows: To prove p, read it as the negative claim ¬¬p. If we can
prove a contradiction from ¬p, we’ve established ¬¬p by proof by contradiction,
and hence p.
In the last example, we aimed to prove a negative claim, namely that A
has no elements, and so the assumption we made for the purpose of proof
by contradiction (i.e., that there is an x ∈ A) was a positive claim. It gave
us something to work with, namely the hypothetical x ∈ A about which we
continued to reason until we got to x ∈ ∅.
When proving a positive claim indirectly, the assumption you’d make for
the purpose of proof by contradiction would be negative. But very often you
can easily reformulate a positive claim as a negative claim, and a negative
claim as a positive claim. Our previous proof would have been essentially the
same had we proved “A = ∅” instead of the negative consequent “A has no
elements.” (By definition of =, “A = ∅” is a general claim, since it unpacks to
“every element of A is an element of ∅ and vice versa”.) But it is easily seen
to be equivalent to the negative claim “not: there is an x ∈ A.”
So it is sometimes easier to work with ¬p as an assumption than it is to
prove p directly. Even when a direct proof is just as simple or even simpler (as
in the next examples), some people prefer to proceed indirectly. If the double
negation confuses you, think of a proof by contradiction of some claim as a proof
of a contradiction from the opposite claim. So, a proof by contradiction of ¬p
is a proof of a contradiction from the assumption p; and proof by contradiction
of p is a proof of a contradiction from ¬p.

Proposition 70.11. A ⊆ A ∪ B.

Proof. We want to show that A ⊆ A ∪ B.

On the face of it, this is a positive claim: every x ∈ A is also in


A ∪ B. The negation of that is: some x ∈ A is ∈ / A ∪ B. So we
can prove the claim indirectly by assuming this negated claim, and
showing that it leads to a contradiction.

Suppose not, i.e., A ⊈ A ∪ B.

890 Release : 6891b66 (2024-12-01)


70.7. PROOF BY CONTRADICTION

We have a definition of A ⊆ A ∪ B: every x ∈ A is also ∈ A ∪ B. To


understand what A ⊈ A∪B means, we have to use some elementary
logical manipulation on the unpacked definition: it’s false that every
x ∈ A is also ∈ A ∪ B iff there is some x ∈ A that is ∈
/ C. (This is a
place where you want to be very careful: many students’ attempted
proofs by contradiction fail because they analyze the negation of a
claim like “all As are Bs” incorrectly.) In other words, A ⊈ A ∪ B
iff there is an x such that x ∈ A and x ∈/ A ∪ B. From then on, it’s
easy.

So, there is an x ∈ A such that x ∈


/ A ∪ B. By definition of ∪, x ∈ A ∪ B
iff x ∈ A or x ∈ B. Since x ∈ A, we have x ∈ A ∪ B. This contradicts the
assumption that x ∈/ A ∪ B.

Problem 70.2. Prove indirectly that A ∩ B ⊆ A.

Proposition 70.12. If A ⊆ B and B ⊆ C then A ⊆ C.

Proof. Suppose A ⊆ B and B ⊆ C. We want to show A ⊆ C.

Let’s proceed indirectly: we assume the negation of what we want


to etablish.

Suppose not, i.e., A ⊈ C.

As before, we reason that A ⊈ C iff not every x ∈ A is also ∈ C,


i.e., some x ∈ A is ∈
/ C. Don’t worry, with practice you won’t have
to think hard anymore to unpack negations like this.

In other words, there is an x such that x ∈ A and x ∈


/ C.

Now we can use this to get to our contradiction. Of course, we’ll


have to use the other two assumptions to do it.

Since A ⊆ B, x ∈ B. Since B ⊆ C, x ∈ C. But this contradicts x ∈


/ C.

Proposition 70.13. If A ∪ B = A ∩ B then A = B.

Proof. Suppose A ∪ B = A ∩ B. We want to show that A = B.

The beginning is now routine:

Assume, by way of contradiction, that A ̸= B.

Our assumption for the proof by contradiction is that A ̸= B. Since


A = B iff A ⊆ B an B ⊆ A, we get that A ̸= B iff A ⊈ B or
B ⊈ A. (Note how important it is to be careful when manipulating
negations!) To prove a contradiction from this disjunction, we use
a proof by cases and show that in each case, a contradiction follows.

Release : 6891b66 (2024-12-01) 891


CHAPTER 70. PROOFS

A ̸= B iff A ⊈ B or B ⊈ A. We distinguish cases.

In the first case, we assume A ⊈ B, i.e., for some x, x ∈ A but ∈


/ B.
A ∩ B is defined as those elements that A and B have in common,
so if something isn’t in one of them, it’s not in the intersection.
A ∪ B is A together with B, so anything in either is also in the
union. This tells us that x ∈ A ∪ B but x ∈ / A ∩ B, and hence that
A ∩ B ̸= A ∪ B.

Case 1: A ⊈ B. Then for some x, x ∈ A but x ∈ / B. Since x ∈/ B, then


x∈/ A ∩ B. Since x ∈ A, x ∈ A ∪ B. So, A ∩ B ̸= A ∪ B, contradicting the
assumption that A ∩ B = A ∪ B.
Case 2: B ⊈ A. Then for some y, y ∈ B but y ∈ / A. As before, we
have y ∈ A ∪ B but y ∈/ A ∩ B, and so A ∩ B ̸= A ∪ B, again contradicting
A ∩ B = A ∪ B.

content/methods/proofs/reading-proofs.tex

70.8 Reading Proofs


Proofs you find in textbooks and articles very seldom give all the details we mth:prf:rea:
sec
have so far included in our examples. Authors often do not draw attention to
when they distinguish cases, when they give an indirect proof, or don’t mention
that they use a definition. So when you read a proof in a textbook, you will
often have to fill in those details for yourself in order to understand the proof.
Doing this is also good practice to get the hang of the various moves you have
to make in a proof. Let’s look at an example.

Proposition 70.14 (Absorption). For all sets A, B,

A ∩ (A ∪ B) = A

Proof. If z ∈ A ∩ (A ∪ B), then z ∈ A, so A ∩ (A ∪ B) ⊆ A. Now suppose z ∈ A.


Then also z ∈ A ∪ B, and therefore also z ∈ A ∩ (A ∪ B).

The preceding proof of the absorption law is very condensed. There is no


mention of any definitions used, no “we have to prove that” before we prove
it, etc. Let’s unpack it. The proposition proved is a general claim about any
sets A and B, and when the proof mentions A or B, these are variables for
arbitrary sets. The general claims the proof establishes is what’s required to
prove identity of sets, i.e., that every element of the left side of the identity is
an element of the right and vice versa.

“If z ∈ A ∩ (A ∪ B), then z ∈ A, so A ∩ (A ∪ B) ⊆ A.”

892 Release : 6891b66 (2024-12-01)


70.8. READING PROOFS

This is the first half of the proof of the identity: it establishes that if an
arbitrary z is an element of the left side, it is also an element of the right, i.e.,
A ∩ (A ∪ B) ⊆ A. Assume that z ∈ A ∩ (A ∪ B). Since z is an element of the
intersection of two sets iff it is an element of both sets, we can conclude that
z ∈ A and also z ∈ A ∪ B. In particular, z ∈ A, which is what we wanted
to show. Since that’s all that has to be done for the first half, we know that
the rest of the proof must be a proof of the second half, i.e., a proof that
A ⊆ A ∩ (A ∪ B).

“Now suppose z ∈ A. Then also z ∈ A ∪ B, and therefore also


z ∈ A ∩ (A ∪ B).”

We start by assuming that z ∈ A, since we are showing that, for any z, if


z ∈ A then z ∈ A ∩ (A ∪ B). To show that z ∈ A ∩ (A ∪ B), we have to show
(by definition of “∩”) that (i) z ∈ A and also (ii) z ∈ A ∪ B. Here (i) is just
our assumption, so there is nothing further to prove, and that’s why the proof
does not mention it again. For (ii), recall that z is an element of a union of
sets iff it is an element of at least one of those sets. Since z ∈ A, and A ∪ B is
the union of A and B, this is the case here. So z ∈ A ∪ B. We’ve shown both
(i) z ∈ A and (ii) z ∈ A ∪ B, hence, by definition of “∩,” z ∈ A ∩ (A ∪ B). The
proof doesn’t mention those definitions; it’s assumed the reader has already
internalized them. If you haven’t, you’ll have to go back and remind yourself
what they are. Then you’ll also have to recognize why it follows from z ∈ A
that z ∈ A ∪ B, and from z ∈ A and z ∈ A ∪ B that z ∈ A ∩ (A ∪ B).
Here’s another version of the proof above, with everything made explicit:

Proof. [By definition of = for sets, A ∩ (A ∪ B) = A we have to show (a)


A ∩ (A ∪ B) ⊆ A and (b) A ∩ (A ∪ B) ⊆ A. (a): By definition of ⊆, we have
to show that if z ∈ A ∩ (A ∪ B), then z ∈ A.] If z ∈ A ∩ (A ∪ B), then
z ∈ A [since by definition of ∩, z ∈ A ∩ (A ∪ B) iff z ∈ A and z ∈ A ∪ B], so
A ∩ (A ∪ B) ⊆ A. [(b): By definition of ⊆, we have to show that if z ∈ A, then
z ∈ A ∩ (A ∪ B).] Now suppose [(1)] z ∈ A. Then also [(2)] z ∈ A ∪ B [since by
(1) z ∈ A or z ∈ B, which by definition of ∪ means z ∈ A ∪ B], and therefore
also z ∈ A ∩ (A ∪ B) [since the definition of ∩ requires that z ∈ A, i.e., (1), and
z ∈ A ∪ B), i.e., (2)].

Problem 70.3. Expand the following proof of A ∪ (A ∩ B) = A, where you


mention all the inference patterns used, why each step follows from assump-
tions or claims established before it, and where we have to appeal to which
definitions.

Proof. If z ∈ A ∪ (A ∩ B) then z ∈ A or z ∈ A ∩ B. If z ∈ A ∩ B, z ∈ A. Any


z ∈ A is also ∈ A ∪ (A ∩ B).

content/methods/proofs/cant-do-it.tex

Release : 6891b66 (2024-12-01) 893


CHAPTER 70. PROOFS

70.9 I Can’t Do It!


We all get to a point where we feel like giving up. But you can do it. Your mth:prf:cnt:
sec
instructor and teaching assistant, as well as your fellow students, can help. Ask
them for help! Here are a few tips to help you avoid a crisis, and what to do if
you feel like giving up.
To make sure you can solve problems successfully, do the following:
1. Start as far in advance as possible. We get busy throughout the semester
and many of us struggle with procrastination, one of the best things you
can do is to start your homework assignments early. That way, if you’re
stuck, you have time to look for a solution (that isn’t crying).
2. Talk to your classmates. You are not alone. Others in the class may also
struggle—but they may struggle with different things. Talking it out
with your peers can give you a different perspective on the problem that
might lead to a breakthrough. Of course, don’t just copy their solution:
ask them for a hint, or explain where you get stuck and ask them for the
next step. And when you do get it, reciprocate. Helping someone else
along, and explaining things will help you understand better, too.
3. Ask for help. You have many resources available to you—your instructor
and teaching assistant are there for you and want you to succeed. They
should be able to help you work out a problem and identify where in the
process you’re struggling.
4. Take a break. If you’re stuck, it might be because you’ve been staring
at the problem for too long. Take a short break, have a cup of tea, or
work on a different problem for a while, then return to the problem with
a fresh mind. Sleep on it.
Notice how these strategies require that you’ve started to work on the proof
well in advance? If you’ve started the proof at 2am the day before it’s due,
these might not be so helpful.
This might sound like doom and gloom, but solving a proof is a challenge
that pays off in the end. Some people do this as a career—so there must be
something to enjoy about it. Like basically everything, solving problems and
doing proofs is something that requires practice. You might see classmates who
find this easy: they’ve probably just had lots of practice already. Try not to
give in too easily.
If you do run out of time (or patience) on a particular problem: that’s
ok. It doesn’t mean you’re stupid or that you will never get it. Find out (from
your instructor or another student) how it is done, and identify where you went
wrong or got stuck, so you can avoid doing that the next time you encounter
a similar issue. Then try to do it without looking at the solution. And next
time, start (and ask for help) earlier.

content/methods/proofs/resources.tex

894 Release : 6891b66 (2024-12-01)


70.10 Other Resources
There are many books on how to do proofs in mathematics which may be
useful. Check out How to Read and do Proofs: An Introduction to Mathe-
matical Thought Processes (Solow, 2013) and How to Prove It: A Structured
Approach (Velleman, 2019) in particular. The Book of Proof (Hammack, 2013)
and Mathematical Reasoning (Sandstrum, 2019) are books on proof that are
freely available online. Philosophers might find More Precisely: The Math you
need to do Philosophy (Steinhart, 2018) to be a good primer on mathematical
reasoning.
There are also various shorter guides to proofs available on the internet;
e.g., “Introduction to Mathematical Arguments” (Hutchings, 2003) and “How
to write proofs” (Cheng, 2004).

Motivational Videos
Feel like you have no motivation to do your homework? Feeling down? These
videos might help!

• https://www.youtube.com/watch?v=ZXsQAXx_ao0

• https://www.youtube.com/watch?v=BQ4yd2W50No

• https://www.youtube.com/watch?v=StTqXEQ2l-Y

Chapter 71

Induction

content/methods/induction/introduction.tex

71.1 Introduction
mth:ind:int: Induction is an important proof technique which is used, in different forms, in
sec
almost all areas of logic, theoretical computer science, and mathematics. It is
needed to prove many of the results in logic.

895
CHAPTER 71. INDUCTION

Induction is often contrasted with deduction, and characterized as the in-


ference from the particular to the general. For instance, if we observe many
green emeralds, and nothing that we would call an emerald that’s not green, we
might conclude that all emeralds are green. This is an inductive inference, in
that it proceeds from many particular cases (this emerald is green, that emer-
ald is green, etc.) to a general claim (all emeralds are green). Mathematical
induction is also an inference that concludes a general claim, but it is of a very
different kind than this “simple induction.”
Very roughly, an inductive proof in mathematics concludes that all mathe-
matical objects of a certain sort have a certain property. In the simplest case,
the mathematical objects an inductive proof is concerned with are natural
numbers. In that case an inductive proof is used to establish that all natural
numbers have some property, and it does this by showing that
1. 0 has the property, and
2. whenever a number k has the property, so does k + 1.
Induction on natural numbers can then also often be used to prove general
claims about mathematical objects that can be assigned numbers. For instance,
finite sets each have a finite number n of elements, and if we can use induction
to show that every number n has the property “all finite sets of size n are . . . ”
then we will have shown something about all finite sets.
Induction can also be generalized to mathematical objects that are induc-
tively defined. For instance, expressions of a formal language such as those of
first-order logic are defined inductively. Structural induction is a way to prove
results about all such expressions. Structural induction, in particular, is very
useful—and widely used—in logic.

content/methods/induction/induction-on-N.tex

71.2 Induction on N
In its simplest form, induction is a technique used to prove results for all natural mth:ind:inN:
sec
numbers. It uses the fact that by starting from 0 and repeatedly adding 1 we
eventually reach every natural number. So to prove that something is true
for every number, we can (1) establish that it is true for 0 and (2) show that
whenever it is true for a number n, it is also true for the next number n + 1.
If we abbreviate “number n has property P ” by P (n) (and “number k has
property P ” by P (k), etc.), then a proof by induction that P (n) for all n ∈ N
consists of:
1. a proof of P (0), and
2. a proof that, for any k, if P (k) then P (k + 1).
To make this crystal clear, suppose we have both (1) and (2). Then (1) tells us
that P (0) is true. If we also have (2), we know in particular that if P (0) then

896 Release : 6891b66 (2024-12-01)


71.2. INDUCTION ON N

P (0 + 1), i.e., P (1). This follows from the general statement “for any k, if P (k)
then P (k + 1)” by putting 0 for k. So by modus ponens, we have that P (1).
From (2) again, now taking 1 for n, we have: if P (1) then P (2). Since we’ve
just established P (1), by modus ponens, we have P (2). And so on. For any
number n, after doing this n times, we eventually arrive at P (n). So (1) and (2)
together establish P (n) for any n ∈ N.
Let’s look at an example. Suppose we want to find out how many different
sums we can throw with n dice. Although it might seem silly, let’s start with
0 dice. If you have no dice there’s only one possible sum you can “throw”:
no dots at all, which sums to 0. So the number of different possible throws
is 1. If you have only one die, i.e., n = 1, there are six possible values, 1
through 6. With two dice, we can throw any sum from 2 through 12, that’s
11 possibilities. With three dice, we can throw any number from 3 to 18, i.e.,
16 different possibilities. 1, 6, 11, 16: looks like a pattern: maybe the answer
is 5n + 1? Of course, 5n + 1 is the maximum possible, because there are only
5n + 1 numbers between n, the lowest value you can throw with n dice (all 1’s)
and 6n, the highest you can throw (all 6’s).

Theorem 71.1. With n dice one can throw all 5n + 1 possible values between
n and 6n.

Proof. Let P (n) be the claim: “It is possible to throw any number between n
and 6n using n dice.” To use induction, we prove:

1. The induction basis P (1), i.e., with just one die, you can throw any
number between 1 and 6.

2. The induction step, for all k, if P (k) then P (k + 1).

(1) Is proved by inspecting a 6-sided die. It has all 6 sides, and every
number between 1 and 6 shows up one on of the sides. So it is possible to
throw any number between 1 and 6 using a single die.
To prove (2), we assume the antecedent of the conditional, i.e., P (k). This
assumption is called the inductive hypothesis. We use it to prove P (k + 1). The
hard part is to find a way of thinking about the possible values of a throw of
k + 1 dice in terms of the possible values of throws of k dice plus of throws of
the extra k + 1-st die—this is what we have to do, though, if we want to use
the inductive hypothesis.
The inductive hypothesis says we can get any number between k and 6k
using k dice. If we throw a 1 with our (k + 1)-st die, this adds 1 to the total.
So we can throw any value between k + 1 and 6k + 1 by throwing k dice and
then rolling a 1 with the (k + 1)-st die. What’s left? The values 6k + 2 through
6k + 6. We can get these by rolling k 6s and then a number between 2 and 6
with our (k + 1)-st die. Together, this means that with k + 1 dice we can throw
any of the numbers between k + 1 and 6(k + 1), i.e., we’ve proved P (k + 1)
using the assumption P (k), the inductive hypothesis.

Release : 6891b66 (2024-12-01) 897


CHAPTER 71. INDUCTION

Very often we use induction when we want to prove something about a


series of objects (numbers, sets, etc.) that is itself defined “inductively,” i.e.,
by defining the (n + 1)-st object in terms of the n-th. For instance, we can
define the sum sn of the natural numbers up to n by

s0 = 0
sn+1 = sn + (n + 1)

This definition gives:

s0 = 0,
s1 = s0 + 1 = 1,
s2 = s1 + 2 =1+2=3
s3 = s2 + 3 = 1 + 2 + 3 = 6, etc.

Now we can prove, by induction, that sn = n(n + 1)/2.

Proposition 71.2. sn = n(n + 1)/2.

Proof. We have to prove (1) that s0 = 0 · (0 + 1)/2 and (2) if sk = k(k + 1)/2
then sk+1 = (k + 1)(k + 2)/2. (1) is obvious. To prove (2), we assume the
inductive hypothesis: sk = k(k + 1)/2. Using it, we have to show that sk+1 =
(k + 1)(k + 2)/2.
What is sk+1 ? By the definition, sk+1 = sk + (k + 1). By inductive
hypothesis, sk = k(k + 1)/2. We can substitute this into the previous equation,
and then just need a bit of arithmetic of fractions:

k(k + 1)
sk+1 = + (k + 1) =
2
k(k + 1) 2(k + 1)
= + =
2 2
k(k + 1) + 2(k + 1)
= =
2
(k + 2)(k + 1)
= .
2
The important lesson here is that if you’re proving something about some
inductively defined sequence an , induction is the obvious way to go. And even if
it isn’t (as in the case of the possibilities of dice throws), you can use induction
if you can somehow relate the case for k + 1 to the case for k.

content/methods/induction/strong-induction.tex

71.3 Strong Induction


mth:ind:str:
sec

898 Release : 6891b66 (2024-12-01)


71.4. INDUCTIVE DEFINITIONS

In the principle of induction discussed above, we prove P (0) and also if P (k),
then P (k + 1). In the second part, we assume that P (k) is true and use this
assumption to prove P (k+1). Equivalently, of course, we could assume P (k−1)
and use it to prove P (k)—the important part is that we be able to carry out
the inference from any number to its successor; that we can prove the claim in
question for any number under the assumption it holds for its predecessor.
There is a variant of the principle of induction in which we don’t just assume
that the claim holds for the predecessor k − 1 of k, but for all numbers smaller
than k, and use this assumption to establish the claim for k. This also gives
us the claim P (n) for all n ∈ N. For once we have established P (0), we have
thereby established that P holds for all numbers less than 1. And if we know
that if P (l) for all l < k, then P (k), we know this in particular for k = 1. So
we can conclude P (1). With this we have proved P (0) and P (1), i.e., P (l) for
all l < 2, and since we have also the conditional, if P (l) for all l < 2, then P (2),
we can conclude P (2), and so on.
In fact, if we can establish the general conditional “for all k, if P (l) for all
l < k, then P (k),” we do not have to establish P (0) anymore, since it follows
from it. For remember that a general claim like “for all l < k, P (l)” is true if
there are no l < k. This is a case of vacuous quantification: “all As are Bs” is
true if there are no As, ∀x (φ(x) → ψ(x)) is true if no x satisfies φ(x). In this
case, the formalized version would be “∀l (l < k → P (l))”—and that is true if
there are no l < k. And if k = 0 that’s exactly the case: no l < 0, hence “for
all l < 0, P (0)” is true, whatever P is. A proof of “if P (l) for all l < k, then
P (k)” thus automatically establishes P (0).
This variant is useful if establishing the claim for k can’t be made to just
rely on the claim for k − 1 but may require the assumption that it is true for
one or more l < k.

content/methods/induction/inductive-definitions.tex

71.4 Inductive Definitions


mth:ind:idf: In logic we very often define kinds of objects inductively, i.e., by specifying rules
sec
for what counts as an object of the kind to be defined which explain how to get
new objects of that kind from old objects of that kind. For instance, we often
define special kinds of sequences of symbols, such as the terms and formulas of
a language, by induction. For a simple example, consider strings of consisting
of letters a, b, c, d, the symbol ◦, and brackets [ and ], such as “[[c ◦ d][”,
“[a[]◦]”, “a” or “[[a ◦ b] ◦ d]”. You probably feel that there’s something “wrong”
with the first two strings: the brackets don’t “balance” at all in the first, and
you might feel that the “◦” should “connect” expressions that themselves make
sense. The third and fourth string look better: for every “[” there’s a closing
“]” (if there are any at all), and for any ◦ we can find “nice” expressions on
either side, surrounded by a pair of parentheses.

Release : 6891b66 (2024-12-01) 899


CHAPTER 71. INDUCTION

We would like to precisely specify what counts as a “nice term.” First of all,
every letter by itself is nice. Anything that’s not just a letter by itself should
be of the form “[t ◦ s]” where s and t are themselves nice. Conversely, if t and
s are nice, then we can form a new nice term by putting a ◦ between them and
surround them by a pair of brackets. We might use these operations to define
the set of nice terms. This is an inductive definition.
Definition 71.3 (Nice terms). The set of nice terms is inductively defined
as follows:
1. Any letter a, b, c, d is a nice term.
2. If s1 and s2 are nice terms, then so is [s1 ◦ s2 ].
3. Nothing else is a nice term.

This definition tells us that something counts as a nice term iff it can be
constructed according to the two conditions (1) and (2) in some finite number
of steps. In the first step, we construct all nice terms just consisting of letters
by themselves, i.e.,
a, b, c, d
In the second step, we apply (2) to the terms we’ve constructed. We’ll get

[a ◦ a], [a ◦ b], [b ◦ a], . . . , [d ◦ d]

for all combinations of two letters. In the third step, we apply (2) again, to
any two nice terms we’ve constructed so far. We get new nice term such as
[a◦[a◦a]]—where t is a from step 1 and s is [a◦a] from step 2—and [[b◦c]◦[d◦b]]
constructed out of the two terms [b ◦ c] and [d ◦ b] from step 2. And so on.
Clause (3) rules out that anything not constructed in this way sneaks into the
set of nice terms.
Note that we have not yet proved that every sequence of symbols that
“feels” nice is nice according to this definition. However, it should be clear that
everything we can construct does in fact “feel nice”: brackets are balanced, and
◦ connects parts that are themselves nice.
The key feature of inductive definitions is that if you want to prove some-
thing about all nice terms, the definition tells you which cases you must con-
sider. For instance, if you are told that t is a nice term, the inductive definition
tells you what t can look like: t can be a letter, or it can be [s1 ◦s2 ] for some pair
of nice terms s1 and s2 . Because of clause (3), those are the only possibilities.
When proving claims about all of an inductively defined set, the strong form
of induction becomes particularly important. For instance, suppose we want
to prove that for every nice term of length n, the number of [ in it is < n/2.
This can be seen as a claim about all n: for every n, the number of [ in any
nice term of length n is < n/2.
Proposition 71.4. For any n, the number of [ in a nice term of length n is
< n/2.

900 Release : 6891b66 (2024-12-01)


71.4. INDUCTIVE DEFINITIONS

Proof. To prove this result by (strong) induction, we have to show that the
following conditional claim is true:

If for every l < k, any nice term of length l has < l/2 [’s, then any
nice term of length k has < k/2 [’s.

To show this conditional, assume that its antecedent is true, i.e., assume that
for any l < k, nice terms of length l contain < l/2 [’s. We call this assumption
the inductive hypothesis. We want to show the same is true for nice terms of
length k.
So suppose t is a nice term of length k. Because nice terms are inductively
defined, we have two cases: (1) t is a letter by itself, or (2) t is [s1 ◦ s2 ] for some
nice terms s1 and s2 .

1. t is a letter. Then k = 1, and the number of [ in t is 0. Since 0 < 1/2,


the claim holds.

2. t is [s1 ◦ s2 ] for some nice terms s1 and s2 . Let’s let l1 be the length
of s1 and l2 be the length of s2 . Then the length k of t is l1 + l2 + 3 (the
lengths of s1 and s2 plus three symbols [, ◦, ]). Since l1 + l2 + 3 is always
greater than l1 , l1 < k. Similarly, l2 < k. That means that the induction
hypothesis applies to the terms s1 and s2 : the number m1 of [ in s1 is
< l1 /2, and the number m2 of [ in s2 is < l2 /2.
The number of [ in t is the number of [ in s1 , plus the number of [ in s2 ,
plus 1, i.e., it is m1 + m2 + 1. Since m1 < l1 /2 and m2 < l2 /2 we have:

l1 l2 l1 + l2 + 2 l1 + l2 + 3
m1 + m2 + 1 < + +1= < = k/2.
2 2 2 2

In each case, we’ve shown that the number of [ in t is < k/2 (on the basis of
the inductive hypothesis). By strong induction, the proposition follows.

Problem 71.1. Define the set of supernice terms by

1. Any letter a, b, c, d is a supernice term.

2. If s is a supernice term, then so is [s].

3. If s1 and s2 are supernice terms, then so is [s1 ◦ s2 ].

4. Nothing else is a supernice term.

Show that the number of [ in a supernice term t of length n is ≤ n/2 + 1.

content/methods/induction/structural-induction.tex

Release : 6891b66 (2024-12-01) 901


CHAPTER 71. INDUCTION

71.5 Structural Induction


So far we have used induction to establish results about all natural numbers. mth:ind:sti:
sec
But a corresponding principle can be used directly to prove results about all
elements of an inductively defined set. This often called structural induction,
because it depends on the structure of the inductively defined objects.
Generally, an inductive definition is given by (a) a list of “initial” elements
of the set and (b) a list of operations which produce new elements of the set
from old ones. In the case of nice terms, for instance, the initial objects are
the letters. We only have one operation: the operations are
o(s1 , s2 ) =[s1 ◦ s2 ]
You can even think of the natural numbers N themselves as being given by an
inductive definition: the initial object is 0, and the operation is the successor
function x + 1.
In order to prove something about all elements of an inductively defined
set, i.e., that every element of the set has a property P , we must:
1. Prove that the initial objects have P
2. Prove that for each operation o, if the arguments have P , so does the
result.
For instance, in order to prove something about all nice terms, we would prove
that it is true about all letters, and that it is true about [s1 ◦ s2 ] provided it is
true of s1 and s2 individually.
Proposition 71.5. The number of [ equals the number of ] in any nice term t.

Proof. We use structural induction. Nice terms are inductively defined, with
letters as initial objects and the operation o for constructing new nice terms
out of old ones.
1. The claim is true for every letter, since the number of [ in a letter by
itself is 0 and the number of ] in it is also 0.
2. Suppose the number of [ in s1 equals the number of ], and the same is
true for s2 . The number of [ in o(s1 , s2 ), i.e., in [s1 ◦ s2 ], is the sum of
the number of [ in s1 and s2 plus one. The number of ] in o(s1 , s2 ) is the
sum of the number of ] in s1 and s2 plus one. Thus, the number of [ in
o(s1 , s2 ) equals the number of ] in o(s1 , s2 ).

Problem 71.2. Prove by structural induction that no nice term starts with ].

Let’s give another proof by structural induction: a proper initial segment


of a string t of symbols is any string s that agrees with t symbol by symbol,
read from the left, but t is longer. So, e.g., [a ◦ is a proper initial segment of
[a ◦ b], but neither are [b ◦ (they disagree at the second symbol) nor [a ◦ b] (they
are the same length).

902 Release : 6891b66 (2024-12-01)


71.6. RELATIONS AND FUNCTIONS

mth:ind:sti: Proposition 71.6. Every proper initial segment of a nice term t has more
prop:initial
[’s than ]’s.

Proof. By induction on t:

1. t is a letter by itself: Then t has no proper initial segments.

2. t = [s1 ◦ s2 ] for some nice terms s1 and s2 . If r is a proper initial segment


of t, there are a number of possibilities:

a) r is just [: Then r has one more [ than it does ].


b) r is [r1 where r1 is a proper initial segment of s1 : Since s1 is a nice
term, by induction hypothesis, r1 has more [ than ] and the same is
true for [r1 .
c) r is [s1 or [s1 ◦ : By the previous result, the number of [ and ] in s1
are equal; so the number of [ in [s1 or [s1 ◦ is one more than the
number of ].
d) r is [s1 ◦ r2 where r2 is a proper initial segment of s2 : By induction
hypothesis, r2 contains more [ than ]. By the previous result, the
number of [ and of ] in s1 are equal. So the number of [ in [s1 ◦ r2
is greater than the number of ].
e) r is [s1 ◦ s2 : By the previous result, the number of [ and ] in s1 are
equal, and the same for s2 . So there is one more [ in [s1 ◦ s2 than
there are ].

content/methods/induction/relations.tex

71.6 Relations and Functions


mth:ind:rel: When we have defined a set of objects (such as the natural numbers or the nice
sec
terms) inductively, we can also define relations on these objects by induction.
For instance, consider the following idea: a nice term t1 is a subterm of a nice
term t2 if it occurs as a part of it. Let’s use a symbol for it: t1 ⊑ t2 . Every
nice term is a subterm of itself, of course: t ⊑ t. We can give an inductive
definition of this relation as follows:

Definition 71.7. The relation of a nice term t1 being a subterm of t2 , t1 ⊑ t2 ,


is defined by induction on t2 as follows:

1. If t2 is a letter, then t1 ⊑ t2 iff t1 = t2 .

2. If t2 is [s1 ◦ s2 ], then t1 ⊑ t2 iff t1 = t2 , t1 ⊑ s1 , or t1 ⊑ s2 .

This definition, for instance, will tell us that a ⊑ [b ◦ a]. For (2) says that
a ⊑ [b ◦ a] iff a = [b ◦ a], or a ⊑ b, or a ⊑ a. The first two are false: a clearly

Release : 6891b66 (2024-12-01) 903


CHAPTER 71. INDUCTION

isn’t identical to [b◦a], and by (1), a ⊑ b iff a = b, which is also false. However,
also by (1), a ⊑ a iff a = a, which is true.
It’s important to note that the success of this definition depends on a fact
that we haven’t proved yet: every nice term t is either a letter by itself, or there
are uniquely determined nice terms s1 and s2 such that t = [s1 ◦ s2 ]. “Uniquely
determined” here means that if t = [s1 ◦ s2 ] it isn’t also = [r1 ◦ r2 ] with s1 ̸= r1
or s2 ̸= r2 . If this were the case, then clause (2) may come in conflict with
itself: reading t2 as [s1 ◦ s2 ] we might get t1 ⊑ t2 , but if we read t2 as [r1 ◦ r2 ]
we might get not t1 ⊑ t2 . Before we prove that this can’t happen, let’s look at
an example where it can happen.
Definition 71.8. Define bracketless terms inductively by
1. Every letter is a bracketless term.
2. If s1 and s2 are bracketless terms, then s1 ◦ s2 is a bracketless term.
3. Nothing else is a bracketless term.

Bracketless terms are, e.g., a, b ◦ d, b ◦ a ◦ b. Now if we defined “subterm”


for bracketless terms the way we did above, the second clause would read
If t2 = s1 ◦ s2 , then t1 ⊑ t2 iff t1 = t2 , t1 ⊑ s1 , or t1 ⊑ s2 .
Now b ◦ a ◦ b is of the form s1 ◦ s2 with

s1 = b and s2 = a ◦ b.

It is also of the form r1 ◦ r2 with

r1 = b ◦ a and r2 = b.

Now is a ◦ b a subterm of b ◦ a ◦ b? The answer is yes if we go by the first


reading, and no if we go by the second.
The property that the way a nice term is built up from other nice terms is
unique is called unique readability. Since inductive definitions of relations for
such inductively defined objects are important, we have to prove that it holds.
Proposition 71.9. Suppose t is a nice term. Then either t is a letter by itself,
or there are uniquely determined nice terms s1 , s2 such that t = [s1 ◦ s2 ].

Proof. If t is a letter by itself, the condition is satisfied. So assume t isn’t a


letter by itself. We can tell from the inductive definition that then t must be
of the form [s1 ◦ s2 ] for some nice terms s1 and s2 . It remains to show that
these are uniquely determined, i.e., if t = [r1 ◦ r2 ], then s1 = r1 and s2 = r2 .
So suppose t = [s1 ◦ s2 ] and also t = [r1 ◦ r2 ] for nice terms s1 , s2 , r1 , r2 . We
have to show that s1 = r1 and s2 = r2 . First, s1 and r1 must be identical, for
otherwise one is a proper initial segment of the other. But by Proposition 71.6,
that is impossible if s1 and r1 are both nice terms. But if s1 = r1 , then clearly
also s2 = r2 .

904 Release : 6891b66 (2024-12-01)


71.6. RELATIONS AND FUNCTIONS

We can also define functions inductively: e.g., we can define the function f
that maps any nice term to the maximum depth of nested [. . . ] in it as follows:
mth:ind:rel: Definition 71.10. The depth of a nice term, f (t), is defined inductively as
defn:depth
follows: (
0 if t is a letter
f (t) =
max(f (s1 ), f (s2 )) + 1 if t = [s1 ◦ s2 ].
For instance
f ([a ◦ b]) = max(f (a), f (b)) + 1 =
= max(0, 0) + 1 = 1, and
f ([[a ◦ b] ◦ c]) = max(f ([a ◦ b]), f (c)) + 1 =
= max(1, 0) + 1 = 2.
Here, of course, we assume that s1 an s2 are nice terms, and make use of
the fact that every nice term is either a letter or of the form [s1 ◦ s2 ]. It is again
important that it can be of this form in only one way. To see why, consider
again the bracketless terms we defined earlier. The corresponding “definition”
would be: (
0 if t is a letter
g(t) =
max(g(s1 ), g(s2 )) + 1 if t = s1 ◦ s2 .
Now consider the bracketless term a ◦ b ◦ c ◦ d. It can be read in more than one
way, e.g., as s1 ◦ s2 with
s1 = a and s2 = b ◦ c ◦ d,

or as r1 ◦ r2 with

r1 = a ◦ b and r2 = c ◦ d.
Calculating g according to the first way of reading it would give
g(s1 ◦ s2 ) = max(g(a), g(b ◦ c ◦ d)) + 1 =
= max(0, 2) + 1 = 3

while according to the other reading we get

g(r1 ◦ r2 ) = max(g(a ◦ b), g(c ◦ d)) + 1 =


= max(1, 1) + 1 = 2
But a function must always yield a unique value; so our “definition” of g doesn’t
define a function at all.
Problem 71.3. Give an inductive definition of the function l, where l(t) is
the number of symbols in the nice term t.
Problem 71.4. Prove by structural induction on nice terms t that f (t) < l(t)
(where l(t) is the number of symbols in t and f (t) is the depth of t as defined
in Definition 71.10).

Release : 6891b66 (2024-12-01) 905


Part XVI

History

Chapter 72

Biographies

content/history/biographies/georg-cantor.tex

72.1 Georg Cantor


An early biography of Georg Cantor his:bio:can:
sec
(gay-org kahn-tor) claimed that he was
born and found on a ship that was sailing
for Saint Petersburg, Russia, and that his
parents were unknown. This, however, is
not true; although he was born in Saint
Petersburg in 1845.
Cantor received his doctorate in
mathematics at the University of Berlin
in 1867. He is known for his work in set
theory, and is credited with founding set
theory as a distinctive research discipline.
He was the first to prove that there are
infinite sets of different sizes. His theo-
ries, and especially his theory of infini-
ties, caused much debate among mathe-
maticians at the time, and his work was
controversial.
Cantor’s religious beliefs and his
mathematical work were inextricably Figure 72.1: Georg Cantor
tied; he even claimed that the theory of
transfinite numbers had been communicated to him directly by God. In later

906
72.2. ALONZO CHURCH

life, Cantor suffered from mental illness. Beginning in 1894, and more fre-
quently towards his later years, Cantor was hospitalized. The heavy criticism
of his work, including a falling out with the mathematician Leopold Kronecker,
led to depression and a lack of interest in mathematics. During depressive
episodes, Cantor would turn to philosophy and literature, and even published
a theory that Francis Bacon was the author of Shakespeare’s plays.
Cantor died on January 6, 1918, in a sanatorium in Halle.

Further Reading For full biographies of Cantor, see Dauben (1990) and
Grattan-Guinness (1971). Cantor’s radical views are also described in the
BBC Radio 4 program A Brief History of Mathematics (du Sautoy, 2014). If
you’d like to hear about Cantor’s theories in rap form, see Rose (2012).

content/history/biographies/alonzo-church.tex

72.2 Alonzo Church


his:bio:chu: Alonzo Church was born in Washington,
sec
DC on June 14, 1903. In early childhood,
an air gun incident left Church blind in
one eye. He finished preparatory school
in Connecticut in 1920 and began his uni-
versity education at Princeton that same
year. He completed his doctoral stud-
ies in 1927. After a couple years abroad,
Church returned to Princeton. Church
was known to be exceedingly polite and
careful. His blackboard writing was im-
maculate, and he would preserve impor-
tant papers by carefully covering them
in Duco cement (a clear glue). Outside
of his academic pursuits, he enjoyed read-
ing science fiction magazines and was not Figure 72.2: Alonzo Church
afraid to write to the editors if he spotted
any inaccuracies in the writing.
Church’s academic achievements were great. Together with his students
Stephen Kleene and Barkley Rosser, he developed a theory of effective calcula-
bility, the lambda calculus, independently of Alan Turing’s development of the
Turing machine. The two definitions of computability are equivalent, and give
rise to what is now known as the Church–Turing Thesis, that a function of the
natural numbers is effectively computable if and only if it is computable via
Turing machine (or lambda calculus). He also proved what is now known as
Church’s Theorem: The decision problem for the validity of first-order formulas
is unsolvable.

Release : 6891b66 (2024-12-01) 907


CHAPTER 72. BIOGRAPHIES

Church continued his work into old age. In 1967 he left Princeton for UCLA,
where he was professor until his retirement in 1990. Church passed away on
August 1, 1995 at the age of 92.

Further Reading For a brief biography of Church, see Enderton (2019).


Church’s original writings on the lambda calculus and the Entscheidungsprob-
lem (Church’s Thesis) are Church (1936a,b). Aspray (1984) records an inter-
view with Church about the Princeton mathematics community in the 1930s.
Church wrote a series of book reviews of the Journal of Symbolic Logic from
1936 until 1979. They are all archived on John MacFarlane’s website (MacFar-
lane, 2015).

content/history/biographies/gerhard-gentzen.tex

72.3 Gerhard Gentzen


Gerhard Gentzen is known primarily his:bio:gen:
sec
as the creator of structural proof the-
ory, and specifically the creation of the
natural deduction and sequent calculus
derivation systems. He was born on
November 24, 1909 in Greifswald, Ger-
many. Gerhard was homeschooled for
three years before attending preparatory
school, where he was behind most of his
classmates in terms of education. De-
spite this, he was a brilliant student and Figure 72.3: Gerhard Gentzen
showed a strong aptitude for mathemat-
ics. His interests were varied, and he, for instance, also write poems for his
mother and plays for the school theatre.
Gentzen began his university studies at the University of Greifswald, but
moved around to Göttingen, Munich, and Berlin. He received his doctorate in
1933 from the University of Göttingen under Hermann Weyl. (Paul Bernays
supervised most of his work, but was dismissed from the university by the
Nazis.) In 1934, Gentzen began work as an assistant to David Hilbert. That
same year he developed the sequent calculus and natural deduction derivation
systems, in his papers Untersuchungen über das logische Schließen I–II [Inves-
tigations Into Logical Deduction I–II]. He proved the consistency of the Peano
axioms in 1936.
Gentzen’s relationship with the Nazis is complicated. At the same time his
mentor Bernays was forced to leave Germany, Gentzen joined the university
branch of the SA, the Nazi paramilitary organization. Like many Germans, he
was a member of the Nazi party. During the war, he served as a telecommu-
nications officer for the air intelligence unit. However, in 1942 he was released
from duty due to a nervous breakdown. It is unclear whether or not Gentzen’s

908 Release : 6891b66 (2024-12-01)


72.4. KURT GÖDEL

loyalties lay with the Nazi party, or whether he joined the party in order to
ensure academic success.
In 1943, Gentzen was offered an academic position at the Mathematical
Institute of the German University of Prague, which he accepted. However, in
1945 the citizens of Prague revolted against German occupation. Soviet forces
arrived in the city and arrested all the professors at the university. Because of
his membership in Nazi organizations, Gentzen was taken to a forced labour
camp. He died of malnutrition while in his cell on August 4, 1945 at the age
of 35.

Further Reading For a full biography of Gentzen, see Menzler-Trott (2007).


An interesting read about mathematicians under Nazi rule, which gives a brief
note about Gentzen’s life, is given by Segal (2014). Gentzen’s papers on logical
deduction are available in the original german (Gentzen, 1935a,b). English
translations of Gentzen’s papers have been collected in a single volume by
Szabo (1969), which also includes a biographical sketch.

content/history/biographies/kurt-goedel.tex

72.4 Kurt Gödel


his:bio:god: Kurt Gödel (ger-dle) was born on
sec
April 28, 1906 in Brünn in the Austro-
Hungarian empire (now Brno in the
Czech Republic). Due to his inquisitive
and bright nature, young Kurtele was
often called “Der kleine Herr Warum”
(Little Mr. Why) by his family. He ex-
celled in academics from primary school
onward, where he got less than the high-
est grade only in mathematics. Gödel
was often absent from school due to
poor health and was exempt from phys-
ical education. He was diagnosed with
rheumatic fever during his childhood.
Throughout his life, he believed this per-
manently affected his heart despite med-
ical assessment saying otherwise.
Gödel began studying at the Univer-
sity of Vienna in 1924 and completed his Figure 72.4: Kurt Gödel
doctoral studies in 1929. He first in-
tended to study physics, but his interests soon moved to mathematics and
especially logic, in part due to the influence of the philosopher Rudolf Car-
nap. His dissertation, written under the supervision of Hans Hahn, proved
the completeness theorem of first-order predicate logic with identity (Gödel,

Release : 6891b66 (2024-12-01) 909


CHAPTER 72. BIOGRAPHIES

1929). Only a year later, he obtained his most famous results—the first and
second incompleteness theorems (published in Gödel 1931). During his time
in Vienna, Gödel was heavily involved with the Vienna Circle, a group of
scientifically-minded philosophers that included Carnap, whose work was espe-
cially influenced by Gödel’s results.
In 1938, Gödel married Adele Nimbursky. His parents were not pleased: not
only was she six years older than him and already divorced, but she worked as
a dancer in a nightclub. Social pressures did not affect Gödel, however, and
they remained happily married until his death.
After Nazi Germany annexed Austria in 1938, Gödel and Adele emigrated
to the United States, where he took up a position at the Institute for Advanced
Study in Princeton, New Jersey. Despite his introversion and eccentric nature,
Gödel’s time at Princeton was collaborative and fruitful. He published essays
in set theory, philosophy and physics. Notably, he struck up a particularly
strong friendship with his colleague at the IAS, Albert Einstein.
In his later years, Gödel’s mental health deteriorated. His wife’s hospi-
talization in 1977 meant she was no longer able to cook his meals for him.
Having suffered from mental health issues throughout his life, he succumbed
to paranoia. Deathly afraid of being poisoned, Gödel refused to eat. He died
of starvation on January 14, 1978, in Princeton.

Further Reading For a complete biography of Gödel’s life is available, see


John Dawson (1997). For further biographical pieces, as well as essays about
Gödel’s contributions to logic and philosophy, see Wang (1990), Baaz et al.
(2011), Takeuti et al. (2003), and Sigmund et al. (2007).
Gödel’s PhD thesis is available in the original German (Gödel, 1929). The
original text of the incompleteness theorems is (Gödel, 1931). All of Gödel’s
published and unpublished writings, as well as a selection of correspondence,
are available in English in his Collected Papers Feferman et al. (1986, 1990).
For a detailed treatment of Gödel’s incompleteness theorems, see Smith
(2013). For an informal, philosophical discussion of Gödel’s theorems, see Mark
Linsenmayer’s podcast (Linsenmayer, 2014).

content/history/biographies/emmy-noether.tex

72.5 Emmy Noether


Emmy Noether (ner-ter) was born in Erlangen, Germany, on March 23, 1882, his:bio:noe:
sec
to an upper-middle class scholarly family. Hailed as the “mother of modern
algebra,” Noether made groundbreaking contributions to both mathematics
and physics, despite significant barriers to women’s education. In Germany
at the time, young girls were meant to be educated in arts and were not al-
lowed to attend college preparatory schools. However, after auditing classes at
the Universities of Göttingen and Erlangen (where her father was professor of
mathematics), Noether was eventually able to enroll as a student at Erlangen

910 Release : 6891b66 (2024-12-01)


72.5. EMMY NOETHER

in 1904, when their policy was updated to allow female students. She received
her doctorate in mathematics in 1907.
Despite her qualifications, Noether
experienced much resistance during her
career. From 1908–1915, she taught
at Erlangen without pay. During this
time, she caught the attention of David
Hilbert, one of the world’s foremost
mathematicians of the time, who invited
her to Göttingen. However, women were
prohibited from obtaining professorships,
and she was only able to lecture under
Hilbert’s name, again without pay. Dur-
ing this time she proved what is now
known as Noether’s theorem, which is
still used in theoretical physics today.
Noether was finally granted the right to
teach in 1919. Hilbert’s response to con-
tinued resistance of his university col-
leagues reportedly was: “Gentlemen, the
faculty senate is not a bathhouse.” Figure 72.5: Emmy Noether
In the later 1920s, she concentrated
on work in abstract algebra, and her contributions revolutionized the field.
In her proofs she often made use of the so-called ascending chain condition,
which states that there is no infinite strictly increasing chain of certain sets.
For instance, certain algebraic structures now known as Noetherian rings have
the property that there are no infinite sequences of ideals I1 ⊊ I2 ⊊ . . . . The
condition can be generalized to any partial order (in algebra, it concerns the
special case of ideals ordered by the subset relation), and we can also consider
the dual descending chain condition, where every strictly decreasing sequence
in a partial order eventually ends. If a partial order satisfies the descending
chain condition, it is possible to use induction along this order in a similar way
in which we can use induction along the < order on N. Such orders are called
well-founded or Noetherian, and the corresponding proof principle Noetherian
induction.
Noether was Jewish, and when the Nazis came to power in 1933, she was
dismissed from her position. Luckily, Noether was able to emigrate to the
United States for a temporary position at Bryn Mawr, Pennsylvania. During
her time there she also lectured at Princeton, although she found the university
to be unwelcoming to women (Dick, 1981, 81). In 1935, Noether underwent an
operation to remove a uterine tumour. She died from an infection as a result
of the surgery, and was buried at Bryn Mawr.

Further Reading For a biography of Noether, see Dick (1981). The Perime-
ter Institute for Theoretical Physics has their lectures on Noether’s life and

Release : 6891b66 (2024-12-01) 911


CHAPTER 72. BIOGRAPHIES

influence available online (Institute, 2015). If you’re tired of reading, Stuff You
Missed in History Class has a podcast on Noether’s life and influence (Frey
and Wilson, 2015). The collected works of Noether are available in the original
German (Jacobson, 1983).

content/history/biographies/rozsa-peter.tex

72.6 Rózsa Péter


Rózsa Péter was born Rósza Politzer, his:bio:pet:
sec
in Budapest, Hungary, on February 17,
1905. She is best known for her work on
recursive functions, which was essential
for the creation of the field of recursion
theory.
Péter was raised during harsh polit-
ical times—WWI raged when she was
a teenager—but was able to attend the
affluent Maria Terezia Girls’ School in
Budapest, from where she graduated in
1922. She then studied at Pázmány
Péter University (later renamed Loránd
Eötvös University) in Budapest. She be-
gan studying chemistry at the insistence
of her father, but later switched to math-
ematics, and graduated in 1927. Al- Figure 72.6: Rózsa Péter
though she had the credentials to teach
high school mathematics, the economic situation at the time was dire as the
Great Depression affected the world economy. During this time, Péter took odd
jobs as a tutor and private teacher of mathematics. She eventually returned
to university to take up graduate studies in mathematics. She had originally
planned to work in number theory, but after finding out that her results had
already been proven, she almost gave up on mathematics altogether. She was
encouraged to work on Gödel’s incompleteness theorems, and unknowingly
proved several of his results in different ways. This restored her confidence,
and Péter went on to write her first papers on recursion theory, inspired by
David Hilbert’s foundational program. She received her PhD in 1935, and in
1937 she became an editor for the Journal of Symbolic Logic.
Péter’s early papers are widely credited as founding contributions to the
field of recursive function theory. In Péter (1935a), she investigated the re-
lationship between different kinds of recursion. In Péter (1935b), she showed
that a certain recursively defined function is not primitive recursive. This
simplified an earlier result due to Wilhelm Ackermann. Péter’s simplified func-
tion is what’s now often called the Ackermann function—and sometimes, more

912 Release : 6891b66 (2024-12-01)


72.7. JULIA ROBINSON

properly, the Ackermann–Péter function. She wrote the first book on recursive
function theory (Péter, 1951).
Despite the importance and influence of her work, Péter did not obtain a
full-time teaching position until 1945. During the Nazi occupation of Hungary
during World War II, Péter was not allowed to teach due to anti-Semitic laws.
In 1944 the government created a Jewish ghetto in Budapest; the ghetto was
cut off from the rest of the city and attended by armed guards. Péter was
forced to live in the ghetto until 1945 when it was liberated. She then went on
to teach at the Budapest Teachers Training College, and from 1955 onward at
Eötvös Loránd University. She was the first female Hungarian mathematician
to become an Academic Doctor of Mathematics, and the first woman to be
elected to the Hungarian Academy of Sciences.
Péter was known as a passionate teacher of mathematics, who preferred
to explore the nature and beauty of mathematical problems with her students
rather than to merely lecture. As a result, she was affectionately called “Aunt
Rosa” by her students. Péter died in 1977 at the age of 71.

Further Reading For more biographical reading, see (O’Connor and Robert-
son, 2014) and (Andrásfai, 1986). Tamassy (1994) conducted a brief interview
with Péter. For a fun read about mathematics, see Péter’s book Playing With
Infinity (Péter, 2010).

content/history/biographies/julia-robinson.tex

72.7 Julia Robinson


his:bio:rob: Julia Bowman Robinson was an Ameri-
sec
can mathematician. She is known mainly
for her work on decision problems, and
most famously for her contributions to
the solution of Hilbert’s tenth problem.
Robinson was born in St. Louis, Mis-
souri, on December 8, 1919. Robinson re-
calls being intrigued by numbers already
as a child (Reid, 1986, 4). At age nine
she contracted scarlet fever and suffered
from several recurrent bouts of rheumatic
fever. This forced her to spend much of
her time in bed, putting her behind in
her education. Although she was able to
catch up with the help of private tutors,
the physical effects of her illness had a
lasting impact on her life.
Despite her childhood struggles, Robin- Figure 72.7: Julia Robinson
son graduated high school with several

Release : 6891b66 (2024-12-01) 913


CHAPTER 72. BIOGRAPHIES

awards in mathematics and the sciences. She started her university career
at San Diego State College, and transferred to the University of California,
Berkeley, as a senior. There she was influenced by the mathematician Raphael
Robinson. They became good friends, and married in 1941. As a spouse of a
faculty member, Robinson was barred from teaching in the mathematics de-
partment at Berkeley. Although she continued to audit mathematics classes,
she hoped to leave university and start a family. Not long after her wedding,
however, Robinson contracted pneumonia. She was told that there was sub-
stantial scar tissue build up on her heart due to the rheumatic fever she suffered
as a child. Due to the severity of the scar tissue, the doctor predicted that she
would not live past forty and she was advised not to have children (Reid, 1986,
13).
Robinson was depressed for a long time, but eventually decided to continue
studying mathematics. She returned to Berkeley and completed her PhD in
1948 under the supervision of Alfred Tarski. The first-order theory of the real
numbers had been shown to be decidable by Tarski, and from Gödel’s work
it followed that the first-order theory of the natural numbers is undecidable.
It was a major open problem whether the first-order theory of the rationals is
decidable or not. In her thesis (1949), Robinson proved that it was not.
Interested in decision problems, Robinson next attempted to find a solu-
tion to Hilbert’s tenth problem. This problem was one of a famous list of
23 mathematical problems posed by David Hilbert in 1900. The tenth prob-
lem asks whether there is an algorithm that will answer, in a finite amount of
time, whether or not a polynomial equation with integer coefficients, such as
3x2 − 2y + 3 = 0, has a solution in the integers. Such questions are known
as Diophantine problems. After some initial successes, Robinson joined forces
with Martin Davis and Hilary Putnam, who were also working on the problem.
They succeeded in showing that exponential Diophantine problems (where the
unknowns may also appear as exponents) are undecidable, and showed that
a certain conjecture (later called “J.R.”) implies that Hilbert’s tenth problem
is undecidable (Davis et al., 1961). Robinson continued to work on the prob-
lem throughout the 1960s. In 1970, the young Russian mathematician Yuri
Matijasevich finally proved the J.R. hypothesis. The combined result is now
called the Matijasevich–Robinson–Davis–Putnam theorem, or MRDP theorem
for short. Matijasevich and Robinson became friends and collaborated on sev-
eral papers. In a letter to Matijasevich, Robinson once wrote that “actually I
am very pleased that working together (thousands of miles apart) we are obvi-
ously making more progress than either one of us could alone” (Matijasevich,
1992, 45).
Robinson was the first female president of the American Mathematical So-
ciety, and the first woman to be elected to the National Academy of Science.
She died on July 30, 1985 at the age of 65 after being diagnosed with leukemia.

Further Reading Robinson’s mathematical papers are available in her Col-


lected Works (Robinson, 1996), which also includes a reprint of her National

914 Release : 6891b66 (2024-12-01)


72.8. BERTRAND RUSSELL

Academy of Sciences biographical memoir (Feferman, 1994). Robinson’s older


sister Constance Reid published an “Autobiography of Julia,” based on inter-
views (Reid, 1986), as well as a full memoir (Reid, 1996). A short documentary
about Robinson and Hilbert’s tenth problem was directed by George Csicsery
(Csicsery, 2016). For a brief memoir about Yuri Matijasevich’s collaborations
with Robinson, and her influence on his work, see (Matijasevich, 1992).

content/history/biographies/bertrand-russell.tex

72.8 Bertrand Russell


his:bio:rus: Bertrand Russell is hailed as one of the
sec
founders of modern analytic philosophy.
Born May 18, 1872, Russell was not only
known for his work in philosophy and
logic, but wrote many popular books in
various subject areas. He was also an ar-
dent political activist throughout his life.
Russell was born in Trellech, Mon-
mouthshire, Wales. His parents were
members of the British nobility. They
were free-thinkers, and even made friends
with the radicals in Boston at the
time. Unfortunately, Russell’s parents
died when he was young, and Russell
was sent to live with his grandparents.
There, he was given a religious upbring-
ing (something his parents had wanted
to avoid at all costs). His grandmother
was very strict in all matters of morality. Figure 72.8: Bertrand Russell
During adolescence he was mostly home-
schooled by private tutors.
Russell’s influence in analytic philosophy, and especially logic, is tremen-
dous. He studied mathematics and philosophy at Trinity College, Cambridge,
where he was influenced by the mathematician and philosopher Alfred North
Whitehead. In 1910, Russell and Whitehead published the first volume of
Principia Mathematica, where they championed the view that mathematics
is reducible to logic. He went on to publish hundreds of books, essays and
political pamphlets. In 1950, he won the Nobel Prize for literature.
Russell’s was deeply entrenched in politics and social activism. During
World War I he was arrested and sent to prison for six months due to pacifist
activities and protest. While in prison, he was able to write and read, and
claims to have found the experience “quite agreeable.” He remained a pacifist
throughout his life, and was again incarcerated for attending a nuclear disar-
mament rally in 1961. He also survived a plane crash in 1948, where the only

Release : 6891b66 (2024-12-01) 915


CHAPTER 72. BIOGRAPHIES

survivors were those sitting in the smoking section. As such, Russell claimed
that he owed his life to smoking. Russell was married four times, but had a
reputation for carrying on extra-marital affairs. He died on February 2, 1970
at the age of 97 in Penrhyndeudraeth, Wales.

Further Reading Russell wrote an autobiography in three parts, spanning


his life from 1872–1967 (Russell, 1967, 1968, 1969). The Bertrand Russell Re-
search Centre at McMaster University is home of the Bertrand Russell archives.
See their website at Duncan (2015), for information on the volumes of his col-
lected works (including searchable indexes), and archival projects. Russell’s
paper On Denoting (Russell, 1905) is a classic of 20th century analytic philos-
ophy.
The Stanford Encyclopedia of Philosophy entry on Russell (Irvine, 2015)
has sound clips of Russell speaking on Desire and Political theory. Many video
interviews with Russell are available online. To see him talk about smoking
and being involved in a plane crash, e.g., see Russell (n.d.). Some of Russell’s
works, including his Introduction to Mathematical Philosophy are available as
free audiobooks on LibriVox (n.d.).

content/history/biographies/alfred-tarski.tex

72.9 Alfred Tarski


Alfred Tarski was born on January 14, his:bio:tar:
sec
1901 in Warsaw, Poland (then part
of the Russian Empire). Described
as “Napoleonic,” Tarski was boisterous,
talkative, and intense. His energy was
often reflected in his lectures—he once
set fire to a wastebasket while disposing
of a cigarette during a lecture, and was
forbidden from lecturing in that building
again.
Tarski had a thirst for knowledge
from a young age. Although later in
life he would tell students that he stud-
ied logic because it was the only class in
which he got a B, his high school records
show that he got A’s across the board—
even in logic. He studied at the Univer-
sity of Warsaw from 1918 to 1924. Tarski Figure 72.9: Alfred Tarski
first intended to study biology, but be-
came interested in mathematics, philosophy, and logic, as the university was
the center of the Warsaw School of Logic and Philosophy. Tarski earned his
doctorate in 1924 under the supervision of Stanislaw Leśniewski.

916 Release : 6891b66 (2024-12-01)


72.10. ALAN TURING

Before emigrating to the United States in 1939, Tarski completed some of his
most important work while working as a secondary school teacher in Warsaw.
His work on logical consequence and logical truth were written during this
time. In 1939, Tarski was visiting the United States for a lecture tour. During
his visit, Germany invaded Poland, and because of his Jewish heritage, Tarski
could not return. His wife and children remained in Poland until the end of the
war, but were then able to emigrate to the United States as well. Tarski taught
at Harvard, the College of the City of New York, and the Institute for Advanced
Study at Princeton, and finally the University of California, Berkeley. There
he founded the multidisciplinary program in Logic and the Methodology of
Science. Tarski died on October 26, 1983 at the age of 82.

Further Reading For more on Tarski’s life, see the biography Alfred Tarski:
Life and Logic (Feferman and Feferman, 2004). Tarski’s seminal works on
logical consequence and truth are available in English in (Corcoran, 1983). All
of Tarski’s original works have been collected into a four volume series, (Tarski,
1981).

content/history/biographies/alan-turing.tex

72.10 Alan Turing


his:bio:tur: Alan Turing was born in Maida Vale, London, on June 23, 1912. He is consid-
sec
ered the father of theoretical computer science. Turing’s interest in the physical
sciences and mathematics started at a young age. However, as a boy his in-
terests were not represented well in his schools, where emphasis was placed on
literature and classics. Consequently, he did poorly in school and was repri-
manded by many of his teachers.
Turing attended King’s College, Cambridge as an undergraduate, where
he studied mathematics. In 1936 Turing developed (what is now called) the
Turing machine as an attempt to precisely define the notion of a computable
function and to prove the undecidability of the decision problem. He was
beaten to the result by Alonzo Church, who proved the result via his own
lambda calculus. Turing’s paper was still published with reference to Church’s
result. Church invited Turing to Princeton, where he spent 1936–1938, and
obtained a doctorate under Church.
Despite his interest in logic, Turing’s earlier interests in physical sciences
remained prevalent. His practical skills were put to work during his service
with the British cryptanalytic department at Bletchley Park during World
War II. Turing was a central figure in cracking the cypher used by German
Naval communications—the Enigma code. Turing’s expertise in statistics and
cryptography, together with the introduction of electronic machinery, gave the
team the ability to crack the code by creating a de-crypting machine called
a “bombe.” His ideas also helped in the creation of the world’s first pro-

Release : 6891b66 (2024-12-01) 917


CHAPTER 72. BIOGRAPHIES

grammable electronic computer, the Colossus, also used at Bletchley park to


break the German Lorenz cypher.
Turing was gay. Nevertheless, in 1942
he proposed to Joan Clarke, one of his
teammates at Bletchley Park, but later
broke off the engagement and confessed
to her that he was homosexual. He had
several lovers throughout his lifetime, al-
though homosexual acts were then crimi-
nal offences in the UK. In 1952, Turing’s
house was burgled by a friend of his lover
at the time, and when filing a police re-
port, Turing admitted to having a homo-
sexual relationship, under the impression
that the government was on their way to
legalizing homosexual acts. This was not
true, and he was charged with gross inde-
cency. Instead of going to prison, Turing
opted for a hormone treatment that re- Figure 72.10: Alan Turing
duced libido. Turing was found dead on
June 8, 1954, of a cyanide overdose—most likely suicide. He was given a royal
pardon by Queen Elizabeth II in 2013.

Further Reading For a comprehensive biography of Alan Turing, see Hodges


(2014). Turing’s life and work inspired a play, Breaking the Code, which was
produced in 1996 for TV starring Derek Jacobi as Turing. The Imitation Game,
an Academy Award nominated film starring Bendict Cumberbatch and Kiera
Knightley, is also loosely based on Alan Turing’s life and time at Bletchley
Park (Tyldum, 2014).
Radiolab (2012) has several podcasts on Turing’s life and work. BBC Hori-
zon’s documentary The Strange Life and Death of Dr. Turing is available to
watch online (Sykes, 1992). (Theelen, 2012) is a short video of a working LEGO
Turing Machine—made to honour Turing’s centenary in 2012.
Turing’s original paper on Turing machines and the decision problem is
Turing (1937).

content/history/biographies/ernst-zermelo.tex

72.11 Ernst Zermelo


Ernst Zermelo was born on July 27, 1871 in Berlin, Germany. He had five his:bio:zer:
sec
sisters, though his family suffered from poor health and only three survived to
adulthood. His parents also passed away when he was young, leaving him and
his siblings orphans when he was seventeen. Zermelo had a deep interest in the
arts, and especially in poetry. He was known for being sharp, witty, and critical.

918 Release : 6891b66 (2024-12-01)


His most celebrated mathematical achievements include the introduction of the
axiom of choice (in 1904), and his axiomatization of set theory (in 1908).

Zermelo’s interests at university were


varied. He took courses in physics, math-
ematics, and philosophy. Under the su-
pervision of Hermann Schwarz, Zermelo
completed his dissertation Investigations
in the Calculus of Variations in 1894 at
the University of Berlin. In 1897, he de-
cided to pursue more studies at the Uni-
versity of Göttigen, where he was heav-
ily influenced by the foundational work of
David Hilbert. In 1899 he became eligible
for professorship, but did not get one un-
til eleven years later—possibly due to his
strange demeanour and “nervous haste.”

Zermelo finally received a paid pro-


fessorship at the University of Zurich in
1910, but was forced to retire in 1916 Figure 72.11: Ernst Zermelo
due to tuberculosis. After his recovery,
he was given an honourary professorship
at the University of Freiburg in 1921. During this time he worked on foun-
dational mathematics. He became irritated with the works of Thoralf Skolem
and Kurt Gödel, and publicly criticized their approaches in his papers. He was
dismissed from his position at Freiburg in 1935, due to his unpopularity and
his opposition to Hitler’s rise to power in Germany.

The later years of Zermelo’s life were marked by isolation. After his dis-
missal in 1935, he abandoned mathematics. He moved to the country where
he lived modestly. He married in 1944, and became completely dependent on
his wife as he was going blind. Zermelo lost his sight completely by 1951. He
passed away in Günterstal, Germany, on May 21, 1953.

Further Reading For a full biography of Zermelo, see Ebbinghaus (2015).


Zermelo’s seminal 1904 and 1908 papers are available to read in the original
German (Zermelo, 1904, 1908b). Zermelo’s collected works, including his writ-
ing on physics, are available in English translation in (Ebbinghaus et al., 2010;

919
CHAPTER 73. HISTORY AND MYTHOLOGY OF SET THEORY

Ebbinghaus and Kanamori, 2013).

Chapter 73

History and Mythology of Set


Theory

This chapter includes the historical prelude from Tim Button’s Open
Set Theory text.

content/history/set-theory/infinitesimals.tex

73.1 Infinitesimals and Differentiation


his:set:infinitesimals:
Newton and Leibniz discovered the calculus (independently) at the end of the sec

17th century. A particularly important application of the calculus was differ-


entiation. Roughly speaking, differentiation aims to give a notion of the “rate
of change”, or gradient, of a function at a point.

Here is a vivid way to illustrate the idea. Consider the function f (x) =
x2/4 + 1/2, depicted in black below:

920 Release : 6891b66 (2024-12-01)


73.1. INFINITESIMALS AND DIFFERENTIATION

f (x)
5

x
1 2 3 4

Suppose we want to find the gradient of the function at c = 1/2. We start by


drawing a triangle whose hypotenuse approximates the gradient at that point,
perhaps the red triangle above. When β is the base length of our triangle, its
height is f (1/2 + β) − f (1/2), so that the gradient of the hypotenuse is:

f (1/2 + β) − f (1/2)
.
β

So the gradient of our red triangle, with base length 3, is exactly 1. The
hypotenuse of a smaller triangle, the blue triangle with base length 2, gives
a better approximation; its gradient is 3/4. A yet smaller triangle, the green
triangle with base length 1, gives a yet better approximation; with gradient
1/2.

Ever-smaller triangles give us ever-better approximations. So we might say


something like this: the hypotenuse of a triangle with an infinitesimal base
length gives us the gradient at c = 1/2 itself. In this way, we would obtain a
formula for the (first) derivative of the function f at the point c:

f (c + β) − f (c)
f ′ (c) = where β is infinitesimal.
β

And, roughly, this is what Newton and Leibniz said.


However, since they have said this, we must ask them: what is an infinites-
imal ? A serious dilemma arises. If β = 0, then f ′ is ill-defined, for it involves
dividing by 0. But if β > 0, then we just get an approximation to the gradient,
and not the gradient itself.
This is not an anachronistic concern. Here is Berkeley, criticizing Newton’s
followers:

I admit that signs may be made to denote either any thing or noth-
ing: and consequently that in the original notation c + β, β might

Release : 6891b66 (2024-12-01) 921


CHAPTER 73. HISTORY AND MYTHOLOGY OF SET THEORY

have signified either an increment or nothing. But then which


of these soever you make it signify, you must argue consistently
with such its signification, and not proceed upon a double mean-
ing: Which to do were a manifest sophism. (Berkeley 1734, §XIII,
variables changed to match preceding text)
To defend the infinitesimal calculus against Berkeley, one might reply that the
talk of “infinitesimals” is merely figurative. One might say that, so long as we
take a really small triangle, we will get a good enough approximation to the
tangent. Berkeley had a reply to this too: whilst that might be good enough
for engineering, it undermines the status of mathematics, for
we are told that in rebus mathematicis errores quàm minimi non
sunt contemnendi. [In the case of mathematics, the smallest errors
are not to be neglected.] (Berkeley, 1734, §IX)
The italicised passage is a near-verbatim quote from Newton’s own Quadrature
of Curves (1704).
Berkeley’s philosophical objections are deeply incisive. Nevertheless, the
calculus was a massively successful enterprise, and mathematicians continued
to use it without falling into error.

content/history/set-theory/limits.tex

73.2 Rigorous Definition of Limits


These days, the standard solution to the foregoing problem is to get rid of the his:set:limits:
sec
infinitesimals. Here is how.
We saw that, as β gets smaller, we get better approximations of the gradient.
Indeed, as β gets arbitrarily close to 0, the value of f ′ (c) “tends without limit”
to the gradient we want. So, instead of considering what happens at β = 0, we
need only consider the trend of f ′ (c) as β approaches 0.
Put like this, the general challenge is to make sense of claims of this shape:
As x approaches c, g(x) tends without limit to ℓ.
which we can write more compactly as follows:
lim g(x) = ℓ.
x→c

In the 19th century, building upon earlier work by Cauchy, Weierstrass offered
a perfectly rigorous definition of this expression. The idea is indeed that we
can make g(x) as close as we like to ℓ, by making x suitably close to c. More
precisely, we stipulate that limx→c g(x) = ℓ will mean:
(∀ε > 0)(∃δ > 0)∀x (|x − c| < δ → |g(x) − ℓ| < ε) .
The vertical bars here indicate absolute magnitude. That is, |x| = x when
x ≥ 0, and |x| = −x when x < 0; you can depict that function as follows:

922 Release : 6891b66 (2024-12-01)


73.3. PATHOLOGIES

|x|

x
−2 −1 1 2

So the definition says roughly this: you can make your “error” less than ε (i.e.,
|g(x) − ℓ| < ε) by choosing arguments which are no more than δ away from c
(i.e., |x − c| < δ).
Having defined the notion of a limit, we can use it to avoid infinitesimals
altogether, stipulating that the gradient of f at c is given by:
 
′ f (c + x) − f (c)
f (c) = lim where a limit exists.
x→0 x

It is important, though, to realise why our definition needs the caveat “where
a limit exists”. To take a simple example, consider f (x) = |x|, whose graph we
just saw. Evidently, f ′ (0) is ill-defined: if we approach 0 “from the right”, the
gradient is always 1; if we approach 0 “from the left”, the gradient is always
−1; so the limit is undefined. As such, we might add that a function f is
differentiable at x iff such a limit exists.
We have seen how to handle differentiation using the notion of a limit. We
can use the same notion to define the idea of a continuous function. (Bolzano
had, in effect, realised this by 1817.) The Cauchy–Weierstrass treatment of
continuity is as follows. Roughly: a function f is continuous (at a point)
provided that, if you demand a certain amount of precision concerning the
output of the function, you can guarantee this by insisting upon a certain
amount of precision concerning the input of the function. More precisely: f
is continuous at c provided that, as x tends to zero, the difference between
f (c + x) and f (c) itself tends to 0. Otherwise put: f is continuous at c iff
f (c) = limx→c f (x).
To go any further would just lead us off into real analysis, when our subject
matter is set theory. So now we should pause, and state the moral. During
the 19th century, mathematicians learnt how to do without infinitesimals, by
invoking a rigorously defined notion of a limit.

content/history/set-theory/pathologies.tex

73.3 Pathologies
his:set:pathology: However, the definition of a limit turned out to allow for some rather “patho-
sec
logical” constructions.

Release : 6891b66 (2024-12-01) 923


CHAPTER 73. HISTORY AND MYTHOLOGY OF SET THEORY

Around the 1830s, Bolzano discovered a function which was continuous


everywhere, but differentiable nowhere. (Unfortunately, Bolzano never pub-
lished this; the idea was first encountered by mathematicians in 1872, thanks
to Weierstrass’s independent discovery of the same idea.)1 This was, to say
the least, rather surprising. It is easy to find functions, such as |x|, which
are continuous everywhere but not differentiable at a particular point. But a
function which is continuous everywhere but differentiable nowhere is a very
different beast. Consider, for a moment, how you might try to draw such a
function. To ensure it is continuous, you must be able to draw it without ever
removing your pen from the page; but to ensure it is differentiable nowhere,
you would have to abruptly change the direction of your pen, constantly.
Further “pathologies” followed. In January 5 1874, Cantor wrote a letter
to Dedekind, posing the problem:

Can a surface (say a square including its boundary) be one-to-one


correlated to a line (say a straight line including its endpoints) so
that to every point of the surface there corresponds a point of the
line, and conversely to every point of the line there corresponds a
point of the surface?

It still seems to me at the moment that the answer to this question


is very difficult—although here too one is so impelled to say no that
one would like to hold the proof to be almost superfluous. [Quoted
in Gouvêa 2011]

But, in 1877, Cantor proved that he had been wrong. In fact, a line and a square
have exactly the same number of points. He wrote on 29 June 1877 to Dedekind
“je le vois, mais je ne le crois pas”; that is, “I see it, but I don’t believe it”.
In the “received history” of mathematics, this is often taken to indicate just
how literally incredible these new results were to the mathematicians of the
time. (The correspondence is presented in Gouvêa (2011), and we return to it
in section 73.4. Cantor’s proof is outlined in section 73.5.)
Inspired by Cantor’s result, Peano started to consider whether it might be
possible to map a line smoothly onto a plane. This would be a curve which
fills space. In 1890, Peano constructed just such a curve. This is truly counter-
intuitive: Euclid had defined a line as “breadthless length” (Book I, Definition
2), but Peano had shown that, by curling up a line appropriately, its length
can be turned into breadth. In 1891, Hilbert described a slightly more intuitive
space-filling curve, together with some pictures illustrating it. The curve is
constructed in sequence, and here are the first six stages of the construction:

1 The history is documented in extremely thorough footnotes to the Wikipedia article on

the Weierstrass function.

924 Release : 6891b66 (2024-12-01)


73.4. MORE MYTH THAN HISTORY?

In the limit—a notion which had, by now, received rigorous definition—the


entire square is filled in solid red. And, in passing, Hilbert’s curve is continuous
everywhere but differentiable nowhere; intuitively because, in the infinite limit,
the function abruptly changes direction at every moment. (We will outline
Hilbert’s construction in more detail in section 73.6.)
For better or worse, these “pathological” geometric constructions were treated
as a reason to doubt appeals to geometric intuition. They became something
approaching propaganda for a new way of doing mathematics, which would
culminate in set theory. In the later myth-building of the subject, it was re-
peated, often, that these results were both perfectly rigorous and perfectly
shocking. They therefore served a dual purpose: as a warning against relying
upon geometric intuition, and as a demonstration of the fertility of new ways
of thinking.

content/history/set-theory/mythology.tex

73.4 More Myth than History?


his:set:mythology: Looking back on these events with more than a century of hindsight, we must
sec
be careful not to take these verdicts on trust. The results were certainly novel,
exciting, and surprising. But how truly shocking were they? And did they
really demonstrate that we should not rely on geometric intuition?
On the question of shock, Gouvêa (2011) points out that Cantor’s famous
note to Dedekind, “je le vois, mais je ne le crois pas” is taken rather out of
context. Here is more of that context (quoted from Gouvêa):

Please excuse my zeal for the subject if I make so many demands


upon your kindness and patience; the communications which I
lately sent you are even for me so unexpected, so new, that I can
have no peace of mind until I obtain from you, honoured friend, a

Release : 6891b66 (2024-12-01) 925


CHAPTER 73. HISTORY AND MYTHOLOGY OF SET THEORY

decision about their correctness. So long as you have not agreed


with me, I can only say: je le vois, mais je ne le crois pas.

Cantor knew his result was “so unexpected, so new”. But it is doubtful that he
ever found his result unbelievable. As Gouvêa points out, he was simply asking
Dedekind to check the proof he had offered.
On the question of geometric intuition: Peano published his space-filling
curve without including any diagrams. But when Hilbert published his curve,
he explained his purpose: he would provide readers with a clear way to un-
derstand Peano’s result, if they “help themselves to the following geometric
intuition”; whereupon he included a series of diagrams just like those provided
in section 73.3.
More generally: whilst diagrams have fallen rather out of fashion in pub-
lished proofs, there is no getting round the fact that mathematicians frequently
use diagrams when proving things. (Roughly put: good mathematicians know
when they can rely upon geometric intuition.)
In short: don’t believe the hype; or at least, don’t just take it on trust. For
more on this, you could read Giaquinto (2007).

content/history/set-theory/cantor-plane.tex

73.5 Cantor on the Line and the Plane


Some of the circumstances surrounding the proof of Schröder-Bernstein tie in his:set:cantorplane:
sec
with the history we discussed in section 73.3. Recall that, in 1877, Cantor
proved that there are exactly as many points on a square as on one of its sides.
Here, we will present his (first attempted) proof.
Let L be the unit line, i.e., the set of points [0, 1]. Let S be the unit square,
i.e., the set of points L × L. In these terms, Cantor proved that L ≈ S. He
wrote a note to Dedekind, essentially containing the following argument.

Theorem 73.1. L ≈ S his:set:cantorplane:


thm:cantorplane

Proof: first part.. Fix a, b ∈ L. Write them in binary notation, so that we have
infinite sequences of 0s and 1s, a1 , a2 , . . . , and b1 , b2 , . . . , such that:

a = 0.a1 a2 a3 a4 . . .
b = 0.b1 b2 b3 b4 . . .

Now consider the function f : S → L given by

f (a, b) = 0.a1 b1 a2 b2 a3 b3 a4 b4 . . .

Now f is an injection, since if f (a, b) = f (c, d), then an = cn and bn = dn for


all n ∈ N, so that a = c and b = d.

926 Release : 6891b66 (2024-12-01)


73.6. APPENDIX: HILBERT’S SPACE-FILLING CURVES

Unfortunately, as Dedekind pointed out to Cantor, this does not answer the
original question. Consider 0.1̇0̇ = 0.1010101010 . . .. We need that f (a, b) =
0.1̇0̇, where:

a = 0.1̇1̇ = 0.111111 . . .
b=0

But a = 0.1̇1̇ = 1. So, when we say “write a and b in binary notation”, we


have to choose which notation to use; and, since f is to be a function, we can
use only one of the two possible notations. But if, for example, we use the
simple notation, and write a as “1.000 . . .”, then we have no pair ⟨a, b⟩ such
that f (a, b) = 0.1̇0̇.
To summarise: Dedekind pointed out that, given the possibility of certain
recurring decimal expansions, Cantor’s function f is an injection but not a sur-
jection. So Cantor has shown only that S ⪯ L and not that S ≈ L.
Cantor wrote back to Dedekind almost immediately, essentially suggesting
that the proof could be completed as follows:

Proof: completed.. So, we have shown that S ⪯ L. But there is obviously


an injection from L to S: just lay the line flat along one side of the square. So
L ⪯ S and S ⪯ L. By Schröder–Bernstein (Theorem 4.25), L ≈ S.

But of course, Cantor could not complete the last line in these terms, for
the Schröder-Bernstein Theorem was not yet proved. Indeed, although Cantor
would subsequently formulate this as a general conjecture, it was not satisfac-
torily proved until 1897. (And so, later in 1877, Cantor offered a different proof
of Theorem 73.1, which did not go via Schröder–Bernstein.)

content/history/set-theory/hilbert-curve.tex

73.6 Appendix: Hilbert’s Space-filling Curves


his:set:hilbertcurve: In chapter section 73.3, we mentioned that Cantor’s proof that a line and
sec
a square have exactly the same number of points (Theorem 73.1) prompted
Peano to ask whether there might be a space-filling curve. He obtained a
positive answer in 1890. In this section, we explain (in a hand-wavy way) how
to construct Hilbert’s space-filling curve (with a tiny tweak).2
We must define a function, h, as the limit of a sequence of functions h1 ,
h2 , h3 , . . . We first describe the construction. Then we show it is space-filling.
Then we show it is a curve.
We will take h’s range to be the unit square, S. Here is our first approxi-
mation to h, i.e., h1 :
2 For a more rigorous explanation, see Rose (2010). The tweak amounts to the inclusion

of the red parts of the curves below. This makes it slightly easier to check that the curve is
continuous.

Release : 6891b66 (2024-12-01) 927


CHAPTER 73. HISTORY AND MYTHOLOGY OF SET THEORY

To keep track of things, we have imposed a 2 × 2 grid on the square. We can


think of the curve starting in the bottom left quarter, moving to the top left,
then to the top right, then finally to the bottom right. Here is the second stage
in the construction, i.e., h2 :

The different colours will help explain how h2 was constructed. We first place
scaled-down copies of the non-red bit of h1 into the bottom left, top left, top
right, and bottom right of our square (drawn in black). We then connect these
four figures (with green lines). Finally, we connect our figure to the boundary
of the square (with red lines).
Now to h3 . Just as h2 was made from four connected, scaled-down copies
of the non-red bit of h1 , so h3 is made up of four scaled-down copies of the
non-red bit of h2 (drawn in black), which are then joined together (with green
lines) and finally connected to the boundary of the square (with red lines).

And now we see the general pattern for defining hn+1 from hn . At last we define
the curve h itself by considering the point-by-point limit of these successive
functions h1 , h2 , . . . That is, for each x ∈ S:

h(x) = lim hn (x)


n→∞

We now show that this curve fills space. When we draw the curve hn , we
impose a 2n × 2n grid onto S. By Pythagoras’s Theorem, the diagonal of each

928 Release : 6891b66 (2024-12-01)


73.6. APPENDIX: HILBERT’S SPACE-FILLING CURVES

grid-location is of length:
q
2 2 1
(1/2n ) + (1/2n ) = 2( 2 −n)

and evidently hn passes through every grid-location. So each point in S is at


1
most 2( 2 −n) distance away from some point on hn . Now, h is defined as the
limit of the functions h1 , h2 , h3 , . . . So the maximum distance of any point
from h is given by:
1
lim 2( 2 −n) = 0.
n→∞

That is: every point in S is 0 distance from h. In other words, every point of
S lies on the curve. So h fills space!
It remains to show that h is, indeed, a curve. To show this, we must define
the notion. The modern definition builds on one given by Jordan in 1887 (i.e.,
only a few years before the first space-filling curve was provided):

Definition 73.2. A curve is a continuous map from L to R2 .

This is fairly intuitive: a curve is, intuitively, a “smooth” map which takes
a canonical line onto the plane R2 . Our function, h, is indeed a map from L
to R2 . So, we just need to show that h is continuous. We defined continuity
in section 73.2 using ε/δ notation. In the vernacular, we want to establish
the following: If you specify a point p in S, together with any desired level of
precision ε, we can find an open section of L such that, given any x in that
open section, h(x) is within ε of p.
So: assume that you have specified p and ε. This is, in effect, to draw a
circle with centre p and radius ε on S. (The circle might spill off the edge of S,
but that doesn’t matter.) Now, recall that, when describing the function hn ,
we drew a 2n × 2n grid upon S. It is obvious that, no matter how small ε is,
there is some n such that some individual grid-location of the 2n × 2n grid on
S lies wholly within the circle with centre p and radius ε.
So, take that n, and let I be the largest open part of L which hn maps
wholly into the relevant grid location. (It is clear that (a, b) exists, since we
already noted that hn passes through every grid-location in the 2n × 2n grid.)
It now suffices to show to show that, whenever x ∈ I the point h(x) lies in that
same grid-location. And to do this, it suffices to show that hm (x) lies in that
same grid location, for any m > n. But this is obvious. If we consider what
happens with hm for m > n, we see that exactly the “same part” of the unit
interval is mapped into the same grid-location; we just map it into that region
in an increasingly stretched-out, wiggly fashion.

Release : 6891b66 (2024-12-01) 929


Part XVII

Reference

This part collects various lists of alphabets, notations, definitions, rules,


etc., for inclusion as appendices in a textbook.

Chapter 74

The Greek Alphabet

Alpha α A Nu ν N
Beta β B Xi ξ Ξ
Gamma γ Γ Omicron o O
Delta δ ∆ Pi π Π
Epsilon ε E Rho ρ P
Zeta ζ Z Sigma σ Σ
Eta η H Tau τ T
Theta θ Θ Upsilon υ Υ
Iota ι I Phi φ Φ
Kappa κ K Chi χ X
Lambda λ Λ Psi ψ Ψ
Mu µ M Omega ω Ω

930
Chapter 75

The Fraktur Alphabet

A A a a N N n n
B B b b O O o o
C C c c P P p p
D D d d Q Q q q
E E e e R R r r
F F f f S S s s
G G g g T T t t
H H h h U U u u
I I i i V V v v
J J j j W W w w
K K k k X X x x
L L l l Y Y y y
M M m m Z Z z z

Photo Credits

Georg Cantor, p. 906: Portrait of Georg Cantor by Otto Zeth courtesy of


the Universitätsarchiv, Martin-Luther Universität Halle–Wittenberg. UAHW
Rep. 40-VI, Nr. 3 Bild 102.
Alonzo Church, p. 907: Portrait of Alonzo Church, undated, photogra-
pher unknown. Alonzo Church Papers; 1924–1995, (C0948) Box 60, Folder 3.
Manuscripts Division, Department of Rare Books and Special Collections,
Princeton University Library. © Princeton University. The Open Logic
Project has obtained permission to use this image for inclusion in non-commercial
OLP-derived materials. Permission from Princeton University is required for

931
any other use.
Gerhard Gentzen, p. 908: Portrait of Gerhard Gentzen playing ping-pong
courtesy of Ekhart Mentzler-Trott.
Kurt Gödel, p. 909: Portrait of Kurt Gödel, ca. 1925, photographer un-
known. From the Shelby White and Leon Levy Archives Center, Institute for
Advanced Study, Princeton, NJ, USA, on deposit at Princeton University Li-
brary, Manuscript Division, Department of Rare Books and Special Collections,
Kurt Gödel Papers, (C0282), Box 14b, #110000. The Open Logic Project has
obtained permission from the Institute’s Archives Center to use this image
for inclusion in non-commercial OLP-derived materials. Permission from the
Archives Center is required for any other use.
Emmy Noether, p. 911: Portrait of Emmy Noether, ca. 1922, courtesy of
the Abteilung für Handschriften und Seltene Drucke, Niedersächsische Staats-
und Universitätsbibliothek Göttingen, Cod. Ms. D. Hilbert 754, Bl. 14 Nr. 73.
Restored from an original scan by Joel Fuller.
Rózsa Péter, p. 912: Portrait of Rózsa Péter, undated, photographer un-
known. Courtesy of Béla Andrásfai.
Julia Robinson, p. 913: Portrait of Julia Robinson, unknown photographer,
courtesy of Neil D. Reid. The Open Logic Project has obtained permission to
use this image for inclusion in non-commercial OLP-derived materials. Per-
mission is required for any other use.
Bertrand Russell, p. 915: Portrait of Bertrand Russell, ca. 1907, courtesy of
the William Ready Division of Archives and Research Collections, McMaster
University Library. Bertrand Russell Archives, Box 2, f. 4.
Alfred Tarski, p. 916: Passport photo of Alfred Tarski, 1939. Cropped and
restored from a scan of Tarski’s passport by Joel Fuller. Original courtesy
of Bancroft Library, University of California, Berkeley. Alfred Tarski Papers,
Banc MSS 84/49. The Open Logic Project has obtained permission to use
this image for inclusion in non-commercial OLP-derived materials. Permission
from Bancroft Library is required for any other use.
Alan Turing, p. 918: Portrait of Alan Mathison Turing by Elliott & Fry, 29
March 1951, NPG x82217, © National Portrait Gallery, London. Used under
a Creative Commons BY-NC-ND 3.0 license.
Ernst Zermelo, p. 919: Portrait of Ernst Zermelo, ca. 1922, courtesy of the
Abteilung für Handschriften und Seltene Drucke, Niedersächsische Staats- und

932
BIBLIOGRAPHY

Universitätsbibliothek Göttingen, Cod. Ms. D. Hilbert 754, Bl. 6 Nr. 25.

Bibliography

Andrásfai, Béla. 1986. Rózsa (Rosa) Péter. Periodica Polytechnica Electrical


Engineering 30(2-3): 139–145. URL http://www.pp.bme.hu/ee/article/
view/4651.

Aspray, William. 1984. The Princeton mathematics community in the 1930s:


Alonzo Church. URL http://www.princeton.edu/mudd/finding_aids/
mathoral/pmc05.htm. Interview.

Baaz, Matthias, Christos H. Papadimitriou, Hilary W. Putnam, Dana S. Scott,


and Charles L. Harper Jr. 2011. Kurt Gödel and the Foundations of Mathe-
matics: Horizons of Truth. Cambridge: Cambridge University Press.

Banach, Stefan and Alfred Tarski. 1924. Sur la décomposition des ensembles
de points en parties respectivement congruentes. Fundamenta Mathematicae
6: 244–77.

Benacerraf, Paul. 1965. What numbers could not be. The Philosophical Review
74(1): 47–73.

Berkeley, George. 1734. The Analyst; or, a Discourse Adressed to an Infidel


Mathematician.

Boolos, George. 1971. The iterative conception of set. The Journal of Philos-
ophy 68(8): 215–31.

Boolos, George. 1989. Iteration again. Philosophical Topics 17(2): 5–21.

Boolos, George. 2000. Must we believe in set theory? In Between Logic and
Intuition: Essays in Honor of Charles Parsons, eds. Gila Sher and Richard
Tieszen, 257–68. Cambridge: Cambridge University Press.

Burali-Forti, Cesare. 1897. Una questione sui numeri transfiniti. Rendiconti


del Circolo Matematico di Palermo 11: 154–64.

Button, Tim. 2021. Level theory, part 1: Axiomatizing the bare idea of a
cumulative hierarchy of sets. The Bulletin of Symbolic Logic 27(4): 436–460.

Cantor, Georg. 1878. Ein Beitrag zur Mannigfaltigkeitslehre. Journal für die
reine und angewandte Mathematik 84: 242–58.

Release : 6891b66 (2024-12-01) 933


BIBLIOGRAPHY

Cantor, Georg. 1883. Grundlagen einer allgemeinen Mannigfaltigkeitslehre.


Ein mathematisch-philosophischer Versuch in der Lehre des Unendlichen.
Leipzig: Teubner.
Cantor, Georg. 1892. Über eine elementare Frage der Mannigfaltigkeitslehre.
Jahresbericht der deutschen Mathematiker-Vereinigung 1: 75–8.

Cheng, Eugenia. 2004. How to write proofs: A quick quide.


URL https://eugeniacheng.com/wp-content/uploads/2017/02/
cheng-proofguide.pdf.
Church, Alonzo. 1936a. A note on the Entscheidungsproblem. The Journal of
Symbolic Logic 1: 40–41.

Church, Alonzo. 1936b. An unsolvable problem of elementary number theory.


American Journal of Mathematics 58: 345–363.
Cohen, Paul J. 1963. The independence of the continuum hypothesis. Proceed-
ings of the National Academy of Sciences of the United States of America
24: 556–557.
Cohen, Paul J. 1966. Set Theory and the Continuum Hypothesis. Reading,
MA: Benjamin.
Conway, John. 2006. The power of mathematics. In Power, eds. Alan Blackwell
and David MacKay, Darwin College Lectures. Cambridge: Cambridge Uni-
versity Press. URL http://www.cs.toronto.edu/~mackay/conway.pdf.
Corcoran, John. 1983. Logic, Semantics, Metamathematics. Indianapolis:
Hackett, 2nd ed.
Csicsery, George. 2016. Zala films: Julia Robinson and Hilbert’s tenth problem.
URL http://www.zalafilms.com/films/juliarobinson.html.
Dauben, Joseph. 1990. Georg Cantor: His Mathematics and Philosophy of the
Infinite. Princeton: Princeton University Press.
Davis, Martin, Hilary Putnam, and Julia Robinson. 1961. The decision problem
for exponential Diophantine equations. Annals of Mathematics 74(3): 425–
436. URL http://www.jstor.org/stable/1970289.
Dedekind, Richard. 1888. Was sind und was sollen die Zahlen? Braunschweig:
Vieweg.
Dick, Auguste. 1981. Emmy Noether 1882–1935. Boston: Birkhäuser.

du Sautoy, Marcus. 2014. A brief history of mathematics: Georg Cantor. URL


http://www.bbc.co.uk/programmes/b00ss1j0. Audio Recording.
Duncan, Arlene. 2015. The Bertrand Russell Research Centre. URL http:
//russell.mcmaster.ca/.

934 Release : 6891b66 (2024-12-01)


BIBLIOGRAPHY

Ebbinghaus, Heinz-Dieter. 2015. Ernst Zermelo: An Approach to his Life and


Work. Berlin: Springer-Verlag.
Ebbinghaus, Heinz-Dieter, Craig G. Fraser, and Akihiro Kanamori. 2010. Ernst
Zermelo. Collected Works, vol. 1. Berlin: Springer-Verlag.

Ebbinghaus, Heinz-Dieter and Akihiro Kanamori. 2013. Ernst Zermelo: Col-


lected Works, vol. 2. Berlin: Springer-Verlag.
Enderton, Herbert B. 2019. Alonzo Church: Life and Work. In The Col-
lected Works of Alonzo Church, eds. Tyler Burge and Herbert B. Enderton.
Cambridge, MA: MIT Press.

Feferman, Anita and Solomon Feferman. 2004. Alfred Tarski: Life and Logic.
Cambridge: Cambridge University Press.
Feferman, Solomon. 1994. Julia Bowman Robinson 1919–1985. Biograph-
ical Memoirs of the National Academy of Sciences 63: 1–28. URL
http://www.nasonline.org/publications/biographical-memoirs/
memoir-pdfs/robinson-julia.pdf.
Feferman, Solomon, John W. Dawson Jr., Stephen C. Kleene, Gregory H.
Moore, Robert M. Solovay, and Jean van Heijenoort. 1986. Kurt Gödel:
Collected Works. Vol. 1: Publications 1929–1936. Oxford: Oxford Univer-
sity Press.

Feferman, Solomon, John W. Dawson Jr., Stephen C. Kleene, Gregory H.


Moore, Robert M. Solovay, and Jean van Heijenoort. 1990. Kurt Gödel:
Collected Works. Vol. 2: Publications 1938–1974. Oxford: Oxford Univer-
sity Press.
Feferman, Solomon and Azriel Levy. 1963. Independence results in set theory
by Cohen’s method II. Notices of the American Mathematical Society 10:
593.
Fraenkel, Abraham. 1922. Über den Begriff ‘definit’ und die Unabhängigkeit
des Auswahlaxioms. Sitzungsberichte der Preussischen Akadademie der Wis-
senschaften, Physikalisch-mathematische Klasse 253–257.

Frege, Gottlob. 1884. Die Grundlagen der Arithmetik: Eine logisch mathema-
tische Untersuchung über den Begriff der Zahl. Breslau: Wilhelm Koebner.
Translation in Frege (1953).
Frege, Gottlob. 1953. Foundations of Arithmetic, ed. J. L. Austin. Oxford:
Basil Blackwell & Mott, 2nd ed.

Frey, Holly and Tracy V. Wilson. 2015. Stuff you missed in history class:
Emmy Noether, mathematics trailblazer. URL https://www.iheart.
com/podcast/stuff-you-missed-in-history-cl-21124503/episode/
emmy-noether-mathematics-trailblazer-30207491/. Podcast audio.

Release : 6891b66 (2024-12-01) 935


BIBLIOGRAPHY

Gentzen, Gerhard. 1935a. Untersuchungen über das logische Schließen I.


Mathematische Zeitschrift 39: 176–210. English translation in Szabo (1969),
pp. 68–131.

Gentzen, Gerhard. 1935b. Untersuchungen über das logische Schließen II.


Mathematische Zeitschrift 39: 176–210, 405–431. English translation in Sz-
abo (1969), pp. 68–131.

Giaquinto, Marcus. 2007. Visual Thinking in Mathematics. Oxford: Oxford


University Press.

Gödel, Kurt. 1929. Über die Vollständigkeit des Logikkalküls [On the com-
pleteness of the calculus of logic]. Dissertation, Universität Wien. Reprinted
and translated in Feferman et al. (1986), pp. 60–101.

Gödel, Kurt. 1931. über formal unentscheidbare Sätze der Principia Mathe-
matica und verwandter Systeme I [On formally undecidable propositions of
Principia Mathematica and related systems I]. Monatshefte für Mathematik
und Physik 38: 173–198. Reprinted and translated in Feferman et al. (1986),
pp. 144–195.

Gödel, Kurt. 1938. The consistency of the axiom of choice and the generalized
continuum hypothesis. Proceedings of the National Academy of Sciences of
the United States of America 50: 1143–8.

Gouvêa, Fernando Q. 2011. Was Cantor surprised? American Mathematical


Monthly 118(3): 198–209.

Grattan-Guinness, Ivor. 1971. Towards a biography of Georg Cantor. Annals


of Science 27(4): 345–391.

Hammack, Richard. 2013. Book of Proof. Richmond, VA: Virginia Com-


monwealth University. URL http://www.people.vcu.edu/~rhammack/
BookOfProof/BookOfProof.pdf.

Hartogs, Friedrich. 1915. Über das Problem der Wohlordnung. Mathematische


Annalen 76: 438–43.

Hausdorff, Felix. 1914. Bemerkung über den Inhalt von Punktmengen. Math-
ematische Annalen 75: 428–34.

Heck, Richard Kimberly. 2012. Reading Frege’s Grundgesetze. Oxford: Oxford


University Press.

Heijenoort, Jean van. 1967. From Frege to Gödel: A Source Book in Mathe-
matical Logic, 1879–1931. Cambridge, MA: Harvard University Press.

Hilbert, David. 1891. Über die stetige Abbildung einer Linie auf ein
Flächenstück. Mathematische Annalen 38(3): 459–460.

936 Release : 6891b66 (2024-12-01)


BIBLIOGRAPHY

Hilbert, David. 2013. David Hilbert’s Lectures on the Foundations of Arith-


metic and Logic 1917–1933, eds. William Bragg Ewald and Wilfried Sieg.
Heidelberg: Springer.
Hodges, Andrew. 2014. Alan Turing: The Enigma. London: Vintage.
Hume, David. 1740. A Treatise of Human Nature. London.
Hutchings, Michael. 2003. Introduction to mathematical arguments. URL
https://math.berkeley.edu/~hutching/teach/proofs.pdf.
Incurvati, Luca. 2020. Conceptions of Set and the Foundations of Mathematics.
Cambridge: Cambridge University Press.
Institute, Perimeter. 2015. Emmy Noether: Her life, work, and influence. URL
https://www.youtube.com/watch?v=tNNyAyMRsgE. Video Lecture.
Irvine, Andrew David. 2015. Sound clips of Bertrand Russell speaking. URL
http://plato.stanford.edu/entries/russell/russell-soundclips.
html.
Jacobson, Nathan. 1983. Emmy Noether: Gesammelte Abhandlungen—
Collected Papers. Berlin: Springer-Verlag.
John Dawson, Jr. 1997. Logical Dilemmas: The Life and Work of Kurt Gödel.
Boca Raton: CRC Press.
Katz, Karin Usadi and Mikhail G. Katz. 2012. Stevin numbers and reality.
Foundations of Science 17(2): 109–23.
Kunen, Kenneth. 1980. Set Theory: An Introduction to Independence Proofs.
New York: North Holland.
Lévy, Azriel. 1960. Axiom schemata of strong infinity in axiomatic set theory.
Pacific Journal of Mathematics 10(1): 223–38.
LibriVox. n.d. Bertrand Russell. URL https://librivox.org/author/1508?
primary_key=1508&search_category=author&search_page=1&search_
form=get_results. Collection of public domain audiobooks.
Linnebo, Øystein. 2010. Predicative and impredicative definitions. Internet
Encyclopedia of Philosophy URL http://www.iep.utm.edu/predicat/.
Linsenmayer, Mark. 2014. The partially examined life: Gödel on
math. URL http://www.partiallyexaminedlife.com/2014/06/16/
ep95-godel/. Podcast audio.
MacFarlane, John. 2015. Alonzo Church’s JSL reviews. URL http://
johnmacfarlane.net/church.html.
Maddy, Penelope. 1988a. Believing the axioms I. The Journal of Symbolic
Logic 53(2): 481–511.

Release : 6891b66 (2024-12-01) 937


BIBLIOGRAPHY

Maddy, Penelope. 1988b. Believing the axioms II. The Journal of Symbolic
Logic 53(3): 736–64.

Magnus, P. D., Tim Button, J. Robert Loftis, Aaron Thomas-Bolduc, Robert


Trueman, and Richard Zach. 2021. Forall x: Calgary. An Introduction to
Formal Logic. Calgary: Open Logic Project, f21 ed. URL https://forallx.
openlogicproject.org/.

Matijasevich, Yuri. 1992. My collaboration with Julia Robinson. The Mathe-


matical Intelligencer 14(4): 38–45.

Menzler-Trott, Eckart. 2007. Logic’s Lost Genius: The Life of Gerhard


Gentzen. Providence: American Mathematical Society.

Montague, Richard. 1961. Semantic closure and non-finite axiomatizability I.


In Infinitistic Methods: Proceedings of the Symposium on Foundations of
Mathematics (Warsaw 1959), 45–69. New York: Pergamon.

Montague, Richard. 1965. Set theory and higher-order logic. In Formal systems
and recursive functions, eds. John Crossley and Michael Dummett, 131–48.
Amsterdam: North-Holland. Proceedings of the Eight Logic Colloquium,
July 1963.

O’Connor, John J. and Edmund F. Robertson. 2005. The real num-


bers: Stevin to Hilbert URL http://www-history.mcs.st-and.ac.uk/
HistTopics/Real_numbers_2.html.

O’Connor, John J. and Edmund F. Robertson. 2014. Rózsa Péter. URL http:
//www-groups.dcs.st-and.ac.uk/~history/Biographies/Peter.html.

Peano, Giuseppe. 1890. Sur une courbe, qui remplit toute une aire plane.
Mathematische Annalen 36(1): 157–60.

Péter, Rózsa. 1935a. Über den Zusammenhang der verschiedenen Begriffe der
rekursiven Funktion. Mathematische Annalen 110: 612–632.

Péter, Rózsa. 1935b. Konstruktion nichtrekursiver Funktionen. Mathematische


Annalen 111: 42–60.

Péter, Rózsa. 1951. Rekursive Funktionen. Budapest: Akademiai Kiado. En-


glish translation in (Péter, 1967).

Péter, Rózsa. 1967. Recursive Functions. New York: Academic Press.

Péter, Rózsa. 2010. Playing with Infinity. New York: Dover.


URL https://books.google.ca/books?id=6V3wNs4uv_4C&lpg=PP1&ots=
BkQZaHcR99&lr&pg=PP1#v=onepage&q&f=false.

Potter, Michael. 2004. Set Theory and its Philosophy. Oxford: Oxford Univer-
sity Press.

938 Release : 6891b66 (2024-12-01)


BIBLIOGRAPHY

Radiolab. 2012. The Turing problem. URL http://www.radiolab.org/


story/193037-turing-problem/. Podcast audio.

Ramsey, Frank Plumpton. 1925. The foundations of mathematics. Proceedings


of the London Mathematical Society 25: 338–384.

Reid, Constance. 1986. The autobiography of Julia Robinson. The College


Mathematics Journal 17: 3–21.

Reid, Constance. 1996. Julia: A Life in Mathematics. Cambridge:


Cambridge University Press. URL https://books.google.ca/books?id=
lRtSzQyHf9UC&lpg=PP1&pg=PP1#v=onepage&q&f=false.

Robinson, Julia. 1949. Definability and decision problems in arithmetic. The


Journal of Symbolic Logic 14(2): 98–114. URL http://www.jstor.org/
stable/2266510.

Robinson, Julia. 1996. The Collected Works of Julia Robinson. Providence:


American Mathematical Society.

Robinson, Raphael. 1947. On the decomposition of spheres. Fundamenta Math-


ematicae 34(1): 246–60.

Rose, Daniel. 2012. A song about Georg Cantor. URL https://www.youtube.


com/watch?v=QUP5Z4Fb5k4. Audio Recording.

Rose, Nicholas J. 2010. Hilbert-type space-filling curves. URL


https://web.archive.org/web/20151010184939/http://www4.ncsu.
edu/~njrose/pdfFiles/HilbertCurve.pdf.

Russell, Bertrand. 1905. On denoting. Mind 14: 479–493.

Russell, Bertrand. 1919. Introduction to Mathematical Philosophy. London:


Allen & Unwin.

Russell, Bertrand. 1967. The Autobiography of Bertrand Russell, vol. 1. Lon-


don: Allen and Unwin.

Russell, Bertrand. 1968. The Autobiography of Bertrand Russell, vol. 2. Lon-


don: Allen and Unwin.

Russell, Bertrand. 1969. The Autobiography of Bertrand Russell, vol. 3. Lon-


don: Allen and Unwin.

Russell, Bertrand. n.d. Bertrand Russell on smoking. URL https://www.


youtube.com/watch?v=80oLTiVW_lc. Video Interview.

Sandstrum, Ted. 2019. Mathematical Reasoning: Writing and Proof. Allendale,


MI: Grand Valley State University. URL https://scholarworks.gvsu.
edu/books/7/.

Release : 6891b66 (2024-12-01) 939


BIBLIOGRAPHY

Scott, Dana. 1974. Axiomatizing set theory. In Axiomatic Set Theory II, ed.
Thomas Jech, 207–14. American Mathematical Society. Proceedings of the
Symposium in Pure Mathematics of the American Mathematical Society,
July–August 1967.
Segal, Sanford L. 2014. Mathematicians under the Nazis. Princeton: Princeton
University Press.
Shoenfield, Joseph R. 1977. Axioms of set theory. In Handbook of Mathematical
Logic, ed. Jon Barwise, 321–44. London: North-Holland.
Sigmund, Karl, John Dawson, Kurt Mühlberger, Hans Magnus Enzensberger,
and Juliette Kennedy. 2007. Kurt Gödel: Das Album–The Album. The
Mathematical Intelligencer 29(3): 73–76.
Skolem, Thoralf. 1922. Einige Bemerkungen zur axiomatischen Begründung
der Mengenlehre. In Wissenschaftliche Vorträge gehalten auf dem fünften
Kongress der skandanivschen Mathematiker in Helsingfors vom 4. bis zum
7. Juli 1922, 137–52. Akademiska Bokhandeln.

Smith, Peter. 2013. An Introduction to Gödel’s Theorems. Cambridge: Cam-


bridge University Press.
Smullyan, Raymond M. 1968. First-Order Logic. New York, NY: Springer.
Corrected reprint, New York, NY: Dover, 1995.

Solow, Daniel. 2013. How to Read and Do Proofs. Hoboken, NJ: Wiley.
Steinhart, Eric. 2018. More Precisely: The Math You Need to Do Philosophy.
Peterborough, ON: Broadview, 2nd ed.
Sykes, Christopher. 1992. BBC Horizon: The strange life and death of Dr. Tur-
ing. URL https://www.youtube.com/watch?v=gyusnGbBSHE.
Szabo, Manfred E. 1969. The Collected Papers of Gerhard Gentzen. Amster-
dam: North-Holland.
Takeuti, Gaisi, Nicholas Passell, and Mariko Yasugi. 2003. Memoirs of a Proof
Theorist: Gödel and Other Logicians. Singapore: World Scientific.

Tamassy, Istvan. 1994. Interview with Róza Péter. Modern Logic 4(3): 277–
280.
Tarski, Alfred. 1981. The Collected Works of Alfred Tarski, vol. I–IV. Basel:
Birkhäuser.

Theelen, Andre. 2012. Lego turing machine. URL https://www.youtube.


com/watch?v=FTSAiF9AHN4.
Tomkowicz, Grzegorz and Stan Wagon. 2016. The Banach-Tarski Paradox.
Cambridge: Cambridge University Press.

940 Release : 6891b66 (2024-12-01)


BIBLIOGRAPHY

Turing, Alan M. 1937. On computable numbers, with an application to the


“Entscheidungsproblem”. Proceedings of the London Mathematical Society,
2nd Series 42: 230–265.
Tyldum, Morten. 2014. The imitation game. Motion picture.
Velleman, Daniel J. 2019. How to Prove It: A Structured Approach. Cambridge:
Cambridge University Press, 3rd ed.
Vitali, Giuseppe. 1905. Sul problema della misura dei gruppi di punti di una
retta. Bologna: Gamberini e Parmeggiani.
von Neumann, John. 1925. Eine Axiomatisierung der Mengenlehre. Journal
für die reine und angewandte Mathematik 154: 219–40.
Wang, Hao. 1990. Reflections on Kurt Gödel. Cambridge: MIT Press.
Weston, Tom. 2003. The Banach-Tarski paradox URL http://people.math.
umass.edu/~weston/oldpapers/banach.pdf.
Whitehead, Alfred North and Bertrand Russell. 1910. Principia Mathematica,
vol. 1. Cambridge: Cambridge University Press.
Zermelo, Ernst. 1904. Beweis, daß jede Menge wohlgeordnet werden kann.
Mathematische Annalen 59: 514–516. English translation in (Ebbinghaus
et al., 2010, pp. 115–119).
Zermelo, Ernst. 1908a. Untersuchungen über die Grundlagen der Mengenlehre
I. Mathematische Annalen 65: 261–81.
Zermelo, Ernst. 1908b. Untersuchungen über die Grundlagen der Mengen-
lehre I. Mathematische Annalen 65(2): 261–281. English translation in
(Ebbinghaus et al., 2010, pp. 189-229).
Zuckerman, Martin M. 1973. Formation sequences for propositional formulas.
Notre Dame Journal of Formal Logic 14(1): 134–138.

Release : 6891b66 (2024-12-01) 941

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy