0% found this document useful (0 votes)
67 views272 pages

An Introduction To Graph Theory 1691935582

Uploaded by

romandre91
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views272 pages

An Introduction To Graph Theory 1691935582

Uploaded by

romandre91
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 272

An introduction to graph theory

(Text for Math 530 in Spring 2022 at Drexel University)

Darij Grinberg*
Spring 2023 edition, August 2, 2023
arXiv:2308.04512v1 [math.HO] 2 Aug 2023

Abstract. This is a graduate-level introduction to graph theory,


corresponding to a quarter-long course. It covers simple graphs,
multigraphs as well as their directed analogues, and more restrictive
classes such as tournaments, trees and arborescences. Among the
features discussed are Eulerian circuits, Hamiltonian cycles, span-
ning trees, the matrix-tree and BEST theorems, proper colorings,
Turan’s theorem, bipartite matching and the Menger and Gallai–
Milgram theorems. The basics of network flows are introduced in
order to prove Hall’s marriage theorem.
Around a hundred exercises are included (without solutions).

Contents
1. Preface 6
1.1. What is this? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1.1. Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2. Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2. Simple graphs 9
2.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2. Drawing graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3. A first fact: The Ramsey number R (3, 3) = 6 . . . . . . . . . . . . 13
2.4. Degrees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.5. Graph isomorphism . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.6. Some families of graphs . . . . . . . . . . . . . . . . . . . . . . . . 26
2.6.1. Complete and empty graphs . . . . . . . . . . . . . . . . . 26
* Drexel
University, Korman Center, 15 S 33rd Street, Office #263, Philadelphia, PA 19104
(USA). // darijgrinberg@gmail.com // http://www.cip.ifi.lmu.de/~grinberg/

1
An introduction to graph theory, version August 2, 2023 page 2

2.6.2. Path and cycle graphs . . . . . . . . . . . . . . . . . . . . . 28


2.6.3. Kneser graphs . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.7. Subgraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.8. Disjoint unions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.9. Walks and paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.9.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.9.2. Composing/concatenating and reversing walks . . . . . . 36
2.9.3. Reducing walks to paths . . . . . . . . . . . . . . . . . . . . 36
2.9.4. Remark on algorithms . . . . . . . . . . . . . . . . . . . . . 37
2.9.5. The equivalence relation “path-connected” . . . . . . . . . 39
2.9.6. Connected components and connectedness . . . . . . . . . 40
2.9.7. Induced subgraphs on components . . . . . . . . . . . . . 42
2.9.8. Some exercises on connectedness . . . . . . . . . . . . . . 43
2.10. Closed walks and cycles . . . . . . . . . . . . . . . . . . . . . . . . 45
2.11. The longest path trick . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.12. Bridges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.13. Dominating sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.13.1. Definition and basic facts . . . . . . . . . . . . . . . . . . . 54
2.13.2. The number of dominating sets . . . . . . . . . . . . . . . 56
2.14. Hamiltonian paths and cycles . . . . . . . . . . . . . . . . . . . . . 60
2.14.1. Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
2.14.2. Sufficient criteria: Ore and Dirac . . . . . . . . . . . . . . . 64
2.14.3. A necessary criterion . . . . . . . . . . . . . . . . . . . . . . 66
2.14.4. Hypercubes . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
2.14.5. Cartesian products . . . . . . . . . . . . . . . . . . . . . . . 70
2.14.6. Subset graphs . . . . . . . . . . . . . . . . . . . . . . . . . . 72

3. Multigraphs 74
3.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.2. Conversions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.3. Generalizing from simple graphs to multigraphs . . . . . . . . . . 79
3.3.1. The Ramsey number R (3, 3) . . . . . . . . . . . . . . . . . 80
3.3.2. Degrees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.3.3. Graph isomorphisms . . . . . . . . . . . . . . . . . . . . . . 82
3.3.4. Complete graphs, paths, cycles . . . . . . . . . . . . . . . . 83
3.3.5. Induced submultigraphs . . . . . . . . . . . . . . . . . . . . 83
3.3.6. Disjoint unions . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.3.7. Walks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.3.8. Path-connectedness . . . . . . . . . . . . . . . . . . . . . . . 85
3.3.9. G \ e, bridges and cut-edges . . . . . . . . . . . . . . . . . . 87
3.3.10. Dominating sets . . . . . . . . . . . . . . . . . . . . . . . . 88
3.3.11. Hamiltonian paths and cycles . . . . . . . . . . . . . . . . . 88
3.3.12. Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
An introduction to graph theory, version August 2, 2023 page 3

3.4. Eulerian circuits and walks . . . . . . . . . . . . . . . . . . . . . . 93


3.4.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
3.4.2. The Euler–Hierholzer theorem . . . . . . . . . . . . . . . . 96

4. Digraphs and multidigraphs 101


4.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.2. Outdegrees and indegrees . . . . . . . . . . . . . . . . . . . . . . . 103
4.3. Subdigraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.4. Conversions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.4.1. Multidigraphs to multigraphs . . . . . . . . . . . . . . . . 105
4.4.2. Multigraphs to multidigraphs . . . . . . . . . . . . . . . . 106
4.4.3. Simple digraphs to multidigraphs . . . . . . . . . . . . . . 107
4.4.4. Multidigraphs to simple digraphs . . . . . . . . . . . . . . 108
4.4.5. Multidigraphs as a big tent . . . . . . . . . . . . . . . . . . 109
4.5. Walks, paths, closed walks, cycles . . . . . . . . . . . . . . . . . . 109
4.5.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.5.2. Basic properties . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.5.3. Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
4.5.4. The adjacency matrix . . . . . . . . . . . . . . . . . . . . . 115
4.6. Connectedness strong and weak . . . . . . . . . . . . . . . . . . . 119
4.7. Eulerian walks and circuits . . . . . . . . . . . . . . . . . . . . . . 122
4.8. Hamiltonian cycles and paths . . . . . . . . . . . . . . . . . . . . . 124
4.9. The reverse and complement digraphs . . . . . . . . . . . . . . . . 124
4.10. Tournaments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
4.10.1. Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
4.10.2. The Rédei theorems . . . . . . . . . . . . . . . . . . . . . . 132
4.10.3. Hamiltonian cycles in tournaments . . . . . . . . . . . . . 136
4.10.4. Application of tournaments to the Vandermonde deter-
minant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
4.11. Exercises on tournaments . . . . . . . . . . . . . . . . . . . . . . . 141

5. Trees and arborescences 143


5.1. Some general properties of components and cycles . . . . . . . . 143
5.1.1. Backtrack-free walks revisited . . . . . . . . . . . . . . . . 143
5.1.2. Counting components . . . . . . . . . . . . . . . . . . . . . 144
5.2. Forests and trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
5.2.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
5.2.2. The tree equivalence theorem . . . . . . . . . . . . . . . . . 148
5.2.3. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
5.3. Leaves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
5.4. Spanning trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
5.4.1. Spanning subgraphs . . . . . . . . . . . . . . . . . . . . . . 157
5.4.2. Spanning trees . . . . . . . . . . . . . . . . . . . . . . . . . 157
5.4.3. Spanning forests . . . . . . . . . . . . . . . . . . . . . . . . 159
An introduction to graph theory, version August 2, 2023 page 4

5.4.4. Existence and construction of a spanning tree . . . . . . . 160


5.4.5. Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
5.4.6. Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
5.4.7. Existence and construction of a spanning forest . . . . . . 173
5.5. Centers of graphs and trees . . . . . . . . . . . . . . . . . . . . . . 173
5.5.1. Distances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
5.5.2. Eccentricity and centers . . . . . . . . . . . . . . . . . . . . 175
5.5.3. The centers of a tree . . . . . . . . . . . . . . . . . . . . . . 176
5.6. Arborescences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
5.6.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
5.6.2. Arborescences vs. trees: statement . . . . . . . . . . . . . . 187
5.6.3. The arborescence equivalence theorem . . . . . . . . . . . 187
5.7. Arborescences vs. trees . . . . . . . . . . . . . . . . . . . . . . . . . 191
5.8. Spanning arborescences . . . . . . . . . . . . . . . . . . . . . . . . 197
5.9. The BEST theorem: statement . . . . . . . . . . . . . . . . . . . . . 200
5.10. Arborescences rooted to r . . . . . . . . . . . . . . . . . . . . . . . 201
5.11. The BEST theorem: proof . . . . . . . . . . . . . . . . . . . . . . . 203
5.12. A corollary about spanning arborescences . . . . . . . . . . . . . . 212
5.13. Spanning arborescences vs. spanning trees . . . . . . . . . . . . . 213
5.14. The matrix-tree theorem . . . . . . . . . . . . . . . . . . . . . . . . 217
5.14.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
5.14.2. Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
5.14.3. The Laplacian of a multidigraph . . . . . . . . . . . . . . . 219
5.14.4. The Matrix-Tree Theorem: statement . . . . . . . . . . . . 220
5.14.5. Application: Counting the spanning trees of Kn . . . . . . 221
5.14.6. Preparations for the proof . . . . . . . . . . . . . . . . . . . 224
5.14.7. The Matrix-Tree Theorem: proof . . . . . . . . . . . . . . . 225
5.14.8. Further exercises on the Laplacian . . . . . . . . . . . . . . 234
5.14.9. Application: Counting Eulerian circuits of Knbidir . . . . . . 237
5.15. The undirected Matrix-Tree Theorem . . . . . . . . . . . . . . . . 238
5.15.1. The theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
5.15.2. Application: counting spanning trees of Kn,m . . . . . . . 240
5.16. de Bruijn sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
5.16.1. Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
5.16.2. Existence of de Bruijn sequences . . . . . . . . . . . . . . . 250
5.16.3. Counting de Bruijn sequences . . . . . . . . . . . . . . . . 254
5.17. More on Laplacians . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
5.18. On the left nullspace of the Laplacian . . . . . . . . . . . . . . . . 260
5.19. A weighted Matrix-Tree Theorem . . . . . . . . . . . . . . . . . . . 264
5.19.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
5.19.2. The weighted Matrix-Tree Theorem . . . . . . . . . . . . . 265
5.19.3. The polynomial identity trick . . . . . . . . . . . . . . . . . 266
5.19.4. Proof of the weighted MTT . . . . . . . . . . . . . . . . . . 267
5.19.5. Application: Counting trees by their degrees . . . . . . . . 268
An introduction to graph theory, version August 2, 2023 page 5

5.19.6. The weighted harmonic vector theorem . . . . . . . . . . . 272

6. Colorings 273
6.1. Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
6.2. 2-colorings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
6.3. The Brooks theorems . . . . . . . . . . . . . . . . . . . . . . . . . . 283
6.4. Exercises on proper colorings . . . . . . . . . . . . . . . . . . . . . 284
6.5. The chromatic polynomial . . . . . . . . . . . . . . . . . . . . . . . 285
6.6. Vizing’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
6.7. Further exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295

7. Independent sets 296


7.1. Definition and the Caro–Wei theorem . . . . . . . . . . . . . . . . 296
7.2. A weaker (but simpler) lower bound . . . . . . . . . . . . . . . . . 303
7.3. A proof of Turan’s theorem . . . . . . . . . . . . . . . . . . . . . . 306

8. Matchings 307
8.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
8.2. Bipartite graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
8.3. Hall’s marriage theorem . . . . . . . . . . . . . . . . . . . . . . . . 314
8.4. König and Hall–König . . . . . . . . . . . . . . . . . . . . . . . . . 317
8.5. Systems of representatives . . . . . . . . . . . . . . . . . . . . . . . 322
8.6. Regular bipartite graphs . . . . . . . . . . . . . . . . . . . . . . . . 324
8.7. Latin squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
8.8. Magic matrices and the Birkhoff–von Neumann theorem . . . . . 329
8.9. Further uses of Hall’s marriage theorem . . . . . . . . . . . . . . . 335
8.10. Further exercises on matchings . . . . . . . . . . . . . . . . . . . . 337

9. Networks and flows 338


9.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
9.1.1. Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
9.1.2. The notations S, [ P, Q] and d ( P, Q) . . . . . . . . . . . . . 340
9.1.3. Flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
9.1.4. Inflow, outflow and value of a flow . . . . . . . . . . . . . 343
9.2. The maximum flow problem and bipartite graphs . . . . . . . . . 344
9.3. Basic properties of flows . . . . . . . . . . . . . . . . . . . . . . . . 346
9.4. The max-flow-min-cut theorem . . . . . . . . . . . . . . . . . . . . 349
9.4.1. Cuts and their capacities . . . . . . . . . . . . . . . . . . . 349
9.4.2. The max-flow-min-cut theorem: statement . . . . . . . . . 349
9.4.3. How to augment a flow . . . . . . . . . . . . . . . . . . . . 350
9.4.4. The residual digraph . . . . . . . . . . . . . . . . . . . . . . 352
9.4.5. The augmenting path lemma . . . . . . . . . . . . . . . . . 354
9.4.6. Proof of max-flow-min-cut . . . . . . . . . . . . . . . . . . 357
9.5. Application: Deriving Hall–König . . . . . . . . . . . . . . . . . . 359
9.6. Other applications . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
An introduction to graph theory, version August 2, 2023 page 6

10.More about paths 362


10.1. Menger’s theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . 362
10.1.1. The arc-Menger theorem for directed graphs . . . . . . . . 363
10.1.2. The edge-Menger theorem for undirected graphs . . . . . 377
10.1.3. The vertex-Menger theorem for directed graphs . . . . . . 381
10.1.4. The vertex-Menger theorem for undirected graphs . . . . 396
10.2. The Gallai–Milgram theorem . . . . . . . . . . . . . . . . . . . . . 397
10.2.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
10.2.2. The Gallai–Milgram theorem . . . . . . . . . . . . . . . . . 399
10.2.3. Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . 404
10.3. Path-missing sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
10.4. Elser’s sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411

This work is licensed under a Creative Commons


“CC0 1.0 Universal” license.

1. Preface
1.1. What is this?
This is a course on graphs – a rather elementary concept (actually a cluster of
closely related concepts) that can be seen all over mathematics. We will discuss
several kinds of graphs (simple graphs, multigraphs, directed graphs, etc.) and
study their features and properties. In particular, we will encounter walks on
graphs, matchings of graphs, flows on networks (networks are graphs with
extra data), and take a closer look at certain types of graphs such as trees and
tournaments.
The theory of graphs goes back at least to Leonhard Euler, who in a 1736
paper [Euler36] (see [Euler53] for an English translation) solved a puzzle about
an optimal tour of the town of Königsberg. It saw some more developments in
the 19th century and straight-up exploded in the 20th; now it is one of the most
active fields of mathematics. There are now dozens (if not hundreds) textbooks
available on the subject, such as

• the comprehensive works [BonMur08], [Berge91], [Ore74], [Bollob98],


[Dieste17], [ChLeZh16], [Jungni13]

• or the more introductory [Ore96], [BenWil06, Chapters 5–6], [Bollob71],


[Griffi21], [Galvin21], [Guicha16, Chapter 5], [Harary69], [Harju14],
[HaHiMo08, Chapter 1], [Wilson10], [Tait21], [LeLeMe18, Chapters 10–
13], [Ruohon13], [KelTro17], [LoPeVe03], [West01], [Verstr21], [HarRin03].

These texts are written at different levels of sophistication, rigor and detail, are
tailored to different audiences, and (beyond the absolute basics) often cover
An introduction to graph theory, version August 2, 2023 page 7

different ground (for instance, [Dieste17] distinguishes itself by treating infinite


and random graphs, whereas [Griffi21] is strong on applications).
The present notes are self-contained and do not follow any existing book.
Nevertheless, I recommend skimming the texts cited above to gain a wider
perspective on graph theory (far beyond what we can cover in an introductory
course), and perhaps marking the one or the other book for later reading. Our
focus in these notes is on the more discrete and algebraic sides of graph theory
(finite graphs of various kinds, existential results, counting formulas), and they
are limited both by the time constraints (being written for a quarter-long course)
and the limits of my own knowledge.

1.1.1. Remarks
Prerequisites. These notes target a graduate-level (or advanced undergraduate)
reader. A certain mathematical sophistication and willingness to think along
(as well as invent one’s own examples) is expected. Beyond that, the main
prerequisites are the basic properties of determinants, polynomials and finite
sums. Rings and fields are occasionally mentioned, but the reader can make do
with just the most basic examples thereof (Q, R, polynomial rings and matrix
rings; also the finite field F2 in a few places). No analysis (or even calculus) is
required anywhere in this text.
Course websites. These notes were written for my Math 530 course at Drexel
University in Spring 2022. The website of this course can be found at

https://www.cip.ifi.lmu.de/~grinberg/t/22s .

An older, but similarly structured course is my Spring 2017 course at the Uni-
versity of Minnesota. Its website is available at

https://www.cip.ifi.lmu.de/~grinberg/t/17s ,

and contains some additional materials (such as solutions to some selected


exercises, a few more detailed topics, and a stub of a text [17s] that covers parts
of our Chapter 2 in more depth). If you are reading the present notes on the
arXiv, then said additional materials can also be found as ancillary files to this
arXiv submission.
Exercises. These notes include exercises of varying difficulty and signifi-
cance. Almost all of the exercises are optional (i.e., they are not used anywhere
in the text, except perhaps in other exercises), but they often provide practice,
context and additional inspiration. Naturally, one person’s inspiration is an-
other’s distraction, so I do not recommend assigning too much importance to
any specific exercise; it is usually better to read on than to dwell for hours.
However, a dozen minutes of thought per exercise will likely not be a waste of
time.
An introduction to graph theory, version August 2, 2023 page 8

Acknowledgments. I have learned a lot from conversations with Joel Brew-


ster Lewis, Lukas Katthän and Victor Reiner. Chiara Libera Carnevale and
Amanda Johnson corrected errors in previous versions of these notes. I am
indebted to all of the above, and would appreciate any further input – please
contact darijgrinberg@gmail.com about any corrections (however small) and
suggestions.

1.2. Notations
The following notations will be used throughout these notes:

• We let N = {0, 1, 2, . . .}. Thus, 0 ∈ N.

• The size (i.e., cardinality) of a finite set S is denoted by |S|.

• If S is a set, then the powerset of S means the set of all subsets of S. This
powerset will be denoted by P (S).
Moreover, if S is a set, and k is an integer, then Pk (S) will mean the set of
all k-element subsets of S. For instance,

P2 ({1, 2, 3}) = {{1, 2} , {1, 3} , {2, 3}} .

 any number n and any k ∈ N, we define the binomial coefficient


• For
n
to be the number
k
k−1
∏ (n − i)
n ( n − 1) ( n − 2) · · · ( n − k + 1) i =0
= .
k! k!

These binomial coefficients have many interesting properties, which can


often be found in textbooks on enumerative combinatorics (e.g., [19fco,
Chapter 2]). Some of the most important ones are the following:
 
n n!
– The factorial formula: If n, k ∈ N and n ≥ k, then = .
k k! · (n − k) !
– The combinatorial
  interpretation: If n, k ∈ N, and if S is an n-element
n
set, then is the number of all k-element subsets of S (in other
k  
n
words, |Pk (S)| = ).
k
– Pascal’s recursion: For any number n and any positive integer k, we
have      
n n−1 n−1
= + .
k k−1 k
An introduction to graph theory, version August 2, 2023 page 9

2. Simple graphs
2.1. Definitions
The first type of graphs that we will consider are the “simple graphs”, named
so because of their very simple definition:

Definition 2.1.1. A simple graph is a pair (V, E), where V is a finite set, and
where E is a subset of P2 (V ).

To remind, P2 (V ) is the set of all 2-element subsets of V. Thus, a simple


graph is a pair (V, E), where V is a finite set, and E is a set consisting of 2-
element subsets of V. We will abbreviate the word “simple graph” as “graph”
in this chapter, but later (in Chapter 3) we will learn some more advanced and
general notions of “graphs”.

Example 2.1.2. Here is a simple graph:

({1, 2, 3, 4} , {{1, 3} , {1, 4} , {3, 4}}) .

Example 2.1.3. For any n ∈ N, we can define a simple graph Copn to be the
pair (V, E), where V = {1, 2, . . . , n} and

E = {{u, v} ∈ P2 (V ) | gcd (u, v) = 1} .

We call this the n-th coprimality graph.

(Some authors do not require V to be finite in Definition 2.1.1; this leads to


infinite graphs. But I shall leave this can of worms closed for this quarter.)
The purpose of simple graphs is to encode relations on a finite set – specif-
ically the kind of relations that are binary (i.e., relate pairs of elements), sym-
metric (i.e., mutual) and irreflexive (i.e., an element cannot be related to itself).
For example, the graph Copn in Example 2.1.3 encodes the coprimality (aka
coprimeness) relation on the set {1, 2, . . . , n}, except that the latter relation is
not irreflexive (1 is coprime to 1, but {1, 1} is not in E; thus, the graph Copn
“forgets” that 1 is coprime to 1). For another example, if V is a set of people,
and E is the set of {u, v} ∈ P2 (V ) such that u has been married to v at some
point, then (V, E) is a simple graph. Even in 2022, marriage to oneself is not a
thing, so all marriages can be encoded as 2-element subsets.1
The following notations provide a quick way to reference the elements of V
and E when given a graph (V, E):
1 The more standard example for a social graph would be a “friendship graph”; here, V is
again a set of people, but E is now the set of { u, v } ∈ P2 (V ) such that u and v are friends.
Of course, this only works if you think of friendship as being automatically mutual (true
for facebook friendship, questionable for the actual thing).
An introduction to graph theory, version August 2, 2023 page 10

Definition 2.1.4. Let G = (V, E) be a simple graph.

(a) The set V is called the vertex set of G; it is denoted by V ( G ). (Notice


that the letter “V” in “V ( G )” is upright, as opposed to the letter “V”
in “(V, E)”, which is italic. These are two different symbols, and have
different meanings: The letter V stands for the specific set V which is
the first component of the pair G, whereas the letter V is part of the
notation V (G ) for the vertex set of any graph. Thus, if H = (W, F) is
another graph, then V ( H ) is W, not V.)
The elements of V are called the vertices (or the nodes) of G.

(b) The set E is called the edge set of G; it is denoted by E (G ). (Again, the
letter “E” in “E ( G )” is upright, and stands for a different thing than
the “E”.)
The elements of E are called the edges of G. When u and v are two
elements of V, we shall often use the notation uv for {u, v}; thus, each
edge of G has the form uv for two distinct elements u and v of V. Of
course, we always have uv = vu.
Notice that each simple graph G satisfies G = (V ( G ) , E ( G )).

(c) Two vertices u and v of G are said to be adjacent (to each other) if
uv ∈ E (that is, if uv is an edge of G). In this case, the edge uv is said
to join u with v (or connect u and v); the vertices u and v are called
the endpoints of this edge. When the graph G is not obvious from the
context, we shall often say “adjacent in G” instead of just “adjacent”.
Two vertices u and v of G are said to be non-adjacent (to each other) if
they are not adjacent (i.e., if uv ∈
/ E).

(d) Let v be a vertex of G (that is, v ∈ V). Then, the neighbors of v (in
G) are the vertices u of G that satisfy vu ∈ E. In other words, the
neighbors of v are the vertices of G that are adjacent to v.

Example 2.1.5. Let G be the simple graph

({1, 2, 3, 4} , {{1, 3} , {1, 4} , {3, 4}})

from Example 2.1.2. Then, its vertex set and its edge set are

V (G ) = {1, 2, 3, 4} and E ( G ) = {{1, 3} , {1, 4} , {3, 4}} = {13, 14, 34}

(using our notation uv for {u, v}). The vertices 1 and 3 are adjacent (since
13 ∈ E ( G )), but the vertices 1 and 2 are not (since 12 ∈
/ E (G )). The neighbors
of 1 are 3 and 4. The endpoints of the edge 34 are 3 and 4.
An introduction to graph theory, version August 2, 2023 page 11

2.2. Drawing graphs


There is a common method to represent graphs visually: Namely, a graph can
be drawn as a set of points in the plane and a set of curves connecting some of
these points with each other.
More precisely:
Definition 2.2.1. A simple graph G can be visually represented by drawing
it on the plane. To do so, we represent each vertex of G by a point (at which
we put the name of the vertex), and then, for each edge uv of G, we draw a
curve that connects the point representing u with the point representing v.
The positions of the points and the shapes of the curves can be chosen freely,
as long as they allow the reader to unambiguously reconstruct the graph G
from the picture. (Thus, for example, the curves should not pass through
any points other than the ones they mean to connect.)

Example 2.2.2. Let us draw some simple graphs.


(a) The simple graph ({1, 2, 3} , {12, 23}) (where we are again using the
shorthand notation uv for {u, v}) can be drawn as follows:

1 2 3
.
This is (in a sense) the simplest way to draw this graph: The edges are
represented by straight lines. But we can draw it in several other ways as
well – e.g., as follows:

1 3 2

.
Here, we have placed the points representing the vertices 1, 2, 3 differently.
As a consequence, we were not able to draw the edge 12 as a straight line,
because it would then have overlapped with the vertex 3, which would make
the graph ambiguous (the edge 12 could be mistaken for two edges 13 and
32).
Here are three further drawings of the same graph ({1, 2, 3} , {12, 23}):

2 2
1 3 2 .
1 3 1 3

(b) Consider the 5-th coprimality graph Cop5 defined in Example 2.1.3.
An introduction to graph theory, version August 2, 2023 page 12

Here is one way to draw it:

2
3

4
5
.
Here is another way to draw the same graph Cop5 , with fewer intersections
between edges:
2
3

4
5
.
By appropriately repositioning the points corresponding to the five vertices
of Cop5 , we can actually get rid of all intersections and make all the edges
straight (as opposed to curved). Can you find out how?
(c) Let us draw one further graph: the simple graph
({1, 2, 3, 4, 5} , P2 ({1, 2, 3, 4, 5})). This is the simple graph whose ver-
tices are 1, 2, 3, 4, 5, and whose edges are all possible two-element sets
consisting of its vertices (i.e., each pair of two distinct vertices is adjacent).
We shall later call this graph the “complete graph K5 ”. Here is a simple way
to draw this graph:
2
3

4
5
.
This drawing is useful for many purposes; for example, it makes the ab-
stract symmetry of this graph (i.e., the fact that, roughly speaking, its vertices
An introduction to graph theory, version August 2, 2023 page 13

1, 2, 3, 4, 5 are “equal in rights”) obvious. But sometimes, you might want to


draw it differently, to minimize the number of intersecting curves. Here is a
drawing with fewer intersections:

3
1

4
5
.

In this drawing, we have only one intersection between two curves left. Can
we get rid of all intersections?
This is a question of topology, not of combinatorics, since it really is about
curves in the plane rather than about finite sets and graphs. The answer is
“no”. (That is, no matter how you draw this graph in the plane, you will
always have at least one pair of curves intersect.) This is a classical result
(one of the first theorems in the theory of planar graphs), and proofs of it
can be found in various textbooks (e.g., [FriFri98, Theorem 4.1.2], which is
generally a good introduction to planar graph theory even if it uses termi-
nology somewhat different from ours). Note that any proof must use some
analysis or topology, since the result relies on the notion of a (continuous)
curve in the plane (if curves were allowed to be non-continuous, then they
could “jump over” one another, so they could easily avoid intersecting!).

2.3. A first fact: The Ramsey number R ( 3, 3) = 6


Enough definitions; let’s state a first result:

Proposition 2.3.1. Let G be a simple graph with |V (G )| ≥ 6 (that is, G has at


least 6 vertices). Then, at least one of the following two statements holds:

• Statement 1: There exist three distinct vertices a, b and c of G such that


ab, bc and ca are edges of G.

• Statement 2: There exist three distinct vertices a, b and c of G such that


none of ab, bc and ca is an edge of G.

In other words, Proposition 2.3.1 says that if a graph G has at least 6 vertices,
An introduction to graph theory, version August 2, 2023 page 14

then we can either find three distinct vertices that are mutually adjacent2 or find
three distinct vertices that are mutually non-adjacent (i.e., no two of them are
adjacent), or both. Often, this is restated as follows: “In any group of at least
six people, you can always find three that are (pairwise) friends to each other,
or three no two of whom are friends” (provided that friendship is a symmetric
relation).
We will give some examples in a moment, but first let us introduce some
convenient terminology:
Definition 2.3.2. Let G be a simple graph.

(a) A set { a, b, c} of three distinct vertices of G is said to be a triangle (of


G) if every two distinct vertices in this set are adjacent (i.e., if ab, bc and
ca are edges of G).

(b) A set { a, b, c} of three distinct vertices of G is said to be an anti-triangle


(of G) if no two distinct vertices in this set are adjacent (i.e., if none of
ab, bc and ca is an edge of G).

Thus, Proposition 2.3.1 says that every simple graph with at least 6 vertices
contains a triangle or an anti-triangle (or both).
Example 2.3.3. Let us show two examples of graphs G to which Proposi-
tion 2.3.1 applies, as well as an example to which it does not:

(a) Let G be the graph (V, E), where

V = {1, 2, 3, 4, 5, 6} and
E = {{1, 2} , {2, 3} , {3, 4} , {4, 5} , {5, 6} , {6, 1}} .

(This graph can be drawn in such a way as to look like a hexagon:

3 2

4 1

5 6
.

) This graph satisfies Proposition 2.3.1, since {1, 3, 5} is an anti-triangle


(or since {2, 4, 6} is an anti-triangle).
2 bywhich we mean (of course) that any two distinct ones among these three vertices are
adjacent
An introduction to graph theory, version August 2, 2023 page 15

(b) Let G be the graph (V, E), where

V = {1, 2, 3, 4, 5, 6} and
E = {{1, 2} , {2, 3} , {3, 4} , {4, 5} , {5, 6} , {6, 1} , {1, 3} , {4, 6}} .

(This graph can be drawn in such a way as to look like a hexagon with
two extra diagonals:

3 2

4 1

5 6
.

) This graph satisfies Proposition 2.3.1, since {1, 2, 3} is a triangle.

(c) Let G be the graph (V, E), where

V = {1, 2, 3, 4, 5} and
E = {{1, 2} , {2, 3} , {3, 4} , {4, 5} , {5, 1}} .

(This graph can be drawn to look like a pentagon:

2
3

4
5
.

) Proposition 2.3.1 says nothing about this graph, since this graph does
not satisfy the assumption of Proposition 2.3.1 (in fact, its number of
vertices |V (G )| fails to be ≥ 6). By itself, this does not yield that the
claim of Proposition 2.3.1 is false for this graph. However, it is easy
to check that the claim actually is false for this graph: It has neither a
triangle nor an anti-triangle.

Proof of Proposition 2.3.1. We need to prove that G has a triangle or an anti-


An introduction to graph theory, version August 2, 2023 page 16

triangle (or both).


Choose any vertex u ∈ V ( G ). (This is clearly possible, since |V ( G )| ≥ 6 ≥ 1.)
Then, there are at least 5 vertices distinct from u (since G has at least 6 vertices).
We are in one of the following two cases:
Case 1: The vertex u has at least 3 neighbors.
Case 2: The vertex u has at most 2 neighbors.
Let us consider Case 1 first. In this case, the vertex u has at least 3 neighbors.
Hence, we can find three distinct neighbors p, q and r of u. Consider these p, q
and r. If one (or more) of pq, qr and rp is an edge of G, then G has a triangle
(for example, if pq is an edge of G, then { u, p, q} is a triangle). If not, then G has
an anti-triangle (namely, { p, q, r }). Thus, in either case, our proof is complete
in Case 1.
Let us now consider Case 2. In this case, the vertex u has at most 2 neighbors.
Hence, the vertex u has at least 3 non-neighbors3 (since there are at least 5
vertices distinct from u in total). Thus, we can find three distinct non-neighbors
p, q and r of u. Consider these p, q and r. If all of pq, qr and rp are edges of G,
then G has a triangle (namely, { p, q, r }). If not, then G has an anti-triangle (for
example, if pq is not an edge of G, then { u, p, q} is an anti-triangle). In either
case, we are thus done with the proof in Case 2. Thus, both cases are resolved,
and the proof is complete.
Notice the symmetry between Case 1 and Case 2 in our above proof: the ar-
guments used were almost the same, except that neighbors and non-neighbors
swapped roles.

Remark 2.3.4. Proposition 2.3.1 could also be proved by brute force as well
(using a computer). Indeed, it clearly suffices to prove it for all simple graphs
with 6 vertices (as opposed to ≥ 6 vertices), because if a graph has more than
6 vertices, then we can just throw away some of them until we have only 6
left. However, there are only finitely many simple graphs with 6 vertices (up
to relabeling of their vertices), and the validity of Proposition 2.3.1 can be
checked for each of them. This is, of course, cumbersome (even a computer
would take a moment checking all the 215 possible graphs for triangles and
anti-triangles) and unenlightening.

Proposition 2.3.1 is the first result in a field of graph theory known as Ramsey
theory. I shall not dwell on this field in this course, but let me make a few more
remarks. The first step beyond Proposition 2.3.1 is the following generalization:

r and s betwo positive integers. Let G be a simple


Proposition 2.3.5. Let 
r+s−2
graph with |V (G )| ≥ . Then, at least one of the following two
r−1
statements holds:
3 Theword “non-neighbor” shall here mean a vertex that is not adjacent to u and distinct from
u. Thus, u does not count as a non-neighbor of u.
An introduction to graph theory, version August 2, 2023 page 17

• Statement 1: There exist r distinct vertices of G that are mutually adja-


cent (i.e., each two distinct ones among these r vertices are adjacent).

• Statement 2: There exist s distinct vertices of G that are mutually non-


adjacent (i.e., no two distinct ones among these s vertices are adjacent).

Applying Proposition 2.3.5 to r = 3 and s = 3, we can recover Proposi-


tion 2.3.1.  
r+s−2
One might wonder whether the number in Proposition 2.3.5 can
r−1
be improved – i.e., whether we can replace it by a smaller number without
making Proposition 2.3.5 false. In the case of r = 3 and s = 3, this is im-
possible, because the number 6 in Proposition 2.3.1 cannot be made 4
  smaller .
r+s−2
However, for some other values of r and s, the value can be im-
r−1
proved.
 (For example,
 for r = 4 and s = 4, the best possible value is 18 rather
4+4−2
than = 20.) The smallest possible value that could stand in place
 4 − 1
r+s−2
of in Proposition 2.3.5 is called the Ramsey number R (r, s); thus,
r−1
we have just showed that R (3, 3) = 6. Finding R (r, s) for higher values of r
and s is a hard computational challenge; here are some values that have been
found with the help of computers:

R (3, 4) = 9; R (3, 5) = 14; R (3, 6) = 18; R (3, 7) = 23;


R (3, 8) = 28; R (3, 9) = 36; R (4, 4) = 18; R (4, 5) = 25.

(We are only considering the cases r ≤ s, since it is easy to see that R (r, s) =
R (s, r ) for all r and s. Also, the trivial values R (1, s) = 1 and R (2, s) = s + 1
for s ≥ 2 are omitted.) The Ramsey number R (5, 5) is still unknown (although
it is known that 43 ≤ R (5, 5) ≤ 48).
Proposition 2.3.5 can be further generalized to a result called Ramsey’s theo-
rem. The idea behind the generalization is to slightly change the point of view,
and replace the simple graph G by a complete graph (i.e., a simple graph in
which every two distinct vertices are adjacent) whose edges are colored in two
colors (say, blue and red). This is a completely equivalent concept, because
the concepts of “adjacent” and “non-adjacent” in G can be identified with the
concepts of “adjacent through a blue edge” (i.e., the edge connecting them is
colored blue) and “adjacent through a red edge”, respectively. Statements 1
and 2 then turn into “there exist r distinct vertices that are mutually adjacent
through blue edges” and “there exist s distinct vertices that are mutually adja-
cent through red edges”, respectively. From this point of view, it is only logical

4 Indeed, we saw in Example 2.3.3 (c) that 5 vertices would not suffice.
An introduction to graph theory, version August 2, 2023 page 18

to generalize Proposition 2.3.5 further to the case when the edges of a complete
graph are colored in k (rather than two) colors. The corresponding generaliza-
tion is known as Ramsey’s theorem. We refer to the well-written Wikipedia
page https://en.wikipedia.org/wiki/Ramsey’s_theorem for a treatment of
this generalization with proof, as well as a table of known Ramsey numbers
R (r, s) and a self-contained (if somewhat terse) proof of Proposition 2.3.5. Ram-
sey’s theorem can be generalized and varied further; this usually goes under
the name “Ramsey theory”. For elementary introductions, see the Cut-the-knot
page http://www.cut-the-knot.org/Curriculum/Combinatorics/ThreeOrThree.shtml
, the above-mentioned Wikipedia article, as well as the texts by Harju [Harju14],
Bollobas [Bollob98] and West [West01].
There is one more direction in which Proposition 2.3.1 can be improved a bit:
A graph G with at least 6 vertices has not only one triangle or anti-triangle, but
at least two of them (this can include having one triangle and one anti-triangle).
Proving this makes for a nice exercise:

Exercise 2.1. Let G be a simple graph. A triangle-or-anti-triangle in G means


a set that is either a triangle or an anti-triangle.

(a) Assume that |V ( G )| ≥ 6. Prove that G has at least two triangle-or-anti-


triangles. (For comparison: Proposition 2.3.1 shows that G has at least
one triangle-or-anti-triangle.)

(b) Assume that |V ( G )| = m + 6 for some m ∈ N. Prove that G has at least


m + 1 triangle-or-anti-triangles.

[Solution: This is Exercise 1 on homework set #1 from my Spring 2017


course; see the course page for solutions.]

2.4. Degrees
The degree of a vertex in a simple graph just counts how many edges contain
this vertex:

Definition 2.4.1. Let G = (V, E) be a simple graph. Let v ∈ V be a vertex.


Then, the degree of v (with respect to G) is defined to be

deg v := (the number of edges e ∈ E that contain v)


= (the number of neighbors of v)
= |{u ∈ V | uv ∈ E}|
= |{e ∈ E | v ∈ e}| .
An introduction to graph theory, version August 2, 2023 page 19

(These equalities are pretty easy to check: Each edge e ∈ E that contains v
contains exactly one neighbor of v, and conversely, each neighbor of v belongs
to exactly one edge that contains v. However, these equalities are specific to
simple graphs, and won’t hold any more once we move on to multigraphs.)
For example, in the graph

3 2

4 1 5
,

the vertices have degrees

deg 1 = 3, deg 2 = 2, deg 3 = 3, deg 4 = 2, deg 5 = 0.

Here are some basic properties of degrees in simple graphs:

Proposition 2.4.2. Let G be a simple graph with n vertices. Let v be a vertex


of G. Then,
deg v ∈ {0, 1, . . . , n − 1} .

Proof. All neighbors of v belong to the (n − 1)-element set V ( G ) \ {v}. Thus,


their number is ≤ n − 1.

Proposition 2.4.3 (Euler 1736). Let G be a simple graph. Then, the sum of
the degrees of all vertices of G equals twice the number of edges of G. In
other words,
∑ deg v = 2 · |E (G)| .
v ∈V( G )

Proof. Write the simple graph G as G = (V, E); thus, V ( G ) = V and E ( G ) = E.


Now, let N be the number of all pairs (v, e) ∈ V × E such that v ∈ e. We
compute N in two different ways (this is called “double-counting”):

1. We can obtain N by computing, for each v ∈ V, the number of all e ∈ E


that satisfy v ∈ e, and then summing these numbers over all v. Since these
numbers are just the degrees deg v, the result will be ∑ deg v.
v ∈V

2. On the other hand, we can obtain N by computing, for each e ∈ E, the


number of all v ∈ V that satisfy v ∈ e, and summing these numbers over
all e. Since each e ∈ E contains exactly 2 vertices v ∈ V, this result will be
∑ 2 = | E| · 2 = 2 · | E|.
e∈ E
An introduction to graph theory, version August 2, 2023 page 20

Since these two results must be equal (because they both equal N), we thus
see that ∑ deg v = 2 · | E|. But this is the claim of Proposition 2.4.3.
v ∈V

Corollary 2.4.4 (handshake lemma). Let G be a simple graph. Then, the


number of vertices v of G whose degree deg v is odd is even.

Proof. Proposition 2.4.3 yields that ∑ deg v = 2 · |E ( G )|. Hence, ∑ deg v


v ∈V( G ) v ∈V( G )
is even. However, if a sum of integers is even, then it must have an even number
of odd addends. Thus, the sum ∑ deg v must have an even number of odd
v ∈V( G )
addends. In other words, the number of vertices v of G whose degree deg v is
odd is even.
Corollary 2.4.4 is often stated as follows: In a group of people, the number of
persons with an odd number of friends (in the group) is even. It is also known
as the handshake lemma.
Here is another property of degrees in a simple graph:

Proposition 2.4.5. Let G be a simple graph with at least two vertices. Then,
there exist two distinct vertices v and w of G that have the same degree.

Proof. Assume the contrary. So the degrees of all n vertices of G are distinct,
where n = |V (G )|.
In other words, the map

deg : V ( G ) → {0, 1, . . . , n − 1} ,
v 7→ deg v

is injective. But this is a map between two finite sets of the same size (n). When
such a map is injective, it has to be bijective (by the pigeonhole principle).
Therefore, in particular, it takes both 0 and n − 1 as values.
In other words, there are a vertex u with degree 0 and a vertex v with degree
n − 1. Are these two vertices adjacent or not? Yes because of deg v = n − 1; no
because of deg u = 0. Contradiction!
(Fine print: The two vertices u and v must be distinct, since 0 6= n − 1. It is
here that we are using the “at least two vertices” assumption!)

Here is an application of counting neighbors to proving a fact about graphs.


This is known as Mantel’s theorem:

Theorem 2.4.6 (Mantel’s theorem). Let G be a simple graph with n vertices


and e edges. Assume that e > n2 /4. Then, G has a triangle (i.e., three distinct
vertices that are pairwise adjacent).
An introduction to graph theory, version August 2, 2023 page 21

Example 2.4.7. Let G be the graph (V, E), where

V = {1, 2, 3, 4, 5, 6} ;
E = {12, 23, 34, 45, 56, 61, 14, 25, 36} .

Here is a drawing:
3 2

4 1

5 6
.
This graph has no triangle (which, by the way, is easy to verify without
checking all possibilities: just observe that every edge of G joins two vertices
of different parity, but a triangle would necessarily have two vertices of equal
parity). Thus, by the contrapositive of Mantel’s theorem, it satisfies e ≤ n2 /4
with n = 6 and e = 9. This is indeed true because 9 = 62 /4. But this also
entails that if we add any further edge to G, then we obtain a triangle.

Proof of Mantel’s theorem. We will prove the theorem by strong induction on n.


Thus, we assume (as the induction hypothesis) that the theorem holds for all
graphs with fewer than n vertices. We must now prove it for our graph G with
n vertices. Let V = V ( G ) and E = E ( G ), so that G = (V, E).
We must prove that G has a triangle. Assume the contrary. Thus, G has no
triangle.
From e > n2 /4 ≥ 0, we see that G has an edge. Pick any such edge, and call
it vw. Thus, v 6= w.
Let us now color each edge of G with one of three colors, as follows:

• The edge vw is colored black.

• Each edge that contains exactly one of v and w is colored red.

• All other edges are colored blue.


An introduction to graph theory, version August 2, 2023 page 22

The following picture shows an example of this coloring:

w 4

5
.

We now count the edges of each color:

• There is exactly 1 black edge – namely, vw.

• How many red edges can there be? I claim that there are at most n − 2.
Indeed, each vertex other than v and w is connected to at most one of v
and w by a red edge, since otherwise it would form a triangle with v and
w.

• How many blue edges can there be? The vertices other than v and w,
along with the blue edges that join them, form a graph with n − 2 vertices;
this graph has no triangles (since G has no triangles). By the induction
hypothesis, however, if this graph had more than (n − 2)2 /4 edges, then
it would have a triangle. Thus, it has ≤ (n − 2)2 /4 edges. In other words,
there are ≤ (n − 2)2 /4 blue edges.

In total, the number of edges is therefore

≤ 1 + (n − 2) + (n − 2)2 /4 = n2 /4.

In other words, e ≤ n2 /4. This contradicts e > n2 /4. This is the contradiction
we were looking for, so the induction is complete.
Quick question: What about equality? Can a graph with n vertices and
exactly n2 /4 edges have no triangles? Yes (for even n). Indeed, for any even n,
we can take the graph

({1, 2, . . . , n} , {ij | i 6≡ j mod 2})


An introduction to graph theory, version August 2, 2023 page 23

(keep in mind that ij means the 2-element set {i, j} here, notthe product i · j).
We can also do this for odd n, and obtain a graph with n2 − 1 /4 edges (which
is as close to n2 /4 as we can get when n is odd – after all, the number of edges
has to be an integer). So the bound in Mantel’s theorem is optimal (as far as
integers are concerned).
The following exercise can be regarded as a “mirror version” of Mantel’s
theorem:

Exercise 2.2. Let G be a simple graph with n vertices and e edges. Assume
that e < n (n − 2) /4. Prove that G has an anti-triangle (i.e., three distinct
vertices that are pairwise non-adjacent).
[Solution: This is Exercise 2 on homework set #1 from my Spring 2017
course; see the course page for solutions.]

Mantel’s theorem can be generalized:

Theorem 2.4.8 (Turan’s theorem). Let r be a positive integer. Let G be a


simple graph with n vertices and e edges. Assume that

r − 1 n2
e> · .
r 2
Then, there exist r + 1 distinct vertices of G that are mutually adjacent.

Mantel’s theorem is the particular case for r = 2. We will see a proof of


Turan’s theorem later (Theorem 7.3.1). Mantel’s and Turan’s theorems are two
of the simplest results of extremal graph theory – the study of how inequalities
between some graph parameters (in our case: the numbers of vertices and
edges) imply the existence of certain substructures (in our case: of a triangle or
of r + 1 mutually adjacent vertices). Deeper introductions to this subject can be
found in [Zhao23, Chapters 1 and 5] and [Jukna11].

Exercise 2.3. Let G = (V, E) be a simple graph. Set n = |V |. Prove that we


can find some edges e1 , e2 , . . . , ek of G and some triangles t1 , t2 , . . . , tℓ of G
such that k + ℓ ≤ n2 /4 and such that each edge e ∈ E \ {e1 , e2, . . . , ek } is a
subset of (at least) one of the triangles t1 , t2 , . . . , tℓ .
[Remark: In other words, this exercise is claiming that all edges of G can be
covered by at most n2 /4 edge-or-triangles. Here, an edge-or-triangle means
either an edge or a triangle of G, and the word “covers” means that each
edge of G is a subset of the chosen edge-or-triangles.]
[Hint: Imitate the above proof of Mantel’s theorem.]
An introduction to graph theory, version August 2, 2023 page 24

Remark 2.4.9. Exercise 2.3 is a generalization of Mantel’s theorem. Indeed, if


the simple graph G = (V, E) has no triangles, then the number ℓ in Exercise
2.3 must be 0, and thus the edges e1 , e2 , . . . , ek must be all edges of G, so that
we conclude that | E| = k ≤ k + ℓ ≤ n2 /4.

Exercise 2.4. Let G be a simple graph with n vertices and k edges, where
k 
n > 0. Prove that G has at least 4k − n2 triangles.
3n
[Hint: First argue that for any edge vw of G, the total number of triangles
that contain v and w is at least deg v + deg w − n. Then, use the inequal-

ity n a21 + a22 + · · · + a2n ≥ (a1 + a2 + · · · + an )2 , which holds for any n real
numbers a1 , a2 , . . . , an . (This is a particular case of the Cauchy–Schwarz in-
equality or the Chebyshev inequality or the Jensen inequality – pick your
favorite!)]

Remark 2.4.10. Exercise 2.4 is known as the Moon–Moser inequality for


triangles. It, too, generalizes Mantel’s theorem: If k > n2 /4, then
k 
4k − n2 > 0, and therefore Exercise 2.4 entails that G has at least one
3n
triangle.

Exercise 2.5. Let G = (V, E) be a simple graph.


An edge e = {u, v} of G will be called odd if the number deg u + deg v is
odd.
Prove that the number of odd edges of G is even.
[Hint: There are several solutions. One uses modular arithmetic and (in
particular) the congruence m2 ≡ m mod 2 for every integer m. Other solu-
tions use nothing but common sense.]

Exercise 2.6. Let G = (V, E) be a simple graph. Let S be a subset of V, and


let k = |S|. Prove that

∑ deg v ≤ k (k − 1) + ∑ min {deg v, k} .


v∈ S v ∈V \ S

Remark 2.4.11. Exercise 2.6 has a converse (the so-called Erdös–Gallai theo-
rem): If d1 , d2 , . . . , dn are n nonnegative integers such that d1 + d2 + · · · + dn
is even and such that d1 ≥ d2 ≥ · · · ≥ dn and such that each k ∈ {1, 2, . . . , n}
satisfies
k n
∑ di ≤ k ( k − 1) + ∑ min {di , k} ,
i =1 i = k+1

then there exists a simple graph with vertex set {1, 2, . . . , n} whose vertices
have degrees d1 , d2 , . . . , dn .
An introduction to graph theory, version August 2, 2023 page 25

2.5. Graph isomorphism


Two graphs can be distinct and yet “the same up to the names of their vertices”:
for instance,

1 2 3 1 3 2
and .

Let us formalize this:

Definition 2.5.1. Let G and H be two simple graphs.

(a) A graph isomorphism (or isomorphism) from G to H means a bijection


φ : V (G ) → V ( H ) that “preserves edges”, i.e., that has the following
property: For any two vertices u and v of G, we have

(uv ∈ E (G)) ⇐⇒ (φ (u) φ (v) ∈ E ( H )) .

(b) We say that G and H are isomorphic (this is written G ∼


= H) if there
exists a graph isomorphism from G to H.

Here are two examples:

• The two graphs

1 2 3 1 3 2
and

are isomorphic, because the bijection between their vertex sets that sends
1, 2, 3 to 1, 3, 2 is an isomorphism. Another isomorphism between the
same two graphs sends 1, 2, 3 to 2, 3, 1.

• The two graphs

3 2

1 2 3
4 1

5 6 A B C
and

are isomorphic, because the bijection between their vertex sets that sends
1, 2, 3, 4, 5, 6 to 1, B, 3, A, 2, C is an isomorphism.
An introduction to graph theory, version August 2, 2023 page 26

Here are some basic properties of isomorphisms (the proofs are straightfor-
ward):

Proposition 2.5.2. Let G and H be two graphs. The inverse of a graph iso-
morphism φ from G to H is a graph isomorphism from H to G.

Proposition 2.5.3. Let G, H and I be three graphs. If φ is a graph isomor-


phism from G to H, and ψ is a graph isomorphism from H to I, then ψ ◦ φ is
a graph isomorphism from G to I.

As a consequence of these two propositions, it is easy to see that the relation



= (on the class of all graphs) is an equivalence relation.
Graph isomorphisms preserve all “intrinsic” properties of a graph. For ex-
ample:

Proposition 2.5.4. Let G and H be two simple graphs, and φ a graph isomor-
phism from G to H. Then:

(a) For every v ∈ V ( G ), we have degG v = deg H (φ (v)). Here, degG v


means the degree of v as a vertex of G, whereas deg H (φ (v)) means the
degree of φ (v) as a vertex of H.

(b) We have |E ( H )| = |E ( G )|.

(c) We have |V ( H )| = |V ( G )|.

One use of graph isomorphisms is to relabel the vertices of a graph. For


example, we can relabel the vertices of an n-vertex graph as 1, 2, . . . , n, or as
any other n distinct objects:

Proposition 2.5.5. Let G be a simple graph. Let S be a finite set such that
|S| = |V ( G)|. Then, there exists a simple graph H that is isomorphic to G
and has vertex set V ( H ) = S.

Proof. Straightforward.

2.6. Some families of graphs


We will now define some particularly significant families of graphs.

2.6.1. Complete and empty graphs


The simplest families of graphs are the complete graphs and the empty graphs:
An introduction to graph theory, version August 2, 2023 page 27

Definition 2.6.1. Let V be a finite set.

(a) The complete graph on V means the simple graph (V, P2 (V )). It is
the simple graph with vertex set V in which every two distinct vertices
are adjacent.
If V = {1, 2, . . . , n} for some n ∈ N, then the complete graph on V is
denoted Kn .

(b) The empty graph on V means the simple graph (V, ∅). It is the simple
graph with vertex set V and no edges.

The following pictures show the complete graph and the empty graph on the
set {1, 2, 3, 4, 5}:

complete graph empty graph

2 2
3 3

1 1

4 4
5 5

The complete one is called K5 .


Here are the complete graphs K0 , K1 , K2 , K3 , K4 :

K0 K1 K2 K3 K4

1 2
2

3 1
1

1 2 3 4

Note that a simple graph G is isomorphic to the complete graph Kn if and


only if it has n vertices and is a complete graph (i.e., every two distinct vertices
are adjacent).
An introduction to graph theory, version August 2, 2023 page 28

Question: Given two finite sets V and W, what are the isomorphisms from
the complete graph on V to the complete graph on W ?
Answer: If |V | 6= |W |, then there are none. If |V | = |W |, then any bijection
from V to W is an isomorphism. The same holds for empty graphs.

2.6.2. Path and cycle graphs


Next come two families of graphs with fairly simple shapes:
Definition 2.6.2. For each n ∈ N, we define the n-th path graph Pn to be the
simple graph

({1, 2, . . . , n} , {{i, i + 1} | 1 ≤ i < n})


= ({1, 2, . . . , n} , {12, 23, 34, . . . , (n − 1) n}) .

This graph has n vertices and n − 1 edges (unless n = 0, in which case it has
0 edges).

Definition 2.6.3. For each n > 1, we define the n-th cycle graph Cn to be the
simple graph

({1, 2, . . . , n} , {{i, i + 1} | 1 ≤ i < n} ∪ {{n, 1}})


= ({1, 2, . . . , n} , {12, 23, 34, . . . , (n − 1) n, n1}) .

This graph has n vertices and n edges (unless n = 2, in which case it has 1
edge only). (We will later modify the definition of the 2-nd cycle graph C2
somewhat, in order to force it to have 2 edges. But we cannot do this yet,
since a simple graph with 2 vertices cannot have 2 edges.)
The following pictures show the path graph P5 and the cycle graph C5 :

path graph cycle graph

2 2
3 3

1 1

4 4
5 5

Of course, it is more common to draw the path graph stretched out horizontally:

1 2 3 4 5
An introduction to graph theory, version August 2, 2023 page 29

Note that the cycle graph C3 is identical with the complete graph K3 .
Question: What are the graph isomorphisms from Pn to itself?
Answer: One such isomorphism is the identity map id : {1, 2, . . . , n} →
{1, 2, . . . , n}. Another is the “reversal” map

{1, 2, . . . , n} → {1, 2, . . . , n} ,
i 7→ n + 1 − i.

There are no others.


Question: What are the graph isomorphisms from Cn to itself?
Answer: For any k ∈ Z, we can define a “rotation by k vertices”, which is the
map

{1, 2, . . . , n} → {1, 2, . . . , n} ,
i 7→ (i + k reduced modulo n to an element of {1, 2, . . . , n}) .

Thus we get n rotations (one for each k ∈ {1, 2, . . . , n}); all of them are graph
isomorphisms.
There are also the reflections, which are the maps

{1, 2, . . . , n} → {1, 2, . . . , n} ,
i 7→ (k − i reduced modulo n to an element of {1, 2, . . . , n})

for k ∈ Z. There are n of them, too, and they are isomorphisms as well.
Altogether we obtain 2n isomorphisms (for n > 2), and there are no others.
(The group they form is the n-th dihedral group.)

2.6.3. Kneser graphs


Here is a more exotic family of graphs:

Example 2.6.4. If S is a finite set, and if k ∈ N, then we define the k-th Kneser
graph of S to be the simple graph

KS,k := (Pk (S) , { I J | I, J ∈ Pk (S) and I ∩ J = ∅}) .

The vertices of KS,k are the k-element subsets of S, and two such subsets are
adjacent if they are disjoint.
An introduction to graph theory, version August 2, 2023 page 30

The graph K{1,2,...,5},2 is called the Petersen graph; here is how it looks like:

{1, 4}

{2, 5}

{2, 3}
{3, 4}

{1, 2} {3, 5}

{4, 5}
{1, 5}

{1, 3}

{2, 4}

2.7. Subgraphs
Definition 2.7.1. Let G = (V, E) be a simple graph.

(a) A subgraph of G means a simple graph of the form H = (W, F), where
W ⊆ V and F ⊆ E. In other words, a subgraph of G means a simple
graph whose vertices are vertices of G and whose edges are edges of G.

(b) Let S be a subset of V. The induced subgraph of G on the set S denotes


the subgraph
(S, E ∩ P2 (S))
of G. In other words, it denotes the subgraph of G whose vertices
are the elements of S, and whose edges are precisely those edges of G
whose both endpoints belong to S.

(c) An induced subgraph of G means a subgraph of G that is the induced


subgraph of G on S for some S ⊆ V.
An introduction to graph theory, version August 2, 2023 page 31

Thus, a subgraph of a graph G is obtained by throwing away some vertices and


some edges of G (in such a way, of course, that no edges remain “dangling”
– i.e., if you throw away a vertex, then you must throw away all edges that
contain this vertex). Such a subgraph is an induced subgraph if no edges are
removed without need – i.e., if you removed only those edges that lost some of
their endpoints. Thus, induced subgraphs can be characterized as follows:

Proposition 2.7.2. Let H be a subgraph of a simple graph G. Then, H is an


induced subgraph of G if and only if each edge uv of G whose endpoints u
and v belong to V ( H ) is an edge of H.

Proof. This is a matter of understanding the definition.

Example 2.7.3. Let n > 1 be an integer.

(a) The path graph Pn is a subgraph of the cycle graph Cn . It is not an


induced subgraph (for n > 2), because it contains the two vertices n
and 1 of Cn but does not contain the edge n1.

(b) The path graph Pn−1 is an induced subgraph of Pn . (Namely, it is the


induced subgraph of Pn on the set {1, 2, . . . , n − 1}.)

(c) Assume that n > 3. Is Cn−1 a subgraph of Cn ? No, because the edge
(n − 1) 1 belongs to Cn−1 but not to Cn .

The following is easy:

Proposition 2.7.4. Let G be a simple graph, and let H be a subgraph of G.


Assume that H is a complete graph. Then, H is automatically an induced
subgraph of G.

Proof. This follows from Proposition 2.7.2, since the completeness of H means
that each 2-element subset {u, v} of the vertex set of H is an edge of H.
We note that triangles in a graph can be characterized in terms of complete
subgraphs. Namely, a triangle “is” the same as a complete subgraph (or, equiv-
alently, induced complete subgraph) with three vertices:

Remark 2.7.5. Let G be a simple graph. Let u, v, w be three distinct vertices


of G. The following are equivalent:

1. The set {u, v, w} is a triangle of G.

2. The induced subgraph of G on {u, v, w} is isomorphic to K3 .

3. The induced subgraph of G on {u, v, w} is isomorphic to C3 .


An introduction to graph theory, version August 2, 2023 page 32

Thus, instead of saying “triangle of G”, one often says “a K3 in G” or “a C3 in


G”. Generally, “an H in G” (where H and G are two graphs) means a subgraph
of G that is isomorphic to H. (In the case when H = K3 = C3 , it does not
matter whether we require it to be a subgraph or an induced subgraph, since a
complete subgraph has to be induced automatically.)

Exercise 2.7. Let n be a positive integer. Let S be a simple graph with 2n


vertices. Prove that S has two distinct vertices that have an even number of
common neighbors.

Exercise 2.8. Let n ≥ 2 be an integer. Let G be a simple graph with n vertices.

(a) Describe G if the degrees of the vertices of G are 1, 1, . . . , 1, n − 1.

(b) Let a and b be two positive integers such that a + b = n. Describe G if


the degrees of the vertices of G are 1, 1, . . . , 1, a, b.

Here, to “describe” G means to explicitly determine (with proof) a graph


that is isomorphic to G.

Remark 2.7.6. The situations in Exercise 2.8 are, in a sense, exceptional. Typ-
ically, the degrees of the vertices of a graph do not uniquely determine the
graph up to isomorphism. For example, the two graphs

3 2 2

1 4

4 1 3

5 6 6
and

are not isomorphic5 , but have the same degrees (namely, each vertex of either
graph has degree 3).

5 The easiest way to see this is to observe that the second graph has a triangle (i.e., three
distinct vertices that are mutually adjacent), while the first graph does not.
An introduction to graph theory, version August 2, 2023 page 33

2.8. Disjoint unions


Another way of constructing new graphs from old is the disjoint union. The
idea is simple: Taking the disjoint union G1 ⊔ G2 ⊔ · · · ⊔ Gk of several simple
graphs G1 , G2 , . . . , Gk means putting the graphs alongside each other and treat-
ing the result as one big graph. To make this formally watertight, we have to
relabel each vertex v of each graph Gi as the pair (i, v), so that vertices coming
from different graphs appear as different even if they were equal. For example,
the disjoint union C3 ⊔ C4 of the two cycle graphs C3 and C4 should not be

2
2 3

1 4
3

(which makes no sense, because there are two points labelled 1 in this picture,
but a graph can have only one vertex 1), but rather should be

(1, 2)
(2, 2) (2, 3)

(1, 1)

(2, 1) (2, 4)
(1, 3)
.
So here is the formal definition:
Definition 2.8.1. Let G1 , G2 , . . . , Gk be simple graphs, where Gi = (Vi , Ei ) for
each i ∈ {1, 2, . . . , k}. The disjoint union of these k graphs G1 , G2 , . . . , Gk is
defined to be the simple graph (V, E), where

V = {(i, v) | i ∈ {1, 2, . . . , k} and v ∈ Vi } and


E = {{(i, v1 ) , (i, v2 )} | i ∈ {1, 2, . . . , k} and {v1 , v2 } ∈ Ei } .

This disjoint union is denoted by G1 ⊔ G2 ⊔ · · · ⊔ Gk .


Note: If G and H are two graphs, then the two graphs G ⊔ H and H ⊔ G are
isomorphic, but not the same graph (unless G = H). For example, C3 ⊔ C4 has
a vertex (2, 4), but C4 ⊔ C3 does not.
An introduction to graph theory, version August 2, 2023 page 34

2.9. Walks and paths


We now come to the definitions of walks and paths – two of the most funda-
mental features that graphs can have. In particular, Euler’s 1736 paper, where
graphs were first studied, is about certain kinds of walks.

2.9.1. Definitions
Imagine a graph as a road network, where each vertex is a town and each edge
is a (bidirectional) road. By successively walking along several edges, you can
often get from a town to another even if they are not adjacent. This is made
formal in the concept of a “walk”:
Definition 2.9.1. Let G be a simple graph. Then:

(a) A walk (in G) means a finite sequence (v0 , v1 , . . . , vk ) of vertices of G


(with k ≥ 0) such that all of v0 v1 , v1 v2 , v2 v3 , . . . , vk−1 vk are edges of
G. (The latter condition is vacuously true if k = 0.)

(b) If w = (v0 , v1 , . . . , vk ) is a walk in G, then:


• The vertices of w are defined to be v0 , v1 , . . . , vk .
• The edges of w are defined to be v0 v1 , v1 v2 , v2 v3 , . . . , vk−1 vk .
• The nonnegative integer k is called the length of w. (This is the
number of all edges of w, counted with multiplicity. It is 1 smaller
than the number of all vertices of w, counted with multiplicity.)
• The vertex v0 is called the starting point of w. We say that w starts
(or begins) at v0 .
• The vertex vk is called the ending point of w. We say that w ends
at vk .

(c) A path (in G) means a walk (in G) whose vertices are distinct. In other
words, a path means a walk (v0 , v1 , . . . , vk ) such that v0 , v1 , . . . , vk are
distinct.

(d) Let p and q be two vertices of G. A walk from p to q means a walk that
starts at p and ends at q. A path from p to q means a path that starts at
p and ends at q.

(e) We often say “walk of G” and “path of G” instead of “walk in G” and


“path in G”, respectively.

Example 2.9.2. Let G be the graph

({1, 2, 3, 4, 5, 6} , {12, 23, 34, 45, 56, 61, 13}) .


An introduction to graph theory, version August 2, 2023 page 35

This graph looks as follows:

3 2

4 1

5 6

Then:

• The sequence (1, 3, 4, 5, 6, 1, 3, 2) of vertices of G is a walk in G. This


walk is a walk from 1 to 2. It is not a path. The length of this walk is 7.

• The sequence (1, 2, 4, 3) of vertices of G is not a walk, since 24 is not an


edge of G. Hence, it is not a path either.

• The sequence (1, 3, 2, 1) is a walk from 1 to 1. It has length 3. It is not a


path.

• The sequence (1, 2, 1) is a walk from 1 to 1. It has length 2. It is not a


path.

• The sequence (5) is a walk from 5 to 5. It has length 0. It is a path.


More generally, each vertex v of G produces a length-0 path (v).

• The sequence (5, 4) is a walk from 5 to 4. It has length 1. It is a path.


More generally, each edge uv of G produces a length-1 path (u, v).

Intuitively, we can think of walks and paths as follows:

• A walk of a graph is a way of walking from one vertex to another (or to


the same vertex) by following a sequence of edges.

• A path is a walk whose vertices are distinct (i.e., each vertex appears at
most once in the walk).

Exercise 2.9. Let G be a simple graph. Let w be a path in G. Prove that the
edges of w are distinct. (This may look obvious when you can point to a
picture; but we ask you to give a rigorous proof!)
[Solution: This is Exercise 3 on homework set #1 from my Spring 2017
course; see the course page for solutions.]
An introduction to graph theory, version August 2, 2023 page 36

2.9.2. Composing/concatenating and reversing walks


Here are some simple things we can do with walks and paths.
First, we can “splice” two walks together if the ending point of the first is the
starting point of the second:
Proposition 2.9.3. Let G be a simple graph. Let u, v and w be three vertices
of G. Let a = (a0 , a1 , . . . , ak ) be a walk from u to v. Let b = (b0 , b1 , . . . , bℓ ) be
a walk from v to w. Then,

( a0 , a1 , . . . , ak , b1 , b2 , . . . , bℓ ) = (a0 , a1 , . . . , ak−1 , b0 , b1 , . . . , bℓ )
= (a0 , a1 , . . . , ak−1 , v, b1 , b2 , . . . , bℓ )

is a walk from u to w. This walk shall be denoted a ∗ b.


Proof. Intuitively clear and straightforward to verify.
Proposition 2.9.4. Let G be a simple graph. Let u and v be two vertices of G.
Let a = ( a0 , a1 , . . . , ak ) be a walk from u to v. Then:

(a) The list ( ak , ak−1 , . . . , a0 ) is a walk from v to u. We denote this walk by


rev a and call it the reversal of a.

(b) If a is a path, then rev a is a path again.

Proof. Intuitively clear and straightforward to verify.

2.9.3. Reducing walks to paths


A path is just a walk without repeated vertices. If you have a walk, you can
turn it into a path by removing “loops” (or “digressions”):
Proposition 2.9.5. Let G be a simple graph. Let u and v be two vertices of G.
Let a = (a0 , a1 , . . . , ak ) be a walk from u to v. Assume that a is not a path.
Then, there exists a walk from u to v whose length is smaller than k.

Proof. Since a is not a path, two of its vertices are equal. In other words, there
exist i < j such that ai = a j . Consider these i and j. Now, consider the tuple
 
 
 a , a , . . . , ai , a j +1 , a j +2 , . . . , ak 
|0 1{z } | {z }
the first i +1 vertices of a the last k− j vertices of a

(this is just a with the part between ai and a j cut out). This tuple is a walk from u
i + (k − j) < j + (k − j) = k. So we have found a walk
to v, and its length is |{z}
<j
from u to v whose length is smaller than k. This proves the proposition.
An introduction to graph theory, version August 2, 2023 page 37

Example 2.9.6. Consider the walk (1, 3, 4, 5, 6, 1, 3, 2) from Example 2.9.2.


Then, Proposition 2.9.5 tells us that there is a walk from 1 to 2 that has
smaller length. You can find this walk by removing the part between the two
3’s. You get the walk (1, 3, 2). This is actually a path.

Corollary 2.9.7 (When there is a walk, there is a path). Let G be a simple


graph. Let u and v be two vertices of G. Assume that there is a walk from u
to v of length k for some k ∈ N. Then, there is a path from u to v of length
≤ k.

Proof. Proposition 2.9.5 says that if there is a walk from u to v that is not a path,
then there is a walk from u to v having shorter length. Apply this repeatedly,
until you get a path. (You will eventually get a path, because the length cannot
keep decreasing forever.)

2.9.4. Remark on algorithms


We take a little break from proving structural theorems in order to address
some important computational questions. As always in these notes, we will
only scratch the surface and content ourselves with simple but not quite optimal
algorithms.
Given a simple graph G and two vertices u and v of G, we can ask ourselves
the following questions:

Question 1: Does G have a walk from u to v ?


Question 2: Does G have a path from u to v ?
Question 3: Find a shortest path from u to v (that is, a path from u
to v having the smallest possible length), or determine that no such
path exists.
Question 4: Given a number k ∈ N, find a walk from u to v having
length k, or determine that no such walk exists.
Question 5: Given a number k ∈ N, find a path from u to v having
length k, or determine that no such path exists.

Corollary 2.9.7 reveals that Questions 1 and 2 are equivalent (indeed, the
existence of a walk from u to v entails the existence of a path from u to v
by Corollary 2.9.7, whereas the converse is obvious). Question 3 is clearly a
stronger version of Question 2 (in the sense that any answer to Question 3 will
automatically answer Question 2 as well).
With a bit more thought, it is easily seen that Question 4 is a stronger version
of Question 3. Indeed, Corollary 2.9.7 shows that a shortest walk from u to v (if
it exists) must also be a shortest path from u to v. However, any path from u to
v must have length ≤ n − 1, where n is the number of vertices of G (since a path
An introduction to graph theory, version August 2, 2023 page 38

of length k has k + 1 distinct vertices, but G has only n vertices to spare). Hence,
if there is no walk of length ≤ n − 1 from u to v, then there is no path from u to
v whatsoever. Thus, if we answer Question 4 for all values k ∈ {0, 1, . . . , n − 1},
then we obtain either a shortest path from u to v (by taking the smallest k for
which the answer is positive, and then picking the resulting walk, which must
be a shortest path by what we previously said), or proof positive that no path
from u to v exists (if the answer for each k ∈ {0, 1, . . . , n − 1} is negative).
Thus, answering Question 4 will yield answers to Questions 1, 2 and 3.
Let us now outline a way how Question 4 can be answered using a recursive
algorithm. Specifically, we recurse on k. The base case (k = 0) is easy: A walk
from u to v having length 0 exists if u = v and does not exist otherwise. The
interesting part is the recursion step: Assume that the integer k is positive, and
that we already know how to answer Question 4 for k − 1 instead of k. Now,
let us answer it for k. To do so, we observe that any walk from u to v having
length k must have the form (u, . . . , w, v), where the penultimate vertex w is
some neighbor of v. Moreover, if we remove the last vertex v from our walk
(u, . . . , w, v), then we obtain a walk (u, . . . , w) of length k − 1. Hence, we can
find a walk from u to v having length k as follows:

• We make a list of all neighbors of v. We go through this list in some


arbitrary order.
• For each neighbor w in this list, we try to find a walk from u to w having
length k − 1 (this is a matter of answering Question 4 for k − 1 instead of
k, so we supposedly already know how to do this). If such a walk exists,
then we simply insert v at its end, and thus obtain a walk from u to v
having length k. Thus we obtain a positive answer to our question.
• If we have gone through our whole list of neighbors of v without finding
a walk from u to v having length k, then no such walk exists, and thus we
have found a negative answer.

This recursive algorithm answers Question 4, and is fast enough to be prac-


tically viable if implemented well. (In the language of complexity theory, it is
a polynomial time algorithm6 .) Much more efficient algorithms exist, however.
In applications, a generalized version of Question 3 often appears, asking for
a path that is shortest not in the sense of smallest length, but in the sense of
smallest “weighted length” (i.e., different edges contribute differently to this
“length”). This generalized question is one of the most fundamental algo-
rithmic problems in computer science, known as the shortest path problem,
and various algorithms can be found on its Wikipedia page and in algorithm-
focussed texts such as [Griffi21, §3.5], [KelTro17, §12.3], [Schrij17, Chapter 1] or
(for a royal treatment) [Schrij03, Chapters 6–8].
6 Tobe specific: Its running time can be bounded in a polynomial of n and k, where n is the
number of vertices of G.
An introduction to graph theory, version August 2, 2023 page 39

Question 5 looks superficially similar to Question 4, yet it differs in the most


important way: There is no efficient algorithm known for answering it! In the
language of complexity theory, it is an NP-hard problem, which means that
a polynomial-time algorithm for it is not expected to exist (although this is
the kind of negative that appears near-impossible to prove at the current stage
of the discipline). It is still technically a finite problem (there are only finitely
many possible paths in G, and thus one can theoretically try them all), and there
is even a polynomial-time algorithm for any fixed value of k (again, a trivial one:
check all the nk+1 possible (k + 1)-tuples of vertices of G for whether they are
paths from u to v), but the complexity of this algorithm grows exponentially in
k, which makes it useless in practice.

2.9.5. The equivalence relation “path-connected”


We can use the concepts of walks and paths to define a certain equivalence
relation on the vertex set V (G ) of any graph G:
Definition 2.9.8. Let G be a simple graph. We define a binary relation ≃ G on
the set V ( G ) as follows: For two vertices u and v of G, we shall have u ≃ G v
if and only if there exists a walk from u to v in G.
This binary relation ≃ G is called “path-connectedness” or just
“connectedness”. When two vertices u and v satisfy u ≃ G v, we say that
“u and v are path-connected”.

Proposition 2.9.9. Let G be a simple graph. Then, the relation ≃ G is an


equivalence relation.
Proof. We need to show that ≃ G is symmetric, reflexive and transitive.
• Symmetry: If u ≃ G v, then v ≃ G u, because we can take a walk from u to
v and reverse it.
• Reflexivity: We always have u ≃ G u, since the trivial walk (u) is a walk
from u to u.
• Transitivity: If u ≃ G v and v ≃ G w, then u ≃ G w, because (as we know
from Proposition 2.9.3) we can take a walk a from u to v and a walk b from
v to w and combine them to form the walk a ∗ b defined in Proposition
2.9.3.

Proposition 2.9.10. Let G be a simple graph. Let u and v be two vertices of


G. Then, u ≃ G v if and only if there exists a path from u to v.
Proof. ⇐=: Clear, since any path is a walk.
=⇒: This is just saying that if there is a walk from u to v, then there is a path
from u to v. But this follows from Corollary 2.9.7.
An introduction to graph theory, version August 2, 2023 page 40

2.9.6. Connected components and connectedness


The equivalence relation ≃ G introduced in Definition 2.9.8 allows us to define
two important concepts:

Definition 2.9.11. Let G be a simple graph. The equivalence classes of the


equivalence relation ≃ G are called the connected components (or, for short,
components) of G.

Definition 2.9.12. Let G be a simple graph. We say that G is connected if G


has exactly one component.

Thus, a simple graph G is connected if and only if it has at least one compo-
nent (i.e., it has at least one vertex) and it has at most one component (i.e., each
two of its vertices are path-connected).

Example 2.9.13. Let G be the graph with vertex set {1, 2, . . . , 9} and such
that two vertices i and j are adjacent if and only if |i − j| = 3. What are the
components of G ?
The graph G looks like this:

3
4
2
5
1
6
9
7
8
.

This looks like a jumbled mess, so you might think that all vertices are mu-
tually path-connected. But this is not the case, because edges that cross in
a drawing do not necessarily have endpoints in common. Walks can only
move from one edge to another at a common endpoint. Thus, there are
much fewer walks than the picture might suggest. We have 1 ≃ G 4 ≃ G 7
and 2 ≃ G 5 ≃ G 8 and 3 ≃ G 6 ≃ G 9, but there are no further ≃ G -relations. In
fact, two vertices of G are adjacent only if they are congruent modulo 3 (as
numbers), and therefore you cannot move from one modulo-3 congruence
class to another by walking along edges of G. So the components of G are
{1, 4, 7} and {2, 5, 8} and {3, 6, 9}. The graph G is not connected.

Example 2.9.14. Let G be the graph with vertex set {1, 2, . . . , 9} and such that
two vertices i and j are adjacent if and only if |i − j| = 6. This graph looks
An introduction to graph theory, version August 2, 2023 page 41

like this:
3
4
2
5
1
6
9
7
8
.
What are the components of G ? They are {1, 7} and {2, 8} and {3, 9} and
{4} and {5} and {6}. Note that three of these six components are singleton
sets. The graph G is not connected.

Example 2.9.15. Let G be the graph with vertex set {1, 2, . . . , 9} and such that
two vertices i and j are adjacent if and only if |i − j| = 3 or |i − j| = 4. This
graph looks like this:

3
4
2
5
1
6
9
7
8
.

We can take a long walk through G:

(1, 4, 7, 3, 6, 9, 5, 2, 5, 8) .

This walk traverses every vertex of G; thus, any two vertices of G are path-
connected. Hence, G has only one component, namely {1, 2, . . . , 9}. Thus, G
is connected.

Example 2.9.16. The complete graph on a nonempty set is connected. The


complete graph on the empty set is not connected, since it has 0 (not 1)
components.

Example 2.9.17. The empty graph on a finite set V has |V | many components
(those are the singleton sets {v} for v ∈ V). Thus, it is connected if and only
if |V | = 1.
An introduction to graph theory, version August 2, 2023 page 42

Exercise 2.10. Let k ∈ N. Let S be a finite set.


Recall that the Kneser graph KS,k is the simple graph whose vertices are
the k-element subsets of S, and whose edges are the unordered pairs { A, B}
consisting of two such subsets A and B that satisfy A ∩ B = ∅.
Prove that this Kneser graph KS,k is connected if |S| ≥ 2k + 1.
[Remark: Can the “if” here be replaced by an “if and only if”? Not quite,
because the graph KS,k is also connected if |S| = 2 and k = 1 (in which case
it has two vertices and one edge), or if |S| = k (in which case it has only one
vertex), or if k = 0 (in which case it has only one vertex). But these are the
only “exceptions”.]

2.9.7. Induced subgraphs on components


The following is not hard to see:

Proposition 2.9.18. Let G be a simple graph. Let C be a component of G.


Then, the induced subgraph of G on the set C is connected.

Proof. Let G [C] be this induced subgraph. We need to show that G [C] is con-
nected. In other words, we need to show that G [ C] has exactly 1 component.
Clearly, G [ C] has at least one vertex (since C is a component, i.e., an equiv-
alence class of ≃ G , but equivalence classes are always nonempty), thus has at
least 1 component. So we only need to show that G [C] has no more than 1
component. In other words, we need to show that any two vertices of G [ C] are
path-connected in G [C].
So let u and v be two vertices of G [C]. Then, u, v ∈ C, and therefore u ≃ G
v (since C is a component of G). In other words, there exists a walk w =
(w0 , w1 , . . . , wk ) from u to v in G. We shall now prove that this walk w is
actually a walk of G [C]. In other words, we shall prove that all vertices of w
belong to C.
But this is easy: If wi is a vertex of w, then (w0 , w1 , . . . , wi ) is a walk from
u to wi in G, and therefore we have u ≃ G wi , so that wi belongs to the same
component of G as u; but that component is C. Thus, we have shown that
each vertex wi of w belongs to C. Therefore, w is a walk of the graph G [C].
Consequently, it shows that u ≃ G [C ] v.
We have now proved that u ≃ G [C ] v for any two vertices u and v of G [C].
Hence, the relation ≃ G [C ] has no more than 1 equivalence class. In other words,
the graph G [C] has no more than 1 component. This completes our proof.
In the following proposition, we are using the notation G [S] for the induced
subgraph of a simple graph G on a subset S of its vertex set.
An introduction to graph theory, version August 2, 2023 page 43

Proposition 2.9.19. Let G be a simple graph. Let C1 , C2 , . . . , Ck be all compo-


nents of G (listed without repetition).
Thus, G is isomorphic to the disjoint union G [C1 ] ⊔ G [C2 ] ⊔ · · · ⊔ G [Ck ].

Proof. Consider the bijection from V ( G [C1 ] ⊔ G [C2 ] ⊔ · · · ⊔ G [Ck ]) to V (G ) that


sends each vertex (i, v) of G [ C1 ] ⊔ G [C2 ] ⊔ · · · ⊔ G [Ck ] to the vertex v of G. We
claim that this bijection is a graph isomorphism. In order to prove this, we
need to check that there are no edges of G that join vertices in different com-
ponents. But this is easy: If two vertices in different components of G were
adjacent, then they would be path-connected, and thus would actually belong
to the same component.
The upshot of these results is that every simple graph can be decomposed
into a disjoint union of its components (or, more precisely, of the induced sub-
graphs on its components). Each of these components is a connected graph.
Moreover, this is easily seen to be the only way to decompose the graph into a
disjoint union of connected graphs.

2.9.8. Some exercises on connectedness

Exercise 2.11. Let G be a simple graph with V (G ) 6= ∅. Show that the


following two statements are equivalent:

• Statement 1: The graph G is connected.

• Statement 2: For every two nonempty subsets A and B of V ( G ) satisfy-


ing A ∩ B = ∅ and A ∪ B = V ( G ), there exist a ∈ A and b ∈ B such
that ab ∈ E ( G ). (In other words: Whenever we subdivide the vertex set
V ( G ) of G into two nonempty subsets, there will be at least one edge
of G connecting a vertex in one subset to a vertex in another.)

[Solution: This is Exercise 7 on homework set #1 from my Spring 2017


course; see the course page for solutions.]

Exercise 2.12. Let V be a nonempty finite set. Let G and H be two simple
graphs such that V ( G ) = V ( H ) = V. Assume that for each u ∈ V and
v ∈ V, there exists a path from u to v in G or a path from u to v in H. Prove
that at least one of the graphs G and H is connected.
[Solution: This is Exercise 8 on homework set #1 from my Spring 2017
course; see the course page for solutions.]

Exercise 2.13. Let G = (V, E) be a simple graph. The complement graph G


of G is defined to be the simple graph (V, P2 (V ) \ E). (Thus, two distinct
An introduction to graph theory, version August 2, 2023 page 44

vertices u and v in V are adjacent in G if and only if they are not adjacent in
G.)
Prove that at least one of the following two statements holds:

• Statement 1: For each u ∈ V and v ∈ V, there exists a path from u to v


in G of length ≤ 3.

• Statement 2: For each u ∈ V and v ∈ V, there exists a path from u to v


in G of length ≤ 2.

[Solution: This is Exercise 9 on homework set #1 from my Spring 2017


course; see the course page for solutions.]

Exercise 2.14. Let n ≥ 2 be an integer. Let G be a connected simple graph


with n vertices.

(a) Describe G if the degrees of the vertices of G are 1, 1, 2, 2, . . . , 2 (exactly


two 1’s and n − 2 many 2’s).

(b) Describe G if the degrees of the vertices of G are 1, 1, . . . , 1, n − 1.

(c) Describe G if the degrees of the vertices of G are 2, 2, . . . , 2.

Here, to “describe” G means to explicitly determine (with proof) a graph


that is isomorphic to G.

The following exercise is not explicitly concerned with connectedness and com-
ponents, but it might help to think about components to solve it (although there
are solutions that do not use them):

Exercise 2.15. Let G be a simple graph with n vertices. Assume that each
vertex of G has at least one neighbor.
A matching of G shall mean a set F of edges of G such that no two edges
in F have a vertex in common. Let m be the largest size of a matching of G.
An edge cover of G shall mean a set F of edges of G such that each vertex
of G is contained in at least one edge e ∈ F. Let c be the smallest size of an
edge cover of G.
Prove that c + m = n.

Remark 2.9.20. Let G be the cycle graph C5 shown in Example 2.13.2.


Then, {12, 34} is a matching of G of largest possible size (why?), whereas
{12, 34, 25} is an edge cover of G of smallest possible size (why?). Thus,
Exercise 2.15 says that 2 + 3 = 5 here, which is indeed true.
An introduction to graph theory, version August 2, 2023 page 45

2.10. Closed walks and cycles


Here are two further kinds of walks:

Definition 2.10.1. Let G be a simple graph.

(a) A closed walk of G means a walk whose first vertex is identical with
its last vertex. In other words, it means a walk (w0 , w1 , . . . , wk ) with
w0 = wk . Sometimes, closed walks are also known as circuits (but
many authors use this latter word for something slightly different).

(b) A cycle of G means a closed walk (w0 , w1 , . . . , wk ) such that k ≥ 3 and


such that the vertices w0 , w1 , . . . , wk−1 are distinct.

Example 2.10.2. Let G be the simple graph

({1, 2, 3, 4, 5, 6} , {12, 23, 34, 45, 56, 61, 13}) .

This graph looks as follows (we have already seen it in Example 2.9.2):

3 2

4 1

5 6

Then:

• The sequence (1, 3, 2, 1, 6, 5, 6, 1) is a closed walk of G. But it is very


much not a cycle.

• The sequences (1, 2, 3, 1) and (1, 3, 4, 5, 6, 1) and (1, 2, 3, 4, 5, 6, 1) are cy-


cles of G. You can get further cycles by rotating these sequences (in a
proper sense of this word – e.g., rotating (1, 2, 3, 1) gives (2, 3, 1, 2) and
(3, 1, 2, 3)) and by reversing them. Every cycle of G can be obtained in
this way.

• The sequences (1) and (1, 2, 1) are closed walks, but not cycles of G
(since they fail the k ≥ 3 condition).

• The sequence (1, 2, 3) is a walk, but not a closed walk, since 1 6= 3.


An introduction to graph theory, version August 2, 2023 page 46

Authors have different opinions about whether (1, 2, 3, 1) and (1, 3, 2, 1) count
as different cycles. Fortunately, this matters only if you want to count cycles,
but not for the existence or non-existence of cycles.
We have now defined paths (in an arbitrary graph) and also path graphs Pn ;
we have also defined cycles (in an arbitrary graph) and also cycle graphs Cn .
Besides their similar names, are they related? The answer is “yes”:

Proposition 2.10.3. Let G be a simple graph.

(a) If ( p0 , p1 , . . . , pk ) is a path of G, then there is a subgraph of


G isomorphic to the path graph Pk+1 , namely the subgraph
({ p0 , p1 , . . . , pk } , { pi pi +1 | 0 ≤ i < k}). (If this subgraph is actually
an induced subgraph of G, then the path ( p0 , p1 , . . . , pk ) is called an
“induced path”.)
Conversely, any subgraph of G isomorphic to Pk+1 gives a path of G.

(b) Now, assume that k ≥ 3. If (c0 , c1 , . . . , ck ) is a cycle of G, then there is a


subgraph of G isomorphic to the cycle graph Ck , namely the subgraph
({c0 , c1 , . . . , ck } , {ci ci +1 | 0 ≤ i < k}). (If this subgraph is actually an
induced subgraph of G, then the cycle (c0 , c1 , . . . , ck ) is called an “in-
duced cycle”.)
Conversely, any subgraph of G isomorphic to Ck gives a cycle of G.

Proof. Straightforward.
Certain graphs contain cycles; other graphs don’t. For instance, the complete
graph Kn contains a lot of cycles (when n ≥ 3), whereas the path graph Pn
contains none. Let us try to find some criteria for when a graph can and when
it cannot have cycles7 :

Proposition 2.10.4. Let G be a simple graph. Let w be a walk of G such


that no two adjacent edges of w are identical. (By “adjacent edges”, we
mean edges of the form wi −1 wi and wi wi +1 , where wi −1 , wi , wi +1 are three
consecutive vertices of w.)
Then, w either is a path or contains a cycle (i.e., there exists a cycle of G
whose edges are edges of w).

Example 2.10.5. Let G be as in Example 2.10.2. Then, (2, 1, 3, 2, 1, 6) is a walk


w of G such that no two adjacent edges of w are identical (even though the
edge 21 appears twice in this walk). On the other hand, (2, 1, 3, 1, 6) is not
such a walk (since its two adjacent edges 13 and 31 are identical).
7 Mantel’stheorem already gives such a criterion for cycles of length 3 (because a cycle of
length 3 is the same as a triangle).
An introduction to graph theory, version August 2, 2023 page 47

Proof of Proposition 2.10.4. We assume that w is not a path. We must then show
that w contains a cycle.
Write w as w = (w0 , w1 , . . . , wk ). Since w is not a path, two of the vertices
w0 , w1, . . . , wk must be equal. In other words, there exists a pair (i, j) of integers
i and j with i < j and wi = w j . Among all such pairs, we pick one with

minimum difference j − i. We shall show that the walk wi , wi +1, . . . , w j is a
cycle.
First, this walk is clearly a closed walk (since wi = w j ). It thus remains to
show that j − i ≥ 3 and that the vertices wi , wi +1, . . . , w j−1 are distinct. The
distinctness of wi , wi +1, . . . , w j−1 follows from the minimality of j − i. To show
that j − i ≥ 3, we assume the contrary. Thus, j − i is either 1 or 2 (since i < j).
But j − i cannot be 1, since the endpoints of an edge cannot be equal (since our
graph is a simple graph). So j − i must be 2. Thus, wi = wi +2. Therefore, the
two edges wi wi +1 and wi +1 wi +2 are identical. But this contradicts the fact that
no two adjacent edges of w are identical. Contradiction, qed.

Corollary 2.10.6. Let G be a simple graph. Assume that G has a closed walk
w of length > 0 such that no two adjacent edges of w are identical. Then, G
has a cycle.

Proof. This follows from Proposition 2.10.4, since w is not a path.

Theorem 2.10.7. Let G be a simple graph. Let u and v be two vertices in G.


Assume that there are two distinct paths from u to v. Then, G has a cycle.

Proof. More generally, we shall prove this theorem with the word “path” re-
placed by “backtrack-free walk”, where a “backtrack-free walk” means a walk
w such that no two adjacent edges of w are identical. This is a generalization
of the theorem, since every path is a backtrack-free walk (why?).
So we claim the following:

Claim 1: Let p and q be two distinct backtrack-free walks that start


at the same vertex and end at the same vertex. Then, G has a cycle.

We shall prove Claim 1 by induction on the length of p. So we fix an integer


N, and we assume that Claim 1 is proved in the case when the length of p is
N − 1. We must now show that it is also true when the length of p is N.
So let p = ( p0 , p1 , . . . , p a ) and q = (q0 , q1 , . . . , qb ) be two distinct backtrack-
free walks that start at the same vertex and end at the same vertex and satisfy
a = N. We must find a cycle.
The walks p and q are distinct but start at the same vertex, so they cannot
both be trivial8 . If one of them is trivial, then the other is a closed walk (because
a trivial walk is a closed walk), and then our goal follows from Corollary 2.10.6
8 We say that a walk is trivial if it has length 0.
An introduction to graph theory, version August 2, 2023 page 48

in this case (because we have a nontrivial closed backtrack-free walk). Hence,


from now on, we WLOG assume that neither of the two walks p and q is
trivial. Thus, each of these two walks has a last edge. The last edge of p is
p a−1 p a , whereas the last edge of q is qb−1 qb .
Two cases are possible:
Case 1: We have p a−1 p a = qb−1 qb .
Case 2: We have p a−1 p a 6= qb−1 qb .
Let us consider Case 1 first. In this case, the last edges p a−1 p a and qb−1 qb of
the two walks p and q are identical, so the second-to-last vertices of these two
walks must also be identical. Thus, if we remove these last edges from both
walks, then we obtain two shorter backtrack-free walks ( p0 , p1 , . . . , p a−1 ) and
(q0 , q1 , . . . , qb−1 ) that again start at the same vertex and end at the same vertex,
but the length of the first of them is a − 1 = N − 1. Hence, by the induction
hypothesis, we can apply Claim 1 to these two shorter walks (instead of p and
q), and we conclude that G has a cycle. So we are done in Case 1.
Let us now consider Case 2. In this case, we combine the two walks p and q
(more precisely, p and the reversal of q) to obtain the closed walk

( p 0 , p 1 , . . . , p a − 1 , p a = q b , q b − 1 , . . . , q0 ) .

This closed walk is backtrack-free (since ( p0 , p1 , . . . , p a ) and (q0 , q1 , . . . , qb ) are


backtrack-free, and since p a−1 p a 6= qb−1 qb ) and has length > 0 (since it contains
at least the edge p a−1 p a ). Hence, Corollary 2.10.6 entails that G has a cycle.
We have thus found a cycle in both Cases 1 and 2. This completes the induc-
tion step. Thus, we have proved Claim 1. As we said, Theorem 2.10.7 follows
from it.

Exercise 2.16. Let G be a simple graph.

(a) Prove that if G has a closed walk of odd length, then G has a cycle of
odd length.

(b) Is it true that if G has a closed walk of length not divisible by 3, then G
has a cycle of length not divisible by 3 ?

(c) Does the answer to part (b) change if we replace “walk” by “non-
backtracking walk”? (A walk w with edges e1 , e2 , . . . , ek (in this or-
der) is said to be non-backtracking if each i ∈ {1, 2, . . . , k − 1} satisfies
ei 6= ei +1 .)

(d) A trail (in a graph) means a walk whose edges are distinct (but whose
vertices are not necessarily distinct). Does the answer to part (b) change
if we replace “walk” by “trail”?

(Proofs and counterexamples should be given.)


An introduction to graph theory, version August 2, 2023 page 49

2.11. The longest path trick


Here is another proposition that guarantees the existence of cycles in a graph
under certain circumstances. More importantly, its proof illustrates a useful
tactic in dealing with graphs:

Proposition 2.11.1. Let G be a simple graph with at least one vertex. Let
d > 1 be an integer. Assume that each vertex of G has degree ≥ d. Then, G
has a cycle of length ≥ d + 1.

Proof. Let p = (v0 , v1 , . . . , vm ) be a longest path of G. (Why does G have a


longest path? Let’s see: Any path of G has length ≤ |V | − 1, since its vertices
have to be distinct. Moreover, G has at least one vertex and thus has at least
one path. A finite nonempty set of integers has a largest element. Thus, G has
a longest path.)
The vertex v0 has degree ≥ d (by assumption), and thus has ≥ d neighbors
(since the degree of a vertex is the number of its neighbors).
If all neighbors of v0 belonged to the set {v1 , v2 , . . . , vd−1 } 9 , then the

number of neighbors of v0 would be at most d − 1, which would contradict the


previous sentence. Thus, there exists at least one neighbor u of v0 that does
not belong to this set { v1 , v2 , . . . , vd−1 }. Consider this u. Then, u 6= v0 (since a
vertex cannot be its own neighbor).
Attaching the vertex u to the front of the path p, we obtain a walk

p′ := (u, v0 , v1 , . . . , vm ) .

If we had u ∈ / { v0 , v1 , . . . , vm }, then this walk p′ would be a path; but this


would contradict the fact that p is a longest path of G. Thus, we must have
u ∈ {v0 , v1 , . . . , vm }. In other words, u = vi for some i ∈ {0, 1, . . . , m}. Consider
this i. Since u 6= v0 and u ∈ / {v1 , v2 , . . . , vd−1 }, we thus have i ≥ d. Here is a
picture:

v0 v1 v2 ··· vi = u ···

Now, consider the walk

c := (u, v0 , v1 , . . . , vi ) .

This is a closed walk (since u = vi ) and has length i + 1 ≥ d + 1 (since i ≥ d). If


we can show that c is a cycle, then we have thus found a cycle of length ≥ d + 1,
so we will be done.
9 If d − 1 > m, then this set should be understood to mean { v1 , v2 , . . . , v m }.
An introduction to graph theory, version August 2, 2023 page 50

It thus remains to prove that c is a cycle. Let us do this. We need to check


that the vertices u, v0 , v1 , . . . , vi −1 are distinct, and that the length of c is ≥ 3.
The latter claim is clear: The length of c is i + 1 ≥ d + 1 ≥ 3 (since d > 1
and d ∈ Z). The former claim is not much harder: Since u = vi , the vertices
u, v0 , v1 , . . . , vi −1 are just the vertices vi , v0 , v1 , . . . , vi −1, and thus are distinct
because they are distinct vertices of the path p. The proof of Proposition 2.11.1
is thus complete.

2.12. Bridges
One question that will later prove crucial is: What happens to a graph if we
remove a single edge from it? Let us first define a notation for this:
Definition 2.12.1. Let G = (V, E) be a simple graph. Let e be an edge of G.
Then, G \ e will mean the graph obtained from G by removing this edge e.
In other words,
G \ e := (V, E \ {e}) .

Some authors write G − e for G \ e.


Theorem 2.12.2. Let G be a simple graph. Let e be an edge of G. Then:

(a) If e is an edge of some cycle of G, then the components of G \ e are


precisely the components of G. (Keep in mind that the components are
sets of vertices. It is these sets that we are talking about here, not the
induced subgraphs on these sets.)

(b) If e appears in no cycle of G (in other words, there exists no cycle of G


such that e is an edge of this cycle), then the graph G \ e has one more
component than G.

Example 2.12.3. Let G be the graph shown in the following picture:

a (1)

(where we have labeled the edges a and b for further reference). This graph
has 4 components. The edge a is an edge of a cycle of G, whereas the edge
An introduction to graph theory, version August 2, 2023 page 51

b appears in no cycle of G. Thus, if we set e = a, then Theorem 2.12.2 (a)


shows that the components of G \ e are precisely the components of G. This
graph G \ e for e = a looks as follows:

and visibly has 4 components. On the other hand, if we set e = b, then


Theorem 2.12.2 (b) shows that the graph G \ e has one more component than
G. This graph G \ e for e = b looks as follows:

and visibly has 5 components.

Proof of Theorem 2.12.2. We will only sketch the proof. For details, see [21f6,
§6.7].
Let u and v be the endpoints of e, so that e = uv. Note that (u, v) is a path of
G, and thus we have u ≃ G v.
(a) Assume that e is an edge of some cycle of G. Then, if you remove e from
this cycle, then you still have a path from u to v left (as the remaining edges of
the cycle function as a detour), and this path is a path of G \ e. Thus, u ≃ G \e v.
Now, we must show that the components of G \ e are precisely the compo-
nents of G. This will clearly follow if we can show that the relation ≃ G \e is
precisely the relation ≃ G (because the components of a graph are the equiva-
lence classes of its ≃ relation). So let us prove the latter fact.
We must show that two vertices x and y of G satisfy x ≃ G \e y if and only if
they satisfy x ≃ G y. The “only if” part is obvious (since a walk of G \ e is always
a walk of G). It thus remains to prove the “if” part. So we assume that x and y
are two vertices of G satisfying x ≃ G y, and we want to show that x ≃ G \e y.
An introduction to graph theory, version August 2, 2023 page 52

From x ≃ G y, we conclude that G has a path from x to y (by Proposition


2.9.10). If this path does not use10 the edge e, then it is a path from x to y in
G \ e, and thus we have x ≃ G \e y, which is what we wanted to prove. So we
WLOG assume that this path does use the edge e. Thus, this path contains the
endpoints u and v of this edge e. We WLOG assume that u appears before v on
this path (otherwise, just swap u with v). Thus, this path looks as follows:

( x, . . . , u, v, . . . , y) .

If we remove the edge e = uv, then this path breaks into two smaller paths

( x, . . . , u) and (v, . . . , y)

(since the edges of a path are distinct, so e appears only once in it). Both of
these two smaller paths are paths of G \ e. Thus, x ≃ G \e u and v ≃ G \e y.
Now, recalling that ≃ G \e is an equivalence relation, we combine these results to
obtain
x ≃ G \e u ≃ G \e v ≃ G \e y.
Hence, x ≃ G \e y. This completes the proof of Theorem 2.12.2 (a).
(b) Assume that e appears in no cycle of G. We must prove that the graph G \
e has one more component than G. To do so, it suffices to show the following:

Claim 1: The component of G that contains u and v (this component


exists, since u ≃ G v) breaks into two components of G \ e when the
edge e is removed.

Claim 2: All other components of G remain components of G \ e.

Claim 2 is pretty clear: The components of G that don’t contain u and v do


not change at all when e is removed (since they contain neither endpoint of e).
Thus, they remain components of G \ e. (Formalizing this is a nice exercise in
formalization; see [21f6, §6.7].)
It remains to prove Claim 1. We introduce some notations:

• Let C be the component of G that contains u and v.

• Let A be the component of G \ e that contains u.

• Let B be the component of G \ e that contains v.

Then, we must show that A ∪ B = C and A ∩ B = ∅.


To see that A ∩ B = ∅, we need to show that u ≃ G \e v does not hold (since A
and B are the equivalence classes of u and v with respect to the relation ≃ G \e ).
So let us do this. Assume the contrary. Thus, u ≃ G \e v. Hence, there exists a
10 We say that a walk w uses an edge f if f is an edge of w.
An introduction to graph theory, version August 2, 2023 page 53

path from u to v in G \ e. Since e = uv, we can “close” this path by appending


the vertex u to its end; the result is a cycle of the graph G that contains the
edge e. But this contradicts our assumption that no cycle of G contains e. This
contradiction shows that our assumption was wrong. Thus, we conclude that
u ≃ G \e v does not hold. Hence, as we said, A ∩ B = ∅.
It remains to show that A ∪ B = C. Since A and B are clearly subsets of C
(because each walk of G \ e is a walk of G, and thus each component of G \ e
is a subset of a component of G), we have A ∪ B ⊆ C, and therefore we only
need to show that C ⊆ A ∪ B. In other words, we need to show that each c ∈ C
belongs to A ∪ B.
Let us show this. Let c ∈ C be a vertex. Then, c ≃ G u (since C is the
component of G containing u). Therefore, G has a path p from c to u. Consider
this path p. Two cases are possible:

• Case 1: This path p does not use the edge e. In this case, p is a path of
G \ e, and thus we obtain c ≃ G \e u. In other words, c ∈ A (since A is the
component of G \ e containing u).

• Case 2: This path p does use the edge e. In this case, the edge e must be
the last edge of p (since the path p would otherwise contain the vertex u
twice11 ; but a path cannot contain a vertex twice), and the last two vertices
of p must be v and u in this order. Thus, by removing the last vertex from
p, we obtain a path from c to v, and this latter path is a path of G \ e (since
it no longer contains u and therefore does not use e). This yields c ≃ G \e v.
In other words, c ∈ B (since B is the component of G \ e containing v).

In either of these two cases, we have shown that c belongs to one of A and B.
In other words, c ∈ A ∪ B. This is precisely what we wanted to show. This
completes the proof of Theorem 2.12.2 (b).
We introduce some fairly standard terminology:

Definition 2.12.4. Let e be an edge of a simple graph G.

(a) We say that e is a bridge (of G) if e appears in no cycle of G.

(b) We say that e is a cut-edge (of G) if the graph G \ e has more compo-
nents than G.

Corollary 2.12.5. Let e be an edge of a simple graph G. Then, e is a bridge if


and only if e is a cut-edge.

Proof. Follows from Theorem 2.12.2.


11 Indeed,the path p already ends in u. If it would contain e anywhere other than at the very
end, then it would thus contain the vertex u twice (since u is an endpoint of e).
An introduction to graph theory, version August 2, 2023 page 54

We can also define “cut-vertices”: A vertex v of a graph G is said to be a cut-


vertex if the graph G \ v (that is, the graph G with the vertex v removed12 ) has
more components than G. Unfortunately, there doesn’t seem to be an analogue
of Corollary 2.12.5 for cut-vertices. Note also that removing a vertex (unlike
removing an edge) can add more than one component to the graph (or it can
also subtract 1 component if this vertex had degree 0). For example, removing
the vertex 0 from the graph

3 0 1

results in an empty graph on the set {1, 2, 3, 4}, so the number of components
has increased from 1 to 4.

2.13. Dominating sets


2.13.1. Definition and basic facts
Here is another concept we can define for a graph:

Definition 2.13.1. Let G = (V, E) be a simple graph.


A subset U of V is said to be dominating (for G) if it has the following
property: Each vertex v ∈ V \ U has at least one neighbor in U.
A dominating set for G (or dominating set of G) will mean a subset of V
that is dominating.

Example 2.13.2. Consider the cycle graph

2
3

4
5
C5 = ({1, 2, 3, 4, 5} , {12, 23, 34, 45, 51}) = .

12 When we remove a vertex, we must of course also remove all edges that contain this vertex.
An introduction to graph theory, version August 2, 2023 page 55

The set {1, 3} is a dominating set for C5 , since all three vertices 2, 4, 5 that
don’t belong to {1, 3} have neighbors in {1, 3}. The set {1, 5} is not a dom-
inating set for C5 , since the vertex 3 has no neighbor in {1, 5}. There is no
dominating set for C5 that has size 0 or 1, but there are several of size 2, and
every subset of size ≥ 3 is dominating.

Here are some more examples:

• If G = (V, E) is a simple graph, then the whole vertex set V is always


dominating, whereas the empty set ∅ is dominating only when V = ∅.

• If G = (V, E) is a complete graph, then any nonempty subset of V is


dominating.

• If G = (V, E) is an empty graph, then only V is dominating.

Clearly, the “denser” a graph is (i.e., the more edges it has), the “easier” it
is for a set to be dominating. Often, a graph is given, and one is interested in
finding a dominating set of the smallest possible size13 . As the case of an empty
graph reveals, sometimes the only choice is the whole vertex set. However, in
many cases, we can do better. Namely, we need to require that the graph has
no isolated vertices:

Definition 2.13.3. Let G be a simple graph. A vertex v of G is said to be


isolated if it has no neighbors (i.e., if deg v = 0).

An isolated vertex has to belong to every dominating set (since otherwise,


it would need a neighbor in that set, but it has no neighbors). Thus, isolated
vertices do not contribute much to the study of dominating sets, other than
inflating their size. Therefore, when we look for dominating sets, we can restrict
ourselves to graphs with no isolated vertices. There, we have the following
result:

Proposition 2.13.4. Let G = (V, E) be a simple graph that has no isolated


vertices. Then:

(a) There exists a dominating subset of V that has size ≤ |V | /2.

(b) There exist two disjoint dominating subsets A and B of V such that
A ∪ B = V.

13 Supposedly,this has applications in mobile networking: For example, you might want to
choose a set of routers in a given network so that each node is either a router or directly
connected (i.e., adjacent) to one.
An introduction to graph theory, version August 2, 2023 page 56

One proof of this proposition will be given in Exercise 2.19 below (homework
set #2 exercise 4). Another appears in [17s, §3.6].
For specific graphs, the bound |V | /2 in Proposition 2.13.4 (a) can often be
improved. Here is an example:

Exercise 2.17. Let n ≥ 3 be an integer. Find a formula for the smallest size
of a dominating set of the cycle graph Cn . You can use the ceiling function
x 7→ ⌈ x ⌉, which sends a real number x to the smallest integer that is ≥ x.

Exercise 2.18. Let n and k be positive integers such that n ≥ k ( k + 1) and


k > 1. Recall (from Subsection 2.6.3) the Kneser graph KGn,k , whose vertices
are the k-element subsets of {1, 2, . . . , n}, and whose edges are the unordered
pairs { A, B} of such subsets with A ∩ B = ∅.
Prove that the minimum size of a dominating set of KGn,k is k + 1.

Exercise 2.19. Let G = (V, E) be a connected simple graph with at least two
vertices.
The distance d (v, w) between two vertices v and w of G is defined to be
the smallest length of a path from v to w. (In particular, d (v, v) = 0 for each
v ∈ V.)
Fix a vertex v ∈ V. Define two subsets

A = { w ∈ V | d (v, w) is even} and B = {w ∈ V | d (v, w) is odd}

of V.

(a) Prove that A is dominating.

(b) Prove that B is dominating.

(c) Prove that there exists a dominating set of G that has size ≤ |V | /2.

(d) Prove that the claim of part (c) holds even if we don’t assume that G is
connected, as long as we assume that each vertex of G has at least one
neighbor. (In other words, prove Proposition 2.13.4 (a).)

2.13.2. The number of dominating sets


Next, we state a rather surprising recent result about the number of dominating
sets of a graph:

Theorem 2.13.5 (Brouwer’s dominating set theorem). Let G be a simple


graph. Then, the number of dominating sets of G is odd.
An introduction to graph theory, version August 2, 2023 page 57

Three proofs of this theorem are given in Brouwer’s note [Brouwe09].14 Let
me show the one I like the most. We first need a notation:

Definition 2.13.6. Let G = (V, E) be a simple graph. A detached pair will


mean a pair ( A, B) of two disjoint subsets A and B of V such that there exists
no edge ab ∈ E with a ∈ A and b ∈ B.

Example 2.13.7. Consider the cycle graph

3 2

4 1

5 6
C6 = ({1, 2, 3, 4, 5, 6} , {12, 23, 34, 45, 56, 61}) = .

Then, ({1, 2} , {4, 5}) is a detached pair, whereas ({1, 2} , {3, 4}) is not (since
23 is an edge). Of course, there are many other detached pairs; in particular,
any pair of the form (∅, B) or ( A, ∅) is detached.

Let me stress that the word “pair” always means “ordered pair” unless I say
otherwise. So, if ( A, B) is a detached pair, then ( B, A) is a different detached
pair, unless A = B = ∅.
Here is an attempt at a proof of Theorem 2.13.5. It is a nice example of how
to apply known results to new graphs to obtain new results. The only problem
is, it shows a result that is a bit at odds with the claim of the theorem...
Proof of Theorem 2.13.5, attempt 1. Write the graph G as (V, E).
Recall that P (V ) denotes the set of all subsets of V.
Construct a new graph H with the vertex set P (V ) as follows: Two subsets
A and B of V are adjacent as vertices of H if and only if ( A, B) is a detached
pair. (Note that if the original graph G has n vertices, then this graph H has 2n
vertices. It is huge!)
I claim that the vertices of H that have odd degree are precisely the subsets
of V that are dominating. In other words:

Claim 1: Let A be a subset of V. Then, the vertex A of H has odd


degree if and only if A is a dominating set of G.

14 Otherproofs can be found in the AoPS thread https://artofproblemsolving.com/community/c6h358772p1960068


. (This thread is concerned with a superficially different contest problem, but the latter
problem is quickly revealed to be Theorem 2.13.5 in a number-theoretical disguise.)
An introduction to graph theory, version August 2, 2023 page 58

[Proof of Claim 1: We let N ( A) denote the set of all vertices of G that have a
neighbor in A. (This may or may not be disjoint from A.)
The neighbors of A (as a vertex in H) are precisely the subsets B of V such
that ( A, B) is a detached pair (by the definition of H). In other words, they are
the subsets B of V that are disjoint from A and also have no neighbors in A (by
the definition of a “detached pair”). In other words, they are the subsets B of
V that are disjoint from A and also disjoint from N ( A). In other words, they
are the subsets of the set V \ ( A ∪ N ( A)). Hence, the number of such subsets
B is 2|V \( A∪ N ( A))| .
The degree of A (as a vertex of H) is the number of neighbors of A in H.
Thus, this degree is 2|V \( A∪ N ( A))| (because we have just shown that the num-
ber of neighbors of A is 2|V \( A∪ N ( A))| ). But 2k is odd if and only if k = 0.
Thus, we conclude that the degree of A (as a vertex of H) is odd if and only if
|V \ ( A ∪ N ( A))| = 0. The condition |V \ ( A ∪ N ( A))| = 0 can be rewritten as
follows:

(|V \ ( A ∪ N ( A))| = 0)
⇐⇒ (V \ ( A ∪ N ( A)) = ∅)
⇐⇒ (V ⊆ A ∪ N ( A))
⇐⇒ (V \ A ⊆ N ( A))
⇐⇒ (each vertex v ∈ V \ A belongs to N ( A))
⇐⇒ (each vertex v ∈ V \ A has a neighbor in A)
⇐⇒ ( A is dominating) (by the definition of “dominating”) .

Thus, what we have just shown is that the degree of A (as a vertex of H) is odd
if and only if A is dominating. This proves Claim 1.]
Claim 1 shows that the vertices of H that have odd degree are precisely the
dominating sets of G. But the handshake lemma (Corollary 2.4.4) tells us that
any simple graph has an even number of vertices of odd degree. Applying this
to H, we conclude that there is an even number of dominating sets of G.
Huh? We want to show that there is an odd number of dominating sets of G,
not an even number! Why did we just get the opposite result?
Puzzle: Find the mistake in our above reasoning! The answer will be revealed
on the next page.
An introduction to graph theory, version August 2, 2023 page 59

So what was the mistake in our reasoning?


The mistake is that our definition of H requires the vertex ∅ of H to be
adjacent to itself (since (∅, ∅) is a detached pair); but a vertex of a simple
graph cannot be adjacent to itself. So we need to tweak the definition of H
somewhat:
Correction of the above proof of Theorem 2.13.5. Define the graph H as above, but
do not try to have ∅ adjacent to itself. (This is the only vertex that creates any
trouble, because a detached pair ( A, B) cannot satisfy A = B unless both A and
B are ∅.)
We WLOG assume that V 6= ∅ (otherwise, the claim is obvious). Thus, the
empty set ∅ is not dominating.
Our Claim 1 needs to be modified as follows:
Claim 1’: Let A be a subset of V. Then, the vertex A of H has odd
degree if and only if A is empty or a dominating set of G.
This can be proved in the same way as we “proved” Claim 1 above; we just
need to treat the A = ∅ case separately now (but this case is easy: ∅ is adjacent
to all other vertices of H, and thus has degree 2|V | − 1, which is odd).
So we conclude (using the handshake lemma) that the number of empty or
dominating sets is even. Subtracting 1 for the empty set, we conclude that the
number of dominating sets is odd (since the empty set is not dominating). This
proves Brouwer’s theorem (Theorem 2.13.5).
There are other ways to prove Brouwer’s theorem as well. A particularly
nice one was found by Irene Heinrich and Peter Tittmann in 2017; they gave
an “explicit” formula for the number of dominating sets that shows that this
number is odd ([HeiTit17, Theorem 8], restated using the language of detached
pairs):

Theorem 2.13.8 (Heinrich–Tittmann formula). Let G = (V, E) be a simple


graph with n vertices. Assume that n > 0.
Let α be the number of all detached pairs ( A, B) such that both numbers
| A| and | B| are even and positive.
Let β be the number of all detached pairs ( A, B) such that both numbers
| A| and | B| are odd.
Then:

(a) The numbers α and β are even.

(b) The number of dominating sets of G is 2n − 1 + α − β.

Part (a) of this theorem is obvious (recall that if ( A, B) is a detached pair, then
so is ( B, A)). Part (b) is the interesting part. In [17s, §3.3–§3.4], I give a long but
elementary proof.
An introduction to graph theory, version August 2, 2023 page 60

More recently ([HeiTit18]), Heinrich and Tittmann have refined their formula
to allow counting dominating sets of a given size. Their main result is the
following formula (exercise 5 on homework set #2):

Exercise 2.20. Let G = (V, E) be a simple graph with at least one vertex. Let
n = |V |. A detached pair means a pair ( A, B) of two disjoint subsets A and
B of V such that there exists no edge ab ∈ E with a ∈ A and b ∈ B.
Prove the following generalization of the Heinrich–Tittmann formula:

∑ x |S | = (1 + x ) n − 1 + ∑ (−1)| A| x | B| .
S is a dominating ( A,B ) is a detached pair;
set of G A 6 = ∅; B 6 = ∅

(Here, both sides are polynomials in a single indeterminate x with coeffi-


cients in Z.)
[Hint: This is a generalization of the Heinrich–Tittmann formula for the
number of dominating sets. (The latter formula can be obtained fairly easily
by substituting x = 1 into the above and subsequently cancelling the addends
with | A| 6≡ | B| mod 2 against each other.) You are free to copy arguments
from [17s] and change whatever needs to be changed. (Some lemmas can
even be used without any changes – they can then be cited without proof.)]

The following exercise gives a generalization of Theorem 2.13.5 (to recover


Theorem 2.13.5 from it, set k = 1):

Exercise 2.21. Let k be a positive integer. Let G = (V, E) be a simple graph.


A subset U of V will be called k-path-dominating if for every v ∈ V, there
exists a path of length ≤ k from v to some element of U.
Prove that the number of all k-path-dominating subsets of V is odd.
[Hint: This is not as substantial a generalization as it may look. The short-
est proof is very short.]
[Solution: This is Exercise 6 on homework set #1 from my Spring 2017
course; see the course page for solutions.]

2.14. Hamiltonian paths and cycles


2.14.1. Basics
Now to something different. Here is a quick question: Given a simple graph G,
when is there a closed walk that contains each vertex of G ?
The answer is easy: When G is connected. Indeed, if a simple graph G is
connected, then we can label its vertices by v1 , v2 , . . . , vn arbitrarily, and we
then get a closed walk by composing a walk from v1 to v2 with a walk from
v2 to v3 with a walk from v3 to v4 and so on, ending with a walk from vn to
An introduction to graph theory, version August 2, 2023 page 61

v1 . This closed walk will certainly contain each vertex. Conversely, such a walk
cannot exist if G is not connected.
The question becomes a lot more interesting if we replace “closed walk” by
“path” or “cycle”. The resulting objects have a name:

Definition 2.14.1. Let G = (V, E) be a simple graph.

(a) A Hamiltonian path in G means a walk of G that contains each vertex


of G exactly once. Obviously, it is a path.

(b) A Hamiltonian cycle in G means a cycle (v0 , v1 , . . . , vk ) of G such that


each vertex of G appears exactly once among v0 , v1 , . . . , vk−1 .

Some graphs have Hamiltonian paths; some don’t. Having a Hamiltonian


cycle is even stronger than having a Hamiltonian path, because if (v0 , v1 , . . . , vk )
is a Hamiltonian cycle of G, then (v0 , v1 , . . . , vk−1 ) is a Hamiltonian path of G.

Convention 2.14.2. In the following, we will abbreviate:

• “Hamiltonian path” as “hamp”;

• “Hamiltonian cycle” as “hamc”.


An introduction to graph theory, version August 2, 2023 page 62

Example 2.14.3. Which of the following eight graphs have hamps? Which
have hamcs?

3 2
5 2
A= 4 1
B= 4 1

5 6 6 3

1 1 2
D=
C=
2 3 3 4

3 2

E= 4 0 1 5 6 7 8

5 6 F= 1 2 3 4

2′ 2′

3′ 3′
2 2
3 3
G= 1 1′ H= 1 1′
4 4
5 5
4′ 4′

5′ 5′

Answers:

• The graph A has a hamc (1, 2, 3, 4, 5, 6, 1), and thus a hamp


(1, 2, 3, 4, 5, 6). (Recall that a graph that has a hamc always has a hamp,
since we can simply remove the last vertex from a hamc to obtain a
hamp.)
An introduction to graph theory, version August 2, 2023 page 63

• The graph B has a hamp (2, 3, 1, 4, 5, 6), but no hamc. The easiest way
to see that B has no hamc is the following: The edge 14 is a cut-edge
(i.e., removing it renders the graph disconnected), thus a bridge (i.e., an
edge that appears in no cycle); therefore, any cycle must stay entirely
“on one side” of this edge.

• The graph C has a hamp (0, 1, 2, 3), but no hamc. The argument for the
non-existence of a hamc is the same as for B: The edge 01 is a bridge.

• The graph D has neither a hamp nor a hamc, because it is not con-
nected. Only a connected graph can have a hamp.

• The graph E has a hamp (0, 3, 2, 1, 6, 5, 4), but no hamc (checking this
requires some work, though).

• The graph F has a hamc (1, 2, 3, 4, 8, 7, 6, 5, 1), thus also a hamp.

• The graph G has a hamc (1, 2, 3, 4, 5, 5′, 4′ , 3′ , 2′ , 1′ , 1), thus also a hamp.

• The graph H (which, by the way, is isomorphic to the Petersen graph


from Subsection 2.6.3) has a hamp (1, 3, 5, 2, 4, 4′, 3′ , 2′ , 1′ , 5′ ), but no
hamc (but this is not obvious! see the Wikipedia article for an argu-
ment).

In general, finding a hamp or a hamc, or proving that none exists, is a hard


problem. It can always be solved by brute force (i.e., by trying all lists of distinct
vertices and checking if there is a hamp among them, and likewise for hamcs),
but this quickly becomes forbiddingly laborious as the size of the graph in-
creases.Some faster algorithms exist (in particular, there is one of running time
O n2 2n , where n is the number of vertices), but no polynomial-time algorithm
is known. The problem (both in its hamp version and in its hamc version)
is known to be NP-hard (in the language of complexity theory). In practice,
hamps and hamcs can often be found with some wit and perseverance; proofs
of their non-existence can often be obtained with some logic and case analysis
(see the above example for some sample arguments). See the Wikipedia page
for “Hamiltonian path problem” for more information.
The problem of finding hamps is related to the so-called “traveling salesman
problem” (TSP), which asks for a hamp with “minimum weight” in a weighted
graph (each edge has a number assigned to it, which is called its “weight”, and
the weight of a hamp is the sum of the weights of the edges it uses). There is a
lot of computer-science literature about this problem.
An introduction to graph theory, version August 2, 2023 page 64

2.14.2. Sufficient criteria: Ore and Dirac


We shall now show some necessary criteria and some sufficient criteria (but no
necessary-and-sufficient criteria) for the existence of hamps and hamcs. Here
is the most famous sufficient criterion:

Theorem 2.14.4 (Ore’s theorem). Let G = (V, E) be a simple graph with n


vertices, where n ≥ 3.
Assume that deg x + deg y ≥ n for any two non-adjacent vertices x and y.
Then, G has a hamc.

There are various proofs of this theorem scattered around; see [Harju14, The-
orem 3.6] or [Guicha16, Theorem 5.3.2]. We shall give another proof (following
the “Algorithm” section on the Wikipedia page for “Ore’s theorem”):
Proof of Theorem 2.14.4. A listing (of V) shall mean a list of elements of V that
contains each element exactly once. It must clearly be an n-tuple.
The hamness of a listing (v1 , v2 , . . . , vn ) will mean the number of all i ∈
{1, 2, . . . , n} such that vi vi +1 ∈ E. Here, we set vn+1 = v1 . (Visually, it is best
to represent a listing (v1 , v2 , . . . , vn ) by drawing the vertices v1 , v2 , . . . , vn on a
circle in this order. Its hamness then counts how often two successive vertices
on the circle are adjacent in the graph G.) Note that the hamness of a listing
(v1 , v2 , . . . , vn ) does not change if we cyclically rotate the listing (i.e., transform
it into (v2 , v3 , . . . , vn , v1 )).
Clearly, if we can find a listing (v1 , v2 , . . . , vn ) of hamness ≥ n, then all of
v1 v2 , v2 v3 , . . . , vn v1 are edges of G, and thus (v1 , v2 , . . . , vn , v1 ) is a hamc of G.
Thus, we need to find a listing of hamness ≥ n.
To do so, I will show that if you have a listing of hamness < n, then you can
slightly modify it to get a listing of larger hamness. In other words, I will show
the following:

Claim 1: Let (v1 , v2 , . . . , vn ) be a listing of hamness k < n. Then,


there exists a listing of hamness larger than k.

[Proof of Claim 1: Since the listing (v1 , v2 , . . . , vn ) has hamness k < n, there
exists some i ∈ {1, 2, . . . , n} such that vi vi +1 ∈/ E. Pick such an i. Thus, the
vertices vi and vi +1 of G are non-adjacent. The “deg x + deg y ≥ n” assumption
of the theorem thus yields deg (vi ) + deg (vi +1 ) ≥ n.
However,

deg (vi ) = |{w ∈ V | vi w ∈ E}|



= j ∈ {1, 2, . . . , n} | vi v j ∈ E

= j ∈ {1, 2, . . . , n} \ {i } | vi v j ∈ E
An introduction to graph theory, version August 2, 2023 page 65

(because j = i could not satisfy vi v j ∈ E anyway) and

deg (vi +1 ) = |{ w ∈ V | vi +1 w ∈ E}|



= j ∈ {1, 2, . . . , n} | vi +1 v j+1 ∈ E
 
since (v2 , v3 , . . . , vn+1 ) is a listing of V
(because vn+1 = v1 )

= j ∈ {1, 2, . . . , n} \ {i } | vi +1 v j+1 ∈ E

(because j = i could not satisfy vi +1 v j+1 ∈ E anyway). In light of these two


equalities, we can rewrite the inequality deg (vi ) + deg (vi +1 ) ≥ n as

j ∈ {1, 2, . . . , n} \ {i } | vi v j ∈ E

+ j ∈ {1, 2, . . . , n} \ {i } | vi +1 v j+1 ∈ E ≥ n.

Thus, the two subsets j ∈ {1, 2, . . . , n} \ {i } | vi v j ∈ E and

j ∈ {1, 2, . . . , n} \ {i } | vi +1 v j+1 ∈ E of the (n − 1)-element set {1, 2, . . . , n} \
{i } have total size ≥ n (that is, the sum of their sizes is ≥ n). Hence, these two
subsets must overlap (i.e., have an element in common). In other words, there
exists a j ∈ {1, 2, . . . , n} \ {i } that satisfies both vi v j ∈ E and vi +1 v j+1 ∈ E. Pick
such a j.
Now, consider a new listing obtained from the old listing (v1 , v2 , . . . , vn ) as
follows:

• First, cyclically rotate the old listing so that it begins with vi +1 . Thus, you
get the listing (vi +1, vi +2 , . . . , vn , v1 , v2 , . . . , vi ).

• Then, reverse the part of the listing starting at vi +1 and ending at v j . Thus,
you get the new listing
 
 
 
 
 
 v j , v j −1 , . . . , vi +1 , v j +1 , v j +2 , . . . , vi  .
 | {z } | {z }
 
 This is the reversed part; This is the part that 
it may or may not “wrap around” was not reversed.
(i.e., contain ...,v1 ,vn ,... somewhere).

This is the new listing we want.

I claim that this new listing has hamness larger than k. Indeed, rotating the
old listing clearly did not change its hamness. But reversing the part from vi +1
to v j clearly did: After the reversal, the edges vi vi +1 and v j v j+1 no longer count
towards the hamness (if they were edges to begin with), but the edges vi v j and
vi +1 v j+1 started counting towards the hamness. This is a good bargain, because
it means that the hamness gained +2 from the newly-counted edges vi v j and
An introduction to graph theory, version August 2, 2023 page 66

vi +1 v j+1 (which, as we know, both exist), while only losing 0 or 1 (since the
edge vi vi +1 did not exist, whereas the edge v j v j+1 may or may not have been
lost). Thus, the hamness of the new listing is larger than the hamness of the
old listing either by 1 or 2. In other words, it is larger than m by at least 1 or 2.
This proves Claim 1.]
Now, we can start with any listing of V and keep modifying it using Claim 1,
increasing its hamness each time, until its hamness becomes ≥ n. But once its
hamness is ≥ n, we have found a hamc (as explained above). Theorem 2.14.4 is
thus proven.

Corollary 2.14.5 (Dirac’s theorem). Let G = (V, E) be a simple graph with n


vertices, where n ≥ 3.
n
Assume that deg x ≥ for each vertex x ∈ V.
2
Then, G has a hamc.

Proof. Follows from Ore’s theorem, since any two vertices x and y of G satisfy
n n
deg x + deg y ≥ + = n.
| {z } | {z } 2 2
n n
≥ ≥
2 2
Exercise 2.22.

(a) Let G = (V, E) be a simple graph, and let u and v be two distinct
vertices of G that are not adjacent. Let n = |V |. Assume that deg u +
deg v ≥ n. Let G ′ = (V, E ∪ {uv}) be the simple graph obtained from
G by adding a new edge uv. Assume that G ′ has a hamc. Prove that G
has a hamc.

(b) Does this remain true if we replace “hamc” by “hamp”?

2.14.3. A necessary criterion


So much for sufficient criteria. What about necessary criteria?

Proposition 2.14.6. Let G = (V, E) be a simple graph.


For each subset S of V, we let G \ S be the induced subgraph of G on the
set V \ S. (In other words, this is the graph obtained from G by removing all
vertices in S and removing all edges that have at least one endpoint in S.)
An introduction to graph theory, version August 2, 2023 page 67

3 2

G= 4 0 1

5 6
(For example, if and S = {3, 6}, then
2

G\S = 4 0 1

5
.)
Also, we let b0 ( H ) denote the number of connected components of a sim-
ple graph H.

(a) If G has a hamc, then every nonempty S ⊆ V satisfies b0 (G \ S) ≤ |S|.

(b) If G has a hamp, then every S ⊆ V satisfies b0 ( G \ S) ≤ |S| + 1.

For example, part (a) of this proposition shows that the graph E from Exam-
ple 2.14.3 has no hamc, because if we take S to be {3, 6}, then b0 ( G \ S) = 3
whereas |S| = 2. Thus, the proposition can be used to rule out the existence of
hamps and hamcs in some cases.
Proof of Proposition 2.14.6. (a) Let S ⊆ V be a nonempty set. If we cut |S| many
vertices out of a cycle, then the cycle splits into at most |S| paths:

† †

remove the vertices


† marked with daggers

Of course, our graph G itself may not be a cycle, but if it has a hamc, then the
removal of the vertices in S will split the hamc into at most | S| paths (according
to the preceding sentence), and thus the graph G \ S will have ≤ |S| many
components (just using the surviving edges of the hamc alone). Taking into
account all the other edges of G can only decrease the number of components.
An introduction to graph theory, version August 2, 2023 page 68

(b) This is analogous to part (a).


This proposition often (but not always) gives a quick way of convincing your-
self that a graph has no hamc or hamp. Alas, its converse is false. Case in point:
The Petersen graph (defined in Subsection 2.6.3) has no hamc, but it does satisfy
the “every nonempty S ⊆ V satisfies b0 ( G \ S) ≤ |S|” condition of Proposition
2.14.6 (a).

2.14.4. Hypercubes
Now, let us move on to a concrete example of a graph that has a hamc.

Definition 2.14.7. Let n ∈ N. The n-hypercube Qn (more precisely, the n-th


hypercube graph) is the simple graph with vertex set

{0, 1}n = {( a1 , a2 , . . . , an ) | each ai belongs to {0, 1}}

and edge set defined as follows: A vertex ( a1 , a2 , . . . , an ) ∈ {0, 1}n is adjacent


to a vertex (b1 , b2 , . . . , bn ) ∈ {0, 1}n if and only if there exists exactly one
i ∈ {1, 2, . . . , n} such that ai 6= bi . (For example, in Q4 , the vertex (0, 1, 1, 0) is
adjacent to (0, 1, 0, 0).)
The elements of {0, 1}n are often called bitstrings (or binary words), and
their entries are called their bits (or letters). So two bitstrings are adjacent in
Qn if and only if they differ in exactly one bit.
We often write a bitstring ( a1 , a2 , . . . , an ) as a1 a2 · · · an . (For example, we
write (0, 1, 1, 0) as 0110.)
An introduction to graph theory, version August 2, 2023 page 69

Example 2.14.8. Here is how the n-hypercubes Qn look like for n = 1, 2, 3:

Q1 = 0

01 11

Q2 = 00 10

101 111

001 011

100 110

Q3 = 000 010

This should explain the name “hypercube”. The 0-hypercube Q0 is a graph


with just one vertex (namely, the empty bitstring ()).

Theorem 2.14.9 (Gray). Let n ≥ 2. Then, the graph Qn has a hamc.

Such hamcs are known as Gray codes. They are circular lists of bitstrings of
length n such that two consecutive bitstrings in the list always differ in exactly
one bit. See the Wikipedia article on “Gray codes” for applications.
Proof of Theorem 2.14.9. We will show something stronger:

Claim 1: For each n ≥ 1, the n-hypercube Qn has a hamp from


00 · · · 0 to 100 · · · 0.
(Keep in mind that 00 · · · 0 and 100 · · · 0 are bitstrings, not numbers:
   

00 · · · 0 = 0, 0, . . . , 0 ; 100 · · · 0 = 1, 0, 0, . . . , 0 .


| {z } | {z }
n zeroes n −1 zeroes

)
An introduction to graph theory, version August 2, 2023 page 70

[Proof of Claim 1: We induct on n.


Induction base: A look at Q1 reveals a hamp from 0 to 1.
Induction step: Fix N ≥ 2. We assume that Claim 1 holds for n = N − 1. In
other words, Q N −1 has a hamp from 00 · · · 0} to 1 00
| {z · · · 0} . Let p be such a
| {z
N −1 zeroes N −2 zeroes
hamp.
By attaching a 0 to the front of each bitstring (= vertex) in p, we obtain a path
q from 00 · · · 0} to 01 00
| {z · · · 0} in Q N .
| {z
N zeroes N −2 zeroes
By attaching a 1 to the front of each bitstring (= vertex) in p, we obtain a path
r from 1 00 · · · 0} to 11 00
| {z · · · 0} in Q N .
| {z
N −1 zeroes N −2 zeroes
Now, we assemble a hamp from 00 · · · 0} to 1 00
| {z · · · 0} in Q N as follows:
| {z
N zeroes N −1 zeroes

• Start at 00 · · · 0}, and follow the path q to its end (i.e., to 01 00


| {z · · · 0} ).
| {z
N zeroes N −2 zeroes

• Then, move to the adjacent vertex 11 00 · · · 0} .


| {z
N −2 zeroes

• Then, follow the path r backwards, ending up at 1 00 · · · 0} .


| {z
N −1 zeroes

This shows that Claim 1 holds for n = N, too.]


Claim 1 tells us that the n-hypercube Qn has a hamp from 00 · · · 0 to 100 · · · 0.
Since its starting point 00 · · · 0 and its ending point 100 · · · 0 are adjacent, we
can turn this hamp into a hamc by appending the starting point 00 · · · 0 again
at the end. This proves Theorem 2.14.9.

2.14.5. Cartesian products


Theorem 2.14.9 can in fact be generalized. To state the generalization, we define
the Cartesian product of two graphs:

Definition 2.14.10. Let G = (V, E) and H = (W, F) be two simple graphs.


The Cartesian product G × H of these two graphs is defined to be the simple
graph (V × W, E′ ∪ F′ ), where

E′ := {(v1 , w) (v2 , w) | v1 v2 ∈ E and w ∈ W } and



F := {(v, w1 ) (v, w2 ) | w1 w2 ∈ F and v ∈ V } .

In other words, it is the graph whose vertices are pairs (v, w) ∈ V × W


consisting of a vertex of G and a vertex of H, and whose edges are of the
forms
( v1 , w ) ( v2 , w ) where v1 v2 ∈ E and w ∈ W
An introduction to graph theory, version August 2, 2023 page 71

and
(v, w1 ) (v, w2 ) where w1 w2 ∈ F and v ∈ V.

For example, the Cartesian product G × P2 of a simple graph G with the


2-path graph P2 can be constructed by overlaying two copies of G and addi-
tionally joining each vertex of the first copy to the corresponding vertex of the
second copy by an edge. (The vertices of the first copy are the (v, 1), whereas
the vertices of the second copy are the (v, 2).) For a specific example, here is
the 5-cycle graph C5 and the Cartesian product C5 × P2 :

C5 C5 × P2

(2, 2)

(3, 2)
(2, 1)
(3, 1)

(1, 1) (1, 2)
2
3 (4, 1)
(5, 1)
1 (4, 2)

4
(5, 2)
5

As another instance of the above description of G × P2 , it is easy to see the


following:

Proposition 2.14.11. We have Qn ∼ = Qn−1 × P2 for each n ≥ 1. (See Definition


2.14.7 for the definitions of Qn and Qn−1.)

Proof. This is Exercise 1 (a) on homework set #2 from my Spring 2017 course;
see the course page for solutions.
Now, we claim the following:

Theorem 2.14.12. Let G and H be two simple graphs. Assume that each of
the two graphs G and H has a hamp. Then:
An introduction to graph theory, version August 2, 2023 page 72

(a) The Cartesian product G × H has a hamp.

(b) Now assume furthermore that at least one of the two numbers |V ( G )|
and |V ( H )| is even, and that both numbers |V ( G )| and |V ( H )| are
larger than 1. Then, the Cartesian product G × H has a hamc.

Proof. This is Exercise 1 on homework set #2 from my Spring 2017 course


(specifically, its parts (b) and (c)). Its solution can be found on the course page.
(Specifically, see the solution to Exercise 1 on homework set #2 from Spring
2017.)
Now, Theorem 2.14.9 can be reproved (again by inducting on n) using Theo-
rem 2.14.12 (b) and Proposition 2.14.11, since P2 has a hamp and since |V ( P2 )| =
2 is even. (Convince yourself that this works!)

2.14.6. Subset graphs


The n-hypercube Qn can be reinterpreted in terms of subsets of {1, 2, . . . , n}.
Namely: Let n ∈ N. Let Gn be the simple graph whose vertex set is the
powerset P ({1, 2, . . . , n}) of {1, 2, . . . , n} (that is, the vertices are all 2n subsets
of {1, 2, . . . , n}), and whose edges are determined as follows: Two vertices S
and T are adjacent if and only if one of the two sets S and T is obtained from
the other by inserting an extra element (i.e., we have either S = T ∪ {s} for
some s ∈ / T, or T = S ∪ { t} for some t ∈ / S). Then, Gn ∼ = Qn , since the map
{0, 1}n → P ({1, 2, . . . , n}) ,
( a1 , a2 , . . . , an ) 7→ {i ∈ {1, 2, . . . , n} | ai = 1}
is a graph isomorphism from Qn to Gn .
Thus, Theorem 2.14.9 shows that for each n ≥ 2, the graph Gn has a hamc. In
other words, for each n ≥ 2, we can list all subsets of {1, 2, . . . , n} in a circular
list in such a way that each subset on this list is obtained from the previous one
by inserting or removing a single element. For example, for n = 3, here is such
a list:
∅, {1} , {1, 2} , {2} , {2, 3} , {1, 2, 3} , {1, 3} , {3} .
A long-standing question only resolved a few years ago asked whether the
n±1
same can be done with the subsets of {1, 2, . . . , n} having size when n is
2
odd. For example, for n = 3, we can do it as follows:
{1} , {1, 2} , {2} , {2, 3} , {3} , {1, 3} .
In other words, if n ≥ 3 is odd, and if Gn′ is the induced  subgraph of Gn on the
n−1 n+1
set of all subsets J of {1, 2, . . . , n} that satisfy | J | ∈ , , then does
2 2
Gn′ have a hamc?
An introduction to graph theory, version August 2, 2023 page 73

Since Gn ∼
= Qn , we can restate this question equivalently as follows: If n ≥ 3
is odd, and if Q′n is the induced subgraph of Qn on the set
(  )
n
n − 1 n + 1
a1 a2 · · · an ∈ {0, 1}n | ∑ ai ∈ , ,
i =1
2 2

then does Q′n have a hamc?


In 2014, Torsten Mütze proved that the answer is “yes”. See [Mutze14] for his
truly nontrivial proof, and [Mutze22] for a recent survey of similar questions.
(Cf. also change ringing.)
The following exercise provides another generalization of Theorem 2.14.9:

Exercise 2.23. Let n and k be two integers such that n > k > 0. Define the
simple graph Qn,k as follows: Its vertices are the bitstrings ( a1 , a2 , . . . , an ) ∈
{0, 1}n ; two such bitstrings are adjacent if and only if they differ in exactly k
bits (in other words: two vertices ( a1 , a2 , . . . , an ) and (b1 , b2 , . . . , bn ) are adja-
cent if and only if the number of i ∈ {1, 2, . . . , n} satisfying ai 6= bi equals k).
(Thus, Qn,1 is the n-hypercube graph Qn .)

(a) Does Qn,k have a hamc when k is even? (Recall that “hamc” is short for
“Hamiltonian cycle”.)

(b) Does Qn,k have a hamc when k is odd?

[Hint: One way to approach part (b) is by identifying the set {0, 1} with
the field F2 with two elements. The bitstrings ( a1 , a2 , . . . , an ) ∈ {0, 1}n thus
become the size-n row vectors in the F2 -vector space F2n . Let e1 , e2 , . . . , en be
the standard basis vectors of F2n (so that ei has a 1 in its i-th position and
zeroes everywhere else). Then, two vectors are adjacent in the n-hypercube
graph Qn (resp. in the graph Qn,k ) if and only if their difference is one of the
standard basis vectors (resp., a sum of k distinct standard basis vectors). Try
to use this to find a graph isomorphism from Qn to a subgraph of Qn,k .]

The next exercise extends the idea of our proof of Theorem 2.14.9:

Exercise 2.24. Let n ≥ 1. Let Qn be the n-hypercube graph, as in Definition


2.14.7. Recall that “hamp” is short for “Hamiltonian path”.
At what vertices can a hamp of Qn end if it starts at the vertex 00 · · · 0 ?
(Find all possibilities, and prove that they are possible and all other vertices
are impossible.)
An introduction to graph theory, version August 2, 2023 page 74

3. Multigraphs
3.1. Definitions
So far, we have been working with simple graphs. We shall now introduce
several other kinds of graphs, starting with the multigraphs.

Definition 3.1.1. Let V be a set. Then, P1,2 (V ) shall mean the set of all
1-element or 2-element subsets of V. In other words,

P1,2 (V ) := {S ⊆ V | | S| ∈ {1, 2}}


= {{u, v} | u, v ∈ V not necessarily distinct} .

For instance,

P1,2 ({1, 2, 3}) = {{1} , {2} , {3} , {1, 2} , {1, 3} , {2, 3}} .

We can now define multigraphs:

Definition 3.1.2. A multigraph is a triple (V, E, ϕ), where V and E are two
finite sets, and ϕ : E → P1,2 (V ) is a map.

Example 3.1.3. Here is a multigraph:

2
β
α δ
γ
ǫ
λ 1 3 4 5
κ

Formally speaking, this multigraph is the triple (V, E, ϕ), where

V = {1, 2, 3, 4, 5} , E = {α, β, γ, δ, ε, κ, λ} ,

and where ϕ : E → P1,2 (V ) is the map that sends α, β, γ, δ, ε, κ, λ to


{1, 2} , {2, 3} , {2, 3} , {4, 5} , {4, 5} , {4, 5} , {1}, respectively. (Of course, you
can write {1} as {1, 1}.)

This suggests the following terminology (most of which is a calque of our


previously defined terminology for simple graphs):
An introduction to graph theory, version August 2, 2023 page 75

Definition 3.1.4. Let G = (V, E, ϕ) be a multigraph. Then:

(a) The elements of V are called the vertices of G.


The set V is called the vertex set of G, and is denoted V ( G ).

(b) The elements of E are called the edges of G.


The set E is called the edge set of G, and is denoted E ( G ).

(c) If e is an edge of G, then the elements of ϕ (e) are called the endpoints
of e.

(d) We say that an edge e contains a vertex v if v ∈ ϕ (e) (in other words, if
v is an endpoint of e).

(e) Two vertices u and v are said to be adjacent if there exists an edge e ∈ E
whose endpoints are u and v.

(f) Two edges e and f are said to be parallel if ϕ (e) = ϕ ( f ). (In the above
example, any two of the edges δ, ε, κ are parallel.)

(g) We say that G has no parallel edges if G has no two distinct edges that
are parallel.

(h) An edge e is called a loop (or self-loop) if ϕ (e) is a 1-element set (i.e.,
if e has only one endpoint). (In Example 3.1.3, the edge λ is a loop.)

(i) We say that G is loopless if G has no loops (among its edges).

(j) The degree deg v (also written degG v) of a vertex v of G is defined to


be the number of edges that contain v, where loops are counted twice.
In other words,

deg v = degG v
:= |{e ∈ E | v ∈ ϕ (e)}| + |{e ∈ E | ϕ (e) = {v}}| .
| {z } | {z }
this counts all edges this counts all loops
that contain v that contain v once again

(Note that, unlike in the case of a simple graph, deg v is not the number
of neighbors of v, unless it happens that v is not contained in any loops
or parallel edges.)
(For example, in Example 3.1.3, we have deg 1 = 3 and deg 2 = 3 and
deg 3 = 2 and deg 4 = 3 and deg 5 = 3.)

(k) A walk in G means a list of the form

( v0 , e1 , v1 , e2 , v2 , . . . , e k , v k ) (with k ≥ 0) ,
An introduction to graph theory, version August 2, 2023 page 76

where v0 , v1 , . . . , vk are vertices of G, where e1 , e2, . . . , ek are edges of G,


and where each i ∈ {1, 2, . . . , k} satisfies

ϕ ( ei ) = { vi −1 , vi }

(that is, the endpoints of each edge ei are vi −1 and vi ). Note that we
have to record both the vertices and the edges in our walk, since we
want the walk to “know” which edges it traverses. (For instance, in
Example 3.1.3, the two walks (1, α, 2, β, 3) and (1, α, 2, γ, 3) are distinct.)
The vertices of a walk (v0 , e1 , v1 , e2 , v2 , . . . , ek , vk ) are v0 , v1 , . . . , vk ; the
edges of this walk are e1 , e2 , . . . , ek . This walk is said to start at v0 and
end at vk ; it is also said to be a walk from v0 to vk . Its starting point is
v0 , and its ending point is vk . Its length is k.

(l) A path means a walk whose vertices are distinct.

(m) The notions of “path-connected” and “connected” and “component”


are defined exactly as for simple graphs. The symbol ≃ G still means
“path-connected”.

(n) A closed walk (or circuit) means a walk (v0 , e1, v1 , e2, v2 , . . . , ek , vk ) with
v k = v0 .

(o) A cycle means a closed walk (v0 , e1 , v1 , e2 , v2 , . . . , ek , vk ) such that


• the vertices v0 , v1 , . . . , vk−1 are distinct;
• the edges e1 , e2 , . . . , ek are distinct;
• we have k ≥ 1.
(Note that we are not requiring k ≥ 3 any more, as we did for sim-
ple graphs. Thus, in Example 3.1.3, both (2, β, 3, γ, 2) and (1, λ, 1) are
cycles, but (2, β, 3, β, 2) is not. The purpose of the “k ≥ 3” require-
ment for cycles in simple graphs was to disallow closed walks such as
(2, β, 3, β, 2) from being cycles; but they are now excluded by the “the
edges e1 , e2 , . . . , ek are distinct” condition.)

(p) Hamiltonian paths and cycles are defined as for simple graphs.

(q) We draw a multigraph by drawing each vertex as a point, each edge as


a curve, and labeling both the vertices and the edges (or not, if we don’t
care about what they are). An example of such a drawing appeared in
Example 3.1.3.

So there are two differences between simple graphs and multigraphs:


1. A multigraph can have loops, whereas a simple graph cannot.
An introduction to graph theory, version August 2, 2023 page 77

2. In a simple graph, an edge e is a set of two vertices, whereas in a multi-


graph, an edge e has a set of two vertices (possibly two equal ones, if e is a
loop) assigned to it by the map ϕ. This not only allows for parallel edges,
but also lets us store some information in the “identities” of the edges.

Nevertheless, the two notions have much in common; thus, they are both
called “graphs”:

Convention 3.1.5. The word “graph” means either “simple graph” or “multi-
graph”. The precise meaning should usually be understood from the context.
(I will try not to use it when it could cause confusion.)

Fortunately, simple graphs and multigraphs have many properties in com-


mon, and often it is not hard to derive a result about multigraphs from the
analogous result about simple graphs or vice versa. We will soon explore how
some of the properties we have seen in the previous chapter can be adapted
to multigraphs. First, however, let us explain how to convert multigraphs into
simple graphs and vice versa.

3.2. Conversions
We can turn each multigraph into a simple graph, but at a cost of losing some
information:

Definition 3.2.1. Let G = (V, E, ϕ) be a multigraph. Then, the underlying


simple graph Gsimp of G means the simple graph

(V, { ϕ (e) | e ∈ E is not a loop}) .

In other words, it is the simple graph with vertex set V in which two distinct
vertices u and v are adjacent if and only if u and v are adjacent in G. Thus,
Gsimp is obtained from G by removing loops and “collapsing” parallel edges
to a single edge.

For example, the underlying simple graph of the multigraph G in Example


3.1.3 is

1 3 4 5
.

Conversely, each simple graph can be viewed as a multigraph:


An introduction to graph theory, version August 2, 2023 page 78

Definition 3.2.2. Let G = (V, E) be a simple graph. Then, the corresponding


multigraph Gmult is defined to be the multigraph

(V, E, ι) ,

where ι : E → P1,2 (V ) is the map sending each e ∈ E to e itself.

Example 3.2.3. If

G= 1 3 4
,

then

2
{1, 2} {2, 3}

{3, 4}
Gmult = 1 3 4
{1, 3}
.

As we said, the “underlying simple graph” construction G 7→ Gsimp destroys


information, so it is irreversible. This being said, the two constructions G 7→
Gsimp and G 7→ Gmult come fairly close to undoing one another:15
Proposition 3.2.4.
simp
(a) If G is a simple graph, then Gmult = G.
(b) If G is a loopless multigraph that has no parallel edges, then
mult
Gsimp ∼
= G. (This is just an isomorphism, not an equality, since the
“identities” of the edges of G have been forgotten in Gsimp and cannot
be recovered.)
(c) If G is a multigraph that has loops or (distinct) parallel edges, then
mult
the multigraph Gsimp has fewer edges than G and thus is not
isomorphic to G.

15 Inthe following proposition, we will use the notion of an “isomorphism of multigraphs”. A


rigorous definition of this notion is given in Definition 3.3.4 further below (but it is more
or less what you would expect: it is a way to relabel the vertices and the edges of one
multigraph to obtain those of another).
An introduction to graph theory, version August 2, 2023 page 79

Proof. A matter of understanding the definitions.


We will often identify a simple graph G with the corresponding multigraph
Gmult . This may be dangerous, because we have defined notions such as ad-
jacency, walks, paths, cycles, etc. both for simple graphs and for multigraphs;
thus, when we identify a simple graph G with the multigraph Gmult , we are
potentially inviting ambiguity (for example, does “cycle of G” mean a cycle of
the simple graph G or of the multigraph Gmult ?). Fortunately, this ambiguity is
harmless, because whenever G is a simple graph, any of the notions we defined
for G is equivalent to the corresponding notion for the multigraph Gmult . For
example, for the notions of a cycle, we have the following:

Proposition 3.2.5. Let G be a simple graph. Then:

(a) If (v0 , e1 , v1 , e2 , v2 , . . . , ek , vk ) is a cycle of the multigraph Gmult , then


(v0 , v1 , . . . , vk ) is a cycle of the simple graph G.
(b) Conversely, if (v0 , v1 , . . . , vk ) is a cycle of the simple graph G, then
(v0 , {v0 , v1 } , v1 , {v1 , v2 } , v2 , . . . , vk−1 , {vk−1 , vk } , vk ) is a cycle of the
multigraph Gmult .

Proof. This is not completely obvious, since our definitions of a cycle of a simple
graph and of a cycle of a multigraph were somewhat different. The proof boils
down to checking the following two statements:

1. If (v0 , v1 , . . . , vk ) is a cycle of the simple graph G, then its edges


{v0 , v1 } , {v1 , v2 } , . . . , {vk−1 , vk } are distinct.
2. If (v0 , e1, v1 , e2 , v2 , . . . , ek , vk ) is a cycle of the multigraph Gmult , then k ≥ 3.

Checking statement 2 is easy (we cannot have k = 1 since Gmult has no loops,
and we cannot have k = 2 since this would lead to e1 = e2 ). Statement 1
is also clear, since the distinctness of the k vertices v0 , v1 , . . . , vk−1 forces the 2-
element sets formed from these k vertices to also be distinct (and since the edges
{v0 , v1 } , {v1 , v2 } , . . . , {vk−1 , vk } = {vk−1 , v0 } are such 2-element sets).
For all other notions discussed above, it is even more obvious that there is no
ambiguity.

3.3. Generalizing from simple graphs to multigraphs


Now, as promised, we shall revisit the results of Chapter 2, and see which of
them also hold for multigraphs instead of simple graphs.
An introduction to graph theory, version August 2, 2023 page 80

3.3.1. The Ramsey number R (3, 3)


One of the first properties of simple graphs that we proved is the following
(Proposition 2.3.1):

Proposition 3.3.1. Let G be a simple graph with |V (G )| ≥ 6 (that is, G has at


least 6 vertices). Then, at least one of the following two statements holds:

• Statement 1: There exist three distinct vertices a, b and c of G such that


ab, bc and ca are edges of G.

• Statement 2: There exist three distinct vertices a, b and c of G such that


none of ab, bc and ca is an edge of G.

This is still true for multigraphs16 , because replacing a multigraph G by the


underlying simple graph Gsimp does not change the meaning of the statement.

3.3.2. Degrees
In Definition 2.4.1, we defined the degree of a vertex v in a simple graph G =
(V, E) by

deg v := (the number of edges e ∈ E that contain v)


= (the number of neighbors of v)
= |{u ∈ V | uv ∈ E}|
= |{e ∈ E | v ∈ e}| .

These equalities no longer hold when G is a multigraph. Parallel edges corre-


spond to the same neighbor, so the number of neighbors of v is only a lower
bound on deg v.

Proposition 2.4.2 (which says that if G is a simple graph with n vertices, then
any vertex v of G has degree deg v ∈ {0, 1, . . . , n − 1}) also no longer holds
for multigraphs, because you can have arbitrarily many edges in a multigraph
with just 1 or 2 vertices. (You can even have parallel loops!)

Is Proposition 2.4.3 true for multigraphs? Yes, because we have said that
loops should count twice in the definition of the degree. The proof needs some
tweaking, though. Let me give a slightly different proof; but first, let me state
the claim for multigraphs as a proposition of its own:

16 Ofcourse, we should understand it appropriately: i.e., we should read “ab is an edge” as


“there is an edge with endpoints a and b”.
An introduction to graph theory, version August 2, 2023 page 81

Proposition 3.3.2 (Euler 1736 for multigraphs). Let G be a multigraph. Then,


the sum of the degrees of all vertices of G equals twice the number of edges
of G. In other words,
∑ deg v = 2 · |E (G)| .
v ∈V( G )

Proof. Write G as G = (V, E, ϕ); thus, V ( G ) = V and E ( G ) = E.


For each edge e, let us (arbitrarily) choose one endpoint of e and denote it
by α (e). The other endpoint will be called β ( e). If e is a loop, then we set
β (e) = α (e). Then, for each vertex v, we have
deg v = (the number of e ∈ E such that v = α (e))
+ (the number of e ∈ E such that v = β (e))
(note how loops get counted twice on the right hand side, because if e ∈ E is a
loop, then v is both α (e) and β (e) at the same time). Summing up this equality
over all v ∈ V, we obtain

∑ deg v = ∑ (the number of e ∈ E such that v = α (e))


v ∈V v ∈V
+ ∑ (the number of e ∈ E such that v = β (e)) .
v ∈V

However,

∑ (the number of e ∈ E such that v = α (e)) = | E| ,


v ∈V

since each edge e ∈ E is counted in exactly one addend of this sum. Similarly,

∑ (the number of e ∈ E such that v = β (e)) = | E| .


v ∈V

Thus, the above equality becomes

∑ deg v = ∑ (the number of e ∈ E such that v = α (e))


v ∈V v ∈V
| {z }
=| E |
+ ∑ (the number of e ∈ E such that v = β (e))
v ∈V
| {z }
=| E |
= | E| + | E| = 2 · | E| .
This proves Proposition 3.3.2.
This is a good motivation for counting loops twice in the definition of a
degree.
The handshake lemma (Corollary 2.4.4) still holds for multigraphs. In other
words, we have the following:
An introduction to graph theory, version August 2, 2023 page 82

Corollary 3.3.3 (handshake lemma). Let G be a multigraph. Then, the num-


ber of vertices v of G whose degree deg v is odd is even.
Proof. This follows from Proposition 3.3.2 in the same way as for simple graphs.

Proposition 2.4.5 fails for multigraphs. For example, the multigraph


1 2 3
has three vertices with degrees 1, 2, 3. Fortunately,
Proposition 2.4.5 was more of a curiosity than a useful fact.
Mantel’s theorem (Theorem 2.4.6) also fails for multigraphs, because we can
join two vertices with a lot of parallel edges and thus satisfy e > n2 /4 for stupid
reasons without ever creating a triangle. Thus, Turan’s theorem (Theorem 2.4.8)
also fails for multigraphs.

3.3.3. Graph isomorphisms


Graph isomorphy (and isomorphisms) can still be defined for multigraphs, but
the definition is not the same as for simple graphs. Graph isomorphisms can
no longer be defined merely as bijections between the vertex sets, since we also
need to specify what they do to the edges. Instead, we define them as follows:
Definition 3.3.4. Let G = (V, E, ϕ) and H = (W, F, ψ) be two multigraphs.

(a) A graph isomorphism (or isomorphism) from G to H means a pair


(α, β) of bijections

α:V→W and β:E→F

with the property that if e ∈ E, then the endpoints of β (e) are the im-
ages under α of the endpoints of e. (This property can also be restated
as a commutative diagram

β
E // F ,
ϕ ψ
 
P1,2 (V ) // P1,2 (W )
P ( α)

where P (α) is the map from P1,2 (V ) to P1,2 (W ) that sends each sub-
set {u, v} ∈ P1,2 (V ) to {α (u) , α (v)} ∈ P1,2 (W ). If you are used to
category theory, this restatement may look more natural to you.)

(b) We say that G and H are isomorphic (this is written G ∼


= H) if there
exists a graph isomorphism from G to H.
An introduction to graph theory, version August 2, 2023 page 83

Again, isomorphy of multigraphs is an equivalence relation.

3.3.4. Complete graphs, paths, cycles


In Definition 2.6.1, Definition 2.6.2 and Definition 2.6.3, we defined the complete
graphs Kn , the path graphs Pn and the cycle graphs Cn as simple graphs. Thus,
all of them can be viewed as multigraphs if one so desires (since each simple
graph G gives rise to a multigraph Gmult ).
However, using multigraphs, we can extend our definition of n-th cycle
graphs Cn to the case n = 1 and also tweak it in the case n = 2 to make it
more natural. We do this as follows:

Definition 3.3.5. We modify the definition of cycle graphs (Definition 2.6.3)


as follows:

(a) We redefine the 2-nd cycle graph C2 to be the multigraph with two
vertices 1 and 2 and two parallel edges with endpoints 1 and 2. (We
don’t care what the edges are, only that there are two of them and each
1 2
has endpoints 1 and 2.) Thus, it looks as follows: .

(b) We define the 1-st cycle graph C1 to be the multigraph with one vertex
1 and one edge (which is necessarily a loop). Thus, it looks as follows:
1
.

This has the effect that the n-th cycle graph Cn has exactly n edges for each
n ≥ 1 (rather than having 1 edge for n = 2, as it did back when it was a simple
graph).

3.3.5. Induced submultigraphs


In Definition 2.7.1, we defined subgraphs and induced subgraphs of a simple
graph. The corresponding notions for multigraphs are defined as follows:

Definition 3.3.6. Let G = (V, E, ϕ) be a multigraph.

(a) A submultigraph of G means a multigraph of the form H = (W, F, ψ),


where W ⊆ V and F ⊆ E and ψ = ϕ | F . In other words, a submulti-
graph of G means a multigraph H whose vertices are vertices of G and
whose edges are edges of G and whose edges have the same endpoints
in H as they do in G.
We often abbreviate “submultigraph” as “subgraph”.
An introduction to graph theory, version August 2, 2023 page 84

(b) Let S be a subset of V. The induced submultigraph of G on the set S


denotes the submultigraph

S, E′ , ϕ | E′

of G, where

E′ := { e ∈ E | all endpoints of e belong to S} .

In other words, it denotes the submultigraph of G whose vertices are


the elements of S, and whose edges are precisely those edges of G
whose both endpoints belong to S. We denote this induced submulti-
graph by G [S].

(c) An induced submultigraph of G means a submultigraph of G that is


the induced submultigraph of G on S for some S ⊆ V.

The infix “multi” is often omitted. So we often speak of “subgraphs”


instead of “submultigraphs”.

With these definitions, we can now identify cycles in a multigraph with sub-
graphs isomorphic to a cycle graph: A cycle of length n in a multigraph G is
“the same as” a submultigraph of G isomorphic to Cn . (We leave the details to
the reader.)

3.3.6. Disjoint unions


In Section 2.8, we defined the disjoint union of two or more simple graphs. The
analogous definition for multigraphs is straightforward and left to the reader.

3.3.7. Walks
We already defined walks, paths, closed walks and cycles for multigraphs back
in Section 3.1. The length of a walk is still defined to be its number of edges.
Now, let’s see which of their basic properties (seen in Section 2.9) still hold for
multigraphs.
First of all, the edges of a path are still always distinct. This is just as easy to
prove as for simple graphs.
Next, let us see how two walks can be “spliced” together:

Proposition 3.3.7. Let G be a multigraph. Let u, v and w be three ver-


tices of G. Let a = ( a0 , e1 , a1 , . . . , ek , ak ) be a walk from u to v. Let
An introduction to graph theory, version August 2, 2023 page 85

b = (b0 , f 1 , b1 , . . . , f ℓ , bℓ ) be a walk from v to w. Then,

(a0 , e1 , a1 , . . . , ek , ak , f 1 , b1 , f 2 , b2 , . . . , f ℓ , bℓ )
= (a0 , e1 , a1 , . . . , ak−1 , ek , b0 , f 1 , b1 , . . . , f ℓ , bℓ )
= (a0 , e1 , a1 , . . . , ak−1 , ek , v, f 1 , b1 , . . . , f ℓ , bℓ )

is a walk from u to w. This walk shall be denoted a ∗ b.


Walks can be reversed (i.e., walked in backwards direction):

Proposition 3.3.8. Let G be a multigraph. Let u and v be two vertices of G.


Let a = ( a0 , e1 , a1 , . . . , ek , ak ) be a walk from u to v. Then:

(a) The list ( ak , ek , ak−1 , ek−1 , . . . , e1 , a0 ) is a walk from v to u. We denote


this walk by rev a and call it the reversal of a.

(b) If a is a path, then rev a is a path again.

Walks that are not paths contain smaller walks between the same vertices:

Proposition 3.3.9. Let G be a multigraph. Let u and v be two vertices of G.


Let a = (a0 , e1 , a1 , . . . , ek , ak ) be a walk from u to v. Assume that a is not a
path. Then, there exists a walk from u to v whose length is smaller than k.

Corollary 3.3.10 (When there is a walk, there is a path). Let G be a multi-


graph. Let u and v be two vertices of G. Assume that there is a walk from u
to v of length k for some k ∈ N. Then, there is a path from u to v of length
≤ k.
All these results can be proved in the same way as their counterparts for
simple graphs; the only change needed is to record the edges in the walk.
Given a multigraph G and two vertices u and v of G, we can ask ourselves
the same five Questions 1, 2, 3, 4 and 5 that we asked for a simple graph G
in Subsection 2.9.4. The answers we gave in that subsection still apply without
requiring substantial changes; the only necessary modification is that we now
have to keep track of the edges in a path or walk. (The reader can easily fill in
the details here.)

3.3.8. Path-connectedness
The relation “path-connected” is defined for multigraphs just as it is for simple
graphs (Definition 2.9.8), and is still denoted ≃ G . It is still an equivalence
relation (and the proof is the same as for simple graphs). The following also
holds (with the same proof as for simple graphs):
An introduction to graph theory, version August 2, 2023 page 86

Proposition 3.3.11. Let G be a multigraph. Let u and v be two vertices of G.


Then, u ≃ G v if and only if there exists a path from u to v.

The definitions of “components” and “connected” for multigraphs are the


same as for simple graphs (Definition 2.9.11 and Definition 2.9.12). The follow-
ing propositions can be proved in the same way as we proved their analogues
for simple graphs:

Proposition 3.3.12. Let G be a multigraph. Let C be a component of G.


Then, the induced subgraph (= submultigraph) of G on the set C is con-
nected.

Proposition 3.3.13. Let G be a multigraph. Let C1 , C2 , . . . , Ck be all compo-


nents of G (listed without repetition).
Thus, G is isomorphic to the disjoint union G [C1 ] ⊔ G [C2 ] ⊔ · · · ⊔ G [Ck ].

The following proposition is an analogue of Proposition 2.10.4 for multi-


graphs:

Proposition 3.3.14. Let G be a multigraph. Let w be a walk of G such that no


two adjacent edges of w are identical. (By “adjacent edges”, we mean edges
of the form ei −1 and ei , where e1 , e2, . . . , ek are the edges of w from first to
last.)
Then, w either is a path or contains a cycle (i.e., there exists a cycle of G
whose edges are edges of w).

Proof. The proof of this proposition for multigraphs is more or less the same as
it was for simple graphs (i.e., as the proof of Proposition 2.10.4), with a mild
difference in how we prove that the walk wi , wi +1, . . . , w j is a cycle (of course,
 
this walk is no longer wi , wi +1, . . . , w j now, but rather wi , ei +1, wi +1, . . . , e j , w j ,
because the edges need to be included).17

17 Hereare some details:


We assume that w is not a path, and we write the walk w as (w0 , e1 , w1 , e2 , w2 , . . . , wk , ek ).
Then, there exists a pair (i, j) of integers i and j with i < j and wi = w j . Among all such

pairs, we pick one with minimum difference j − i. Then, wi , ei+1, wi+1, . . . , e j , w j is a closed
walk. We claim that this closed walk is a cycle.
To do so, we need to show that

1. the vertices wi , wi+1, . . . , w j−1 are distinct;


2. the edges ei+1, ei+2, . . . , e j are distinct;
3. we have j − i ≥ 1.

The first of these claims follows from the minimality of j − i. The third follows from i < j.
It remains to prove the second claim. In other words, it remains to prove that the edges
ei+1, ei+2, . . . , e j are distinct, i.e., that we have e a 6= eb for any two integers a and b satisfying
An introduction to graph theory, version August 2, 2023 page 87

Just as for simple graphs, we get the following corollary:

Corollary 3.3.15. Let G be a multigraph. Assume that G has a closed walk


w of length > 0 such that no two adjacent edges of w are identical. Then, G
has a cycle.

The analogue of Theorem 2.10.7 for multigraphs is true as well:

Theorem 3.3.16. Let G be a multigraph. Let u and v be two vertices in G.


Assume that there are two distinct paths from u to v. Then, G has a cycle.

Proof. For simple graphs, this was proved as Theorem 2.10.7 above. The same
proof applies to multigraphs, once the obvious changes are made (e.g., instead
of p a−1 p a and qb−1 qb , we need to take the last edges of the two walks p and
q).

In contrast, Proposition 2.11.1 is false for multigraphs. In fact, we can take


a multigraph with a single vertex and lots of loops around it. In that case, its
degree can be very large, but it has no cycles of length > 1.

3.3.9. G \ e, bridges and cut-edges


Next, we extend the definition of G \ e (Definition 2.12.1) to multigraphs:

Definition 3.3.17. Let G = (V, E, ϕ) be a multigraph. Let e be an edge of G.


Then, G \ e will mean the graph obtained from G by removing this edge e.
In other words,  
G \ e := V, E \ {e} , ϕ | E\{e} .

i < a < b ≤ j. Let us do this. Let a and b be two integers satisfying i < a < b ≤ j. We must
show that e a 6= eb . We distinguish two cases: the case a = b − 1 and the case a 6= b − 1.
• If a = b − 1, then e a and eb are two adjacent edges of w and thus distinct (since we
assumed that no two adjacent edges of w are identical). Thus, e a 6= eb is proved in
the case when a = b − 1.
• Now, consider the case when a 6= b − 1. In this case, we must have a < b − 1 (since
a < b entails a ≤ b − 1). Also, i ≤ a − 1 (since i < a). Hence, i ≤ a − 1 < a < b − 1 ≤
j − 1 (since b ≤ j). Therefore, b − 1, a − 1 and a are three distinct elements of the
set {i, i + 1, . . . , j − 1}. Consequently, wb−1, w a−1 , w a are three distinct vertices (since
the vertices wi , wi+1, . . . , w j−1 are distinct). Therefore, wb−1 ∈ / { w a −1 , w a } = ϕ ( e a )
(since w is a walk, so that the edge e a has endpoints w a−1 and w a ). However, ϕ (eb ) =
{ wb−1 , wb } (since w is a walk, so that the edge eb has endpoints wb−1 and wb ). Now,
comparing wb−1 ∈ {wb−1 , wb } = ϕ (eb ) with wb−1 ∈ / ϕ (e a ), we see that the sets ϕ (eb )
and ϕ (e a ) must be distinct (since ϕ (eb ) contains wb−1 but ϕ (e a ) does not). In other
words, ϕ (eb ) 6= ϕ (e a ). Hence, eb 6= e a . In other words, e a 6= eb . Thus, e a 6= eb is
proved in the case when a 6= b − 1.
We have now proved e a 6= eb in both cases, so we are done.
An introduction to graph theory, version August 2, 2023 page 88

Some authors write G − e for G \ e.


The analogue of Theorem 2.12.2 for multigraphs holds (and can be proved in
the same way as Theorem 2.12.2):
Theorem 3.3.18. Let G be a multigraph. Let e be an edge of G. Then:

(a) If e is an edge of some cycle of G, then the components of G \ e are


precisely the components of G. (Keep in mind that the components are
sets of vertices. It is these sets that we are talking about here, not the
induced subgraphs on these sets.)

(b) If e appears in no cycle of G (in other words, there exists no cycle of G


such that e is an edge of this cycle), then the graph G \ e has one more
component than G.

Note that an edge e that is a loop always is an edge of a cycle (indeed, it


creates a cycle of length 1), and can never appear on any path; thus, removing
such an edge e obviously does not change the path-connectedness relation.
Defining cut-edges and bridges just as we did for simple graphs (Definition
2.12.4), we equally recover the following corollary:
Corollary 3.3.19. Let e be an edge of a multigraph G. Then, e is a bridge if
and only if e is a cut-edge.
Proof. Just like the proof of Corollary 2.12.5.

3.3.10. Dominating sets


We defined and studied dominating sets in Section 2.13. We could define domi-
nating sets for multigraphs in the same way as for simple graphs, but we would
not get anything new this way. Indeed, if G is a multigraph, then the dominat-
ing sets of G are precisely the dominating sets of Gsimp . Thus, we can reduce
any claims about dominating sets of multigraphs to analogous claims about
simple graphs.

3.3.11. Hamiltonian paths and cycles


As we said before, a multigraph G has a Hamiltonian path or Hamiltonian
cycle if and only if the corresponding simple graph Gsimp has one. This does
not mean, however, that everything we proved about Hamiltonian paths still
applies to multigraphs. For instance, neither Ore’s theorem (Theorem 2.14.4)
nor Dirac’s theorem (Corollary 2.14.5) holds for multigraphs, because we could
duplicate edges to make degrees arbitrarily large, without necessarily creating
a hamc.
Proposition 2.14.6 still holds for multigraphs, but this is clear because it can
be derived from the corresponding property of Gsimp .
An introduction to graph theory, version August 2, 2023 page 89

3.3.12. Exercises

Exercise 3.1. Which of the Exercises 2.3, 2.4, 2.7, 2.14, 2.15, 2.5 and 2.8 remain
true if “simple graph” is replaced by “multigraph”?
(For each exercise that becomes false, provide a counterexample. For each
exercise that remains true, either provide a new solution that works for multi-
graphs, or argue that the solution we have seen applies verbatim to multi-
graphs, or derive the multigraph case from the simple graph case.)

Exercise 3.2. Let G be a multigraph with at least one edge. Assume that each
vertex of G has even degree. Prove that G has a cycle.
[Solution: This is Exercise 4 on midterm #1 from my Spring 2017 course;
see the course page for solutions.]

Exercise 3.3. Let G be a multigraph. Let d > 2 be an integer. Assume that


deg v > 2 for each vertex v of G. Prove that G has a cycle whose length is
not divisible by d.

Exercise 3.4. Let G be a multigraph. Assume that G has exactly two vertices
of odd degree. Prove that these two vertices are path-connected.

Exercise 3.5. Let G = (V, E, ϕ) be a multigraph that has no loops.


If e ∈ E is an edge that contains a vertex v ∈ V, then we let e/v denote the
endpoint of e distinct from v. (If e is a loop, then this is understood to mean
v itself.)
For each v ∈ V, we define a rational number qv by

deg (e/v)
qv = ∑ deg v
.
e∈ E;
v ∈ ϕ (e )

(Note that the denominator deg v on the right hand side is nonzero whenever
the sum is nonempty!)
(Thus, qv is the average degree of the neighbors of v, weighted with the
number of edges that join v to the respective neighbors. If v has no neighbors,
then qv = 0.)
Prove that
∑ qv ≥ ∑ deg v.
v ∈V v ∈V
(In other words, in a social network, your average friend has, on average,
more friends than you do!)
x y
[Hint: Any positive reals x and y satisfy + ≥ 2. Why, and how does
y x
this help?]
An introduction to graph theory, version August 2, 2023 page 90

Exercise 3.6. Let F be any field. (For instance, F can be Q or R or C.)


Let G = (V, E, ϕ) be a multigraph, where V = {1, 2, . . . , n} for some n ∈
N.
For each edge e ∈ E, we construct a column vector χe ∈ Fn (that is, a
column vector with n entries) as follows:
• If e is a loop, then we let χe be the zero vector.
• Otherwise, we let u and v be the two endpoints of e, and we let χe be the
column vector that has a 1 in its u-th position, a −1 in its v-th position,
and 0s in all other positions. (This depends on which endpoint we call
u and which endpoint we call v, but we just make some choice and
stick with it. The result will be true no matter how we choose.)
Let M be the n × | E|-matrix over F whose columns are the column vectors
χe for all e ∈ E (we order them in some way; the exact order doesn’t matter).
Prove that
rank M = |V | − conn G,
where conn G denotes the number of components of G.
[Example: Here is an example: Let G be the multigraph
e
a 2 3

1 b d g
c
5 4 h
f

(so that n = 5). Then, if we choose


 the endpoints of b to be 2 and 5 in this
0
 1 
 
order, then we have χb =  
 0 . (Choosing them to be 5 and 2 instead, we
 0 
 
−1
0
 −1
 
would obtain χb =  
 0 .) If we do the same for all edges of G (that is,
 0 
1
we choose the smaller endpoint as u and the larger endpoint as v), and if we
order the columns so that they correspond to the edges a, b, c, d, e, f , g, h from
left to right, then the matrix M comes out as follows:
 
1 0 1 0 0 0 0 0
 −1 1 0 0 1 0 0 0
 
M= 0 0 0 1 −1 0 1 0.
 0 0 0 0 0 1 − 1 0
0 −1 −1 −1 0 −1 0 0
An introduction to graph theory, version August 2, 2023 page 91

It is easy to see that rank M = 4, which is precisely |V | − conn G.]


[Remark: The claim of the exercise can be restated as follows: The span of
the vectors χe for all e ∈ E has dimension |V | − conn G.
Topologists will recognize the matrix M as (a matrix that represents) the
boundary operator ∂ : C1 ( G ) → C0 ( G ), where G is viewed as a CW-
complex.]

Exercise 3.7. If G is a multigraph, then conn G shall denote the number of


connected components of G. (Note that this is 0 when G has no vertices, and
1 if G is connected.)
Let (V, H, ϕ) be a multigraph. Let E and F be two subsets of H.

(a) Prove that

conn (V, E, ϕ | E ) + conn (V, F, ϕ | F )


≤ conn (V, E ∪ F, ϕ | E∪ F ) + conn (V, E ∩ F, ϕ | E∩ F ) . (2)

[Hint: Feel free to restrict yourself to the case of a simple graph; in this
case, E and F are two subsets of P2 (V ), and you have to show that

conn (V, E) + conn (V, F) ≤ conn (V, E ∪ F) + conn (V, E ∩ F) .

This isn’t any easier than the general case, but saves you the hassle of
carrying the map ϕ around.]

(b) Give an example where the inequality (2) does not become an equality.

[Solution: This is Exercise 3 on homework set #3 from my Spring 2017


course; see the course page for solutions.]

Exercise 3.8. Let G = (V, E, ϕ) be a connected multigraph with 2m edges,


where m ∈ N. A set {e, f } of two distinct edges will be called a friendly
couple if e and f have at least one endpoint in common. Prove that the
edge set of G can be decomposed into m disjoint friendly couples (i.e., there
exist m disjoint friendly couples {e1 , f 1 } , {e2 , f 2 } , . . . , {em , f m } such that E =
{e1 , f 1 , e2 , f 2 , . . . , em , f m }). (“Disjoint” means “disjoint as sets” – i.e., having
no edges in common.)
An introduction to graph theory, version August 2, 2023 page 92

[Example: Here is a graph with an even number of edges:

b
z
a
x
y
c

One possible decomposition of its edge set into disjoint friendly couples is
{a, y} , {b, z} , {c, x }.]
[Hint: Induct on | E| . Pick a vertex v of degree > 1 and consider the
components of G \ v.]

Exercise 3.9. Let n ≥ 0. Let d1 , d2 , . . . , dn be n nonnegative integers such that


d1 + d2 + · · · + dn is even.

(a) Prove that there exists a multigraph G with vertex set {1, 2, . . . , n} such
that all i ∈ {1, 2, . . . , n} satisfy deg i = di .

(b) Prove that there exists a loopless multigraph G with vertex set
{1, 2, . . . , n} such that all i ∈ {1, 2, . . . , n} satisfy deg i = di if and only
if each i ∈ {1, 2, . . . , n} satisfies the inequality

∑ d j ≥ di . (3)
j∈{1,2,...,n };
j 6 =i

[Remark: The inequality (3) is the “n-gon inequality”: It is equivalent to


the existence of a (possibly degenerate) n-gon with sidelengths d1 , d2 , . . . , dn .]

Exercise 3.10. Let G be a loopless multigraph. Recall that a trail (in G) means
a walk whose edges are distinct (but whose vertices are not necessarily dis-
tinct). Let u and v be two vertices of G. As usual, “trail from u to v” means
“trail that starts at u and ends at v”. Prove that

(the number of trails from u to v in G)


≡ (the number of paths from u to v in G) mod 2.

[Hint: Try to pair up the non-path trails into pairs. Make sure to prove that
this pairing is well-defined (i.e., each non-path trail t has exactly one partner,
which is not itself, and that t is the designated partner of its partner!).]
An introduction to graph theory, version August 2, 2023 page 93

Exercise 3.11. Let G be a multigraph such that every vertex of G has even
degree. Let u and v be two distinct vertices of G. Prove that the number of
paths from u to v is even.
[Hint: When you add an edge joining u to v, the graph G becomes a graph
with exactly two odd-degree vertices u and v, and the claim becomes “the
number of paths from u to v is odd” (why?). In this form, the claim turns
out to be easier to prove. Indeed, any path must start with some edge...
Keep in mind that paths can be replaced by trails, by Exercise 3.10.]

Exercise 3.12. Let G = (V, E, ϕ) be a multigraph such that | E| > |V |. Prove


2n + 2
that G has a cycle of length ≤ , where n = |V |.
3
[Solution: This is Exercise 8 on midterm #3 from my Spring 2017 course
(except that the simple graph was replaced by a multigraph); see the course
page for solutions.]

3.4. Eulerian circuits and walks


3.4.1. Definitions
Let us now move on to a new feature of multigraphs, one that we have not yet
studied (even for simple graphs).
Recall that a Hamiltonian path or cycle is a path or cycle that contains all
vertices of the graph. Being a path or cycle, it has to contain each of them
exactly once (except, in the case of a cycle, of its starting point).
What about a walk or closed walk that contains all edges exactly once in-
stead? These are called “Eulerian” walks or circuits; here is the formal defini-
tion:

Definition 3.4.1. Let G be a multigraph.

(a) A walk of G is said to be Eulerian if each edge of G appears exactly


once in this walk.
(In other words: A walk (v0 , e1 , v1 , e2 , v2 , . . . , ek , vk ) of G is said to be
Eulerian if for each edge e of G, there exists exactly one i ∈ {1, 2, . . . , k}
such that e = ei .)

(b) An Eulerian circuit of G means a circuit (i.e., closed walk) of G that is


Eulerian. (Strictly speaking, the preceding sentence is redundant, but
we still said it to stress the notion of an Eulerian circuit.)

Unlike for Hamiltonian paths and cycles, an Eulerian walk or circuit is usu-
ally not a path or cycle. Also, finding an Eulerian walk in a multigraph G is not
the same as finding an Eulerian walk in the simple graph Gsimp . (Nevertheless,
An introduction to graph theory, version August 2, 2023 page 94

some authors call Eulerian walks “Eulerian paths” and call Eulerian circuits
“Eulerian cycles”. This is rather confusing.)
Example 3.4.2. Consider the following multigraphs:

e
2 3 a
a 1 2
A= 1 b d g
B= d b
c
5 4
f 4 3
c

3 3
c
c
b d
C= g 1 2 D= 1 2
a
a e
e

f
4 4

2 2
a a
e

b f b
E= 1 4 F= 1 4
c c
g
d d
3 3

g h
a
1 2
H= 1
G= d b
f e
4 3
c

• The multigraph A has an Eulerian walk


An introduction to graph theory, version August 2, 2023 page 95

(3, d, 5, b, 2, e, 3, g, 4, f , 5, c, 1, a, 2). But A has no Eulerian circuit.


The easiest way to see this is by observing that A has a vertex of odd
degree (e.g., the vertex 2). If an Eulerian circuit were to exist, then it
would have to enter this vertex as often as it exited it; but this would
mean that the degree of this vertex would be even (because each edge
containing this vertex would be used exactly once either to enter or
to exit it, except for loops, which would be used twice). So, more
generally, any multigraph that has a vertex of odd degree cannot have
an Eulerian circuit.

• The multigraph B has an Eulerian circuit (1, a, 2, b, 3, c, 4, d, 1), and thus


of course an Eulerian walk (since any Eulerian circuit is an Eulerian
walk).

• The multigraph C has an Eulerian circuit


(1, g, 1, b, 2, c, 3, d, 2, e, 4, f , 2, a, 1).
• The multigraph D has no Eulerian walk. Indeed, it has four vertices of
odd degree. If v is a vertex of odd degree, then any Eulerian walk has to
either start or end at v (since otherwise, the walk would enter and leave
v equally often, but then the degree of v would be even). But a walk
can only have one starting point and one ending point. This allows for
two vertices of odd degree, but not more than two. So, more generally,
any multigraph that has more than two vertices of odd degree cannot
have an Eulerian walk.

• The multigraph E has no Eulerian walk. The reason is the same as for
D. Note that E is the famous multigraph of bridges in Königsberg, as
studied by Euler in 1736 (see the Wikipedia page for “Seven bridges of
Königsberg” for the backstory).

• The multigraph F has no Eulerian walk, since it has two components,


each containing at least one edge. (An Eulerian walk would have to
contain both edges b and c, but there is no way to walk between them,
since they belong to different components.)

• The multigraph G has an Eulerian walk, namely


(3, b, 2, h, 5, g, 1, a, 2, f , 4, d, 1, e, 3, c, 4). It has no Eulerian circuit,
since it has two vertices of odd degree.

• The multigraph H has an Eulerian circuit, namely (1).

Remark 3.4.3. For the pedants: A multigraph can have an Eulerian circuit
even if it is not connected, as long as all its edges belong to the same compo-
nent (i.e., all but one components are just singletons with no edges). Here is
An introduction to graph theory, version August 2, 2023 page 96

an example:
a
1 2

5 d b 6

4 3
c

Exercise 3.13. Let n be a positive integer. Recall from Definition 2.6.1 (a) that
Kn denotes the complete graph on n vertices. This is the graph with vertex
set V = {1, 2, . . . , n} and edge set P2 (V ) (so each two distinct vertices are
adjacent).
Find Eulerian circuits for the graphs K3 , K5 , and K7 .
[Solution: This is Exercise 2 on homework set #2 from my Spring 2017
course; see the course page for solutions.]

3.4.2. The Euler–Hierholzer theorem


How hard is it to find an Eulerian walk or circuit in a multigraph, or to check
if there is any? Surprisingly, this is a lot easier than the same questions for
Hamiltonian paths or cycles. The second question in particular is answered
(for connected multigraphs) by the Euler–Hierholzer theorem:

Theorem 3.4.4 (Euler, Hierholzer). Let G be a connected multigraph. Then:

(a) The multigraph G has an Eulerian circuit if and only if each vertex of
G has even degree.

(b) The multigraph G has an Eulerian walk if and only if all but at most
two vertices of G have even degree.

We already proved the “=⇒” directions of both parts (a) and (b) in Example
3.4.2. It remains to prove the “⇐=” directions. I don’t think that Euler actually
proved them in his 1736 paper, but Hierholzer did in 1873. The “standard”
proof can be found in many texts, such as [Guicha16, Theorem 5.2.2 and The-
orem 5.2.3]. I will sketch a different proof, which I learnt from [LeLeMe18,
Problem 12.35]. We begin with the following definition:

Definition 3.4.5. Let G be a multigraph. A trail of G means a walk of G


whose edges are distinct.

So a trail can repeat vertices, but cannot repeat edges.


Thus, an Eulerian walk has to be a trail. A trail cannot be any longer than
an Eulerian walk. So a reasonable way to try constructing an Eulerian walk
An introduction to graph theory, version August 2, 2023 page 97

is to start with some trail, and make it progressively longer until it becomes
Eulerian (hopefully).
This suggests the following approach to proving the “⇐=” directions of The-
orem 3.4.4: We pick the longest trail of G and argue that (under the right
assumptions) it has to be Eulerian, since otherwise there would be a way to
make it longer. Of course, we need to find such a way. Here is the first step:

Lemma 3.4.6. Let G be a multigraph with at least one vertex. Then, G has a
longest trail.

Proof. Clearly, G has at least one trail (e.g., a length-0 trail from a vertex to
itself). Moreover, G has only finitely many trails (since each edge of G can only
be used once in a trail, and there are only finitely many edges). Hence, the
maximum principle proves the lemma.
Our goal now is to show that under appropriate conditions, such a longest
trail will be Eulerian. This will require two further lemmas.
First, one more piece of notation: We say that an edge e of a multigraph G
intersects a walk w if at least one endpoint of e is a vertex of w. Here is how
this can look like:

w w w w

(here, the edges of w are marked with a “w” underneath them) or

w w w w

(here, the endpoint of e that is a vertex of w happens to be the starting point of


w) or
e

w w w w

(here, both endpoints of e happen to be vertices of w). Be careful with such


pictures, though: A walk doesn’t have to be a path; it can visit a vertex any
number of times!
An introduction to graph theory, version August 2, 2023 page 98

Lemma 3.4.7. Let G be a connected multigraph. Let w be a walk of G. As-


sume that there exists an edge of G that is not an edge of w.
Then, there exists an edge of G that is not an edge of w but intersects w.

Proof. We assumed that there exists an edge of G that is not an edge of w. Pick
such an edge, and call it f .
A “w- f -path” will mean a path from a vertex of w to an endpoint of f . Such
a path clearly exists, since G is connected. Thus, we can pick a shortest such
path. If this shortest path has length 0, then we are done (since f intersects w in
this case). If not, we consider the first edge of this path. This first edge cannot
be an edge of w, because otherwise we could remove it from the path and get
an even shorter w- f -path. But it clearly intersects w. So we have found an edge
of G that is not an edge of w but intersects w. This proves the lemma.

Lemma 3.4.8. Let G be a multigraph such that each vertex of G has even
degree. Let w be a longest trail of G. Then, w is a closed walk.

Proof. Assume the contrary. Let u be the starting point and v the ending point
of w. Since we assumed that w is not a closed walk, we thus have u 6= v.
Consider the edges of w that contain v. Such edges are of two kinds: those
by which w enters v (this means that v comes immediately after this edge in
w), and those by which w leaves v (this means that v comes immediately before
this edge in w). 18 Except for the very last edge of w, each edge of the former
kind is immediately followed by an edge of the latter kind; conversely, each
edge of the latter kind is immediately preceded by an edge of the former kind
(since w starts at the vertex u, which is distinct from v). Hence, the walk w has
exactly one more edge entering v than it has edges leaving v. Thus, the number
of edges of w that contain v (with loops counting twice) is odd. However, the
total number of edges of G that contain v (with loops counting twice) is even
(because it is the degree of v, but we assumed that each vertex of G has even
degree). So these two numbers are distinct. Thus, there is at least one edge of
G that contains v but is not an edge of w.
Fix such an edge and call it f . Now, append f to the trail w at the end. The
result will be a trail (since f is not an edge of w) that is longer than w. But this
contradicts the fact that w is a longest trail. Thus, the lemma is proved.
We can now finish the proof of the Euler–Hierholzer theorem:
Proof of Theorem 3.4.4. (a) =⇒: We proved this back in Example 3.4.2.
⇐=: Assume that each vertex of G has even degree.
By Lemma 3.4.6, we know that G has a longest trail. Fix such a longest trail,
and call it w. Then, Lemma 3.4.8 shows that w is a closed walk.

18 Loops whose only endpoint is v count as both.


An introduction to graph theory, version August 2, 2023 page 99

We claim that w is Eulerian. Indeed, assume the contrary. Then, there exists
an edge of G that is not an edge of w. Hence, Lemma 3.4.7 shows that there
exists an edge of G that is not an edge of w but intersects w. Fix such an edge,
and call it f .
Since f intersects w, there exists an endpoint v of f that is a vertex of w.
Consider this v. Since w is a closed trail, we can WLOG assume that w starts
and ends at v (since we can otherwise achieve this by rotating19 w). Then, we
can append the edge f to the trail w. This results in a new trail (since f is not
an edge of w) that is longer than w. And this contradicts the fact that w is a
longest trail of G.
This contradiction proves that w is Eulerian. Hence, w is an Eulerian circuit
(since w is a closed walk). Thus, the “⇐=” direction of Theorem 3.4.4 (a) is
proven.
(b) =⇒: Already proved in Example 3.4.2.
⇐=: Assume that all but at most two vertices of G have even degree. We
must prove that G has an Eulerian walk.
If each vertex of G has even degree, then this follows from Theorem 3.4.4 (a),
since every Eulerian circuit is an Eulerian walk. Thus, we WLOG assume that
not each vertex of G has even degree. In other words, the number of vertices of
G having odd degree is positive.
The handshake lemma for multigraphs (i.e., Corollary 3.3.3) shows that the
number of vertices of G having odd degree is even. Furthermore, this number
is at most 2 (since all but at most two vertices of G have even degree). So this
number is even, positive and at most 2. Thus, this number is 2. In other words,
the multigraph G has exactly two vertices having odd degree. Let u and v be
these two vertices.
Add a new edge e that has endpoints u and v to the multigraph G (do this
even if there already is such an edge!20 ). Let G ′ denote the resulting multi-
graph. Then, in G ′ , each vertex has even degree (since the newly added edge
e has increased the degrees of u and v by 1, thus turning them from odd to
even). Moreover, G ′ is still connected (since G was connected, and the newly
added edge e can hardly take that away). Thus, we can apply Theorem 3.4.4
(a) to G ′ instead of G. As a result, we conclude that G ′ has an Eulerian circuit.
Cutting the newly added edge e out of this Eulerian circuit21 , we obtain an Eu-

19 Rotating a closed walk (w0 , e1 , w1 , e2 , w2 , . . . , ek , wk ) means moving its first vertex and its first
edge to the end, i.e., replacing the walk by (w1 , e2 , w2 , e3 , w3 , . . . , ek , wk , e1 , w1 ). This always
results in a closed walk again. For example, if (1, a, 2, b, 3, c, 1) is a closed walk, then we can
rotate it to obtain (2, b, 3, c, 1, a, 2); then, rotating it one more time, we obtain (3, c, 1, a, 2, b, 3).
Clearly, by rotating a closed walk several times, we can make it start at any of its vertices.
Moreover, if we rotate a closed trail, then we obtain a closed trail.
20 This is a time to be grateful for the notion of a multigraph. We could not do this with simple

graphs!
21 More precisely: We rotate this circuit until e becomes its last edge, and then we remove this

last edge to obtain a walk.


An introduction to graph theory, version August 2, 2023 page 100

lerian walk of G. Hence, G has an Eulerian walk. Thus, the “⇐=” direction of
Theorem 3.4.4 (b) is proven.

Note: If you look closely at the above proof, you will see hidden in it an
algorithm for finding Eulerian circuits and walks.22

Exercise 3.14. Let G be a connected multigraph. Let m be the number of


vertices of G that have odd degree. Prove that we can add m/2 new edges to
G in such a way that the resulting multigraph will have an Eulerian circuit.
(It is allowed to add an edge even if there is already an edge between the
same two vertices.)
[Solution: This exercise is Exercise 6 on midterm #1 from my Spring 2017
course; see the course page for solutions.]

Exercise 3.15. Let G = (V, E, ϕ) be a multigraph. The line graph L ( G ) is


defined as the simple graph ( E, F), where
F = {{e1 , e2 } ∈ P2 ( E) | ϕ (e1 ) ∩ ϕ (e2 ) 6= ∅} .
(In other words, L ( G ) is the graph whose vertices are the edges of G, and in
which two vertices e1 and e2 are adjacent if and only if the edges e1 and e2 of
G share a common endpoint.)
[Example: Here is a multigraph G along with its line graph L ( G ):

G L (G)

b
b
2 3
.
c a
a c
d

1 4 d

Note that L ( G ) does not always determine G uniquely.]


Assume that |V | > 1. Prove the following:
22 You might be skeptical about this. After all, in order to apply Lemma 3.4.8, we need a longest
trail, so you might wonder how we can find a longest trail to begin with.
Fortunately, we don’t need to take Lemma 3.4.8 this literally. Our above proof of Lemma
3.4.8 can be used even if w is not a longest trail. In this case, however, instead of showing
that w is a closed walk, this proof may show us a way how to make w longer. In other
words, by following this proof, we may discover a trail longer than w. In this case, we can
replace w by this longer trail, and then apply Lemma 3.4.8 again. We can repeat this over
and over again, until we do end up with a closed walk. (This will eventually happen, since
we know that a trail cannot be longer than the total number of edges of G.)
An introduction to graph theory, version August 2, 2023 page 101

(a) If G has a Hamiltonian path, then L ( G ) has a Hamiltonian path.

(b) If G has an Eulerian walk, then L ( G ) has a Hamiltonian path.

[Solution: This exercise is Exercise 2 on midterm #1 from my Spring 2017


course (generalized from simple graphs to multigraphs); see the course page
for solutions.]

4. Digraphs and multidigraphs


4.1. Definitions
We have so far seen two concepts of graphs: simple graphs and multigraphs.
For all their differences, these two concepts have one thing in common: The
two endpoints of an edge are equal in rights. Thus, when defining walks, each
edge serves as a “two-way road”. Hence, such graphs are good at modelling
symmetric relations between things.
We shall now introduce two analogous versions of “graphs” in which the
edges have directions. These versions are known as directed graphs (short:
digraphs). In such directed graphs, each edge will have a specified starting
point (its “source”) and a specified ending point (its “target”). Correspondingly,
we will draw these edges as arrows, and we will only allow using them in one
direction (viz., from source to target) when we walk down the graph. Here are
the definitions in detail:

Definition 4.1.1. A simple digraph is a pair (V, A), where V is a finite set,
and where A is a subset of V × V.

Definition 4.1.2. Let D = (V, A) be a simple digraph.

(a) The set V is called the vertex set of D; it is denoted by V ( D ).


Its elements are called the vertices (or nodes) of D.

(b) The set A is called the arc set of D; it is denoted by A ( D ).


Its elements are called the arcs (or directed edges) of D.
When u and v are two elements of V, we will occasionally use uv as
a shorthand for the pair (u, v). Note that this means an ordered pair
now!

(c) If (u, v) is an arc of D (or, more generally, a pair in V × V), then u is


called the source of this arc, and v is called the target of this arc.
An introduction to graph theory, version August 2, 2023 page 102

(d) We draw D as follows: We represent each vertex of D by a point, and


each arc uv by an arrow that goes from the point representing u to the
point representing v.

(e) An arc (u, v) is called a loop (or self-loop) if u = v. (In other words, an
arc is a loop if and only if its source is its target.)

Example 4.1.3. For each n ∈ N, we define the divisibility digraph on


{1, 2, . . . , n} to be the simple digraph (V, A), where V = {1, 2, . . . , n} and

A = {(i, j) ∈ V × V | i divides j} .

For example, for n = 6, this digraph looks as follows:

3 2

4 1

5 6
. (4)

Note that simple digraphs (unlike simple graphs) are allowed to have loops
(i.e., arcs of the form (v, v)).
Definition 4.1.4. A multidigraph is a triple (V, A, ψ), where V and A are
two finite sets, and ψ : A → V × V is a map.

Definition 4.1.5. Let D = (V, A, ψ) be a multidigraph.

(a) The set V is called the vertex set of D; it is denoted by V ( D ).


Its elements are called the vertices (or nodes) of D.

(b) The set A is called the arc set of D; it is denoted by A ( D ).


Its elements are called the arcs (or directed edges) of D.

(c) If a is an arc of D, and if ψ ( a) = (u, v), then the vertex u is called the
source of a, and the vertex v is called the target of a.

(d) We draw D as follows: We represent each vertex of D by a point, and


each arc a by an arrow that goes from the point representing u to the
point representing v, where (u, v) = ψ (a).
An introduction to graph theory, version August 2, 2023 page 103

Example 4.1.6. Here is a multidigraph:

a 2 3
e
D= 1 b d g
c
5 4
f
. (5)

Formally speaking, this multidigraph is the triple (V, A, ψ), where V =


{1, 2, 3, 4, 5} and A = {a, b, c, d, e, f , g, h} and ψ ( a) = (1, 2) and ψ (b) = (2, 5)
and so on.
Thus, simple digraphs and multidigraphs are analogues of simple graphs
and multigraphs, respectively, in which the edges have been replaced by arcs
(“edges endowed with a direction”). The analogy is perfect but for the fact
that simple graphs forbid loops but simple digraphs allow loops (but different
authors have different opinions on this).

Convention 4.1.7. The word “digraph” means either “simple digraph” or


“multidigraph”, depending on the context.

The word “digraph” was originally a shorthand for “directed graph”, but
by now it is a technical term that is perfectly understood by everyone in the
subject. (It is also understood by linguists, but in a rather different way.)

4.2. Outdegrees and indegrees


What can we do with digraphs? Many of the things we have done with graphs
can be modified to work with digraphs (although not all their properties will
still hold). For example, the notion of the degree of a vertex in a graph has the
following two counterpart notions for digraphs:

Definition 4.2.1. Let D be a digraph with vertex set V. (This can be either a
simple digraph or a multidigraph.) Let v ∈ V be any vertex. Then:

(a) The outdegree of v denotes the number of arcs of D whose source is v.


This outdegree is denoted deg+ v.

(b) The indegree of v denotes the number of arcs of D whose target is v.


This indegree is denoted deg− v.
An introduction to graph theory, version August 2, 2023 page 104

Example 4.2.2. In the divisibility digraph on {1, 2, 3, 4, 5, 6} (see (4) for a


drawing), we have

deg+ 1 = 6, deg− 1 = 1, deg+ 2 = 3, deg− 2 = 2,


deg+ 3 = 2, deg− 3 = 2, deg+ 4 = 1, deg− 4 = 3,
deg+ 5 = 1, deg− 5 = 2, deg+ 6 = 1, deg− 6 = 4.

Recall Euler’s result (Proposition 3.3.2) saying that in a graph, the sum of all
degrees is twice the number of edges. Here is an analogue of this result for
digraphs:

Proposition 4.2.3 (diEuler). Let D be a digraph with vertex set V and arc set
A. Then,
∑ deg+ v = ∑ deg− v = | A| .
v ∈V v ∈V

Proof. By the definition of an outdegree, we have

deg+ v = (the number of arcs of D whose source is v)

for each v ∈ V. Thus,

∑ deg+ v = ∑ (the number of arcs of D whose source is v)


v ∈V v ∈V
= (the number of all arcs of D )
 
since each arc of D has exactly one source,
and thus is counted exactly once in the sum
= | A| .

Similarly, ∑ deg− v = | A|.


v ∈V

(“diEuler” is not a real mathematician; I just gave that moniker to Proposition


4.2.3 in order to stress its analogy with Euler’s 1736 result.)

4.3. Subdigraphs
Just as we defined subgraphs of a multigraph, we can define subdigraphs (or
“submultidigraphs”, to be very precise) of a digraph:

Definition 4.3.1. Let D = (V, A, ψ) be a multidigraph.

(a) A submultidigraph (or, for short, subdigraph) of D means a multi-


digraph of the form E = (W, B, χ), where W ⊆ V and B ⊆ A and
An introduction to graph theory, version August 2, 2023 page 105

χ = ψ | B . In other words, a submultidigraph of D means a multidi-


graph E whose vertices are vertices of D and whose arcs are arcs of D
and whose arcs have the same sources and targets in E as they have in
D.

(b) Let S be a subset of V. The induced subdigraph of D on the set S


denotes the subdigraph

S, A′ , ψ | A′

of D, where

A′ := { a ∈ A | both the source and the target of a belong to S} .

In other words, it denotes the subdigraph of D whose vertices are the el-
ements of S, and whose arcs are precisely those arcs of D whose sources
and targets both belong to S. We denote this induced subdigraph by
D [ S] .

(c) An induced subdigraph of D means a subdigraph of D that is the


induced subdigraph of D on S for some S ⊆ V.

4.4. Conversions
4.4.1. Multidigraphs to multigraphs
Any multidigraph D can be turned into an (undirected) graph G by “removing
the arrowheads” (aka “forgetting the directions of the arcs”):

Definition 4.4.1. Let D be a multidigraph. Then, Dund will denote the multi-
graph obtained from D by replacing each arc with an edge whose endpoints
are the source and the target of this arc. Formally, this is defined as follows:
If D = (V, A, ψ), then Dund = (V, A, ϕ), where the map ϕ : A → P1,2 (V )
sends each arc a ∈ A to the set of the entries of ψ ( a) (that is, to the set
consisting of the source of a and the target of a).

For example, if D is the multidigraph from (5), then Dund is the following
An introduction to graph theory, version August 2, 2023 page 106

multigraph:
h

a 2 3
e
Dund = 1 b d g
c
5 4
f
.

4.4.2. Multigraphs to multidigraphs


We have just seen how to turn any multidigraph D into a multigraph Dund by
forgetting the directions of the arcs.
Conversely, we can turn a multigraph G into a multidigraph Gbidir by “du-
plicating” each edge (more precisely: turning each edge into two arcs with
opposite orientations). Here is a formal definition:

Definition 4.4.2. Let G = (V, E, ϕ) be a multigraph. For each edge e ∈ E,


let us choose one of the endpoints of e and call it se ; the other endpoint will
then be called te . (If e is a loop, then we understand te to mean se .)
We then define Gbidir to be the multidigraph (V, E × {1, 2} , ψ), where
the map ψ : E × {1, 2} → V × V is defined as follows: For each edge e ∈ E,
we set
ψ (e, 1) = (se , te ) and ψ (e, 2) = (te , se ) .
We call Gbidir the bidirectionalized multidigraph of G.

Note that the map ψ depends on our choice of se ’s (that is, it depends on
which endpoint of an edge e we choose to be se ). This makes the definition of
Gbidir non-canonical; I don’t know if there is a good way to fix this. Fortunately,
all choices of se ’s will lead to mutually isomorphic multidigraphs Gbidir . (The
notion of isomorphism for multidigraphs is exactly the one that you expect.)

Example 4.4.3. If

3
a

G= g 1 2
c
b

4
,
An introduction to graph theory, version August 2, 2023 page 107

then
3
( a, 1)
( g, 1)
(c, 2)
( a, 2)
Gbidir = 1 2
(b, 1)
(c, 1)
( g, 2)
(b, 2)
4
.
(Here, for example, we have chosen s a to be 2, so that ta = 3 and ψ ( a, 1) =
(2, 3) and ψ ( a, 2) = (3, 2).) Yes, even the loops of G are duplicated in Gbidir !

The operation that assigns a multidigraph Gbidir to a multigraph G is injective


– i.e., the original graph G can be uniquely reconstructed from Gbidir . This is
in stark difference to the operation D 7→ Dund , which destroys information (the
und
directions of the arcs). Note that the multigraph Gbidir is not isomorphic
bidir
und
to G, since each edge of G is doubled in G .

4.4.3. Simple digraphs to multidigraphs


Next, we introduce another operation: one that turns simple digraphs into
multidigraphs. This is very similar to the operation G 7→ Gmult that turns
simple graphs into multigraphs, so we will even use the same notation for it.
Its definition is as follows:

Definition 4.4.4. Let D = (V, A) be a simple digraph. Then, the correspond-


ing multidigraph Dmult is defined to be the multidigraph

(V, A, ι) ,

where ι : A → V × V is the map sending each a ∈ A to a itself.

Example 4.4.5. If

D= 1 3 4
,
An introduction to graph theory, version August 2, 2023 page 108

then

2
(1, 2) (2, 3)

(3, 4)
Dmult = 1 3 4
(1, 3)
.

4.4.4. Multidigraphs to simple digraphs


There is also an operation D 7→ Dsimp that turns multidigraphs into simple
digraphs:23

Definition 4.4.6. Let D = (V, A, ψ) be a multidigraph. Then, the underlying


simple digraph Dsimp of D means the simple digraph

(V, {ψ ( a) | a ∈ A}) .

In other words, it is the simple digraph with vertex set V in which there is an
arc from u to v if there exists an arc from u to v in D. Thus, Dsimp is obtained
from D by “collapsing” parallel arcs (i.e., arcs having the same source and
the same target) to a single arc.

Example 4.4.7. If

2
a b
e
D= 1 c 3 4
g
f
d ,

then

Dsimp = 1 3 4
.
23 I will use a notation that I probably should have introduced before: If u and v are two vertices
of a digraph, then an “arc from u to v” means an arc with source u and target v.
An introduction to graph theory, version August 2, 2023 page 109

Note that the arcs c and d have not been “collapsed” into one arc, since they
do not have the same source and the same target. Likewise, the loop g has
been preserved (unlike for undirected graphs).

4.4.5. Multidigraphs as a big tent


A takeaway from this all is that multidigraphs are the “most general” notion of
graphs we have introduced so far. Indeed, using the operations we have seen
so far, we can convert every notion of graphs into a multidigraph:

• Each simple graph becomes a multigraph via the G 7→ Gmult operation.

• Each multigraph, in turn, becomes a multidigraph via the D 7→ Dbidir


operation.

• Each simple digraph becomes a multidigraph via the D 7→ Dmult opera-


tion.

Since all three of these operations are injective (i.e., lose no information), we
thus can encode each of our four notions of graphs as a multidigraph. Con-
sequently, any theorem about multidigraphs can be specialized to the other
three types of graphs. This doesn’t mean that any theorem on any other type
of graphs can be generalized to multidigraphs, though (e.g., Mantel’s theorem
holds only for simple graphs) – but when it can, we will try to state it at the
most general level possible, to avoid doing the same work twice.

4.5. Walks, paths, closed walks, cycles


4.5.1. Definitions
Let us now define various kinds of walks for simple digraphs and for multidi-
graphs.
For simple digraphs, we imitate the definitions from Sections 2.9 and 2.10 as
best as we can, making sure to require all arcs to be traversed in the correct
direction:

Definition 4.5.1. Let D be a simple digraph. Then:

(a) A walk (in D) means a finite sequence (v0 , v1 , . . . , vk ) of vertices of D


(with k ≥ 0) such that all of the pairs v0 v1 , v1 v2 , v2 v3 , . . . , vk−1 vk are
arcs of D. (The latter condition is vacuously true if k = 0.)

(b) If w = (v0 , v1 , . . . , vk ) is a walk in D, then:


• The vertices of w are defined to be v0 , v1 , . . . , vk .
An introduction to graph theory, version August 2, 2023 page 110

• The arcs of w are defined to be the pairs


v0 v1 , v1 v2 , v2 v3 , . . . , v k −1 v k .
• The nonnegative integer k is called the length of w. (This is the
number of all arcs of w, counted with multiplicity. It is 1 smaller
than the number of all vertices of w, counted with multiplicity.)
• The vertex v0 is called the starting point of w. We say that w starts
(or begins) at v0 .
• The vertex vk is called the ending point of w. We say that w ends
at vk .

(c) A path (in D) means a walk (in D) whose vertices are distinct. In other
words, a path means a walk (v0 , v1 , . . . , vk ) such that v0 , v1 , . . . , vk are
distinct.

(d) Let p and q be two vertices of D. A walk from p to q means a walk that
starts at p and ends at q. A path from p to q means a path that starts at
p and ends at q.

(e) A closed walk of D means a walk whose first vertex is identical with
its last vertex. In other words, it means a walk (w0 , w1 , . . . , wk ) with
w0 = wk . Sometimes, closed walks are also known as circuits (but
many authors use this latter word for something slightly different).

(f) A cycle of D means a closed walk (w0 , w1 , . . . , wk ) such that k ≥ 1 and


such that the vertices w0 , w1 , . . . , wk−1 are distinct.

Note that we replaced the condition k ≥ 3 by k ≥ 1 in the definition of a


cycle, since simple digraphs can have loops. Fortunately, with the arcs being
directed, we no longer have to worry about the same arc being traversed back
and forth, so we need no extra condition to rule this out.

Example 4.5.2. Consider the simple digraph

D= 1 3 4
.

Then, (1, 2, 3, 4) and (1, 3, 4) are two walks of D, and these walks are paths.
But (2, 3, 1) is not a walk (since you cannot use the arc 13 to get from 3 to 1).
This digraph D has no cycles, and its only closed walks have length 0.
An introduction to graph theory, version August 2, 2023 page 111

Example 4.5.3. Consider the simple digraph

D= 1 3 4
.

Then, (1, 2, 3, 1) and (3, 4, 3) and (4, 4) are cycles of D. Moreover,


(1, 2, 3, 4, 3, 1) is a closed walk but not a cycle.
Now let’s define the same concepts for multidigraphs, by modifying the anal-
ogous definitions for multigraphs we saw in Definition 3.1.4:
Definition 4.5.4. Let D = (V, A, ψ) be a multidigraph. Then:

(a) A walk in D means a list of the form

( v 0 , a1 , v 1 , a2 , v 2 , . . . , a k , v k ) (with k ≥ 0) ,

where v0 , v1 , . . . , vk are vertices of D, where a1 , a2 , . . . , ak are arcs of D,


and where each i ∈ {1, 2, . . . , k} satisfies

ψ ( ai ) = ( v i − 1 , v i )

(that is, each arc ai has source vi −1 and target vi ). Note that we have
to record both the vertices and the arcs in our walk, since we want the
walk to “know” which arcs it traverses.
The vertices of a walk (v0 , a1 , v1 , a2 , v2 , . . . , ak , vk ) are v0 , v1 , . . . , vk ; the
arcs of this walk are a1 , a2 , . . . , ak . This walk is said to start at v0 and
end at vk ; it is also said to be a walk from v0 to vk . Its starting point is
v0 , and its ending point is vk . Its length is k.

(b) A path means a walk whose vertices are distinct.

(c) A closed walk (or circuit) means a walk (v0 , a1 , v1 , a2 , v2 , . . . , ak , vk ) with


v k = v0 .

(d) A cycle means a closed walk (v0 , a1 , v1 , a2 , v2 , . . . , ak , vk ) such that


• the vertices v0 , v1 , . . . , vk−1 are distinct;
• we have k ≥ 1.
(This automatically implies that the arcs a1 , a2 , . . . , ak are distinct, since
each arc ai has source vi −1 .)
An introduction to graph theory, version August 2, 2023 page 112

Example 4.5.5. Consider the multidigraph

2
a b
e
D= 1 c 3 4
g
f
d .

Then, (1, a, 2, b, 3, d, 1) and (3, d, 1, c, 3) and (4, g, 4) are three cycles of D,


whereas (3, d, 1, a, 2, b, 3, d, 1, c, 3) is a circuit but not a cycle.

4.5.2. Basic properties


Now, let us see which properties of walks, paths, closed walks and cycles re-
main valid for digraphs.
In Proposition 2.9.3, we saw how two walks in a simple graph could be com-
bined (“spliced together”) if the ending point of the first is the starting point of
the second. In Proposition 3.3.7, we generalized this to multigraphs. The same
holds for multidigraphs:

Proposition 4.5.6. Let D be a multidigraph. Let u, v and w be three ver-


tices of D. Let a = ( a0 , e1 , a1 , . . . , ek , ak ) be a walk from u to v. Let
b = (b0 , f 1 , b1 , . . . , f ℓ , bℓ ) be a walk from v to w. Then,

(a0 , e1 , a1 , . . . , ek , ak , f 1 , b1 , f 2 , b2 , . . . , f ℓ , bℓ )
= (a0 , e1 , a1 , . . . , ak−1 , ek , b0 , f 1 , b1 , . . . , f ℓ , bℓ )
= (a0 , e1 , a1 , . . . , ak−1 , ek , v, f 1 , b1 , . . . , f ℓ , bℓ )

is a walk from u to w. This walk shall be denoted a ∗ b.

Proof. The same (trivial) argument as for undirected graphs works here.
However, unlike for undirected graphs, we can no longer reverse walks or
paths in digraphs. Thus, it often happens that there is a walk from u to v, but
no walk from v to u.
Reducing a walk to a path (as we did in Proposition 2.9.5 for simple graphs
and in Proposition 3.3.9 for multigraphs) still works for multidigraphs:

Proposition 4.5.7. Let D be a multidigraph. Let u and v be two vertices of D.


Let a be a walk from u to v. Let k be the length of a. Assume that a is not a
path. Then, there exists a walk from u to v whose length is smaller than k.
An introduction to graph theory, version August 2, 2023 page 113

Corollary 4.5.8 (When there is a walk, there is a path). Let D be a multidi-


graph. Let u and v be two vertices of D. Assume that there is a walk from u
to v of length k for some k ∈ N. Then, there is a path from u to v of length
≤ k.
The proofs of these facts are the same as for multigraphs.
The following proposition is an analogue of Proposition 2.10.4 for multidi-
graphs:
Proposition 4.5.9. Let D be a multidigraph. Let w be a walk of D. Then, w
either is a path or contains a cycle (i.e., there exists a cycle of D whose arcs
are arcs of w).
Proof. This follows by the same argument as Proposition 2.10.4.
Given a multidigraph D and two vertices u and v of D, we can pose the
same five algorithmic questions (Questions 1, 2, 3, 4 and 5) that we posed for
a simple graph G in Subsection 2.9.4. As with multigraphs, the same answers
that we gave back then are still valid in our new setting, as long as we replace
“neighbors of v” by “in-neighbors of v” (that is, vertices w such that D has an
arc from w to v), and as long as we keep track of the arcs in our paths or walks.

4.5.3. Exercises
Exercise 4.1. Let D be a multidigraph with at least one vertex. Prove the
following:

(a) If each vertex v of D satisfies deg+ v > 0, then D has a cycle.

(b) If each vertex v of D satisfies deg+ v = deg− v = 1, then each vertex of


D belongs to exactly one cycle of D. Here, two cycles are considered to
be identical if one can be obtained from the other by cyclic rotation.

Exercise 4.2. Let p be a prime number. Let ( a1 , a2 , a3 , . . .) be a sequence of


integers that is periodic with period p (that is, that satisfies ai = ai + p for each
i > 0). Assume that a1 + a2 + · · · + a p is not divisible by p. Prove that there
exists an i ∈ {1, 2, . . . , p} such that none of the p numbers

ai , ai + ai + 1 , ai + ai + 1 + ai + 2 , . . . , ai + ai + 1 + · · · + ai + p − 1

(that is, of the p sums ai + ai +1 + · · · + a j for i ≤ j < i + p) is divisible by p.


[Remark: This would be false if p was not prime. For instance, for p = 4,
the sequence (0, 2, 2, 2, 0, 2, 2, 2, . . .) would be a counterexample.]
[Hint: Use Exercise 4.1 (a). What is the digraph, and why does it have a
cycle?]
An introduction to graph theory, version August 2, 2023 page 114

Exercise 4.3. Let D = (V, A, ψ) be a multidigraph.



For two vertices u and v of D, we shall write u → v if there exists a path
from u to v.
A root of D means a vertex u ∈ V such that each vertex v ∈ V satisfies

u → v.
A common ancestor of two vertices u and v means a vertex w ∈ V such
∗ ∗
that w → u and w → v.
Assume that D has at least one vertex. Prove that D has a root if and only
if every two vertices in D have a common ancestor.

The following exercise is both a directed analogue and a generalization of Man-


tel’s theorem (Theorem 2.4.6):

Exercise 4.4. Let D be a simple digraph with n vertices and a arcs. Assume
that D has no loops, and that we have a > n2 /2. Prove the following:

(a) The digraph D has a cycle of length 3.

(b) We define an enhanced 3-cycle to be a triple (u, v, w) of distinct vertices


of D such that all four pairs (u, v), (v, w), (w, u) and (u, w) are arcs of
D. Then, the digraph D has an enhanced 3-cycle.

Exercise 4.5. Let D = (V, A) be a simple digraph that has no cycles.


If v = (v1 , v2 , . . . , vn ) is a list of vertices of D (not necessarily a walk!), then
a back-cut of v shall mean an arc a ∈ A whose source is vi and whose target
is v j for some i, j ∈ {1, 2, . . . , n} satisfying i > j. (Colloquially speaking, a
back-cut of v is an arc of D that leads from some vertex of v to some earlier
vertex of v.)
A list v = (v1 , v2 , . . . , vn ) of vertices of D is said to be a toposort24 of D if
it contains each vertex of D exactly once and has no back-cuts.
Prove the following:

(a) The digraph D has at least one toposort.

(b) If D has only one toposort, then this toposort is a Hamiltonian path of
D.
Here, a Hamiltonian path in D means a walk of D that contains each
vertex of D exactly once.
An introduction to graph theory, version August 2, 2023 page 115

[Example: For example, the digraph

3 2

has two toposorts: (3, 2, 1, 4) and (3, 2, 4, 1).]

Exercise 4.6. Let n be a positive integer. Let D be a digraph that has no cycles
of length ≤ 2. Assume that D has at least 2n−1 vertices. Prove that D has an
induced subdigraph that has n vertices and has no cycles.

4.5.4. The adjacency matrix


A simple way to find the number of walks from a given vertex to a given vertex
in a multidigraph is provided by matrix algebra:

Theorem 4.5.10. Let D = (V, A, ψ) be a multidigraph, where V =


{1, 2, . . . , n} for some n ∈ N.
If M is any matrix, and if i and j are two positive integers, then Mi,j shall
denote the (i, j)-th entry of M (that is, the entry of M in the i-th row and the
j-th column).
Let C be the n × n-matrix (with real entries) defined by

Ci,j = (the number of all arcs a ∈ A with source i and target j )


for all i, j ∈ V.

Let k ∈ N, and let i, j ∈ V. Then, Ck i,j equals the number of all walks of
D having starting point i, ending point j and length k.

Remark 4.5.11. The matrix C in Theorem 4.5.10 is known as the adjacency

24 This is short for “topological sorting”. I don’t know where this name comes from.
An introduction to graph theory, version August 2, 2023 page 116

matrix of D. For example, if the multidigraph is

2
a b
e
D= 1 c 3 4
g
f
d

then its adjacency matrix is


 
0 1 1 0
0 0 1 0
C=
1
,
0 0 2
0 0 0 1

andthus Theorem 4.5.10 yields (among other things) that the (1, 3)-rd entry
Ck 1,3 of its k-th power Ck equals the number of all walks of D having
starting point 1, ending point 3 and length k.
The adjacency matrix of a multidigraph D determines D up to the iden-
tities of the arcs, and thus is often used as a convenient way to encode a
multidigraph.

Proof of Theorem 4.5.10. Forget that we fixed i, j and k. We want to prove the
following claim:

Claim 1: Let i ∈ V and j ∈ V and k ∈ N. Then,


 
Ck = (the number of walks from i to j that have length k) .
i,j

Before we prove this claim, let us recall that C is the adjacency matrix of D.
Thus, for each i ∈ V and j ∈ V, we have

Ci,j = (the number of all arcs a ∈ A with source i and target j )

(by the definition of the adjacency matrix). In other words, for each i ∈ V and
j ∈ V, we have
Ci,j = (the number of arcs from i to j) ,
where we agree that an “arc from i to j” means an arc a ∈ A with source i and
target j.
Renaming i as w in this statement, we obtain the following: For each w ∈ V
and j ∈ V, we have

Cw,j = (the number of arcs from w to j) . (6)


An introduction to graph theory, version August 2, 2023 page 117

Let us also recall that any two n × n-matrices M and N satisfy


n
( MN )i,j = ∑ Mi,w Nw,j (7)
w =1

for any i ∈ V and j ∈ V. (Indeed, this is just the rule for how matrices are
multiplied.)
We can now prove Claim 1:
[Proof of Claim 1: We shall prove Claim 1 by induction on k:
Induction base: We shall first prove Claim 1 for k = 0.
Indeed, let i ∈ V and j ∈ V. The 0-th power of any n × n-matrix is defined to
be the n × n identity matrix In ; thus, C0 = In . Hence,
(
  1, if i = j;
C0 = ( In )i,j = (8)
i,j 0, if i 6= j

(by the definition of the identity matrix).


On the other hand, how many walks from i to j have length 0 ? A walk
that has length 0 must consist of a single vertex, which is simultaneously the
starting point and the ending point of this walk. Thus, a walk from i to j that
has length 0 exists only when i = j, and in this case there is exactly one such
walk (namely, the walk (i )). Hence,
(
1, if i = j;
(the number of walks from i to j that have length 0) =
0, if i 6= j.

Comparing this with (8), we conclude that


 
C0 = (the number of walks from i to j that have length 0) . (9)
i,j

Now, forget that we fixed i and j. We thus have proven (9) for any i ∈ V and
j ∈ V. In other words, Claim 1 holds for k = 0. Thus, the induction base is
complete.
Induction step: Let g be a positive integer. Assume that Claim 1 holds for
k = g − 1. We must show that Claim 1 holds for k = g as well.
We have assumed that Claim 1 holds for k = g − 1. In other words, for any
i ∈ V and j ∈ V, we have
 
C g−1 = (the number of walks from i to j that have length g − 1) .
i,j

Renaming j as w in this statement, we obtain the following: For any i ∈ V and


w ∈ V, we have
 
g−1
C = (the number of walks from i to w that have length g − 1) . (10)
i,w
An introduction to graph theory, version August 2, 2023 page 118

Each walk from i to j that has length g has the form



w = v 0 , a1 , v 1 , a2 , v 2 , . . . , a g − 1 , v g − 1 , a g , v g

for some vertices v0 , v1 , . . . , v g of D and some arcs a1 , a2 , . . . , a g of D satisfying


v0 = i and v g = j and (ψ (ah ) = (vh−1 , vh ) for all h ∈ {1, 2, . . . , g}). Thus, each
such walk w can be constructed by the following algorithm:

• First, we choose a vertex w of D to serve as the vertex v g−1 (that is, as the
penultimate vertex of the walk w). This vertex w must belong to V.

• Now, we choose the vertices v0 , v1 , . . . , v g−1 (that is, all vertices of our
walk except for the last one) and the arcs a1 , a2 , . . . , a g−1 (that is, all arcs
of our walk except for the last one) in such a way that v g−1 = w. This is

tantamount to choosing a walk v0 , a1 , v1 , a2 , v2 , . . . , a g−1 , v g−1 from i to

w that has length g − 1. This choice can be made in C g−1 i,w many ways
(because (10) shows
 that the number of walks from i to w that have length
g − 1 is C g − 1 ).
i,w

• We have now determined all but the last vertex and all but the last arc of
our walk w. We set the last vertex v g of our walk to be j. (This is the only
possible option, since our walk w has to be a walk from i to j.)

• We choose the last arc a g of our walk w. This arc a g must have source v g−1
and target v g ; in other words, it must have source w and target j (since
v g−1 = w and v g = j). In other words, it must be an arc from w to j. Thus,
it can be chosen in Cw,j many ways (because (6) shows that the number of
arcs from w to j is Cw,j ).

Conversely, of course, this algorithm always constructs a walk from i to j


that has length g, and different choices in the algorithm lead to distinct walks.
Thus, the total number of walks from i to j that have length g equals the  total
number of choices in the algorithm. But the latter number is ∑ C g − 1 C
i,w w,j
w ∈V 
(since the algorithm first chooses a w ∈ V, then involves a step with C g−1 i,w
choices, and then involves a step with Cw,j choices). Hence, the total number of

walks from i to j that have length g is ∑ C g−1 i,w Cw,j . In other words,
w ∈V
 
(the number of walks from i to j that have length g) = ∑ C g−1 Cw,j .
w ∈V i,w
An introduction to graph theory, version August 2, 2023 page 119

Comparing this with


 
  n  
 C g  = C g−1 C = ∑ C g−1
Cw,j
|{z} i,j i,w
w =1
= C g −1 C i,j
 
g−1
by (7) (applied to M = C and N = C)
 
= ∑ C g−1 Cw,j (since {1, 2, . . . , n} = V ) ,
w ∈V i,w

we obtain

(C g )i,j = (the number of walks from i to j that have length g) . (11)

Now, forget that we fixed i and j. We thus have proven (11) for any i ∈ V
and j ∈ V. In other words, Claim 1 holds for k = g. Thus, the induction step is
complete. Hence, Claim 1 is proven by induction.]
Theorem 4.5.10 follows immediately from Claim 1.

Exercise 4.7. Let E be the following multidigraph:

E= 2 3

Let n ∈ N. Compute the number of walks from 1 to 1 having length n.

4.6. Connectedness strong and weak


We defined the “path-connected” relation for undirected graphs using the ex-
istence of paths (see Definition 2.9.8). For a digraph, however, the relations
“there is a walk from u to v” and “there is a walk from v to u” are (in general)
distinct and non-symmetric, so I prefer not to give them a symmetric-looking
symbol such as ≃ D . Instead, we define strong path-connectedness to mean the
existence of both walks:

Definition 4.6.1. Let D be a multidigraph. We define a binary relation ≃ D on


the set V ( D ) as follows: For two vertices u and v of D, we shall have u ≃ D v
if and only if there exists a walk from u to v in D and there exists a walk
from v to u in D.
This binary relation ≃ D is called “strong path-connectedness”. When two
vertices u and v satisfy u ≃ D v, we say that “u and v are strongly path-
connected”.
An introduction to graph theory, version August 2, 2023 page 120

Example 4.6.2. Let D be as in Example 4.5.5. Then, 1 ≃ D 2, because there


exists a walk from 1 to 2 in D (for instance, (1, a, 2)) and there also exists a
walk from 2 to 1 in D (for instance, (2, b, 3, d, 1)). However, we don’t have
3 ≃ D 4. Indeed, while there exists a walk from 3 to 4 in D, there exists no
walk from 4 to 3 in D.

Proposition 4.6.3. Let D be a multidigraph. Then, the relation ≃ D is an


equivalence relation.

Proof. Easy, like for simple graphs.


Again, we can replace “walk” by “path” in the definition of the relation ≃ D :
Proposition 4.6.4. Let D be a multidigraph. Let u and v be two vertices of D.
Then, u ≃ D v if and only if there exist a path from u to v and a path from v
to u.
Proof. Easy, like for simple graphs.
Definition 4.6.5. Let D be a multidigraph. The equivalence classes of the
equivalence relation ≃ D are called the strong components of D.

Definition 4.6.6. Let D be a multidigraph. We say that D is strongly con-


nected if D has exactly one strong component.

Thus, a multidigraph D is strongly connected if and only if it has at least one


vertex and there is a path from any vertex to any vertex.25
In comparison, here is a weaker notion of connected components and con-
nectedness:

Definition 4.6.7. Let D be a multidigraph. Consider its underlying undi-


rected multigraph Dund . The components of this undirected multigraph Dund
(that is, the equivalence classes of the equivalence relation ≃ Dund ) are called
the weak components of D. We say that D is weakly connected if D has
exactly one weak component (i.e., if Dund is connected).

Example 4.6.8. Let D be the following simple digraph:

2 5 7

D= 3

1 4 6
.
25 Someauthors use the word “diconnected” for “strongly connected”. As this word is just a
single letter away from “disconnected”, I cannot recommend it.
An introduction to graph theory, version August 2, 2023 page 121

We treat D as a multidigraph (namely, Dmult ).


The weak components of D are {1, 2, 3, 4, 5} and {6, 7}.
The strong components of D are {1}, {2}, {3, 4, 5}, {6} and {7}. (Indeed,
for example, we have 1 6≃ D 2 6≃ D 3 but 3 ≃ D 4 ≃ D 5.)
So D is neither strongly nor weakly connected, but has more strong than
weak components.

Example 4.6.9. The digraph from Example 4.5.2 is weakly connected, but not
at all strongly connected (indeed, each of its strong components has size 1).
The digraph from Example 4.5.3, on the other hand, is strongly connected.

Proposition 4.6.10. Any strongly connected digraph is weakly connected.

Proof. Let D be a multidigraph. Then, any walk of D is (or, more precisely,


gives rise to) a walk of Dund . Hence, if two vertices u and v of D are strongly
path-connected in D, then they are path-connected in Dund . Therefore, if D is
strongly connected, then Dund is connected, but this means that D is weakly
connected.

Exercise 4.8. Let D be a multidigraph. Prove that the strong components of


D are the weak components of D if and only if each arc of D is contained in
at least one cycle.

Let us take a look at what bidirectionalization (i.e., the operation G 7→ Gbidir


that sends a multigraph G to the multidigraph Gbidir ) does to walks, paths,
closed walks and cycles:

Proposition 4.6.11. Let G be a multigraph. Then:

(a) The walks of G are “more or less the same as” the walks of the multi-
digraph Gbidir . More precisely, each walk of G gives rise to a walk of
Gbidir (with the same starting point and the same ending point), and
conversely, each walk of Gbidir gives rise to a walk of G. If G has no
loops, then this is a one-to-one correspondence (i.e., a bijection) be-
tween the walks of G and the walks of Gbidir .

(b) The paths of G are “more or less the same as” the paths of the multi-
digraph Gbidir . This is always a one-to-one correspondence, since paths
cannot contain loops.

(c) The closed walks of G are “more or less the same as” the closed walks
of the multidigraph Gbidir .

(d) The cycles of G are not quite the same as the cycles of Gbidir . In fact, if e
is an edge of G with two distinct endpoints u and v, then (u, e, v, e, u) is
An introduction to graph theory, version August 2, 2023 page 122

not a cycle of G, but either (u, (e, 1) , v, (e, 2) , u) or (u, (e, 2) , v, (e, 1) , u)
is a cycle of Gbidir (this is best seen on a picture: G has the edge
(e, 1)
u v
e
u v (e, 2)
whereas Gbidir has the arc-pair ), so Gbidir
usually has more cycles than G has. But it is true that each cycle of G
gives rise to a cycle of Gbidir .

Exercise 4.9. Let D = (V, E, ψ) be a multidigraph.


Let A, B and C be three subsets of V such that the induced subdigraphs
D [ A], D [ B] and D [ C] are strongly connected.
A cycle of D will be called eclectic if it contains at least one arc of D [ A], at
least one arc of D [ B] and at least one arc of D [C] (although these three arcs
are not required to be distinct).
Prove the following:

(a) If the sets B ∩ C, C ∩ A and A ∩ B are nonempty, but A ∩ B ∩ C is empty,


then D has an eclectic cycle.

(b) If the induced subdigraphs D [ B ∩ C], D [ C ∩ A] and D [ A ∩ B] are


strongly connected, but the induced subdigraph D [ A ∩ B ∩ C] is not
strongly connected, then D has an eclectic cycle.

[Note: Keep in mind that the multidigraph with 0 vertices does not count
as strongly connected.]
[Solution: This is a generalization of Exercise 7 on midterm #2 from my
Spring 2017 course; see the course page for solutions.]

4.7. Eulerian walks and circuits


We have studied Eulerian walks and circuits for (undirected) multigraphs in
Section 3.4. Let us now define analogous concepts for multidigraphs:

Definition 4.7.1. Let D be a multidigraph.

(a) A walk of D is said to be Eulerian if each arc of D appears exactly once


in this walk.
(In other words: A walk (v0 , a1 , v1 , a2 , v2 , . . . , ak , vk ) of D is said to be
Eulerian if for each arc a of D, there exists exactly one i ∈ {1, 2, . . . , k}
such that a = ai .)
An introduction to graph theory, version August 2, 2023 page 123

(b) An Eulerian circuit of D means a circuit (i.e., closed walk) of D that is


Eulerian.

The Euler–Hierholzer theorem gives a necessary and sufficient criterion for a


multigraph to have an Eulerian circuit or walk. For multidigraphs, there is an
analogous result:

Theorem 4.7.2 (diEuler, diHierholzer). Let D be a weakly connected multidi-


graph. Then:

(a) The multidigraph D has an Eulerian circuit if and only if each vertex v
of D satisfies deg+ v = deg− v.

(b) The multidigraph D has an Eulerian walk if and only if all but two
vertices v of D satisfy deg+ v = deg− v, and the remaining two vertices
v satisfy deg+ v − deg− v ≤ 1.

Exercise 4.10. Prove Theorem 4.7.2.

Incidentally, the “each vertex v of D satisfies deg+ v = deg− v” condition has


a name:

Definition 4.7.3. A multidigraph D is said to be balanced if each vertex v of


D satisfies deg+ v = deg− v.

So balancedness is necessary and sufficient for the existence of an Eulerian


circuit in a weakly connected multidigraph.
The following proposition is obvious:

Proposition 4.7.4. Let G be a multigraph. Then, the multidigraph Gbidir is


balanced.

Proof. The definition of Gbidir yields that each vertex v of Gbidir satisfies deg+ v =
deg v and deg− v = deg v, where deg v denotes the degree of v as a vertex of
G. Hence, each vertex v of Gbidir satisfies deg+ v = deg v = deg− v. In other
words, Gbidir is balanced.
Combining this proposition with Theorem 4.7.2 (a), we can obtain a curious
fact about undirected(!) multigraphs:

Theorem 4.7.5. Let G be a connected multigraph. Then, the multidigraph


Gbidir has an Eulerian circuit. In other words, there is a circuit of G that
contains each edge exactly twice, and uses it once in each direction.
An introduction to graph theory, version August 2, 2023 page 124

Proof. The multidigraph Gbidir is balanced (by Proposition 4.7.4) and weakly
connected (this follows easily from the connectedness of G). Hence, Theorem
4.7.2 (a) can be applied to D = Gbidir . Thus, Gbidir has an Eulerian circuit.
Reinterpreting this circuit as a circuit of G, we obtain a circuit of G that con-
tains each edge exactly twice, and uses it once in each direction. This proves
Theorem 4.7.5.

4.8. Hamiltonian cycles and paths


We can define Hamiltonian paths and cycles for simple digraphs in the same
way as we defined them for simple graphs:

Definition 4.8.1. Let D = (V, A) be a simple digraph.

(a) A Hamiltonian path in D means a walk of D that contains each vertex


of D exactly once. Obviously, it is a path.

(b) A Hamiltonian cycle in D means a cycle (v0 , v1 , . . . , vk ) of D such that


each vertex of D appears exactly once among v0 , v1 , . . . , vk−1 .

Convention 4.8.2. In the following, we will abbreviate:

• “Hamiltonian path” as “hamp”;

• “Hamiltonian cycle” as “hamc”.

We might wonder what can be said about hamps and hamcs for digraphs. Is
there an analogue of Ore’s theorem? The answer is “yes”, but it is significantly
harder to prove:

Theorem 4.8.3 (Meyniel). Let D = (V, A) be a strongly connected loopless


simple digraph with n vertices. Assume that for each pair (u, v) ∈ V × V of
two vertices u and v satisfying u 6= v and (u, v) ∈
/ A and (v, u) ∈
/ A, we have
+ −
deg u + deg v ≥ 2n − 1. Here, deg w means deg w + deg w. Then, D has a
hamc.
For the (rather complicated) proof of this, see [BonTho77] or [Berge91, §10.3,
Theorem 7]. Note that the “strongly connected” condition is needed.

4.9. The reverse and complement digraphs


We take a break from studying hamps (Hamiltonian paths) in order to intro-
duce two more operations on simple digraphs.
An introduction to graph theory, version August 2, 2023 page 125

Definition 4.9.1. Let D = (V, A) be a simple digraph. Then:

(a) The elements of (V × V ) \ A will be called the non-arcs of D.

(b) The reversal of a pair (i, j) ∈ V × V means the pair ( j, i ).

(c) We define Drev as the simple digraph (V, Arev ), where

Arev = {( j, i ) | (i, j) ∈ A} .

Thus, Drev is the digraph obtained from D by reversing each arc (i.e.,
swapping its source and its target). This is called the reversal of D.

(d) We define D as the simple digraph (V, (V × V ) \ A). This is the di-
graph that has the same vertices as D, but whose arcs are precisely the
non-arcs of D. This digraph D is called the complement of D.

Example 4.9.2. Let


1 2

3 4
D= .
Then,

1 2 1 2

3 4 3 4
Drev = and D= .

Convention 4.9.3. In the following, the symbol # means “number”. For ex-
ample,
(# of subsets of {1, 2, 3}) = 8.

We now shall try to count hamps in simple digraphs26 . As a warmup, here


is a particularly simple case:

26 See [17s-lec7] for a more detailed treatment of this topic.


An introduction to graph theory, version August 2, 2023 page 126

Proposition 4.9.4. Let D be the simple digraph (V, A), where

V = {1, 2, . . . , n} for some n ∈ N,

and where
A = {(i, j) | i < j} .
Then, (# of hamps of D ) = 1.

Proof. It is easy to see that the only hamp of D is (1, 2, . . . , n).


The following is easy, too:

Proposition 4.9.5. Let D be a simple digraph. Then,

(# of hamps of Drev ) = (# of hamps of D ) .

Proof. The hamps of Drev are obtained from the hamps of D by walking back-
wards.
So far, so boring. What about this:

Theorem 4.9.6 (Berge’s theorem). Let D be a simple digraph. Then,



# of hamps of D ≡ (# of hamps of D ) mod 2.

This is much less obvious or even expected. We first give an example:

Example 4.9.7. Let D be the following digraph:

1 2 3
D= .

This digraph has 3 hamps: (1, 2, 3) and (2, 3, 1) and (3, 1, 2).
Its complement D looks as follows:

1 2 3
D= .

It has only 1 hamp: (1, 3, 2).


Thus, in this case, Theorem 4.9.6 says that 1 ≡ 3 mod 2.
An introduction to graph theory, version August 2, 2023 page 127

Proof of Theorem 4.9.6. (This is an outline; see [17s-lec7, proof of Theorem 1.3.6]
for more details.)
Write the simple digraph D as D = (V, A), and assume WLOG that V 6= ∅.
Set n = |V |.
A V-listing will mean a list of elements of V that contains each element of
V exactly once. (Thus, each V-listing is an n-tuple, and there are n! many V-
listings.) Note that a V-listing is the same as a hamp of the “complete” digraph
(V, V × V ). Any hamp of D or of D is therefore a V-listing, but not every
V-listing is a hamp of D or D.
If σ = (σ1 , σ2 , . . . , σn ) is a V-listing, then we define a set

P (σ) := {σ1 σ2 , σ2 σ3 , . . . , σn−1 σn } .

We call this set P (σ) the arc set of σ. When we regard σ as a hamp of
(V, V × V ), this set P (σ) is just the set of all arcs of σ. Note that this is an
(n − 1)-element set. We make a few easy observations (prove them!):

Observation 1: We can reconstruct a V-listing σ from its arc set P (σ).


In other words, the map σ 7→ P (σ) is injective.

Observation 2: Let σ be a V-listing. Then, σ is a hamp of D if and


only if P (σ) ⊆ A.

Observation 3: Let σ be a V-listing. Then, σ is a hamp of D if and


only if P (σ) ⊆ (V × V ) \ A.

Now, let N be the # of pairs (σ, B), where σ is a V-listing and B is a subset of
A satisfying B ⊆ P (σ). Thus,

N= ∑ Nσ ,
σ is a V-listing

where
Nσ = (# of subsets B of A satisfying B ⊆ P (σ)) .
But we also have
N= ∑ NB,
B is a subset of A
where
N B = (# of V-listings σ satisfying B ⊆ P (σ)) .
Let us now relate these two sums to hamps. We begin with ∑ Nσ .
σ is a V-listing
We shall use the Iverson bracket notation: i.e., the notation [A] for the truth
value of a statement A. This truth value is defined to be the number 1 if A is
true, and 0 if A is false. For instance,

[ 2 + 2 = 4] = 1 and [2 + 2 = 5] = 0.
An introduction to graph theory, version August 2, 2023 page 128

For any V-listing σ, we have


Nσ = (# of subsets B of A satisfying B ⊆ P (σ))
= (# of subsets B of A ∩ P (σ))
= 2| A∩ P(σ)|
≡ [| A ∩ P (σ)| = 0] (since 2m ≡ [m = 0] mod 2 for each m ∈ N )
 
since equivalent statements have the
= [ A ∩ P (σ) = ∅]
same truth value
= [ P ( σ ) ⊆ (V × V ) \ A] (since P (σ) is always a subset of V × V )
 
= σ is a hamp of D mod 2 (by Observation 3) .
So
N= ∑ Nσ
|{z}
σ is a V-listing
≡ [σ is a hamp of D ] mod 2
 
≡ ∑ σ is a hamp of D
σ is a V-listing

= # of V-listings σ that are hamps of D
   
because ∑ σ is a hamp of D is a sum
 σ is a V-listing 
 
 of several 1’s and several 0’s, and the 1’s in this 
 
 sum correspond precisely to 
the V-listings σ that are hamps of D

= # of hamps of D mod 2.
What about the other expression for N ? Recall that
N= ∑ NB,
B is a subset of A

where
N B = (# of V-listings σ satisfying B ⊆ P (σ)) .
We want to prove that this sum equals (# of hamps of D ), at least modulo 2.
So let B be a subset of A. We want to know N B mod 2. In other words, we
want to know when N B is odd.
Let us first assume that N B is odd, and see what follows from this.
Since N B is odd, we have N B > 0. Thus, there exists at least one V-listing σ
satisfying B ⊆ P (σ). We shall now draw some conclusions from this.
First, a definition: A path cover of V means a set of paths in the “complete”
digraph (V, V × V ) such that each vertex v ∈ V is contained in exactly one of
these paths. The set of arcs of such a path cover is simply the set of all arcs of
all its paths. For example, if V = {1, 2, 3, 4, 5, 6, 7}, then
{(1, 3, 5) , (2) , (6) , (7, 4)}
An introduction to graph theory, version August 2, 2023 page 129

is a path cover of V, and its set of arcs is {13, 35, 74}.


Now, ponder the following: If we remove an arc vi vi +1 from a path (v1 , v2 , . . . , vk ),
then this path breaks up into two paths (v1 , v2 , . . . , vi ) and (vi +1 , vi +2 , . . . , vk ).
Thus, if we remove some arcs from the arc set P (σ) of a V-listing σ, then we
obtain the set of arcs of a path cover of V. (For instance, removing the arcs
52, 26 and 67 from the arc set P (σ) of the V-listing σ = (1, 3, 5, 2, 6, 7, 4) yields
precisely the path cover {(1, 3, 5) , (2) , (6) , (7, 4)} that we just showed as an
example.)
Now, recall that there exists at least one V-listing σ satisfying B ⊆ P (σ).
Hence, B is obtained by removing some arcs from the arc set P (σ) of this V-
listing σ. Therefore, B is the set of arcs of a path cover of V (by the claim of
the preceding paragraph). Let us say that this path cover consists of exactly r
paths. Then,
(# of V-listings σ satisfying B ⊆ P (σ)) = r!,
because any such V-listing σ can be constructed by concatenating the r paths
in our path cover in some order (and there are r! possible orders).
Thus, N B = (# of V-listings σ satisfying B ⊆ P (σ)) = r!. But we have as-
sumed that N B is odd. So r! is odd. Since r is positive (because V 6= ∅, so our
path cover must contain at least one path), this entails that r = 1. So our path
cover is just a single path; this path is a path of D (since its set of arcs B is a
subset of A) and therefore is a hamp of D (since it constitutes a path cover of V
all by itself). If we denote it by σ, then we have B = P (σ) (since B is the set of
arcs of the path cover that consists of σ alone).
Forget our assumption that N B is odd. We have thus shown that if N B is odd,
then B = P (σ) for some hamp σ of D.
Conversely, it is easy to see that if B = P (σ) for some hamp σ of D, then N B
is odd (and actually equals 1).
Combining these two results, we see that N B is odd if and only if B = P (σ)
for some hamp σ of D. Therefore,
h i
N B is odd = [ B = P (σ) for some hamp σ of D ] .

However,
h i
B B
N ≡ N is odd (since m ≡ [m is odd] mod 2 for any m ∈ Z)
= [ B = P (σ) for some hamp σ of D ] mod 2.
An introduction to graph theory, version August 2, 2023 page 130

We have proved this congruence for every subset B of A. Thus,

N= ∑ NB
|{z}
B is a subset of A ≡[ B = P (σ) for some hamp σ of D ] mod 2

≡ ∑ [ B = P (σ) for some hamp σ of D ]


B is a subset of A
= (# of subsets B of A such that B = P (σ) for some hamp σ of D )
= (# of sets of the form P (σ) for some hamp σ of D )
 
because each set of the form P (σ) for some
hamp σ of D is a subset of A (by Observation 2)
= (# of hamps of D ) mod 2
(indeed, Observation 1 shows that different hamps σ have different sets P (σ),
so counting the sets P (σ) for all hamps σ is equivalent to counting the hamps
σ themselves). 
Now we have proved that N ≡ # of hamps of D mod 2 and
N ≡ (# of hamps of D ) mod 2. Comparing these two congruences, we obtain

# of hamps of D ≡ (# of hamps of D ) mod 2.
This proves Berge’s theorem.

4.10. Tournaments
4.10.1. Definition
We now introduce a special class of simple digraphs.

Definition 4.10.1. A digraph D is said to be loopless if it has no loops.

Definition 4.10.2. A tournament is defined to be a loopless simple digraph


D that satisfies the

• Tournament axiom: For any two distinct vertices u and v of D, exactly


one of (u, v) and (v, u) is an arc of D.

Example 4.10.3. The following digraph is a tournament:

3
.
An introduction to graph theory, version August 2, 2023 page 131

The following digraph is a tournament as well:

3
.

However, the following digraph is not a tournament:

3
,

because the tournament axiom is not satisfied for u = 1 and v = 3. Nor is


the following digraph a tournament:

3
,

because the tournament axiom is not satisfied for u = 1 and v = 2. Finally,


the digraph
2

is not a tournament either, since it is not loopless.


The digraph D in Proposition 4.9.4 always is a tournament.
An introduction to graph theory, version August 2, 2023 page 132

Example 4.10.4. Here is a tournament with 5 vertices:

5
.

A tournament can also be viewed as a complete graph, whose each edge has
been given a direction.
Using Definition 4.9.1, we can restate the definition of a tournament as fol-
lows:

Proposition 4.10.5. Let D = (V, A) be a loopless simple digraph. Then, D is


a tournament if and only if the non-loop arcs of D are precisely the arcs of
Drev.

Proof. Easy consequence of definitions.

Exercise 4.11. Let D be a tournament with at least one vertex.


We say that a vertex u of D directly owns a vertex w of D if (u, w) is an
arc of D.
We say that a vertex u of D indirectly owns a vertex w of D if there exists
a vertex v of D such that both (u, v) and (v, w) are arcs of D.
Prove that D has a vertex that (directly or indirectly) owns all other ver-
tices.
[Solution: This exercise appears in [20f, Exercise 6.3.1] (restated in the
language of players and matches) and in [Maurer80, Theorem 1] (restated
in the language of chickens and pecking orders). It originates in a study of
pecking orders by Landau [Landau53].]

4.10.2. The Rédei theorems


Which tournaments have hamps? The answer is surprisingly simple:27

27 Here we agree to consider the empty list () to be a hamp of the digraph (∅, ∅).
An introduction to graph theory, version August 2, 2023 page 133

Theorem 4.10.6 (Easy Rédei theorem). A tournament always has at least one
hamp.

Even better, and perhaps even more surprisingly:

Theorem 4.10.7 (Hard Rédei theorem). Let D be a tournament. Then,

(# of hamps of D ) is odd.

Our goal now is to prove these two theorems. Clearly, the Easy Rédei Theo-
rem follows from the Hard one, since an odd number cannot be 0. Thus, it will
suffice to prove the Hard one.
The proof of the hard Rédei theorem will rely on the following crucial lemma:

Lemma 4.10.8. Let D = (V, A) be a tournament, and let vw ∈ A be an arc of


D.
Let D ′ be the digraph obtained from D by reversing the arc vw. In other
words, let
D ′ := (V, ( A \ {vw}) ∪ { wv}) .
Then, D ′ is again a tournament, and satisfies

(# of hamps of D ) ≡ # of hamps of D ′ mod 2.

Here is a visualization of the setup of Lemma 4.10.8:

D: v w D′ : v w

; .

(Here, we are only showing the arcs joining v with w, since D and D ′ agree in
all other arcs.)
Proof of Lemma 4.10.8. (This is an outline; see [17s-lec7, proof of Lemma 1.6.2]
for more details.)
First of all, D ′ is clearly a tournament. It remains to prove the congruence.
We introduce two more digraphs: Let

D0 := (the digraph D with the arc vw removed) and


D2 := (the digraph D with the arc wv added) .
An introduction to graph theory, version August 2, 2023 page 134

Note that these are not tournaments any more. Here is a comparative illustra-
tion of all four digraphs D, D ′ , D0 and D2 (again showing only the arcs joining
v with w, since there are no differences in the other arcs):

D: v w D′ : v w

; ;

D0 : v w D2 : v w

; .

The digraph D0 is D ′ with the arc wv removed. Therefore, a hamp of D0 is


the same as a hamp of D ′ that does not use the arc wv. Hence,

(# of hamps of D0 )

= # of hamps of D ′ that do not use the arc wv
 
= # of hamps of D ′ − # of hamps of D ′ that use the arc wv .

Similarly, since D is D2 with the arc wv removed, we have

(# of hamps of D )
= (# of hamps of D2 ) − (# of hamps of D2 that use the arc wv)

= (# of hamps of D2 ) − # of hamps of D ′ that use the arc wv

(the last equality is because a hamp of D2 that uses the arc wv cannot use the
arc vw, and therefore is automatically a hamp of D ′ as well, and of course the
converse is obviously true).
However, from the previously proved equality

(# of hamps of D0 )
 
= # of hamps of D ′ − # of hamps of D ′ that use the arc wv ,

we obtain

# of hamps of D ′

= (# of hamps of D0 ) + # of hamps of D ′ that use the arc wv

≡ (# of hamps of D0 ) − # of hamps of D ′ that use the arc wv mod 2
An introduction to graph theory, version August 2, 2023 page 135

(since x + y ≡ x − y mod 2 for any integers x and y). Thus, if we can show that

(# of hamps of D2 ) ≡ (# of hamps of D0 ) mod 2,

then we will be able to conclude that

(# of hamps of D )

= (# of hamps of D2 ) − # of hamps of D ′ that use the arc wv
| {z }
≡(# of hamps of D0 ) mod 2

≡ (# of hamps of D0 ) − # of hamps of D ′ that use the arc wv

≡ # of hamps of D ′ mod 2,

and the proof of the lemma will be complete.


So let us show this. Recall that D is a tournament. Thus, the non-loop arcs
of D are precisely the arcs of Drev (by Proposition 4.10.5). Hence, the non-loop
arcs of D0 are precisely the arcs of D2rev (since D0 is just D with the extra arc vw
added, and since D2rev is just Drev with the extra arc vw added). Therefore, the
digraphs D0 and D2rev are equal “up to loops” (i.e., they have the same vertices
and the same non-loop arcs). Since loops don’t matter for hamps, these two
digraphs thus have the same of hamps. Hence,

# of hamps in D0 = (# of hamps in D2rev ) = (# of hamps in D2 )

(by Proposition 4.9.5), and therefore



(# of hamps in D2 ) = # of hamps in D0 ≡ (# of hamps in D0 ) mod 2

(by Theorem 4.9.6). As explained above, this completes the proof of Lemma
4.10.8.
Now, the Hard Rédei theorem has become easy:
Proof of Theorem 4.10.7. (This is an outline; see [17s-lec7, proof of Theorem 1.6.1]
for more details.)
We need to prove that the # of hamps of D is odd. Lemma 4.10.8 tells us that
the parity of this # does not change when we reverse a single arc of D. Thus, of
course, if we reverse several arcs of D, then this parity does not change either.
However, we can WLOG assume that the vertices of D are 1, 2, . . . , n for some
n ∈ N, and then, by reversing the appropriate arcs, we can ensure that the arcs
of D are

12, 13, 14, . . . , 1n,


23, 24, . . . , 2n,
··· ,
( n − 1) n
An introduction to graph theory, version August 2, 2023 page 136

(i.e., each arc of D has the form ij with i < j). But at this point, the tournament
D has only one hamp: namely, (1, 2, . . . , n). So (# of hamps of D ) = 1 is odd
at this point. Since the parity of the # of hamps of D has not changed as we
reversed our arcs, we thus conclude that it has always been odd. This proves
the Hard Rédei theorem (Theorem 4.10.7).
As we already mentioned, the Easy Rédei theorem follows from the Hard
Rédei theorem. But it also has a short self-contained proof ([17s-lec7, Theorem
1.4.9]).

Remark 4.10.9. Theorem 4.10.7 shows that the # of hamps in a tournament


is an odd positive integer. Can it be any odd positive integer, or are certain
odd positive integers impossible?
Surprisingly, 7 and 21 are impossible. All other odd numbers between 1
and 80555 are possible. For higher numbers, the answer is not known so far.
See MathOverflow question #232751 ([MO232751]) for more details.

4.10.3. Hamiltonian cycles in tournaments


By the Easy Rédei theorem, every tournament has a hamp. But of course, not
every tournament has a hamc28 . One obstruction is clear:

Proposition 4.10.10. If a digraph D has a hamc, then D is strongly connected.

In general, this is only a necessary criterion for a hamc, not a sufficient one.
Not every strongly connected digraph has a hamc. However, it turns out that
for tournaments, it is also sufficient, as long as the tournament has enough
vertices:

Theorem 4.10.11 (Camion’s theorem). If a tournament D is strongly con-


nected and has at least two vertices, then D has a hamc.

Proof sketch. A detailed proof can be found in [17s-lec7, Theorem 1.5.5]; here is
just a very rough sketch.
Let D = (V, A) be a strongly connected tournament with at least two ver-
tices.29 We must show that D has a hamc.
It is easy to see that D has a cycle. Let c = (v1 , v2 , . . . , vk , v1 ) be a cycle of
maximum length. We shall show that c is a hamc.
Let C be the set {v1 , v2 , . . . , vk } of all vertices of this cycle c.
A vertex w ∈ V \ C will be called a to-vertex if there exists an arc from some
vi to w.
28 Recallthat “hamc” is our shorthand for “Hamiltonian cycle”.
29 Bythe way, a tournament with exactly two vertices cannot be strongly connected (as it has
only 1 arc). Thus, by requiring D to have at least two vertices, we have actually guaranteed
that D has at least three vertices.
An introduction to graph theory, version August 2, 2023 page 137

A vertex w ∈ V \ C will be called a from-vertex if there exists an arc from w


to some vi .
Since D is a tournament, each vertex in V \ C is a to-vertex or a from-vertex.
In theory, a vertex could be both (having an arc from some vi and also an arc
to some other v j ). However, this does not actually happen. To see why, argue
as follows:

• If a to-vertex w has an arc from some vi , then it must also have an arc
from vi +1 30 (because otherwise there would be an arc from w to vi +1 ,
and then we could make our cycle c longer by interjecting w between vi
and vi +1 ; but this would contradict the fact that c is a cycle of maximum
length).

• Iterating this argument, we see that if a to-vertex w has an arc from some
vi , then it must also have an arc from vi +1 , an arc from vi +2 , an arc from
vi +3 , and so on; i.e., it must have an arc from each vertex of c. Conse-
quently, w cannot be a from-vertex. This shows that a to-vertex cannot be
a from-vertex.

Let F be the set of all from-vertices, and let T be the set of all to-vertices.
Then, as we have just shown, F and T are disjoint. Moreover, F ∪ T = V \ C.
Since a to-vertex cannot be a from-vertex, we furthermore conclude that any to-
vertex has an arc from each vertex of c (otherwise, it would be a from-vertex),
and that any from-vertex has an arc to each vertex of c (otherwise, it would be
a to-vertex).
Next, we argue that there cannot be an arc from a to-vertex t to a from-vertex
f . Indeed, if there was such an arc, then we could make the cycle c longer by
interjecting t and f between (say) v1 and v2 .
In total, we now know that every vertex of D belongs to one of the three
disjoint sets C, F and T, and furthermore there is no arc from T to F, no arc
from T to C, and no arc from C to F. Thus, there exists no walk from a vertex
in T to a vertex in C (because there is no way out of T). This would contradict
the fact that D is strongly connected, unless the set T is empty. Hence, T must
be empty. Similarly, F must be empty. Since F ∪ T = V \ C, this entails that
V \ C is empty, so that V = C. In other words, each vertex of D is on our cycle
c. Therefore, c is a hamc. This proves Camion’s theorem.

4.10.4. Application of tournaments to the Vandermonde determinant


To wrap up the topic of tournaments, let me briefly discuss a curious appli-
cation of their theory: a combinatorial proof of the Vandermonde determinant
formula. See [17s-lec8] for the many details I’ll be omitting.
Recall the Vandermonde determinant formula:

30 Here, indices are periodic modulo k, so that vk+1 means v1 .


An introduction to graph theory, version August 2, 2023 page 138

Theorem 4.10.12 (Vandermonde determinant formula). Let x1 , x2 , . . . , xn be


n numbers (or, more generally, elements of a commutative ring). Consider
the n × n-matrix
 
1 1 1 ··· 1
 x1 x2 x3 · · · xn 
   
 x2 x 2 x 2 · · · x 2 

V :=  1 2 3 n  = x i −1 .
 ... .. .. .. .. 
j 1≤i ≤ n, 1≤ j≤ n
 . . . . 
n −1 n −1 n −1
x1 x2 x3 · · · x n −1
n

Then, its determinant is



det V = ∏ x j − xi .
1≤ i < j ≤ n

There are many simple proofs of this theorem (e.g., a few on its ProofWiki
page, which works with the transpose matrix). I will now outline a combina-
torial one, using tournaments. This proof goes back to Ira Gessel’s 1979 paper
[Gessel79]. 
First, how do det V and ∏ xi − x j relate to tournaments?
1≤ i < j ≤ n
As a warmup, let’s assume that we have some number y(i,j) given for each
pair (i, j) of integers, and let’s expand the product
   
y(1,2) + y(2,1) y(1,3) + y(3,1) y(2,3) + y(3,2) .

The result is a sum of 8 products, one for each way to pluck an addend out of
each of the three little sums:
   
y(1,2) + y(2,1) y(1,3) + y(3,1) y(2,3) + y(3,2)
= y(1,2) y(1,3) y(2,3) + y(1,2) y(1,3) y(3,2) + y(1,2) y(3,1) y(2,3) + y(1,2) y(3,1) y(3,2)
+ y(2,1) y(1,3) y(2,3) + y(2,1) y(1,3) y(3,2) + y(2,1) y(3,1) y(2,3) + y(2,1) y(3,1) y(3,2) .

Note that each of the 8 products obtained has the form y a yb yc , where

• a is one of the pairs (1, 2) and (2, 1),

• b is one of the pairs (1, 3) and (3, 1), and

• c is one of the pairs (2, 3) and (3, 2).

We can view these pairs a, b and c as the arcs of a tournament with vertex
set {1, 2, 3}. Thus, our above expansion can be rewritten more compactly as
An introduction to graph theory, version August 2, 2023 page 139

follows:
   
y(1,2) + y(2,1) y(1,3) + y(3,1) y(2,3) + y(3,2)
= ∑ ∏ y(i,j).
D is a tournament (i,j) is an arc of D
with vertex set {1,2,3}

For reference, here are all the 8 tournaments with vertex set {1, 2, 3}:

1 1 1

2 3 2 3 2 3

1 1 1

2 3 2 3 2 3

1 1

2 3 2 3

Here, for convenience, we are drawing an arc ij in blue if i < j and in red
otherwise.
This expansion can be generalized: We have
 
∏ y (i,j) + y ( j,i ) = ∑ ∏ y(i,j) .
1≤ i < j ≤ n D is a tournament (i,j) is an arc of D
with vertex set {1,2,...,n }
An introduction to graph theory, version August 2, 2023 page 140

(
xj, if i < j;
Substituting y(i,j) = in this equality, we obtain
− x j , if i ≥ j
(
 xj, if i < j;
∏ x j − xi = ∑ ∏ − x j , if i ≥ j
1≤ i < j ≤ n D is a tournament (i,j) is an arc of D
with vertex set {1,2,...,n } | {z }
n deg− j
=(−1)(# of red arcs of D) ∏ x j
j =1
(where deg− j means the indegree of j in D,
and where the “red arcs” are the arcs ij with i > j)
n
(# of red arcs of D ) deg− j
= ∑ (−1) xj ∏ .
D is a tournament j =1
with vertex set {1,2,...,n }

We shall refer to this sum as the “big sum”.


On the other hand, if we let Sn be the group of permutations of {1, 2, . . . , n},
and if we denote the sign of a permutation σ by sign σ, then we have
  n
σ( j)−1
det V = det V T = ∑ sign σ · ∏ x j
σ ∈ Sn j =1

(by the definition of a determinant). We shall refer to this sum as the “small
sum”.
Our goal is to prove that the big sum equals the small sum. To prove this, we
must verify the following:

1. Each addend of the small sum is an addend of the big sum. Indeed, for
each permutation σ ∈ Sn , there is a certain tournament Tσ that has
n n
deg− j σ( j)−1
(−1)(# of red arcs of Tσ ) ∏ x j = sign σ · ∏ x j .
j =1 j =1

Can you find this Tσ ?

2. All the addends of the big sum that are not addends of the small sum
cancel each other out. Why?
The basic idea is to argue that if a tournament D appears in the big sum
but not in the small sum, then D has a 3-cycle (i.e., a cycle of length
3). When we reverse such a 3-cycle (i.e., we reverse each of its arcs), the
indegrees of all vertices are preserved, but the sign (−1)(# of red arcs of D) is
flipped (since three arcs change their orientation).
This suffices to show that for each addend that appears in the big sum but
not in the small sum, there is another addend with the same magnitude
but with opposite sign. Unfortunately, this in itself does not suffice to
An introduction to graph theory, version August 2, 2023 page 141

ensure that all these addends cancel out; for example, the sum 1 + 1 + 1 +
(−1) has the same property but does not equal 0. We need to show that
the # of addends with positive sign (i.e., with (−1)(# of red arcs of D) = 1)
and a given magnitude equals the # of addends with negative sign (i.e.,
with (−1)(# of red arcs of D) = −1) and the same magnitude.
One way to achieve this would be by constructing a bijection (aka “perfect
matching”) between the “positive” and the “negative” addends. This is
tricky here: We would have to decide which 3-cycle to reverse (as there
are usually many of them), and this has to be done in a bijective way
(so that two “positive” addends don’t get assigned the same “negative”
partner).
A less direct, but easier way is the following: Fix a positive integer k,
and consider only the tournaments with exactly k many 3-cycles. For
each such tournament, we can reverse any of its k many 3-cycles. It can
be shown (nice exercise!) that reversing the arcs of a 3-cycle does not
change the # of all 3-cycles; thus, we don’t accidentally change our k in the
process. Thus, we find a “k-to-k” correspondence between the “positive”
addends of a given magnitude and the “negative” addends of the same
magnitude. As one can easily see, this entails that the former and the
latter are equinumerous, and thus really cancel out. The addends that
remain are exactly those in the small sum.
As already mentioned, this is only a rough summary of the proof; the details
can be found in [17s-lec8].

4.11. Exercises on tournaments


There is, of course, much more to say about tournaments. See [Moon13] for a
selection of topics. Let us merely hint at some possible directions by giving a
few exercises.
The next three exercises use the notion of a “3-cycle”:
Definition 4.11.1. A 3-cycle in a tournament D = (V, A) means a triple
(u, v, w) of vertices in V such that all three pairs (u, v), (v, w) and (w, u)
belong to A.

For example, the tournament shown in Example 4.10.4 has the nine different
3-cycles
(1, 4, 3) , (1, 5, 3) , (2, 5, 3) , (3, 1, 4) ,
(3, 1, 5) , (3, 2, 5) , (4, 3, 1) , (5, 3, 1) ,
(5, 3, 2) .
(Yes, we are counting a 3-cycle (u, v, w) as being distinct from (v, w, u) and
(w, u, v).)
An introduction to graph theory, version August 2, 2023 page 142

Exercise 4.12. Let D = (V, A) be a tournament. Set n = |V | and m =


 
deg− (v)
∑ .
v ∈V 2
 
deg+ (v)
(a) Show that m = ∑ .
v ∈V 2
  
n
(b) Show that the number of 3-cycles in D is 3 −m .
3

[Solution: This is Exercise 5 on homework set #2 from my Spring 2017


course; see the course page for solutions.]
The next exercise uses the notation deg−
D v for the indegree of a vertex v in a
digraph D. (We usually denote this by deg− v, but sometimes it is important to
stress the dependence on D, since v can be a vertex of two different digraphs.)
Exercise 4.13. If a tournament D has a 3-cycle (u, v, w), then we can define a
new tournament Du,v,w′ ′
as follows: The vertices of Du,v,w shall be the same as

those of D. The arcs of Du,v,w shall be the same as those of D, except that the
three arcs (u, v), (v, w) and (w, u) are replaced by the three new arcs (v, u),

(w, v) and (u, w). (Visually speaking, Du,v,w is obtained from D by turning
the arrows on the arcs (u, v), (v, w) and (w, u) around.) We say that the
new tournament Du,v,w ′ is obtained from the old tournament D by a 3-cycle
reversal operation.
Now, let V be a finite set, and let E and F be two tournaments with vertex
set V. Prove that F can be obtained from E by a sequence of 3-cycle reversal
operations if and only if each v ∈ V satisfies deg− −
E ( v) = deg F (v) . (Note that
a sequence may be empty, which allows handling the case E = F even if E
has no 3-cycles to reverse.)
[Solution: This is Exercise 6 on homework set #2 from my Spring 2017
course; see the course page for solutions.]

Exercise 4.14. A tournament D = (V, A) is called transitive if it has no 3-


cycles.
If a tournament D = (V, A) has three distinct vertices u, v and w satisfying
(u, v) ∈ A and (v, w) ∈ A, then we can define a new tournament Du,v,w ′′ as
′′
follows: The vertices of Du,v,w shall be the same as those of D. The arcs of
′′
Du,v,w shall be the same as those of D, except that the two arcs (u, v) and
(v, w) are replaced by the two new arcs (v, u) and (w, v). We say that the
new tournament Du,v,w ′′ is obtained from the old tournament D by a 2-path
reversal operation.
Let D be any tournament. Prove that there is a sequence of 2-path reversal
operations that transforms D into a transitive tournament.
An introduction to graph theory, version August 2, 2023 page 143

[Solution: This is Exercise 7 on homework set #2 from my Spring 2017


course; see the course page for solutions.]

5. Trees and arborescences


Trees are particularly nice graphs. Among other things, they can be character-
ized as
• the minimal connected graphs on a given set of vertices, or
• the maximal acyclic (= having no cycles) graphs on a given set of vertices,
or
• in many other ways.
Arborescences are their closest analogue for digraphs.
In this chapter, we will discuss the theory of trees and some of their ap-
plications. Further applications are usually covered in courses in theoretical
computer science, but their notion of a tree is somewhat different from ours.

5.1. Some general properties of components and cycles


5.1.1. Backtrack-free walks revisited
Before we start with trees, let us recall and prove some more facts about general
multigraphs. Recall the notion of a “backtrack-free walk” that already had a
brief appearance in the proof of Theorem 2.10.7:

Definition 5.1.1. Let G be a multigraph. A backtrack-free walk of G means


a walk w such that no two adjacent edges of w are identical.

Here are a few properties of this notion:


Proposition 5.1.2. Let G be a multigraph. Let w be a backtrack-free walk of
G. Then, w either is a path or contains a cycle.

Proof. We have already proved this for simple graphs (in Proposition 2.10.4).
More or less the same argument works for multigraphs. (“More or less” be-
cause the definition of a cycle in a multigraph is slightly different from that in
a simple graph; but the proof is easy to adapt.)

Theorem 5.1.3. Let G be a multigraph. Let u and v be two vertices of G.


Assume that there are two distinct backtrack-free walks from u to v in G.
Then, G has a cycle.

Proof. We have already proved this for simple graphs (Claim 1 in the proof of
Theorem 2.10.7). More or less the same argument works for multigraphs.
An introduction to graph theory, version August 2, 2023 page 144

5.1.2. Counting components


Next, we shall derive a few properties of the number of components of a graph.
Again, we have already done most of the hard work, and we can now derive
corollaries. First, we give this number a name:

Definition 5.1.4. Let G be a multigraph. Then, conn G means the number of


components of G. (Some authors also call this number b0 ( G ). This notation
comes from algebraic topology, where it stands for the 0-th Betti number.
This makes sense, because we can regard a multigraph G as a topological
space. But we won’t need this.)

So a multigraph G satisfies conn G = 1 if and only if G is connected. More-


over, conn G = 0 if and only if G has no vertices.
Let us next recall Definition 3.3.17 and Theorem 3.3.18 (which is an analogue
of Theorem 2.12.2 and can be proved in more or less the same way). As a
consequence of the latter theorem, we obtain the following:

Corollary 5.1.5. Let G be a multigraph. Let e be an edge of G. Then:

(a) If e is an edge of some cycle of G, then conn ( G \ e) = conn G.

(b) If e appears in no cycle of G, then conn ( G \ e) = conn G + 1.

(c) In either case, we have conn ( G \ e) ≤ conn G + 1.

Proof. Part (a) follows from Theorem 3.3.18 (a). Part (b) follows from Theorem
3.3.18 (b). Part (c) follows by combining parts (a) and (b).

Corollary 5.1.6. Let G = (V, E, ϕ) be a multigraph. Then, conn G ≥ |V | − | E|.

Proof. We induct on | E|:


Base case: If | E| = 0, then conn G = |V | (since | E| = 0 means that the graph
G has no edges, and thus no two distinct vertices are path-connected); but this
rewrites as conn G = |V | − | E| (since | E| = 0). Thus, Corollary 5.1.6 is proved
for | E| = 0.
Induction step: Let k ∈ N. Assume (as the induction hypothesis) that Corol-
lary 5.1.6 holds for | E| = k. We must now show that it also holds for | E| = k + 1.
So let us consider a multigraph G = (V, E, ϕ) with | E| = k + 1. Thus, | E| −
1 = k. Pick any edge e ∈ E (such an edge exists, since | E| = k + 1 ≥ 1 > 0).
Then, the multigraph G \ e has edge set E \ {e} and therefore has | E \ { e}| =
| E| − 1 = k many edges. Hence, by the induction hypothesis, we have

conn ( G \ e) ≥ |V | − | E \ { e}|
An introduction to graph theory, version August 2, 2023 page 145

(since G \ e is a multigraph with vertex set V and edge set E \ { e}). However,
Corollary 5.1.5 (c) yields conn ( G \ e) ≤ conn G + 1. Thus,
conn G ≥ conn ( G \ e) −1 ≥ |V | − | E \ { e}| −1 = |V | − (| E| − 1) − 1 = |V | − | E| .
| {z } | {z }
≥|V |−| E \{e}| =| E |−1

This completes the induction step. Thus, Corollary 5.1.6 is proven.

Corollary 5.1.7. Let G = (V, E, ϕ) be a multigraph that has no cycles. Then,


conn G = |V | − | E|.

Proof. Replay the proof of Corollary 5.1.6, with just a few changes: Instead of
applying Corollary 5.1.5 (c), apply Corollary 5.1.5 (b) (this is allowed because
G has no cycles and thus e appears in no cycle of G). The induction hypothesis
can be used because when G has no cycles, G \ e has no cycles either. All ≤ and
≥ signs in the above proof now can be replaced by = signs (since Corollary
5.1.5 (b) claims an equality, not an inequality). The result is therefore conn G =
|V | − | E | .

Corollary 5.1.8. Let G = (V, E, ϕ) be a multigraph that has at least one cycle.
Then, conn G ≥ |V | − | E| + 1.

Proof. Pick an edge e ∈ E that belongs to some cycle (such an edge exists, since
G has at least one cycle). Then, Corollary 5.1.5 (a) yields conn ( G \ e) = conn G.
However, Corollary 5.1.6 (applied to G \ e and E \ { e} instead of G and E) yields
conn ( G \ e) ≥ |V | − | E \ { e}| = |V | − (| E| − 1) = |V | − | E| + 1.
| {z }
=| E |−1

Since conn (G \ e) = conn G, this rewrites as conn G ≥ |V | − | E| + 1.


We summarize what we have proved into one convenient theorem:

Theorem 5.1.9. Let G = (V, E, ϕ) be a multigraph. Then:

(a) We always have conn G ≥ |V | − | E|.

(b) We have conn G = |V | − | E| if and only if G has no cycles.

Proof. (a) This is Corollary 5.1.6.


(b) ⇐=: This is Corollary 5.1.7.
=⇒: Assume that conn G = |V | − | E|. If G had any cycles, then Corollary
5.1.8 would yield conn G ≥ |V | − | E| + 1 > |V | − | E|, which would contradict
conn G = |V | − | E|. So G has no cycles. This proves the “=⇒” direction of
Theorem 5.1.9.
An introduction to graph theory, version August 2, 2023 page 146

Remark 5.1.10. Let G = (V, E, ϕ) be a multigraph. Does the number

conn G − (|V | − | E|)

have anything to do with how many cycles G has? We know that it is 0 if G


has no cycles. More generally, could it just be the number of cycles of G ?
(Let’s say we count reversals and cyclic rotations of a cycle as being the same
cycle.)
Unfortunately, the answer
 is still
  no. For example, a complete graph Kn
n
has many more than 1 − n − many cycles. However, there is still
2
some subtler connection. The number conn G − (|V | − | E|) is known as the
circuit rank or the cyclomatic number of G, and is the dimension of a certain
vector space that, in some way, consists of cycles.

5.2. Forests and trees


5.2.1. Definitions
We now introduce two of the heroes of this chapter:

Definition 5.2.1. A forest is a multigraph with no cycles.


(In particular, a forest therefore cannot contain two distinct parallel edges.
It also cannot contain loops.)

Definition 5.2.2. A tree is a connected forest.


An introduction to graph theory, version August 2, 2023 page 147

Example 5.2.3. Consider the following multigraphs:

e
a 2 3
2 3
A= 1 b d g
B= 1 b d g
c
5 4 c
f 5 4

3
c

2 3 D= 1 2
a
g e
C= 1 b
.
c
5 4 4

2 2
a
a
b
E= 1 4 F= 1 4
c
c
d
3 3

G= H= 1

(Yes, G is an empty graph with no vertices.) Which of them are forests, and
which are trees?
• The graph A is not a forest, since it has a cycle (actually, several cycles).
Thus, A is not a tree either.
• The graph B is a tree.
• The graph C is a forest, but not a tree, since it is not connected.
• The graph D is a tree.
• The graph E is a forest, but not a tree.
An introduction to graph theory, version August 2, 2023 page 148

• The graph F is not a forest, since it has cycles.

• The graph G (which has no vertices and no edges) is a forest, but not
a tree, since it is not connected (recall: a graph is connected if it has 1
component; but G has 0 components).

• The graph H is a tree.

5.2.2. The tree equivalence theorem


Trees can be described in many ways:

Theorem 5.2.4 (The tree equivalence theorem). Let G = (V, E, ϕ) be a multi-


graph. Then, the following eight statements are equivalent:

• Statement T1: The multigraph G is a tree.

• Statement T2: The multigraph G has no loops, and we have V 6= ∅,


and for each u ∈ V and v ∈ V, there is a unique path from u to v.

• Statement T3: We have V 6= ∅, and for each u ∈ V and v ∈ V, there is


a unique backtrack-free walk from u to v.

• Statement T4: The multigraph G is connected, and we have | E| =


|V | − 1.
• Statement T5: The multigraph G is connected, and we have | E| < |V |.

• Statement T6: We have V 6= ∅, and the graph G is a forest, but adding


any new edge to G creates a cycle.

• Statement T7: The multigraph G is connected, but removing any edge


from G yields a disconnected (i.e., non-connected) graph.

• Statement T8: The multigraph G is a forest, and we have | E| ≥ |V | − 1


and V 6= ∅.
An introduction to graph theory, version August 2, 2023 page 149

Proof. We shall prove the following implications:

T7
T2
T6

T1 T3

T8
T5
T4
.

In this digraph, an arc from Ti to Tj stands for the implication Ti =⇒Tj. Since
this digraph is strongly connected (i.e., you can travel from Statement Ti to
Statement Tj along its arcs for any i, j), this will prove the theorem. So let us
prove the implications.
Proof of T1=⇒T3: Assume that Statement T1 holds. Thus, G is a tree. There-
fore, G is connected, so that V 6= ∅. We must prove that for each u ∈ V and
v ∈ V, there is a unique backtrack-free walk from u to v. The existence of such
a walk is clear (since G is connected, so there is a path from u to v). Thus, we
only need to show that it is unique. But this is easy: If there were two distinct
backtrack-free walks from u to v (for some u ∈ V and v ∈ V), then Theorem
5.1.3 would show that G has a cycle, and thus G could not be a forest, let alone
a tree. Thus, the backtrack-free walk from u to v is unique. So we have proved
Statement T3. The implication T1=⇒T3 is thus proved.
Proof of T3=⇒T2: Assume that Statement T3 holds. We must prove that State-
ment T2 holds. First, G has no loops, because if there was a loop e with end-
point u, then the two walks (u) and (u, e, u) would be two distinct backtrack-
free walks from u to u. It remains to prove that for each each u ∈ V and
v ∈ V, there is a unique path from u to v. However, the existence of a walk
from u to v always implies the existence of a path from u to v (by Corollary
3.3.10). Moreover, the uniqueness of a backtrack-free walk from u to v implies
the uniqueness of a path from u to v (since any path is a backtrack-free walk).
Thus, Statement T2 follows from Statement T3.
Proof of T2=⇒T7: Assume that Statement T2 holds. Then, G is connected.
Now, let us remove any edge e from G. Let u and v be the endpoints of e. Then,
u 6= v (since G has no loops). There cannot be a path from u to v in the graph
G \ e (because if there was such a path, then it would also be a path from u to v
in the graph G, and this path would be distinct from the path (u, e, v); thus, the
graph G would have at least two paths from u to v; but this would contradict
the uniqueness part of Statement T2). Hence, the graph G \ e is disconnected.
So we have shown that G is connected, but removing any edge from G yields a
disconnected graph. In other words, Statement T7 holds.
An introduction to graph theory, version August 2, 2023 page 150

Proof of T7=⇒T1: Assume that Statement T7 holds. We must show that G


is a tree. Since G is connected (by Statement T7), it suffices to show that G
is a forest, i.e., that G has no cycles. However, if G had any cycle, then we
could pick any edge e of this cycle, and then we would know that G \ e is still
connected (since Corollary 5.1.5 (a) would yield conn ( G \ e) = conn G = 1),
and this would contradict Statement T7. Thus, G has no cycles, hence is a
forest. This proves Statement T1.
Proof of T1=⇒T6: Assume that Statement T1 holds. Thus, G is a tree. We
must show that adding any new edge to G creates a cycle (since all other parts
of Statement T6 are clear).
Indeed, let us add a new edge f to G. Let u and v be the endpoints of f . The
graph G is connected, so there is already a path from u to v in G. Combining
this path with the edge f , we obtain a cycle. Thus, the graph obtained from G
by adding the new edge f has a cycle. This completes our proof that Statement
T6 holds.
Proof of T6=⇒T1: Assume that Statement T6 holds. Thus, G is a forest. We
must only show that G is connected.
Assume the contrary. Thus, there exist two vertices u and v of G that are not
path-connected in G. Hence, adding a new edge f with endpoints u and v to
the graph G cannot create a new cycle (because any such cycle would have to
contain f (otherwise, it would already be a cycle of G, but G has no cycles), and
then we could remove f from it to obtain a path from u to v in G; but such a
path cannot exist, since u and v are not path-connected in G). This contradicts
Statement T6.
So we have shown that G is connected, and thus G is a tree. This proves
Statement T1.
Proof of T1=⇒T8: Assume that Statement T1 holds. So G is a tree. Clearly, G
is then a forest. We must show that | E| ≥ |V | − 1.
Theorem 5.1.9 (a) yields conn G ≥ |V | − | E|. But we have conn G = 1 because
G is connected. Thus, 1 = conn G ≥ |V | − | E|. In other words, | E| ≥ |V | − 1.
This proves Statement T8.
Proof of T8=⇒T1: Assume that Statement T8 holds. Thus, G is a forest. We
must only show that G is connected. However, G is a forest, and thus has
no cycles. Hence, Theorem 5.1.9 (b) yields conn G = |V | − | E| ≤ 1 (since
Statement 8 yields | E| ≥ |V | − 1). On the other hand, conn G ≥ 1 (since V 6= ∅).
Combining these two inequalities, we obtain conn G = 1. In other words, G is
connected. This yields Statement T1 (since G is a forest).
Proof of T1=⇒T4: Assume that Statement T1 holds. Then, G is a tree, hence a
connected forest. Therefore, G has no cycles (by the definition of a forest). Theo-
rem 5.1.9 (b) therefore yields conn G = |V | − | E| . Thus, |V | − | E| = conn G = 1
(since G is connected), so that | E| = |V | − 1. Thus, Statement T4 is proved.
Proof of T4=⇒T5: The implication T4=⇒T5 is obvious.
An introduction to graph theory, version August 2, 2023 page 151

Proof of T5=⇒T1: Assume that Statement T5 holds. Thus, the multigraph G


is connected, and we have | E| < |V |. Thus, | E| ≤ |V | − 1. In other words,
1 ≤ |V | − | E|. Since G is connected, we have conn G = 1 ≤ |V | − | E| . However,
Theorem 5.1.9 (a) yields conn G ≥ |V | − | E|. Combining these two inequalities,
we obtain conn G = |V | − | E|. Thus, Theorem 5.1.9 (b) shows that G has no
cycles. In other words, G is a forest. Hence, G is a tree (since G is connected).
This proves Statement T1.
We have now proved all necessary implications to conclude that all eight
statements T1, T2, . . ., T8 are equivalent. Theorem 5.2.4 is thus proved.
We also observe the following connection between trees and forests:

Proposition 5.2.5. Let G be a multigraph, and let C1 , C2 , . . . , Ck be its com-


ponents. Then, G is a forest if and only if all the induced subgraphs
G [C1 ] , G [ C2 ] , . . . , G [Ck ] are trees.

Proof. =⇒: Assume that G is a forest. Thus, G has no cycles. Hence, the induced
subgraphs G [C1 ] , G [C2 ] , . . . , G [Ck ] have no cycles either (since a cycle in any
of them would be a cycle of G); in other words, they are forests. But they are
furthermore connected (since the induced subgraph on a component is always
connected). Hence, they are connected forests, i.e., trees.
⇐=: Assume that the induced subgraphs G [C1 ] , G [C2 ] , . . . , G [Ck ] are trees.
Hence, none of them has a cycle. Thus, G has no cycles either (since a cycle of
G would have to be fully contained in one of these induced subgraphs31 ). In
other words, G is a forest.

5.2.3. Summary
Let us briefly summarize some properties of trees:
If T = (V, E, ϕ) is a tree, then...

• T is a connected forest. (This is how trees were defined.) Thus, T has no


cycles. (This is how forests were defined.)

• we have | E| = |V | − 1. (This follows from the implication T1=⇒T4 in


Theorem 5.2.4.)

• adding any new edge to T creates a cycle. (This follows from the implica-
tion T1=⇒T6 in Theorem 5.2.4.)

• removing any edge from T yields a disconnected (i.e., non-connected)


graph. (This follows from the implication T1=⇒T7 in Theorem 5.2.4.)

31 Indeed,
if it wasn’t, then it would contain vertices from different components. But this is
impossible, since there are no walks between vertices in different components.
An introduction to graph theory, version August 2, 2023 page 152

• for each u ∈ V and v ∈ V, there is a unique backtrack-free walk from


u to v. (This follows from the implication T1=⇒T3 in Theorem 5.2.4.)
Moreover, this backtrack-free walk is a path (since any walk from u to v
contains a path from u to v).

Remark 5.2.6. Computer scientists use some notions of “trees” that are sim-
ilar to ours, but not quite the same. In particular, their trees often have roots
(i.e., one vertex is chosen to be called “the root” of the tree), which leads to
a parent/child relationship on each edge (namely: the endpoint closer to the
root is called the “parent” of the endpoint further away from the root). Of-
ten, they also impose a total order on the children of each given vertex. With
these extra data, a tree can be used for addressing objects, since each vertex
has a unique “path description” from the root leading to it (e.g., “the second
child of the fourth child of the root”). But this all is going too far afield for
us here; we are mainly interested in trees as graphs, and won’t impose any
extra structure unless we need it for something.

Exercise 5.1. Let G be a multigraph that has no loops. Assume that there
exists a vertex u of G such that

for each vertex v of G, there is a unique path from u to v in G.

Prove that G is a tree.


[Remark: Pay attention to the quantifiers used here: ∃u∀v. This differs
from the ∀u∀v in Statement T2 of the tree equivalence theorem (Theorem
5.2.4).]

5.3. Leaves
Continuing with our faux-botanical terminology, we define leaves in a tree:

Definition 5.3.1. Let T be a tree. A vertex of T is said to be a leaf if its degree


is 1.

For example, the tree

2 3

1 b d g
c
5 4

has three leaves: 1, 2 and 4.


An introduction to graph theory, version August 2, 2023 page 153

How to find a tree with as many leaves as possible (for a given number of
vertices)? For any n ≥ 3, the simple graph

({0, 1, . . . , n − 1} , {0i | i > 0})


is a tree (when considered as a multigraph), and has n − 1 leaves (namely, all
of 1, 2, . . . , n − 1). This tree is called an n-star graph, as it looks as follows:

3
2
4

0 1

5
7
6
(for n = 8) .
It is easy to see that no tree with n ≥ 3 vertices can have more than n − 1 leaves,
so the n-star graph is optimal in this sense. Note that for n = 2, the n-star graph
has 2 leaves, not 1.
How to find a tree with as few leaves as possible? For any n ≥ 2, the n-path
graph

Pn = 1 2 3 ··· n

is a tree with only 2 leaves (viz., the vertices 1 and n). Can we find a tree with
fewer leaves? For n = 1, yes, because the 1-path graph P1 (this is simply the
graph with 1 vertex and no edges) has no leaves at all. However, for n ≥ 2, the
n-path graph is the best we can do:

Theorem 5.3.2. Let T be a tree with at least 2 vertices. Then:

(a) The tree T has at least 2 leaves.

(b) Let v be a vertex of T. Then, there exist two distinct leaves p and q of T
such that v lies on the path from p to q.

Note that I’m saying “the path” rather than “a path” here. This is allowed,
because in a tree, for any two vertices p and q, there is a unique path from p
to q. This follows from Statement T2 in the tree equivalence theorem (Theorem
5.2.4).
Proof of Theorem 5.3.2. (b) We apply a variant of the “longest path trick”: Among
all paths that contain the vertex v, let w be a longest one. Let p be the starting
An introduction to graph theory, version August 2, 2023 page 154

point of w, and let q be the ending point of w. We shall show that p and q are
two distinct leaves.
[Here is a picture of w, for what it’s worth:

p ··· v ··· q
.

Of course, the tree T can have other edges as well, not just those of w.]
First, we observe that T is connected (since T is a tree), and has at least one
vertex u distinct from v (since T has at least 2 vertices). Hence, T has a path r
that connects v to u. This path r must contain at least one edge (since u 6= v).
Thus, we have found a path r of T that contains v and contains at least one
edge. Hence, the path w must contain at least one edge as well (since w is a
longest path that contains v, and thus cannot be shorter than r). Since w is a
path from p to q, we thus conclude that p 6= q (because if a path contains at
least one edge, then its starting point is distinct from its ending point).
Now, assume (for the sake of contradiction) that p is not a leaf. Then, deg p 6=
1. The path w already contains one edge that contains p (namely, the first edge
of w). Since deg p 6= 1, there must be another edge f of T that contains w.
Consider this f . Let p′ be its endpoint distinct from p (if f is a loop, then we
set p′ = p). Appending this edge f (and its endpoint) to the beginning of the
path w, we obtain a backtrack-free walk
 
 p′ , f , p, . . . , v, . . . , q
| {z }
This is w

(this is backtrack-free since f is not the first edge of w). According to Proposi-
tion 5.1.2, this backtrack-free walk either is a path or contains a cycle. Since T
has no cycle (because T is a forest), we thus conclude that this backtrack-free
walk is a path. It is furthermore a path that contains v and is longer than w
(longer by 1, in fact). But this contradicts the fact that w is a longest path that
contains v. This contradiction shows that our assumption (that p is not a leaf)
was wrong.
Hence, p is a leaf. A similar argument shows that q is a leaf (here, we need
to append the new edge at the end of w rather than at the beginning). Thus,
p and q are two distinct leaves of T (distinct because p 6= q) such that v lies on
the path from p to q (since v lies on the path w, which is a path from p to q).
This proves Theorem 5.3.2 (b).
(a) Pick any vertex v of T. Then, Theorem 5.3.2 (b) shows that there exist two
distinct leaves p and q of T such that v lies on the path from p to q. Thus, in
particular, there exist two distinct leaves p and q of T. In other words, T has at
least two leaves. This proves Theorem 5.3.2 (a).
An introduction to graph theory, version August 2, 2023 page 155

[Remark: Another way to prove part (a) is to write the tree T as T = (V, E, ϕ),
and recall the handshake lemma, which yields

∑ deg v = 2 · | E| = 2 · (|V | − 1) (since | E| = |V | − 1 in a tree)


v ∈V
= 2 · |V | − 2.

Since each v ∈ V satisfies deg v ≥ 1 (why?), this equality entails that at least
two vertices v ∈ V must satisfy deg v ≤ 1 (since otherwise, the sum ∑ deg v
v ∈V
would be ≥ 2 · |V | − 1), and therefore these two vertices are leaves.]

Leaves are particularly helpful for performing induction on trees. The formal
reason for this is the following theorem:

Theorem 5.3.3 (induction principle for trees). Let T be a tree with at least 2
vertices. Let v be a leaf of T. Let T \ v be the multigraph obtained from T
by removing v and all edges that contain v (note that there is only one such
edge, since v is a leaf). Then, T \ v is again a tree.

Here is an example of a tree T and of the smaller tree T \ v obtained by


removing a leaf v (namely, v = 3):

5 8 5 8

1 7 1 7

3 2 6 2 6

4 4
T T\v

Proof of Theorem 5.3.3. Write T as T = (V, E, ϕ). Thus, T \ v is the induced


subgraph T [V \ {v}].
The graph T is a tree, thus a forest; hence, it has no cycles. Thus, the graph
T \ v has no cycles either. Hence, it is a forest.
Furthermore, this forest T \ v has at least 1 vertex (since T has at least 2
vertices).
We shall now show that any two vertices p and q of T \ v are path-connected
in T \ v.
An introduction to graph theory, version August 2, 2023 page 156

Indeed, let p and q be two vertices of T \ v. Then, p and q are path-connected


in T (since T is connected). Hence, there exists a path w from p to q in T.
Consider this path w. Note that v is neither the starting point nor the ending
point of this path w (since p and q are vertices of T \ v, and thus distinct from
v). Hence, if v was a vertex of w, then w would contain two distinct edges that
contain v (namely, the edge just before v and the edge just after v). But this is
impossible, since there is only one edge available that contains v (because v is
a leaf). Thus, v cannot be a vertex of w. Hence, the path w does not use the
vertex v, and thus is a path in the graph T \ v as well. So the vertices p and q
are path-connected in T \ v.
We have now shown that any two vertices p and q of T \ v are path-connected
in T \ v. This shows that T \ v is connected (since T \ v has at least 1 vertex).
Hence, T \ v is a tree (since T \ v is a forest).
Theorem 5.3.3 has a converse as well:

Theorem 5.3.4. Let G be a multigraph. Let v be a vertex of G such that


deg v = 1 and such that G \ v is a tree. (Here, G \ v means the multigraph
obtained from G by removing the vertex v and all edges that contain v.)
Then, G is a tree.

Proof. Left to the reader. (The main step is to show that a cycle of G cannot
contain v.)
Theorem 5.3.3 helps prove many properties of trees by induction on the num-
ber of vertices. In the induction step, remove a leaf v and apply the induction
hypothesis to T \ v.
The following exercise is essentially a generalization of Theorem 5.3.2 (a):

Exercise 5.2. Let T be a tree. Let w be any vertex of T. Prove that T has at
least deg w many leaves.

Exercise 5.3. A dominating set of a multigraph G is defined to be a dominat-


ing set of its underlying simple graph Gsimp .
Let G be a forest. Prove that

∑ (−1)|D| = ±1.
D is a dominating set of G

Exercise 5.4. Let T be a tree having more than 1 vertex. Let L be the set of
leaves of T. Prove that it is possible to add | L| − 1 new edges to T in such a
way that the resulting multigraph has a Hamiltonian cycle.[Solution: This is
Exercise 4 on homework set #3 from my Spring 2017 course; see the course
page for solutions.]
An introduction to graph theory, version August 2, 2023 page 157

5.4. Spanning trees


5.4.1. Spanning subgraphs
We now proceed to a crucial application of trees. First we define a concept that
makes sense for any multigraphs:

Definition 5.4.1. A spanning subgraph of a multigraph G = (V, E, ϕ) means


a multigraph of the form (V, F, ϕ | F ), where F is a subset of E.
In other words, it means a submultigraph of G with the same vertex set as
G.
In other words, it means a multigraph obtained from G by removing some
edges, but leaving all vertices undisturbed.

Compare this to the notion of an induced subgraph:

• To build an induced subgraph, we throw away some vertices but keep all
the edges that we can keep. (As usual in mathematics, the words “some
vertices” include “no vertices” and “all vertices”.)

• In contrast, to build a spanning subgraph, we keep all vertices but throw


away some edges.

5.4.2. Spanning trees


Spanning subgraphs are particularly useful when they are trees:

Definition 5.4.2. A spanning tree of a multigraph G means a spanning sub-


graph of G that is a tree.

Example 5.4.3. Let G be the following multigraph:

λ
ε 2 3 κ
1 δ 4
α
β γ
µ ν
5
.
An introduction to graph theory, version August 2, 2023 page 158

Here is a spanning tree of G:

ε 2 3

1 δ 4
α

ν
5
.

Here is another:

ε 2 3

1 δ 4
β
ν
5
.

(Yes, this is a different one, because α 6= β.) And here is yet another spanning
tree of G:
λ
ε 2 3

1 4
β
ν
5
.

Example 5.4.4. Let n be a positive integer. Consider the cycle graph Cn . (We
defined this graph Cn in Definition 2.6.3 for all n ≥ 2, but we later redefined
C2 and defined C1 in Definition 3.3.5. Here, we are using the latter modified
definition.)
The graph Cn has exactly n spanning trees. Indeed, any graph obtained
from Cn by removing a single edge is a spanning tree of Cn .

Proof. A tree with n vertices must have exactly n − 1 edges (by the implication
T1=⇒T4 in Theorem 5.2.4). Thus, a spanning subgraph of Cn can be a tree only
if it has n − 1 edges, i.e., only if it is obtained from Cn by removing a single edge
(since Cn has n edges in total). Thus, Cn has at most n spanning trees (since
Cn has n edges that can be removed). It remains to check that any subgraph
An introduction to graph theory, version August 2, 2023 page 159

obtained from Cn by removing a single edge is indeed a spanning tree. But


this is easy, since all such subgraphs are isomorphic to the path graph Pn . This
proves Example 5.4.4.

Exercise 5.5. Fix m ≥ 1. Let G be the simple graph with 3m + 2 vertices

a, b, x1 , x2 , . . . , xm , y1 , y2 , . . . , ym , z1 , z2 , . . . , zm

and the following 3m + 3 edges:

ax1 , ay1 , az1 ,


x i x i + 1 , y i y i + 1 , zi zi + 1 for all i ∈ {1, 2, . . . , m − 1} ,
xm b, ym b, zm b.

(Thus, the graph consists of two vertices a and b connected by three paths,
each of length m + 1, with no overlaps between the paths except for their
starting and ending points. Here is a picture for m = 3:

x1 x2 x3

a y1 y2 y3 b

z1 z2 z3

) Compute the number of spanning trees of G.


[To argue why your number is correct, a sketch of the argument in 1-2
sentences should be enough; a fully rigorous proof is not required.]
[Solution: This is Exercise 2 (c) on homework set #3 from my Spring 2017
course; see the course page for solutions.]

5.4.3. Spanning forests


A spanning tree of a graph G can be regarded as a minimum “backbone” of G
– that is, a way to keep G connected using as few edges as possible. Of course,
if G is not connected, then this is not possible at all, so G has no spanning trees
in this case. The best one can hope for is a spanning subgraph that keeps each
component of G connected using as few edges as possible. This is known as a
“spanning forest”:
An introduction to graph theory, version August 2, 2023 page 160

Definition 5.4.5. A spanning forest of a multigraph G means a spanning


subgraph H of G that is a forest and satisfies conn H = conn G.

When G is a connected multigraph, a spanning forest of G means the same


as a spanning tree of G.

5.4.4. Existence and construction of a spanning tree


The following theorem is crucial, which is why we will outline four different
proofs:

Theorem 5.4.6. Each connected multigraph G has at least one spanning tree.

First proof. Let G be a connected multigraph. We want to construct a spanning


tree of G. We try to achieve this by removing edges from G one by one, until
G becomes a tree. When doing so, we must be careful not to disconnect the
graph (i.e., not to destroy its connectedness). According to Theorem 3.3.18, this
can be achieved by making sure that we never remove a bridge (i.e., an edge
that appears in no cycle). Thus, we keep removing non-bridges (i.e., edges that
are not bridges) as long as we can (i.e., until we end up with a graph in which
every edge is a bridge).
So here is the algorithm: We start with G, and we successively remove non-
bridges one by one until we no longer have any non-bridges left32 . This pro-
cedure cannot go on forever, since G has only finitely many edges. Thus, after
finitely many steps, we will end up with a graph that has no non-bridges any
more. This resulting graph therefore has no cycles (since any cycle would have
at least one edge, and this edge would be a non-bridge), but is still connected
(since G was connected, and we never lost connectedness as we removed only
non-bridges). Thus, this resulting graph is a tree. Since it is also a spanning
subgraph of G (by construction), it is therefore a spanning tree of G. This proves
Theorem 5.4.6.
Second proof (sketched). In the above first proof, we constructed a spanning tree
of G by starting with G and successively removing edges until we got a tree.
Now let us take the opposite strategy: Start with an empty graph on the same
vertex set as G, and successively add edges (from G) until we get a connected
graph.
Here are some details: We start with a graph L that has the same vertex set
as G, but has no edges. Now, we inspect all edges e of G one by one (in some
order). For each such edge e, we add it to L, but only if it does not create
a cycle in L; otherwise, we discard this edge. Notice that adding an edge e
32 Warning:We cannot remove several non-bridges at once! We have to remove them one by
one. Indeed, if e and f are two non-bridges of G, then there is no guarantee that f remains a
non-bridge in G \ e. So we cannot remove both e and f simultaneously; we have to remove
one of them and check whether the other is still a non-bridge.
An introduction to graph theory, version August 2, 2023 page 161

with endpoints u and v to L creates a cycle if and only if u and v lie in the
same component of L (before we add e). Thus, we only add an edge to L if
its endpoints lie in different components of L; otherwise, we discard it. This
way, at the end of the procedure, our graph L will still have no cycles (since we
never create any cycles). In other words, it will be a forest.
Let me denote this forest by H. (Thus, H is the L at the end of the procedure.)
I claim that this forest H is a spanning tree of G. Why? Since we know that H is
a forest, we only need to show that H is connected. Assume the contrary. Thus,
there is at least one edge e of G whose endpoints lie in different components of
H (why?). This edge e is therefore not an edge of H. Therefore, at some point
during our construction of H, we must have discarded this edge e (instead of
adding it to L). As we know, this means that the endpoints of e used to lie
in the same component of L at the point at which we discarded e. But this
entails that these two endpoints lie in the same component of L at the end of
the procedure as well (because the graph L never loses any edges during the
procedure, so that any two vertices that used to lie in the same component of
L at some point will still lie in the same component of L ever after). In other
words, the endpoints of e lie in the same component of H. This contradicts
our assumption that the endpoints of e lie in different components of H. This
contradiction completes our proof that H is connected. Hence, H is a spanning
tree of G, and we have proved Theorem 5.4.6 again.
Third proof. This proof takes yet another approach to constructing a spanning
tree of G: We choose an arbitrary vertex r of G, and then progressively “spread
a rumor” from r. The rumor starts at vertex r. On day 0, only r has heard
the rumor. Every day, every vertex that knows the rumor spreads it to all its
neighbors (i.e., all vertices adjacent to it). Since G is connected, the rumor
will eventually spread to every vertex of G. Now, each vertex v (other than r)
remembers which other vertex v′ it has first heard the rumor from (if it heard
it from several vertices at the same time, it just picks one of them), and picks
some edge ev that has endpoints v and v′ (such an edge must exist, since v must
have heard the rumor from a neighbor). The edges ev for all v ∈ V \ {r } (where
V is the vertex set of G) then form a spanning tree of G (that is, the graph with
vertex set V and edge set {ev | v ∈ V \ {r }} is a spanning tree). Why?
Intuitively, this is quite convincing: This graph cannot have cycles (because
that would require a time loop) and must be connected (because for any ver-
tex v, we can trace back the path of the rumor from r to v by following the
edges ev backwards). To obtain a rigorous proof, we formalize this construction
mathematically:
Write G as G = (V, E, ϕ). Choose any vertex r of G.
We shall recursively construct a sequence of subgraphs
(V0 , E0 , ϕ0 ) , (V1 , E1 , ϕ1 ) , (V2 , E2 , ϕ2 ) , ...
of G. The idea behind these subgraphs is that for each i ∈ N, the set Vi will
consist of all vertices v that have heard the rumor by day i, and the set Ei will
An introduction to graph theory, version August 2, 2023 page 162

consist of the corresponding edges ev . The map ϕi will be the restriction of ϕ


to Ei , of course.
Here is the exact construction of this sequence of subgraphs:

• Recursion base: Set V0 := {r } and E0 := ∅. Let ϕ0 be the restriction of ϕ to


the (empty) set E0 .

• Recursion step: Let i ∈ N. Assume that the subgraph (Vi , Ei , ϕi ) of G has


already been defined. Now, we set

Vi +1 := Vi ∪ {v ∈ V | v is adjacent to some vertex in Vi } .

For each v ∈ Vi +1 \ Vi , we choose one edge ev that joins33 v to a vertex in


Vi (such an edge exists, since v ∈ Vi +1; if there are several, we just choose
a random one). Set

Ei +1 := Ei ∪ {ev | v ∈ Vi +1 \ Vi } .

Finally, we let ϕi +1 be the restriction of the map ϕ to the set Ei +1 . This is


a map from Ei +1 to P1,2 (Vi +1 ) (because any edge ev with v ∈ Vi +1 \ Vi has
one endpoint v in Vi +1 \ Vi ⊆ Vi +1 and the other endpoint in Vi ⊆ Vi +1).
Thus, (Vi +1, Ei +1 , ϕi +1 ) is a well-defined subgraph of G.

This construction yields that (Vi , Ei , ϕi ) is a subgraph of (Vi +1, Ei +1 , ϕi +1 ) for


each i ∈ N. Hence, V0 ⊆ V1 ⊆ V2 ⊆ · · · , so that |V0 | ≤ |V1 | ≤ |V2 | ≤ · · · . Since
a sequence of integers bounded from above cannot keep increasing forever (and
the sizes |Vi | are bounded from above by |V |, since each Vi is a subset of V), we
thus see that there exists some i ∈ N such that |Vi | = |Vi +1 |. Consider this i.
From |Vi | = |Vi +1 |, we obtain Vi = Vi +1 (since Vi ⊆ Vi +1).
In our colloquial model above, Vi = Vi +1 means that no new vertices learn
the rumor on day i + 1; it is reasonable to expect that at this point, every vertex
has heard the rumor. In other words, we claim that Vi = V. A rigorous proof
of this can be easily given using the fact that G is connected34 .
Now, we claim that the subgraph (Vi , Ei , ϕi ) is a spanning tree of G. To see
this, we must show that this subgraph is a forest and is connected (since Vi = V
already shows that it is a spanning subgraph). Before we do this, let us give an
example:

33 We say that an edge joins a vertex p to a vertex q if the endpoints of this edge are p and q.
34 Here is the proof in detail: We must show that Vi = V. Assume the contrary. Thus, there
exists a vertex u ∈ V \ Vi . Consider this u. The path from r to u starts at a vertex in Vi
(since r ∈ V0 ⊆ Vi ) and ends at a vertex in V \ Vi (since u ∈ V \ Vi ). Thus, it must cross over
from Vi into V \ Vi at some point. Therefore, there exists an edge with one endpoint in Vi
and the other endpoint in V \ Vi . Let v and w be these two endpoints, so that v ∈ Vi and
w ∈ V \ Vi . Then, w is adjacent to some vertex in Vi (namely, to v), and therefore belongs to
Vi+1 (by the definition of Vi+1 ). Hence, w ∈ Vi+1 = Vi . But this contradicts w ∈ / V \ Vi . This
contradiction shows that our assumption was wrong, qed.
An introduction to graph theory, version August 2, 2023 page 163

Example 5.4.7. Let G be the following multigraph:

2
5

1 4

3 10
.

Set r = 3. Then, the above construction yields

V0 = { 3} ,
V1 = {3, 1, 4} ,
V2 = {3, 1, 4, 2, 5, 6, 10} ,
V3 = {3, 1, 4, 2, 5, 6, 10, 8, 9, 7} = V,

so that Vk = V for all k ≥ 3. Thus, we can take i = 3. Here is an image of the


An introduction to graph theory, version August 2, 2023 page 164

Vk as progressively growing circles:

2
5

1 4

3 10

(The dark-red inner circle is V0 ; the red circle is V1 ; the orange circle is V2 ;
the yellow circle is V3 = V4 = V5 = · · · = V.) Finally, the edges ev can be
An introduction to graph theory, version August 2, 2023 page 165

chosen to be the following (we are painting them red for clarity):

7
e7

2
e6 5 e8
e2
8
e5
1 4
e10
e9 9
e3 e4
3 10
.

(Here, we have made two choices: We chose e2 to be the edge joining 2 with 1
rather than the edge joining 2 with 4, and we chose e7 to be the edge joining
7 with 6 rather than 7 with 5. The other options would have been equally
fine.)
We now return to the general proof. Let us first show the following:

Claim 1: Let j ∈ N. Each vertex of the graph Vj , E j , ϕ j is path-
connected to r in this graph.

[Proof of Claim 1: We induct on j:


Base case: For j = 0, Claim 1 is obvious, since V0 = {r } (so the only vertex of
the graph in question is r itself).
Induction step: Fix some positive integer k. Assume (as the induction hy-
pothesis) that Claim 1 holds for j = k − 1. That is, each vertex of the graph
(Vk−1 , Ek−1 , ϕk−1 ) is path-connected to r in this graph.
Now, let v be a vertex of the graph (Vk , Ek , ϕk ). We must show that v is
path-connected to r in this graph. If v ∈ Vk−1 , then this follows from the in-
duction hypothesis (since (Vk−1 , Ek−1 , ϕk−1 ) is a subgraph of (Vk , Ek , ϕk )). Thus,
we WLOG assume that v ∈ / Vk−1 from now on. Hence, v ∈ Vk \ Vk−1 . Accord-
ing to the recursive definition of Ek , this entails that there is an edge ev ∈ Ek
that joins v to some vertex u ∈ Vk−1 . Consider this latter vertex u. Then, v
is path-connected to u in the graph (Vk , Ek , ϕk ) (since the edge ev provides a
length-1 path from v to u). However, u is path-connected to r in the graph
An introduction to graph theory, version August 2, 2023 page 166

(Vk−1 , Ek−1 , ϕk−1 ) (by the induction hypothesis, since u ∈ Vk−1 ), hence also
in the graph (Vk , Ek , ϕk ) (since (Vk−1 , Ek−1 , ϕk−1 ) is a subgraph of (Vk , Ek , ϕk )).
Since the relation “path-connected” is transitive, we conclude from the previous
two sentences that v is path-connected to r in the graph (Vk , Ek , ϕk ).
So we have shown that each vertex v of the graph (Vk , Ek , ϕk ) is path-connected
to r in the graph (Vk , Ek , ϕk ). In other words, Claim 1 holds for j = k. This com-
pletes the induction step, and Claim 1 is proved.]
Claim 1 (applied to j = i) shows that each vertex of the graph (Vi , Ei , ϕi ) is
path-connected to r in this graph. Since the relation “path-connected” is an
equivalence relation, this entails that any two vertices of this graph are path-
connected. Thus, the graph (Vi , Ei , ϕi ) is connected (since it has at least one
vertex). It remains to prove that this graph (Vi , Ei , ϕi ) is a forest.
Again, we do this using an auxiliary claim:

Claim 2: Let j ∈ N. Then, the graph Vj , E j , ϕ j has no cycles.

[Proof of Claim 2: We induct on j:


Base case: The graph (V0 , E0 , ϕ0 ) has no edges (because E0 = ∅) and thus no
cycles. Thus, Claim 2 holds for j = 0.
Induction step: Fix some positive integer k. Assume (as the induction hypoth-
esis) that Claim 2 holds for j = k − 1. That is, the graph (Vk−1 , Ek−1 , ϕk−1 ) has
no cycles.
Now, let C be a cycle of the graph (Vk , Ek , ϕk ). Then, C must use at least
one edge from Ek \ Ek−1 (since otherwise, C would be a cycle of the graph
(Vk−1 , Ek−1 , ϕk−1 ), but this is impossible, since (Vk−1 , Ek−1 , ϕk−1 ) has no cycles).
However, each edge from Ek \ Ek−1 has the form ev for some v ∈ Vk \ Vk−1
(because of how Ek was defined). Thus, C must have an edge of this form.
Consider the corresponding vertex v ∈ Vk \ Vk−1 . The cycle C contains the edge
ev and therefore also contains its endpoint v. However, (again by the definition
of Ek ) the edge ev is the only edge in Ek that contains the vertex v. Thus, the
vertex v cannot be contained in any cycle of (Vk , Ek , ϕk ) (because a cycle would
necessarily include two distinct edges that contain v). This contradicts the fact
that the cycle C contains v.
Forget that we fixed C. We thus have obtained a contradiction for each cycle
C of the graph (Vk , Ek , ϕk ). Hence, the graph (Vk , Ek , ϕk ) has no cycles. In other
words, Claim 2 holds for j = k. This completes the induction step, and Claim 2
is proved.]
Applying Claim 2 to j = i, we see that the graph (Vi , Ei , ϕi ) has no cycles. In
other words, this graph is a forest. Since it is connected, it is therefore a tree.
Since it is a spanning subgraph of G, we thus conclude that it is a spanning tree
of G. Hence, we have constructed a spanning tree of G.
We note an important property of this construction:
An introduction to graph theory, version August 2, 2023 page 167

Claim 3: For each k ∈ N, we have

Vk = {v ∈ V | d (r, v) ≤ k} ,

where d (r, v) means the length of a shortest path from r to v.

This is easily proved by induction on k. Thus, the spanning tree (Vi , Ei , ϕi )


we have constructed has the following property: For each v ∈ V, the path from
r to v in this spanning tree is a shortest path from r to v in G. For this reason,
this spanning tree is called a breadth-first search (“BFS”) tree. Note that the
choice of root r is important here: It is usually not true that the path from an
arbitrary vertex u to an arbitrary vertex v along our spanning tree is a shortest
path in G. No spanning tree of G has this property, unless G itself is “more or
less a tree” (more precisely, unless Gsimp is a tree)!
Fourth proof of Theorem 5.4.6 (sketched). We imagine a snake that slithers along
the edges of G, trying to eventually bite each vertex. It starts at some vertex r,
which it immediately bites. Any time the snake enters a vertex v, it makes the
following step:

• If some neighbor of v has not been bitten yet, then the snake picks such
a neighbor w as well as some edge f that joins w with v; the snake then
moves to w along the edge f , bites the vertex w and marks the edge f .

• If not, then the snake marks the vertex v as fully digested and backtracks
(along the marked edges) to the last vertex it has visited but not fully
digested yet.

Once backtracking is no longer possible (because there are no more vertices


left that are not fully digested), the procedure is finished. I claim that the
marked edges at that moment are the edges of a spanning tree of G.
I won’t prove this claim in detail, but I will give some hints. First, however,
an example:
An introduction to graph theory, version August 2, 2023 page 168

Example 5.4.8. Let G be the following connected multigraph:

2 5 9 12

1 4 10 13

3 6 11 14

7
.

Let our snake start its journey at r = 3. It bites this vertex. Then, let’s say
that it picks the vertex 1 as its next victim (it could just as well go to 4 or 7;
the snake has many choices, but we follow one possible trip). Thus, it next
arrives at vertex 1, bites it and marks the edge that brought it to this vertex.
As its next destination, it necessarily picks the vertex 2 (since vertex 3 has
already been bitten). It moves to vertex 2, bites it and marks the edge. Next,
let’s say that it picks the vertex 4 (the other option would be 8). It thus moves
to 4, bites it and marks the edge. Proceeding likewise, it then moves to 5 (the
other options are 6 and 10; the vertices 2 and 3 do not qualify since they are
already bitten), bites 5 and marks an edge. From there, let’s say it moves to
8, bites 8 and marks an edge. Now, there is no longer an unbitten neighbor
of 8 to move to. Thus, the snake marks the vertex 8 as fully digested and
backtracks to the last vertex not fully digested – which, at this point, is 5.
From this vertex 5, it moves on to 9 (this is the only option, since 4 and 8
have already been bitten). And so on. Here is one possible outcome of this
journey (there are a few more decisions that the snake can make here, so you
An introduction to graph theory, version August 2, 2023 page 169

may get a different one):

2 5 9 12

1 4 10 13

3 6 11 14

7
.

Here, the marked edges are drawn in bold red ink, and endowed with an
arrow that represents the direction in which they were first used (e.g., the
edge joining 2 with 4 has an arrow towards 4 because it was first used to get
from 2 to 4).
Now, as promised, let me outline a proof of the above claim (that the marked
edges form a spanning tree of G). To wit, argue the following four observations
(ideally in this order):

1. After each step, the marked edges are precisely the edges along which the
snake has moved so far.

2. After each step, the network of bitten vertices and marked edges is a tree.

3. After enough steps, each bitten vertex is fully digested.

4. At that point, the network of bitten vertices and marked edges is a span-
ning tree (since each neighbor of a fully digested vertex is bitten, thus
fully digested by observation 3).

Details are left to the reader.


The result is that Theorem 5.4.6 is proved once again. However, more comes
out of the above construction if you know where to look. The spanning tree
T of G whose edges are the edges marked by the snake is called a depth-first
search (“DFS”) tree. It has the following extra property: If u and v are two
An introduction to graph theory, version August 2, 2023 page 170

adjacent vertices of G, then either u lies on the path from r to v in T, or v lies on


the path from r to u in T. (This called a “lineal spanning tree”. See [BenWil06,
§6.1] for details.)

5.4.5. Applications
Spanning trees have lots of applications:

• A spanning tree of a graph can be viewed as a kind of “backbone” of


the graph, which in particular provides “canonical” paths between any
two vertices. This is useful, e.g., for networking applications where hav-
ing a choice between different paths would be problematic (see, e.g., the
Spanning Tree Protocol).

• A w-minimum spanning tree (see Exercise 5.8 = Homework set #5 exercise


6) solves a global version of the cheapest-path problem. It can also be used
for detecting clusters.

• Depth-first search (the algorithm used in our fourth proof of Theorem


5.4.6) can also be used as a way to traverse all vertices of a given graph
and return back to the starting point. In particular, this provides an al-
gorithmic way to solve mazes (since a maze can be modeled as a graph,
where the vertices correspond to “rooms” and the edges correspond to
“doors”). This appears to have been the original motivation for Trémaux
to invent depth-first search back in the 19th century.

Here is a more theoretical application of spanning trees:

Definition 5.4.9. A vertex v of a connected multigraph G is said to be a cut-


vertex if the graph G \ v is disconnected. (Recall that G \ v is the multigraph
obtained from G by removing the vertex v and all edges that contain v.)

Proposition 5.4.10. Let G be a connected multigraph with ≥ 2 vertices. Then,


there are at least 2 vertices of G that are not cut-vertices.

Proof. Pick a spanning tree T of G (we know from Theorem 5.4.6 that such a
spanning tree exists). Then, T has at least 2 leaves (by Theorem 5.3.2 (a)). But
each leaf of T is a non-cut-vertex of G (why?).

Remark 5.4.11. It is not true that conversely, any non-leaf of T is a cut-vertex


of G. So we cannot get any lower bound on the number of cut-vertices.
And this is not surprising: Lots of graphs (e.g., the complete graph Kn for
n ≥ 2) have no cut-vertices at all. These graphs are said to be 2-connected,
and their properties have been amply studied (see, e.g., [West01, §4.2] for an
introduction).
An introduction to graph theory, version August 2, 2023 page 171

5.4.6. Exercises

Exercise 5.6. Let G be a connected multigraph. Let T1 and T2 be two spanning


trees of G.
Prove the following:35

(a) For any e ∈ E ( T1 ) \ E ( T2 ), there exists an f ∈ E (T2 ) \ E ( T1 ) with the


property that replacing e by f in T1 (that is, removing the edge e from
T1 and adding the edge f ) results in a spanning tree of G.

(b) For any f ∈ E ( T2 ) \ E ( T1 ), there exists an e ∈ E ( T1 ) \ E ( T2 ), with the


property that replacing e by f in T1 (that is, removing the edge e from
T1 and adding the edge f ) results in a spanning tree of G.

[Hint: The two parts look very similar, but (to my knowledge) their proofs
are not.]

Exercise 5.7. Let G be a connected multigraph. Let S be the simple graph


whose vertices are the spanning trees of G, and whose edges are defined as
follows: Two spanning trees T1 and T2 of G are adjacent (as vertices of S )
if and only if T2 can be obtained from T1 by removing an edge and adding
another (i.e., if and only if there exist an edge e1 of T1 and an edge e2 of T2
such that e2 6= e1 and T2 \ e2 = T1 \ e1 ).
Prove that the simple graph S is itself connected. (In simpler language:
Prove that any spanning tree of G can be transformed into any other span-
ning tree of G by a sequence of legal “remove an edge and add another”
operations, where such an operation is called legal if its result is a spanning
tree of G.)
[Example: If G is the multigraph

3
a d
b
1 2
c ,

35 Recall that E ( H ) denotes the edge set of any graph H.


An introduction to graph theory, version August 2, 2023 page 172

then the graph S looks as follows:

3
a

1 2
c

3 3
a a d
b
1 2 1 2

3 3
d
b
1 2 1 2
c c

Exercise 5.8. Let G = (V, E, ϕ) be a connected multigraph. Let w : E → R be


a map that assigns a real number w (e) to each edge e. We shall call this real
number w (e) the weight of the edge e.
If H = (W, F, ϕ | F ) is a subgraph of G, then the weight w ( H ) of H is
defined to be ∑ w ( f ) (that is, the sum of the weights of all edges of H).
f ∈F
A w-minimum spanning tree of G means a spanning tree of G that has
the smallest weight among all spanning trees of G.
In our first proof of Theorem 5.4.6, we have seen a way to construct a
spanning tree of G by successively removing non-bridges until only bridges
remain. (A non-bridge means an edge that is not a bridge.)
Now, let us perform this algorithm, but taking care to choose a non-bridge
of largest weight (among all non-bridges) at each step. Prove that the result
will be a w-minimum spanning tree.
An introduction to graph theory, version August 2, 2023 page 173

Exercise 5.9. Let G be a connected multigraph with an even number of ver-


tices. Prove that there exists a spanning subgraph H of G such that each
vertex of H has odd degree (in H).
[Hint: One way to solve this begins by reducing the problem to the case
when G is a tree.]

5.4.7. Existence and construction of a spanning forest


So we have learnt that connected graphs have spanning trees. What do discon-
nected graphs have?

Corollary 5.4.12. Each multigraph has a spanning forest.

Proof. Apply Theorem 5.4.6 to each component of the multigraph. Then, com-
bine the resulting spanning trees into a spanning forest.

5.5. Centers of graphs and trees


5.5.1. Distances
Given a graph, we can define a “distance” between any two of its vertices,
simply by counting edges on the shortest path from one to the other:

Definition 5.5.1. Let G be a multigraph.


For any two vertices u and v of G, we define the distance between u and
v to be the smallest length of a path from u to v. If no such path exists, then
this distance is defined to be ∞.
The distance between u and v is denoted by d (u, v) or by d G (u, v) when
the graph G is not clear from the context.

Example 5.5.2. If G is the multigraph from Example 5.4.8, then

d G (1, 9) = 4, d G (4, 13) = 2, d G (4, 4) = 0.

Remark 5.5.3. Distances in a multigraph satisfy the rules that you would
expect a distance function to satisfy:

(a) We have d (u, u) = 0 for any vertex u.

(b) We have d (u, v) = d (v, u) for any vertices u and v.

(c) We have d (u, v) + d (v, w) ≥ d (u, w) for any vertices u, v and w. (Here,
we understand that ∞ ≥ m and ∞ + m = ∞ for any m ∈ N.)

Also:
An introduction to graph theory, version August 2, 2023 page 174

(d) The distances d (u, v) do not change if we replace “path” by “walk” in


the definition of the distance.

(e) If V is the vertex set of our multigraph, then d (u, v) ≤ |V | − 1 for any
vertices u and v.

Proof. Part (d) follows from Corollary 3.3.10. The proofs of (a), (b) and (c) are
then straightforward (the proof of (c) relies on part (d), because splicing two
paths generally only yields a walk, not a path). Finally, in order to prove part
(e), observe that any path of our multigraph has length ≤ |V | − 1 (since its
vertices are distinct).
We note that the definition of a distance becomes simpler if our multigraph
is a tree: Namely, if T is a tree, then the distance d (u, v) between two vertices
u and v is the length of the only path from u to v in T. Thus, in a tree, we do
not have to worry whether a given path is the shortest.
We also notice that if G is a multigraph, and if u and v are two vertices of
G, then the distance d G (u, v) in G equals the distance d Gsimp (u, v) in the simple
graph Gsimp . (The reason for this is that any path of G can be converted into
a path of Gsimp having the same length, and vice versa. Of course, this is not
a one-to-one correspondence, but it suffices for our purposes.) Thus, when
studying distances on a multigraph, we can WLOG restrict ourselves to simple
graphs.
The following few exercises give some curious properties of distances in var-
ious kinds of graphs.
Exercise 5.10. Let a, b and c be three vertices of a connected multigraph
G = (V, E, ϕ). Prove that d (b, c) + d (c, a) + d ( a, b) ≤ 2 |V | − 2.
[Solution: This is Exercise 7 on midterm #1 from my Spring 2017 course,
except that the simple graph has been replaced by a multigraph (but this
makes no serious difference); see the course page for solutions.]

Exercise 5.11. Let a, b and c be three vertices of a strongly connected multi-


digraph D = (V, A, ψ) such that |V | ≥ 4. For any two vertices u and v of D,
we define the distance d (u, v) to be the smallest length of a path from u to v.
(This definition is the obvious analogue of Definition 5.5.1 for digraphs.)

(a) Prove that d (b, c) + d (c, a) + d ( a, b) ≤ 3 |V | − 4.

(b) For each n ≥ 5, construct an example in which |V | = n and d (b, c) +


d (c, a) + d ( a, b) = 3 |V | − 4. (No proof is required for the example.)

[Solution: This is Exercise 5 on homework set #3 from my Spring 2017


course, except that the simple digraph has been replaced by a multidigraph
(but this makes no serious difference); see the course page for solutions.]
An introduction to graph theory, version August 2, 2023 page 175

Exercise 5.12. Let G be a tree. Let x, y, z and w be four vertices of G.


Show that the two largest ones among the three numbers

d ( x, y) + d (z, w) , d ( x, z) + d (y, w) and d ( x, w) + d (y, z)

are equal.
[Solution: This is Exercise 6 on midterm #2 from my Spring 2017 course;
see the course page for solutions.]

Exercise 5.13. Let G be a connected multigraph. Let x, y, z and w be four


vertices of G.
Assume that the two largest ones among the three numbers

d ( x, y) + d (z, w) , d ( x, z) + d (y, w) and d ( x, w) + d (y, z)

are not equal.


Prove that G has a cycle of length ≤ d ( x, z) + d (y, w) + d ( x, w) + d (y, z).
[Hint: This is a strengthening of Exercise 5.12. Try deriving it by applying
the latter exercise to a strategically chosen subgraph of G.]
[Solution: This is Exercise 1 on midterm #3 from my Spring 2017 course;
see the course page for solutions.]

5.5.2. Eccentricity and centers


We can now define “eccentricities”:

Definition 5.5.4. Let v be a vertex of a multigraph G = (V, E, ϕ). The eccen-


tricity of v (with respect to G) is defined to be the number

max {d (v, u) | u ∈ V } ∈ N ∪ { ∞} .

This eccentricity is denoted by ecc v or eccG v.

Definition 5.5.5. Let G = (V, E, ϕ) be a multigraph. Then, a center of G


means a vertex of G whose eccentricity is minimum (among all vertices).

(Some authors have a slightly different definition of a “center”: They define


the center of G to be the set of all vertices of G whose eccentricity is minimum.
That is, what they call “center” is the set of what we call “centers”.)
An introduction to graph theory, version August 2, 2023 page 176

Example 5.5.6. Let G be the following multigraph:

r u w

p q v
.

Then, the eccentricities of its vertices are as follows (we are just labeling each
vertex with its eccentricity):

2 3 4

4 3 2
.

Thus, the centers of G are the vertices r and v.

Example 5.5.7. Let G be a complete graph Kn (with n vertices). Then, each


vertex of G has the same eccentricity (which is 1 if n ≥ 2 and 0 if n = 1), and
thus each vertex of G is a center of G.

Example 5.5.8. Let G be a graph with more than one component. Then, each
vertex v of G has eccentricity ∞ (because there exists at least one vertex u
that lies in a different component of G than v, and thus this vertex u satisfies
d (v, u) = ∞). Hence, each vertex of G is a center of G.

5.5.3. The centers of a tree


As we see from Example 5.5.8, eccentricity and centers are not very useful
notions when the graph is disconnected. Even for a connected graph, Example
5.5.6 shows that the centers do not necessarily form a connected subgraph.
However, in a tree, they behave a lot better:

Theorem 5.5.9. Let T be a tree. Then:

(a) The tree T has either 1 or 2 centers.

(b) If T has 2 centers, then these 2 centers are adjacent.

(c) Moreover, these centers can be found by the following algorithm:


If T has more than 2 vertices, then we remove all leaves from T (simul-
taneously). What remains is again a tree. If that tree still has more than
An introduction to graph theory, version August 2, 2023 page 177

2 vertices, we remove all leaves from it (simultaneously). The result is


again a tree. If that tree still has more than 2 vertices, we remove all
leaves from it (simultaneously), and continue doing so until we are left
with a tree that has only 1 or 2 vertices. These vertices are the centers
of T.

To prove Theorem 5.5.9, we first study how a tree is affected when all its
leaves are removed:

Lemma 5.5.10. Let T = (V, E, ϕ) be a tree with more than 2 vertices.


Let L be the set of all leaves of T.
Let T \ L be the induced submultigraph of T on the set V \ L. (Thus, T \ L
is obtained from T by removing all the vertices in L and all adjacent that
contain a vertex in L.)
Then:

(a) The multigraph T \ L is a tree.

(b) For any u ∈ V \ L and v ∈ V \ L, we have

{paths of T from u to v} = {paths of T \ L from u to v}

(that is, the paths of T from u to v are precisely the paths of T \ L from
u to v).

(c) For any u ∈ V \ L and v ∈ V \ L, we have d T (u, v) = d T \ L (u, v).

(d) Each vertex v ∈ V \ L satisfies eccT v = eccT \ L v + 1.

(e) Each leaf v ∈ L satisfies eccT v = eccT w + 1, where w is the unique


neighbor of v in T. (A neighbor of v means a vertex that is adjacent to
v.)

(f) The centers of T are precisely the centers of T \ L.


An introduction to graph theory, version August 2, 2023 page 178

Example 5.5.11. Let T be the following tree:

1 2 3 4

5 6 7 9

8 10 11
.

Then, the set L from Lemma 5.5.10 is {4, 5, 7, 8, 10, 11}, and the tree T \ L
looks as follows:

1 2 3

6 9
.

Proof of Lemma 5.5.10. First, we notice that T is a forest (since T is a tree), and
thus has no cycles. In particular, T therefore has no loops and no parallel edges.
Also, for any two vertices u and v of T, there is a unique path from u to v in T.
Next, we introduce some terminology: If p is a path of some multigraph, then
an intermediate vertex of p shall mean a vertex of p that is neither the starting
point nor the ending point of p. In other words, if p = ( p0 , e1 , p1 , e2 , p2 , . . . , ek , pk )
is a path of some multigraph, then the intermediate vertices of p are p1 , p2 , . . . , pk−1 .
Clearly, any intermediate vertex of a path p must have degree ≥ 2 (since the
path p enters it along some edge, and leaves it along another). Hence, if p is a
path of T, then

any intermediate vertex of p must belong to V \ L (12)

(because it must have degree ≥ 2, thus cannot be a leaf of T; but this means
that it cannot belong to L; therefore, it must belong to V \ L).
(b) Let u ∈ V \ L and v ∈ V \ L. Let p be a path of T from u to v. We shall
show that p is a path of T \ L as well.
Indeed, let us first check that all vertices of p belong to V \ L. This is clear for
the vertices u and v (since u ∈ V \ L and v ∈ V \ L); but it also holds for every
intermediate vertex of p (by (12)). Thus, it does indeed hold for all vertices of
p.
An introduction to graph theory, version August 2, 2023 page 179

We have thus shown that all vertices of p belong to V \ L. Hence, p is a path


of T \ L (since T \ L is the induced submultigraph of T on the set V \ L).
Forget that we fixed p. We have thus shown that every path p of T from u to
v is also a path of T \ L. Hence,

{paths of T from u to v} ⊆ {paths of T \ L from u to v} .


Conversely, we have

{paths of T \ L from u to v} ⊆ {paths of T from u to v} ,


since every path of T \ L is a path from T (because T \ L is a submultigraph of
T). Combining these two facts, we obtain

{paths of T from u to v} = {paths of T \ L from u to v} .


This proves Lemma 5.5.10 (b).
(c) This follows from Lemma 5.5.10 (b), since the distance d G (u, v) of two
vertices u and v in a graph G is defined to be the smallest length of a path from
u to v.
(a) The graph T is a tree, thus a forest. Hence, its submultigraph T \ L is a
forest as well (since any cycle of T \ L would be a cycle of T). It thus remains
to show that T \ L is connected.
First, it is easy to see that T \ L has at least one vertex36 . It remains to show
that any two vertices of T \ L are path-connected.
Let u and v be two vertices of T \ L. Then, u ∈ V \ L and v ∈ V \ L. Hence,
Lemma 5.5.10 (b) yields

{paths of T from u to v} = {paths of T \ L from u to v} .


Thus, {paths of T \ L from u to v} = {paths of T from u to v} 6= ∅ (since there
exists a path of T from u to v (because T is connected)). In other words, there
exists a path of T \ L from u to v. In other words, u and v are path-connected
in T \ L.
We have now shown that any two vertices u and v of T \ L are path-connected
in T \ L. This entails that T \ L is connected (since T \ L has at least one vertex).
This proves Lemma 5.5.10 (a).
36 Proof.We assumed that T has more than 2 vertices. In other words, there exist three distinct
vertices u, v, w of T. Consider these u, v, w. If all three distances d T (u, v ), d T (v, w) and
d T (w, u) were equal to 1, then T would have a cycle (of the form (u, ∗, v, ∗, w, ∗, u), where
each asterisk stands for some edge); but this would contradict the fact that T has no cycles.
Thus, not all of these three distances are equal to 1. Hence, at least one of them is 6= 1.
WLOG assume that d T (u, v) 6= 1 (otherwise, we permute u, v, w). Hence, the path from
u to v has more than one edge (indeed, it must have at least one edge, since u and v are
distinct). Therefore, this path has at least one intermediate vertex. This intermediate vertex
then must belong to V \ L (by (12)). Hence, it is a vertex of the subgraph T \ L. This shows
that T \ L has at least one vertex.
An introduction to graph theory, version August 2, 2023 page 180

(d) If u and v are two vertices of T \ L, then the two distances d T (u, v) and
d T \ L (u, v) are equal (by Lemma 5.5.10 (c)); thus, we shall denote both distances
by d (u, v) (since there is no confusion to be afraid of).
Let v ∈ V \ L. We must show that eccT v = eccT \ L v + 1.
Let u be a vertex of T \ L such that d (v, u) is maximum. Thus, eccT \ L v =
d (v, u) (by the definition of eccT \ L v). However, u is a vertex of T \ L, and thus
does not belong to L. Hence, u is not a leaf of T (since L is the set of all leaves
of T). Hence, u has degree ≥ 2 in T (since a vertex in a tree with more than 1
vertex cannot have degree 0).
Now, consider the path p from v to u in the tree T. This path p has length
d (v, u). Since u has degree ≥ 2, there exist at least two edges of T that contain
u. Hence, in particular, there exists at least one edge f that contains u and is
distinct from the last edge of p 37 . Consider this edge f . Let w be the endpoint
of f other than u. Appending f and w to the end of the path p, we obtain a
walk from v to w. This walk is backtrack-free (since f is distinct from the last
edge of p) and thus must be a path (by Proposition 5.1.2, since T has no cycles).
This path has length d (v, u) + 1 (since it was obtained by appending an edge
to the path p, which has length d (v, u)). Hence, d (v, w) = d (v, u) + 1. But the
definition of eccentricity yields

eccT v ≥ d (v, w) = d (v, u) +1 = eccT \ L v + 1. (13)


| {z }
=eccT \ L v

On the other hand, let x be a vertex of T such that d (v, x ) is maximum. Thus,
eccT v = d (v, x ) (by the definition of eccT v). The path from v to x has length
≥ 1 (since otherwise, we would have x = v and therefore d (v, x ) = d (v, v) = 0,
which would easily contradict the maximality of d (v, x )). Thus, it has a second-
to-last vertex. Let y be this second-to-last vertex. Then, the path from v to
y is simply the path from v to x with its last edge removed. Consequently,
d (v, y) = d (v, x ) − 1. However, it is easy to see that y ∈ V \ L 38 . In other
words, y is a vertex of T \ L. Thus, the definition of eccentricity yields

eccT \ L v ≥ d (v, y) = d (v, x ) −1 = eccT v − 1,


| {z }
=eccT v

so that eccT v ≤ eccT \ L v + 1. Combining this with (13), we obtain eccT v =


eccT \ L v + 1. This proves Lemma 5.5.10 (d).

37 Ifthe path p has no edges, then f can be any edge that contains u.
38 Proof.Assume the contrary. Thus, y ∈ / V \ L. Hence, y 6= v (since y ∈
/ V \ L but v ∈ V \ L).
However, y is the second-to-last vertex of the path from v to x. Therefore, y is either
the starting point v of this path, or an intermediate vertex of this path. Since y 6= v, we
thus conclude that y is an intermediate vertex of this path. Hence, by (12), we see that y
must belong to V \ L. But this contradicts y ∈ / V \ L. This contradiction shows that our
assumption was false, qed.
An introduction to graph theory, version August 2, 2023 page 181

(e) If u and v are two vertices of T \ L, then the two distances d T (u, v) and
d T \ L (u, v) are equal (by Lemma 5.5.10 (c)); thus, we shall denote both distances
by d (u, v) (since there is no confusion to be afraid of).
Let v ∈ L be a leaf. Let w be the unique neighbor of v in T. We must prove
that eccT v = eccT w + 1.
We first claim that
d (v, u) = d (w, u) + 1 for each u ∈ V \ { v} . (14)
[Proof of (14): We have deg v = 1 (since v is a leaf). In other words, there is a
unique edge of T that contains v. Let e be this edge. The endpoints of e are v
and w (since w is the unique neighbor of v). Thus, v 6= w (since T has no loops)
and d (v, w) = 1.
Now, let u ∈ V \ {v}. Then, the path from v to u in T must have length ≥ 1
(since u 6= v), and therefore must begin with the edge e (since e is the only edge
that contains v). If we remove this edge e from this path, we thus obtain a path
from w to u. As a consequence, the path from v to u is longer by exactly 1 edge
than the path from w to u. In other words, we have d (v, u) = d (w, u) + 1. This
proves (14).]
Now, the definition of eccentricity yields
eccT v = max { d (v, u) | u ∈ V } . (15)
This maximum is clearly not attained for u = v (since d (v, v) = 0 is smaller
than d (v, w) = 1). Thus, this maximum does not change if we remove v from
its indexing set V. Hence, (15) rewrites as
 

 


 

 
eccT v = max d (v, u) | u ∈ V \ { v}

 | {z } 


 

=d(w,u)+1 
(by (14))
= max {d (w, u) + 1 | u ∈ V \ {v}}
= max {d (w, u) | u ∈ V \ {v}} + 1. (16)
On the other hand, the definition of eccentricity yields
eccT w = max { d (w, u) | u ∈ V } . (17)
We shall now show that this maximum does not change if we remove v from
its indexing set V. In other words, we shall show that
max {d (w, u) | u ∈ V } = max { d (w, u) | u ∈ V \ { v}} . (18)
[Proof of (18): Assume that (18) is false. Then, the maximum max {d (w, u) | u ∈ V }
is attained only at u = v. In other words, we have
d (w, v) > d (w, u) for all u ∈ V \ {v} . (19)
An introduction to graph theory, version August 2, 2023 page 182

However, the tree T has more than 2 vertices. Thus, it has a vertex u that is
distinct from both v and w. Consider this u. Thus, u ∈ V \ { v}, so that (19)
yields d (w, v) > d (w, u). In view of d (w, v) = d (v, w) = 1, this rewrites as
1 > d (w, u), so that d (w, u) < 1. Therefore, w = u. But this contradicts the
facts that w is distinct from u. This contradiction shows that our assumption
was false, and thus (18) is proved.]
Now, (16) becomes

eccT v = max {d (w, u) | u ∈ V \ {v}} +1


| {z }
=max{d(w,u ) | u ∈V }
(by (18))
= max {d (w, u) | u ∈ V } +1 = eccT w + 1.
| {z }
=eccT w
(by (17))

This proves Lemma 5.5.10 (e).


(f) Lemma 5.5.10 (e) shows that any vertex v ∈ L has a higher eccentricity
than its unique neighbor. Thus, a vertex v of T that minimizes eccT v cannot
belong to L. In other words, a vertex v of T that minimizes eccT v must belong
to V \ L.
However, the centers of T are defined to be the vertices of T that minimize
eccT v. As we just proved, these vertices must belong to V \ L. Thus, the centers
of T can also be characterized as the vertices v ∈ V \ L that minimize eccT v.
However, a vertex v ∈ V \ L minimizes eccT v if and only if it minimizes eccT \ L v
(because Lemma 5.5.10 (d) yields eccT v = eccT \ L v + 1 for any such vertex v).
Thus, we conclude that the centers of T can be characterized as the vertices
v ∈ V \ L that minimize eccT \ L v. But this is precisely the definition of the
centers of T \ L. As a consequence, we see that the centers of T are precisely
the centers of T \ L. This proves Lemma 5.5.10 (f).
Proof of Theorem 5.5.9. We shall prove parts (a) and (b) of Theorem 5.5.9 by
strong induction on |V ( T )|:
Induction step: Consider a tree T. Assume that parts (a) and (b) of Theorem
5.5.9 are true for any tree with fewer than |V ( T )| many vertices. We must now
prove these parts for our tree T.
If |V ( T )| ≤ 2, then both parts are obvious. Hence, WLOG assume that
|V ( T )| > 2. Thus, the tree T has more than 2 vertices. Let L be the set of all
leaves of T. Note that | L| ≥ 2 (since we know that any tree with at least 2
vertices has at least 2 leaves). Define the multigraph T \ L as in Lemma 5.5.10.
Then, Lemma 5.5.10 (f) shows that the centers of T are precisely the centers of
T \ L.
However, Lemma 5.5.10 (a) yields that T \ L is again a tree. This tree has
fewer vertices than T (since | L| ≥ 2 > 0). Hence, by the induction hypothesis,
both parts (a) and (b) of Theorem 5.5.9 are true for the tree T \ L instead of T.
An introduction to graph theory, version August 2, 2023 page 183

In other words, the tree T \ L has either 1 or 2 centers, and if it has 2 centers,
then these 2 centers are adjacent. Since the centers of T are precisely the centers
of T \ L, we can rewrite this as follows: The tree T has either 1 or 2 centers,
and if it has 2 centers, then these 2 centers are adjacent. In other words, parts
(a) and (b) of Theorem 5.5.9 hold for our tree T. This completes the induction
step. Thus, parts (a) and (b) of Theorem 5.5.9 are proved.
(c) This follows from Lemma 5.5.10 (f). Indeed, if T has at most 2 vertices,
then all vertices of T are centers of T (this is trivial to check). If not, then each
“leaf-removal” step of our algorithm leaves the set of centers of T unchanged
(by Lemma 5.5.10 (f)), and thus the centers of the original tree T are precisely
the centers of the tree that remains at the end of the algorithm. But the latter
tree has at most 2 vertices, and thus its centers are precisely its vertices. So the
centers of T are precisely the vertices that remain at the end of the algorithm.
Theorem 5.5.9 (c) is proven.
The following exercise shows another approach to the centers of a tree:

Exercise 5.14. Let T be a tree. Let p = ( p0 , ∗, p1 , ∗, p2 , . . . , ∗, pm ) be a longest


path of T. (We write asterisks for the edges since we don’t need to name
them.)
Prove the following:

(a) If m is even, then the only center of T is pm/2 .

(b) If m is odd, then the two centers of T are p(m−1)/2 and p(m+1)/2 .

Remark 5.5.12. Exercise 5.14 is a result by Arthur Cayley from 1875. It shows
once again that each tree has exactly one center or two adjacent centers, and
also shows that any two longest paths of a tree have a common vertex.

The notion of a centroid of a tree is a relative of the notion of a center. We


briefly discuss it in the following exercise:

Exercise 5.15. Let T be a tree. For any vertex v of T, we let cv denote the size
of the largest component of the graph T \ v. (Recall that T \ v is the graph
obtained from T by removing the vertex v and all edges that contain v. Note
that a component (according to our definition) is a set of vertices; thus, its
size is the number of vertices in it.)
The vertices v of T that minimize the number cv are called the centroids of
T.

(a) Prove that T has no more than two centroids, and furthermore, if T has
two centroids, then these two centroids are adjacent.
An introduction to graph theory, version August 2, 2023 page 184

(b) Find a tree T such that the centroid(s) of T are distinct from the center(s)
of T.

[Example: Here is an example of a tree T, where each vertex v is labelled


with the corresponding number cv :

10 9 10

10 5 7 10

8 10

10 10
.

Thus, the vertex labelled 5 is the only centroid of this tree T.]

Note the analogy between Exercise 5.15 (a) and Theorem 5.5.9 (a) and (b).

5.6. Arborescences
5.6.1. Definitions
Enough about undirected graphs.
What would be a directed analogue of a tree? I.e., what kind of digraphs
play the same role among digraphs that trees do among undirected graphs?
Trees are graphs that are connected and have no cycles. This suggests two
directed versions:

• We can study digraphs that are strongly connected and have no cycles.
Unfortunately, there is not much to study: Any such digraph has only 1
vertex and no arcs. (Make sure you understand why!)

• We can drop the connectedness requirement. Digraphs that have no cy-


cles are called acyclic, and more typically they are called dags (short for
“directed acyclic graphs”).
However, these dags aren’t quite like trees. For example, a tree always has
An introduction to graph theory, version August 2, 2023 page 185

fewer edges than vertices, but a dag can have more arcs than vertices.39

Here is a more convincing analogue of trees for digraphs:40

Definition 5.6.1. Let D be a multidigraph. Let r be a vertex of D.

(a) We say that r is a from-root (or, short, root) of D if for each vertex v of
D, the digraph D has a path from r to v.

(b) We say that D is an arborescence rooted from r if r is a from-root of D


and the undirected multigraph Dund has no cycles. (Recall that Dund is
the multigraph obtained from D by turning each arc into an undirected
edge. Parallel arcs are not merged into one!)

Of course, there are analogous notions of a “to-root” and an “arborescence


rooted towards r”, but these are just the same notions that we just defined with
all arrows reversed. So we need not study them separately; we can just take
any property of “rooted from” and reverse all arcs to make it into a property of
“rooted to”.

Example 5.6.2. The multidigraph

2 3
0
1 4

has three from-roots (namely, 0, 1 and 2). It is not an arborescence rooted


from any of them, because turning each arc into an undirected edge yields a
graph with a cycle.
39 For example, here is a dag with 4 vertices and 5 arcs:

40 We recall that we defined a multigraph Dund for every multidigraph D (in Definition
4.4.1). Roughly speaking, this multigraph Dund is obtained by “forgetting the di-
rections” of the arcs of D. Parallel arcs are not merged into one. For example,

if D = 1 2 , then Dund = 1 2
.
An introduction to graph theory, version August 2, 2023 page 186

If we reverse the arc from 0 to 1, then we obtain a multidigraph

2 3
0
1 4

which has only one from-root (namely, 1) and is still not an arborescence (for
the same reason as before).

Example 5.6.3. Consider the following multidigraph:

5 8

1 7

3 2 6

4
.

This is an arborescence rooted from 6. Indeed, it has paths from 6 to all


vertices, and turning each arc into an undirected edge yields a tree.
If we reverse the arc from 1 to 2, we obtain a multidigraph

5 8

1 7

3 2 6

4
,

which is not an arborescence, because it has no from-root anymore.


An introduction to graph theory, version August 2, 2023 page 187

5.6.2. Arborescences vs. trees: statement


The above examples suggest that an arborescence rooted from r is basically
the same as a tree, whose all edges have been “oriented away from r”. More
precisely:

Theorem 5.6.4. Let D be a multidigraph, and let r be a vertex of D. Then,


the following two statements are equivalent:

• Statement C1: The multidigraph D is an arborescence rooted from r.

• Statement C2: The undirected multigraph Dund is a tree, and each arc
of D is “oriented away from r” (this means the following: the source of
this arc lies on the unique path between r and the target of this arc on
Dund ).

This is an easy theorem to believe, but an annoyingly hard one to formally


prove in full detail! We shall prove this theorem later.

5.6.3. The arborescence equivalence theorem


First, let us show another bunch of equivalent criteria for arborescences, imitat-
ing the tree equivalence theorem (Theorem 5.2.4):

Theorem 5.6.5 (The arborescence equivalence theorem). Let D = (V, A, ψ)


be a multidigraph with a from-root r. Then, the following six statements are
equivalent:

• Statement A1: The multidigraph D is an arborescence rooted from r.

• Statement A2: We have | A| = |V | − 1.

• Statement A3: The multigraph Dund is a tree.

• Statement A4: For each vertex v ∈ V, the multidigraph D has a unique


walk from r to v.

• Statement A5: If we remove any arc from D, then the vertex r will no
longer be a from-root of the resulting multidigraph.

• Statement A6: We have deg− r = 0, and each v ∈ V \ {r } satisfies


deg− v = 1.

Proof. We will prove the implications A1=⇒A4=⇒A5=⇒A6=⇒A2=⇒A3=⇒A1.


Since these implications form a cycle that includes all six statements, this will
entail that all six statements are equivalent.
An introduction to graph theory, version August 2, 2023 page 188

Before we prove these implications, we introduce a notation: If a is any arc


of D, then D \ a shall denote the multidigraph
 obtained from D byremoving
this arc a. (Formally, this means that D \ a := V, A \ { a} , ψ | A\{ a} .)
We now come to the proofs of the promised implications.
Proof of the implication A1=⇒A4: Assume that Statement A1 holds. Thus, D
is an arborescence rooted from r. In other words, r is a from-root of D and the
undirected multigraph Dund has no cycles.
We must show that for each vertex v ∈ V, the multidigraph D has a unique
walk from r to v. The existence of such a walk is clear (because r is a from-root
of D). It is the uniqueness that we need to prove.
Assume the contrary. Thus, there exists a vertex v ∈ V such that two distinct
walks u and v from r to v exist. However, the multigraph D has no loops (since
any loop of D would be a loop of Dund , and thus create a cycle of Dund , but
we know that Dund has no cycles). Hence, any walk of D is automatically a
backtrack-free walk of Dund (indeed, it is backtrack-free because the only way
two consecutive arcs of a walk in a digraph can be equal is if they are loops).
Therefore, the two walks u and v of D are two backtrack-free walks of Dund .
Thus, there are two distinct backtrack-free walks from r to v in Dund (namely,
u and v). Theorem 5.1.3 thus lets us conclude that Dund has a cycle. But this
contradicts the fact that Dund has no cycles.
This contradiction shows that our assumption was wrong. Hence, we have
proved that for each vertex v ∈ V, the multidigraph D has a unique walk from
r to v. In other words, Statement A4 holds.
Proof of the implication A4=⇒A5: Assume that Statement A4 holds.
Let now a be any arc of D. We shall show that r is not a from-root of the
multidigraph D \ a.
Indeed, let s be the source and t the target of the arc a. We shall show that
the digraph D \ a has no path from r to t.
Indeed, assume the contrary. Thus, D \ a has some path p from r to t. This
path does not use the arc a (since it is a path of D \ a).
On the other hand, we have assumed that Statement A4 holds. Applying this
statement to v = s, we conclude that the multidigraph D has a unique walk
from r to s. Let (v0 , a1 , v1 , a2 , v2 , . . . , ak , vk ) be this walk. By appending the arc a
and the vertex t to its end, we extend it to a longer walk

(v0 , a1 , v1 , a2 , v2 , . . . , ak , vk , a, t) ,

which is a walk from r to t. We denote this walk by q.


We have now found two walks from r to t in the digraph D: namely, the
path p and the walk q. These two walks are distinct (since q uses the arc a,
but p does not). However, Statement A4 (applied to v = t) yields that the
multidigraph D has a unique walk from r to t. This contradicts the fact that we
just have found two distinct such walks.
An introduction to graph theory, version August 2, 2023 page 189

This contradiction shows that our assumption was false. Hence, the digraph
D \ a has no path from r to t. Thus, r is not a from-root of D \ a.
Forget that we fixed a. We have now proved that if a is any arc of D, then r
is not a from-root of D \ a. In other words, if we remove any arc from D, then
the vertex r will no longer be a from-root of the resulting multidigraph. Thus,
Statement A5 holds.
Proof of the implication A5=⇒A6: Assume that Statement A5 holds. We must
prove that Statement A6 holds. In other words, we must prove that deg− r = 0,
and that each v ∈ V \ {r } satisfies deg− v = 1.
Let us first prove that deg− r = 0. Indeed, assume the contrary. Thus,
deg− r 6= 0, so that there exists an arc a with target r. We shall show that r
is a from-root of D \ a.
The arc a has target r. Thus, a path that starts at r cannot use this arc a
(because this arc would lead it back to r, but a path is not allowed to revisit
any vertex), and therefore must be a path of D \ a. Thus we have shown that
any path of D that starts at r is also a path of D \ a. However, for each vertex
v of D, the digraph D has a path from r to v (since r is a from-root of D). This
path is also a path of D \ a (since any path of D that starts at r is also a path
of D \ a). Thus, for each vertex v of D \ a, the digraph D \ a has a path from
r to v. In other words, r is a from-root of D \ a. However, we have assumed
that Statement A5 holds. Thus, in particular, if we remove the arc a from D,
then the vertex r will no longer be a from-root of the resulting multidigraph. In
other words, r is not a from-root of D \ a. But this contradicts the fact that r is
a from-root of D \ a.
This contradiction shows that our assumption was false. Hence, deg− r = 0
is proved.
Now, let v ∈ V \ {r } be arbitrary. We must show that deg− v = 1.
Indeed, assume the contrary. Thus, deg− v 6= 1. Using the fact that r is a
from-root of D, it is thus easy to see that deg− v ≥ 2 41 . Hence, there exist
two distinct arcs a and b with target v. Consider these arcs a and b.
We are in one of the following three cases:
Case 1: The digraph D \ a has a path from r to v.
Case 2: The digraph D \ b has a path from r to v.
Case 3: Neither the digraph D \ a nor the digraph D \ b has a path from r to
v.
Let us first consider Case 1. In this case, the digraph D \ a has a path from r
to v. Let p be such a path.
We have assumed that Statement A5 holds. Thus, in particular, if we remove
the arc a from D, then the vertex r will no longer be a from-root of the resulting
41 Proof.Since r is a from-root of D, we know that the digraph D has a path from r to v. Since
v 6= r (because v ∈ V \ {r }), this path must have at least one arc. The last arc of this path
is clearly an arc with target v. Thus, there exists at least one arc with target v. In other
words, deg− v ≥ 1. Combining this with deg− v 6= 1, we obtain deg− v > 1. In other words,
deg− v ≥ 2.
An introduction to graph theory, version August 2, 2023 page 190

multidigraph. In other words, r is not a from-root of D \ a. In other words,


there exists a vertex w ∈ V such that the digraph D \ a has no path from r to w
(by the definition of a “from-root”). Consider this vertex w.
The digraph D has a path q from r to w (since r is a from-root of D). Consider
this path q. If the path q did not use the arc a, then it would be a path of D \ a
as well, but this would contradict the fact that D \ a has no path from r to w.
Thus, the path q must use the arc a.
Consider the part of q that comes after the arc a. This part must be a path
from v to w (since the arc a has target v, whereas the path q has ending point
w). Let us denote this path by q′ . Thus, the path q′ does not use the arc a (since
it was defined as the part of q that comes after a). Hence, q′ is a path of D \ a.
Now, we know that the digraph D \ a has a path p from r to v as well as a
path q′ from v to w. Splicing these paths together, we obtain a walk p ∗ q′ from
r to w. So we know that D \ a has a walk from r to w. According to Corollary
3.3.10, we thus conclude that D \ a has a path from r to w. This contradicts the
fact that D \ a has no path from r to w.
We have thus obtained a contradiction in Case 1.
The same argument (but with the roles of a and b interchanged) results in a
contradiction in Case 2.
Let us finally consider Case 3. In this case, neither the digraph D \ a nor the
digraph D \ b has a path from r to v. However, the digraph D has a path p
from r to v (since r is a from-root of D). Consider this path p. If this path p did
not use the arc a, then it would be a path of D \ a, but this would contradict
our assumption that the digraph D \ a has no path from r to v. Thus, this path
p must use the arc a. For a similar reason, it must also use the arc b. However,
the two arcs a and b have the same target (viz., v) and thus cannot both appear
in the same path (since a path cannot visit a vertex more than once). This
contradicts the fact that the path p uses both arcs a and b. Hence, we have
found a contradiction in Case 3.
We have now found contradictions in all three Cases 1, 2 and 3. This contra-
diction shows that our assumption was false. Hence, deg− v = 1 is proved.
We have now proved that each v ∈ V \ {r } satisfies deg− v = 1. Since we
have also shown that deg− r = 0, we thus have proved Statement A6.
Proof of the implication A6=⇒A2: Assume that Statement A6 holds. We must
prove that Statement A2 holds. However, Proposition 4.2.3 yields
| A| = ∑ deg− v = deg− r
| {z }
+ ∑ deg− v
| {z }
v ∈V v∈V \{r }
=0 =1
(by Statement A6) (by Statement A6)

= 0+ ∑ 1= ∑ 1 = |V \ {r }| = |V | − 1.
v∈V \{r } v∈V \{r }

Hence, Statement A2 holds.


Proof of the implication A2=⇒A3: Assume that Statement A2 holds. We must
prove that Statement A3 holds.
An introduction to graph theory, version August 2, 2023 page 191

For each v ∈ V, the digraph D has a path from r to v (since r is a from-root


of D). Thus, for each v ∈ V, the graph Dund has a path from r to v (since
any path of D is a path of Dund ). Therefore, any two vertices u and v of Dund
are path-connected in Dund (because we can get from u to v via r, according
to the previous sentence). Therefore, the graph Dund is connected (since it
has at least one vertex42 ). Moreover, its number of edges is | A| = |V | − 1 (by
Statement A2). Therefore, the multigraph Dund satisfies the Statement T4 of the
tree equivalence theorem (Theorem 5.2.4). Consequently, it satisfies Statement
T1 of that theorem as well. In other words, it is a tree. This proves Statement
A3.
Proof of the implication A3=⇒A1: Assume that Statement A3 holds. We must
prove that Statement A1 holds.
The multigraph Dund is a tree (by Statement A3), and thus is a forest; hence,
it has no cycles. Since we also know that r is a from-root of D, we thus conclude
that D is an arborescence rooted from r (by the definition of an arborescence).
In other words, Statement A1 is satisfied.
We have now proved all six implications in the chain
A1=⇒A4=⇒A5=⇒A6=⇒A2=⇒A3=⇒A1. Thus, all six statements A1, A2,
. . ., A6 are equivalent. This proves Theorem 5.6.5.

Exercise 5.16. Let D = (V, A, φ) be a multidigraph that has no cycles43 . Let


r ∈ V be some vertex of D. Prove the following:

(a) If deg− u > 0 holds for all u ∈ V \ {r }, then r is a from-root of D.

(b) If deg− u = 1 holds for all u ∈ V \ {r }, then D is an arborescence rooted


from r.

5.7. Arborescences vs. trees


Our next goal is to prove Theorem 5.6.4, which connects arborescences with
trees.
To prove it formally, we introduce a few notations regarding trees. First, we
recall the notion of a distance (Definition 5.5.1). We claim the following simple
property of distances in trees:

Proposition 5.7.1. Let T = (V, E, ϕ) be a tree. Let r ∈ V be a vertex of


T. Let e be an edge of T, and let u and v be its two endpoints. Then,
the distances d (r, u) and d (r, v) differ by exactly 1 (that is, we have either
d (r, u) = d (r, v) + 1 or d (r, v) = d (r, u) + 1).
42 This
is because r ∈ V.
43 Recall
that cycles in a digraph have to be directed cycles – i.e., each arc is traversed from its
source to its target.
An introduction to graph theory, version August 2, 2023 page 192

Proof. We recall that since T is a tree, the distance d ( p, q) between two vertices
p and q of T is simply the length of the path from p to q. (This path is unique,
since T is a tree.)
Let p be the path from r to u. Then, we are in one of the following two cases:
Case 1: The edge e is an edge of p.
Case 2: The edge e is not an edge of p.
Consider Case 1. In this case, e must be the last edge of p (since otherwise, p
would visit u more than once, but p cannot do this, since p is a path). Thus, if
we remove this last edge e (and the vertex u) from p, then we obtain a path from
r to v. This path is exactly one edge shorter than p. Thus, d (r, v) = d (r, u) − 1,
so that d (r, u) = d (r, v) + 1. So we are done in Case 1.
Now, consider Case 2. In this case, the edge e is not an edge of p. Thus, we
can append e and v to the end of the path p, and the result will be a backtrack-
free walk p′ . However, a backtrack-free walk in a tree is always a path (since
otherwise, it would contain a cycle44 , but a tree has no cycles). Thus, p′ is a
path from r to v, and it is exactly one edge longer than p (by its construction).
Therefore, d (r, v) = d (r, u) + 1. So we are done in Case 2.
Now, we are done in both cases, so that Proposition 5.7.1 is proven.

Definition 5.7.2. Let T = (V, E, ϕ) be a tree. Let r ∈ V be a vertex of T. Let e


be an edge of T. By Proposition 5.7.1, the distances from the two endpoints
of e to the vertex r differ by exactly 1. So one of them is smaller than the
other.

(a) We define the r-parent of e to be the endpoint of e whose distance to r


is the smallest. We denote this endpoint by e−r .

(b) We define the r-child of e to be the endpoint of e whose distance to r is


the largest. We denote this endpoint by e+r .

Thus, by Proposition 5.7.1, we have


 
d r, e+r = d r, e−r + 1.

Example 5.7.3. Here is a tree T, a vertex r, an edge e and its r-parent e−r and

44 by Proposition 5.1.2
An introduction to graph theory, version August 2, 2023 page 193

its r-child e+r :


e
e +r e −r

Definition 5.7.4. Let T = (V, E, ϕ) be a tree. Let r ∈ V be a vertex of T. Then,


we define a multidigraph T r→ by

T r→ := (V, E, ψ) ,

where ψ : E → V × V is the map that sends each edge e ∈ E to the pair


(e−r , e+r ). Colloquially speaking, this means that T r→ is the multidigraph
obtained from T by turning each edge e into an arc from its r-parent e−r to
its r-child e+r . This is what we mean when we speak of “orienting each edge
of T away from r” in Theorem 5.6.4.

Example 5.7.5. If T is the tree from Example 5.7.3, then T r→ is the following
multidigraph:

Now, Theorem 5.6.4 can be rewritten as follows:


Theorem 5.7.6. Let D be a multidigraph, and let r be a vertex of D. Then,
the following two statements are equivalent:

• Statement C1: The multidigraph D is an arborescence rooted from r.


• Statement C2: The undirected multigraph Dund is a tree, and we have
r →
D = Dund . (This is a honest equality, not just some isomorphism.)
An introduction to graph theory, version August 2, 2023 page 194

The proof of this theorem is best organized by splitting into two lemmas:
Lemma 5.7.7. Let T = (V, E, ϕ) be a tree. Let r ∈ V be a vertex of T. Then,
the multidigraph T r→ is an arborescence rooted from r.
Proof. The idea is to show that if p is a path from r to some vertex v in the tree
T, then p is also a path in the digraph T r→ , because all the edges of p have been
“oriented correctly” (i.e., their orientation matches how they are used in p).
Here are the details: Clearly, (T r→ )und = T. Hence, the graph (T r→ )und is a
tree and hence has no cycles. Thus, it suffices to prove that r is a from-root of
T r→ . In other words, we must prove that
T r→ has a path from r to v (20)
for each v ∈ V.
We shall prove (20) by induction on d (r, v) (where d means the distance on
the tree T):
Base case: If v ∈ V satisfies d (r, v) = 0, then v = r, and thus T r→ has a path
from r to v (namely, the trivial path (r )). Thus, (20) is proved for d (r, v) = 0.
Induction step: Let k ∈ N. Assume (as the induction hypothesis) that (20)
holds for each v ∈ V satisfying d (r, v) = k. We must now prove the same for
each v ∈ V satisfying d (r, v) = k + 1.
So let v ∈ V satisfy d (r, v) = k + 1. Then, the path of T from r to v has
length k + 1. Let p be this path, let e be its last edge, and let u be its second-
to-last vertex (so that its last edge e has endpoints u and v). Then, by removing
the last edge e from the path p, we obtain a path from r to u that is one edge
shorter than p. Hence, d (r, u) = d (r, v) − 1 < d (r, v). Consequently, the edge
e has r-parent u and r-child v (by Definition 5.7.2). In other words, e−r = u
and e+r = v. Therefore, in the digraph T r→ , the edge e is an arc from u to
v (by Definition 5.7.4). Moreover, we have d (r, u) = d (r, v) − 1 = k (since
d (r, v) = k + 1); therefore, the induction hypothesis tells us that (20) holds for
u instead of v. In other words, T r→ has a path from r to u. Attaching the arc
e and the vertex v to this path, we obtain a walk of T r→ from r to v (since e
is an arc from u to v in T r→ ). Thus, the digraph T r→ has a walk from r to v,
therefore also a path from r to v. Hence, (20) holds for our v. This completes
the induction step.
Thus, (20) is proved by induction. As we explained above, this yields Lemma
5.7.7.
Lemma 5.7.8. Let D = (V, A, ψ) be an arborescence rooted from r (for some
r ∈ V). Let a ∈ A be an arc of D. Let s be the source of a, and let t be the
target of a. Then:

(a) We have d (r, s) < d (r, t), where d means distance on the tree Dund .
r →
(b) In the multidigraph Dund , the arc a has source s and target t.
An introduction to graph theory, version August 2, 2023 page 195

Proof. (a) The vertex r is a from-root of D (since D is an arborescence rooted


from r). Thus, D has a path from r to t. Let p be this path. Note that deg− t ≥ 1,
since t is the target of at least one arc (namely, of a).
The digraph D is an arborescence rooted from r, and thus satisfies Statement
A6 in the arborescence equivalence theorem (Theorem 5.6.5). In other words,
we have

deg− r = 0 and deg− v = 1 for each v ∈ V \ {r } .

In particular, this entails deg− v ≤ 1 for each v ∈ V. Applying this to v = t, we


obtain deg− t ≤ 1. Hence, the arc a is the only arc whose target is t.
We have t 6= r (since deg− r = 0 but deg− t ≥ 1 > 0). Thus, the path p from r
to t has at least one arc. Its last arc is therefore an arc whose target is t. Hence,
this last arc is a (since a is the only arc whose target is t).
If we remove this last arc from the path p, then we obtain a path p′ from r to
s (since s is the source of a).
However, each path of D is a path of Dund . Thus, in particular, p is a path of
D und from r to t, while p′ is a path of Dund from r to s. Since p′ is exactly one
edge shorter than p, we thus obtain d (r, s) = d (r, t) − 1 < d (r, t). This proves
Lemma 5.7.8 (a).
(b) The arc a of the digraph D has source s and target t. Hence, the edge a
of the tree Dund has endpoints s and t. Since d (r, s) < d (r, t) (by part (a)), this
entails that its r-parent is s and its r-child is t (by Definition 5.7.2). Thus, in the
r →
digraph Dund , this edge a becomes an arc with source s and target t (by
Definition 5.7.4). This proves Lemma 5.7.8 (b).
Proof of Theorem 5.7.6. If (V, A, ψ) is a multidigraph, then we shall refer to the
map ψ : A → V × V (which determines the source and the target of each arc)
as the “psi-map” of this multidigraph.
Write the multidigraph D as D = (V, A, ψ). We shall now prove the implica-
tions C1=⇒C2 and C2=⇒C1 separately:
Proof of the implication C1=⇒C2: Assume that Statement C1 holds. That is,
D is an arborescence rooted from r. We must prove Statement C2. In other
words, we must prove that the undirected multigraph Dund is a tree, and that
r →
D = Dund .
It is clear (by the definition of an arborescence) that Dund is a tree. It thus
r →
remains to prove that D = Dund .
und
r→
The multidigraphs D and D have the same set of vertices (namely, V)
and the same set of arcs (namely, A); we therefore just need to show that their
psi-maps are the same. In other words, we need to show that ψ′ = ψ, where ψ′
r→
is the psi-map of Dund .
Let a ∈ A be arbitrary. Let ψ ( a) = (s, t). Thus, the arc a of D has source s and
r →
target t. Lemma 5.7.8 (b) therefore shows that in the multidigraph Dund ,
An introduction to graph theory, version August 2, 2023 page 196

the arc a has source s and target t as well. In other words, ψ′ ( a) = (s, t) (since
ψ′ is the psi-map of this multidigraph). Hence, ψ′ ( a) = (s, t) = ψ ( a).
Forget that we fixed a. We thus have shown that ψ′ ( a) = ψ ( a) for each
a ∈ A. In other words, ψ′ = ψ. As explained above, this completes the proof of
Statement C2.
Proof of the implication C2=⇒C1: Assume that Statement C2 holds. Thus, the
r →
undirected multigraph Dund is a tree, and we have D = Dund . Hence,
und und
r→
Lemma 5.7.7 (applied to T = D ) yields that the multidigraph D is
an arborescence rooted from
und
r→ r. In other words, D is an arborescence rooted
from r (since D = D ). This shows that Statement C1 holds.
Having now proved both implications C1=⇒C2 and C2=⇒C1, we conclude
that Statements C1 and C2 are equivalent. Thus, Theorem 5.7.6 is proved.
Oof.
Let’s get one more consequence out of this. First, let us show that an arbores-
cence can have only one root:
Proposition 5.7.9. Let D be an arborescence rooted from r. Then, r is the
only root of D.
Proof of Proposition 5.7.9. Assume the contrary. Thus, D has another root s dis-
tinct from r. Hence, D has a path from r to s (since r is a root) as well as a path
from s to r (since s is a root). Combining these paths gives a circuit of length
> 0. However, a circuit of length > 0 in a digraph must always contain a cycle
(since Proposition 4.5.9 shows that it either is a path or contains a cycle; but it
clearly cannot be a path). Hence, D has a cycle. Therefore, Dund also has a cycle
(since any cycle of D is a cycle of Dund ). However, Dund has no cycles (since
D is an arborescence rooted from r). The preceding two sentences contradict
each other. This shows that the assumption was wrong, and Proposition 5.7.9
is proven.
Definition 5.7.10. A multidigraph D is said to be an arborescence if there
exists a vertex r of D such that D is an arborescence rooted from r. In this
case, this r is uniquely determined as the only root of D (by Proposition
5.7.9).

Theorem 5.7.11. There are two mutually inverse maps

{pairs ( T, r ) of a tree T and a vertex r of T } → {arborescences} ,


( T, r ) 7→ T r→

and

{arborescences} → {pairs ( T, r ) of a tree T and a vertex r of T } ,


 √ 
und
D 7→ D , D ,
An introduction to graph theory, version August 2, 2023 page 197


where D denotes the root of D.

Proof. The map

{pairs ( T, r ) of a tree T and a vertex r of T } → {arborescences} ,


( T, r ) 7→ T r→

is well-defined because of Lemma 5.7.7. The map

{arborescences} → {pairs ( T, r ) of a tree T and a vertex r of T } ,


 √ 
D 7→ Dund , D ,

is well-defined because if D is an arborescence, then Dund is a tree. In order to


show that these two maps are mutually inverse, we must check the following
two statements:
r →
1. Each arborescence D satisfies Dund = D, where r is the root of D;
r → und = T and
q pair ( T, r ) of a tree T and a vertex r of T satisfies ( T )
2. Each
( T r→ )und = r.

However, Statement 1 follows from Theorem 5.7.6 (specifically, from the im-
plication C1=⇒C2 in Theorem 5.7.6). Statement 2 follows from Lemma 5.7.7
r → und = T part of Statement 2 is obvious, whereas
q precisely, the ( T )
(more
the ( T r→ )und = r part follows from Lemma 5.7.7). Thus, Theorem 5.7.11 is
proved.
Theorem 5.7.11 formalizes the idea that an arborescence is “just a tree with a
chosen vertex”. For this reason, arborescences are sometimes called “oriented
trees”, but this name is also shared with a more general notion, which is why I
avoid it.

Exercise 5.17. Let G = (V, E, ϕ) be a connected multigraph such that | E| ≥


|V |. Show that there exists an injective map f : V → E such that for each
vertex v ∈ V, the edge f (v) contains v.
(In other words, show that we can assign to each vertex an edge that con-
tains this vertex in such a way that no edge is assigned twice.)

5.8. Spanning arborescences


In analogy to spanning subgraphs of a multigraph, we can define spanning
subdigraphs of a multidigraph:
An introduction to graph theory, version August 2, 2023 page 198

Definition 5.8.1. A spanning subdigraph of a multidigraph D = (V, A, ψ)


means a multidigraph of the form (V, B, ψ | B ), where B is a subset of A.
In other words, it means a submultidigraph of D with the same vertex set
as D.
In other words, it means a multidigraph obtained from D by removing
some arcs, but leaving all vertices untouched.

Definition 5.8.2. Let D be a multidigraph. Let r be a vertex of D. A spanning


arborescence of D rooted from r means a spanning subdigraph of D that is
an arborescence rooted from r.

Example 5.8.3. Let D = (V, A, ψ) be the following multidigraph:

2
a b
e
D= 1 c 3 4
g
f
d .

Is there a spanning arborescence of D rooted from 1 ? Yes, for instance,

2
a
e
  c
1 3 4
V, { a, c, e} , ψ |{ a,c,e} = .

By abuse of notation, we shall refer to this spanning arborescence simply


as { a, c, e} (since a spanning subdigraph of D is uniquely determined by its
arc set). Another spanning arborescence of D rooted from 1 is { a, b, e} . Yet
another is { a, b, f } . A non-example is { a, d, f } (indeed, this is an arborescence
rooted from 3, not from 1).
Is there a spanning arborescence of D rooted from 2 ? Yes, for example
{b, d, f }.
Is there a spanning arborescence of D rooted from 4 ? No, since 4 is not a
from-root of D.

This illustrates a first obstruction to the existence of spanning arborescences:


Namely, a digraph D can have a spanning arborescence rooted from r only if r
is a from-root. This necessary criterion is also sufficient:
An introduction to graph theory, version August 2, 2023 page 199

Theorem 5.8.4. Let D be a multidigraph. Let r be a from-root of D. Then, D


has a spanning arborescence rooted from r.

Proof. This is an analogue of the “every connected multigraph has a spanning


tree” theorem (Theorem 5.4.6) that we proved in 4 ways. At least the first proof
easily adapts to the directed case:
Remove arcs from D one by one, but in such a way that the “rootness of r”
(that is, the property that r is a root of our multidigraph) is preserved. So we
can only remove an arc if r remains a root afterwards.
Clearly, this removing process will eventually come to an end, since D has
only finitely many arcs. Let D ′ be the multidigraph obtained at the end of this
process. Then, r is still a root of D ′ , but we cannot remove any more arcs from
D ′ without breaking the rootness of r. That is, if we remove any arc from D ′ ,
then the vertex r will no longer be a from-root of the resulting multidigraph.
This means that D ′ satisfies Statement A5 from the arborescence equivalence
theorem (Theorem 5.6.5). Thus, D ′ satisfies Statement A1 as well (since all six
statements A1, A2, . . ., A6 are equivalent). In other words, D ′ is an arborescence
rooted from r. Since D ′ is a spanning subdigraph of D, we thus conclude that D
has a spanning arborescence rooted from r (namely, D ′ ). This proves Theorem
5.8.4.
Question 5.8.5. Can the other three proofs of Theorem 5.4.6 be adapted to
Theorem 5.8.4, too?


Example 5.8.6. Let n be a positive integer. The n-cycle digraph C n
is defined to be the simple digraph with vertices 1, 2, . . . , n and arcs
12, 23, 34, . . . , (n − 1) n, n1. (Here is how it looks for n = 5:

2
3

4
5

)


Note that this digraph C n is a directed analogue of the cycle graph Cn . As
we recall from Example 5.4.4, the cycle graph Cn has n spanning trees.


In contrast, the digraph C n has only one spanning arborescence rooted


from 1. This spanning arborescence is the subdigraph of C n obtained by
removing the arc n1.
An introduction to graph theory, version August 2, 2023 page 200



Proof. If we remove the arc n1 from C n , then we obtain the simple digraph E
with vertices 1, 2, . . . , n and arcs 12, 23, . . . , (n − 1) n. This digraph E is easily
seen to be an arborescence rooted from 1 (indeed, 1 is a from-root of E, and the
underlying undirected graph Eund = Pn has no cycles). Thus, E is a spanning


arborescence of C n rooted from 1.
We shall now prove that it is the only such arborescence. Indeed, let F be


any spanning arborescence of C n rooted from 1. Then, 1 is a from-root of
F. Hence, for each vertex v ∈ {2, 3, . . . , n}, the digraph F must have a path
from 1 to v, and thus must contain an arc with target v (namely, the last arc


of this path). This arc must be (v − 1, v) (since this is the only arc of C n with
target v). Thus, for each vertex v ∈ {2, 3, . . . , n}, the digraph F must contain
the arc (v − 1, v). In other words, the digraph F must contain all n − 1 arcs
−→
12, 23, . . . , (n − 1) n. If F were to also contain the remaining arc n1 of C n ,
then the underlying undirected graph Fund = Cn would contain a cycle, which
would contradict F being an arborescence. Hence, F cannot contain the arc n1.
Thus, F contains the n − 1 arcs 12, 23, . . . , (n − 1) n and no others. In other


words, F = E. This shows that any spanning arborescence of C n rooted from


1 must be E. In other words, E is the only spanning arborescence of C n rooted
from 1. This completes the proof of Example 5.8.6.

5.9. The BEST theorem: statement


We now come to something much more surprising.
Recall that a multidigraph D = (V, A, ϕ) is balanced if and only if each
vertex v satisfies deg− v = deg+ v. This is necessary for the existence of an
Eulerian circuit. If D is weakly connected, this is also sufficient (by Theorem
4.7.2 (a)).
Surprisingly, there is a formula for the number of these Eulerian circuits:

Theorem 5.9.1 (The BEST theorem). Let D = (V, A, ψ) be a balanced multi-


digraph such that each vertex has indegree > 0. Fix an arc a of D, and let
r be its target. Let τ ( D, r ) be the number of spanning arborescences of D
rooted from r. Let ε ( D, a) be the number of Eulerian circuits of D whose last
arc is a. Then, 
ε ( D, a) = τ ( D, r ) · ∏ deg− u − 1 !.
u ∈V

The “BEST” in the name of this theorem is an abbreviation for de Bruijn, van
Aardenne–Ehrenfest, Smith and Tutte, who discovered it in the middle of the
20th century45 . 46
45 More precisely, van Aardenne–Ehrenfest and de Bruijn discovered it in 1951 (see [VanEhr51,
§6]) generalizing an earlier result of Smith and Tutte.
46 We note that the number of Eulerian circuits of D whose last arc is a is precisely the number
An introduction to graph theory, version August 2, 2023 page 201

To prove this theorem, we shall restate it in terms of “arborescences to” (as


opposed to “arborescences from”). Mathematically speaking, this restatement
isn’t really necessary (the argument is the same in both cases up to reversing
the directions of all arcs), but it helps make the proof more intuitive, since it
lets us build our Eulerian circuits by moving forwards rather than backwards.

5.10. Arborescences rooted to r


Here is the formal definition of “arborescences to”:

Definition 5.10.1. Let D be a multidigraph. Let r be a vertex of D.

(a) We say that r is a to-root of D if for each vertex v of D, the digraph D


has a path from v to r.

(b) We say that D is an arborescence rooted to r if r is a to-root of D and


the undirected multigraph Dund has no cycles.

Clearly, Definition 5.6.1 and Definition 5.10.1 differ only in the direction of
the arcs. In other words, if we reverse each arc of our digraph (turning its
source into its target and vice versa), then a from-root becomes a to-root, and
an arborescence rooted from r becomes an arborescence rooted to r, and vice
versa. Thus, every property that we have proved for arborescences rooted from
r can be translated into the language of arborescences rooted to r by reversing
all arcs.
If you want to see this stated more rigorously, here is a formal definition of “revers-
ing each arc”:

Definition 5.10.2. Let D = (V, A, ψ) be a multidigraph. Then, Drev shall denote the
multidigraph (V, A, τ ◦ ψ), where τ : V × V → V × V is the map that sends each
pair (s, t) to (t, s). Thus, if an arc a of D has source s and target t, then it is also an
arc of Drev , but in this digraph Drev it has source t and target s.
The multidigraph Drev is called the reversal of the multidigraph D; we say that it
is obtained from D by “reversing each arc”.

This notion of “reversing each arc” allows us to reverse walks in digraphs: If w is a


walk from a vertex s to t in some multidigraph D, then its reversal rev w (obtained by
reading w backwards) is a walk from t to s in the multidigraph Drev . The same holds
if we replace the word “walk” by “path”. Thus, we easily obtain the following:

Proposition 5.10.3. Let D be a multidigraph. Let r be a vertex of D. Then:

(a) The vertex r is a to-root of D if and only if r is a from-root of Drev .

of all Eulerian circuits of D counted up to rotation. Indeed, each Eulerian circuit of D


contains the arc a exactly once, and thus can be rotated in a unique way to end with a.
An introduction to graph theory, version August 2, 2023 page 202

(b) The digraph D is an arborescence rooted to r if and only if Drev is an arbores-


cence rooted from r.

Proof. Completely straightforward unpacking of the definitions.

Note that when we reverse each arc in a digraph D, the outdegrees of its
vertices become their indegrees and vice versa. Hence, a balanced digraph D
remains balanced when this happens. In particular, the BEST theorem (Theo-
rem 5.9.1) thus gets translated as follows:

Theorem 5.10.4 (The BEST’ theorem). Let D = (V, A, ψ) be a balanced mul-


tidigraph such that each vertex has outdegree > 0. Fix an arc a of D, and let
r be its source. Let τ ( D, r ) be the number of spanning arborescences of D
rooted to r. Let ε ( D, a) be the number of Eulerian circuits of D whose first
arc is a. Then, 
ε ( D, a) = τ ( D, r ) · ∏ deg+ u − 1 !.
u ∈V

We will soon prove Theorem 5.10.4, and then derive Theorem 5.9.1 from it by
reversing the arcs.
First, however, let us state the analogue of the Arborescence Equivalence
Theorem (Theorem 5.6.5) for “arborescences rooted to r” (as opposed to “ar-
borescences rooted from r”):

Theorem 5.10.5 (The dual arborescence equivalence theorem). Let D =


(V, A, ψ) be a multidigraph with a to-root r. Then, the following six state-
ments are equivalent:

• Statement A’1: The multidigraph D is an arborescence rooted to r.

• Statement A’2: We have | A| = |V | − 1.

• Statement A’3: The multigraph Dund is a tree.

• Statement A’4: For each vertex v ∈ V, the multidigraph D has a unique


walk from v to r.

• Statement A’5: If we remove any arc from D, then the vertex r will no
longer be a to-root of the resulting multidigraph.

• Statement A’6: We have deg+ r = 0, and each v ∈ V \ {r } satisfies


deg+ v = 1.

Proof. Upon reversing all arcs of D, this turns into the original Arborescence
Equivalence Theorem (Theorem 5.6.5).
An introduction to graph theory, version August 2, 2023 page 203

5.11. The BEST theorem: proof


We now come to the proof of the BEST theorem (Theorem 5.9.1). As we said,
we proceed by proving Theorem 5.10.4 first. We first outline the idea of the
proof; then we will give the details.
Proof idea for Theorem 5.10.4. An a-Eulerian circuit shall mean an Eulerian cir-
cuit of D whose first arc is a.
Let e be an a-Eulerian circuit. Its first arc is a; therefore, its first and last
vertex is r.
Being an Eulerian circuit, e must contain each arc of D and therefore contain
each vertex of D (since each vertex has outdegree > 0). For each vertex u 6= r,
we let e (u) be the last exit of e from u, that is, the last arc of e that has source
u. Let Exit e be the set of these last exits e (u) for all vertices u 6= r. Then, we
claim:

Claim 1: This set Exit e (or, more precisely, the spanning subdigraph
(V, Exit e, ψ |Exit e )) is a spanning arborescence of D rooted to r.

Let’s assume for the moment that Claim 1 is proven. Thus, given any a-
Eulerian circuit e, we have constructed a spanning arborescence of D rooted to
r.
How many a-Eulerian circuits e lead to a given arborescence in this way?
The answer is rather nice:

Claim 2: For each spanning arborescence


 (V, B, ψ | B ) of D rooted to
r, there are exactly ∏ deg+ u − 1 ! many a-Eulerian circuits e such
u ∈V
that Exit e = B.

Let us again assume that this


 is proven. Combining Claim 1 with Claim
2, we obtain a ∏ deg+ u − 1 !-to-1 correspondence between the a-Eulerian
u ∈V
circuits and the spanning arborescences
 of D rooted to r. Thus, the number
of the former is ∏ deg+ u − 1 ! times the number of the latter. But this is
u ∈V
precisely the claim of Theorem 5.10.4. Hence, in order to prove Theorem 5.10.4,
it remains to prove Claim 1 and Claim 2.
Here is the complete proof:
Proof of Theorem 5.10.4. Some notations first:
An outgoing arc from a vertex u will mean an arc whose source is u. An
incoming arc into a vertex u will mean an arc whose target is u.
An a-Eulerian circuit shall mean an Eulerian circuit of D whose first arc is a.
A sparb shall mean a spanning arborescence of D rooted to r.
A spanning subdigraph of D always has the form (V, B, ψ | B ) for some subset
B of A. Thus, it is uniquely determined by its arc set B.
An introduction to graph theory, version August 2, 2023 page 204

Hence, from now on, we shall identify a spanning subdigraph (V, B, ψ | B )


of D with its arc set B. Conversely, any subset B of A will be identified with
the corresponding spanning subdigraph (V, B, ψ | B ) of D. Thus, for instance,
when we say that a subset B of A “is a sparb”, we shall actually mean that the
corresponding spanning subdigraph (V, B, ψ | B ) is a sparb.
For each a-Eulerian circuit e, we define a subset Exit e of A as follows:
Let e be an a-Eulerian circuit. Its first arc is a; thus, its first and last vertex
is r. Being an Eulerian circuit, e must contain each arc of D and therefore also
contain each vertex of D (since each vertex of D has outdegree > 0). For each
vertex u ∈ V \ {r }, we let e (u) be the last exit of e from u; this means the last
arc of e that has source u. We let Exit e be the set of these last exits e (u) for
all u ∈ V \ {r }. Thus, we have defined a subset Exit e of A for each a-Eulerian
circuit e.
Example 5.11.1. Here is an example of this construction: Let D be the multi-
digraph
2
b
g 3 a

k f
c j 1
h l
e
4

d
5 i

with r = 1, and let e be the a-Eulerian circuit

(1, a, 2, b, 3, c, 4, d, 5, e, 1, f , 3, g, 3, h, 5, i, 5, j, 2, k, 4, l, 1)

(we have deliberately named the arcs in such a way that they appear on an
Eulerian circuit in alphabetic order). Then,

e (2) = k, e (3) = h, e (4) = l, e (5) = j,


An introduction to graph theory, version August 2, 2023 page 205

so that Exit e = {k, h, l, j} . Here is Exit e as a spanning subdigraph:

k
j 1
h l

Now, we claim the following:

Claim 1: Let e be an a-Eulerian circuit. Then, the set Exit e is a sparb.

Claim 2: For each sparb


 B (regarded as a subset of A), there are ex-
actly ∏ deg+ u − 1 ! many a-Eulerian circuits e such that Exit e =
u ∈V
B.

[Proof of Claim 1: The set Exit e contains exactly one outgoing arc (namely,
e (u)) from each vertex u ∈ V \ {r }, and no outgoing arc from r. Thus, |Exit e| =
|V | − 1.
Let us number the arcs of e as a1 , a2 , . . . , am , in the order in which they appear
in e. (Thus, a1 = a, since the first arc of e is a.)
Recall that the arcs in Exit e are the arcs e (u) for all u ∈ V \ {r } (defined as
above – i.e., the arc e (u) is the last exit of e from u). We shall refer to these arcs
as the last-exit arcs.
For each u ∈ V \ {r }, we let j (u) be the unique number i ∈ {1, 2, . . . , m} such
that e (u) = ai . (This i indeed exists and is unique, since each arc of D appears
exactly once on e.) Thus, j (u) tells us how late in the Eulerian circuit e the arc
e (u) appears. Since e (u) is the last exit of e from u, the Eulerian circuit e never
visits the vertex u again after this.
Thus, if a last-exit arc e (u) has target v 6= r, then

j ( u) < j ( v ) (21)

(because the arc e (u) leads the circuit e into the vertex v, which the circuit then
has to exit at least once; therefore, the corresponding last-exit arc e (v) has to
appear later in e than the arc e (u)).
An introduction to graph theory, version August 2, 2023 page 206

We shall now show that r is a to-root of Exit e (that is, of the spanning subdi-
graph (V, Exit e, ψ |Exit e )). To this purpose, we must show that for each vertex
v ∈ V, there is a path from v to r in the digraph (V, Exit e, ψ |Exit e ).
Indeed, let v ∈ V be any vertex. We must find a path from v to r in the
digraph (V, Exit e, ψ |Exit e ). It will suffice to find a walk from v to r in this
digraph (by Corollary 4.5.8). In other words, we must find a way to walk from
v to r in D using last-exit arcs only.
So we start walking at v. If v = r, then we are already done. Otherwise, we
have v ∈ V \ {r }, so that the arc e (v) and the number j (v) are well-defined.
We thus take the arc e (v). This brings us to a vertex v′ (namely, the target of
e (v)) that satisfies j (v) < j (v′ ) (by (21)). If this vertex v′ is r, then we are done.
If not, then e (v′ ) and j (v′ ) are well-defined, so we continue our walk by taking
the arc e (v′ ). This brings us to a further vertex v′′ (namely, the target of e (v′ ))
that satisfies j (v′ ) < j (v′′ ) (by (21)). If this vertex v′′ is r, then we are done.
Otherwise, we proceed as before. We thus construct a walk
  
v, e (v) , v′ , e v′ , v′′ , e v′′ , . . .

that either goes on indefinitely or stops at the vertex r.


However, this walking process cannot go on forever (since the chain of in-
equalities j (v) < j (v′ ) < j (v′′ ) < · · · would force the numbers j (v) , j (v′ ) , j (v′′ ) , . . .
to be all distinct, but there are only m distinct numbers in {1, 2, . . . , m}). Thus,
it must stop at the vertex r. So we have found a walk from v to r using last-exit
arcs only. Thus, Exit e has a walk from v to r. Hence, Exit e has a path from v
to r.
Forget that we fixed v. We thus have shown that for each vertex v ∈ V,
there is a path from v to r in the digraph (V, Exit e, ψ |Exit e ). In other words,
r is a to-root of Exit e. Hence, we conclude (using the implication A’2=⇒A’1
in Theorem 5.10.5) that Exit e is an arborescence rooted to r (since |Exit e| =
|V | − 1). Therefore, Exit e is a sparb. This proves Claim 1.]
[Proof of Claim 2: Let B be a sparb. (As before, B is a set of arcs, and we
identify it with the spanning subdigraph (V, B, ψ | B ).) 
We must prove that there are exactly ∏ deg+ u − 1 ! many a-Eulerian cir-
u ∈V
cuits e such that Exit e = B.
We shall refer to the arcs in B as the B-arcs. Recall that B is an arborescence
rooted to r (since B is a sparb). Hence, by the implication A’1=⇒A’6 in Theorem
5.10.5, we see that the outdegrees of its vertices satisfy

deg+
B r = 0, and deg+
B v = 1 for all v ∈ V \ {r }

(where deg+ B v means the outdegree of a vertex in the digraph (V, B, ψ | B )).
In other words, there is no B-arc with source r; however, for each vertex u ∈
V \ {r }, there is exactly one B-arc with source u.
Now, we are trying to count the a-Eulerian circuits e such that Exit e = B.
An introduction to graph theory, version August 2, 2023 page 207

Let us try to construct such an a-Eulerian circuit e as follows:


A turtle wants to walk through the digraph D using each arc of D at most
once. It starts its walk by heading out from the vertex r along the arc a. From
that point on, it proceeds in the usual way you would walk on a digraph: Each
time it reaches a vertex, it chooses an arbitrary arc leading out of this vertex,
observing the following two rules:
1. It never uses an arc that it has already used before.
2. It never uses a B-arc unless it has to (i.e., unless this B-arc is the only
outgoing arc from its current position that is still unused).
Clearly, the turtle will eventually get stuck at some vertex (with no more arcs
left to continue walking along), since D has only finitely many arcs.
Let w be the total walk that the turtle has traced by the time it got stuck.
Thus, w is a trail (i.e., a walk that uses no arc more than once) that starts with
the vertex r and the arc a.
We will soon see that w is an a-Eulerian circuit satisfying Exit w = B. First,
however, let us see an example:
Example 5.11.2. Let D be the multidigraph

2
b
g 3 a

k f
c j 1
h l
e
4

d 5 i
,

and let r = 1 and a = a (we called it a on purpose). Let B be the set { d, e, h, k},
regarded as a spanning subdigraph of D. (The arcs of B are drawn bold and
in red in the above picture.)
The turtle starts at r = 1 and walks along the arc a. This leads it to the
vertex 2. It now must choose between the arcs b and k, but since it must not
use the B-arc k unless it has to, it is actually forced to take the arc b next.
This brings it to the vertex 3. It now has to choose between the arcs c, g and
h, but again the arc h is disallowed because it is not yet time to use a B-arc.
Let us say that it takes the arc g. This brings it back to the vertex 3. Next, the
turtle must walk along c (since g is already used, while the B-arc still must
An introduction to graph theory, version August 2, 2023 page 208

wait until it is the only option). This brings it to the vertex 4. Its next step is
to take the arc l to the vertex 1. From there, it follows the arc f to the vertex
3. Now, it can finally take the B-arc h, since all the other outgoing arcs from
3 have already been used. This brings it to the vertex 5. Now it has a choice
between the arcs e, i and j, but the arc e is disallowed because it is a B-arc.
Let us say it decides to use the arc j. This brings it to the vertex 2. From
there, it takes the B-arc k to the vertex 4 (since it has no other options). From
there, it continues along the B-arc d to the vertex 5. Now, it has to traverse
the loop i, and then leave 5 along the B-arc e to come back to 1. At this point,
the turtle is stuck, since it has nowhere left to go. The walk w we obtained is
thus
w = (1, a, 2, b, 3, g, 3, c, 4, l, 1, f , 3, h, 5, j, 2, k, 4, d, 5, i, 5, e, 1) .
(Of course, other choices would have led to other walks.)
Returning to the general case, let us analyze the walk w traversed by the
turtle.

• First, we claim that w is a closed walk (i.e., ends at r).


[Proof: Assume the contrary. Let u be the ending point of w. Thus, u
is the vertex at which the turtle gets stuck. Moreover, u 6= r (since we
just assumed that w is not a closed walk). Hence, the walk w enters the
vertex u more often than it leaves it (since it ends but does not start at u).
In other words, the turtle has entered the vertex u more often than it has
left it. However, since D is balanced, we have deg− u = deg+ u. The turtle
has entered the vertex u at most deg− u times (because it cannot use an
arc twice, but there are only deg− u many arcs with target u). Thus, it has
left the vertex u less than deg− u times (because it has entered the vertex
u more often than it has left it). Since deg− u = deg+ u, this means that
the turtle has left the vertex u less than deg+ u times. Thus, by the time
the turtle has gotten stuck at u, there is at least one outgoing arc from u
that has not been used by the turtle. Therefore, the turtle is not actually
stuck at u. This is a contradiction. Thus, our assumption was wrong, so
we have proved that w is a closed walk.]

In other words, w is a circuit. We shall next show that w is an Eulerian


circuit.
To do so, we introduce one more piece of notation: A vertex u of D will be
called exhausted if the turtle has used each outgoing arc from u (that is, if each
outgoing arc from u is used in the circuit w).
Since w is a circuit, the ending point of w is its starting point, i.e., the vertex
r. Thus, the turtle must have gotten stuck at r. Hence, the vertex r is exhausted.

• We shall now show that all vertices of D are exhausted.


An introduction to graph theory, version August 2, 2023 page 209

[Proof: Assume the contrary. Thus, there exists a vertex u of D that is


not exhausted. Consider this u. But B is a sparb, thus an arborescence
rooted to r. Hence, r is a to-root of B. Therefore, there exists a path
p = ( p0 , b1 , p1 , b2 , p2 , . . . , bk , pk ) from u to r in B. Consider this path. Thus,
we have p0 = u and pk = r, and all the arcs b1 , b2 , . . . , bk belong to B.
There exists at least one i ∈ {0, 1, . . . , k} such that the vertex pi is ex-
hausted (for instance, i = k qualifies, since pk = r is exhausted). Consider
the smallest such i. Then, pi 6= p0 (since pi is exhausted, but p0 = u is
not). Hence, i 6= 0, so that i ≥ 1. Therefore, pi −1 exists. Moreover, the ver-
tex pi −1 is not exhausted (since i was defined to be the smallest element
of {0, 1, . . . , k} such that pi is exhausted).
The arc bi has source pi −1 and target pi . Thus, it is an outgoing arc from
pi −1 and incoming arc into pi . Furthermore, it belongs to B (since all the
arcs b1 , b2 , . . . , bk belong to B).
The digraph D is balanced; thus, deg+ ( pi ) = deg− ( pi ).
The vertex pi is exhausted. In other words, the turtle has used each out-
going arc from pi (by the definition of “exhausted”). Since the turtle never
reuses an arc, this entails that the turtle has used exactly deg+ ( pi ) many
outgoing arcs from pi (since deg+ ( pi ) is the total number of outgoing
arcs from pi in D). In other words, it has used exactly deg− ( pi ) many
outgoing arcs from pi (since deg+ ( pi ) = deg− ( pi )).
However, the turtle’s trajectory is a closed walk (in fact, it is the walk w,
which is closed). Thus, it must enter the vertex pi as often as it leaves this
vertex. In other words, the number of incoming arcs into pi used by the
turtle must equal the number of outgoing arcs from pi used by the turtle.
Since we just found (in the preceding paragraph) that the latter number
is deg− ( pi ), we thus conclude that the former number is deg− ( pi ) as
well. In other words, the turtle must have used exactly deg− ( pi ) many
incoming arcs into pi . Since deg− ( pi ) is the total number of incoming arcs
into pi in D, we thus conclude that the turtle must have used all incoming
arcs into pi (since the turtle never reuses an arc).
Hence, in particular, the turtle must have used the arc bi (since bi is an
incoming arc into pi ). This arc bi is an outgoing arc from pi −1 . But bi is
a B-arc, and thus our turtle uses this arc only as a last resort (i.e., after
using all other outgoing arcs from pi −1 ). Hence, we conclude that the
turtle must have used all outgoing arcs from pi −1 (since it has used bi ).
In other words, pi −1 is exhausted. But this contradicts the fact that pi −1
is not exhausted! This shows that our assumption was wrong, and our
proof is finished.47 ].]
47 Forthe sake of diversity, let me sketch a second proof of the same claim (i.e., that all vertices
in D are exhausted):
Assume the contrary. Thus, there exists a non-exhausted vertex u of D. Consider this u.
An introduction to graph theory, version August 2, 2023 page 210

Thus, we have shown that all vertices of D are exhausted. In other words,
the turtle has used all arcs of D. In other words, the trail w contains all arcs of
D. Since w is a trail and a closed walk, this entails that w is an Eulerian circuit
of D. Since w starts with r and a, this shows further that w is an a-Eulerian
circuit. Since the turtle only used B-arcs as a last resort (and it used each B-arc
eventually, because w is Eulerian), we have Exit w = B.
Thus, the turtle’s walk has produced an a-Eulerian circuit e satisfying Exit e =
B (namely, the walk w). However, this circuit depends on some decisions the
turtle made during its walk. Namely, every time the turtle was at some vertex
u ∈ V, it had to decide which arc to take next; this arc had to be an unused arc
with source u, subject to the conditions that

1. if u 6= r, then the B-arc48 has to be used last;


2. if u = r, then the arc a has to be used first.

Let us count how many options the turtle has had in total. To make the
argument clearer, we modify the procedure somewhat: Instead of deciding ad-
hoc which arc to take, the turtle should now make all these decisions before
embarking on its journey. To do so, it chooses, for each vertex u ∈ V, a total
order on the set of all arcs with source u, such that

1. if u 6= r, then the B-arc comes last in this order, and


Then, u 6= r (since r is exhausted but u is not). Since u is not exhausted, there is at least
one outgoing arc from u that the turtle has not used. Hence, the turtle has not used the
B-arc outgoing from u (since the turtle never uses a B-arc before it has to). Let f be this
B-arc, and let u′ be its target. Thus, the turtle has not used all incoming arcs of u′ (because
it has not used the arc f ). As a consequence, it has not used all outgoing arcs from u′ either
(because the turtle has left u′ as often as it has entered u′ , but the balancedness of D entails
that deg− (u′ ) = deg+ (u′ )). In other words, the vertex u′ is non-exhausted.
Thus, by starting at the non-exhausted vertex u and taking the B-arc outgoing from u,
we have arrived at a further non-exhausted vertex u′ . Applying the same argument to u′
instead of u, we can take a further B-arc and arrive at a further non-exhausted vertex u′′ .
Continuing like this, we obtain an infinite sequence (u, u′ , u′′ , . . .) of non-exhausted vertices
such that any vertex in this sequence is reached from the previous one by traveling along a
B-arc. Clearly, this sequence must have two equal vertices (since D has only finitely many
vertices). For example, let’s say that u′′ = u′′′′′ . Then, if we consider only the part of the
sequence between u′′ and u′′′′′ , then we obtain a closed walk

u′′ , ∗, u′′′ , ∗, u′′′′ , ∗, u′′′′′ ,

where each asterisk stands for some B-arc (not the same one, of course). This is a closed
walk of the digraph (V, B, ψ | B ). Since this closed walk has length > 0, it cannot be a path;
therefore, it contains a cycle (by Proposition 4.5.9). Thus, we have found a cycle of the
digraph (V, B, ψ | B ). However, the digraph (V, B, ψ | B ) is an arborescence, and thus has no
cycles (because if D is an arborescence, then any cycle of D would be a cycle of Dund ; but
the multigraph Dund has no cycles by the definition of an arborescence). The previous two
sentences contradict each other. This shows that our assumption was wrong, and our proof
is finished.
48 We say “the B-arc”, because there is exactly one B-arc with source u.
An introduction to graph theory, version August 2, 2023 page 211

2. if u = r, then the arc a comes first in this order.



Note that this total order can be chosen in deg+ u − 1 ! many ways (since
there are deg+ u arcs with source u, and we can freely choose their order except

that one of them has a fixed position). Thus, in total, there are ∏ deg+ u − 1 !
u ∈V
many options for how the turtle can choose all these orders. Once these orders
have been chosen, the turtle then uses them to decide which arcs to walk along:
Namely, the first time it visits the vertex u, it leaves it along the first arc (ac-
cording to its chosen order); the second time, it uses the second arc; the third
time, the third arc; and so on. 
So the turtle has ∏ deg+ u − 1 ! many options, and each of these options
u ∈V
leads to a different a-Eulerian circuit e (because the total orders chosen by the
turtle are reflected in e: they are precisely the orders in which the respective
arcs appear in e). Moreover, each a-Eulerian circuit e satisfying Exit e = B
comes from one of these options49 .
Therefore, the total number of a-Eulerian circuits e satisfying Exit e = B is the
total number of options, which is ∏ deg+ u − 1 ! as we know. This proves
u ∈V
Claim 2.]
With Claims 1 and 2 proved, we are almost done. The map
{ a-Eulerian circuits of D } → {sparbs} ,
e 7→ Exit e
is well-defined (by Claim 1). Furthermore, Claim 2 shows that this map is a
+
∏ deg u − 1 !-to-1 correspondence50 (i.e., each sparb B has exactly
u ∈V 
+
∏ deg u − 1 ! many preimages under this map). Thus, by the multijection
u ∈V
principle51 , we conclude that52
!
+ 
(# of a-Eulerian circuits of D ) = ∏ deg u − 1 ! · (# of sparbs) .
u ∈V
49 Proof. Let e be an a-Eulerian circuit satisfying Exit e = B. Then, by choosing the appropriate
total orders ahead of its journey, the turtle will trace this exact circuit e. (Of course, the
“appropriate total orders” are the ones dictated by e: That is, for each vertex u ∈ V, the
turtle must pick the same total order on the set of all arcs with source u in which they appear
on e. This choice is legitimate, because the arc a is the first arc of e (so it will certainly come
first in its order), and because each B-arc appears in e after all other arcs from the same
source have appeared (so it will come last in its total order).)
50 An m-to-1 correspondence (where m is a nonnegative integer) means a map f : X → Y

between two sets such that each element of Y has exactly m preimages under f .
51 The multijection principle is a basic counting principle that says the following: Let X and Y

be two finite sets, and let m ∈ N. Let f : X → Y be an m-to-1 correspondence (i.e., a map
such that each element of Y has exactly m preimages under f ). Then, | X | = m · |Y |.
For example, n (intact) sheep have 4n legs in total, since the map that sends each leg to
its sheep is a 4-to-1 correspondence.
52 The symbol “#” means “number”.
An introduction to graph theory, version August 2, 2023 page 212

Since ε ( D, a) = (# of a-Eulerian circuits of D ) and τ ( D, r ) = (# of sparbs), we


can rewrite this as follows:
!
 
ε ( D, a) = ∏ deg+ u − 1 ! · τ ( D, r ) = τ ( D, r ) · ∏ deg+ u − 1 !.
u ∈V u ∈V

This proves Theorem 5.10.4.


Proof of Theorem 5.9.1. As we already mentioned, Theorem 5.9.1 follows from
Theorem 5.10.4 by reversing each arc (i.e., by applying Theorem 5.10.4 to the
digraph Drev instead of D).

5.12. A corollary about spanning arborescences


Before we actually use the BEST (or BEST’) theorem to count the Eulerian cir-
cuits on any digraph, let us mention a neat corollary for the number of spanning
arborescences:

Corollary 5.12.1. Let D = (V, A, ψ) be a balanced multidigraph. For each


vertex r ∈ V, let τ ( D, r ) be the number of spanning arborescences of D
rooted to r. Then, τ ( D, r ) does not depend on r.

Proof of Corollary 5.12.1. WLOG assume that |V | > 1 (else, the claim is obvious).
If there is a vertex v ∈ V with deg+ v = 0, then this vertex v satisfies deg− v = 0
as well (since the balancedness of D entails deg− v = deg+ v = 0), and therefore
D has no spanning arborescences at all (since any spanning arborescence would
have an arc with source or target v). Thus, we WLOG assume that deg+ v > 0
for all v ∈ V. In other words, each vertex has outdegree > 0.
Let r and s be two vertices of D. We must prove that τ ( D, r ) = τ ( D, s).
Pick an arc a with source r. (This exists, since deg+ r > 0.) Pick an arc b with
source s. (This exists, since deg+ s > 0.)
Applying the BEST’ theorem (Theorem 5.10.4), we get

ε ( D, a) = τ ( D, r ) · ∏ deg+ u − 1 ! and similarly
u ∈V

ε ( D, b) = τ ( D, s) · ∏ deg+ u − 1 !.
u ∈V

However, ε ( D, a) = ε ( D, b), since counting Eulerian circuits that start with a is


equivalent to counting Eulerian circuits that start with b (because an Eulerian
circuit can be rotated uniquely to start with any given arc). Thus, we obtain
 
τ ( D, r ) · ∏ deg+ u − 1 ! = ε ( D, a) = ε ( D, b) = τ ( D, s) · ∏ deg+ u − 1 !.
u ∈V u ∈V

Cancelling the (nonzero!) number ∏ deg+ u − 1 ! from this equality, we ob-
u ∈V
tain τ ( D, r ) = τ ( D, s). This proves Corollary 5.12.1.
An introduction to graph theory, version August 2, 2023 page 213

5.13. Spanning arborescences vs. spanning trees


The BEST theorem (Theorem 5.10.4 or Theorem 5.9.1) connects the # of Eulerian
circuits in a digraph with the # of spanning arborescences of the same digraph.
Now let us try to find a way to compute the latter.
For example, let us try to do this for digraphs of the form Gbidir where G is a
multigraph. I claim that the spanning arborescences of Gbidir rooted to a given
vertex r are just the spanning trees of G in disguise:

Proposition 5.13.1. Let G = (V, E, ϕ) be a multigraph. Fix a vertex r ∈ V.


Recall that the arcs of Gbidir are the pairs (e, i ) ∈ E × {1, 2}. Identify each
spanning tree of G with its edge set, and each spanning arborescence of
Gbidir with its arc set.
If B is a spanning arborescence of Gbidir rooted to r, then we set

B := { e | (e, i ) ∈ B} .

(Recall that we are identifying spanning arborescences with their arc sets, so
that “(e, i ) ∈ B” means “(e, i ) is an arc of B”.)
Then:

(a) If B is a spanning arborescence of Gbidir rooted to r, then B is a spanning


tree of G.

(b) The map


n o
bidir
spanning arborescences of G rooted to r → {spanning trees of G } ,
B 7→ B

is a bijection.
An introduction to graph theory, version August 2, 2023 page 214

Example 5.13.2. Here is a multigraph G (on the left) with the corresponding
multidigraph Gbidir (on the right):

5 5

3 4 3 4

2 1 2 1

G Gbidir

Here is a spanning arborescence B of Gbidir rooted to 1, and the correspond-


ing spanning tree B of G:

5 5

3 4 3 4

2 1 2 1
B B

(here, the arcs of Gbidir that don’t belong to B, as well as the edges of G that
don’t belong to B, have been drawn as dotted arrows). It is fairly easy to see
how B can be reconstructed from B: You just need to replace each edge of B
by the appropriately directed arc (namely, the one that is “directed towards
1”).

Proof of Proposition 5.13.1. This is an exercise in yak-shaving (and we have, in


fact, shaved a very similar yak in Section 5.7; the only difference is that we are
An introduction to graph theory, version August 2, 2023 page 215

no longer dealing with trees in isolation, but rather with spanning trees of G).
(a) Let B be a spanning arborescence of Gbidir rooted to r. Then, Bund is a tree
(by the implication A’1=⇒A’3 in Theorem 5.10.5). However, it is easy to see
that Bund ∼= B as multigraphs (indeed, each vertex v of Bund corresponds to the
same vertex v of B, whereas any edge (e, i ) of Bund corresponds to the edge e
of B) 53 . Thus, B is a tree (since Bund is a tree)54 , therefore a spanning tree of
G (since B is clearly a spanning subgraph of G). This proves Proposition 5.13.1
(a).
(b) We must prove that this map is surjective and injective.
Surjectivity: Let T be a spanning tree of G. Then, the multidigraph T r→
(defined in Definition 5.7.4) is an arborescence rooted from r (by Lemma 5.7.7).
Reversing each arc in this arborescence T r→ , we obtain a new multidigraph
T r← , which is thus an arborescence rooted to r. Unfortunately, T r← is not a
subdigraph of Gbidir , for a rather stupid reason: The arcs of T r← are elements
of E, whereas the arcs of Gbidir are pairs of the form (e, i ) with e ∈ E and
i ∈ {1, 2}.
Fortunately, this is easily fixed: For each arc e of T r← , we let e′ be the arc
(e, i ) of Gbidir that has the same source as e (and thus the same target as e). This
is uniquely determined, since the arcs (e, 1) and (e, 2) of Gbidir have different
sources55 . If we replace each arc e of T r← by the corresponding arc e′ of Gbidir ,
then we obtain a spanning subdigraph S of Gbidir that is an arborescence rooted
to r (since T r← is an arborescence rooted to r, and we have only replaced its
arcs by equivalent ones with the same sources and the same targets). In other
words, we obtain a spanning arborescence S of Gbidir rooted to r. It is easy to
see that S = T. Hence, the map
n o
bidir
spanning arborescences of G rooted to r → {spanning trees of G } ,
B 7→ B
53 Here we need to use the fact that for each edge e of B, exactly one of the two pairs (e, 1) and
(e, 2) is an edge of Bund . But this is easy to check: At least one of the two pairs (e, 1) and
(e, 2) must be an arc of B (since e is an edge of B). In other words, at least one of the two
pairs (e, 1) and (e, 2) must be an edge of Bund . But both of these pairs cannot be edges of
Bund at the same time (since this would create a cycle, but Bund is a tree and thus has no
cycles). Hence, exactly one of these pairs is an edge of Bund , qed.
54 Alternatively, you can prove this as follows: The vertex r is a to-root of B (since B is an

arborescence rooted to r). Thus, for each v ∈ V, there is a path from v to r in B. By “project-
ing” this path onto B (that is, replacing each arc (e, i ) of this path by the corresponding edge
e of B), we obtain a path from v to r in B. This shows that the multigraph B is connected.
Furthermore, the definition of B shows that B ≤ | B| = |V | − 1 (by Statement A’2 in The-
orem 5.10.5, since B is an arborescence rooted to r). Hence, B < |V |. Thus, we can apply
the implication T5=⇒T1 of the Tree Equivalence Theorem (Theorem 5.2.4) to conclude that
B is a tree.
55 Proof. The edge e of T is not a loop (because T is a tree, but a tree cannot have any loops).

Hence, its two endpoints are distinct. Thus, the arcs (e, 1) and (e, 2) of Gbidir have different
sources (since their sources are the two endpoints of e).
An introduction to graph theory, version August 2, 2023 page 216

sends S to T. This shows that T is a value of this map. Since we have proved this
for every spanning tree T of G, we have thus shown that this map is surjective.
Injectivity: The main idea is that, in order to recover a spanning arborescence
B back from the corresponding spanning tree B, we just need to “orient the
edges of the tree towards r”. Here are the (annoyingly long) details:
Let B and C be two sparbs56 such that B = C. We must show that B = C.
Assume the contrary. Thus, B 6= C. Let T be the tree B = C. Thus, each edge
e of T corresponds to either an arc (e, 1) or an arc (e, 2) in B (since T = B), and
likewise for C. Conversely, each arc (e, i ) of B or of C corresponds to an edge e
of T. Hence, from B 6= C, we see that there must exist an edge e of T such that

• either we have (e, 1) ∈ B and (e, 2) ∈ C,

• or we have (e, 1) ∈ C and (e, 2) ∈ B.

Consider this edge e. We WLOG assume that (e, 1) ∈ B and (e, 2) ∈ C (else,
we can just swap B with C). Let the arc (e, 1) of Gbidir have source s and target
t, so that (e, 2) has source t and target s. The edge e thus has endpoints s and t.
Since B is an arborescence rooted to r, the vertex r is a to-root of B. Hence,
there exists a path p from s to r in B. This path p must begin with the arc (e, 1)
57 . Projecting this path p down onto T, we obtain a path p from s to r in T. (By

the word “projecting”, we mean replacing each arc (e, i ) by the corresponding
edge e. Clearly, doing this to a path in B yields a path in T, because T = B.)
Since the path p begins with the arc (e, 1), the “projected” path p begins with
the edge e. Thus, in the tree T, the path from s to r begins with the edge e
(because this path must be the path p). As a consequence, t must be the second
vertex of this path (since the edge e has endpoints s and t), so that removing the
first edge from this path yields the path from t to r. Thus, d (t, r ) = d (s, r ) − 1,
where d denotes distance on the tree T. Hence, d (t, r ) < d (s, r ).
A similar argument (but with the roles of B and C swapped, as well as the
roles of s and t swapped, and the roles of (e, 1) and (e, 2) swapped) shows that
d (s, r ) < d (t, r ). But this contradicts d (t, r ) < d (s, r ).
This contradiction shows that our assumption was false. Thus, we have
proved that B = C.

56 Henceforth, “sparb” is short for “spanning arborescence of Gbidir rooted to r”.


57 Proof. Since r is a to-root of B, we know that there exists a path from t to r in B. Let t be this
path. Extending this path t by the vertex s and the arc (e, 1) (which we both insert at the start
of t), we obtain a walk t′ from s to r in B. (So, if t = (t, . . . , r ), then t′ = (s, (e, 1) , t, . . . , r ).)
However, B is an arborescence rooted to r. Thus, Statement A’4 in the Dual Arborescence
Equivalence Theorem (Theorem 5.10.5) shows that for each vertex v ∈ V, the digraph B has
a unique walk from v to r. Hence, in particular, B has a unique walk from s to r. Thus,
p = t′ (since both p and t′ are walks from s to r in B). Since t′ begins with the arc (e, 1), we
thus conclude that p begins with the arc (e, 1).
An introduction to graph theory, version August 2, 2023 page 217

Forget that we fixed B and C. We thus have shown that if B and C are two
sparbs such that B = C, then B = C. In other words, our map
n o
spanning arborescences of Gbidir rooted to r → {spanning trees of G } ,
B 7→ B

is injective.
We have now shown that this map is both surjective and injective. Hence, it
is a bijection. This proves Proposition 5.13.1 (b).

5.14. The matrix-tree theorem


5.14.1. Introduction
So counting spanning trees in a multigraph is a particular case of counting
spanning arborescences (rooted to a given vertex) in a multidigraph. But how
do we do either? Let us begin with some simple examples:

Example 5.14.1. There is only one spanning tree of the complete graph K1 :

1 .

There is only one spanning tree of the complete graph K2 :

1 2 .

There are 3 spanning trees of the complete graph K3 :

2 2 2

1 1 1 .

3 3 3

(They are all isomorphic, but still distinct.)


An introduction to graph theory, version August 2, 2023 page 218

There are 16 spanning trees of the complete graph K4 :

2 2 2 2

3 1 3 1 3 1 3 1

4 4 4 4

2 2 2 2

3 1 3 1 3 1 3 1

4 4 4 4
.
2 2 2 2

3 1 3 1 3 1 3 1

4 4 4 4

2 2 2 2

3 1 3 1 3 1 3 1

4 4 4 4

(There are only two non-isomorphic ones among them.)

This example suggests that the # of spanning trees of a complete graph Kn is


nn −2 .
This is indeed true, and we will prove this later. For now, however, let us
address the more general problem of counting spanning arborescences of an
arbitrary digraph D.

5.14.2. Notations
First, we introduce a notation:
An introduction to graph theory, version August 2, 2023 page 219

Definition 5.14.2. We will use the Iverson bracket notation: If A is any


logical statement, then we set
(
1, if A is true;
[A] :=
0, if A is false.

For example, [ K2 is a tree] = 1 whereas [ K3 is a tree] = 0.


Definition 5.14.3. Let M be a matrix. Let i and j be two integers. Then,

Mi,j will mean the entry of M in row i and column j;


M∼i,∼ j will mean the matrix M with row i removed and column j removed.

For example,
   
a b c a b c  
 d e f  = f  d e f  a b
and = .
g h
g h i 2,3 g h i ∼2,∼3

5.14.3. The Laplacian of a multidigraph


We shall now assign a matrix to (more or less) any multidigraph:58
Definition 5.14.4. Let D = (V, A, ψ) be a multidigraph. Assume that V =
{1, 2, . . . , n} for some n ∈ N.
For any i, j ∈ V, we let ai,j be the # of arcs of D that have source i and
target j.
The Laplacian of D is defined to be the n × n-matrix L ∈ Z n×n whose
entries are given by

Li,j = deg+ i · [i = j] − ai,j for all i, j ∈ V.
| {z }
This is also
known as δi,j

In other words, it is the matrix


 
deg+ 1 − a1,1 − a1,2 ··· − a1,n
 + 
 − a2,1 deg 2 − a2,2 · · · − a2,n 
L=  .. .. .. ..
.

 . . . . 
− an,1 − an,2 ··· deg+ n − an,n

58 Recall that the symbol “#” means “number”.


An introduction to graph theory, version August 2, 2023 page 220

Example 5.14.5. Let D be the digraph

1 3
.

Then, its Laplacian is


   
2 − 1 −1 −0 1 −1 0
 −0 1 − 0 −1  =  0 1 −1  .
−0 −0 1 − 1 0 0 0

One thing we notice from this example is that loops do not matter at all to
the Laplacian L. Indeed, a loop with source i and target i counts once in deg+ i
and once in ai,i , but these contributions cancel out.
Here is a simple property of Laplacians:

Proposition 5.14.6. Let D = (V, A, ψ) be a multidigraph. Assume that V =


{1, 2, . . . , n} for some positive integer n.
Then, the Laplacian L of D is singular; i.e., we have det L = 0.

Proof. The sum of all columns of L is the zero vector, because for each i ∈ V we
have
n n  
∑ Li,j = ∑ deg+ i · [i = j] − ai,j (by the definition of L)
j =1 j =1
n  n
+
= ∑ deg i · [i = j] − ∑ ai,j
j =1 j =1
| {z } | {z }
=deg+ i =deg+ i
(since only the addend (since this is counting
for j=i can be nonzero) all arcs with source i)
+ +
= deg i − deg i = 0.

In other words, we have Le = 0 for the vector e := (1, 1, . . . , 1)T . Thus, this
vector e lies in the kernel (aka nullspace) of L, and so L is singular.
(Note that we used the positivity of n here! If n = 0, then e is the zero vector,
because a vector with 0 entries is automatically the zero vector.)

5.14.4. The Matrix-Tree Theorem: statement


Proposition 5.14.6 shows that the determinant of the Laplacian of a digraph is
not very interesting. It is common, however, that when a matrix has determi-
nant 0, its largest nonzero minors (= determinants of submatrices) often carry
An introduction to graph theory, version August 2, 2023 page 221

some interesting information; they are “the closest the matrix has” to a nonzero
determinant. In the case of the Laplacian, they turn out to count spanning ar-
borescences:

Theorem 5.14.7 (Matrix-Tree Theorem). Let D = (V, A, ψ) be a multidigraph.


Assume that V = {1, 2, . . . , n} for some positive integer n.
Let L be the Laplacian of D. Let r be a vertex of D. Then,

(# of spanning arborescences of D rooted to r ) = det ( L∼r,∼r ) .

Before we prove this, some remarks:

• The determinant det ( L∼r,∼r ) is the (r, r )-th entry of the adjugate matrix
of L.

• The V = {1, 2, . . . , n} assumption is a typical “WLOG assumption”: If


you have an arbitrary digraph D, you can always rename its vertices as
1, 2, . . . , n, and then this assumption will be satisfied. Thus, Theorem
5.14.7 helps you count the spanning arborescences of any digraph. That
said, you can also drop the V = {1, 2, . . . , n} assumption from Theorem
5.14.7 if you are okay with matrices whose rows and columns are indexed
not by numbers but by elements of an arbitrary finite set59 .

5.14.5. Application: Counting the spanning trees of Kn


Now, let us use the Matrix-Tree Theorem to count the spanning trees of Kn . This
should provide some intuition for the theorem before we come to its proof.
We fix a positive integer n. Let L be the Laplacian of the multidigraph Knbidir
(where Kn , as we recall, is the complete graph on the set {1, 2, . . . , n}). Then,
each vertex of Knbidir has outdegree n − 1, and thus we have
 
n − 1 −1 · · · −1
 −1 n − 1 · · · −1 
 
L= . .. .. .. 
 .. . . . 
−1 −1 ··· n−1

(this is the n × n-matrix whose diagonal entries are n − 1 and whose off-diagonal
entries are −1). By Proposition
 5.13.1 (b) (applied to G = Kn and r = 1),
there is a bijection between spanning arborescences of Knbidir rooted to 1 and

59 Suchmatrices are perfectly fine, just somewhat unusual and hard to write down (which row
do you put on top?). See https://mathoverflow.net/questions/317105 for details.
An introduction to graph theory, version August 2, 2023 page 222

{spanning trees of Kn }. Hence, by the bijection principle, we have


(# of spanning trees of Kn )
 
= # of spanning arborescences of Knbidir rooted to 1
 
= det ( L∼1,∼1 ) by Theorem 5.14.7, applied to D = Knbidir and r = 1
 
n − 1 −1 · · · −1
 −1 n − 1 · · · −1 
 
= det  . .. .. ..  .
 .. . . . 
−1 −1 · · · n − 1
| {z }
an ( n −1)×(n −1)-matrix

How do we compute this determinant? Here are three ways:


• The most elementary approach is using row transformations:
 
n − 1 −1 · · · −1
 −1 n − 1 · · · − 1 
 
det  . .. .. .. 
 .. . . . 
−1 −1 · · · n − 1
 
n − 1 −1 −1 −1 · · · −1
 −n n 0 0 ··· 0   
 
 −n 0 n 0 ··· 0  here, we have
   subtracted the 1st row 
= det  −n 0 0 n ··· 0 
 
 . . . .. .. ..  from each other row
 .. .. .. . . . 
−n 0 0 0 ··· n
 
n − 1 −1 −1 −1 · · · −1  
 −1 1 0 0 ··· 0  here, we have
   
 −1 0 1 0 ··· 0   factored out 
 
= nn−2 det  −1 0 0 1 ··· 0  
 an n from each 

   
 . . . . .. ..  row except for
 .. .. .. .. . . 
the first row
−1 0 0 0 ··· 1
 
1 0 0 0 ··· 0
 −1 1 0 0 · · · 0 
   
 −1 0 1 0 · · · 0 
  here, we have added the 2nd,
= nn−2 det  −1 0 0 1 · · · 0 
  3rd, etc. rows to the 1st row
 . .. .. .. . . .. 
 .. . . . . . 
−1 0 0 0 · · · 1
| {z }
=1
(since the matrix is triangular
with diagonal entries 1,1,...,1)

= nn −2 .
An introduction to graph theory, version August 2, 2023 page 223

• The so-called matrix determinant lemma says that for any m × m-matrix
A ∈ R m×m , any column vector u ∈ R m×1 and any row vector v ∈ R1×m ,
we have
det ( A + uv) = det A + v (adj A) u.
This helps us compute our determinant, since
 
n − 1 −1 · · · −1
 −1 n − 1 · · · −1 
 
 .. .
. . . .
. 
 . . . . 
−1 −1 · · · n − 1
   
n 0 ··· 0 −1
 0 n  
· · · 0   −1  
 
= . .. .. . + .  1 1 ··· 1 .
 .. . . ..   .. | {z }
=v
0 0 ··· n −1
| {z } | {z }
=A =u

• Here is an approach that is heavier on linear algebra (specifically, eigen-


vectors and eigenvalues60 ):
Let (e1 , e2 , . . . , en−1 ) be the standard basis of the R-vector space R n−1 (so
that ei is the column vector with its i-th coordinate equal to 1 and all its
other coordinates equal to 0). Then, we can  find the following n − 1 eigen-

n − 1 −1 · · · −1
 −1 n − 1 · · · −1 
 
vectors of our (n − 1) × (n − 1)-matrix  . .. .. ..  :
 .. . . . 
−1 −1 ··· n−1
– the n − 2 eigenvectors e1 − ei for all i ∈ {2, 3, . . . , n − 1}, each of them
with eigenvalue n (check this!);
– the eigenvector e1 + e2 + · · · + en−1 with eigenvalue 1 (check this!).
Since these n − 1 eigenvectors are linearly independent (check this!), they
form a basis of R n−1. Hence, our matrix is similar to the diagonal matrix
with diagonal entries n, n, . . . , n, 1 (by [Treil17, Chapter 4, Theorem 2.1]),
| {z }
n −2 times
· · · n} 1 = nn−2 .
and therefore has determinant |nn {z
n −2 times

There are other ways as well. Either way, the result we obtain is nn−2 . Thus,
we have proved (relying on the Matrix-Tree Theorem, which we haven’t yet
proved):

60 See [Treil17, Chapter 4] for a refresher.


An introduction to graph theory, version August 2, 2023 page 224

Theorem 5.14.8 (Cayley’s formula). Let n be a positive integer. Then, the #


of spanning trees of the complete graph Kn is nn−2 .

In other words:

Corollary 5.14.9. Let n be a positive integer. Then, the # of simple graphs


with vertex set {1, 2, . . . , n} that are trees is nn−2 .

Proof. This is just Theorem 5.14.8, since the simple graphs with vertex set
{1, 2, . . . , n} that are trees are precisely the spanning trees of Kn .
There are many ways to prove Cayley’s formula (Theorem 5.14.8). I can par-
ticularly recommend the two combinatorial proofs given in [Galvin21, §2.4 and
§2.5], as well as Joyal’s proof sketched in [Leinst19]. Most textbooks on enu-
merative combinatorics give one proof or another; e.g., [Stanle18, Appendix to
Chapter 9] gives three. Cayley’s formula also appears in Aigner’s and Ziegler’s
best-of compilation of mathematical proofs [AigZie18, Chapter 33] with four
different proofs. Note that some of the sources use a matrix-tree theorem for
undirected graphs; this is a particular case of our matrix-tree theorem.61
However, in order to complete our proof, we still need to prove the Matrix-
Tree Theorem.

5.14.6. Preparations for the proof


In order to prepare for the proof of the Matrix-Tree Theorem, we state a simple
lemma (yet another criterion for a digraph to be an arborescence):

Lemma 5.14.10. Let D = (V, A, ψ) be a multidigraph. Let r be a vertex of


D. Assume that D has no cycles. Assume moreover that D has no arcs with
source r. Assume furthermore that each vertex v ∈ V \ {r } has outdegree 1.
Then, the digraph D is an arborescence rooted to r.

This lemma is precisely Exercise 4.4 (b), at least after reversing all arcs. But
let us give a self-contained proof here:

61 Onemore remark: In Corollary 5.14.9, we have counted the trees with n vertices (i.e., simple
graphs with vertex set {1, 2, . . . , n} that are trees). It sounds equally natural to count the
“unlabelled trees with n vertices”, i.e., the equivalence classes of such trees up to isomor-
phism. Unfortunately, this is one of those “messy numbers” with no good expression: the
best formula known is recursive. There is also an asymptotic formula (“Otter’s formula”,
[Otter48]): the number of equivalence classes of n-vertex trees (up to isomorphism) is

αn
≈β with α ≈ 2.955 and β ≈ 0.5349.
n5/2
An introduction to graph theory, version August 2, 2023 page 225

Proof of Lemma 5.14.10. Let u be any vertex of D. Let p = (v0 , a1 , v1 , a2 , v2 , . . . , ak , vk )


be a longest path of D that starts at u. 62 Thus, v0 = u.
We shall show that vk = r. Indeed, assume the contrary. Thus, vk 6= r, so that
vk ∈ V \ {r }. Hence, the vertex vk has outdegree 1 (since we assumed that each
vertex v ∈ V \ {r } has outdegree 1). Thus, there exists an arc b of D that has
source vk . Consider this arc b, and let w be its target. Thus, appending the arc
b and the vertex w to the end of the path p, we obtain a walk

w = (v0 , a1 , v1 , a2 , v2 , . . . , ak , vk , b, w)

of D that starts at u (since v0 = u). Proposition 4.5.9 shows that this walk w
either is a path or contains a cycle. Hence, w is a path (since D has no cycles).
Thus, w is a path of D that starts at u. Since w is longer than p (namely, longer
by 1), this shows that p is not the longest path of D that starts at u. But this
contradicts the very definition of p.
This contradiction shows that our assumption was false. Hence, vk = r. Thus,
p is a path from u to r (since v0 = u and vk = r). Therefore, the digraph D has
a path from u to r (namely, p).
Forget that we fixed u. We thus have shown that for each vertex u of D,
the digraph D has a path from u to r. In other words, r is a to-root of D.
Furthermore, we have deg+ r = 0 (since D has no arcs with source r), and each
v ∈ V \ {r } satisfies deg+ v = 1 (since we have assumed that each vertex v ∈
V \ {r } has outdegree 1). In other words, the digraph D satisfies Statement A’6
from the dual arborescence equivalence theorem (Theorem 5.10.5). Therefore,
it satisfies Statement A’1 from that theorem as well (since all six statements A’1,
A’2, . . ., A’6 are equivalent). In other words, D is an arborescence rooted to r.
This proves Lemma 5.14.10.

5.14.7. The Matrix-Tree Theorem: proof


We shall now prove the Matrix-Tree Theorem (Theorem 5.14.7), guided by the
following battle plan:

1. First, we will prove it in the case when each vertex v ∈ V \ {r } has out-
degree 1. In this case, after removing all arcs with source r from D (these
arcs do not matter, since neither the submatrix D∼r,∼r nor the spanning ar-
borescences rooted to r depend on them), we have essentially two options
(subcases): either D is itself an arborescence or D has a cycle.

2. Then, we will prove the matrix-tree theorem in the slightly more general
case when each v ∈ V \ {r } has outdegree ≤ 1. This is easy, since a vertex
v ∈ V \ {r } having outdegree 0 trivializes the theorem.

62 Sucha path clearly exists, since the length-0 path (u) is a path of D that starts at u, and since
a path of D cannot have length larger than |V | − 1.
An introduction to graph theory, version August 2, 2023 page 226

3. Finally, we will prove the theorem in the general case. This is done by
strong induction on the number of arcs of D. Every time you have a
vertex v ∈ V \ {r } with outdegree > 1, you can pick such a vertex and
color the outgoing arcs from it red and blue in such a way that each color
is used at least once. Then, you can consider the subdigraph of D obtained
by removing all blue arcs (call it Dred ) and the subdigraph of D obtained
by removing all red arcs (call it Dblue ). You can then apply the induction
hypothesis to Dred and to Dblue (since each of these two subdigraphs has
fewer arcs than D), and add the results together. The good news is that
both the # of spanning arborescences rooted to r and the determinant
det ( L∼r,∼r ) “behave additively” (we will soon see what this means).

So let us begin with Step 1. We first study a very special case:

Lemma 5.14.11. Let D = (V, A, ψ) be a multidigraph. Let r be a vertex of


D. Assume that D has no cycles. Assume moreover that D has no arcs with
source r. Assume furthermore that each vertex v ∈ V \ {r } has outdegree 1.
Then:

(a) The digraph D has a unique spanning arborescence rooted to r.

(b) Assume that V = {1, 2, . . . , n} for some n ∈ N. Let L be the Laplacian


of D. Then, det ( L∼r,∼r ) = 1.

Proof. (a) Lemma 5.14.10 shows that the digraph D itself is an arborescence
rooted to r.
As a consequence, D itself is a spanning arborescence of D rooted to r.
Therefore, | A| = |V | − 1 (by Statement A’2 in the Dual Arborescence Equiv-
alence Theorem (Theorem 5.10.5)63 ). Hence, D has no spanning arborescences
other than itself (because the condition | A| = |V | − 1 would get destroyed as
soon as we remove an arc). So the only spanning arborescence of D rooted to r
is D itself. This proves Lemma 5.14.11 (a).
(b) We WLOG assume that r = n (otherwise, we can swap r with n, so that
L∼r,∼r becomes L∼n,∼n ).
Let D ′ be the digraph D with a loop added at each vertex – i.e., the multidi-
graph obtained from D by adding n extra arcs ℓ1 , ℓ2 , . . . , ℓn and letting each arc
ℓi have source i and target i.
Let Sn−1 denote the group of permutations of the set
( )
{1, 2, . . . , n − 1} = {1, 2, . . . , n} \ |{z}
n = V \ {r } .
| {z }
=V =r

63 or by the fact that | A| is the sum of the outdegrees of all vertices of D


An introduction to graph theory, version August 2, 2023 page 227

Now, from r = n, we have


n −1
det ( L∼r,∼r ) = det ( L∼n,∼n ) = ∑ sign σ · ∏ Li,σ(i) (22)
σ ∈ S n −1 i =1

(by the Leibniz formula for the determinant). We shall now study the addends
in the sum on the right hand side of this equality. Specifically, we will show that
n −1
the only addend whose product ∏ Li,σ(i ) is nonzero is the addend for σ = id.
i =1
n −1
Indeed, let σ ∈ Sn−1 be a permutation such that the product ∏ Li,σ(i ) is
i =1
nonzero. We shall prove that σ = id.
Consider an arbitrary v ∈ {1, 2, . . . , n − 1}. Then, Lv,σ(v) 6= 0 (because Lv,σ(v)
n −1
is a factor in the product ∏ Li,σ(i ) , which is nonzero). However, the definition
i =1
of L yields Lv,σ(v) = deg+ v · [v = σ (v)] − av,σ( v) . Thus,

deg+ v · [v = σ (v)] − av,σ(v) = Lv,σ( v) 6= 0.

Hence, at least one of the numbers [v = σ (v)] and av,σ( v) is nonzero. In other
words, we have v = σ (v) (this is what it means for [v = σ (v)] to be nonzero) or
the digraph D has an arc with source v and target σ (v) (because this is what it
means for av,σ(v) to be nonzero). In either case, the digraph D ′ has an arc with
source v and target σ (v) (because if v = σ (v), then one of the loops we added
to D does the trick). We can apply the same argument to σ (v) instead of v, and
obtain an arc with source σ (v) and target σ (σ (v)). Similarly, we obtain an arc
with source σ (σ (v)) and target σ (σ (σ (v))). We can continue this reasoning
indefinitely. By continuing it for n steps, we obtain a walk
 
2 3 n
v, ∗, σ (v) , ∗, σ (v) , ∗, σ (v) , . . . , ∗, σ (v)

in the digraph D ′ , where each asterisk means an arc (we don’t care about what
these arcs are, so we are not giving them names). This walk cannot be a path
(since it has n + 1 vertices, but D ′ has only n vertices); thus, it must contain
a cycle (by Proposition 4.5.9). All arcs of this cycle must be loops (because
otherwise, we could remove the loops from this cycle and obtain a cycle of D,
but we know that D has no cycles). In particular, its first arc is a loop. Thus, our
above walk v, ∗, σ (v) , ∗, σ2 (v) , ∗, σ3 (v) , . . . , ∗, σn (v) contains a loop (since
the arcs of the cycle come from this walk). In other words, we have σi (v) =
σi +1 (v) for some i ∈ {0, 1, . . . , n − 1}. Since σ is injective, we can apply σ−i
to both sides of this equality, and conclude that v = σ (v). In other words,
σ (v) = v.
Forget that we fixed v. We thus have shown that σ (v) = v for each v ∈
{1, 2, . . . , n − 1}. In other words, σ = id.
An introduction to graph theory, version August 2, 2023 page 228

Forget that we fixed σ. We thus have proved that σ = id for each permutation
n −1
σ ∈ Sn−1 for which the product ∏ Li,σ(i ) is nonzero. In other words, the
i =1
n −1
only permutation σ ∈ Sn−1 for which the product ∏ Li,σ(i ) is nonzero is the
i =1
permutation id.
Thus, the only nonzero addend on the right hand side of (22) is the addend
corresponding to σ = id. Hence, (22) can be simplified as follows:
n −1 n −1
det ( L∼n,∼n ) = sign (id) · ∏ Li,id(i ) = ∏ Li,id(i) .
| {z } i =1 i =1
=1

Since each i ∈ {1, 2, . . . , n − 1} satisfies



Li,id(i ) = Li,i = deg+ i · [i = i ] − ai,i
| {z } | {z } |{z}
=1 =1 =0
(since i has outdegree 1 (since D has no cycles
(because each vertex v∈V \{r } has and thus cannot have
outdegree 1, and we can apply this a loop with source i)
to v=i since i ∈{1,2,...,n −1}=V \{r }))
(by the definition of L)
= 1 · 1 − 0 = 1,
n −1
this can be simplified to det ( L∼n,∼n ) = ∏ 1 = 1. This proves Lemma 5.14.11
i =1
(b).
Next, we drop the “no cycles” condition:

Lemma 5.14.12. Let D = (V, A, ψ) be a multidigraph. Let r be a vertex of


D. Assume that each vertex v ∈ V \ {r } has outdegree 1. Then, the MTT
holds for these D and r. (Here and in the following, “MTT” is short for
“Matrix-Tree Theorem”, i.e., for Theorem 5.14.7.)

Proof. First of all, we note that an arc with source r cannot appear in any
spanning arborescence of D rooted to r (since any such arborescence satisfies
deg+ r = 0, according to Statement A’6 in the Dual Arborescence Equivalence
Theorem (Theorem 5.10.5)). Furthermore, the arcs with source r do not affect
the matrix L∼r,∼r , since they only appear in the r-th row of the matrix L (but
this r-th row is removed in L∼r,∼r ).
Hence, any arc with source r can be removed from D without disturbing
anything we currently care about. Thus, we WLOG assume that D has no arcs
with source r (else, we can just remove them from D).
We WLOG assume that r = n (otherwise, we can swap r with n, so that L∼r,∼r
becomes L∼n,∼n ).
An introduction to graph theory, version August 2, 2023 page 229

We are in one of the following two cases:


Case 1: The digraph D has a cycle.
Case 2: The digraph D has no cycles.
Consider Case 1. In this case, D has a cycle v = (v1 , ∗, v2 , ∗, . . . , ∗, vm ) (where
we again are putting asterisks in place of the arcs). This cycle cannot contain
r (since D has no arcs with source r). Thus, all its vertices v1 , v2 , . . . , vm belong
to V \ {r }. Hence, for each i ∈ {1, 2, . . . , m − 1}, the vertex vi has outdegree 1
(since we assumed that each vertex v ∈ V \ {r } has outdegree 1). Consequently,
for each i ∈ {1, 2, . . . , m − 1}, the only arc of D that has source vi is the arc that
follows vi on the cycle v. Therefore, in the matrix L, the vi -th row has a 1 in
the vi -th position (because deg+ (vi ) = 1), a −1 in the vi +1 -th position (since
the arc that follows vi on the cycle v has source vi and target vi +1 ), and 0s in all
other positions. Since r = n, the same must then be true for the matrix L∼r,∼r :
That is, the vi -th row of the matrix L∼r,∼r has a 1 in the vi -th position, a −1 in
the vi +1 -th position, and 0s in all other positions. Thus, the sum of the v1 -th,
v2 -th, . . ., vm−1 -th rows of L∼r,∼r is the zero vector (since the 1s and the −1s
just cancel out)64 .65
So we have found a nonempty set of rows of L∼r,∼r whose sum is the zero
vector. This yields that the matrix L∼r,∼r is singular (by basic properties of
determinants66 ), so its determinant is det ( L∼r,∼r ) = 0. On the other hand, the
digraph D has no spanning arborescence (because, in order to get a spanning
arborescence of D, we would have to remove at least one arc of our cycle v

64 Namely, the −1 in the vi+1-th position of the vi -th row gets cancelled by the 1 in the vi+1-th
position of the vi+1-th row. (We are using the fact that vm = v1 here.)
65 Let me illustrate this on a representative example: Assume that the numbers
v1 , v2 , . . . , vm−1 , vm are 1, 2, . . . , m − 1, 1 (respectively). Then, the first m − 1 rows of L look
as follows:
1 −1
1 −1
1 −1
.. ..
. .
1 −1
−1 1
(where all the missing entries are zeroes). Thus, the sum of these m − 1 rows is the zero
vector. The same is therefore true of the matrix L∼r,∼r (since the first m − 1 rows of the latter
matrix are just the first m − 1 rows of L, with their r-th entries removed).
The general case is essentially the same as this example; the only difference is that the
relevant rows are in other positions.
66 Specifically, we are using the following fact: “Let M be a square matrix. If there is a certain

nonempty set of rows of M whose sum is the zero vector, then the matrix M is singular.”.
To prove this fact, we let S be this nonempty set. Choose one row from this set, and
call it the chosen row. Now, add all the other rows from this set to this one chosen row.
This operation does not change the determinant of M (since the determinant of a matrix
is unchanged when we add one row to another), but the resulting matrix has a zero row
(namely, the chosen row) and thus has determinant 0. Hence, the original matrix M must
have had determinant 0 as well. In other words, M was singular, qed.
An introduction to graph theory, version August 2, 2023 page 230

(since an arborescence cannot have a cycle); but then, the source of this arc
would have outdegree 0, and thus we could no longer find a path from this
source to r, so we would not obtain a spanning arborescence). In other words,

(# of spanning arborescences of D rooted to r ) = 0.


Comparing this with det ( L∼r,∼r ) = 0, we conclude that the MTT holds in this
case (since it claims that 0 = 0). Thus, Case 1 is done.
Next, we consider Case 2. In this case, D has no cycles. Then, det ( L∼r,∼r ) = 1
(by Lemma 5.14.11 (b)) and

(# of spanning arborescences of D rooted to r ) = 1 (by Lemma 5.14.11 (a)) .


Thus, the MTT boils down to 1 = 1, which is again true.
So Lemma 5.14.12 is proved.
Next, we venture into a mildly greater generality:

Lemma 5.14.13. Let D = (V, A, ψ) be a multidigraph. Let r be a vertex of D.


Assume that each vertex v ∈ V \ {r } has outdegree ≤ 1. Then, the MTT (=
Matrix-Tree Theorem) holds for these D and r.

Proof. If each vertex v ∈ V \ {r } has outdegree 1, then this is true by Lemma


5.14.12.
Thus, we WLOG assume that this is not the case. Hence, some vertex v ∈
V \ {r } has outdegree 6= 1. Consider this v. The outdegree of v is 6= 1, but also
≤ 1 (by the hypothesis of the lemma). Hence, this outdegree must be 0. That
is, there is no arc with source v.
WLOG assume that r = n (otherwise, swap r with n).
We have v 6= r. Hence, the digraph D has no path from v to r (since any such
path would include an arc with source v, but there is no arc with source v).
Therefore, D has no spanning arborescence rooted to r (because any such
spanning arborescence would have to have a path from v to r). In other words,

(# of spanning arborescences of D rooted to r ) = 0.


Also, det ( L∼r,∼r ) = 0 (since the v-th row of the matrix L∼r,∼r is 0 (because
there is no arc with source v)). So the MTT boils down to 0 = 0 again, and thus
Lemma 5.14.13 is proved.
We are now ready to prove the MTT in the general case:
Proof of Theorem 5.14.7. First, we introduce a notation:

Let M and N be two n × n-matrices that agree in all but one row.
That is, there exists some j ∈ {1, 2, . . . , n} such that for each i 6= j,
we have
(the i-th row of M) = (the i-th row of N ) .
An introduction to graph theory, version August 2, 2023 page 231

j j
Then, we write M ≡ N, and we let M + N be the n × n-matrix that
is obtained from M by adding the j-th row of N to the j-th row of M
(while leaving all remaining rows unchanged).
   
a b c a b c
2
For example, if M =  d e f  and N =  d′ e′ f ′ , then M ≡ N
g h i g h i
and  
2
a b c
M + N =  d + d′ e + e′ f + f ′  .
g h i
A well-known property of determinants (the multilinearity of the determi-
nant) says that if M and N are two n × n-matrices and j ∈ {1, 2, . . . , n} is a
j
number such that M ≡ N, then
 j

det M + N = det M + det N.

Now, let us prove the MTT. We proceed by strong induction on the # of arcs
of D.
Induction step: Let m ∈ N. Assume (as the induction hypothesis) that the
MTT holds for all digraphs D that have < m arcs. We must now prove it for
our digraph D with m arcs.
WLOG assume that r = n (otherwise, swap r with n).
If each vertex v ∈ V \ {r } has outdegree ≤ 1, then the MTT holds by Lemma
5.14.13. Thus, we WLOG assume that some vertex v ∈ V \ {r } has outdegree
> 1. Pick such a vertex v. We color each arc with source v either red or blue,
making sure that at least one arc is red and at least one arc is blue. (We can
do this, since v has outdegree > 1.) All arcs that do not have source v remain
uncolored.
Now, let Dred be the subdigraph obtained from D by removing all blue arcs.
Then, Dred has fewer arcs than D. In other words, Dred has < m arcs. Hence,
the induction hypothesis yields that the MTT holds for Dred . That is, we have
   
# of spanning arborescences of Dred rooted to r = det Lred
∼r,∼r ,

where Lred means the Laplacian of Dred .


Likewise, let Dblue be the subdigraph obtained from D by removing all red
arcs. Then, Dblue has fewer arcs than D. Hence, the induction hypothesis yields
that the MTT holds for Dblue . That is,
   
# of spanning arborescences of Dblue rooted to r = det Lblue
∼r,∼r ,

where Lblue means the Laplacian of Dblue .


An introduction to graph theory, version August 2, 2023 page 232

Example 5.14.14. Let D be the multidigraph

c 3
b

a 1
d

with r = 1. Its Laplacian is


 
1 −1 0 0 0
 0 1 −1 0 0 
 
L=
 −1 0 3 −1 −1 
.
 0 0 0 1 −1 
−1 0 0 0 1

Let us pick v = 3 (this is a vertex with outdegree > 1), and let us color the
arcs a and c red and the arcs b and d blue (various other options are possible).
Then, Dred and Dblue look as follows (along with their Laplacians Lred and
Lblue ):
An introduction to graph theory, version August 2, 2023 page 233

2 2

c 3 3
b

a 1 1
d

4 4

5 5
Dred Dblue
 1 −1 0 0 0   1 −1 0 0 0 
 0 1 −1 0 0   0 1 −1 0 0 
   
Lred = 0 0 1 −1 0  Lblue =  −1 0 2 0 −1 
   
0 0 0 1 −1 0 0 0 1 −1
−1 0 0 0 1 −1 0 0 0 1

Now, the digraphs D, Dblue and Dred differ only in the arcs with source v,
and as far as the latter arcs are concerned, the arcs of D are divided between
Dblue and Dred . Hence, by the definition of the Laplacian, we have
v v
Lred ≡ Lblue and Lred + Lblue = L.

Thus,
v v
Lred blue
∼r,∼r ≡ L∼r,∼r and Lred blue
∼r,∼r + L∼r,∼r = L∼r,∼r
(here, we have used the fact that r = n and v 6= r, so that when we remove
the r-th row and the r-th column of the matrix L, the v-th row remains the v-th
row). Hence,
 
     
  v
det  L
 | ∼{zr,∼r
 = det L red
+ L blue
= det L red
+ det L blue
}  ∼r,∼r ∼r,∼r ∼r,∼r ∼r,∼r
v
= Lred blue
∼r,∼r + L ∼r,∼r

(by the multilinearity of the determinant).


However, a similar equality holds for the # of spanning arborescences: namely,
An introduction to graph theory, version August 2, 2023 page 234

we have

(# of spanning arborescences of D rooted to r )


 
= # of spanning arborescences of Dred rooted to r
 
blue
+ # of spanning arborescences of D rooted to r .

Here is why: Recall that an arborescence rooted to r must satisfy deg+ v = 1


(by Statement A’6 in the Dual Arborescence Equivalence Theorem (Theorem
5.10.5), since v ∈ V \ {r }). In other words, an arborescence rooted to r must
contain exactly one arc with source v. In particular, a spanning arborescence of
D rooted to r must contain either a red arc or a blue arc, but not both at the
same time. In the former case, it is a spanning arborescence of Dred ; in the latter,
it is a spanning arborescence of Dblue . Conversely, any spanning arborescence
of Dred or of Dblue rooted to r is automatically a spanning arborescence of D
rooted to r. Thus,

(# of spanning arborescences of D rooted to r )


 
= # of spanning arborescences of Dred rooted to r
| {z }
=det( L ∼r,∼r )
red

(as we saw above)


 
+ # of spanning arborescences of Dblue rooted to r
| {z }
=det( Lblue
∼r,∼r )
(as we saw above)
   
= det Lred
∼r,∼r + det L blue
∼r,∼r = det ( L∼r,∼r )

 
(since we proved that det ( L∼r,∼r ) = det Lred blue
∼r,∼r + det L∼r,∼r ). That is, the
MTT holds for our digraph D and its vertex r. This completes the induction
step, and thus the MTT (Theorem 5.14.7) is proved.
Our above proof of Theorem 5.14.7 has followed [Stanle18, Theorem 10.4].
Other proofs can be found across the literature, e.g., in [VanEhr51, Theorem
7], in [Margol10, Theorem 2.8], in [DeLeen19, Theorem 1] and in [Holzer22,
Theorem 2.5.3]. (Some of these sources prove more general versions of the
theorem. Confusingly, each source uses different notations and works in a
slightly different setup, although most of them quickly reveal themselves to be
equivalent upon some introspection.)

5.14.8. Further exercises on the Laplacian


An introduction to graph theory, version August 2, 2023 page 235

Exercise 5.18. Let G = (V, E, ϕ) be a multigraph. Let L be the Laplacian of


the digraph Gbidir . Prove that L is positive semidefinite.
[Hint: Write L as N T N, where N or N T is some matrix you have seen
before.
Note that the statement is not true if we replace Gbidir by an arbitrary
digraph D.]

The following two exercises stand at the beginning of the theory of chip-firing
and related dynamical systems on a digraph (see [CorPer18], [Klivan19] and
[JoyMel17] for much more). While the Laplacian is not mentioned in them
directly, it is implicitly involved in the definition of a “donation” (how?).

Exercise 5.19. Let D = (V, A, ψ) be a strongly connected multidigraph.


A wealth distribution on D shall mean a family (kv )v∈V of integers (one
for each vertex v ∈ V). If k = (kv )v∈V is a wealth distribution, then we refer
to each value kv as the wealth of the vertex v, and we define the total wealth
of k to be the sum ∑ kv . We say that a vertex v is in debt in a given wealth
v ∈V
distribution k = (kv )v∈V if its wealth kv is negative.
For any vertices v and w, we let av,w denote the number of arcs that have
source v and w.
A donation is an operation that transforms a wealth distribution as fol-
lows: We choose a vertex v, and we decrease its wealth by its outdegree
deg+ v, and then increase the wealth of each vertex w ∈ V (including v itself)
by av,w . (You can think of v as donating a unit of wealth for each arc that has
source v. This unit flows to the target to this arc. Note that a donation does
not change the total wealth.)
Let k be a wealth distribution on D whose total wealth is larger than | A| −
|V |. Prove that by an appropriately chosen finite sequence of donations, we
can ensure that no vertex is in debt.
[Example: For instance, consider the digraph

3 2

4 1

5 6

with wealth distribution (k1 , k2 , k3 , k4 , k5 , k6 ) = (−1, −1, 1, 2, 0, 1). The ver-


tices 1 and 2 are in debt here, but it is possible to get all vertices out of debt
An introduction to graph theory, version August 2, 2023 page 236

by having the vertices 4, 5, 6, 1 donate in some order (the order clearly does
not matter for the result67 ).
Note that vertices are allowed to donate multiple times (although in the
above example, this was unnecessary).]
[Hint: A donation will be called safe if its donor v (that is, the vertex cho-
sen to lose wealth) satisfies kv ≥ deg+ v, where k is the wealth distribution
just before this donation. Start by showing that if the total wealth is larger
than | A| − |V |, then at least one vertex v has wealth ≥ deg+ v (and thus can
make a safe donation). Next, show that for any given wealth distribution k,
there are only finitely many wealth distributions that can be obtained from
k by a sequence of safe donations. Finally, for any vertex v, find a rational
quantity that increases every time that a donor distinct from v makes a do-
nation. Conclude that in a sufficiently long sequence of safe donations, every
vertex must appear as a donor. But a donor of a safe donation must be out
of debt just before its safe donation, and will never go back into debt.]

Exercise 5.20. We continue with the setting and terminology of Exercise 5.19.
A clawback is an operation that transforms a wealth distribution as fol-
lows: We choose a vertex v, and we increase its wealth by its outdegree
deg+ v, and then decrease the wealth of each vertex w ∈ V (including v
itself) by av,w . (Thus, a clawback is the inverse of a donation.)
Let k be a wealth distribution on D whose total wealth is larger than | A| −
|V |. Prove that by an appropriately chosen finite sequence of clawbacks, we
can ensure that no vertex is in debt.
[Remark: Note that we are still assuming D to be strongly connected.
Otherwise, the truth of the claim is not guaranteed. For instance, for the
digraph
3 4

1 2

with wealth distribution (k1 , k2 , k3 , k4 ) = (0, 0, −1, 2), no sequence of dona-


tions and clawbacks will result in every vertex being out of debt (since the
wealth difference k4 − k3 is preserved under any donation or clawback, but
this difference is too large to come from a debt-free distribution with total
weight 1). ]
[Hint: Show that any donation is equivalent to an appropriately chosen
composition of clawbacks. Something we know about the Laplacian may
come useful here.]
67 Depending on the order, some vertices will go into debt in the process, but this is okay as
long as they ultimately end up debt-free.
An introduction to graph theory, version August 2, 2023 page 237

5.14.9. Application: Counting Eulerian circuits of Knbidir


Here is one more consequence of the MTT:

Proposition 5.14.15. Let n be a positive integer. Pick any arc a of the multi-
digraph Knbidir . Then, the # of Eulerian circuits of Knbidir whose first arc is a is
n n −2 · ( n − 2) ! n .

Proof. Let r be the source of the arc a. The digraph Knbidir is balanced, and each
of its vertices has outdegree n − 1. By the BEST’ theorem (Theorem 5.10.4), we
have
 
# of Eulerian circuits of Knbidir whose first arc is a
 
  n
 
= # of spanning arborescences of Knbidir rooted to r · ∏ deg+ u −1!
| {z } u =1 | {z }
= n −1
= n n −2
(as we saw in Subsection 5.14.5 in the case when r =1,
and can similarly prove for arbitrary r)
n
n −2 n −2 n
=n · ∏ ( n − 2) ! = n · ( n − 2) ! ,
u =1

qed.
In comparison, there is no good formula known for the # of Eulerian circuits
of the undirected graph Kn . For n even, this # is 0 of course (since Kn has
vertices of odd degree in this case). For n odd, the # grows very fast, but little
else is known about it (see https://oeis.org/A135388 for some known values,
and see Exercise 5.22 for a divisibility property).

Exercise 5.21. Let n be a positive integer. Let N = {1, 2, . . . , n}. A map


f : N → N is said to be n-potent if each i ∈ N satisfies f n−1 (i ) = n. (As
usual, f k denotes the k-fold composition f ◦ f ◦ · · · ◦ f .)
Prove that the # of n-potent maps f : N → N is nn−2 .
[Hint: What do these n-potent maps have to do with trees?]

Exercise 5.22. Let n = 2m + 1 > 2 be an odd integer. Let e be an edge of the


(undirected) complete graph Kn . Prove that the # of Eulerian circuits of Kn
that start with e is a multiple of (m − 1)!n .
[Hint: Argue that each Eulerian circuit of Kn is an Eulerian circuit of a
unique balanced tournament. Here, a “balanced tournament” means a bal-
anced digraph obtained from Kn by orienting each edge.]
An introduction to graph theory, version August 2, 2023 page 238

5.15. The undirected Matrix-Tree Theorem


5.15.1. The theorem
The Matrix-Tree Theorem becomes simpler if we apply it to a digraph of the
form Gbidir :

Theorem 5.15.1 (undirected Matrix-Tree Theorem). Let G = (V, E, ϕ) be a


multigraph. Assume that V = {1, 2, . . . , n} for some positive integer n.
Let L be the Laplacian of the digraph Gbidir . Explicitly, this is the n × n-
matrix L ∈ Z n×n whose entries are given by

Li,j = (deg i ) · [i = j] − ai,j ,

where ai,j is the # of edges of G that have endpoints i and j (with loops
counting twice). Then:

(a) For any vertex r of G, we have

(# of spanning trees of G) = det ( L∼r,∼r ) .

(b) Let t be an indeterminate. Expand the determinant det (tIn + L) (here,


In denotes the n × n identity matrix) as a polynomial in t:

det (tIn + L) = cn tn + cn−1 tn−1 + · · · + c1 t1 + c0 t0 ,

where c0 , c1 , . . . , cn are numbers. (Note that this is the characteristic


polynomial of L up to substituting −t for t and multiplying by a power
of −1. Some of its coefficients are cn = 1 and cn−1 = Tr L and c0 =
det L.) Then,
1
(# of spanning trees of G) = c1 .
n
(c) Let λ1 , λ2 , . . . , λn be the eigenvalues of L, listed in such a way that
λn = 0 (we know that 0 is an eigenvalue of L, since L is singular).
Then,
1
(# of spanning trees of G) = · λ1 λ2 · · · λn−1 .
n

Proof. (a) Let r be a vertex of G. Then, Proposition 5.13.1 (b) shows that there is
a bijection
n o
spanning arborescences of Gbidir rooted to r → {spanning trees of G } .
An introduction to graph theory, version August 2, 2023 page 239

Hence, by the bijection principle, we have


(# of spanning trees of G)
 
= # of spanning arborescences of Gbidir rooted to r
= det ( L∼r,∼r ) (by the Matrix-Tree Theorem (Theorem 5.14.7)) .
This proves Theorem 5.15.1 (a).
(b) We claim that
n
c1 = ∑ det ( L∼r,∼r ) . (23)
r =1
Note that this is a purely linear-algebraic result, and has nothing to do with the
fact that L is the Laplacian of a digraph; it holds just as well if L is replaced by
any square matrix.
Once (23) is proved, Theorem 5.15.1 (b) will easily follow, because (23) entails
1 1 n 1 n
c1 = ∑
n r∑
det ( L∼r,∼r ) = (# of spanning trees of G)
n n r =1 | {z } =1
=(# of spanning trees of G ) | {z }
(by Theorem 5.15.1 (a)) = n ·(# of spanning trees of G )
1
= · n (# of spanning trees of G) = (# of spanning trees of G) .
n
Thus, it remains to prove (23).
A rigorous proof of (23) can be found in [21s, Proposition 6.4.29] or in
https://math.stackexchange.com/a/3989575/ (both of these references actu-
ally describe all coefficients c0 , c1 , . . . , cn of the polynomial det (tIn + L), not just
the t1 -coefficient c1 ). We shall merely outline the proof of (23) on a convenient
example. We want to compute c1 . In other words, we want to compute the
coefficient of t1 in the polynomial det (tIn + L) (since c1 is defined to be this
very coefficient). Let us say that n = 4, so that L has the form
 
a b c d
 a′ b′ c ′ d′ 
L=  a′′ b′′ c′′ d′′  .

a′′′ b′′′ c′′′ d′′′


Thus,  
t+a b c d
 a′ t + b′ c′ d′ 
det (tIn + L) = det 
 a ′′ ′′ ′′ ′′
.

b t+c d
a ′′′ b ′′′ c ′′′ t+d ′′′

Imagine expanding the right hand side (using the Leibniz formula) and ex-
panding the resulting products further. For instance, the product

(t + a) t + b′ d′′ c′′′
An introduction to graph theory, version August 2, 2023 page 240

becomes ttd′′ c′′′ + tb′ d′′ c′′′ + atd′′ c′′′ + ab′ d′′ c′′′ . In the huge sum that results,
we are interested in those addends that contain exactly one t, because it is
precisely these addends that contribute to the coefficient of t1 in the polynomial
det (tIn + L). Where do these addends come from? To pick up exactly one t
from a product like (t + a) (t + b′ ) d′′ c′′′ , we need to have at least one diagonal
entry in our product (for example, we cannot pick up any t from the product
cd′ b′′ a′′′ ), and we need to pick out the t from this diagonal entry (rather than,
e.g., the a or b′ or c′′ or d′′′ ). If we pick the r-th diagonal entry, then the rest of
the product is part of the expansion of det ( L∼r,∼r ) (since we must not pick any
further ts and thus can pretend that they are not there in the first place). Thus,
n
the total t1 -coefficient in det (tIn + L) will be ∑ det ( L∼r,∼r ). This proves (23),
r =1
and thus the proof of Theorem 5.15.1 (b) is complete.
(c) Consider the polynomial det (tIn + L) introduced in part (b), and in par-
ticular its t1 -coefficient c1 .
It is known that the characteristic polynomial det (tIn − L) of L is a monic
polynomial of degree n, and that its roots are the eigenvalues λ1 , λ2 , . . . , λn of
L. Hence, it can be factored as follows:

det (tIn − L) = (t − λ1 ) (t − λ2 ) · · · (t − λn ) .

Substituting −t for t on both sides of this equality, we obtain

det (−tIn − L) = (−t − λ1 ) (−t − λ2 ) · · · (−t − λn ) .

Multiplying both sides of this equality by (−1)n , we find

det (tIn + L) = (t + λ1 ) (t + λ2 ) · · · (t + λn )
= ( t + λ1 ) ( t + λ2 ) · · · ( t + λ n − 1 ) t (since λn = 0) .

Hence, the t1 -coefficient of the polynomial det (tIn + L) is λ1 λ2 · · · λn−1 (since


this is clearly the t1 -coefficient on the right hand side). Since we defined c1
to be the t1 -coefficient of the polynomial det (tIn + L), we thus conclude that
c1 = λ1 λ2 · · · λn−1 . However, Theorem 5.15.1 (b) yields
1 1
(# of spanning trees of G) = c1 = · λ1 λ2 · · · λ n − 1 .
n |{z} n
= λ1 λ2 ···λn−1

This proves Theorem 5.15.1 (c).

5.15.2. Application: counting spanning trees of Kn,m


Laplacians of digraphs often have computable eigenvalues, so Theorem 5.15.1
(c) is actually pretty useful. A striking example of a # of spanning trees (specifi-
cally, of the n-hypercube graph Qn , which we already met in Subsection 2.14.4)
that can be counted using eigenvalues will appear in Exercise 5.26.
An introduction to graph theory, version August 2, 2023 page 241

Here, however, let us give a simpler example, in which Theorem 5.15.1 (a)
suffices:

Exercise 5.23. Let n and m be two positive integers. Let Kn,m be the simple
graph with n + m vertices

1, 2, . . . , n and − 1, −2, . . . , −m,

where two vertices i and j are adjacent if and only if they have opposite
signs (i.e., each positive vertex is adjacent to each negative vertex, but no two
vertices of the same sign are adjacent).
[For example, here is how K5,2 looks like:

1 2 3 4 5

−2 −1
.]

How many spanning trees does Kn,m have?

Solution. If we rename the negative vertices −1, −2, . . . , −m as n + 1, n + 2, . . . , n +


bidir can be written in block-matrix no-
m, then the Laplacian L of the digraph Kn,m
tation as follows:  
A B
L= ,
C D
where

• A is a diagonal n × n-matrix whose all diagonal entries are equal to m


(since there are no edges between positive vertices, and since each positive
vertex has degree m);

• B is an n × m-matrix whose all entries equal −1;

• C is an m × n-matrix whose all entries equal −1;

• D is a diagonal m × m-matrix whose all diagonal entries are equal to n.

For instance, if n = 3 and m = 2, then


 
2 0 0 −1 −1
 0 2 0 −1 −1 
 

L= 0 0 2 −1 −1 
.
 −1 −1 −1 3 0 
−1 −1 −1 0 3
An introduction to graph theory, version August 2, 2023 page 242

Theorem 5.15.1 (a) yields


(# of spanning trees of Kn,m ) = det ( L∼r,∼r ) for any vertex r of Kn,m ;
thus, we need to compute det ( L∼r,∼r ) for some vertex r. We let r = 1. Then, the
submatrix L∼r,∼r = L∼1,∼1 of L again can be written in block-matrix notation
as follows: !
Ae Be
L∼r,∼r = , (24)
e D
C
where
• Ae is a diagonal (n − 1) × (n − 1)-matrix, whose all diagonal entries are
equal to m;
e is an (n − 1) × m-matrix whose all entries equal −1;
• B
e is an m × (n − 1)-matrix whose all entries equal −1;
• C
• D is a diagonal m × m-matrix whose all diagonal entries are equal to n.
Fortunately, determinants of block matrices are often not hard to compute, at
least when some of the blocks are invertible. For example, the Schur comple-
ment provides a neat formula. Our life here is even easier, since A e and D are
multiples of identity matrices: namely, A e = mIn−1 and D = nIm . We perform
!
e
A B e
a “blockwise row transformation” on the block matrix L∼r,∼r = ,
Ce D
 
e e− 1
specifically subtracting the C A -multiple of the first “block row” e
A B e
 
from the second “block row” C e D (yes, this is legitimate – it’s the same as
 
In−1 0
left-multiplying by the block matrix e−1 Im , which has determinant
−Ce A
1 because it is lower-triangular). As a result, we obtain
! !
e
A B e Ae e
B
det = det
e
C D e e e − 1 e
C − CA A D − CA e e−1 B
e
!
Ae Be
= det .
0 D−C eA e−1 B
e
The matrix on the right is “block-upper triangular”, so its determinant factors
as follows:68
!
e e  
A B e · det D − C eAe−1 B
e .
det = det A
0 D−C eAe−1 B
e
68 Weare using the fact that if a matrix is block-triangular (with all diagonal blocks being square
matrices), then its determinant is the product of the determinants of its diagonal blocks.
See, e.g., https://math.stackexchange.com/a/1221066/ or [Grinbe20, Exercise 6.29] for a
proof of this fact.
An introduction to graph theory, version August 2, 2023 page 243

Of course, det A e = mn−1, since A e is a diagonal matrix with m, m, . . . , m on the


 
e e − 1 e
diagonal. Computing det D − C A B is a bit more complicated, but still
doable: The matrix A e−1 is a diagonal matrix with m−1 , m−1 , . . . , m−1 on the
diagonal; thus, its role in the product C eA e−1 B
e is merely to multiply everything
− 1
by m . Hence, C eAe B
− 1 e= m C
− 1 eB.
e Since all entries of C e and Be are −1’s, we
see that all entries of C eB
e are (n − 1)’s. Putting all of this together, we see
that D − C eA
e B
− 1 e is the m × m-matrix whose all diagonal entries are equal to

n − m (n − 1) and whose all off-diagonal entries are equal to −m−1 (n − 1).
1

We have already computed the determinant of a matrix much like this back in
our proof of Cayley’s Formula (Subsection 5.14.5); let us deal with the general
case:
Proposition 5.15.2. Let n ∈ N. Let x and a be two numbers. Then,
 
x a a ··· a a
 a x a ··· a a 
 
 a a x ··· a a 
 
det  . . . . = ( x + ( n − 1) a ) ( x − a ) n −1 .
 .. .. .. . . .. .. 
. . 
 
 a a a ··· x a 
a a a ··· a x
| {z }
the n × n-matrix
whose diagonal entries are x
and whose off-diagonal entries are a

Proposition 5.15.2 can be proved using similar reasoning as the determinant


in Subsection 5.14.5; we will say more about it later. For now, let us apply it to
m, n − m−1 (n − 1) and −m−1 (n − 1) instead of n, x and a, to obtain
     
det D − CeAe−1 Be = n − m −1 ( n − 1) + ( m − 1) − m −1 ( n − 1)
| {z }
=1
 m−1
   
 
·  n − m −1 ( n − 1) − − m −1 ( n − 1) 
| {z }
=n
m−1
=n .
An introduction to graph theory, version August 2, 2023 page 244

Now, it is time to combine everything we know. Theorem 5.15.1 (a) yields


(# of spanning trees of Kn,m ) = det ( L∼r,∼r )
!
e B
A e
= det (by (24))
e D
C
!
e
A e
B
= det
e
0 D − CA e−1 B
e
 
e e e−1 e
= det
| {zA} · det D − C A B
= m n −1
| {z }
= n m −1
= m n −1 · nm−1 .

Thus, we have obtained the following:


Theorem 5.15.3. Let n and m be two positive integers. Let Kn,m be the simple
graph with n + m vertices

1, 2, . . . , n and − 1, −2, . . . , −m,

where two vertices i and j are adjacent if and only if they have opposite signs.
Then,
(# of spanning trees of Kn,m ) = mn−1 · nm−1 .

See [AbuSbe88] for a combinatorial proof of this theorem.


Exercise 5.24. Let n be a positive integer. Let Kn,2 be the simple graph with
vertex set {1, 2, . . . , n} ∪ {−1, −2} such that two vertices of Kn,2 are adjacent
if and only if they have opposite signs (i.e., each positive vertex is adjacent to
each negative vertex, but no two vertices of the same sign are adjacent). We
regard Kn,2 as a multigraph in the usual way.
(a) Without using the matrix-tree theorem, prove that the number of span-
ning trees of Kn,2 is n · 2n−1.
′ be the graph obtained by adding a new edge {−1, −2} to K .
(b) Let Kn,2 n,2
′ have?
How many spanning trees does Kn,2
[Example: Here is the graph Kn,2 for n = 5:

1 2 3 4 5

−2 −1
An introduction to graph theory, version August 2, 2023 page 245

′ :
And here is the corresponding graph Kn,2

1 2 3 4 5

−2 −1

Exercise 5.25. Let n be a positive integer. Let A be the (n − 1) × (n − 1)-


matrix  
2 −1 0 · · · 0
 −1 2 −1 · · · 0 
 
 0 −1 2 · · · 0 
 ,
 .. .
. .
. . . .
. 
 . . . . . 
0 0 0 ··· 2
whose (i, j)-th entry is


2, if i = j;
Ai,j := −1, if |i − j| = 1; for all i, j ∈ {1, 2, . . . , n − 1} .


0, otherwise

Prove that det A = n.


[Hint: Recall Example 5.4.4.]

Exercise 5.26. Let n be a positive integer. Let Qn be the n-hypercube graph (as
defined in Definition 2.14.7). Recall that its vertex set is the set V := {0, 1}n
of length-n bitstrings, and that two vertices are adjacent if and only if they
differ in exactly one bit. Our goal is to compute the # of spanning trees of
Qn .
Let D be the digraph Qbidir n . Let L be the Laplacian of D. We regard L as a
V × V-matrix (i.e., as a 2 × 2n -matrix whose rows and columns are indexed
n

by bitstrings in V).
We shall use the notation ai for the i-th entry of a bitstring a. Thus, each
bitstring a ∈ V has the form a = ( a1 , a2 , . . . , an ). (We shall avoid the short-
hand notation a1 a2 · · · an here, as it could be mistaken for an actual product.)
For any two bitstrings a, b ∈ V, we define the number h a, bi to be the
integer a1 b1 + a2 b2 + · · · + an bn .
An introduction to graph theory, version August 2, 2023 page 246

(a) Prove that every bitstring a ∈ V satisfies


(
h a,b i 2n , if a = 0;
∑ (− 1 ) =
0, otherwise.
b ∈V

Here, 0 denotes the bitstring (0, 0, . . . , 0) ∈ V.

Now, define a further V × V-matrix G by requiring that its ( a, b)-th entry


is
Ga,b = (−1)ha,bi for any a, b ∈ V.
Furthermore, define a diagonal V × V-matrix D by requiring that its ( a, a) -th
entry is

Da,a = 2 · (# of i ∈ {1, 2, . . . , n} such that ai = 1)


= 2 · (the number of 1s in a) for any a ∈ V

(and its off-diagonal entries are 0).


Prove the following:

(b) We have G2 = 2n · I, where I is the identity V × V-matrix.

(c) We have GLG −1 = D.

(d) The eigenvalues of L are 2k for  k ∈ {0, 1, . . . , n}, and each eigenvalue
 all
n
2k appears with multiplicity .
k
(e) The # of spanning trees of Qn is
n
1 n

2n ∏ (2k)( k ) .
k=1

[Example: As an example, here is the case n = 3. In this case, the graph


Qn looks as follows:

101 111

001 011

100 110

Q3 = 000 010
An introduction to graph theory, version August 2, 2023 page 247

The matrices L, G and D are


 
3 −1 −1 0 −1 0 0 0
 −1 3 0 −1 0 −1 0 0 
 
 −1 0 3 −1 0 0 −1 0 
 
 0 −1 −1 3 0 0 0 − 1 
L=  −1 0
,
 0 0 3 −1 −1 0  
 0 −1 0 0 − 1 3 0 − 1 
 
 0 0 −1 0 −1 0 3 −1
0 0 0 −1 0 −1 −1 3
 
1 1 1 1 1 1 1 1
 1 −1 1 −1 1 −1 1 −1
 
 1 1 −1 −1 1 1 −1 −1
 
 1 −1 −1 1 1 − 1 − 1 1 
G= 1 1
,
 1 1 −1 −1 −1 −1 
 1 −1 1 −1 −1 1 −1 1 
 
 1 1 −1 −1 −1 −1 1 1 
1 −1 −1 1 −1 1 1 −1
 
0 0 0 0 0 0 0 0
 0 2 0 0 0 0 0 0
 
 0 0 2 0 0 0 0 0
 
 0 0 0 2 0 0 0 0
D=  0 0 0 0 4 0 0 0 ,

 
 0 0 0 0 0 4 0 0
 
 0 0 0 0 0 0 4 0
0 0 0 0 0 0 0 6

where the rows and the columns are ordered by listing the eight bitstrings
a ∈ V in the order 000, 001, 010, 011, 100, 101, 110, 111. ]

As we promised, let us make a few more remarks about Proposition 5.15.2.


While this proposition can be proved by fairly straightforward row transforma-
tions (first subtracting the first row from all the other rows, then factoring an
x − a from all the latter rows, then subtracting a times each of the latter rows to
the first row to obtain a triangular matrix), it can also be viewed as a particular
case of either of the following two determinantal identities:
An introduction to graph theory, version August 2, 2023 page 248

Proposition 5.15.4. Let n ∈ N. Let a1 , a2 , . . . , an be n numbers, and let x be a


further number. Then,
 
x a1 a2 · · · a n − 1 a n
 a1 x a2 · · · a n − 1 a n 
  !
 a1 a2 x · · · a n − 1 a n  n n
 
det  . . . . .. ..  = x + ∑ ai ∏ ( x − ai ) .
 .. .. .. .. . .  i =1 i =1
 
 a1 a2 a3 · · · x an 
a1 a2 a3 · · · an x
| {z }
an (n +1)×(n +1)-matrix

Proposition 5.15.5. Let n ∈ N. Let x1 , x2 , . . . , xn be n numbers, and let a be a


further number. Then,
 
x1 a a ··· a
 a x2 a ··· a 
  n n
 x3 · · · 
det  a a a = ∏ ( xi − a ) + a ∑ yi ,
 . .. .. . . .. 
 .. . . . .  i =1 i =1

a a a ··· xn

where we set yi := ∏ ( xk − a) for each i ∈ {1, 2, . . . , n}.


k∈{1,2,...,n };
k6 =i

Both of these propositions make good exercises in determinant evaluation.


(Proposition 5.15.4 is [Grinbe20, Exercise 6.21], while Proposition 5.15.5 is https://math.stackexch
.)
See [KleSta19] and [Rubey00] for more applications of the Matrix-Tree Theo-
rem, and [Holzer22] for many more related results.

5.16. de Bruijn sequences


5.16.1. Definition
Let me move on to a more intricate application of what we have learned about
arborescences.
A little puzzle first: What is special about the periodic sequence

|| : 0000 1111 0110 0101 : || ?

(This is an infinite sequence of 0’s and 1’s; the spaces between some of them
are only for readability. The || : and : || symbols are “repeat signs” – they mean
that everything that stands between them should be repeated over and over. So
the sequence above is 0000 1111 0110 0101 0000 1111 . . ..)
An introduction to graph theory, version August 2, 2023 page 249

One nice property of this sequence is that if you slide a ”length-4 window”
(i.e., a window that shows four consecutive entries) along it, you get all 16
possible bitstrings of length 4 depending on the position of the window, and
these bitstrings do not repeat until you move 16 steps to the right. Just see:

0000 11110110010100001111 . . .
0 0001 1110110010100001111 . . .
00 0011 110110010100001111 . . .
000 0111 10110010100001111 . . .
0000 1111 0110010100001111 . . .
00001 1110 110010100001111 . . .
000011 1101 10010100001111 . . .
0000111 1011 0010100001111 . . .
00001111 0110 010100001111 . . .
000011110 1100 10100001111 . . .
0000111101 1001 0100001111 . . .
00001111011 0010 100001111 . . .
000011110110 0101 00001111 . . .
0000111101100 1010 0001111 . . .
00001111011001 0100 001111 . . .
000011110110010 1000 01111 . . .

Note that, as you slide the window along the sequence, at each step, the first
bit is removed and a new bit is inserted at the end. Thus, by sliding a length-4
window along the above sequence, you run through all 16 possible length-4
bitstrings in such a way that each bitstring is obtained from the previous one
by removing the first bit and inserting a new bit at the end. This is nice and
somewhat similar to Gray codes (in which you run through all bitstrings of a
given length in such a way that only a single bit is changed at each step).
Can we find such nice sequences for any window length, not just 4 ?
Here is an answer for window length 3, for instance:

|| : 000 111 01 : || .
What about higher window length?
Moreover, we can ask the same question with other alphabets. For instance,
instead of bits, here is a similar sequence for the alphabet {0, 1, 2} (that is, we
use the numbers 0, 1, 2 instead of 0 and 1) and window length 2:

|| : 00 11 22 02 1 : || .
An introduction to graph theory, version August 2, 2023 page 250

What about the general case? Let us give it a name:

Definition 5.16.1. Let n and k be two positive integers, and let K be a k-


element set.
A de Bruijn sequence of order n on K means a kn -tuple (c0 , c1 , . . . , ckn −1 )
of elements of K such that

(A) for each n-tuple ( a1 , a2 , . . . , an ) ∈ K n of elements of K, there is a unique


r ∈ {0, 1, . . . , kn − 1} such that

( a1 , a2 , . . . , a n ) = ( c r , c r + 1 , . . . , c r + n − 1 ) .

Here, the indices under the letter “c” are understood to be periodic modulo
kn ; that is, we set cq+kn = cq for each q ∈ Z (so that ckn = c0 and ckn +1 = c1
and so on).

For example, for n = 2 and k = 3 and K = {0, 1, 2}, the 9-tuple

(0, 0, 1, 1, 2, 2, 0, 2, 1)

is a de Bruijn sequence of order n on K, because if we label the entries of this


9-tuple as c0 , c1 , . . . , c8 (and extend the indices periodically, so that c9 = c0 ),
then we have

(0, 0) = (c0 , c1 ) ; (0, 1) = (c1 , c2 ) ; (0, 2) = (c6 , c7 ) ;


(1, 0) = (c8 , c9 ) ; (1, 1) = (c2 , c3 ) ; (1, 2) = (c3 , c4 ) ;
(2, 0) = (c5 , c6 ) ; (2, 1) = (c7 , c8 ) ; (2, 2) = (c4 , c5 ) .

This de Bruijn sequence (0, 0, 1, 1, 2, 2, 0, 2, 1) corresponds to the periodic se-


quence || : 00 11 22 02 1 : || that we found above.

5.16.2. Existence of de Bruijn sequences


It turns out that de Bruijn sequences always exist:

Theorem 5.16.2 (de Bruijn, Sainte-Marie). Let n and k be positive integers.


Let K be a k-element set. Then, a de Bruijn sequence of order n on K exists.

Proof. It looks reasonable to approach this using a digraph. For example, we


can define a digraph whose vertices are the n-tuples in K n , and that has an arc
from one n-tuple i to another n-tuple j if j can be obtained from i by dropping
the first entry and adding a new entry at the end. Then, a de Bruijn sequence
(of order n on K) is the same as a Hamiltonian cycle of this digraph.
Unfortunately, we don’t have any useful criteria that would show that such a
cycle exists. So this idea seems to be a dead end.
An introduction to graph theory, version August 2, 2023 page 251

However, let us do something counterintuitive: We try to reinterpret de


Bruijn sequences in terms of Eulerian circuits (rather than Hamiltonian cycles),
since we have a good criterion for the existence of Eulerian circuits (unlike for
that of Hamiltonian cycles)!
We need a different digraph for that. Namely, we let D be the multidigraph
K n−1 , K n , ψ , where the map ψ : K n → K n−1 × K n−1 is given by the formula

ψ ( a1 , a2 , . . . , an ) = (( a1 , a2 , . . . , an−1 ) , ( a2 , a3 , . . . , an )) .

Thus, the vertices of D are the (n − 1)-tuples (not the n-tuples!) of elements
of K, whereas the arcs are the n-tuples of elements of K, and each such arc
( a1 , a2 , . . . , an ) has source ( a1 , a2 , . . . , an−1 ) and target ( a2 , a3 , . . . , an ). Hence,
there is an arc from each (n − 1)-tuple i ∈ K n−1 to each (n − 1)-tuple j ∈ K n−1
that is obtained by dropping the first entry of i and adding a new entry at the
end. (Be careful: If n = 1, then D has only one vertex but n arcs. If this confuses
you, just do the n = 1 case by hand. For any n > 1, there are no parallel arcs in
D.)
Example 5.16.3. For example, if n = 3 and k = 2 and K = {0, 1}, then D
looks as follows (we again write our tuples without commas and without
parentheses):

101 01

10 010
001

100
110 00
011

000
11 111

Let us make a few observations about D:

• The multidigraph D is strongly connected.


[Proof: We need to show that for any two vertices i and j of D, there is a
walk from i to j. But this is easy: Just insert the entries of j into i one by
one, pushing out the entries of i. In other words, using the notation k p for
An introduction to graph theory, version August 2, 2023 page 252

the p-th entry of any tuple k, we have the walk

i = ( i1 , i2 , . . . , i n −1 )
→ (i2 , i3 , . . . , in−1, j1 )
→ (i3 , i4 , . . . , in−1, j1 , j2 )
→ ···
→ (in−1 , j1 , j2 , . . . , jn−2 )
→ ( j1 , j2 , . . . , jn−1 ) = j.

Note that this walk has length n − 1, and is the unique walk from i to j
that has length n − 1. Thus, the # of walks from i to j that have length
n − 1 is 1. This will come useful further below.]

• Thus, the multidigraph D is weakly connected (since any strongly con-


nected digraph is weakly connected).

• The multidigraph D is balanced, and in fact each vertex of D has outde-


gree k and indegree k.
[Proof: Let i be a vertex of D. The arcs with source i are the n-tuples
whose first n − 1 entries form the (n − 1)-tuple i while the last, n-th entry
is an arbitrary element of K. Thus, there are |K | many such arcs. In other
words, i has outdegree k. A similar argument shows that i has indegree
k. This entails that deg− i = deg+ i. Since this holds for every vertex i, we
conclude that D is balanced.]

• The digraph D has an Eulerian circuit.


[Proof: This follows from the directed Euler–Hierholzer theorem (Theorem
4.7.2), since D is weakly connected and balanced. Alternatively, we can
derive this from the BEST theorem (Theorem 5.9.1) as follows: Pick an
arbitrary arc a of D, and let r be its source. Then, r is a from-root of D
(since D is strongly connected), and thus D has a spanning arborescence
rooted from r (by Theorem 5.8.4). In other words, using the notations of
the BEST theorem (Theorem 5.9.1), we have τ ( D, r ) 6= 0. Moreover, each
vertex of D has indegree k > 0. Thus, the BEST theorem yields

ε ( D, a) = τ ( D, r ) · ∏ deg− u − 1 ! 6= 0.
| {z } u∈V
6 =0 | {z }
6 =0

But this shows that D has an Eulerian circuit whose last arc is a.]

So we know that D has an Eulerian circuit c. This Eulerian circuit leads to a


de Bruijn sequence as follows:
Let p0 , p1 , . . . , pkn −1 be the arcs of c (from first to last). Extend the subscripts
periodically modulo kn (that is, set pq+kn = pq for all q ∈ N). Thus, we obtain
An introduction to graph theory, version August 2, 2023 page 253

an infinite walk69 with arcs p0 , p1 , p2 , . . . (since c is a circuit). In other words,


for each i ∈ N, the target of the arc pi is the source of the arc pi +1 .
In other words, for each i ∈ N, the last n − 1 entries of pi are the first n − 1
entries of pi +1 (since the target of pi is the tuple consisting of the last n − 1
entries of pi , whereas the source of pi +1 is the tuple consisting of the first n − 1
entries of pi +1 ). Therefore, for each i ∈ N and each j ∈ {2, 3, . . . , n}, we have

(the j-th entry of pi )


= (the ( j − 1) -st entry of pi +1 ) . (25)

Now, for each i ∈ N, we let xi denote the first entry of the n-tuple pi . Then,
xq+kn = xq for all q ∈ N (since pq+kn = pq for all q ∈ N). In other words,
the sequence ( x0 , x1 , x2 , . . .) repeats itself every kn terms. Note that the kn -tuple
( x0 , x1 , . . . , xkn −1 ) consists of the first entries of the arcs p0 , p1 , . . . , pkn −1 of c (by
the definition of xi ).
For each i ∈ N and each s ∈ {1, 2, . . . , n}, we have

(the s-th entry of pi )


= (the (s − 1) -st entry of pi +1 ) (by (25))
= (the (s − 2) -nd entry of pi +2 ) (by (25))
= (the (s − 3) -rd entry of pi +3 ) (by (25))
= ···
= (the 1-st entry of pi +s−1 )
= xi + s −1 (since xi +s−1 was defined as the first entry of pi +s−1 ) .

In other words, for each i ∈ N, the entries of pi (from first to last) are
xi , xi +1, . . . , xi +n−1. In other words, for each i ∈ N, we have

pi = ( xi , xi +1 , . . . , xi + n −1 ) . (26)

Now, recall that c is an Eulerian circuit. Thus, each arc of D appears exactly
once among its arcs p0 , p1 , . . . , pkn −1 . In other words, each n-tuple in K n appears
exactly once among p0 , p1 , . . . , pkn −1 (since the arcs of D are the n-tuples in K n ).
In other words, as i ranges from 0 to kn − 1, the n-tuple pi takes each possible
value in K n exactly once.
In view of (26), we can rewrite this as follows: As i ranges from 0 to kn − 1,
the n-tuple ( xi , xi +1, . . . , xi +n−1) takes each possible value in K n exactly once
(since this n-tuple is precisely pi , as we have shown in the previous para-
graph). In other words, for each ( a1 , a2 , . . . , an ) ∈ K n , there is a unique r ∈
{0, 1, . . . , kn − 1} such that ( a1 , a2 , . . . , an ) = ( xr , xr+1 , . . . , xr+n−1).
Hence, the kn -tuple ( x0 , x1 , . . . , xkn −1 ) is a de Bruijn sequence of order n on
K. This shows that a de Bruijn sequence exists. Theorem 5.16.2 is thus proven.

69 We have never formally defined infinite walks, but it should be fairly clear what they are.
An introduction to graph theory, version August 2, 2023 page 254

Example 5.16.4. For n = 3 and k = 2 and K = {0, 1}, one possible Eulerian
circuit c of D is

(00, 001, 01, 010, 10, 101, 01, 011, 11, 111, 11, 110, 10, 100, 00)

(where we have written the arcs in bold for readability). The first entries of
the arcs of this circuit form the sequence 0010111, which is indeed a de Bruijn
sequence of order 3 on {0, 1}. Any 3 consecutive entries of this sequence
(extended periodically to the infinite sequence || : 0010111 : ||) form the
respective arc of c.

Theorem 5.16.2 is merely the starting point of a theory. Several specific de


Bruijn sequences are known, many of them having peculiar properties. See
[Freder82] for a survey of various such sequences70 (note that they are called
“full length nonlinear shift register sequences” in this survey).71
There are also several variations on de Bruijn sequences. For some of them,
see [ChDiGr92]. (Note that some of the open questions in that paper are still
unsolved.) A variation that recently became quite popular is the notion of a
“universal cycle for permutations” – a string that contains all “permutations”
(more precisely, n-tuples of distinct elements of K) as factors. See [EngVat18]
for some recent progress on minimizing the length of such a string, including
a contribution by a notorious hacker known as 4chan. (This is no longer really
about Eulerian circuits, since some amount of duplication cannot be avoided in
these strings.)

5.16.3. Counting de Bruijn sequences


Let us move in a different direction. Having proved the existence of de Bruijn
sequences in Theorem 5.16.2, let us try to count them!
Question. Let n and k be two positive integers. Let K be a k-element set.
How many de Bruijn sequences of order n on K are there?
To solve this, it makes sense to apply the BEST theorem to the digraph D
we have constructed above. Alas, D is not of the form Gbidir for some undi-
rected graph G, so we cannot apply the undirected MTT (Matrix-Tree Theo-
rem). However, D is a balanced multidigraph, and for such digraphs, a version
of the undirected MTT still holds:

70 Some of these sequences (the “prefer-one” and “prefer-opposite” generators) are just dis-
guised implementations of the algorithm for finding an Eulerian circuit implicit in our
proof of the BEST theorem.
71 My favorite is the one obtained by concatenating all Lyndon words whose length divides

n in lexicographically increasing order (assuming that the set K is totally ordered). See
[Moreno04] for the details of that construction.
An introduction to graph theory, version August 2, 2023 page 255

Theorem 5.16.5 (balanced Matrix-Tree Theorem). Let D = (V, A, ψ) be a


balanced multidigraph. Assume that V = {1, 2, . . . , n} for some positive
integer n.
Let L be the Laplacian of D. Then:

(a) For any vertex r of D, we have

(# of spanning arborescences of D rooted to r ) = det ( L∼r,∼r ) .

Moreover, this number does not depend on r.

(b) Let t be an indeterminate. Expand the determinant det (tIn + L) (here,


In denotes the n × n identity matrix) as a polynomial in t:

det (tIn + L) = cn tn + cn−1 tn−1 + · · · + c1 t1 + c0 t0 ,

where c0 , c1 , . . . , cn are numbers. (Note that this is the characteristic


polynomial of L up to substituting −t for t and multiplying by a power
of −1. Some of its coefficients are cn = 1 and cn−1 = Tr L and c0 =
det L.) Then, for any vertex r of D, we have

1
(# of spanning arborescences of D rooted to r ) = c1 .
n

(c) Let λ1 , λ2 , . . . , λn be the eigenvalues of L, listed in such a way that


λn = 0. Then, for any vertex r of D, we have

1
(# of spanning arborescences of D rooted to r ) = · λ1 λ2 · · · λ n − 1 .
n

(d) Let λ1 , λ2 , . . . , λn be the eigenvalues of L, listed in such a way that


λn = 0. If all vertices of D have outdegree > 0, then

1 
(# of Eulerian circuits of D ) = | A| · · λ1 λ2 · · · λn−1 · ∏ deg+ u − 1 !.
n u ∈V

(If you identify an Eulerian circuit with its cyclic rotations, then you
should drop the | A| factor on the right hand side.)

Proof. (a) The equality comes from the MTT (Theorem 5.14.7). It remains to
prove that the # of spanning arborescences of D rooted to r does not depend
on r. But this is Corollary 5.12.1.
(b) follows from (a) as in the undirected graph case (proof of Theorem 5.15.1
(b)).72
72 In more detail: Just as we proved in our above proof of Theorem 5.15.1 (for the undirected
An introduction to graph theory, version August 2, 2023 page 256

(c) follows from (b) as in the undirected graph case (proof of Theorem 5.15.1
(c)).
(d) Assume that all vertices of D have outdegree > 0. Then,

(# of Eulerian circuits of D )
= ∑ (# of Eulerian circuits of D whose first arc is a) .
a∈ A

However, if a ∈ A is any arc, and if r is the source of a, then

(# of Eulerian circuits of D whose first arc is a)



= (# of spanning arborescences of D rooted to r ) · ∏ deg+ u − 1 !
u ∈V
(by the BEST’ theorem (Theorem 5.10.4))
1 
= · λ1 λ2 · · · λn−1 · ∏ deg+ u − 1 ! (by part (c)) .
n u ∈V

Hence,

(# of Eulerian circuits of D )
= ∑ |(# of Eulerian circuits of
{z
D whose first arc is a)
}
a∈ A
1
= · λ1 λ2 ···λn−1 · ∏ (deg+ u −1)!
n u ∈V

1 
= ∑ · λ1 λ2 · · · λn−1 · ∏ deg+ u − 1 !
a∈ A
n u ∈V
1 
= | A| · · λ1 λ2 · · · λn−1 · ∏ deg+ u − 1 !.
n u ∈V

This proves part (d).


n
case), we have c1 = ∑ det ( L∼r,∼r ). However, part (a) shows that the number det ( L∼r,∼r )
r =1
n
does not depend on r. Thus, the sum ∑ det ( L∼r,∼r ) consists of n equal addends, which
r =1
can be written as det ( L∼r,∼r ) for any vertex r of D. Therefore, this sum can be rewritten
n
as n · det ( L∼r,∼r ) for any vertex r of D. Hence, the equality c1 = ∑ det ( L∼r,∼r ) can be
r =1
1
rewritten as c1 = n · det ( L∼r,∼r ) for any vertex r of D. Therefore, det ( L∼r,∼r ) = c for
n 1
any vertex r of D. Since part (a) yields

(# of spanning arborescences of D rooted to r ) = det ( L∼r,∼r ) ,


we can rewrite this equality as
1
(# of spanning arborescences of D rooted to r ) = c .
n 1
An introduction to graph theory, version August 2, 2023 page 257

Now, let’s try to solve our question – i.e., let’s count the de Bruijn sequences
of order n on K.
Recall the digraph D from our above proof of Theorem 5.16.2. We constructed
a de Bruijn sequence of order n on K by finding an Eulerian circuit of D. This
actually works both ways: The map

{Eulerian circuits of D } → {de Bruijn sequences of order n on K } ,


c 7→ (the sequence of first entries of the arcs of c)

is a bijection (make sure you understand why!). Hence, by the bijection princi-
ple, we have

(# of de Bruijn sequences of order n on K )


= (# of Eulerian circuits of D ) . (27)

By Theorem 5.16.5 (d), however, we have

(# of Eulerian circuits of D )
1 
= | K n | · n−1 · λ1 λ2 · · · λkn−1 −1 · ∏ deg+ u − 1 !, (28)
k u ∈ K n −1

where λ1 , λ2 , . . . , λkn−1 are the eigenvalues of the Laplacian L of D, indexed


 in
such a way that λkn−1 = 0. (Note that the digraph D = K n−1 , K n , ψ has kn−1
vertices, not n vertices, so the “n” in Theorem 5.16.5 is kn−1 here.)
As we know, each vertex of D has outdegree k. That is, we have deg+ u = k
for each u ∈ K n−1 . Thus,
 n −1
∏ deg+ u − 1 ! = ∏ (k − 1)! = ((k − 1)!)k .
u ∈ K n −1 u ∈ K n −1

Also,
1 1
|Kn | · = kn · = k.
kn −1 kn −1
It remains to find λ1 λ2 · · · λkn−1 −1 . What are the eigenvalues of L ?
The Laplacian L of our digraph D is a kn−1 × kn−1 -matrix whose rows and
columns are indexed by (n − 1)-tuples in K n−1 . Strictly speaking, we should
relabel the vertices of D as 1, 2, . . . , kn−1 here, in order to have a “proper matrix”
with a well-defined order on its rows and columns. But let’s not do this; instead,
I trust you can do the relabeling yourself, or just use the more general notion
of matrices that allows for the rows and the columns to be indexed by arbitrary
things (see https://mathoverflow.net/questions/317105 for details).
Let C be the adjacency matrix of the digraph D; this is the kn−1 × kn−1 -matrix
(again with rows and columns indexed by (n − 1)-tuples in K n−1 ) whose (i, j)-
th entry is the # of arcs with source i and target j. In particular, the trace of C
is thus the # of loops of D. It is easy to see that the loops of D are precisely the
An introduction to graph theory, version August 2, 2023 page 258

arcs of the form ( x, x, . . . , x ) ∈ K n for x ∈ K; thus, D has exactly k loops. Hence,


the trace of C is k.
Recall the definition of the Laplacian matrix L. We can restate it as follows:

L = ∆ − C, (29)

where ∆ is the diagonal matrix whose diagonal entries are the outdegrees of
the vertices of D. Since each vertex of D has outdegree k, the latter diagonal
matrix ∆ is simply k · I, where I is the identity matrix (of the appropriate size).
Hence, (29) can be rewritten as

L = k · I − C.

Thus, if γ1 , γ2 , . . . , γkn−1 are the eigenvalues of C, then k − γ1 , k − γ2 , . . . , k −


γkn−1 are the eigenvalues of L. Computing the former will thus help us find the
latter.
Furthermore, let J be the kn−1 × kn−1 -matrix (again with rows and columns
indexed by (n − 1)-tuples in K n−1 ) whose all entries are 1. It is easy to see that
the eigenvalues of J are
0, 0, . . . , 0 , kn−1 .
| {z }
kn−1 −1 many zeroes

(The easiest way to see this is by noticing that J has rank 1 and trace kn−1 . 73 )

Now, here is something really underhanded: We observe that

Cn−1 = J.

[Proof: We need to show that all entries of the matrix Cn−1 are 1. So let i and
j be two vertices of D. We must then show that the (i, j)-th entry of Cn−1 is 1.
Recall the combinatorial interpretation of the powers of an adjacency matrix
(Theorem 4.5.10): For any ℓ ∈ N, the (i, j)-th entry of Cℓ is the # of walks from
i to j (in D) that have length ℓ. Thus, in particular, the (i, j)-th entry of Cn−1
is the # of walks from i to j (in D) that have length n − 1. But this number
is actually 1, as we have already shown in our above proof of Theorem 5.16.2.
This completes the proof of Cn−1 = J.]
How does this help us compute the eigenvalues of C ? Well, let γ1 , γ2 , . . . , γkn−1
be the eigenvalues of C. Then, for any ℓ ∈ N, the eigenvalues of Cℓ are
γ1ℓ , γ2ℓ , . . . , γkℓn−1 (this is a fact that holds for any square matrix, and is probably
easiest to prove using the Jordan canonical form or triangularization). Hence, in
73 Here are the details: The matrix J has rank 1 (since all its rows are the same); thus, all but one
of its eigenvalues are 0. It remains to show that the remaining eigenvalue is kn−1 . However,
it is known that the sums of the eigenvalues of a square matrix equals its trace. Thus, if all
but one of the eigenvalues of a square matrix are 0, then the remaining eigenvalue equals
its trace. Applying this to our matrix J, we see that its remaining eigenvalue equals its trace,
which is kn−1 .
An introduction to graph theory, version August 2, 2023 page 259

particular, γ1n−1 , γ2n−1, . . . , γknn−−11 are the eigenvalues of Cn−1 = J; but we know
that the latter eigenvalues are 0, 0, . . . , 0 , kn−1 . Hence, all but one of the
| {z }
kn−1 −1 many zeroes
kn−1 numbers γ1n−1 , γ2n−1, . . . , γknn−−11 equal 0. Thus, all but one of the kn−1 num-
bers γ1 , γ2 , . . . , γkn−1 equal 0 (we don’t know what the remaining number is,
since (n − 1)-st roots are not uniquely determined in C). In other words, all but
one of the eigenvalues of C equal 0. The remaining eigenvalue must thus be
the trace of C (because the sum of the eigenvalues of a square matrix is known
to be the trace of that matrix), and therefore equal k (since we know that the
trace of C is k).
So we have shown that the eigenvalues of C are 0, 0, . . . , 0 , k. Thus, the
| {z }
kn−1 −1 many zeroes
eigenvalues of L are
k − 0, k − 0, . . . , k − 0, k − k
| {z }
kn−1 −1 many (k−0)’s

(because if γ1 , γ2 , . . . , γkn−1 are the eigenvalues of C, then k − γ1 , k − γ2 , . . . , k −


γkn−1 are the eigenvalues of L). In other words, the eigenvalues of L are
k, k, . . . , k , 0.
| {z }
kn−1 −1 many k’s

Hence, the eigenvalues λ1 , λ2 , . . . , λkn−1 −1 in (28) all equal k. Thus, (28) simpli-
fies to
(# of Eulerian circuits of D )
1 
= |K n | · n−1 · kk
| · · · k} · ∏ deg+ u − 1 !
{z
| {zk } kn−1 −1 factors u∈K n−1
| {z }
n
1 k n −1
=k · =((k−1)!)
kn −1
=k
n −1 n −1 n −1
= k· kk · · · }k
| {z · ((k − 1)!)k = kk · ((k − 1)!)k
kn−1 −1 factors
| {z }
= k k n −1
  k n −1
n −1
=  k · ( k − 1) !  = k!k .
| {z }
= k!
In view of this, we can rewrite (27) as
n −1
(# of de Bruijn sequences of order n on K ) = k!k .
Thus, we have proved the following:
An introduction to graph theory, version August 2, 2023 page 260

Theorem 5.16.6. Let n and k be positive integers. Let K be a k-element set.


Then,
n −1
(# of de Bruijn sequences of order n on K ) = k!k .

What a nice (and huge) answer!


Our above proof of Theorem 5.16.6 is essentially taken from [Stanle18, Chap-
ter 10].
We note that a combinatorial proof of Theorem 5.16.6 (avoiding any use of
linear algebra) has been recently given in [BidKis02].

5.17. More on Laplacians


Much more can be said about the Laplacian of a digraph. The study of matrices
associated to a graph or digraph is known as spectral graph theory; I’d say the
Laplacian is probably the most prominent of these matrices (even though the
adjacency matrix is somewhat easier to define). The original form of the matrix-
tree theorem (actually a subtler variant of Theorem 5.15.1 (a)) was found by
Gustav Kirchhoff in his study of electricity [Kirchh47] (see [Holzer22, §2.1.1] for
a modern exposition); the effective resistance between two nodes of an electrical
network is a ratio of spanning-tree counts and thus can be computed using the
Laplacian (see, e.g., [Vos16, §2 and §3]). To be more precise, this relies on a
“weighted count” of spanning trees, which is more general than the counting
we have done so far; we will learn about it in the next section.
Another application of Laplacians is to drawing graphs: see “spectral layout”
or “spectral graph drawing” (e.g., [Gallie13]).

5.18. On the left nullspace of the Laplacian


Let me mention one more result about Laplacians of digraphs that answers a
rather natural question you might already have asked yourself. Recall
 that the
1
 1 
 
Laplacian L of a digraph D always satisfies Le = 0, where e =  . . Thus,
 .. 
1
the vector e belongs to the right nullspace (= right kernel) of L. It is not hard
to see that if D has a to-root and we are working over a characteristic-0 field,
then e spans this nullspace, i.e., there are no vectors in that nullspace other than
scalar multiples of e. (This is actually an “if and only if”.) What about the left
nullspace of L ? Can we explicitly find a nonzero vector f with f L = 0 ? The
answer is positive:
An introduction to graph theory, version August 2, 2023 page 261

Theorem 5.18.1 (harmonic vector theorem for Laplacians). Let D = (V, A, ψ)


be a multidigraph, where V = {1, 2, . . . , n} for some n ∈ N.
For each r ∈ V, let τ ( D, r ) be the # of spanning arborescences of D rooted
to r.
Let f be the row vector (τ ( D, 1) , τ ( D, 2) , . . . , τ ( D, n)). Then, f L = 0.

Theorem 5.18.1 (or, more precisely, its weighted version, which we will see in
the next section) can be used to explicitly compute the steady state of a Markov
chain (see [KrGrWi10]); a similar interpretation, but in economical terms (emer-
gence of money in a barter economy), appears in [Sahi14, §1].
We shall give a proof of Theorem 5.18.1 based upon two lemmas. The first
lemma is a general linear-algebraic result:

Lemma 5.18.2. Let B be an n × n-matrix over an arbitrary commutative ring


K. (For example, K can be R, in which case B is a real matrix.) Assume
that the sum of all columns of B is the zero vector. Then, for any r, s, t ∈
{1, 2, . . . , n}, we have

det ( B∼r,∼t ) = (−1)s−t det ( B∼r,∼s ) .

Proof of Lemma 5.18.2. There are various ways to prove this, but here is probably
the most elegant one:
We WLOG assume that s 6= t, since otherwise the claim is obvious. Let us
now change the r-th row of the matrix B as follows:

• We replace the s-th entry of the r-th row by 1.

• We replace the t-th entry of the r-th row by −1.

• We replace all other entries of the r-th row by 0.

Let C be the resulting n × n-matrix.74 Thus, C agrees with B in all rows other
than the r-th one. Hence, in particular,

C∼r,∼k = B∼r,∼k for each k ∈ {1, 2, . . . , n} . (30)

 
a b c d
 ′ b′ c′ d′ 
74 For example, if n = 4 and B =  a  and s = 1 and t = 3 and r = 2, then
 a′′ b′′ c′′ d′′ 
a′′′ b′′′ c′′′ d′′′
 
a b c d
 1 0 − 1 0 
C=  a′′ b′′ c′′ d′′ .

a′′′ b′′′ c′′′ d′′′


An introduction to graph theory, version August 2, 2023 page 262

Note also that the only nonzero entries in the r-th row of C are75 Cr,s = 1 and
Cr,t = −1. Hence, the entries in the r-th row of C add up to 0.
Recall that the sum of all columns of B is the zero vector. In other words,
in each row of B, the entries add up to 0. The matrix C therefore also has this
property (because the only row of C that differs from the corresponding row
of B is the r-th row; however, we have shown above that in the r-th row, the
entries of C also add up to 0). In other words, the sum of all columns of C is
the zero vector. This easily entails that det C = 0 76 .
On the other hand, Laplace expansion along the r-th row yields
n
det C = ∑ (−1)r+k Cr,k det (C∼r,∼k )
k=1
= (−1)r+s 1 det (C∼r,∼s ) + (−1)r+t (−1) det (C∼r,∼t )

(since the only nonzero entries Cr,k in the r-th row of C are Cr,s = 1 and Cr,t =
−1). Comparing this with det C = 0, we obtain

0 = (−1)r+s 1 det (C∼r,∼s ) + (−1)r+t (−1) det (C∼r,∼t )


= (−1)r+s det (C∼r,∼s ) − (−1)r+t det (C∼r,∼t )
| {z } | {z }
= B∼r,∼s = B∼r,∼t
(by (30)) (by (30))

= (−1)r+s det ( B∼r,∼s ) − (−1)r+t det ( B∼r,∼t ) .

In other words, (−1)r+t det ( B∼r,∼t ) = (−1)r+s det ( B∼r,∼s ). Dividing both
sides of this by (−1)r+t , we obtain det ( B∼r,∼t ) = (−1)s−t det ( B∼r,∼s ). This
proves Lemma 5.18.2.
Our next lemma is the following generalization of Theorem 5.14.7:

Theorem 5.18.3 (Matrix-Tree Theorem, off-diagonal version). Let D =


(V, A, ψ) be a multidigraph. Assume that V = {1, 2, . . . , n} for some pos-
itive integer n.
Let L be the Laplacian of D. Let r and s be two vertices of D. Then,

(# of spanning arborescences of D rooted to r ) = (−1)r+s det ( L∼r,∼s ) .

75 We are using the notation Cr,k for the entry of C in the r-th row and the k-th column.
76 Proof.
It is well-known that the determinant of a matrix does not change if we add a column
to another. Hence, the determinant of C will not change if we add each column of C other
than the first one to the first column of C. However, the result of this operation will be
a matrix whose first column is 0 (since the sum of all columns of C is the zero vector),
and therefore this matrix will have determinant 0. Since the operation did not change the
determinant, we thus conclude that the determinant of C was 0. In other words, det C = 0.
An introduction to graph theory, version August 2, 2023 page 263

Note that Theorem 5.14.7 is the particular case of Theorem 5.18.3 for s = r.
Fortunately, using Lemma 5.18.2, we can easily derive the general case from the
particular:
Proof of Theorem 5.18.3. We have seen (in the proof of Proposition 5.14.6) that
the sum of all columns of the Laplacian L is the zero vector. Hence, Lemma
5.18.2 (applied to K = Q and B = L and t = r) yields
det ( L∼r,∼r ) = (−1)s−r det ( L∼r,∼s ) = (−1)r+s det ( L∼r,∼s ) .
| {z }
=(−1)r +s

However, the Matrix-Tree Theorem (Theorem 5.14.7) yields


(# of spanning arborescences of D rooted to r ) = det ( L∼r,∼r )
= (−1)r+s det ( L∼r,∼s ) .
This proves Theorem 5.18.3.
We are now ready to prove Theorem 5.18.1:
Proof of Theorem 5.18.1. For each r, s ∈ {1, 2, . . . , n}, we have
τ ( D, r ) = (# of spanning arborescences of D rooted to r )
(by the definition of τ ( D, r ))
= (−1)r+s det ( L∼r,∼s ) (31)
(by Theorem 5.18.3).
However, we have f = (τ ( D, 1) , τ ( D, 2) , . . . , τ ( D, n)). Thus, for each s ∈
{1, 2, . . . , n}, the s-th entry of the column vector f L is77
n
∑ τ ( D, r )
| {z }
Lr,s
r =1
=(−1)r +s det( L ∼r,∼s )
(by (31))
n
r+s
= ∑ (−1) det ( L∼r,∼s ) Lr,s
r =1
n
= ∑ (−1)r+s Lr,s det ( L∼r,∼s ) = det L
r =1
 
since Laplace expansion along the s-th column
 n 
yields det L = ∑ (−1)r+s Lr,s det ( L∼r,∼s )
r =1
=0
(by Proposition 5.14.6). This shows that all entries of f L are 0. In other words,
f L = 0. Theorem 5.18.1 is thus proved.
77 We are using the notation Lr,s for the entry of the matrix L in the r-th row and the s-th
column.
An introduction to graph theory, version August 2, 2023 page 264

Other proofs of Theorem 5.18.1 exist. In particular, a combinatorial proof is


sketched in [Sahi14, Theorem 1]. (More precisely, [Sahi14, Theorem 1] in this
paper is the claim of Theorem 5.18.1 upon reversing all the arcs and replacing
all matrices by their transposes.)78

5.19. A weighted Matrix-Tree Theorem


5.19.1. Definitions
We have so far been counting arborescences. A natural generalization of count-
ing is weighted counting – i.e., you assign a certain number (a “weight”) to
each arborescence (or whatever object you are interested in), and then you sum
the weights of all arborescences (instead of merely counting them). This gener-
alizes counting, because if all weights are 1, then you get the # of arborescences.
If you pick the weights to be completely random, then the sum won’t usually
be particularly interesting. However, some choices of weights lead to good
behavior. Let us see what we get if we assign a weight to each arc of our
digraph, and then define the weight of an arborescence to be the product of the
weights of the arcs that appear in this arborescence.

Definition 5.19.1. Let D = (V, A, ψ) be a multidigraph.


Let K be a commutative ring. Assume that an element w a ∈ K is assigned
to each arc a ∈ A. We call this w a the weight of the arc a. (You can assume
that K = R, so that the weights are just numbers.)
w be the sum of the weights of all
(a) For any two vertices i, j ∈ V, we let ai,j
arcs of D that have source i and target j.
(b) For any vertex i ∈ V, we define the weighted outdegree deg+w i of i to
be the sum
∑ wa .
a∈ A;
the source of a is i

(c) If B is a subdigraph of D, then the weight w ( B) of B is defined to be


the product ∏ w a . This is the product of the weights of all arcs
a is an arc of B
of B.
(d) Assume that V = {1, 2, . . . , n} for some n ∈ N. The weighted Lapla-
cian of D (with respect to the weights w a ) is defined to be the n × n-
matrix Lw ∈ K n×n (note that the “w” here is a superscript, not an
exponent) whose entries are given by

w
Li,j = deg+w i · [i = j] − ai,j
w
for all i, j ∈ V.

78 I tried to explain this proof in more detail in the solutions to Spring 2018 Math 4707 midterm
#3 – see the proof of Theorem 0.7 in those solutions; you be the judge if I succeeded.
An introduction to graph theory, version August 2, 2023 page 265

These definitions generalize analogous definitions in the “unweighted case”.


Indeed, if we take all the arc weights w a to be 1, then the weighted outdegree
deg+w i of a vertex i becomes its usual outdegree deg i, and the weighted Lapla-
cian Lw becomes the usual Laplacian L. The weight w ( B) of a subdigraph B
simply becomes 1 in this case.

5.19.2. The weighted Matrix-Tree Theorem


We now can generalize the original MTT (= Matrix-Tree Theorem)79 as follows:

Theorem 5.19.2 (weighted Matrix-Tree Theorem). Let D = (V, A, ψ) be a


multidigraph.
Let K be a commutative ring. Assume that an element w a ∈ K is assigned
to each arc a ∈ A. We call this w a the weight of the arc a.
Assume that V = {1, 2, . . . , n} for some n ∈ N. Let Lw be the weighted
Laplacian of D.
Let r be a vertex of D. Then,

∑ w ( B) = det Lw
∼r,∼r .
B is a spanning
arborescence
of D rooted to r

Example 5.19.3. Let D be the following multidigraph:

2
β
α

D= 1 γ 3

δ , and let r = 3.
Then, D has two spanning arborescences rooted to r. One of the two has arcs
α and β (and thus has weight wα w β ); the other has arcs γ and β (and thus
has weight wγ w β ). Hence,

∑ w ( B) = wα w β + wγ w β , (32)
B is a spanning
arborescence
of D rooted to r
The weighted Laplacian Lw is
 
wα + wγ −wα −wγ
Lw =  0 w β −w β 
−wδ 0 wδ
79 To remind: The original MTT is Theorem 5.14.7.
An introduction to graph theory, version August 2, 2023 page 266

(since, for example, deg+w 1 = wα + wγ and a1,1


w = 0 and aw = w ). Thus,
1,2 α

 
wα + wγ −wα
Lw
∼3,∼3 = and therefore
0 wβ

det Lw
∼3,∼3 = ( wα + wγ ) w β = wα w β + wγ w β .

The right hand side of this agrees with that of (32). This confirms the
weighted MTT for our D and r.

As we already said, the weighted MTT generalizes the original MTT, because
if we take all w a ’s to be 1, we just recover the original MTT.
However, we can also go backwards: we can derive the weighted MTT from
the original MTT. Let us do this.

5.19.3. The polynomial identity trick


First, we recall a standard result in algebra, known as the principle of perma-
nence of polynomial identities or as the polynomial identity trick (it also goes
under several other names). Here is one incarnation of this principle:

Theorem 5.19.4 (principle of permanence of polynomial identities). Let


P ( x1 , x2 , . . . , xm ) and Q ( x1 , x2 , . . . , xm ) be two polynomials with integer co-
efficients in several indeterminates x1 , x2 , . . . , xm . Assume that the equality

P ( k1 , k2 , . . . , k m ) = Q (k1 , k2 , . . . , k m ) (33)

holds for every m-tuple (k1 , k2 , . . . , km ) ∈ N m of nonnegative integers. Then,


P ( x1 , x2 , . . . , xm ) and Q ( x1 , x2 , . . . , xm ) are identical as polynomials (so that,
in particular, the equality (33) holds not only for every (k1 , k2 , . . . , km ) ∈ N m ,
but also for every (k1 , k2 , . . . , km ) ∈ C m , and more generally, for every
(k1 , k2 , . . . , km ) ∈ Km where K is an arbitrary commutative ring).

Theorem 5.19.4 is often summarized as “in order to prove that two polynomi-
als are equal, it suffices to show that they are equal on all nonnegative integer
points” (where a “nonnegative integer point” means a point – i.e., a tuple of
inputs – whose all entries are nonnegative integers). Even shorter, one says
that “a polynomial identity (i.e., an equality between two polynomials) needs
only to be checked on nonnegative integers”. For example, if you can prove the
equality
( x + y)4 + ( x − y)4 = 2x4 + 12x2 y2 + 2y4
for all nonnegative integers x and y, then you automatically conclude that this
equality holds as a polynomial identity, and thus is true for any elements x and
y of a commutative ring.
An introduction to graph theory, version August 2, 2023 page 267

A typical application of Theorem 5.19.4 is to argue that a polynomial identity


you have proved for all nonnegative integers must automatically hold for all
inputs (because of Theorem 5.19.4). Some examples of such reasoning can be
found in [19fco, §2.6.3 and §2.6.4]. A variant of Theorem 5.19.4 is [Conrad21,
Theorem 2.6]; actually, the proof of [Conrad21, Theorem 2.6] can be trivially
adapted to prove Theorem 5.19.4 (just replace “nonempty open set in C k ” by
“N k ”). In truth, there is nothing special about nonnegative integers and the
set N; you could replace N by any infinite set of numbers (or even any suf-
ficiently large set of numbers, where “sufficiently large” means “more than
max {deg P, deg Q} many”). See [Alon02, Lemma 2.1] for a fairly general ver-
sion of Theorem 5.19.4 that includes such cases80 .

5.19.4. Proof of the weighted MTT


We can now deduce the weighted MTT from the original MTT (Theorem 5.14.7):
Proof of Theorem 5.19.2. The claim of Theorem 5.19.2 (for fixed D and r) is an
equality between two polynomials in the arc weights w a . (For instance, in Ex-
 
wα + wγ −wα
ample 5.19.3, this equality is wα w β + wγ w β = det .)
0 wβ
Therefore, thanks to Theorem 5.19.4, it suffices to prove this equality in the
case when all arc weights w a are nonnegative integers. So let us WLOG assume
that arc weights w a are nonnegative integers.
Let us now replace each arc a of D by w a many copies of the arc a (having
the same source as a and the same target as a). The result is a new digraph D ′ .
Here is an example:
Example 5.19.5. Let D be the digraph

α β

D= 1 3
γ
,

80 Tobe precise, [Alon02, Lemma 2.1] is not concerned with two polynomials being identical,
but rather with one polynomial being identically zero. But this is an equivalent question:
Two polynomials P and Q are identical if and only if their difference P − Q is identically
zero.
An introduction to graph theory, version August 2, 2023 page 268

and let the arc weights be wα = 2 and w β = 3 and wγ = 2. Then, D ′ looks as


follows:
2

β1
α1 β2
α2
β3

γ1
D′ = 1 3
γ2
,
where α1 , α2 are the two arcs obtained from α, and so on.
Now, recall that the digraph D ′ has the same vertices as D, but each arc a
of D has turned into w a arcs of D ′ . Thus, the weighted outdegree deg+w i of
a vertex i of D equals the (usual, i.e., non-weighted) outdegree deg+ i of the
same vertex i of D ′ . Hence, the weighted Laplacian Lw of D is the (usual, i.e.,
non-weighted) Laplacian of D ′ .
Recall again that the digraph D ′ has the same vertices as D, but each arc a
of D has turned into w a arcs of D ′ . Thus, each subdigraph B of D gives rise
to w ( B) many subdigraphs of D ′ (because we can replace each arc a of B by
any of the w a many copies of this arc in D ′ ). Moreover, this correspondence
takes spanning arborescences to spanning arborescences81 , and we can obtain
any spanning arborescence of D ′ in this way from exactly one B. Hence,

∑ w ( B) = # of spanning arborescences of D ′ rooted to r .
B is a spanning
arborescence
of D rooted to r

Thus, applying the original MTT (Theorem 5.14.7) to D ′ yields the weighted
MTT for D (since the weighted Laplacian Lw of D is the (usual, i.e., non-
weighted) Laplacian of D ′ ). This completes the proof of Theorem 5.19.2.
[Remark: Alternatively, it is not hard to adapt our above proof of the original
MTT to the weighted case.]

5.19.5. Application: Counting trees by their degrees


The weighted MTT has some applications that wouldn’t be obvious from the
original MTT. Here is one:

81 Moreprecisely: Let B be a subdigraph of D, and let B′ be any of the w ( B) many subdi-


graphs of D ′ that are obtained from B through this correspondence. Then, B is a spanning
arborescence of D rooted to r if and only if B′ is a spanning arborescence of D ′ rooted to r.
An introduction to graph theory, version August 2, 2023 page 269

Exercise 5.27. Let n ≥ 2 be an integer, and let d1 , d2 , . . . , dn be n positive


integers. An n-tree shall mean a simple graph with vertex set {1, 2, . . . , n}
that is a tree. We know from Corollary 5.14.9 that there are nn−2 many n-
trees. How many of these n-trees have the property that

deg i = di for each vertex i ?

Solution. The n-trees are just the spanning trees of the complete graph Kn .
To incorporate the deg i = di condition into our count, we use a generating
function. So let us not fix the numbers d1 , d2 , . . . , dn , but rather consider the
polynomial
deg 1 deg 2 deg n
P ( x1 , x2 , . . . , x n ) : = ∑ x1 x2 · · · xn (34)
T is a n-tree

in n indeterminates x1 , x2 , . . . , xn (where deg i means the degree of i in T). Then,


the x1d1 x2d2 · · · xndn -coefficient of this polynomial P ( x1 , x2 , . . . , xn ) is the # of n-
trees T satisfying the property that

deg i = di for each vertex i


d
(because each such n-tree T contributes a monomial x11 x2d2 · · · xndn to the sum on
the right hand side of (34), whereas any other n-tree T contributes a different
monomial to this sum).
Let us assign to each edge ij of Kn the weight wij := xi x j . Then, the definition
of P ( x1 , x2 , . . . , xn ) rewrites as follows:

P ( x1 , x2 , . . . , x n ) = ∑ w (T ) ,
T is an n-tree

where w ( T ) denotes the product of the weights of all edges of T. (Indeed, for
deg 1 deg 2 deg n
any subgraph T of Kn , the weight w ( T ) equals x1 x2 · · · xn , where deg i
means the degree of i in T.)
We have assigned weights to the edges of the graph Kn ; let us now assign the
same weights to the arcs of the digraph Knbidir . That is, the two arcs (ij, 1) and
(ij, 2) corresponding to an edge ij of Kn shall both have the weight
w(ij,1) = w(ij,2) = wij = xi x j . (35)

As we are already used to, we can replace spanning trees of Kn by spanning


arborescences of Knbidir rooted to 1, since the former are in bijection with the
latter. Thus, we have

(# of spanning trees of Kn )
 
bidir
= # of spanning arborescences of Kn rooted to 1 .
An introduction to graph theory, version August 2, 2023 page 270

Moreover, since this bijection preserves weights (because of (35)), we also have

∑ w (T ) = ∑ w ( B) .
T is a spanning B is a spanning
tree of Kn arborescence of Knbidir
rooted to 1

In other words,

∑ w (T ) = ∑ w ( B)
T is an n-tree B is a spanning
arborescence of Knbidir
rooted to 1

(since the spanning trees of Kn are precisely the n-trees).


To compute the right hand side, we shall use the weighted Matrix-Tree The-
orem. The weighted Laplacian of Knbidir (with the weights we have just defined)
is the n × n-matrix Lw with entries given by

w
Li,j = deg+w i · [i = j] − ai,j w

deg+w i − aw , if i = j;
i,j
= w
− ai,j , if i 6= j
( !
deg+w i, if i = j; w = 0 when i = j
since ai,j
= w,
− ai,j if i 6= j (because Knbidir has no loops)
(
xi ( x1 + x2 + · · · + xn ) − xi x j , if i = j;
=
− xi x j , if i 6= j
 
since deg+w i = xi x1 + xi x2 + · · · + xi xi −1 + xi xi +1 + · · · + xi xn
 = xi ( x1 + x2 + · · · + xi −1 + xi +1 + · · · + x n ) 
 
 = xi ( x1 + x2 + · · · + x n ) − xi xi 
 
 
 = xi ( x1 + x2 + · · · + xn ) − xi x j whenever i = j, 
and since ai,jw = x x whenever i 6= j
i j

= [i = j ] xi ( x1 + x2 + · · · + x n ) − xi x j

= xi [i = j ] ( x1 + x2 + · · · + x n ) − x j .
 
We can find its minor det Lw ∼1,∼1 without too much trouble (e.g., using row
transformations similar to the ones we have done back in the proof of Cayley’s
formula82 ); the result is
 n −2
det Lw ∼1,∼1 = x1 x2 · · · xn ( x1 + x2 + · · · + xn ) .

82 The first step, of course, is to factor an xi out of the i-th row for each i.
An introduction to graph theory, version August 2, 2023 page 271

Summarizing what we have done so far,

P ( x1 , x2 , . . . , x n ) = ∑ w (T ) = ∑ w ( B)
T is an n-tree B is a spanning
arborescence of Knbidir
rooted to 1

= det Lw
∼1,∼1 (by the weighted Matrix-Tree Theorem)
= x1 x2 · · · x n ( x1 + x2 + · · · + x n ) n −2 . (36)
d
As we recall, we are looking for the x11 x2d2 · · · xndn -coefficient in this polynomial.
From (36), we see that
 
d1 d2 dn
the x1 x2 · · · xn -coefficient of P ( x1 , x2 , . . . , xn )
 
= the x1d1 x2d2 · · · xndn -coefficient of x1 x2 · · · xn ( x1 + x2 + · · · + xn )n−2
 
= the x1d1 −1 x2d2 −1 · · · xndn −1 -coefficient of ( x1 + x2 + · · · + xn )n−2

(because when we multiply a polynomial by x1 x2 · · · xn , all the exponents in it


get incremented by 1, so its coefficients just shift by a 1 in each exponent).
Now, how can we describe the coefficients of ( x1 + x2 + · · · + xn )n−2 , or,
more generally, of ( x1 + x2 + · · · + xn )m for some m ∈ N ? These are the
so-called multinomial coefficients (named in analogy to the binomial coeffi-
cients, which are their particular case for n = 2). Their definition is as follows:
If p1 , p2 , . . . , pn , q are nonnegative
 integers with q = p1 + p2 + · · · + pn , then
q q!
the multinomial coefficient is defined to be . If
p1 , p2 , . . . , p n p1 !p2 ! · · · pn !
q 6= p1 + p2 + · · · + pn , then it is defined to be 0 instead. In either case, this
coefficient is easily seen to be an integer.83 The multinomial formula (aka
multinomial theorem) says that for each k ∈ N, we have
 
k i
k
( x1 + x2 + · · · + x n ) = ∑ x11 x2i2 · · · xinn
i ,i ,...,i ∈N;
i ,
1 2i , . . . , i n
1 2 n
i1 +i2 +···+in = k
 
k i
= ∑ x 1 xi2 · · · xinn
i1 ,i2 ,...,in ∈N
i1 , i2 , . . . , i n 1 2

(it does not matter whether we restrict  the sum  by the condition i1 + i2 + · · · +
k
in = k or not, since the coefficient is defined to be 0 when this
i1 , i2 , . . . , i n
condition is violated anyway). Hence,
   k

i1 i2 in k
the x1 x2 · · · xn -coefficient of ( x1 + x2 + · · · + xn ) =
i1 , i2 , . . . , i n
83 See [23wd, Lecture 18, Section 4.12] for an introduction to multinomial coefficients.
An introduction to graph theory, version August 2, 2023 page 272

for any k ∈ N and any i1 , i2 , . . . , in ∈ N. In particular,


 
the x1d1 −1 x2d2 −1 · · · xndn −1 -coefficient of ( x1 + x2 + · · · + xn )n−2
 
n−2
= .
d1 − 1, d2 − 1, . . . , dn − 1
Summarizing, we find
 
the x1d1 x2d2 · · · xndn -coefficient of P ( x1 , x2 , . . . , xn )
 
= the x1d1 −1 x2d2 −1 · · · xndn −1 -coefficient of ( x1 + x2 + · · · + xn )n−2
 
n−2
= .
d1 − 1, d2 − 1, . . . , dn − 1
However, the x1d1 x2d2 · · · xndn -coefficient of P ( x1 , x2 , . . . , xn ) is the # of n-trees T
satisfying the property that
deg i = di for each vertex i
(as we have seen above). Thus, we have proved the following:
Theorem 5.19.6 (refined Cayley’s formula). Let n ≥ 2 be an integer, and let
d1 , d2 , . . . , dn be n positive integers. Then, the # of n-trees with the property
that
deg i = di for each i ∈ {1, 2, . . . , n}
is the multinomial coefficient
 
n−2
.
d1 − 1, d2 − 1, . . . , dn − 1

5.19.6. The weighted harmonic vector theorem


The harmonic vector theorem for Laplacians (Theorem 5.18.1) also has a weighted
version:
Theorem 5.19.7 (harmonic vector theorem for weighted Laplacians). Let D =
(V, A, ψ) be a multidigraph, where V = {1, 2, . . . , n} for some n ∈ N. Let
K be a commutative ring. Assume that an element w a ∈ K is assigned to
each arc a ∈ A. For each r ∈ V, let τ w ( D, r ) be the sum of the weights of
all the spanning arborescences of D rooted to r. Let f w be the row vector
(τ w ( D, 1) , τ w ( D, 2) , . . . , τ w ( D, n)). Let Lw be the weighted Laplacian of
D. Then, f w Lw = 0.
Proof. Similar to the unweighted case.
Here ends our study of spanning trees and their enumeration. An interested
reader can learn more from [Rubey00], [Holzer22], [Moon70] and [GrSaSu14].

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy