The Notorious Collatz Conjecture: Terence Tao, UCLA
The Notorious Collatz Conjecture: Terence Tao, UCLA
conjecture
Terence Tao, UCLA
The Collatz conjecture is one of
the most elementary unsolved
problems in mathematics.
It is also one of the most “dangerous”
conjectures known – notorious for
absorbing massive amounts of time
from both professional and amateur
mathematicians.
Introduced by Lothar Collatz in
1937, the conjecture is also
known as the “3x+1 conjecture”
or the “Syracuse problem”.
The conjecture involves an innocuous function Col on the
natural numbers {1,2,3,…} defined by the following rule:
• Col(n) equals 3n+1 if n is odd.
• Col(n) equals n/2 if n is even.
(odd)
n 3n+1
(even)
n n/2
n 1 2 3 4 5 6 7 8 9 10
Col(n) 4 1 10 2 16 3 22 4 28 5
n 11 12 13 14 15 16 17 18 19 20
Col(n) 34 6 40 7 46 8 52 9 58 10
n 21 22 23 24 25 26 27 28 29 30
Col(n) 64 11 70 12 76 13 82 14 88 15
Now consider iterates of the Collatz
function Col, in which the output of
the function is fed back into the input:
2
Col (n) = Col(Col(n))
3
Col (n) = Col(Col(Col(n)))
etc.
n 1 2 3 4 5 6 7 8 9 10
Col(n) 4 1 10 2 16 3 22 4 28 5
Col2(n) 2 4 5 1 8 10 11 2 14 16
Col3(n) 1 2 16 4 4 5 34 1 7 8
Col4(n) 4 1 8 2 2 16 17 4 22 4
Col5(n) 2 4 4 1 1 8 52 2 11 2
Col6(n) 1 2 2 4 4 4 26 1 34 1
Col7(n) 4 1 1 2 2 2 13 4 17 4
Every natural number n generates a
Collatz sequence (or Collatz orbit)
2 3
n, Col(n), Col (n), Col (n), …
Col(n) 2
Col (n)
n …
For instance, n=1 generates the
periodic Collatz sequence
1, 4, 2, 1, 4, 2, 1, 4, 2, 1,…
1 2
4
If a Collatz sequence reaches the
value 1, it will then cycle through the
values 1, 4, 2 indefinitely.
n … 1 2
4
For instance, n=6 generates the
Collatz sequence
6, 3, 10, 5, 16, 8, 4, 2, 1, 4, 2, 1, 4, 2,
1, 4, 2, 1,…
6 10 16 1 2
3 5 8 4
Collatz sequences are also known as
hailstone sequences, as they can bounce
up and down much like hailstones in a
cloud were thought to.
For instance, n=27 generates the Collatz sequence
27, 82, 41, 124, 62, 31, 94, 47, 142, 71, 214, 107, 322, 161, 484, 242, 121, 364, 182,
91, 274, 137, 412, 206, 103, 310, 155, 466, 233, 700, 350, 175, 526, 263, 790, 395,
1186, 593, 1780, 890, 445, 1336, 668, 334, 167, 502, 251, 754, 377, 1132, 566,
283, 850, 425, 1276, 638, 319, 958, 479, 1438, 719, 2158, 1079, 3238, 1619, 4858,
2429, 7288, 3644, 1822, 911, 2734, 1367, 4102, 2051, 6154, 3077, 9232, 4616,
2308, 1154, 577, 1732, 866, 433, 1300, 650, 325, 976, 488, 244, 122, 61, 184, 92,
46, 23, 70, 35, 106, 53, 160, 80, 40, 20, 10, 5, 16, 8, 4, 2, 1, 4, 2, 1, …
(Pedantic note: the modern theory of hailstone
formation deviates from this classical model, being
based instead on the properties of supercooled
water droplets.)
But just as every hailstone eventually
falls to the ground, we have the
infamous
“This is an extraordinarily
difficult problem, completely
out of reach of present day
mathematics.” – Jeff Lagarias,
“Mathematics is not yet ripe 2010
enough for such questions.” –
Paul Erdős, 1983
XKCD, Randall Monroe,
March 5, 2010
The Collatz conjecture appears to be
a mere mathematical curiosity, with
no obvious real-world applications.
Why should we try to solve it?
• Pure intellectual challenge
• A benchmark for testing our understanding of
number theory
• Proof attempts have linked the problem to other
areas of mathematics
• It is a simple, but non-trivial, toy model of a
dynamical system
• Modest cash prizes ($50, Harold Coxeter; $500,
Paul Erdős; £1000, Sir Bryan Thwaites)
• Bragging rights
Mathematically speaking, a (discrete)
dynamical system is a state space X,
together with a shift map T from X to
2 3
itself. The iterates T, T , T , … describe
the dynamics of the system.
In the Collatz dynamical system, the
state space is the natural numbers N
= {1,2,3,…} and the shift map is the
Collatz map Col.
A sibling to the discrete dynamical
systems are the continuous dynamical
systems, where the dynamics are given
by ordinary differential equations
(ODE) or partial differential equations
(PDE).
Many important real-world systems,
such as fluids, ecosystems, and the
climate, can be viewed as
(continuous) dynamical systems.
The Collatz conjecture highlights the
basic fact that even very simple
equations can lead to amazingly
complicated dynamics.
In mathematics, when we cannot solve a
problem completely, we look for partial
results. Even if they do not lead to a
complete solution, they often reveal
insights about the problem.
It is also useful to locate obstructions – such as
counterexamples to related problems that
highlight difficulties that have to be overcome
in any proposed solution.
What partial results and
obstructions do we have for the
Collatz conjecture?
Partial result: in 2017, a distributed
computing project verified the Collatz
conjecture for all starting values n up to 1020.
So it is highly unlikely that a counterexample
can be found just from pen and paper search.
One way the Collatz conjecture could fail is
if there is a cycle – a Collatz sequence that
repeats itself indefinitely – other than the
known cycle 1, 2, 4, 1, 2, 4,… (or its shifts).
Partial result: it is known that any such cycle
must have length at least 17,087,915. (Eliahou,
1993). So one cannot simply produce a short
cycle to easily disprove the conjecture!
Obstruction: On the other hand, there are
variants of the Collatz conjecture that have non-
trivial cycles. For instance, if one modifies Col
by sending an odd number n to 3n-1 rather than
3n+1, then two additional cycles appear:
• 5, 14, 7, 20, 10, 5,…
• 17, 50, 25, 74, 37, 110, 55, 164, 82, 41, 122, 61,
192, 91, 272, 136, 68, 34, 17,…
We don’t know if there are any further cycles for this
map.
This obstruction shows that any proof
of the Collatz conjecture must at some
point use a property of the 3n+1 map
that is not shared by the 3n-1 map.
Obstruction: the absence of non-trivial
Collatz cycles can be shown to imply a
difficult result in number theory:
Theorem: The gap between
powers of 2 and powers of 3 goes
to infinity.
32-23 = 9-8 = 1; 25-33 = 32-27 = 5; 28-35 = 256-243 = 13; 37-211= 2187-2048 = 139; …
n 3n+1 (3n+1)/2
Heuristically, there is a fifty-fifty chance that the number
(3n+1)/2 will also be even, leading to further divisions by 2.
Indeed, a probability theory calculation reveals that the
“expected number” of divisions by 2 one experiences
before reaching an odd number again is equal to two.
7, 36, 18, 9, 46, 23, 116, 58, 29, 146, 73, 366, 183, 916, 458, 229, 1146, 573, 2866, 1433, 7166, 3583, 17916, …
One can partially convert these heuristics
into rigorous partial results by working
statistically – studying the behavior of
almost all Collatz orbits, rather than all
orbits, thus excluding “outliers”.
Partial result: in 1976, Terras showed that
almost all initial values n eventually
iterated to a value less than n. (As a first
approximation, think of “almost all” as
meaning “at least 99.99% of all”.)
(almost all)
n … <n
If one could show that all initial values
n (other than 1) iterated to something
less than n, this would imply the Collatz
conjecture by further iteration.
(all except 1?)
n … <n
Partial result: Terras’s result was refined
over the years. In 1979, Allouche showed
that almost all initial values n eventually
iterated to a value less than n0.869.
(almost all)
n … < n0.869
Partial result: in 1994, Korec
lowered this bound further to
n 0.7925 .
(almost all)
n 0.7925
… < n
Partial result: In 2019, I showed that almost all
initial values n eventually iterated to a value
less than f(n), for any function f() that grew to
infinity, no matter how slowly. “Almost all
Collatz orbits attain almost bounded values.”
(almost all)
n … < f(n)
For instance: almost all initial
values n eventually iterate to a
value less than log(log(log(log(n)))).
(almost all)
n … < log(log(log(log(n)))).
This is about as close as one can
get to the Collatz conjecture
without actually solving it.
(all?)
n … 1
Unfortunately, the statistical methods
used in the proof seem to be unable to
fully resolve the conjecture, which
remains out of reach for now.
The argument was inspired by other dynamical
systems results, and in particular by a 1994 result
of Bourgain on constructing an invariant measure
for the nonlinear Schrödinger equation.
A key difficulty with the Collatz iteration is
that it can greatly distort the distribution of a
set of numbers – some numbers collide into
each other, others get skipped entirely.
1 2 3 4 5 6 7 8
1 2 3 4 10 … 16 … 22
4
As a consequence, the statistical
behavior of Collatz iteration quickly
becomes intractable to study.
1 2 3 4 5 6 7 8
1 2 3 4 10 … 16 … 22
4
However, I was able to construct an
(approximate) invariant measure – a distribution
of numbers that iterates to something
resembling a smaller version of itself.