Ergodic Hypothesis
Ergodic Hypothesis
189-201, (2007)
www.sbfisica.org.br
Departamento de Matem
atica, Universidade Federal de S
ao Carlos, S
ao Carlos, SP, Brasil
2
Departamento de Fsica, Universidade Federal de S
ao Carlos, S
ao Carlos, SP, Brasil
Recebido em 1/6/2006; Aceito em 27/9/2006
An updated discussion on physical and mathematical aspects of the ergodic hypothesis in classical equilibrium
statistical mechanics is presented. Then a practical attitude for the justification of the microcanonical ensemble
is indicated. It is also remarked that the difficulty in proving the ergodic hypothesis should be expected.
Keywords: ergodic hypothesis, statistical mechanics, microcanonical ensemble.
Apresenta-se uma discuss
ao atual sobre aspectos fsicos e matem
aticos da hip
otese erg
odica em mec
anica
estatstica de equilbrio. Ent
ao indica-se uma eventual postura para se justificar o ensemble microcan
onico.
Observa-se, tambem, que a dificuldade em se demonstrar a hip
otese erg
odica deveria ser esperada.
Palavras-chave: hip
otese erg
odica, mec
anica estatstica, ensemble microcan
onico.
1.
Introduction
oliveira@dm.ufscar.br.
190
An Appendix summarizes the first steps of integration theory and presents selected theorems of ergodic
theory; it is no more than a quick reference for the readers.
2.
Microcanonical ensemble
Establishing a mechanical model for the thermodynamic macroscopic observables is not a simple task. By
beginning with the Hamilton equations of motion of
classical mechanics
q =
H
,
p
p =
H
,
q
(1)
Three approaches
Time averages
A traditional way of introducing time averages of observables follows. Given a phase space function f that
should correspond to a macroscopic physical quantity,
the measurements of the precise values f ((t)) are not
possible since the knowing of detailed positions and momenta of the particles of the system would be necessary;
it is then supposed that the result of a measurement is
the time average of f .
It is also argued that each measurement of a macroscopic observable at time t0 takes, actually, certain
interval of time to be realized; in such interval the microstate (t) changes and so different values of f ((t))
are generated, and the time average
1
t
t0 +t
f (T s ) ds,
(2)
t0
191
f () := lim
t0 +t
f (T s ) ds.
(3)
t0
Density function
The initial idea is due to Maxwell and was then developed by Gibbs. Consider a huge collection of identical
systems, each with its own initial condition at time t0
(some authors call this collection an ensemble; here
an ensemble will mean a probability distribution, i.e.,
a probability measure as defined in the Appendix); the
typical behavior of such collection would correspond
to equilibrium. This behavior is characterized by an
initial positive density function t0 : IR+ . If t ()
is
R the corresponding density function at time t, then
() d indicates the average number of microstates
A t
that
R will occupy the set A at t (i.e., a probability);
so t () d = 1 due to the normalization of the total probability. The value of an observable f at time t
would be
Z
f () t () d.
(6)
(5)
2.2.3.
3.
A mathematical digression
192
1 if 0 A
0 (A) =
,
0 if 0
/A
then 0 is an invariant measure for t . This construction can be carried out similarly to periodic orbits, and
to each periodic orbit an invariant probability measure,
concentrated on it, is associated. Hence, invariant measures are also understood as suitable generalizations of
equilibrium points and periodic orbits of dynamical systems.
If is invariant for t , then it is possible to show
that, for integrable functions f : IR
Z
Z
f ( t ) d() =
f () d( t )
Z
(8)
=
f () d(),
193
t1 +t
t1
Z t0 +t
t0
Z
f (T s ) ds =
f (T s ) ds +
t0
f (T s ) ds +
t1
Z t1 +t
f (T s ) ds.
(9)
t0 +t
R t +t
Rt
Now t01+t f (T s ) ds = t01 f (T s+t ) ds and so for
reasonable (e.g., bounded) functions f this integral, as
Rt
well as t10 f (T s ) ds, are bounded; therefore after dividing by t and taking t both vanish. Hence
Z
1 t1 +t
f () = lim
f (T s ) ds =
t t t
1
Z
1 t0 +t
lim
(10)
f (T s ) ds,
t t t
0
and a.e. the time average does not depend on the initial
time.
The exclusion of sets of measure zero is not just a
mathematical preciosity. For example, for a gas in a
box, consider an initial condition so that the motion
of all particles are perpendicular to two opposite faces
resulting in null pressure on the other faces of the box;
another situation is such that all particles are confined
in a small portion of the box; such initial conditions are
not found in practice and the mathematical formalism
is wise enough to include them in a set of Lebesgue
measure zero for which the results do not apply.
It is outstanding that the volume measure d is
a macroscopic equilibrium for Hamiltonian mechanics
(Liouville theorem), and that merely the existence of
this equilibrium has resulted in well-defined time averages and their independence on the initial time in a
set of full volume. Although apparently we have answered some of the questions proposed at the beginning
of this section, an important point is, however, missing: we are not sure that the microcanonical ensemble
d is the equilibrium to be implemented in statistical
mechanics, that is, why is this the invariant measure
to be considered? This is the wanted justification of
the ergodic hypothesis, a supposition not fully justified
yetas discussed ahead.
An invariant measure with respect to a general
dynamics t is called ergodic (or the pair ( t , ) is ergodic) if for every set A with t (A) = A, one has
either (A) = 0 or ( \ A) = 0, i.e., every invariant
set under the dynamics has zero or full measure. Then
ergodicity means that the only nontrivial (that is, with
nonzero measure) invariant set is just the whole set; in
other words, ergodicity is equivalent to the set be indecomposable under the dynamics ( t , ). If it is clear
what is the invariant measure under consideration, one
also says that the system or the dynamics t is ergodic.
This is just one possible way of defining ergodic measures, and a more detailed discussion is presented in the
Appendixcompare with Definition 3 and Theorem 4.
194
(12)
Note first that without Liouville theorem the discussion about density functions in Subsection 2.2.2, including condition given by Eq. (7) for equilibrium,
would be incorrect. Indeed, for the average value hf i
of f over at time t we have
Z
Z
t
f (T ) 0 () d
f () 0 (T t ) d(T t )
f () 0 (T t ) d.
(13)
4.
From the discussion in the preceding section, if the ergodic hypothesis holds, then we have a precise mechanical
definition of macroscopic equilibrium d, the existence
a.e. of time averages of integrable functions f representing macroscopic observables, the equality of such
averages with space averages (i.e., relation (5)) and a
justification of the adoption of the microcanonical ensemble. In summary, the statistical mechanics prescription would be justified.
In this section many aspects related to the ergodic
hypothesis in classical statistical mechanics are discussed. We have tried to cover the main current points,
including some numerical indications.
4.1.
Before going on, we present an example of a Hamiltonian system that is not ergodic. Consider two independent harmonic oscillators; since the phase space
can be decomposed in two independent parts, one for
each oscillator and both of nonzero d measure, the
pair (T t , d) is not ergodic. The argument easily generalizes to noninteracting systems with finitely many
particles, so that for the ergodicity of (T t , d) the interaction among its particles is fundamental.
For the so-called gas of hard spheres whose particles interact only via elastic collisions, it was recently
demonstrated by Simanyi [35] that elastically colliding
N 2 hard balls, of the same radius and arbitrary
masses, on the flat torus of any finite dimension is ergodic (also mixing, see ahead); the energy is fixed and
the total momentum is zero. This is a generalization of
results by Sinai around 1970 for N = 2. There are some
classes of two-particle mechanical systems, i.e., the Hamiltonian is the sum of the kinetic energy K and the
potential energy U , on the two-dimensional torus studied by Donnay and Liverani [9], whose potential U
is radially symmetric and vanishes outside a disk, for
which ergodicity is also proved. These can be considered the most realistic models in statistical mechanics
where the ergodic hypothesis has been rigorously established. What is then the attitude when other interactions are present? Unfortunately a satisfactory answer
is still missing. Some authors (as in most textbooks)
suggest that the postulate of equal a priori probability
should be invoked and taking the experimental results
as a final confirmation.
Some workers in the area have argued that ergodic theory did not success in explaining the positive
results of statistical mechanics. For instance, Earman
and Redei [10] claim that many models for which statistical mechanics works are likely not ergodic and so
nonergodic properties must be invoked. As an alternative they have proposed that an ergodic-like behavior,
195
i.e., the validity of Eq. (5) should hold only for a finite set of observables f (see also Eq. [24], where it is
proposed to restrict ergodicity and mixing to suitable
variables). They assert that for each model the equilibrium statistical mechanics predicts values for only
a finite set of observables, and one should investigate
whether the ergodic-like behavior holds only for such
set, which may differ from system to system.
There are some criticisms on the assumption that
the outcome of a measurement can be described by infinite time averages; for example, only results concerning quantities in equilibrium could be obtained in this
way (see Refs. [16] p. 84 and [37] p. 176). Note that in
this work only equilibrium statistical mechanics is considered. Also that sets of Lebesgue measure zero can
be neglected has the opposition of some authors; the
interested reader is referred to Ref. [25].
Another suggestive argument employed for the justification of the microcanonical ensemble comes from
analogies with the second law of thermodynamics. Separate the phase space into a finite number of M
disjoint cells and associate a probability pj of a representative of the microstate
PM be in the j-th cell; then we
have the constrain
j=1 pj = 1. Such separation of
phase space into cells of nonzero volume is usually called a coarse grain partition and associated to each of
them one defines the Gibbs coarse grain entropy
S(pj ) =
M
X
pj ln pj .
j=1
Now impose that the equilibrium is attained for the distribution of pj with maximum entropy under the above
constrain; it is left as an exercise to conclude that such
maximum is obtained for pj = 1/M , for all j, that is,
equal probability. In the formal limit of infinite many
cells (with vanishing sizes) one gets an indication for the
validity of the equal a priori probability. Note, however,
that the microscopic evolution does not enter explicitly
in the argument, as should be expected, and that this
argument only shift the actual problem to another one.
Since the idea of separating the phase space into
finitely many cells was mentioned, it is worth recalling Boltzmann reasoning that led him to the original formulation of the ergodic hypothesis [14] in the
1870s. Under time evolution Boltzmann supposed that
the cells are cyclically permuted; then it became natural to assume that time averages could be performed
by averaging over cells. In the limit of infinite cells one
would get Eq. (5).
There is a clear general uncomfortable reaction in
the literature with respect to the missing proofs of the
ergodic hypothesis for a large class of particle interactions. In our opinion such great difficulty for ergodic proofs should be expected due to the possibility of
coexistence of different phases. More precisely, suppose that (T t , d) is not ergodic; another mathematical
(14)
This expression means that as time increases the portion of a given measurable set A that then resides any
set B is proportional to the measure of B; thus, according to the probability measure , t (A) becomes
uniformly distributed over . If this condition is satisfied then, necessarily, the system is ergodic; in fact,
if t (A) = A and by taking B = A in (14), it follows
that (A) = (A)2 , so that (A) = 0 or (A) = 1,
i.e., ( t , ) is ergodic. Therefore, the natural condition
claimed to assure convergence to equilibrium implies
ergodicity. Although this proof sounds simple today, it
hides a complex set of developments and such implication was not known by Boltzmann and Gibbs, and was
first proved many years after their works on the foundations of statistical mechanics. Certainly, it is usually
harder to show that a system is mixing than its ergodicity (in case such properties hold).
Lombardi [24] argued that although some authors
take a pragmatic position to the microcanonical ensemble [36, 39], in accepting that mixing of d plays
a significant role in the description of the approach to
macroscopic equilibrium d then, according to the preceding paragraph, ergodicity of d comes up again. Ne-
196
Another source of indications of the prevalence of nonergodic Hamiltonian systems (with respect to d) are
the perturbations of the so-called completely integrable
Hamiltonian systems. Such systems are those that have
as many independent constants of motion (like energy)
as degrees of freedom; a combination of results by Liouville and Arnold [1] shows that the solutions of Hamiltonian equations can be explicitly found (up to evaluation
of certain integrals and inverse functions) and in case
the accessible phase space is compact it can be foliated
by invariant tori (whose dimension coincides with the
number of degrees of freedom) and each orbit are just
rotations in a torus. Hence, in this case the motion is
described in a simple way. Think of the harmonic oscillator with one degree of freedom whose phase space
can be foliated by its orbits, and these orbits are (continuously deformed) one dimensional tori. Most textbook examples of mechanical systems are completely
integrable.
Clearly complete integrable systems with more than
one degree of freedom do not satisfy the ergodic hypothesis. For a long time it was conceived that a small
nonlinear perturbation of an integrable system would
make them ergodic, so that aside from the exceptions of
completely integrable cases, Hamiltonian systems were
ergodic (at least for most compact energy surfaces).
However, in one of the first computer simulations in
physics performed by Fermi, Pasta and Ulam on anharmonic perturbations of a chain of independent harmonic oscillators (so an integrable system) in 1955, indications were found that the resulting systems were
not ergodic. See Ref. [4] for a collection of papers celebrating 50 years of the Fermi-Pasta-Ulam problem and
additional references.
A rigorous approach for the lack of ergodicity of
197
5.
A practical attitude
standard microcanonical, canonical and grand canonical distributions generate the same thermodynamics
[33]; however, it is worth mentioning that the important cases of Coulomb and gravitational potentials are
not included in such proofs.
Maybe, for a given system the choices of distributions that result in a thermodynamics should be restricted on physical grounds, e.g., suitable boundary
conditions. Whether only those standard distributions should be considered seems an interesting question.
The existence of other possibilities, as coexistence of
different thermodynamics for some parameter values of
the system and the nonequivalence of ensembles, also
opens the door for descriptions of different phases.
Note that this practical attitude is the acceptance
of the classical approach by Boltzmann and Gibbs, but
with one simple additional ingredient: the idea that the
equivalence of ensembles could be enough to justify the
recipe of statistical mechanics.
6.
Conclusions
198
have not dealt with: the problem of irreversibility; the
evolution toward equilibrium and the arrow of time;
Kinchin approach [18, 2] to the ergodic question; ergodicity and phase transitions of systems with an infinite number of degrees of freedom [19, 15]. Note that
in order to discuss situations far from equilibrium one
should go a step further the ideas presented above, in
particular for estimating the increase of entropy, and so
on.
Finally, we mention a personal view expressed by
G. Gallavotti; in his opinion, despite the lack of proper mathematical tools Boltzmann understood many
points related to the ergodic hypothesis better than we
do now.
Acknowledgments
We thank the anonymous referee for valuable suggestions. C.R. de O. thanks the partial support by CNPq.
T.W. thanks a scholarship by FAPESP.
Appendix
A short discussion on ergodic theory
The (mathematical) ergodic theory can be considered
a branch of dynamical systems, particularly of those
that preserve a measure. After a discussion of (Lebesgue) integration theory, some abstract results related
to ergodicity are presented. It is expected that this collection of results and ideas could clarify many of the
statements in the main body of this article. Details of
the traditional subject of measure and integration can
be found, for instance, in Refs. [6, 32], and of ergodic
theory in Refs. [5, 14, 26, 27, 29, 42].
Measure and integration
Measure theory is a generalization of the concept of
length, area, etc., as well as of probability and density.
For example, in the case of the real line IR the natural
length of an interval (a, b) is (b a), its so-called Lebesgue measure, and one tries to extend this notion to
all subsets of IR (generalized lengths can also be considered, e.g., (b2 a2 ), resulting in other measures).
However, by using the Axiom of Choice it is possible to
construct subsets of IR for which their Lebesgue measure depends on the way the set is decomposed. So,
clearly such sets can not be considered with a welldefined length, and one says they are not measurable.
A consequence of this remark is that each measure must
be defined on a specific domain of measurable sets, called a -algebra; and the theory gets rather involved.
Definition 1 A -algebra in a set is a collection A
of subsets of (and each element of A is called a measurable set) so that
1. and belong to A.
2. If A A then \ A A.
3. If A1 , A2 , A3 ,
j=1 Aj A.
are elements of A,
then
Pn
and extends it linearly, that is, for f =
j=1 aj Aj ,
aj IR, Aj measurable, 1 j n (the so-called simple functions), then
Z
n
X
f () d() :=
aj (Aj ).
j=1
199
Functions f that can be approximated by simple functions fn in a pointwise way are called measurable functions, and their integrals are defined by the corresponding limit
Z
Z
f d := lim
fn d.
s
() d() = 0. (15)
lim
f
(
)
ds
t t 0
Theorem 3 (Birkhoff ) Let be an invariant measure for the flow t . If f : IR is integrable, then
Rt
i f () := limt 1t 0 f ( s ) ds exists -a.e. and
the function f is also integrable.
ii f ( t ) = f () -a.e., that is, f is constant
over orbits.
R
R
iii f () d() = f () d().
The proof of Theorem 2 is much simpler than the
proof of Birkhoff theorem 3, but the former gives no
information on the existence of time averages of individual initial conditions , since an integral is present
before the limit t . Item ii in Birkhoff theorem
should be expected. Item iii says that space average
of f coincides with time average of f , and an important particular case is when the latter is constant.
Definition 3 If for each integrable f the time average
f is constant a.e., then the pair ( t , ) is called
ergodic. Note that this implies
Z
f () =
f ( 0 ) d( 0 ), a.e.,
B ( s ) ds = (B), a.e.,
B () = lim
t t 0
that is, t visits each set B with frequency equal to
the measure of B.
200
There is a number of characterizations of ergodicity, and we mention some of them ahead. A measurable set A is invariant if t (A) = A and -invariant if
( t AA) = 0, t IR (recall that AB = (A \ B)
(B \ A) is the symmetric difference between the sets
A and B), and a measurable function f is invariant if
f t = f a.e.; for simplicity, in what follows invariant measures are supposed to be probability measures.
Theorem 4 Let be an invariant probability measure
for t . The following assertions are equivalent:
i) ( t , ) is ergodic.
which clarifies that every invariant measure can be written in terms of ergodic measures. In case only two ergodic measures 1 , 2 are present, this decomposition
reduces to convex combinations 1 + (1 )2 , with
0 1. The ergodic measures are the building
blocks of invariant measures.
References
[1] V.I. Arnold, Mathematical Methods of Classical Mechanics (Springer-Verlag, Nova Iorque, 1997).
[2] R.W. Batterman, Philosophy of Science 65, 183 (1998).
Z
Z Z
f d =
f (y)d (y) d(),
with denoting ergodic measures (properly) associated to points A and (A) = 1 (note that
f L1 () for any A).
In case f = B is the characteristic function of the
set B one gets
Z
(B) =
(B) d(),
201