0% found this document useful (0 votes)

22 views14 pages

1-Probability 0

Probabilitas

Uploaded by

Muhammad Syaifuddin Fuad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views14 pages

1-Probability 0

Probabilitas

Uploaded by

Muhammad Syaifuddin Fuad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Matthew Schwartz

Statistical Mechanics, Spring 2021

Lecture 1: Probability

1 Basic probability
We are going to be dealing with systems with enormous degrees of freedom, typically governed by
Avogadro's number NA = 6.02 1023. This is the number of hydrogen atoms in a gram, or more
intuitively, the number of molecules of water in a tablespoon. Even a tiny cell, with a diameter of
only 100 microns (10−4 m), contains a trillion molecules. In most areas of physics, we work with
1
small numbers (the fine structure constant = 137 for example), and calculate things as a Taylor
P
series in the coupling f ( ) = cn n, often keeping only the leading term f ( ) c1 . In statistical
1
mechanics, we work with a large number N and calculate things as a Taylor expansion in N , often
keeping only the leading term (N = 1). The key to doing this is not to ask what each particle is
doing, which would be both impossible and impractical, but rather to ask what the probability is
that a particle is doing something. It is imperative therefore to begin statistical mechanics with
statistics.
In general, we will be interested in probabilities of states of a system which we write as Pa or
P (a). The parameter a represents the microstate e.g. the positions fq ~ig and momenta fp
~ ig of
all the particles in a gas, or the square of the wavefunction j (q ~ )j2 in quantum mechanics. We
will sometimes think of a as a discrete index (e.g. if we flip a coin, it can land heads up with
1 1
PH = 2 or tails up with PT = 2 ) and sometimes continuous. In the continuous case, we call P (x)
Rx
the probability density, so that x 2 P (x) dx is the probility of finding x values between x1 and x2.
1
Probability densities only become probabilities when integrated.
We will get to know a number of different probability distributions:

1 (x − x0)2
Gaussian: P (x) = p exp − (1)
2 22
(t)m −t
Poisson: Pm(t) = e (2)
m!
N!
Binomial: BN (m) = ambN −m (3)
m!(N − m)!
Γ 1
Lorentzian: P (x) = (4)
2 (x − x )2 + Γ 2
0 2

Flat: P (x) = constant (5)

Probabilities distributions are always normalized so that they integrate/sum to 1:
Z X
dxP (x) = 1; Pa = 1 (6)
a

Given a probability distribution, we can calculate the expected value of any observable by
integrating/summing against the probability. For example, the expected value of x (the mean) is
Z
x hxi = dxxP (x) (7)
or the mean-square is Z
hx2i = dxx2 P (x) (8)

The variance of a distribution is the diﬀerence between the mean of the square and the square of
the mean:
Var hx2i − hxi2 (9)

1
2 Section 1

The square root of the variance is called the standard deviation.

p
hx2i − hxi2 (10)
While the mean has the intuitive interpretation as the expected outcome, variance is more subtle.
Indeed, developing intuition for variance is a key to mastering statistics. The key point is that the
expected value is worthless if you don't know how likely that value is.
For example, a Gaussian has two parameters, x0 and 0. The ﬁrst parameter is the mean:
Z 1
1 (x − x0)2
hxi = dxx p exp − = x0 (11)
−1 2 0 202
The mean of x is
2

hx2i = 2 + x20 (12)

p
So that the standard deviation is = hx2i − hxi2 = 0. This is why we usually just write instead
of 0 for the parameter of the Gaussian.
The standard deviation has an interpretation as the width of a distribution how far you can
go from the mean before the probability has decreased substantially. For example, in a Gaussian,
the probability of ﬁnding x between x0 − and x0 + is
Z x0 +
(x)1 = dxP (x) = 0.68 (13)
x0 −

So, for a Gaussian, there is a 68% that the values of x fall within 1 standard deviation of the mean.
We will often be interested in situations where the mean is zero. Then the standard deviation
is equivalent to the root-mean-square
p
xRMS = hx2i (14)
~ i = 0. Thus the
For example, in a gas the velocities point in random directions, so hv p characteristic
speed of a gas is characterized not by the mean but by the RMS velocity vRMS = hv ~ 2i .
Another important concept is how probability distributions behave when they are combined.
For example, say PA(x) and PB (y) are the probabilities of winning x dollars when betting on horse
A and y dollars when betting on horse B. The probability of getting z total dollars is then
Z 1
PAB (z) = dxPA(z − x)PB (x) (15)
−1

This is the deﬁnition of the mathematical operation of convolution between two functions. We
say PAB is the convolution of PA and PB and write it as
PAB = PA PB (16)
Convolutions are extremely important in statistical mechanics, since we often measure only the
sum of a great many independent processes. For example, the pressure on the wall of a container
is due to the sum of the forces of all the little molecules hitting it, each with its own probability.

1.1 Examples
Consider the system of a gas molecule bouncing around in a 1D box of size L centered on x = 0. If
there are no external forces and no position-dependent interactions, the molecule is equally likely
to be anywhere in the box. So
1
P (x) = (17)
L
The mean value of the position of the molecule is
Z L
1 2
hxi = dxx = 0 (18)
L −L
2
Similarly, the mean value of x is
2

Z L
1 2 L2
hx i =
2
dxx2 = (19)
L −L 12
2
Law of large numbers 3

So that the standard deviation is

p 1
= hx2i − hxi2 = p L 0.29L (20)
12
2
Note that the probability of finding x within hxi is L = 58%. It is not 68% because the
probability distribution is not Gaussian. This illustrates that the interpretation of as a 68%
confidence interval is not always accurate.
Suppose instead that there is some electric field so that the particles in the box are more likely
0.74
to be on one side than the other. We might find some crazy function P (x) = L ln(1 + e2x/L) for
these probabilities. Then, by numerical integration we find

hxi = 0.59L; hx2i = 0.42L2; = 0.28L (21)

R hxi+
Also, hxi− P (x)dx = 0.6 so 60% within hxi . This is just a contrived example. You should be
able to compute hxi and with any function P (x), at least numerically, and you will generally
ﬁnd that not exactly 68% are within hxi , but often you get something close.

2 Law of large numbers

An extremely important result from probability is that even if P (x) is very complicated, when you
average over many measurements, the result dramatically simpliﬁes. More precisely, the law of
large numbers states that

The average of the results from a set of independent trials

varies less and less the more trials are performed
More mathematically, we can state it this way
If P (x) has standard deviation , then the probability PN (x) of ﬁnding that the average

over N draws from P (x) is x will have standard deviation p .
N

Thus as N ! 1, the standard deviation of the average p ! 0.
N
To derive the law of large numbers, lets consider the probability distribution for the center of
mass of molecules in a box. Say there are N molecules in the box and the probability function of
ﬁnding each is P (x). Some examples for P (x) are Section 1.1. We assume that the probabilities
for each molecule are independent having one at x does not tell us anything about where the
others might be. In this case, what is the mean value of the center of mass of the system? We'll
write hxiN , hx2iN and N for quantities involving the N -body system and drop the subscript for
the N = 1 case: hxi1 = hxi and 1 = .
x +x
For N = 2, the center of mass is x = 1 2 2 , so the mean value of the center of mass is
Z L Z L Z L Z L
2 2 x1 + x2 2 x1 2 x2
hxi2 = dx1 dx2P (x1)P (x2) = dx1P (x1) + dx2P (x2) = hxi (22)
L
−2 −2
L 2 L
−2 2 L
−2 2

So the mean value for 2 molecules is the same as for 1 molecule. The expectation of x2 with 2
molecules is
Z L Z L
2 2 x + x2 2
hx2i2 = dx1 dx2P (x1)P (x2) 1 (23)
−
L
−
L 2
2 2

Z L Z L Z L Z L
1 2 1 2 2 1 2
= dx1P (x1)x21 + dx1P (x1)x1 dx2P (x2)x2 + dx2P (x2)x22 (24)
4 L
−2 2 L
−2
L
−2 4 L
−2

1 1
= hx2i + hxi2 (25)
2 2
4 Section 3

So the standard deviation of the center-of-mass for 2 particles is:

p r
1 2 1 1 p 2
2 = hx i2 − (hxi2) =
2 2
hx i + hxi2 − hxi2 = p hx i − hxi2 = p (26)
2 2 2 2
p
That is, the standard deviation has shrunk by a factor of 2 from the one particle case for any
P (x).
Now say there are N particles. The mean value of the center of mass is
Z L " Z L #

2 x1 + + xN 1 2
hxiN = dx1dxNP (x1)P (xN ) = N dx1x1P (x1) = hxi (27)
−
L N N −
L
2 2

independent of N . The expectation value of x2 is

Z L

2 x + + xN 2
hx2iN = dx1dxNP (x1)P (xN ) 1 (28)
−
L N
2

When we expand (x1 + + xN )2 there are N terms that give hx2i and the remaining (N 2 − N )
terms are the same as hx1x2i = hxi2. So,
Z L
2 1
hx2iN = dx1dxNP (x1)P (xN ) [Nx21 + (N 2 − N )x1x2] (29)
L
−2 N2

1 1
= hx2i + 1 − hxi2 (30)
N N
Therefore
p 1 p 2
N = hx2iN − hxi2 = p hx i − hxi2 = p (31)
N N
p
The appearance of N is called the law of large numbers. Note that Eq. (31), describing how
the standard deviation scales as we average over many molecules, holds for any function P (x).
Different P (x) will give different values of , but the relation between N with N molecules and
with one molecule is universal.
1
For the gas in the box with a flat P (x) = L , as in Section 1.1, the expected value of the center
of mass is hxiN = 0, just like for any individual gas molecules, and the standard deviation is
L
N = p 10−11 p . Thus, even though we don't know very well where any of the molecules are,
N 12
we know the center of mass to extraordinary precision.
The law of large numbers is the reason that statistical mechanics is possible: we can compute
macroscopic properties of systems (like the center of mass, or pressure, or all kinds of other things)
with great confidence even if we don't know exactly what is going on at the microscopic level.

3 Central Limit Theorem

We saw how for when we average over a large number N of draws from a probability distribution

P (x) the mean stays ﬁxed and the standard deviation shrinks by ! p . What can we say about
N
the shape of the probability distribution PN (x)? It turns out we can say a lot. In fact, in the limit
N ! 1 we know PN (x) exactly: it is a Gaussian!
More precisely the central limit theorem states that

When any probability distribution is sampled N times

the average of the samples approaches a Gaussian distribution as N ! 1
1
with width scaling like p
N
Central Limit Theorem 5

There are a lot of ways to prove it. I ﬁnd the moment approach the most accessible, as discussed
next. Another proof using convolutions an Fourier transforms in is Appendix C.

3.1 CLT proof using moments

One way to prove the central limit theorem is by computing moments. If you specify the complete
set of moments of a function, you know its shape completely. These moments are
mean: x = hxi (32)

variance: 2 = hx − x i2 = hx2i − x2 (33)

h(x − x)3i 1
skewness: S= = 3 [hx3i − 3x hx2i + 2x3] (34)
3
h(x − x)4i
kurtosis: K= (35)
4
h(x − x)ni
nth moment: Mn = (36)
n
Skewness measures how asymmetric a distribution is around its mean. Kurtosis measures the 4th
derivative, which is a measure of curvature. More intuitively, higher kurtosis means a probability
distribution has a longer tail, i.e. more outliers from the mean. The higher moments do not have
simple interpretations.
Notice that all the higher-order moments are normalized by dividing by powers of so that
they are dimensionless. To understand this normalization imagine plotting PN (x), but shift it to
center around x = 0 and rescale the x axis by so that the width is always 1. Then the curve will
not get any smaller as N ! 0 because its width is ﬁxed to be 1, but its shape may change. The
shape is determined by the numbers Mn with n > 2. See Fig. 2 below for an example.
For the Gaussian probability distribution in Eq. (1) the moments are easy to calculate in
Mathematica:
x = 0; = ; S = 0; K = 3; M5 = 0; M6 = 15; M7 = 0; M8 = 105; (Gaussian) (37)
Note that skewness is zero for a Gaussian because it is symmetric. For a Gaussian, in fact all
the odd moments (Mn with n odd) vanish. The even moments, normalized to powers of , are
dimensionless numbers given by the formula
8
<0
> ; n odd
Mn = 2 2−
n
n! (38)
>
: n ; n even
2
!

These Mn completely determine the shape of a Gaussian. If a function has all of these moments,
it is a Gaussian.
Now let's compute the moments of the center of mass of our N molecules-in-a-box with prob-
ability P (x). We'll do this for a general P (x), but shift the domain so that hxi = x = 0 in order to
simplify the formulas in Eqs. (33)-(36). For example, the 3rd moment of PN (x) is
Z L

2 x + + xN 3
hx iN =
3
dx1dxNP (x1)P (xN ) 1 (39)
−
L N
2

Since hxi = 0 the only terms in this expression which don't vanish are the ones of the form xj3. So

1 3
hx3iN = hx i (40)
N2
We conclude that the skewness SN with N molecules is related to the skewness S1 for 1 molecule by
h(x − x)3iN h(x − x)3i/N 2 S
SN = = p = p1 (41)
N3
( / N )3 N
6 Section 3

In particular, the skewness goes to zero as N ! 1. That is, the distribution becomes more and
more symmetric abound the mean as N ! 1.
Now let's look at the 4th moment, kurtosis. Following the same method we need
Z L

2 x + + xN 4
hx4iN = dx1dxNP (x1)P (xN ) 1 (42)
−
L N
2

In this case, since hxi = 0, the terms that don't vanish are xj4 or xj2x2i with i =
/ j. Thinking about the
combinatorics a little you can convince yourself that there N terms of the form x4i and 3N (N − 1)
terms of the form x2i xj2.1 So,
1 4 3(N − 1) 2 2
hx4iN = hx i + hx ihx i (43)
N3 N3
1
Then, calling K1 = 4 hx4i the kurtosis for N = 1 we have

h(x − x)4iN 1 1 4 3(N − 1) 2 2 K1 1
KN = = 4 hx i + hx ihx i = +3 1− (44)
N4 /N 2 N 3 N3 N N

This is interesting it says that as N ! 1 the kurtosis KN ! 3 independent of the kurtosis of the
one particle probability distribution! So the skewness goes to zero and the kurtosis goes to 3.
For the 6th moment the term which dominates at large N is the non-vanishing one with the
1
largest combinatoric factor: hx2i3. There are NC3 6 C2 4 C2 = 6 N (N − 1)(N − 2) 15 2 ! 15 of
these. So (M6)N ! 15. Similarly, (M8)N ! 105. In other words, for any P (x) we ﬁnd that as N ! 1

SN ! 0; KN ! 3; (M5)N ! 0; (M6)N ! 15; (M7)N ! 0; (M8)N ! 105; (45)

What we are seeing is that at large N all of the higher moments go to those of a Gaussian! If you
work out the details, the general formula is
8
<0
> ; n odd
(Mr)N ! 2− r2 r! ; n even (46)
>
: r
! 2

In exact agreement with the moments of a Gaussian. Thus we always get a Gaussian and the
central limit is proven. Another proof using convolutions is in Appendix C.

3.2 Combining ﬂat distributions

Because the central limit theorem is so important, let's try to understand why it is true more
physically. Again, say we have some probability distribution P (x) for molecules in a box, with
L L
− 2 < x < 2 . We want to pick N molecules and compute their mean position (center of mass
1P
position) x = N j xj . What is the probability distribution PN (x) that the mean value is x?
1
To be concrete, let's take the ﬂat distribution P (x) = L . For N = 1, we pick only molecule with
1
position x1. Then x = x1 and so P (x) = L : any value for the center-of-mass position is equally likely.
Now say N = 2, so we pick two molecules with positions x1 and x2. What is the probability that
x +x
they will have mean x? For a given x we need 1 2 2 = x. For example if x = 0, then for any x1 there
L
is an x2 that works, namely x2 = −x1. However, if the mean is all the way on the edge, x = 2 , then
L
not all x1 work; in fact, we need both x1 and x2 to be exactly 2 . Thus there are fewer possibilities
when x is close to the boundaries of the box than if x is central. One way to see this is graphically

N N N! N (N − 1)
1. There are 1
= N of the xj4 terms. There are 2
= 2!(N − 2)! = 2
possible pairs i =
/ j and there

4
are 2
= 6 ways of picking which two of the 4 terms in the expansion are i. So the total number of these terms is
3N (N − 1).
Central Limit Theorem 7

Figure 1. The regions in the x1 /x2 plane with mean value x are diagonal lines for L = 2. The length of
the line is the probability P2(x). For x = 0, the line is longest and probability greatest. For x = 1, the line
reduces to a point and the probability to zero.

To be quantitative, the easiest way to calculate the probability is with the Dirac function
(x) (see Appendix A for a refresher on (x)). Using the -function, we can write the probability
x +x
for getting a mean value x = 1 2 2 as
Z L Z L

2 2 x + x2
P2(x) = dx1P (x1) dx2 P (x2) 1 −x (47)
L
−2 −
L 2
2

This is another way of writing a convolution, as in Eq. (15): P2 = P P .

As a check, we can verify that this probability distribution is normalized correctly
Z L Z L Z L Z L
2 2 2 2 x + x2
dxP2(x) = dx dx1P (x1) dx2 P (x2) 1 −x
−
L
−
L
−
L
−
L 2
2 2 2 2

Z L Z L
2 2
= dx1P (x1) dx2 P (x2) = 1 (48)
L L
−2 −2

where we have used the -function to integrate over x to get to the second line.
To evaluate P2(x) we first pull a factor of 2 out of the -function using Eq. (82), giving
Z L Z L
2 2
P2(x) = 2 dx1P (x1) dx2 P (x2)(x1 + x2 − 2x) (49)
L L
−2 −2
x1 + x2
Now, the -function can only fire if its argument hits zero in the integration
region.
Since 2
=x
L L
we can solve for x1 = 2x − x2. If x < 0 then the most x1 can be is 2x − −2 = 2
+ 2x. In other
words, we have
Z L +2x
2
P2(x < 0) = 2 dx1P (x1)P (2x − x1) (50)
L
−2
1
Taking the flat distribution P (x) = L this evaluates to P2(x < 0) = 2L + 4x. Similarly, for x > 0
L
the limit is x1 > 2x − 2 and for a flat distribution P2(x > 0) = 2L − 4x. Thus we have

(51)

2L + 4x; x<0
L P2(x) =
2
=
2L − 4x; x>0

You can also check this by evaluating Eq. (47) with Mathematica:

P=Integrate[DiracDelta[x1+x2-2x],{x1,-1,1},{x2,-1,1}];
8 Section 3

Plot[P, {x, -1, 1}]

For N = 3 we compute
Z L Z L Z L

2 2 2 x + x2 + x3
P3(x) = dx1P (x1) dx2 P (x2) dx3 P (x3) 1 −x (52)
−2
L
−2
L
−
L 3
2

and so on. These successive approximations look like

Figure 2. The average position of N = 1; 2; 3; 4 particles, each of which separately has a ﬂat probability
distribution.

We see that already at N = 4 the ﬂat probability distribution is becoming a Gaussian. Note
also that the widths of the distributions are getting narrower.
The central limit theorem says that the distribution of the mean of N draws from a prob-

ability distribution approaches a Gaussian of width p as N ! 1 independent of the original
N
probability distribution. That is,
r
N (x − x)2
PN (x) ! exp −N (53)
2 2 2 2
Sometimes we sum the values of the draws from a distribution instead of averaging p them. In
this case, the mean grows as x ! N x and the standard deviation grows like ! N . Thus an
equivalent phrasing of the central limit theorem is
Central Limit Theorem: A function with mean x and standard deviation convolved p
with itself N times approaches a Gaussian with mean Nx and standard deviation N as
N ! 1.
Summing the values is what happens when you convolve a function with itself. So for summing
the values, the central limit theorem has the form

1 (x − Nx)2
PNsum(x) = P P P ! p exp − (54)
|||||||||||||||||{z}}}}}}}}}}}}}}}}} 2 2N 2 2N
N

A proof of the CLT using convolutions is in Appendix B.

We put the sum superscript to remind ourselves that we sum the values from each draw from
P (x) rather than average their values. The relation is simply

1 x
PNsum(x) = PN (55)
N N
1
The N comes from the fact that the probability distributions are diﬀerential, so we should techni-
x x
cally write PNsum(x)dx = PN N d N . Note when we average x ! x and ! p and when we sum

p N
x ! N x and ! N , so either way
1
!p (56)
x N x

Thus a foolproof way to think of the scaling is that the dimensionless ratio x

should decrease as
1
p .
N
Poisson distribution 9

3.3 Why we take logarithms in statistical mechanics

In statistical mechanics, we will make great use out of the central limit theorem. Generally we
have systems composed of enormously large numbers of particles N Avogadro's number 1024.
The things we measure are macroscopic: the pressure a gas puts on a wall is the average pressure.
Microscopically, the gas has a bunch of little molecules hitting and bouncing off the wall and the
force these molecules impart is constantly varying. We don't care about these tiny fluctuations,
just the average. So any time we try to measure something, like the pressure in a gas, or the con-
centration of a chemical, we will necessarily be averaging over an enormous number of fluctuations.
Because of the central limit theorem, the distribution of any macroscopic quantity will be close to
a Gaussian around its mean. This central limit theorem itself doesn't tell us what the mean is, or
how various macroscopic quantities are related we need physics for that. But it tells us that we
don't need to worry about the precise details of the microscopic description.
Normally when a function f (x) is rapidly falling away from x x we Taylor expand x = x
and keep the first few terms. We can do this for PN (x) too. However, the Taylor expansion of a
Gaussian has an infinite number of terms
x2 1
X m
− 1 x2 x2 1 x2 2 1 x2 3
e 22 = − 2 =1− 2 + − + (57)
m! 2 2 2 2 2 6 22
m=0

You need all the terms to reconstruct the original Gaussian. However, if we take the logarithm
first, then Taylor expand, we find
x2
− x2
ln e 22 = − 2 (58)
2
with only one term. So it will be extremely convenient to start taking the logarithms of our
probabilities. By the central limit theorem, when we average the values,
r
(x − x)2 N
ln PN (x) ! −N + ln (59)
2 2 2 2
As N ! 1 there are no higher order terms.
In other words, a Gaussian is an unusual function. It is flat near the peak, but then quickly
drops off and has a long tail. Since the function is smooth near the peak, it's hard to know what's
going on at the tail from expanding near the peak. In particular, you have to work very hard to
get information about points with x & from information at the peak. Taking the logarithm puts
the peak and the tail on the same footing. Of course, we can't get something for nothing: taking
logarithms alone won't solve any problems. But taking logarithms often makes it easier to solve
problems. We will see many examples of this as the course progresses.

4 Poisson distribution
In many physical situations, there is a large number N of possible events each occurring with very
small probability for a given time interval. For example, if you put a glass out in the rain, there
are lots of possible drops of water that could fall into the glass, but each has a small probability.
Or you have lots of friends on Instagram, each one has a small probability of posting something
interesting. Or we have a gas of molecules and each one has a small chance of being in some tiny
volume. Probabilities in situations like this, where each event is uncorrelated with the previous
event, are described by the Poisson distribution.
Let's take a concrete example, radioactive decay. A block of 235U has N 1024 atoms each of
which can decay with a tiny probability
dP = dt (60)
1
is called the decay rate. It has units of time . For a single atom of 235U , this decay rate is
= 3 10−17 s−1. In a mole of Uranium (1024 atoms), 107 Uranium atoms decay, on average, each
second. What is the chance of seeing m decays in a time t?
1
Let's start with m = 0 and the time t very small (compared to ), t = t. If the rate to decay
is dP = dt then the probability of not decaying in time t = t is
Pno decay(t) = 1 − t (61)
10 Section 4

For the system to survive to a time 2 t with no decays, it would have to not decay in t and
then not decay again in the next t. Since the probability of two uncorrelated occurrences (or
not-occurrences in this case) is the product of the probabilities, P (a&b) = P (a)P (b) we then have
Pno decay(2t) = (1 − t)2 (62)
t
Now we can get all the way to time t by sewing together small times t = N and taking N ! 1.
We thus have
t N
Pno decay(t) = lim 1 − = e−t (63)
N !1 N
So that's the m = 0 case: no particles decay.
1
Using this formula, how long will it take for the probability of some decay to be 2 ? That's the
1 1
same as the probability of no decay being 1 − 2 = 2 . So we just solve
1 1 0.7
= e−t1/2 ) t1/2 = ln 2 = (64)
2
1
We often say
is the lifetime and t1/2 is the halﬂife. The two numbers are related by a factor
1
of ln2: t1/2 =
ln 2.
Now try m = 1. We need the probability that there is exactly one decay in exactly one of the
time intervals. There are N intervals we can pick. So,

t N −1 t t N
P1 decay(t) = lim N 1 − = lim − t@t 1 − (65)
N !1 N}}}}}}}}}}}}}}}}}}}}} ||{z}
||||||||||||||||||||||{z} ||||| N}}
}}}}} N !1 N
N −1 no decays one decay

In the third term, we have simply rewritten the expression in a smart way with a derivative so we
can reduce it to a previously solved problem a powerful physicist trick. Now we switch the order
of the limit and the @t and use Eq. (63) to get
P1 decay(t) = −t@tPnodecay(t) = te−t (66)

N! 1
For two decays there are N
2
= (N − 2)!2! = 2 N (N − 1) ways and we have

N (N − 1) t N −2 t 2 1 2 2 (t)2 −t
P2decays(t) = lim 1− = t @t Pno decay(t) = e (67)
N !1 2 }}}}}}}}}}}} ||||||||||||||||||||||{z}
|||||||||||||{z} N}}}}}}}}}}}}}}}}}}}}} ||||||||{z}
N}}}}}}} 2 2
pick 2 to decay N −2 no decays two decays

For general m the result is

(t)m −t (68)
Pm(t) = e
m!
This is called the Poisson distribution. It gives the probability for exactly m events in time t
when each event has a probability per unit time of and the events are uncorrelated.
In any time t there must have been some number of decays between 0 and 1. Indeed,
X X (t)m
Pm(t) = e−t = 1 (69)
m!
m

So that's consistent (as is the t-independence of this sum).

The way we derived the Poisson distribution was for a fixed m, as a function of t. But it can be
more useful to think of it as a function of m at a fixed value of t: P (m; t) = Pm(t). Keep in mind
though that for fixed t, P (m; t) as a function of m is a discrete probability distribution (meaning
m is an integer). In contrast for fixed m, P (m; t) is a continuous function of t. Moreover, while
it is a normalized probability Rin m, it is a simply a function (not a probability distribution) of t.
There is not a sense in which dtPm (t) = 1 this doesn't even have the right units.
For a given fixed t, how many particles do we expect to have decayed? In other words, what is
the expected value hmi in a time t? We compute the mean value for m, by summing the value of
m times the probability of getting m
X X (t)m
hmi = mPm(z) = m e−t = t (70)
m!
m m
Poisson distribution 11

The last step is a little tricky see if you can figure out how to do the sum yourself. (You can
always run Mathematica if you get stuck on steps like this.) The result implies that the expected
number of decays in a time t is t. It makes sense that if you double the time, twice as many
particles decay. How long will it take for half the particles to decay?
The standard deviation of the Poisson distribution is
p p
= hm2i − hmi2 = t (71)
Again, you can check this yourself as an exercise. p
So the Poisson distribution as a function of m at fixed t has mean t and width t . Thus the
width compared to the mean is
1
=p (72)
hmi t
This goes to 0 as t ! 1. In other words, the Poisson distribution is narrower and narrower as t
1
gets larger. What does this mean physically? It means if we wait one lifetime (t = ) we should
p 2
expect 1 1 particle to decay. If we wait 2 lifetimes, we expect 2 2 to decay (t = , hmi = 2
2 p
and = p = 2 ). If we wait 100 lifetimes, we expect 100 10 to decay. So the longer we wait,
2
not only are there more decays, but we know more precisely how many decays there will be. This
is, of course, a consequence of the central limit theorem.
So what do you expect the distribution to look like as t ! 1 or m ! 1? Let's first look
numerically. We can plot Pm(t) as a function m, which is a discrete index, or as a function of t,
which is continuous:

Figure 3. The Poisson distribution as a function of the discrete index m for various times (left) and time,
for various values of m (right)

We clearly see the Gaussian shape emerging at large t (left) and at large m (right).
Now let's try to see how the Gaussian form arises analytically. First of all, we want the high
1
statistics limit, which means large t in units of which also means large m. When you see a factor
of m! and want to expand at large m, you should immediately think Stirling's approximation:
x! e−xxx () (73)
or equivalently
ln x! x ln x − x + (74)
For a simple derivation, see Appendix B. We will use this expansion a lot.
The log of the Poisson distribution is

(t)m −t
ln Pm(t) = ln e = mln(t) − t − ln m! (75)
m!
Then we use Stirling's approximation for m!

t
ln Pm(t) !
!
!
!!
!
!
!!
!
!
!! mln(t) − t − m ln m + m + = mln
!
! + (m − t) + (76)
m1 m
12 Section 5

This is still a mess. But we expect Pm(t) to be peaked around its mean hmi = t. So let's Taylor
expand ln Pm(t) around m = t. The leading term, from setting m = t makes Eq. (76) vanish.
The next term is

@
ln Pm(t) = lim [ln(t) − ln m] = 0 (77)
@m m=t m!t

which also vanishes. We have to go one more order in the Taylor expansion to get a nonzero answer:

@2 1 1
ln P m(t) = lim − =− (78)
@m2 m=t m!t m t
Thus,
1
ln Pm(t) = − (m − t)2 + (79)
2t
and therefore
1 (m −t)2
− 2t
Pm(t) !!
!!!
!
!
!!
!
!
!!p
! e (80)
m1 2t
p
This is a Gaussian with mean hmi = t and width = t exactly as expected by the central limit
theorem.
You might not be terribly impressed with this derivation as a check of the central limit theorem.
After all, we expanded ln Pm to second order around m = hmi. Doing that, for any function Pm
is guaranteed to give a Gaussian. But that's really the whole point of the central limit theorem
any function does give a Gaussian. So in the end you should be impressed after all.

5 Summary
In this lecture, we introduced the basic concepts from probability that will be useful for statistical
mechanics. The key concepts are
R
Normalized probability distributions P (x) with dxP (x) = 1
R
Mean: x = hxi = dxxP (x)
R
Variance var = dx(x − x)2P (x),
p
Standard deviation or width = var

1 )2
(x − x
Gaussian distribution P (x) = p exp − 22 has mean x and width .
2

If you draw x from Gaussian it is 68% likely to between x − and x + .

R1
The convolution of two distributions is deﬁned as (PA PB )(z) = −1 dxPA(z − x)PB (x).
It describes the probabilty of getting z as the sum of a number draw from PA and another
number drawn from PB .
Given a probabilty distribition P (x) with mean x and width , you can construct a new
probabilty distribution PN (x) by averaging over N draws from P (x). The central limit
thoerem (CLT) says that as N ! 1 this new distribution will approach a Gaussian with

the same mean as P (x) (xN = x) and a smaller standard deviation N p . All other
N
properties of P (x) are lost after this averaging at large N .
The CLT also implies that if we sum (rather than average) the values from draws, the mean
p
grows like xN N x and the standard deviation like N N . If we
Because of the CLT, Gaussians are very common. Their exponential decay encourages us to
study logarithms of distributions, which turns fast-varying exponentials into slow-varying
x2
− x2
polynomials: ln e 22 = − 22 .
When we have a rate dP = dt for an event happening that is independent of time, then
the probability of having m events after a time t is described by the Poisson distribution
(t)m
Pm(t) = m! e−t.
Stirling's approximation 13

p
Stirling's approximation is that N ! 2N N Ne−N at large N . This works very well,
even at N = 1.

Appendix A Dirac -function

The Dirac -function is very useful in physics, from quantum mechanics to statistical mechanics.
The -function is not really a function but rather a distribution. (x) is zero everywhere except at
x = 0. When you integrate a function against (x) you pick up the value of that function at 0. That is
Z
dx(x)f (x) = f (0) (81)

This is the deﬁning property of (x). The integration region has to include x = 0 but is otherwise
arbitrary since (x) = 0 if x =
/ 0.
Another useful property of -functions is that if we rescale the argument of (x) by a number
1
a then the -function rescales by a . That is,
1
(ax) = (x) (82)
a
x
To check this, we can change variables from x ! a in the integral
Z Z Z
x x 1 1
dx(ax)f (x) = d ( x)f = f (0) = dx (x) f(x) (83)
a a a a

It's sometimes helpful to think of the function as the limit of a regular function. There are
lots functions whose limits are functions. For example, Gaussians:
2
1 −
x
(x) = lim p e 22 (84)
!0 2
As a check, note that the integral over the Gaussian is 1 regardless of , so the function also
integrates to 1. As ! 0, the width of the Gaussian goes to zero, so it has zero support away from
mean, that is it vanishes except at x = 0, just like the function.

Appendix B Stirling's approximation

There are many ways to derive Stirling's approximation. Here's a relatively easy one. We start by
taking the logarithm
N
X
ln N ! = ln N + ln (N − 1) + ln(N − 2) + + ln1 = ln j (85)
j =1

For large N we then write the sum as an integral

N
X Z N
ln N ! = ln j dj ln j = N ln N − N − 1 N ln N − N (86)
j =1 1

That's the answer.

One can include more terms in the expansion by using the Euler-McLauren formula for the
diﬀerence between a sum and an integral. For example, the next term is
p
N ! 2N N Ne−N (87)
An alternative derivation
R1 is to given an integral representation of the factorial as a G function:
n! = Γ(n + 1) = 0 xne−xd x. For example, Mathematica can simply series expand this around
n = 1 to reproduce Eq. (87). Try it!
14 Appendix C

1
The next order correction to this is down by 12N , which gets small fast. You can check that
Stirling's approximation is oﬀ by less than 8% already at N = 1 and by less than 2% by N = 3. For
Avogadro's number N = 6 1023 it is oﬀ by one part in 1025.

Appendix C Central limit theorem from convolutions

Here's another slick proof of the central limit theorem. We start with the deﬁnition
Z
x + + xn
PN (x) = dx1:::dxnP (x1):::P (xn) 1 −x (88)
N
Now we write the function in Fourier space as
Z
dk ik(x1 ++xn−x)
(x1 + + xn − x) = e (89)
2
So that Z Z
x ++x
ik 1 N n −x
PN (x) = dk dx1:::dxnP (x1):::P (xn)e (90)

Deﬁning the Fourier transform of P as

Z
P~(k) = dxeikxP (x) (91)
we then have N
Z
dk ~ k
PN (x) = P eikx (92)
2 N
Eq. (92) is just the statement that Fourier transforms turns convolutions into products.Then,
Z Z
~ k kx
iN ikx 1 (ky)2 1 iky 3
P = dxe P (x) = dy 1 + − + + P (x) (93)
N N 2 N 3! N

k k 2 hx2i k 3hx3i
=1 + i x − −i + (94)
N 2 N 2 6N 3

Now if we didn't do anythingelse, then as N ! 1 we see immediately that P~ N ! 1 and so
k

PN (x) ! (x). This is because the whole distribution is shrinking down to be around x = 0. We
don't care about this shrinking, but rather what is happing to the shape of the distribution So,
as with the moment proof in Section 3.1,pwe first normalize PN by shifting so that x = 0. We also
need to rescale the x-axis by a factor of N so that the width stays finite. Thus we can write
3
k k2 2 k3 hx i
P~ =1− −i + (95)
N 2N N 6N 3/2 N 3/2
p
where the terms in () are going to be finite as N ! 1 after rescaling x by N . Then
2 3
N k2 2 N
k2 2
k 1
P~ =4 1 − 5 = e− 2 N
2 N
+O ::: (96)
N N N 3/2
x
where ex = limN !1 1 + N N was used and the ::: terms give contributions that vanish as N ! 1.
We can then compute the inverse Fourier transform to get
Z 1 2 2 r Z p r 2
dk − k2 N ikx 2N 1 dk − k22 ik 2N x
N − Nx
PN (x) = e e = e e
= e 2 2 (97)
−1 2 2 −1 2 2 2

which is the desired result, the central limit theorem.

BlackBuck - Case Study
0% (1)
BlackBuck - Case Study
7 pages
1 - Student Materials - Anchor Phenomenon Launch - Performance Task - V4
No ratings yet
1 - Student Materials - Anchor Phenomenon Launch - Performance Task - V4
18 pages
Statistics Review
No ratings yet
Statistics Review
16 pages
3.6 Variance of Random Variables
No ratings yet
3.6 Variance of Random Variables
5 pages
ProbabilityStatistics Probability3
No ratings yet
ProbabilityStatistics Probability3
9 pages
Risk Analysis For Information and Systems Engineering: INSE 6320 - Week 3
No ratings yet
Risk Analysis For Information and Systems Engineering: INSE 6320 - Week 3
9 pages
Statistical and Thermal Physics
No ratings yet
Statistical and Thermal Physics
64 pages
Chapter 1
No ratings yet
Chapter 1
15 pages
Section
No ratings yet
Section
81 pages
1 The Gaussian or Normal Probability Density Function
No ratings yet
1 The Gaussian or Normal Probability Density Function
43 pages
PRP PBL-1
No ratings yet
PRP PBL-1
12 pages
2024 F STA-1005ab Review Problems For The Final Exam
No ratings yet
2024 F STA-1005ab Review Problems For The Final Exam
65 pages
Normal Distribution1
100% (1)
Normal Distribution1
8 pages
Normal Notes
No ratings yet
Normal Notes
3 pages
Probability Density Function:: Time Again. More Closely The Histogram Will Approximate The PDF
No ratings yet
Probability Density Function:: Time Again. More Closely The Histogram Will Approximate The PDF
46 pages
Prof Stanley Dukin Lectures Statistical Mechanics
No ratings yet
Prof Stanley Dukin Lectures Statistical Mechanics
43 pages
Data and Error Analysis
No ratings yet
Data and Error Analysis
35 pages
Lecture 7 Continuous Probability Distribution
No ratings yet
Lecture 7 Continuous Probability Distribution
30 pages
5.1. Notes
No ratings yet
5.1. Notes
16 pages
Normal Distribution
No ratings yet
Normal Distribution
29 pages
Standard Deviation
No ratings yet
Standard Deviation
3 pages
Introduction To Statistical Thermodynamics
No ratings yet
Introduction To Statistical Thermodynamics
29 pages
Inferiential Statistics
No ratings yet
Inferiential Statistics
30 pages
(Bohm G., Zech G.) Introduction To Statistics and (BookFi)
No ratings yet
(Bohm G., Zech G.) Introduction To Statistics and (BookFi)
412 pages
Vstatmp E17
No ratings yet
Vstatmp E17
504 pages
Lekcija 5 - Vjerovatnoca
No ratings yet
Lekcija 5 - Vjerovatnoca
60 pages
Lecture
No ratings yet
Lecture
6 pages
Continuous Probability Distribution PDF
No ratings yet
Continuous Probability Distribution PDF
47 pages
Distribuciones de Probabilidades
No ratings yet
Distribuciones de Probabilidades
10 pages
Normal Distribution-7
No ratings yet
Normal Distribution-7
5 pages
Distribución Gaussiana
No ratings yet
Distribución Gaussiana
26 pages
Ch06.continous Probability Distributions
No ratings yet
Ch06.continous Probability Distributions
26 pages
Chapter 3 - Special Probability Distributions
No ratings yet
Chapter 3 - Special Probability Distributions
45 pages
Feb. 10 Discussion
No ratings yet
Feb. 10 Discussion
7 pages
MIT14 30s09 Lec17
No ratings yet
MIT14 30s09 Lec17
9 pages
Continuous Probability Distributions
No ratings yet
Continuous Probability Distributions
57 pages
Introduction To Statistical Physics-57314317
No ratings yet
Introduction To Statistical Physics-57314317
131 pages
Lecture 6 Continuous Probability Distributions
No ratings yet
Lecture 6 Continuous Probability Distributions
25 pages
Probability Distributions - Continuous
No ratings yet
Probability Distributions - Continuous
22 pages
Normal Distribution:Sampling
No ratings yet
Normal Distribution:Sampling
8 pages
Biot3109 Unit2 Part III
No ratings yet
Biot3109 Unit2 Part III
29 pages
BUS 5 Prob Dist
No ratings yet
BUS 5 Prob Dist
35 pages
Normal Distribution (Hardcopy)
No ratings yet
Normal Distribution (Hardcopy)
8 pages
Distribution in Statistics
No ratings yet
Distribution in Statistics
49 pages
Lecture 6
No ratings yet
Lecture 6
57 pages
Week 05
No ratings yet
Week 05
30 pages
Raghunath Chatterjee - Normal Distribution - Lecture
No ratings yet
Raghunath Chatterjee - Normal Distribution - Lecture
39 pages
Probability Distributions
No ratings yet
Probability Distributions
22 pages
Probability and Statistics: Wikipedia
No ratings yet
Probability and Statistics: Wikipedia
12 pages
Probability and Distributions
No ratings yet
Probability and Distributions
6 pages
Statistics and Probability
No ratings yet
Statistics and Probability
9 pages
Standard Deviation Explained
No ratings yet
Standard Deviation Explained
7 pages
Chapter 5 - 7
No ratings yet
Chapter 5 - 7
110 pages
Machine Learning and Pattern Recognition Week 2 Univariate Gaussian
No ratings yet
Machine Learning and Pattern Recognition Week 2 Univariate Gaussian
3 pages
Assignment 2 State Arman
No ratings yet
Assignment 2 State Arman
9 pages
T Test
No ratings yet
T Test
50 pages
Activity No, 1 Continuous Probability Distributions
No ratings yet
Activity No, 1 Continuous Probability Distributions
17 pages
Examples of Continuous Probability Distributions:: The Normal and Standard Normal
No ratings yet
Examples of Continuous Probability Distributions:: The Normal and Standard Normal
57 pages
Examples of Continuous Probability Distributions:: The Normal and Standard Normal
No ratings yet
Examples of Continuous Probability Distributions:: The Normal and Standard Normal
57 pages
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Solution Dragging of Inertial Frames by A Slowly Rotating Star
No ratings yet
Solution Dragging of Inertial Frames by A Slowly Rotating Star
8 pages
Problem Standard Model Lepton
No ratings yet
Problem Standard Model Lepton
1 page
The Role of Dark Triad Personality On Cyberbullying Is It Still A Problem
No ratings yet
The Role of Dark Triad Personality On Cyberbullying Is It Still A Problem
5 pages
Problem Dragging of Inertial Frames by A Slowly Rotating Star
No ratings yet
Problem Dragging of Inertial Frames by A Slowly Rotating Star
1 page
1921-Article Text-5412-2-10-20201006
No ratings yet
1921-Article Text-5412-2-10-20201006
34 pages
Cold Dark Metter
No ratings yet
Cold Dark Metter
2 pages
Sean Carroll No 1 Bab 6
No ratings yet
Sean Carroll No 1 Bab 6
3 pages
Fluids Mechanics (Problem Solvers 15 Corrected Edition) (Williams) (Z-Library)
No ratings yet
Fluids Mechanics (Problem Solvers 15 Corrected Edition) (Williams) (Z-Library)
58 pages
Rekier 2021 Earths Rotation Observations DeepEarth Survey Geophys Author
No ratings yet
Rekier 2021 Earths Rotation Observations DeepEarth Survey Geophys Author
29 pages
Icas Paper Test 2
100% (1)
Icas Paper Test 2
11 pages
1 s2.0 S0959652620340269 Main
No ratings yet
1 s2.0 S0959652620340269 Main
10 pages
Ibps Po Prelims - 25 (18-10-2024) - Rank List
No ratings yet
Ibps Po Prelims - 25 (18-10-2024) - Rank List
2 pages
A Uniform Thin Ring of Radius R and Mass M Suspended in A Vertical Pla
No ratings yet
A Uniform Thin Ring of Radius R and Mass M Suspended in A Vertical Pla
1 page
Business Plan For TNTS Standard Format
No ratings yet
Business Plan For TNTS Standard Format
21 pages
Autoplants PDF
No ratings yet
Autoplants PDF
6 pages
Summer Vacation BST Holiday Homework
No ratings yet
Summer Vacation BST Holiday Homework
22 pages
Revised AP Phase - 2 Session Wise English, Social & Maths Syllabus For 24-25 (15.10.2024) .PMD
No ratings yet
Revised AP Phase - 2 Session Wise English, Social & Maths Syllabus For 24-25 (15.10.2024) .PMD
3 pages
Blades in The Deck Playerkit
No ratings yet
Blades in The Deck Playerkit
18 pages
3.BP Travel - Create Quotes - Functional Requirements Questionnaire (FRQ)
No ratings yet
3.BP Travel - Create Quotes - Functional Requirements Questionnaire (FRQ)
11 pages
Perfumerflavorist202105 DL
No ratings yet
Perfumerflavorist202105 DL
87 pages
ESO Crafting Style Checklist
No ratings yet
ESO Crafting Style Checklist
54 pages
TOP LOCAL SEO STRATEGIES For 2025
No ratings yet
TOP LOCAL SEO STRATEGIES For 2025
6 pages
Selective High School Placement Test: Session
100% (1)
Selective High School Placement Test: Session
10 pages
Indian Standard: Safety Code For Constructions Involving Use of Hot Bituminous Materials
No ratings yet
Indian Standard: Safety Code For Constructions Involving Use of Hot Bituminous Materials
11 pages
Merged
No ratings yet
Merged
21 pages
Week 6
No ratings yet
Week 6
19 pages
First Video: How Your Brain Predictions Interfere With What You See - Georg Keller
No ratings yet
First Video: How Your Brain Predictions Interfere With What You See - Georg Keller
2 pages
Cat Driver Information Card - LEDT7022
No ratings yet
Cat Driver Information Card - LEDT7022
2 pages
Eklavya Traders - Trading Rulebook-1
No ratings yet
Eklavya Traders - Trading Rulebook-1
20 pages
11 - Professional Secrecy
100% (1)
11 - Professional Secrecy
10 pages
Model Question Paper Grade 9
No ratings yet
Model Question Paper Grade 9
12 pages
Storytelling Rubric Summative
No ratings yet
Storytelling Rubric Summative
2 pages
EXSPI
No ratings yet
EXSPI
235 pages
Moroccan Arabic Textbook 23
No ratings yet
Moroccan Arabic Textbook 23
2 pages
Timeline of American Literature
100% (6)
Timeline of American Literature
8 pages
Breezes Color by Number
No ratings yet
Breezes Color by Number
1 page
Sentiment Analysis On Twitter Data-Set Using Naive Bayes Algorithm
No ratings yet
Sentiment Analysis On Twitter Data-Set Using Naive Bayes Algorithm
5 pages
Art and Design Asthetics
No ratings yet
Art and Design Asthetics
30 pages
Enhancing Production With Audio
No ratings yet
Enhancing Production With Audio
62 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

1-Probability 0

Uploaded by

1-Probability 0

Uploaded by

Matthew Schwartz

Statistical Mechanics, Spring 2021

Flat: P (x) = constant (5)

The square root of the variance is called the standard deviation.

hx2i = 2 + x20 (12)

So that the standard deviation is

hxi = 0.59L; hx2i = 0.42L2; = 0.28L (21)

2 Law of large numbers

The average of the results from a set of independent trials

So the standard deviation of the center-of-mass for 2 particles is:

independent of N . The expectation value of x2 is

3 Central Limit Theorem

When any probability distribution is sampled N times

3.1 CLT proof using moments

variance: 2 = hx − x i2 = hx2i − x2 (33)

SN ! 0; KN ! 3; (M5)N ! 0; (M6)N ! 15; (M7)N ! 0; (M8)N ! 105; (45)

3.2 Combining ﬂat distributions

This is another way of writing a convolution, as in Eq. (15): P2 = P P .

Plot[P, {x, -1, 1}]

and so on. These successive approximations look like

A proof of the CLT using convolutions is in Appendix B.

3.3 Why we take logarithms in statistical mechanics

For general m the result is

So that's consistent (as is the t-independence of this sum).

If you draw x from Gaussian it is 68% likely to between x − and x + .

Appendix A Dirac -function

Appendix B Stirling's approximation

For large N we then write the sum as an integral

That's the answer.

Appendix C Central limit theorem from convolutions

Deﬁning the Fourier transform of P as

which is the desired result, the central limit theorem.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

1-Probability 0

Uploaded by

1-Probability 0

Uploaded by

Matthew Schwartz

Statistical Mechanics, Spring 2021

Flat: P (x) = constant (5)

The square root of the variance is called the standard deviation.

hx2i =  2 + x20 (12)

So that the standard deviation is

hxi = 0.59L; hx2i = 0.42L2;  = 0.28L (21)

2 Law of large numbers

The average of the results from a set of independent trials

So the standard deviation of the center-of-mass for 2 particles is:

independent of N . The expectation value of x2 is

3 Central Limit Theorem

When any probability distribution is sampled N times

3.1 CLT proof using moments

variance: 2 = hx − x i2 = hx2i − x2 (33)

SN ! 0; KN ! 3; (M5)N ! 0; (M6)N ! 15; (M7)N ! 0; (M8)N ! 105; (45)

3.2 Combining ﬂat distributions

This is another way of writing a convolution, as in Eq. (15): P2 = P P .

Plot[P, {x, -1, 1}]

and so on. These successive approximations look like

A proof of the CLT using convolutions is in Appendix B.

3.3 Why we take logarithms in statistical mechanics

For general m the result is

So that's consistent (as is the t-independence of this sum).

 If you draw x from Gaussian it is 68% likely to between x −  and x + .

Appendix A Dirac -function

Appendix B Stirling's approximation

For large N we then write the sum as an integral

That's the answer.

Appendix C Central limit theorem from convolutions

Deﬁning the Fourier transform of P as

which is the desired result, the central limit theorem.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

hx2i = 2 + x20 (12)

hxi = 0.59L; hx2i = 0.42L2; = 0.28L (21)

variance: 2 = hx − x i2 = hx2i − x2 (33)

If you draw x from Gaussian it is 68% likely to between x − and x + .

Appendix A Dirac -function