0% found this document useful (0 votes)
19 views128 pages

Aqm-David Gross

The document consists of incomplete notes on Advanced Quantum Mechanics by David G. Ross, covering topics such as multi-partite quantum systems, indistinguishable particles, field quantization, and scattering theory. It includes detailed sections on mixed states, dynamics of coupled systems, and quantum many-body systems, among others. The notes also provide further reading suggestions and a recap of quantum mechanics fundamentals.

Uploaded by

Salim Dávila
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views128 pages

Aqm-David Gross

The document consists of incomplete notes on Advanced Quantum Mechanics by David G. Ross, covering topics such as multi-partite quantum systems, indistinguishable particles, field quantization, and scattering theory. It includes detailed sections on mixed states, dynamics of coupled systems, and quantum many-body systems, among others. The notes also provide further reading suggestions and a recap of quantum mechanics fundamentals.

Uploaded by

Salim Dávila
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 128

Advanced Quantum Mechanics

— Incomplete Notes —

DAVID G ROSS
Institute for Theoretical Physics
University of Cologne

JANUARY 2, 2025
Contents

Contents 1

1 Multi-partite quantum systems 5


1.1 Mixed states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.1 Visualizing mixed states: The Bloch ball . . . . . . . . . . . . . 6
1.1.2 Time evolution of density operators . . . . . . . . . . . . . . . . 8
1.1.3 Dynamics of a noisy spin . . . . . . . . . . . . . . . . . . . . . . 8
1.2 Multi-partite Hilbert space . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.1 Tensor product Hilbert spaces . . . . . . . . . . . . . . . . . . . 11
1.2.2 The partial trace . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3 Dynamics of coupled systems . . . . . . . . . . . . . . . . . . . . . . . 14
1.3.1 The measurement and the classicality problem . . . . . . . . . . 14
1.3.2 A quantum model for measurements . . . . . . . . . . . . . . . . 16
1.4 Quantum many-body systems as computers . . . . . . . . . . . . . . . . 19
1.4.1 Grover’s algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.5 Bell inequalities and their implications . . . . . . . . . . . . . . . . . . . 26
1.5.1 The CHSH scenario . . . . . . . . . . . . . . . . . . . . . . . . 27
1.5.2 Operational consequences of Bell inequality violations . . . . . . 30
1.5.3 Interpretations . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1.6 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

2 Indistinguishable particles 36
2.1 Bosonic and Fermionic Hilbert spaces . . . . . . . . . . . . . . . . . . . 36
2.1.1 Permutations and occupation numbers . . . . . . . . . . . . . . . 37
2.1.2 Single-particle operators . . . . . . . . . . . . . . . . . . . . . . 40
2.1.3 The exchange interaction . . . . . . . . . . . . . . . . . . . . . . 41
2.2 Second quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.2.1 Fock space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.2.2 Creation and annihilation operators . . . . . . . . . . . . . . . . 44
2.2.3 Single- and two-particle operators . . . . . . . . . . . . . . . . . 47
2.3 Quasiparticles and collective excitations . . . . . . . . . . . . . . . . . . 51
2.3.1 Phonons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.3.2 Global phase gauge symmetry and particle number conservation . 53
2.4 Bose gas: Take 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.4.1 Approximate solution part 1 . . . . . . . . . . . . . . . . . . . . 55
2.5 Detour: Spontaneous symmetry breaking . . . . . . . . . . . . . . . . . 55
2.5.1 Ferromagnetism . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.5.2 SSB and Bose-Einstein condensation . . . . . . . . . . . . . . . 57

1
CONTENTS 2

2.6 Bose gas: Take 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59


2.7 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

3 Field quantization and quantum theory of light 63


3.1 Phonon continuum limit . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.2 Quantization of the EM field . . . . . . . . . . . . . . . . . . . . . . . . 65
3.3 States of the EM field . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.3.1 Number states . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.3.2 Coherent states . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.4 Light-matter interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.4.1 Spontaneous emission . . . . . . . . . . . . . . . . . . . . . . . 70
3.5 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

4 Scattering theory 73

5 Symmetries in quantum mechanics 74

6 Relativistic QM 75

A Quantum mechanics recap 76


A.1 Linear algebra of Hilbert spaces . . . . . . . . . . . . . . . . . . . . . . 76
A.1.1 Hilbert spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
A.1.2 Linear operators . . . . . . . . . . . . . . . . . . . . . . . . . . 78
A.1.3 Dirac notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
A.1.4 Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
A.1.5 The adjoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
A.1.6 Spectral decomposition (discrete case) . . . . . . . . . . . . . . . 83
A.1.7 Spectral decomposition (continuous case) . . . . . . . . . . . . . 84
A.1.8 More on delta distributions . . . . . . . . . . . . . . . . . . . . . 87
A.1.9 More on Fourier transforms . . . . . . . . . . . . . . . . . . . . 88
A.1.10 Functions of operators . . . . . . . . . . . . . . . . . . . . . . . 91
A.1.11 Unitary operators . . . . . . . . . . . . . . . . . . . . . . . . . . 92
A.1.12 Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
A.1.13 The trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
A.1.14 Commuting operators . . . . . . . . . . . . . . . . . . . . . . . 93
A.2 Some concrete systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
A.2.1 A single harmonic oscillator . . . . . . . . . . . . . . . . . . . . 94
A.2.2 Normal modes . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
A.2.3 Central potentials . . . . . . . . . . . . . . . . . . . . . . . . . . 97
A.2.4 Fermionic oscillator . . . . . . . . . . . . . . . . . . . . . . . . 98
A.3 Perturbation theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
A.3.1 Fermi’s golden rule . . . . . . . . . . . . . . . . . . . . . . . . . 98

B Miscellaneous Integrals 101


B.1 Gaussian and Fresnel integrals . . . . . . . . . . . . . . . . . . . . . . . 101
B.2 Some Fourier transforms . . . . . . . . . . . . . . . . . . . . . . . . . . 102

C Function spaces and distributions 103


C.1 Square-integrable functions . . . . . . . . . . . . . . . . . . . . . . . . . 103
C.1.1 Why go beyond L2 ? . . . . . . . . . . . . . . . . . . . . . . . . 104
C.2 Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
CONTENTS 3

C.2.1 Schwartz space . . . . . . . . . . . . . . . . . . . . . . . . . . . 106


C.2.2 Tempered distributions . . . . . . . . . . . . . . . . . . . . . . . 106
C.2.3 Operations on distributions . . . . . . . . . . . . . . . . . . . . . 108
C.3 Topological aspects, more pedantry, and generalizations . . . . . . . . . . 113

D Green’s functions 117


D.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
D.2 Green’s functions from Fourier transforms . . . . . . . . . . . . . . . . . 118
D.2.1 Direct integration . . . . . . . . . . . . . . . . . . . . . . . . . . 119
D.2.2 Complex integration . . . . . . . . . . . . . . . . . . . . . . . . 120
D.2.3 Using the principal value . . . . . . . . . . . . . . . . . . . . . . 121
D.3 Resolvents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
D.3.1 Resolvents and the spectrum . . . . . . . . . . . . . . . . . . . . 123
D.3.2 The resolvent of the Laplacian . . . . . . . . . . . . . . . . . . . 124
D.4 Propagators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
D.5 Epilogue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

This symbol indicates that you may skip forward without missing much.
CONTENTS 4

Warm-up
To warm up, let’s recall the most basic notations from undergraduate quantum mechanics.
For more details, consult Sec. A of the Appendix.

• With every quantum system, one associates a Hilbert space H.


• Each state corresponds to a normalized vector |ψ⟩ ∈ H.
• Observable quantities are associated with Hermitian operators on H. If A = A† is
Hermitian, it has an eigendecomposition
X
A= λi |ϕi ⟩⟨ϕi |,
i

where the {|ϕi ⟩}i are an ortho-normal basis for H and the λi ∈ R the eigenvalues
of A. The possible numerical outcomes of a measurement process are then the λi ,
the i-th one occurring with probability

Prψ [i] = |⟨ϕi |ψ⟩|2 = tr(|ψ⟩⟨ψ|ϕi ⟩⟨ϕi |)

if the system is in a state described by |ψ⟩. If one repeats the measurement many
times, the average of the observed outcomes will then tend to the expectation value
X X
⟨A⟩ψ = λi Prψ [i] = λi |⟨ϕi |ψ⟩|2 = ⟨ψ|A|ψ⟩ = tr(|ψ⟩⟨ψ|A).
i i

• Also, every system is associated with a distinguished Hermitian operator, the Hamil-
tonian H. It serves two roles:
– It is the observable describing energy measurements.
– It determines the time evolution of the system via Schrödinger’s equation

iℏ∂t |ψ(t)⟩ = H|ψ(t)⟩.

• We usually choose a preferred basis for every Hilbert space H, ideally with a clear
physical interpretation. If the Hamiltonian is non-degenerate, the eigenbasis of H is
a natural choice. In this case, saying that the system is in an eigenstate with given
energy completely specifies the state vector.
– Example: The harmonic oscillator, with |n⟩ defined by
1
H|n⟩ = ℏω(n + )|n⟩.
2
If the Hamiltonian is degenerate, it is natural to add additional observables commut-
ing with H until their common eigenbasis is unique.
– Example: The bound states of the hydrogen atom, for which |n, l, m⟩ is defined
by

H|n, l, m⟩ = En |n, l, m⟩,


L2 |n, l, m⟩ = ℏ2 l(l + 1)|n, l, m⟩,
Lz |n, l, m⟩ = ℏm|n, l, m⟩.
Chapter 1

Multi-partite quantum systems

1.1 Mixed states


Goals
In undergraduate QM, “state” means “Hilbert space vector”. To describe noisy
systems (not terribly deep, but practically important) or entangled systems (much
deeper and increasingly important!) one needs to widen the concept of “state” to
include mixed states represented by density operators.

Imagine a process that prepares the state |ψj ⟩ with probability qj . The probabilities
could e.g. reflect fluctuations of control fields, see below. The collection of states |ψj ⟩ and
probabilities qj is called an ensemble. We do not require that the states |ψj ⟩ be orthogonal
to each other.
If we measure an observable A on this ensemble, the expected value will be
X    X  
qj tr |ψj ⟩⟨ψj |A = tr qj |ψj ⟩⟨ψj | A .
j j

Thus, the statistics of the experiment are described by replacing the projection |ψ⟩⟨ψ| with
the more general density operator
X
ρ := qj |ψj ⟩⟨ψj | (1.1)
j

so that ⟨A⟩ = tr(ρA). The density operator ρ has the following properties:
1. It is Hermitian ρ† = ρ,
2. Its eigenvalues form a probability distribution (which is equal to the qj if and only
if the states |ψj ⟩ are orthogonal).
Conversely, every operator with these two properties can be realized by an ensemble as in
(1.1).

Here’s the proof. Equation (1.1) implies the normalization property


X X
tr ρ = qj tr |ψj ⟩⟨ψj | = qj = 1. (1.2)
j j

5
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 6

and the positivity property


X X
⟨ϕ|ρ|ϕ⟩ = qj ⟨ϕ|ψj ⟩⟨ψj |ϕ⟩ = qj |⟨ϕ|ψj ⟩|2 ≥ 0. (1.3)
j j

The density operator ρ is Hermitian


P because every summand in (1.1) is. It thus has
an eigendecomposition ρ = j λj |ϕj ⟩⟨ϕj |, and the above implies
X
λj = tr ρ = 1, λj = ⟨ϕj |ρ|ϕj ⟩ ≥ 0,
j
P
which shows the first claim. Conversely, if ρ = j pj |ϕ⟩⟨ϕ| with pj a distribution,
then the eigendecomposition already forms an ensemble realization.

If ρ is a density operator with only one non-zero eigenvalue, then ρ = |ψ⟩⟨ψ|. In this
case, we say that ρ describes a pure state. Otherwise, the state is mixed.

Example: Canonical ensemble. Consider a classical system where the i-th microstate
has energy Ei . Then, in the Gibbs ensemble, we expect to find the i-th state with probabil-
ity
1 −Ei /(kT ) X
pi = e , Z= e−Ei /(kT ) .
Z i

Here, k is P
the Boltzmann constant, T the temperature, and Z the partition function. Now
let H = i Ei |Ei ⟩⟨Ei | be a quantum-mechanical Hamiltonian. The quantum Gibbs
ensemble is, by definition, the one described by the density operator
1 X 1
ρ= pi |Ei ⟩⟨Ei | = e−H/(kT ) , Z = tr e−H/(kT ) .
Z i Z

Thus, ρ is the operator that is diagonal in the eigenbasis of the Hamiltonian and has the
classical canonical probabilities as eigenvalues. Convince yourself: ρ is pure if and only if
T = 0 and there is a unique ground state.

von Neumann entropy. Density matrices allow us to define a quantum-mechanical no-


tion of entropy. Indeed, recall that with a classical probability distribution p, one associates
the Shannon entropy
X
S(p) = − pi log pi , with convention: 0 log 0 = 0.
i

If ρ is a density operator, then the von Neumann entropy H(ρ) is defined as the Shannon
entropy of its eigenvalues. In addition to its central role in statistical physics, von Neumann
entropy can also be used to quantify entanglement, as we will see later.

1.1.1 Visualizing mixed states: The Bloch ball


For spin-1/2 degrees of freedom, one can easily visualize the set of density operators.
Indeed, any 2 × 2 matrix is of the form
  3
1 a0 + a3 a1 − ia2 1X
A= = ai σi , ai = tr σi A (1.4)
2 a1 + ia2 a0 − a3 2 i=0
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 7

Figure 1.1: Up to a factor of ℏ2 , the i-th component ai = tr ρσi of the Bloch vector
is the expectation values of the angular momentum along the ei -axis. The length of the
Bloch vector encodes the “purity” of the state. Take an ensemble decomposition ρ =
(j)
P
j j j ⟩⟨ψj | of a density operator ρ. If a
q |ψ is the Bloch vector of the j-th state, then
the Bloch representation of ρ is the convex combination a = j qj a(j) .
P

(this is just saying that the Pauli matrices form a basis of the linear space of matrices). One
directly sees that the matrix is Hermitian iff the ai are real and has trace equal to one iff
a0 = 1. Thus density operators are of the form
1
ρ= (1 + a · σ), (1.5)
2
where a ∈ R3 is the Bloch vector. The eigenvalues of ρ are non-negative iff the Bloch
vector lies in the unit ball of R3 ; it lies on the unit sphere exactly if ρ is pure.

To see this, use (1.4) to compute det ρ = 41 (1−∥a∥2 ). Because tr ρ = 1, the eigen-
values are of the form λ, (1 − λ). The determinant is the product of the eigenvalues,
so that
1 1
λ(1 − λ) = (1 − ∥a∥2 ) ⇔ λ= (1 ± ∥a∥).
4 2

The maximally mixed state. The center point of the Bloch ball seems special. From
(1.5), it corresponds to ρ = 21 1. For a d-dimensional Hilbert space, ρ = d1 1 is called the
maximally mixed state. It has eigenvalues (1/d, . . . , 1/d) and thus entropy log d, which
is the highest one can get in d dimensions. In statistical physics language, the maximally
mixed state is thus the Gibbs state for T → ∞.

Non-uniqueness of ensemble decompositions. From Fig. 1.1, it is geometrically obvi-


ous that there are many different ensembles that realize any given mixed state. In particular,
the maximally mixed state in d dimensions can be expressed as
d
1
1=
X
|ψj ⟩⟨ψj |
d i=1

for any ONB {|ψj ⟩}j . (This is just the completeness relation for the basis). What seems
like a geometric curiosity at this point is in fact fundamental for a number of uniquely
quantum phenomena, in particular to quantum steering. We’ll come back to this point
later.
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 8

1.1.2 Time evolution of density operators


The noise-free dynamics of pure states is described by the Schrödinger equation. What is
the analogue for density matrices?
t
Applying the formal solution |ψ(t)⟩ = e iℏ H |ψ(0)⟩ of the Schrödinger equation to an
ensemble, we get
t t t t
X X
ρ(t) = qi |ψi (t)⟩⟨ψi (t)| = qi e iℏ H |ψi (0)⟩⟨ψi (0)|e− iℏ H = e iℏ H ρ(0)e− iℏ H .
i i

Differentiating with respect to t:


 t t
 1
∂t ρ = ∂t e iℏ H ρ(0)e− iℏ H = (Hρ − ρH)
iℏ
which gives the quantum Liouville equation:

iℏ∂t ρ = [H, ρ]. (1.6)

It is called so, because it is the quantum analogue of the classical Liouville equation ∂t ρ =
{H, ρ}, which governs the time evolution of a probability density ρ on phase space. Up to
a sign, the quantum Liouville equation is the same as the Heisenberg picture time evolution
for observables (why?):
t t
A(t) = e− iℏ H Ae iℏ H , iℏ∂t A = [A, H].

1.1.3 Dynamics of a noisy spin


Lamor precession
Recall the noise-free time evolution of a spin in a magnetic field. Plugging the Hamiltonian

γℏ
H=− B·σ
2
and the Bloch ball description of the state ρ = 1
2( 1 + a · σ) into the Liouville equation
gives
3
1  γℏ X γℏ X
iℏ ∂t a(t) · σ = [H, ρ] = − [Bi σi , aj σj ] = −i ϵijk Bi aj σk ,
2 4 ij=1 2
ijk

or ∂t a = γa × B. In particular, for B = Bez , with ω := γB the Lamor frequency,


 
ax cos(ωt) + ay sin(ωt)
(ax − iay )eiωt
 
1 1 + az
a(t) = −ax sin(ωt) + ay cos(ωt) ⇒ ρ(t) = .
2 (ax + iay )e−iωt 1 − az
az

Thus, the main diagonal of the density matrix (corresponding to the spin component paral-
lel to the field) remains constant, while the off-diagonal (corresponding to the spin compo-
nents orthogonal to the field) picks up a complex phase factor oscillating with the Lamor
frequency.
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 9

Dephasing of a spin
So far, we have just re-packaged undergrad calculations in new language. Let’s go further,
by treating a noisy time evolution of a spin-1/2 system in a magnetic field.
Assume that during the time period t ∈ [0, T ], the field strength is not B, but B + ∆B.
Then the Lamor frequency adapts accordingly, so that the phase factor picked up by the
upper-right term of the density matrix during the time interval changes as
eiT ω 7→ eiT ω eiϕ , ϕ = γT ∆B.

We say the system experiences of phase kick by eiϕ .


Now imagine the changes ∆B and thus the phases ϕ fluctuate probabilistically. For
concreteness assume that the ϕ follow a Gaussian distribution
1 2
p(ϕ) = √ e−ϕ /(4Λ)
4πΛ
with mean 0 and variance 2Λ for some Λ ∈ R. Then the expected value of the phase factor
is
Z ∞
1
E[e ] = √
2

eiϕ e−ϕ /(4Λ) dϕ = e−Λ
4πΛ −∞
(c.f. Eq. (B.1)). We see that random phase kicks cause the off-diagonal terms of the density
operator to attenuate:
(ax − iay )eiωT e−Λ
 
1 + az
ρ(T ) = .
(ax + iay )e−iωT e−Λ 1 − az
Now assume that during the following time periods t ∈ [(n − 1)T, nT ], the system expe-
riences independent phase kicks. Taking expectations again gives rise to additional factors
of e−Λ so that the upper-right matrix element at time nT reads (ax − iay )eiωnT e−nΛ .
Under reasonable independence assumptions on the distribution of the fluctuations, it is
then justified to interpolate to arbitrary times, so that, with λ = T /Λ, the system evolves
according to
(ax − iay )eiωt e−λt
 
1 + az
ρ(t) =
(ax + iay )e−iωt e−λt 1 − az
(see Fig. 1.2).
We have seen that unavoidable fluctuations in the control fields lead to the off-diagonal
elements of the density matrix to tend to zero exponentially fast. The characteristic time
scale 1/λ is called the T2 relaxation time. (As the name suggests, there’s also a T1 time,
which is the time scale during which the diagonal elements of ρ tend to their thermal
equilibrium values.) The limiting density matrix
 
1 + az 0
ρ(t → ∞) = (1.7)
0 1 − az
is a probabilistic mixture of energy eigenstates. The lack of superposition terms means
that (1.7) can be interpreted as a “classical state”.

Quantum computers rely on interference effects. Therefore, a system can serve as


a qubit only if its T2 -time is long enough that the computation can conclude before
phase coherence is lost (else, costly quantum error correction procedures become
necessary).
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 10

Figure 1.2: Left panel: Decoherence time measurement on a spin qubit operated at the
Research Center Jülich and RWTH Aachen with support of the project Matter and Light
for Quantum Computing. From [Struck et al., Low-frequency spin qubit energy splitting
noise in highly purified 28 Si/SiGe, npj Quantum Information (2020). Right panel: The
trajectory of a dephasing spin in the Bloch ball.

It is instructive to work out how the Liouville equation has to be modified to take
dephasing into account. Noting that one can write the projection of the density
matrix onto its off-diagonal as
1
(ρ − σz ρσz† ),
2
it is easy to verify that ρ(t) satisfies the differential equation

i λ
σz ρσz† − ρ .

∂t ρ(t) = − [H, ρ] +
ℏ 2
Such differential equations that describe the time evolution of noisy quantum sys-
tems are called quantum master equations.

Summary

• General states are described by density operators, Hermitian operators


whose eigenvalues form a probability distribution.
• Reversible time evolution of density operators is determined by the quantum
Liouville equation iℏ∂t ρ = [H, ρ].
• In d = 2, density operators can be described using their Bloch vector a as
ρ = 21 (1 + a · σ).
• Phase noise attenuates off-diagonal coefficients of density operators.
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 11

1.2 Multi-partite Hilbert space

Goals
We will introduce tensor product Hilbert spaces, and argue why this is the right
space for multiple distinguishable particles. We’ll have to spend a lot of time on
notation (boring, but necessary) and have a first look at entanglement.

1.2.1 Tensor product Hilbert spaces


Two particles are distinguishable if one can construct measurement devices that are sensi-
tive to one of the particles, but are not influenced by the other. (In contrast, try to build a
detector that will be triggered only by one specific electron!)
More precisely, let H1 , H2 be the Hilbert spaces of two particles. We say they are
distinguishable if:
X
For any state |α⟩ ∈ H1 and observable A = ai |ei ⟩⟨ei | on H1 ,
i
X
and any state |β⟩ ∈ H2 and observable B = bj |fj ⟩⟨fj | on H2 ,
j

it makes physical sense to prepare the first particle in the state |α⟩, the second one in the
state |β⟩, and perform the measurements A and B. We also demand that in this case, the
outcome probabilities are independent:

Pr[ai and bj ] = |⟨α|ei ⟩|2 |⟨β|fi ⟩|2 . (1.8)

We now construct the Hilbert space H12 associated with the combined system. The
above implies that H12 contains vectors associated with the outcomes ai , bj . Let’s call
them |ei , fj ⟩. Because they correspond to different outcomes of an observable, they have to
be orthogonal. The Hilbert space must also contain a vector associated with the preparation
procedure, let’s call it |α, β⟩. The independence condition (1.8) is fulfilled if, for
X X
|α⟩ = αi |ei ⟩, |β⟩ = αj |fj ⟩,
i j

we define
X
|α, β⟩ = αi βj |ei , fj ⟩ (1.9)
ij

(One can show that this is essentially the only way to satisfy independence.) The resulting
Hilbert space
nX o
H12 = ψij |ei , fj ⟩ | ψij ∈ C ,
ij

together with the rule (1.9), is called the tensor product space H1 ⊗ H2 .
States that describe independent preparations of the particles, i.e. those of the form
given in (1.9), are called product states. Alternative notations:

|α, β⟩ = |αβ⟩ = |α⟩|β⟩ = |α⟩ ⊗ |β⟩,


CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 12

and, if the bases referenced are (hopefully) clear from context:

|ei , fj ⟩ = |i, j⟩.


P
For general elements |ψ⟩ = ij ψij |ij⟩ ∈ H12 , the coefficients ψij need not factorize as
in (1.9). Such states are called entangled, and we’ll have more to say about them.
The observables A, B associated with the individual particles act on product vectors in
the obvious way:

A|α, β⟩ = (A|α⟩)|β⟩, B|α, β⟩ = |α⟩(B|β⟩).

This defines A, B on all of H12 , because the product vectors |ei , fj ⟩ form a basis.

Notation and conventions If not clear from context, the system on which an operator
acts is explicitly specified

C (1) |α, β⟩ = (C|α⟩)|β⟩, C (2) |α, β⟩ = |α⟩(C|β⟩).

There’s also the “tensor product of operators” notation (sometimes called the Kronecker
product, in particular in computer algebra systems):

C (1) D(2) = C ⊗ D, C (1) = C ⊗ 1, C (2) = 1 ⊗ C.

This implies that “tensor products of outer products” equal “outer products of tensor prod-
ucts” (yeah, I know... you’ll get used to it):

|α⟩⟨γ| ⊗ |β⟩⟨δ| = |αβ⟩ ⟨γδ|. (1.10)

Example: The singlet state. In the theory of the addition of angular momentum (every
student’s favorite topic!), one comes across the singlet state
1
|Ψ− ⟩ = √ (| ↑↓⟩ − | ↓↑⟩)
2
in H1 ⊗ H2 , where the Hi are two-dimensional with basis {| ↑⟩, | ↓⟩}.

1.2.2 The partial trace


Let’s recall the classical notion of a marginal distribution. The statistics of a pair X1 , X2
of random variables is described by their joint distribution p(2) :

p(2) (x1 , x2 ) = Pr[X1 = x1 and X2 = x2 ].

If one has access only to the first variable, one can obtain its distribution from the joint one
by summing over the irrelevant outcomes
X
p(1) (x1 ) = Pr[X1 = x1 ] = p(2) (x1 , x2 ). (1.11)
y

The result p(1) is called the marginal distribution associated with X1 .


Let’s work out the quantum analogue of the “partial sum” in Eq. (1.11). Assume we
are given a joint state of two particles, described by a density operator ρ(12) on the tensor
product Hilbert space H1 ⊗ H2 . We would like to compute an effective state ρ(1) that
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 13

describes measurements performed on the first particle alone. More precisely, for every
observable A on H1 , we demand
tr ρ(1) A = tr(ρ(12) (A ⊗ 1)). (1.12)
To solve this problem, define the partial trace tr2 of a product operator by computing
the usual trace of the second factor only:
tr2 (C ⊗ D) = C tr(D).
Note that the partial trace maps an operator on the tensor product Hilbert space to an
operator on the first system alone. Next, because any operator M on H1 ⊗ H2 can be
expanded in terms of product operators
X X
M= Mijkl |ij⟩⟨kl| = Mijkl |i⟩⟨k| ⊗ |j⟩⟨l|,
ijkl ijkl

one can extend tr2 linearly to all operators:


X XX 
tr2 M = Mijkl |i⟩⟨k| ⊗ tr(|j⟩⟨l|) = Mijkj |i⟩⟨k|.
ijkl ik j

Using this expression:


tr ρ(12) (A ⊗ 1) = ⟨ij|ρ(12) |kl⟩ ⟨kl|A ⊗ 1|ij⟩
 X
| {z }| {z }
ijkl (12)
ρijkl ⟨k|A|i⟩δlj
X 
(12)
X
ρijkj ⟨k|k⟩ = tr A tr2 ρ(12) .

= ⟨k|A|i⟩
ik j

We have found that


ρ(1) = tr2 ρ(12)
solves Eq. (1.12). In this sense, the partial trace is the quantum analogue of the “partial
sum” of Eq. (1.11). The density matrix ρ(1) is called the marginal state or the reduced
density matrix.

Pure product states. For pure product states, we find


ρ(12) = |αβ⟩⟨αβ| = |α⟩⟨α| ⊗ |β⟩⟨β| ⇒ ρ(1) = |α⟩⟨α| tr(|β⟩⟨β|) = |α⟩⟨α|.
Physically, this says that measurements on the first particle are sensitive only to the prepa-
ration of the first particle, which reflects the independence property (1.8) we have required
in the distinguishable case.

The singlet state. The partial trace of the singlet state is much more interesting:
1
tr2 |Ψ− ⟩⟨Ψ− | = tr2 |↑↓⟩⟨↑↓| − |↑↓⟩⟨↓↑| − |↓↑⟩⟨↑↓| + |↓↑⟩⟨↓↑|

2
1 1 1
= |↓⟩⟨↓| + |↑⟩⟨↑| = 1.
2 2 2
While the global state ρ = |Ψ− ⟩⟨Ψ− | was pure, the partial trace tr2 ρ = 21 1 is maximally
mixed! If ρ describes a thermodynamic equilibrium state, the total system is at temperature
0, while Alice’s subsystem has temperature ∞. In classical physics, this is impossible.
This example shows that mixed states can occur in QM even in the absence of any form of
classical randomness. We’ll explore the conceptual implication in the next section.
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 14

Entropy of entanglement Let |Ψ⟩ ∈ H1 ⊗ H2 . If |Ψ⟩ = |αβ⟩ is a product state, then


the reduced density matrix ρ(1) = tr2 |Ψ⟩⟨Ψ| = |β⟩⟨β| is pure and thus has vanishing
von Neumann entropy S(ρ(1) ) = 0. Since we have defined “entanglement” to be the
property of not-being-a-product-state, it is natural to define S(tr2 |Ψ⟩⟨Ψ|) as a quantita-
tive measure of entanglement. For the singlet state, this entropy of entanglement is 1 bit,
the highest value realizable in two dimensions. For this reason, the singlet is called a
maximally entangled state.

Summary

• The global Hilbert space of particles with individual Hilbert spaces H1 , H2


with bases {|ei ⟩}, {|fj ⟩} is the tensor product space
nX o
H12 = ψij |ei fj ⟩ | ψij ∈ C .
ij

• The restriction of a global density operator ρ(12) to one subsystem is given


by the partial trace ρ(2) = tr2 ρ(12) .
• Globally pure states can look locally mixed. This is a sign of entanglement.

The rest of this chapter present some topics in multi-partite quantum systems,
which are, I think, conceptually highly interesting. But we won’t build on them
in the remainder. So it’s fine to skip ahead to Chapter 2.

1.3 Dynamics of coupled systems

Goals
Things will get much more interesting! Our direct objective in this section is to
work out a model in which entangled states arise naturally. Even though the model
is extremely simple, we will, as a by-product, be able to make progress on issues
that seem to pose conceptual problems to QM: The measurement problem and the
question of why the world looks classical, even though it seems to be fundamen-
tally governed by QM.

1.3.1 The measurement and the classicality problem


The measurement problem
Elementary QM provides two very different rules for time evolution:
1
• Hamiltonian time evolution: |ψ⟩ 7→ e iℏ ∆tH |ψ⟩. Change is continuous in time,
reversible, deterministic, and linear in the state vector.

• Projective measurements: |ψ⟩ 7→ Pj |ψ⟩/ pj with probability pj = ⟨ψ|Pj |ψ⟩.
Change is discontinuous in time, irreversible, non-deterministic, non-linear in the
wave function.
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 15

Figure 1.3: The formalism of QM divides the universe into degrees of freedom that are
modeled quantum-mechanically and those that are classical. The boundary between these
two regimes is the Heisenberg cut. For a Stern-Gerlach experiment, the quantum side could
include just the spin (1), but also the motional degrees of freedom (2) of the silver atom, or
the measurement device (3) that records its final position, or even the experimentalist (4)
observing the outcome.

Given that these are completely different, quantum physicists take great care to very care-
fully explain when to use the one and when to use the other. ... Huh huh, just kidding.
Check out your introductory textbook and try to find a definition of which properties ex-
actly a physical process has to fulfill in order to qualify as a “measurement”. I wish you
good luck!
The standard presentation of quantum mechanics divides the world into a “quantum
part” and a “classical part”. The measurement rules connect the two. But it is not clear
which degrees of freedom belong to which side of this cut.
Example: In the standard treatment of the Stern-Gerlach experiment, the spin is mod-
eled quantum mechanically, but the spatial position of the atom classically. The spin-
dependent movement of the atom is treated as a measurement. But it also seems reasonable
to put the atom’s position to the quantum side of the cut (Fig. 1.3). The interaction between
spin and spatial coordinates is then described by a coherent Hamiltonian time evolution.
A measurement only takes place once an observer records the atom’s position.
We can now state to aspects of quantum mechanic’s measurement problem:
• The pragmatic problem: Why can physicists get away with being so vague about
the notion of “measurement”? Why don’t different modeling decisions produce
different predictions? (We’ll be able to answer this).
• The philosophical problem: Given that quantum mechanics is supposedly more fun-
damental than classical theories, how do we deal with the fact that its predictions
are stated with respect to a classical world? Who’s measuring the wave function of
the universe? (We won’t make progress here. In fact, there’s no agreement what’s
the best solution to this issue. Or whether there is a solution. Or whether there was
a problem in the first place. It’s a mess.)

Why is the macroscopic world classical?


The gravitational potential describing planetary motion and the Coulomb potential binding
an electron to a nucleus are mathematically equivalent. Why then is it the case that we
describe states of bound electrons in terms of delocalized orbitals, whereas Venus seems
to occupy a pretty definite spot in the night sky (however, see Fig. 1.4)?
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 16

Figure 1.4: Why do planets and electrons behave differently? An unconventional take.
Source: xkcd.com.

Likewise, why do marbles seem to be in one place at any one time, while from the
perspective of elementary QM, it would be much more natural to assign a momentum
eigenstate to them (which diagonalizes the free Hamiltonian)? Due to the their macro-
scopic mass, it is compatible with Heisenberg’s uncertainty relation that a marble can be
in a state in which both position and momentum are very precisely determined – but it is by
no means necessary that such a state be adopted. So why then does this seem to happen?
More generally: Which process breaks the unitary invariance of quantum state space
and selects the basis in which we encounter physical objects?

1.3.2 A quantum model for measurements


With these fundamental questions at the back of our heads, let’s start with the Hamiltonian
for a particle with spin interacting with an external magnetic field:

P2 γℏ
H= − B·σ
2m 2
Assume that B = Bzez . Then only the z-coordinate participates in the interaction, so
nothing is lost by only treating the spin and the spatial z-coordinate explicitly. The time
evolution is best calculated in interaction picture. Decompose the Hamiltonian as

Pz2 γℏB
H = H0 + HI , H0 = , HI = − zσz .
2m 2
Then the Schrödinger and the interaction–picture wave functions are
1 1 1
|ψS (t)⟩ = e iℏ tH |ψS (0)⟩, |ψI (t)⟩ = e− iℏ tH0 |ψS (t)⟩ = e iℏ tHI |ψS (0)⟩,

where |ψI (t)⟩ describes the change of dynamics caused by an interaction term HI .
First treat the case where the particle is initially in a momentum-0 eigenstate:

|ψS (t = 0)⟩ = (α|↑⟩ + β|↓⟩)|k = 0⟩.


CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 17

ℏγB
Then, with δ := 2 ,

|ψI (t)⟩ = e 2 tBzσz |ψS (t = 0)⟩
iγ iγ
= α e 2 tBz |↑⟩|k = 0⟩ + β e− 2 tBz |↓⟩|k = 0⟩
 

= α|↑⟩|k = δt⟩ + β|↓⟩|k = −δt⟩. (1.13)

This is an entangled state! A measurement of spin and momentum gives correlated out-
comes:
|α|2 (s, k) = (↑, +δt)

Pr[s, k + dk] = .
|β|2 (s, k) = (↓, −δt)

The marginal distribution for the spin variable alone is

|α|2 s = ↑

Pr[s] = .
|β|2 s = ↓

This is exactly what we would have obtained by treating just the spin quantum mechan-
ically! Thus: Using a quantum model for the spatial z-component does not change the
prediction about the measured spin state. All it does is to entangle the measured and the
measuring degree of freedom so that the global state becomes a superposition of consistent
configurations. Indeed, we could have included further degrees of freedom – e.g. the ex-
perimentalist observing the particle momentum. If we model them – simplifying slightly
– as a two-dimensional system with (mental) states |,⟩ when seeing an upwards mov-
ing atom, and |/⟩ when encountering one moving downwards, an analogous calculation
would have resulted in

|ψI (t)⟩ = α|↑⟩|δt⟩|,⟩ + β|↓⟩| − δt⟩|/⟩, (1.14)

with a similar interpretation if now the experimentalist’s state gets measured.


Instead of a momentum eigenstate, let’s use a more realistic Gaussian initial state.
Write |ψk0 ⟩ for a Gaussian wave packet centered around k0 in momentum space:
(k−k0 )2
⟨k|ψk0 ⟩ = (2π)−1/4 e− 4 .

Then

e 2 Bz |ψk0 ⟩ = |ψk0 +δt ⟩

so that, if we take |ψS (0)⟩ = |ψ0 ⟩,

|ψI (t)⟩ = α|↑⟩|ψδt ⟩ + β|↓⟩|ψ−δt ⟩.

The correlations between spin and position now build up over time. Indeed:
(k−δt)2
(
1 |α|2 e− 2 dk s = ↑
Pr[s, k + dk] = √ (k+δt)2
.
2π |β|2 e− 2 dk s = ↓

At t = 0, the momentum distribution is independent of the spin state. For times t ≃ 1/δ,
the two spin-dependent Gaussian distributions become distinct, but overlap significantly.
Only for t ≫ 1/δ does the sign of a measured momentum value identify the spin state
with certainty.
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 18

Let’s summarize: The coupling term (zσz ) caused the spin and the positional degree
of freedom to become entangled over time. A measurement on the entangled state in
the eigenbases of the two factors z, σz leads to correlated outcomes. Asymptotically, the
correlations are perfect, and a direct measurement of one observable on the initial state
is equivalent to a measurement of the other observable on the final state. One can then
define a measurement to be any process to which the above analysis applies. In this case,
the measuring degree of freedom (called a pointer in this context) can be treated either
classically or quantum-mechanically.
The same framework can be used to identify the basis in which objects present. To see
how, compute the reduced density matrix for the spin. From

|ψI (t)⟩⟨ψI (t)| = |α|2 |↑⟩⟨↑| ⊗ |ψδt ⟩⟨ψδt | + αβ ∗ |↑⟩⟨↓| ⊗ |ψδt ⟩⟨ψ−δt | + . . .

and
Z
1 (−δt−k)2 +(δt−k)2
tr |ψδt ⟩⟨ψ−δt | = ⟨ψ−δt |ψδt ⟩ = √ e− 4 dk

Z
1 k2 +(δt)2 2
=√ e− 2 dk = e−(δt) /2 ,

we can read off the reduced density matrix in the {|↑⟩, |↓⟩}-basis:
2
!
|α|2 αβ ∗ e−(δt)
ρspin (t) = trspace |ψI (t)⟩⟨ψI (t)| = 2 .
α βe−(δt)

|β|2

Thus, the state of the spin part alone dephases from a pure state at t = 0 to a probabilistic
mixture of |↑⟩ and |↓⟩ for times t ≫ 1/δ. The entropy (of entanglement) gradually builds
up from S(t = 0) = 0 to

S(t → ∞) = −|α|2 log |α|2 − |β|2 log |β|2 .

Let’s again interpret this calculation from a broader perspective. After the dephasing
time, an unrelated observer will find the spin in a σz -eigenstate and will not encounter
superpositions. Recall what distinguishes the z-axis: It is the one in which the interac-
tion takes place! The bases which we perceive as “classical” are the ones in which the
interaction terms are diagonal, and the emergence of probabilistic mixtures is a result of
entanglement building up. Interactions are local, which is why quantum systems usually
appear to be well-localized in space. However, some interactions select for different bases:
e.g. electrons bound in an atom couple to the environment via the electromagnetic field.
This interaction is sensitive to atomic energy scales and angular momentum – but the wave
lengths of the involved photons is too large for the position of the electron within the atom
to make a meaningful difference. Therefore, the semi-classical description of electrons
in terms of atomic quantum numbers (“n, l, m”) makes sense in this case. In contrast,
whether or not a photon is scattered off the surface of venus depends on the planet’s posi-
tion within its orbit, not on its internal energy or angular momentum.
Further conceptual points:
• Q.: Are measurements discontinuous in time?
A.: Nope! Correlations between the measured system and the environment are built
up at a time scale proportional to the inverse coupling strength. The instantaneous
process postulated in introductory QM can be understood as an effective description
valid for times much larger than that.
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 19

• Q: Are these these processes irreversible?


A.: The dynamics on the quantum side of the cut is reversible in theory – the final
measurement still isn’t. This doesn’t lead to practical contradictions, though. As-
sume we put the whole universe, except for ourselves, on the quantum side. What
would it take to reverse the measurement after a blob of silver (in the Stern-Gerlach
case) has been deposited on a screen, but before we have looked at it? The deposit
will have interacted with an enormous number of degrees of freedom: phonons in the
screen, the cosmic background radiation, thermal photons that have since zoomed
off into the sky at the speed of light. Clearly, for all practical purposes (“FAP”),
it is impossible to reverse those interactions. Thus, once a macroscopic record of
an event exists, the irreversibility introduced by QM’s measurement postulate does
not change anything FAP. Philosophically, it might still be a thorny issue though!
This is all good news if you like to compute things (no immediate contradiction).
It’s bad news if you like to understand foundational questions, because there seems
little empirical guidance on offer for how to handle this conceptual inconsistency.
• Q.: In thermodynamics, there’s tension between the fact that entropy increases,
while microscopic dynamics is reversible. The buildup of entanglement seems like
an elegant solution: Local randomness is created from globally reversible dynamics.
Maybe “all entropy is entanglement entropy”! Is that a good way to think about the
apparent increase of entropy?
A.: You betcha!
Concrete estimates for decoherence rates can be found in Table 2 of Tegmark, Apparent
wave function collapse caused by scattering.

1.4 Quantum many-body systems as computers

Goals
Quantum computing is all the rage! We’ll introduce the basic concepts here and
discuss one very cool and comparatively
√ simple application: Grover’s algorithm,
which can search through N times in N time. You heard me right.

One can iterate the construction of the two-particle Hilbert space to find the space for
n > 2 systems. Assume, for simplicity, that every single-system Hilbert space Hi has
dimension d and basis {|1⟩, . . . , |d⟩}. Then a general state vector in the joint Hilbert space
H = H1 ⊗ · · · ⊗ Hn is of the form
d
X
|ψ⟩ = ψi1 ,...,in |i1 , . . . , in ⟩.
i1 ,i2 ,...,in =1

You should immediately notice that the sum is over dn terms, i.e. the dimension of the
joint space is exponentially large in the number of constituents! For a collection of spin-
1/2s arranged on a cube with side length only 10, this gives an way-bigger-than-merely-
astronomical 21000 . This is:
• Bad news if you work in computational physics. It is absolutely out of the question
even to just store the coefficients ψi1 ,...,in in memory. Fortunately, one can some-
times use clever tricks to make statements about large-n systems without having to
work with explicit representations. More on this: See rest of these notes.
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 20

• Potentially good news if you can carefully control large quantum systems. Because
Nature seems to be able to track quantum states that our classical computers can’t, it
stands to reason that quantum systems could be used to solve otherwise intractable
computational problems. More on this: See Sec. 1.4.
Many-body Hamiltonians can usually be efficiently represented, though. The reason
is that physical interaction involve only few particles at a time. A Hamiltonian with only
single- and two-body terms is of the form
X 1 X (k,l)
H= h(k) + h
2
k k̸=l

where h(ij) acts non-trivially only on the i-th and j-th Hilbert spaces and can therefore be
specified as a d2 × d2 -matrix (or, if d = ∞, will typically be a simple function of position
and momentum operators).
Given the Hamiltonian, typical questions of interest are:

1. Obtain information about the eigenvalues of H, e.g. the energies of the ground states
and of low-lying excitations.
2. Compute thermodynamical potentials, e.g. the free energy

log Z = log tr e−βH .

3. Compute the expectation value ⟨ψ(t)|A(i) |ψ(t)⟩ of a local observable. Here, |ψ(t)⟩ =
t
e iℏ H |ψ(0)⟩ is the time evolution of a state that started out in a simple form, say
|ψ(0)⟩ = |i1 , . . . , in ⟩.

In general, finding answers to these questions is intractable. The task of quantum many-
body theory is to find special cases or approximations where progress can be made.

Quantum algorithms
It is not obvious that simulating the time evolution of quantum many-body systems actu-
ally is classically intractable. Sure, we have argued above that storing a many-body wave
function in memory is impossible. But we have also seen that any physical time evolution
can be described using only a small number of parameters (the local terms of the Hamilto-
nian, a simple initial state). So it is conceivable that there exists a smart universal way of
keeping track of |ψ(t)⟩ that does not involve working with the full state vector.
Today, there is strong evidence that such a universal strategy does not exist1 .
One piece of evidence is given by the existence of quantum algorithms. These are
methods that allow one to solve a difficult classical computational problem efficiently by
outsourcing parts of the calculation to a quantum device.
1 There is no rigorous proof of this, though! The issue is that is has so far been beyond the wit of humankind

to prove any reasonable problem to be computationally hard. For example, the infamous “P vs NP” problem asks
for a proof that finding solutions to problems is generally harder than verifying that a proposed solution indeed
works. An imprecise analogue would be: Appreciating classical music is easier than becoming the next Mozart.
Of course this is true – so the fact that there is no mathematical proof that “P ̸= NP” is not generally be taken to
be indicative of there being serious doubts about the statement, but rather as testament to the limitations of the
human mind. Sadly, a detailed account is beyond the scope of this lecture.
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 21

davidg@repos:˜$ sudo grep davidg /etc/shadow


davidg:$6$2kjoNwafEvibRYry$BzfBTGfk0wY2nk3c05OAucPXQPIP8
dWMFGbCNoAs7B1dacUNNUn5fDMyOorDu4QSaxOWZskpObEz3dlBbfI3f
/:19425:0:99999:7:::
davidg@repos:˜$

Figure 1.5: Stored SHA-512 hash of my actual university login. If you find a pre-image,
you can read my emails and adjust your grades. Knock yourself out! [If you do succeed,
you could also answer my emails. Come to think of it, maybe I should just post my
password...].

1.4.1 Grover’s algorithm


Here, we will look at one example: Grover’s algorithm. As we’ll see, it is simultaneously:
(i) comparatively easy to understand, (ii) highly surprising in what it achieves, and (iii)
probably of limited practical value even if big quantum computers can be constructed.

Overview
There are some computational puzzles for which the best-known approach is to just try
every possible input to see whether it is a solution.
The most clear-cut cases are used in cryptography. For example, your computer does
not actually know your password! Instead, it stores an n-bit image y ⋆ = h(x⋆ ) of the
password x⋆ under a cryptographic hash function h. It is designed such that computing
y = h(x) for an input x is easy, but the best-known way of finding a pre-image x ∈
h−1 ({y}) given y is to try ≃ 2n random inputs (Fig. 1.5). To authenticate a user who
claims their password is x, the computer compares y = h(x) to the hash y ⋆ on file. The
advantage of such an indirect procedure is that not much harm is done if the stored hashes
fall into the wrong hands: A typical value of n is 512 and 2512 ≫ (hadrons in universe),
so recovering the passwords x⋆ is impractical. (Unless, of course, the user chooses a
password that can be guessed with reasonable effort. No hash magic makes “birthday-of-
romantic-partner123lol” a secure choice.)
Finding an inverse by trying random inputs does not require that we understand any-
thing about the inner workings of h. All we need is the ability to compute h(x) given x.
Methods that interact with h only in this way are called black box (or oracle) algorithms.
Are black box algorithms really the best way to invert a hash function? Don’t take my
word for it! The vast wealth stored in “crypto currencies” is secure only to the degree that
this assumption is true. BitCoin is effectively a multi-billion dollar bounty on an improved
algorithm. It hasn’t been claimed as of 2023 (Fig. 1.6).
In light of this, it is truly remarkable√that in 1996, Lov Grover showed that a quantum
computer can find x from y in roughly 2n = 2n/2 time steps. In fact, this square root
speedup is possible for any puzzle for which a solution can be efficiently recognized!
Here’s a high-level overview: We model “puzzle for which a solution can be recog-
nized” by a function f that maps n-bit strings x (“candidates”) to 0 (“no solution”) or 1
(“solution!”). In the above example: f (x) = 1 if h(x) = y ⋆ and 0 else. Assume you
have a piece of code that evaluates f on a classical computer in Tn time steps. Then
Grover’s recipe turns that code into a time-dependent two-body Hamiltonian H(t) such
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 22

Figure 1.6: Left: Cryptocurrency mines consists of racks of computers that try random
inputs hoping to find a solution to a mathematical puzzle. Right: If you can do better,
there’s 300b dollars on the table (as of early 2023). Credit: Wikipedia, Statista.

that if |ψ(0)⟩ = |0, . . . , 0⟩, then


√ 
ψ t = cTn 2n ≃ |x⋆1 , . . . , x⋆n ⟩,

where x⋆ is such that f (x⋆ ) = 1 and c a (reasonably small) constant. A measurement will
then reveal the bits of x⋆ with high probability.
Grover’s algorithm is also “black box” in the sense that no understanding of “the inner
workings” of f beyond the ability to compute it is required. So how is it possible to find a
solution in drastically less time than it would take to consider a fixed fraction of all inputs?
The answer is that Grover constructs a quantum black box Uf : |x, 0⟩ 7→ |x, f (x)⟩ that
can be run on a superposition of inputs
X  X
Uf cx |x, 0⟩ = cx |x, f (x)⟩.
x x

Thus, just a single invocation of the quantum black box results in a wave function that
carries information about all possible inputs. The tricky part is then to read this information
out. Grover’s contribution was to find a clever trick for getting the amplitudes for all
|x, f (x)⟩ with f (x) = 0 to interfere destructively, so that only the solutions survive.
We’ll work our way through the details next.

The gate model


The connection between classical computer code and quantum Hamiltonians goes via the
gate model of computation.
A classical computer operators on bits, physical systems that can be in one of two
states. Traditionally, these are labeled 0 and 1. A logic gate (or just gate) is a process
that changes the state of a small number of bits in a defined way, see Fig. 1.7. It is known
in classical computer science that any function that can be computed at all can also be
computed by a circuit formed by concatenating reversible logic gates.
We now consider a quantum generalization. To this end, replace each bit by a two-level
quantum system and fix some basis with states labeled |0⟩, |1⟩. Voilà, a quantum bit (or
qubit). We assume that we have detailed control over the dynamics: By adjusting classical
control parameters (external fields, position of the qubits, etc.), we are able to switch off
the time evolution H = 0, or to realize single qubit H = h(i) or two-qubit H = h(ij)
Hamiltonians.
(i)
Let’s look at the single-qubit case first. If we set, e.g., H = h(i) = ℏ2 σx for a duration
π (i) (i)
δt = π, then the time evolution U (δt) = e−i 2 σx = −iσx acts on the i-th qubit like the
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 23

Figure 1.7: Classical gates and circuits. (i) The N OT gate inverts the state of a single
bit. (ii) The X OR gate computes the exclusive or x ⊕ y of its inputs. (iii) The C NOT
(or controlled not) gate toggles the state of the second bit if and only if the first bit is
in the 1-state. Note that the C NOT and the N OT gate are reversible: I.e. the input can
be reconstructed given the output. (iv) A reversible circuit. It turns out that anything a
classical computer can do can be represented in this way.

classical N OT gate (ignoring a global phase factor of −i):


 
0 1
σx = : |0⟩ 7→ |1⟩, |1⟩ 7→ |0⟩ or |xi ⟩ 7→ |N OT(xi )⟩.
1 0

Likewise, the matrix that is represented in the {|00⟩, |01⟩, |10⟩, |11⟩}-basis by

  |xi xj ⟩ C NOT|xi xj ⟩
1
|00⟩ |00⟩
 1 
C NOT =  : |01⟩ |01⟩
 0 1
|10⟩ |11⟩
1 0
|11⟩ |10⟩

acts like the C NOT gate, but on qubits. Because U is unitary, it can be implemented
by a suitable two-qubit Hamiltonian (homework). This construction generalizes to all
reversible logic gates and, as per our previous comment, any classical computation can
thus be realized by a time-dependent Hamiltonian on qubits.
But of course, most unitaries are not permutations! A quantum gate is any unitary that
acts on a small number of qubits. Prominent examples with no classical counterpart are
 
1 0
Z= Z-gate, (1.15)
0 −1
 
1 0
P = phase gate, (1.16)
0 i
 
1 1 1
H=√ Hadamard gate. (1.17)
2 1 −1
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 24

The Hadamard gate, e.g., turns basis states into uniform superpositions:

H|0⟩ = 2−1/2 (|0⟩ + |1⟩),


(H ⊗ H)|0⟩|0⟩ = (H|0⟩)(H|0⟩) = 2−1 (|00⟩ + |01⟩ + |10⟩ + |11⟩),
..
.
n
X
H ⊗n |0, . . . , 0⟩ = 2−n/2 |x1 , . . . , xn ⟩.
x1 ,...,xn =0

Grover iterations
Let f : {0, 1}×n → {0, 1} be a classical function as introduced in Sec. 1.4.1. To represent
f in a quantum computer, we have to consider a reversible version. The common choice
is this:

(x, y) 7→ (x, y ⊕ f (x)) x ∈ {0, 1}×n , y ∈ {0, 1}.

As is the case for any reversible function, it can be expressed as a circuit consisting of
reversible classical gates. Re-interpreting these as quantum gates, we arrive at the (n + 1)-
qubit unitary

Uf : |x, y⟩ 7→ |x, y ⊕ f (x)⟩

which can indeed be realized by a time-dependent Hamiltonian running in time propor-


tional to Tn .
At this point, we have constructed the quantum black box, and have seen how to create
superposition states using Hadamard gates. Let’s combine these two steps. For simplicity,
assume that there is a unique x⋆ such f (x⋆ ) = 1. Then:

Uf (H ⊗n ⊗ 1) |0, ..., 0⟩ = 2−n/2


X X
|x, f (x)⟩ = 2−n/2 |x, 0⟩ + 2−n/2 |x⋆ , 1⟩.
x x̸=x⋆

That’s promising, because a single invocation of the quantum black box did indeed leave
information about x⋆ in the output. But it’s not yet useful, because the coefficient in
front of |x⋆ , 1⟩ is exponentially small. Performing a measurement will reveal it only with
probability 2−n , exactly the same as a classical random guess would give.
Grover found a way to amplify the coefficient in front of the solution. His construction
involves the following elements, whose relevance will become clear soon:
1. Instead of Uf , which indicates whether a solution has been first by flipping an aux-
iliary qubit, use

Vf : |x⟩ 7→ (−1)f (x) |x⟩

which changes the sign of the coefficient for the solution. One can construct Vf
from Uf by throwing in an extra Hadamard gate. Verifying this is homework.
2. Introduce a second unitary

Vδ : |x⟩ 7→ (−1)δ(x) |x⟩,

where f is replaced by the “Kronecker delta for bit-strings”, i.e. Vδ flips the sign of
the coefficient for x = 0.
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 25

3. Define the Grover operator to be


G = (−H ⊗n Vδ H ⊗n ) Vf .

The big claim now is that starting from H ⊗n |0⟩, every application of the √ Grover operator
G will rotate the state vector closer to |x⋆ ⟩, hitting the target after ≃ π4 2n iterations.
Proof: Define
1 X
|/⟩ = √ n |x⟩
2 − 1 x̸=x⋆

to be the uniform superposition of all non-solutions. Then {|/⟩, |,⟩ := |x⋆ ⟩} form on
ONB for a two-dimension subspace. Remarkably, the state vector will regularly end up
in this 2-dimensional space, so we can track the progress of the algorithm solely by con-
sidering the dynamics in this small space. (This makes Grover comparatively easy to
analyze. Don’t get your hopes up, though. This never happens again). Indeed, the state
|+⟩ = H ⊗n |0⟩ can be expanded as
r
2n − 1 1
|+⟩ = |/⟩ + √ |,⟩.
2n 2n
As you can see, the initial superposition is almost parallel to the non-solutions |/⟩. The
angle they enclose is
 1  1
θ := ∠(|ψ0 ⟩, |/⟩) = arcsin √ ≃√
2n 2n
(an excellent approximation for reasonably large n). Now the application of Vf changes
the sign of the coefficient in front of |,⟩. Geometrically, this corresponds to a reflection
about the plane orthogonal to |,⟩. Likewise,
−H ⊗n Vδ H ⊗n = H ⊗n (−1 + 2|0⟩⟨0|)H ⊗n = −1 + 2|+⟩⟨+|
is a reflection about |+⟩. The combinations of two reflections is a rotation, and a simple
geometric analysis in the |,⟩–|,⟩–plane (Fig. 1.8) shows it is by an angle of 2θ toward
the solution vector |,⟩. It is reached after k iterations of G, for
π π 1 π√ n
θ + k2θ = ⇔ k= − ≃ 2 (1.18)
2 4θ 2 4
as claimed.
Remarks:
• Don’t run Grover for too long! Otherwise, you’ll rotate past the solution |,⟩.
• Don’t worry if (1.18) has no integer solution. If ⟨ψ|x⋆ ⟩ = 1 − ϵ, you’ll get a
wrong solution x ̸= x⋆ with probability ≃ 2ϵ. But by assumption, we can check the
solution efficiently by computing f (x). If f (x) = 0, just rerun the algorithm.
• While impressive in its generality, the practical utility of Grover’s algorithm is lim-
ited. The “square root speedup” isn’t as large as the exponential speedup some
quantum algorithms promise. What is more, quantum computers are much harder
to build than classical ones and might require a substantial overhead to compensate
for errors. On top of all that, Grover, unlike an exhaustive classical search, cannot
√ root advantage might only materialize for n’s such
be parallelized. Thus the square
that not only 2n , but already 2n is astronomical.
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 26

Figure 1.8: Time evolution of the Grover algorithm. Left panel: A Grover iteration per-
forms two reflections that combine to a rotation by θ toward the target state |,⟩ = |x⋆ ⟩.
Angles not to typical scale! Right panel: The effect of consecutive Grover rotations.

Summary

• The exponential size of the many-body Hilbert space can potentially be put
to use to solve classically hard computational problems.
• Time evolutions of few qubits are described by small unitaries, which are
called quantum gates and generalize classical logic gates.
• Classical computations can be made reversible and reversible gates re-
interpreted as unitaries. This way, classical subroutines can be can be eval-
uated on superpositions of inputs. The resulting state carries information
about their global behavior. Putting this information into a form that can be
read out may require non-trivial efforts (e.g. Grover iterations).

1.5 Bell inequalities and their implications


Classical mechanics tells you what is happening. Quantum mechanics only tells you what
you will observe when you measure. It does not assign values to unmeasured physical
properties.
From the early days of the theory, some scientists – famously Albert Einstein (Fig. 1.9
– saw this as a sign that quantum mechanics was incomplete, and should be supplanted
by a more detailed description of Nature that does track the time evolution of all physical
properties, measured or not.
In what I feel is one of the most profound findings of modern physics, this program
has since been proven to be impossible: The hypothesis

“Physical properties exist independently of measurements” (1.19)

has been experimentally falsified as a general property of Nature! On top of the surprising
conclusion, this is remarkable because (1.19) feels like a philosophical statement that is
too vague to have testable implications. Yet here we are.
In the following derivation, we have to keep in mind that we want to reason about
theories different from quantum mechanics. This means that we cannot use any concept
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 27

Figure 1.9: Left panel: 1935 New York Times headline reporting on Einstein-Podolsky-
Rosen paper arguing that quantum mechanics was incomplete. I wonder how Podolsky
and Rosen felt about the framing. Right panel: 2015 New York Times headline reporting
on Einstein being wrong.

that has a meaning only in the context of QM. “Hilbert space”, “entanglement”, “commu-
tators”, even “photon”... ...all these terms are verboten until further notice.2

Goals
The goals of this section? You got to be kidding me! Understand that, of course.
This has got to be one of the coolest things physics has to offer.

1.5.1 The CHSH scenario


Our challenge now is to come up with a setting in which the vague statement (1.19) leads to
quantitative predictions that can be compared to experiments. The most important case is
the so-called CHSH scenario (Fig. 1.10). While not difficult to understand, it does contain
quite a number of elements that seem ill-motivated at this point. Please bear with me for a
moment.
The scenario contains two observers, Alice and Bob, located at different ends of a
laboratory. There’s a box in the middle. In regular intervals, it emits two systems, one
flying to Alice and one to Bob. Each observer has two measurement devices, labeled 1
and 2. The devices work like this: They have an entry port and when one of the systems
coming from the central box enters a device, one of two lights will flash. The lights are
labeled +1 and −1 respectively. Every time a pair of systems leaves the central box, Alice
and Bob choose one of their measurement devices at random, put it in the path, and record
the observed outcomes.
OK, some Q&A’s:

• Q.: So what’s up with the talk of “systems”? What are these? Photons? Spins?
A.: Unspecified. For now, these could be puffs of hot air and the measurement
2 Physicists talking about Bell inequalities have a tendency of emphasizing entanglement, or the singlet state

and how the fact that it’s spin-0 means that angular momentum measurements are anti-correlated, and some such
things. These are not wrong and even mildly helpful for the design of experiments that lead to the falsification
we are after. All this is also completely secondary to the main point; a case of people sticking to their comfort
zone.
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 28

Figure 1.10: The ingredients of the CHSH scenario (for Clauser, Horne, Shimony and
Holt). Two experimentalists are located at different ends of a laboratory. Each can perform
one of two measurements on systems emanating from a box in the middle. Surprisingly, the
analysis of the set of correlations that are compatible with this extremely vaguely defined
scenario offers profound insights!

devices random number generators. Our analysis does not depend on assumptions
about their nature. (Also, what’s a photon?)
• Q.: Are Alice’s devices 1 and 2 different? Is Alice’s device 1 different from Bob’s
device 1?
A.: We do not need to make any assumptions about this.
• Q.: Why are the outcomes labeled ±1?
A.: That’s not really essential. This particular choice will work well with our analy-
sis, though.
• Q.: Can Alice rig her boxes together such that she can perform both measurement
on the same incoming system?
A.: For all we know at this point... maybe?
• Q.: Look man. You are clearly just avoiding my questions. Why don’t you study your
system first, and come back once you can give specific answers?!
A.: You got it backwards! The fewer assumptions I need to make, the more generally
applicable my conclusions will be.3
• Q.: How in the world does one come up with this?
A.: Well, it took physics a few decades. Also, literal Einstein missed it.
With the setup established, let’s look at the lab book produced by A&B. Here’s a
possible snapshot:
Alice Bob

i A1 A2 B1 B2
1 + −
2 + +
3 − −
4 + +
.. .. .. .. ..
. . . . .
3I once had a long discussion with colleague who refused to conceit this point, despite me applying all the
logic, persuasion, and appeals to authority I could muster. Very frustrating.
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 29

Obviously, in each round i, both Alice and Bob can fill out only the column corresponding
to the measurement they chose to make.
We will now argue that Assumption (1.19) puts quantitative constraints on the type of
data that can appear in this setting. Later, we will see that there are experiments that violate
these constraints—thereby disproving the general validity of (1.19). (Also, QM predicts
the violations correctly. That’s also interesting, but less relevant).
Concretely, if physical properties exist independently of observations, then there exits
a complete table, say
Alice Bob

i A1 A2 B1 B2
1 + − − −
2 + − + +
3 − − + −
4 + + + −
.. .. .. .. ..
. . . . .
and in each round, A&B just decide which of the pre-existing values to uncover.
In what may feel like an unmotivated move even by the standards of the present dis-
cussion, associate the expression

C = A1 B 1 + A1 B 2 + A2 B 1 − A2 B 2

which each complete row. There’s an elegant geometric construction that leads to this
particular formula (the keyword is Bell polytope) – but it takes some time to develop, so
let’s just work with C regardless of where it comes from. In our example:
Alice Bob

i A1 A2 B1 B2 C
1 + − − − −2
2 + − + + 2
3 − − + − 2
4 + + + − 2
.. .. .. .. .. ..
. . . . . .
Despite being the sum of four terms each valued ±1, the expression (in fact: its absolute
value) is upper-bounded by 2: Factoring out Alice’s variables and applying the triangle
inequality,

|C| = |A1 (B1 + B2 ) + A2 (B1 − B2 )| ≤ |B1 + B2 | + |B1 − B2 | = 2.

It may seem that we can’t extract observable predictions out of this discussion, because
the expression C involves all four variables, and by assumption, we only have access to
two of them in each round. But there’s a nice trick to get around this! Indeed, if C ≤ 2 in
every run, then so is the average
N
1 X (i)
⟨C⟩ = C
N i=1
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 30

over N runs. But averages are linear, and therefore ⟨C⟩ equals

⟨A1 B1 + A1 B2 + A2 B1 − A2 B2 ⟩ = ⟨A1 B1 ⟩ + ⟨A1 B2 ⟩ + ⟨A2 B1 ⟩ − ⟨A2 B2 ⟩.

Each of the four terms ⟨Ai Bj ⟩ can be estimated by A&B! If they choose their settings
at random, then by the law of large numbers (or, quantitatively, by the Chernoff bound),
their observed mean will converge to the true expected value in the limit of large N . Thus,
Assumption (1.19) implies that the linear combination of these four experimentally acces-
sible numbers is no larger than 2, up to statistical fluctuations that vanish in the large-N
limit. Such a test of (1.19) is called a Bell inequality.
Following up on pioneering works that led to the 2022 Nobel Prize, it is today fairly
routine to perform experiments that are compatible with the CHSH setup and yield a value
of ⟨C⟩ ≃ 2.7.
Thus, Assumption (1.19) must be rejected as a general feature of Nature.

1.5.2 Operational consequences of Bell inequality violations


The existence of Bell inequality violations imply some interesting “no-go theorems”, i.e.
statements showing that certain processes are impossible (similar to how the second law of
thermodynamics rules out the existence of perpetual motion machines). In the literature,
these results are usually derived relying on the quantum mechanical formalism. But it’s
both easier and more fundamental to conclude them purely from empirically observed
violations of (1.19).
Further reading: The exposition in this section is, unfortunately, not commonly found
in textbooks. It is based on this paper, which should be better known!

Joint measurements
2
Recall the Heisenberg uncertainty principle Varψ [X] Varψ [P ] ≥ ℏ4 . It is often verbally
summarized as stating that “position and momentum can’t be measured simultaneously.”
But the relation says no such thing. (Rather, it says that there’s no state |ψ⟩ that would
cause both position and momentum measurements to produce arbitrarily sharply concen-
trated outcomes.)
It is still true, however, that position and momentum cannot be measured simultane-
ously. What is more, this is true for any pair of observables that Alice can use in an
experiment that violates the CHSH inequality. Even better: This no-go statement does not
assume the validity of QM, but is an empirical fact about the universe we live in.
To state the result, we first have to say what we mean by “joint measurement”, again
without using quantum-mechanical concepts. Let’s say two measurement devices are
equivalent if give the same probability distribution over outcomes for every possible input
(Fig 1.11). Now consider two measurements 1, 2, say with two outcomes each. A joint
measurement machine for 1, 2 is a device J with two pairs of outcomes (Fig. 1.12). It
must be such that if one only considers the first pair, one obtains a measurement proce-
dure equivalent to 1; and if one only considers the second pair, one obtains a measurement
procedure equivalent to 2. The two original machines are said to be jointly measurable if
there exists a joint measurement machine for them.
Now assume that the two properties probed by Alice in the CHSH scenario are jointly
measurable and that the same is true for the two properties measured by Bob. They could
then use joint measurement machines to produce a complete table, with all properties
A1 , A2 , B1 , B2 provided in every round. The definition of a joint measurement machine
and of equivalent measurement implies that each pair i, j, the marginal distributions for
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 31

Figure 1.11: Top panel: Each physical property can be measured in many equivalent ways.
Bottom panel: Formalization this observation for probabilistic theories. Two measurement
devices 1, 1′ are equivalent if for every preparation procedure P , measuring 1 or 1′ leads
to identical probability distribution over outcomes.

Figure 1.12: (i) Two two-outcome measurement devices, 1 and 2, like the ones held by
Alice in the CHSH scenario. They are jointly measurable if there exists a measurement
device J that produces two pairs of outcomes such that: (ii) The first pair (cyan) alone
defines a measurement that is equivalent to 1, and (iii) The second pair (pink) alone defines
a measurement that is equivalent to 2.

Ai Bj the arise this way are identical to the ones that the original measurement devices
realize. In particular, the correlation function C must be the same in both cases. But, as
proven above, in this case |C| ≤ 2.
The contrapositive: In a universe where the CHSH inequality can be violated (such as
ours), there must be pairs of physical properties that cannot, as a matter of principle, be
jointly measured. This is a remarkably far-reaching statement to follow from empirical
observations alone!

• Q.: Wait. In our earlier Q&A, you said that as far as you knew, Alice could measure
her two properties jointly.
A.: And that was the right answer at that point in the analysis! We didn’t have to
assume incompatibility. We derived it. Like the cool kids.
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 32

No cloning
Define a universal cloning machine to be a process that takes one physical system as input
and outputs two systems such that: Applying any measurement device to the first or to
the second output is equivalent to applying it to the input. It is clear that the existence of
a universal cloner implies the existence of a joint measurement machine for any pair of
properties (Fig. 1.13). Again, we conclude that in a universe where CHSH violations are
observed, cloning is impossible.

Figure 1.13: Top: A universal cloning machine (i) is a device that takes one physical
system as input and outputs two physical systems, where each of the outputs is indistin-
guishable from the input under any measurement (ii), (iii). Bottom: A cloner can be used
to construct a joint measurement machine.

There’s a famous paper (cited in an academic publication about once every day!) that
derives the no-cloning theorem from quantum mechanics. Here’s their proof: If U is an
operator that “clones two orthogonal states” in that

U |0⟩ = |00⟩, U |1⟩ = |11⟩,

then by linearity,
1 1 1 1
U √ (|0⟩ + |1⟩) = √ (|00⟩ + |11⟩) ̸= √ (|0⟩ + |1⟩) ⊗ √ (|0⟩ + |1⟩),
2 2 2 2
so it necessarily fails to clone superpositions of the two states. That’s cool and all, but note
that it assumes the validity of quantum mechanics, whereas our argument doesn’t!

True randomness
Assume I put a dice in a cup, shake it vigorously, and put the cup upside down on a table.
Nobody will have any idea how many eyes the dice shows, so one might well model the
situation by ascribing a probability of 1/6 to any of the possible outcomes. But note that
this description only reflects my ignorance about the true state of the dice. There is no
doubt that some side is facing up even before I lift the cup. In fact, it is conceivable
in principle that a computer coupled to a camera that captured my motions might solve
Newton’s equations and predict the state of the dice accurately. Let’s refer to a variable as
pseudo-random if such a prediction is possible in principle, and as truly random otherwise.
A priori, it is unclear whether true randomness exists at all.
But CHSH violations are only possible if the outcomes of Alice and Bob are truly
random. For if some process could predict the outcomes, it could do so independently of
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 33

which property they choose to measure. It could therefore predict the full table, and we
are back at the proof by contradiction outlined above.
The fact that no outside observer can predict the outcomes of Alice and Bob means
that they are, in this sense, “private” to them. This observation is the basis of provably
secure quantum key distribution protocols.

1.5.3 Interpretations
We have presented a negative argument that rules out the classical model of a world that
evolves independently from observations. It is widely argument accepted today. How-
ever, there is no positive agreement what, if anything, should replace it. Below are some
common reactions as I see them.
The orthodox position is to say that the purpose of science is to make empirically
testable predictions. QM excels at this task. Counterfactual questions about “what would
have happened had you measured something else” just amount to storytelling and lie out-
side the remit of science. So Bell is interesting for its operational consequences (Sec. 1.5.2),
but philosophically, there’s not much to be done other than to shrug and move on.
Problems with this position: (1) It is rather unambitious. Theoretical physics has his-
torically offered more than just the ability to predict detector click patterns. To just dis-
allow hypotheticals feels like giving up too early. (2) The elements of reality critique
explained next.
The Bohmians point out that sometimes, one can predict the outcome of a measure-
ment with 100% certainty. (E.g., for a system in the singlet state, when Alice measured
spin along one axis and obtained ↑, Bob will definitely obtain ↓ w.r.t. the same axis). They
argue that in such situations, reality doesn’t change if the now somewhat redundant mea-
surement is performed – so that if we consider outcomes to be real, there must already
have been some element or reality representing them before the measurement. Speaking
in terms of the lab books we analyzed above, they therefore posit that there always is a full
table representing the true state of all elements of reality at any time, measured or not.
By Bell’s argument, the table can’t be independent of the measurements made. A
more detailed analysis shows that one can accommodate CHSH violations only if Bob’s
variables change as a result of Alice interacting with her side of the joint system (and / or
vice versa). There is a simple model developed by David Bohm showing that QM can in
principle be interpreted in such a realistic (i.e. properties have values whether measured
or not) but non-local (i.e. the unmeasured parameters change due to actions far away)
way. In Bohm’s model, the change of unmeasured parameters happens in a subtle way
that is strong enough to enable CHSH violations, but too weak to allow for the exchange
faster-than-light signals between far away parties.
Therefore, the Bohmians argue, such a description is both necessary (by the elements-
-of-reality argument) and possible (by Bohm’s model). There is thus no paradox, and we
should concentrate on working out the details.
The problem with this position is that you get into tension with special relativity even in
the absence of superluminal signals. Recall that if A&B’s actions are space-like separated,
their order in time is observer-dependent. So how can I think about Alice’s actions causing
change at Bob’s end, when in some reference frames, Bob acted first?
Some proponents of Bohm’s program bite this bullet and acknowledge that realist in-
terpretations of QM imply that there must be a distinguished frame of reference in the
universe. They may not like having to break Lorentz invariance, but they like attempts to
discuss way the elements-of-reality critique even less. Needless to say, having seen what
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 34

the Michelson–Morley experiment did to the concept of the luminiferous ether, mainstream
physicists are highly reluctant to re-introduce distinguished reference frames.
The loopholers maintain that there are further implicit assumptions in the analysis,
some of which have to be rejected. After improved experimental techniques in the past
few years, the (unfortunately-named) free will loophole is the last major one standing.
Recall that we have assumed that A&B choose their settings randomly. More precisely,
the empirical means for Ai Bj only converge to the expected values ⟨Ai Bj ⟩ if the probabil-
ity of choosing a setting is independent of its value. (Think of an election pollster calling
random citizens on their landlines during work hours, to ask about their voting intentions.
Retired people are more likely to answer the phone—potentially skewing the result, as their
voting preferences are different from the population as a whole). But A&B are physical
systems, too! They share a common history with the central box. It is therefore unjustified,
it is argued, to assume that they can make independent choices.
Problems: (i) The position “proves too much”. It seems like it can be used as a general
argument against all of empirical science (“apples mostly fall upwards, but we only look
when they happen to fall down”). (ii) One can design the choice function of A&B in such
a way that it would take one sophisticated cosmic conspiracy to still produce a CHSH
value of 2.7. People have performed Bell experiments where the settings were driven by
fluctuations in the cosmic background radiation measured at different sections of the night
sky, XOR’ed against the input of internet users participating in an online action game.
The many-worlders content that QM anyway has a philosophical problem (the one
we didn’t address in Sec. 1.3.1), so let’s fix all issues in one fell swoop. They then throw
out the measurement postulate and posit that there exists a “wave function of the universe”
that evolves under a global Hamiltonian. The reality we experience is an emergent feature
of this wave function – not a pre-existing concept like in standard QM.
Without a measurement postulate that will probabilistically pick one “branch of a su-
perposition”, all of them have an equal right to being considered as “real”. For example,
if |,⟩ is the state of all of my elementary particles that correlates with me feeling happy,
then summands in a superposition state like α|↑⟩|δt⟩|,⟩ + β|↓⟩| − δt⟩|/⟩ (encountered in
Eq. (1.14)) should be interpreted as different co-existing “worlds” (or branches) in which
my feelings are correlated with other degrees of freedom of the universe. In particular, in a
CHSH experiment, all possible outcomes are simultaneously realized in different branches
of the wave function. Any philosophical problem tied to the assumption that only one
branch actually occurs is thus spurious.
The problem here is that the measurement postulate, clunky as it may be, is what
connects the formalism to reality! If you claim it’s unnecessary, it’s on you to re-derive
the empirical content of the theory in this reduced framework. One important touchstone
is the Born rule which says in this language that “if my wave function splits into two
branches with amplitudes α and β, I experience these with probability |α|2 , |β|2 respec-
tively”. Researchers working on many-world formulations therefore spend a lot of time
thinking about probabilities and their interpretation (but, to my personal taste, haven’t
cracked this nut yet).

1.6 Further reading


To repeat the basics:

• Quantum Mechanics by Leslie Ballentine is a nice presentation that’s somewhat


more careful than many textbooks without being too mathematical.
CHAPTER 1. MULTI-PARTITE QUANTUM SYSTEMS 35

• Quantum Mechanics 1 & 2 by Cohen-Tannoudji and friends contains an enormous


amount of optional material for each chapter. It can thus both be used as an intro-
ductory textbook and as a reference.
• Modern Quantum Mechanics and Advanced Quantum Mechanics by Sakurai will
also be used for later parts of this course.

The quantum model of the measurement process is described in Chapter 12 of Quantum


Theory by Asher Peres. A classic volume on decoherence theory is Decoherence and the
Appearance of a Classical World in Quantum Theory by Jost, Zeh, Kiefer (of Cologne),
Giulini, Kupsch, and Stamtescu. A standard introduction to quantum computing is Quan-
tum Information and Computation by Nielsen and Chuang. The operational consequences
of Bell violations follow Quantum Information Theory: An Invitation by Reinhard Werner.
Chapter 2

Indistinguishable particles

2.1 Bosonic and Fermionic Hilbert spaces


The tensor product construction (Sec. 1.2.1) of the Hilbert space of two distinguishable
particles was guided by the need to represent observables for properties of the first or
second particle alone. (For example, “What is the expected position of the first particle?”,
or “Does the second particle’s spin point up?”, which turned out to be represented by
observables of the form A ⊗ 1 and 1 ⊗ B respectively). However, electrons, say, seem
to be indistinguishable in the sense that any experiment that is sensitive to one electron
will be equally sensitive to any other. It thus makes sense to search for a joint Hilbert
space that only supports observables like “What is the expected position averaged over
all particles?” or “How many particles have their spins pointing up?” that do not involve
unphysical references to specific particles.

The same issue


 already
 arises in classical mechanics, where e.g. the two configu-
q1 q2
rations , of point particles describe the same physics. In mechanics,
q2 q1
this redundancy does not usually seem to lead to wrong predictions (unlike the QM
case, as we will see shortly). There’s some indication that things are amiss, though.
The Gibbs paradox says that the classical thermodynamical treatment of a gas of
identical particles does give wrong results unless indistinguishable configurations
are counted only once. However, this doesn’t quite falsify the redundant formu-
lation of classical mechanics, as the connection between this microscopic theory
and thermodynamics depends on unproven assumptions (e.g. the maximum entropy
principle), and so the problem could lie somewhere else.

There’s a simple construction that seems to account for all fundamental particles. Let
H(1) be a single-particle Hilbert space with basis {|i⟩}. If the particles were distinguish-
able, a general element of the n-body joint Hilbert space would be
X
|ψ⟩ = ψi1 ,...,in |i1 , . . . , in ⟩ ∈ (H(1) )⊗n .
i1 ,...,in

Let’s look for subspaces of (H(1) )⊗n that make sense for indistinguishable particles. Let
τkl be the operator that exchanges the k-th and the l-th factor:

τkl (| . . . , ik , . . . , il , . . . ⟩) = | . . . , il , . . . , ik , . . . ⟩.

If the particles are indistinguishable, then |ψ⟩ and τkl |ψ⟩ should describe the same physics.
This is certainly true of they differ at most by a phase factor. Because τkl 2
= 1, such a

36
CHAPTER 2. INDISTINGUISHABLE PARTICLES 37

phase must be ±1. The totally symmetric or Bosonic subspace Symn (H(1) ) consists of all
vectors such that

τkl |ψ⟩ = |ψ⟩ ∀k, l.

The totally anti-symmetric or Fermionic subspace ∧n (H(1) ) (“wedge-n”) consists of all


vectors such that

τkl |ψ⟩ = −|ψ⟩ ∀k, l.

At this point, many texts “prove” that the construction leading to Fermions and
Bosons are the only conceivable ways for building a quantum theory of indistin-
guishable particles. I find all these arguments inconsistent and unhelpful to the
degree that I’m prepared to claim the world would be better if they all just be for-
gotten. Ask me about it, or maybe don’t.

2.1.1 Permutations and occupation numbers


The operators τkl (called transpositions) generate the group Sn of all permutations of
the n factors. Recall that a permutation is a way of re-arranging the symbols 1, 2, . . . , n
(Fig. 2.1). The sign of a permutation π is

+1 σ is product of an even number of transpostions
sgn(σ) = .
−1 σ is product of an odd number of transpositions

The Bosonic and Fermionic Hilbert spaces can therefore also be defined as the sets of
vectors such that

π|ψ⟩ = |ψ⟩, Bosons


π|ψ⟩ = sgn(π)|ψ⟩ Fermions

for all permutations π ∈ Sn .

You have encountered this concept before, in the definition of the determinant of an
(n × n)-matrix:

X n
Y
det M = sgn(π) Mi,π(i) . (2.1)
π∈Sn i=1

(a) (b)
π1
π1 = π2π1
π2

Figure 2.1: (a) A permutation can be represented as a graph, where each position indicates a letter,
and where the arrows points to where each letter is mapped. (b) One can multiply permutations σ1
and σ2 by performing one after the other.
CHAPTER 2. INDISTINGUISHABLE PARTICLES 38

How many permutations of n letters are there? There are n ways of choosing a new
place for the first symbol, then n − 1 ways for the second symbol (as we can’t repeat the
first one), etc, for a total of

|Sn | = n(n − 1) · · · 2 · 1 = n! .

This explains the various “factorials” that will appear in formulas below.
We can now find bases for the Bosonic / Fermionic subspaces. Indeed, if
X
|ψ⟩ = ψi1 ,...,in |i1 , . . . , in ⟩
i1 ,...,in

is Bosonic, then |ψ⟩ = π|ψ⟩ for all π and thus


1 X X 1 X 
|ψ⟩ = π|ψ⟩ = ψi1 ,...,in π|i1 , . . . , in ⟩ . (2.2)
n! π i ,...,i
n! π
1 n

The vector in parentheses only depends on the number of times nk each single-particle
basis element |k⟩ appears in the product |i1 ⟩ . . . |in ⟩. This P
motivates the definition of the
occupation number basis. For ni ∈ {0, 1, 2, . . . } such that i ni = n, set
1 X
|n1 , n2 , . . . ⟩ := p Q π| 1, . . . , 1, 2, . . . , 2, . . . ⟩
n! k nk ! π∈Sn
| {z } | {z }
n1 × n2 ×
1 X
=p Q π(|1⟩⊗n1 |2⟩⊗2 . . . ). (2.3)
n! k nk ! π∈Sn

The funky factorial factor makes the vector normalized (check it!). By (2.2), any Bosonic
state vector can be expanded in the occupation number basis.

Recall the triplet states of two spin-1/2 particles


1 
|↑↑⟩, √ |↑↓⟩ + |↓↑⟩ , |↓↓⟩.
2
They are clearly invariant under permutations of the particles. In occupation number
notation with respect to the |↑⟩, |↓⟩-basis, the triplet states are
|2, 0⟩, |1, 1⟩, |0, 2⟩.

If |ψ⟩ is Fermionic, then arguing as above gives


X 1 X 
|ψ⟩ = ψi1 ,...,in sgn(π)π|i1 , . . . , in ⟩ . (2.4)
i ,...,i
n! π
1 n

Anti-symmetry makes things a bit more exciting, though: Again look at the vector in
parentheses for some choice i1 , . . . , in of single-particle states. If one state occurs twice
(say ik = il ), then

| . . . , ik , . . . , il , . . . ⟩ + sgn(τkl ) τkl | . . . , ik , . . . , il , . . . ⟩ = 0

which implies that the sum is 0. Therefore, in the Fermionic occupation number basis
1 X
sgn(π)π |1⟩⊗n1 |2⟩⊗n2 . . . ,

|n1 , n2 , . . . ⟩ := √ (2.5)
n! π∈Sn
CHAPTER 2. INDISTINGUISHABLE PARTICLES 39

nk must be either 0 or 1. This explains the Pauli principle! Beware that in the Fermi
case, the sign of the occupation number basis elements (2.5) depend on an ordering of
single-particle basis vectors.
For the anti-symmetrization of general single-particle vectors |α1 ⟩, . . . , |αn ⟩, one also
uses the wedge product notation
1 X 
|α1 ⟩ ∧ · · · ∧ |αn ⟩ := √ sgn(π)π |α1 ⟩ ⊗ · · · ⊗ |αn ⟩
n! π∈Sn

pronounced “alpha one, wedge alpha two, ...”. Wedge products are also called Slater de-
terminants. That’s because one can express the wedge product as a “formal determinant”:

|α1 ⟩(1) |α2 ⟩(1) . . . |αn ⟩(1)


 

1  |α1 ⟩(2) |α2 ⟩(2) . . . |αn ⟩(2) 


|α1 ⟩ ∧ · · · ∧ |αn ⟩ = √ det  . ..  .
 
. ..
n!  . . . 
(n) (n)
|α1 ⟩ |α2 ⟩ . . . |αn ⟩(n)

Here, the super-scripts indicate which tensor factor the vector belongs to.

The singlet state √12 |↑↓⟩ − |↓↑⟩ = |↑⟩ ∧ |↓⟩ is clearly anti-symmetric. In occu-


pation number notation with respect to the |↑⟩, |↓⟩-basis, it is given by |1, 1⟩.

Assume dim H(1) = d < ∞. In both the Bose and the Fermi case, the occupation
number bases give us a combinatorial way to compute the dimension of the Hilbert
spaces.
Fermions: Basis elements are labeled by subsets S ⊂ {1, . . . , 1} of size |S| = n.
Thus
 
d
dim ∧n Cd =

.
n

Bosons: Basis elements are labeled by a partition n = di=1 ni of n into d non-


P
negative parts. There’s a cute combinatorial argument for computing the number of
such partitions. The answer is
 
n+d−1
dim Symn Cd =

. (2.6)
n

Can you find it? (Spoiler: Search for “stars and bars”).

The occupation number basis adds another possible meaning to the heavily over-
P of “a list of numbers in a ket”. In particular, in |n1 , n2 , . . . ⟩ =
loaded notation
1
√ Q π π|1, . . . , 1, 2, . . . , 2, . . . ⟩ the numbers in the ket on the l.h.s. count
n! k nk !
occupations, while the numbers in the ket on the r.h.s. are indices of some single-
particle basis. Which of these definitions is meant, and which single-particle basis it
is relative to, and whether the occupation numbers are for Fermions or for Bosons,
or whether the numbers have nothing to do with these many-body concepts and are
more general “quantum numbers” (like the labels |n, l, m⟩ of the atomic basis) has
to be inferred from context. There’s no general, reliable rule.
Look. If I were the emperor of physics, I’d outlaw this mess. But I’m not and
everybody is using it. After you got used to it, you’ll find that this convention
causes surprisingly few catastrophic misunderstandings.
CHAPTER 2. INDISTINGUISHABLE PARTICLES 40

Summary

Let H(1) be a single-body Hilbert space with basis {|i⟩}. Then a general state of n
indistinguishable particles can be expressed in the occupation number basis as
X
|ψ⟩ = cn1 ,n2 ,... |n1 , n2 , . . . ⟩,
n1 ,n2 ,...

P is over ni ∈ {0, 1, 2, . . . } for Bosons and ni ∈ {0, 1} for Fermions,


where the sum
and where i ni = n. The occupation number basis is defined as

1 X
|n1 , n2 , . . . ⟩ = p Q (sgn π)ζ π(|1⟩⊗n1 |2⟩⊗2 . . . ), (2.7)
n! k nk ! π∈Sn

where ζ = 0 for Bosons and ζ = 1 for Fermions.

2.1.2 Single-particle operators


We started this section remarking that a full tensor product Hilbert space supports more
observables than are physically meaningful for indistinguishable particles. Let A(i) be an
operator acting on the i-th particle. Then (why?)

πA(i) π −1 = A(πi ) .

Thus, if |ψ⟩ is Bosonic or Fermionic,


  1 X  (i)  1 X  −1 (i) 
tr A(i) |ψ⟩⟨ψ| = tr A π|ψ⟩⟨ψ|π −1 = tr π A π|ψ⟩⟨ψ|
n! n!
π∈Sn π∈Sn
n
1 X  (j) 
= tr A |ψ⟩⟨ψ| . (2.8)
n j=1

A measurement on any one particle is thus equal to the average over all of them – the
formalism no longer allows us to pick out the properties of individual particles.
Now assume A has an eigendecomposition
X
A= λi |i⟩⟨i|.
i

Then for an element of the occupation number basis with respect to the eigenbasis {|i⟩} of
A, one computes from (2.7)
n
X X 
A(j) |n1 , n2 , . . . ⟩ = λi ni |n1 , n2 , . . . ⟩. (2.9)
j=1 i

In particular, single-body operators are diagonal in the occupation number basis. If the
single-body eigenvalues are sorted λ0 ≤ λ1 ≤ . . . , then the lowest n-body eigenvalue in
the Bosonic case is nλ0 and in the Fermionic case λ0 + · · · + λn−1 . For Fermions, if the
λi ’s describe energies, then λn−1 , the largest energy still occupied in the ground state, is
called the Fermi energy.
CHAPTER 2. INDISTINGUISHABLE PARTICLES 41

2.1.3 The exchange interaction


Goals

The Coulomb repulsion term between two electrons, h(1,2) ∝ ∥x1 − x2 ∥−1 , does
not depend on spin. However, when combined with the anti-symmetrization postu-
late for Fermions, an effective coupling between electron spins arises. It is impor-
tant, e.g. in magnetism and atom physics. We’ll look at a simple case: the electrons
of the Helium atom in first-order perturbation theory.

Treating the nucleus as fixed, the Hamiltonian for the Helium atom is

H = H0 + h(1,2) , H0 = h(1) + h(2) ,


Pi2 2e2 1 e2 1
h(i) = − , h(1,2) = .
2m 4πϵ0 ∥xi ∥ 4πϵ0 ∥x1 − x2 ∥
The eigenfunctions of the single-body Hamiltonian are the same as those for hydrogen
(with Bohr radius halved on account of the higher charge), and with arbitrary spin:
1 1
|ϕn,l,m ⟩|s⟩, n ≥ 0, l ≤ n − 1, −m ≤ l ≤ m, − ≤s≤ .
2 2
By Sec. 2.1.2, their Slater determinants diagonalize the non-interacting part H0 .

Warm up: The ground state


Write |1⟩ := |ϕ1,0,0 ⟩ for short. The ground state of H0 is given by
1 
|1↑⟩ ∧ |1↓⟩ = √ |1↑⟩|1↓⟩ − |1↓⟩|1↑⟩ .
2
That is: both electrons are in the single-body ground state (spectroscopic notation: 1s2 ),
with anti-parallel spins. The ground state vector becomes a lot clearer when we group the
spatial and the spin degrees of freedom together:
1 1 
|1⟩|↑⟩ ∧ |1⟩|↓⟩ = √ (|1⟩|1⟩|↑⟩|↓⟩ − |1⟩|1⟩|↓⟩|↑⟩) = |1⟩|1⟩ √ |↑⟩|↓⟩ − |↓⟩|↑⟩ (2.10)
2 2
Let’s analyze this. The permutation τ exchanges all degrees of freedom of the electrons:
 
τ (|ϕ1 ⟩|s1 ⟩)(|ϕ2 ⟩|s2 ⟩) = (|ϕ2 ⟩|s2 ⟩)(|ϕ1 ⟩|s1 ⟩)

We could also define operators τ (space) and τ (spin) that only act on one of them:
 
τ (space) (|ϕ1 ⟩|s1 ⟩)(|ϕ2 ⟩|s2 ⟩) = (|ϕ2 ⟩|s1 ⟩)(|ϕ1 ⟩|s2 ⟩),
 
τ (spin) (|ϕ1 ⟩|s1 ⟩)(|ϕ2 ⟩|s2 ⟩) = (|ϕ1 ⟩|s2 ⟩)(|ϕ2 ⟩|s1 ⟩)

so that τ = τ (space) τ (spin) . The Hamiltonian H commutes not only with τ , but (in this
case) with τ (space) and τ (spin) individually. We can therefore find a common eigenbasis, i.e.
energy eigenvectors that also have well-defined parity with respect to the exchange of each
of the spatial and the spin parts. To get anti-symmetry under τ , exactly one of these two
parts has to be anti-symmetric. That’s what happened in (2.10).
CHAPTER 2. INDISTINGUISHABLE PARTICLES 42

The energy correction induced by the interaction in first-order perturbation theory is

⟨1|⟨1|⟨Ψ− |h(1,2) |1⟩|1⟩|Ψ− ⟩ = ⟨1|⟨1|h(1,2) |1⟩|1⟩


2e2
Z
1
= |⟨x1 |1⟩|2 |⟨x2 |1⟩|2 d3 x1 d3 x2 . (2.11)
4πϵ0 ∥x1 − x2 ∥
This expression – called the Coulomb or direct integral – equals the expected value of the
repulsion term experienced by two classical electrons that are found at x with probability
density |⟨x|1⟩|2 .

Excited states
The first excited states of H0 are the ones where one electron remains in the ground state
and one is in |ϕ2,0,0 ⟩ =: |2⟩ (spectroscopic: “1s, 2s”). Taking spin into account, the first
excited energy of the non-interacting Hamiltonian is thus four-fold degenerate:

|1⟩|s1 ⟩ ∧ |2⟩|s2 ⟩ si ∈ {↑, ↓}.

As discussed above, we can choose a basis of states that are symmetric / anti-symmetric in
the spatial and spin degrees individually:
o
√1 |1⟩|2⟩ + |1⟩|2⟩ √1 |↑↓⟩ − |↓↑⟩
 
2 2
(S = 0, “singlet”)

√1 |1⟩|2⟩ − |1⟩|2⟩ √1 |↑↓⟩ + |↓↑⟩
 
2  2


√1 |1⟩|2⟩ − |1⟩|2⟩ |↑↑⟩ (S = 1, “triplet”)
2
√1 |1⟩|2⟩ − |1⟩|2⟩ |↓↓⟩
 

2

Again, the energy correction only depends on the spatial part. In particular it is the same
for the last three vectors. For the first two, we get
1
⟨1|⟨2| ± ⟨1|⟨2| h(1,2) |1⟩|2⟩ ± |1⟩|2⟩ = ⟨1|⟨2|h(1,2) |1⟩|2⟩ ± Re⟨1|⟨2|h(1,2) |2⟩|1⟩.
 
2
The first matrix element is again a “Coulomb integral”

2e2
Z
1
I := ⟨1|⟨2|h(1,2) |1⟩|2⟩ = |⟨x1 |1⟩|2 |⟨x2 |2⟩|2 d3 x1 d3 x2 > 0,
4πϵ0 ∥x1 − x2 ∥
which allows for the same probabilistic interpretation as given for Eq. (2.11). The second
one is called the exchange integral

2e2
Z
1
J := ⟨1|⟨2|h(1,2) |2⟩|1⟩ = ⟨1|x1 ⟩ ⟨2|x2 ⟩ ⟨2|x1 ⟩ ⟨1|x2 ⟩ d3 x1 d3 x2 .
4πϵ0 ∥x1 − x2 ∥
The exchange integral is also positive, although that’s less obvious.

To see this, rewrite the exchange integral as

2e2
Z
1
J= ⟨1|x1 ⟩ ⟨2|x1 ⟩ ⟨1|x2 ⟩⟨2|x2 ⟩ d3 x1 d3 x2 .
ϵ0 4π∥x1 − x2 ∥
Defining
Z
1
ϕ(x) := ⟨1|x⟩⟨2|x⟩, A := |x1 ⟩ ⟨x2 | d3 x1 d3 x2 ,
4π∥x1 − x2 ∥
CHAPTER 2. INDISTINGUISHABLE PARTICLES 43

the integral is of the form ⟨ϕ|A|ϕ⟩ with A a translation-invariant. By Eq. (A.25), A


is diagonal in the Fourier basis, with eigenvalues given by (2π)3/2 times the Fourier
transform of f (x) = 1/(4π∥x∥). From Eq. (C.20), (2π)3/2 f˜(k) = ∥k∥ 1
2 , so that

2e2 |⟨ϕ|k⟩|2 3
Z
J= d k > 0.
ϵ0 ∥k∥2

The effect of the interaction is thus twofold: (i) It uniformly increase the energies by
the Coulomb term I describing the expected repulsion felt by the two electrons (as one
would expect). (ii) It introduces a splitting by 2J of the energies between the symmetric
S = 1 and anti-symmetric S = 0 spin states. The physical way to think about the second
effect is that anti-symmetry in the spatial part “allows the electrons to avoid each other”,
thus decreasing the energy penalty due to electron-electron repulsion.

The Heisenberg model


We have seen that within the 1s, 2s-space, the energy depends only on the spin configura-
tion. Let’s map it to an effective 2-spin model by setting:

|s1 , s2 ⟩ := |1⟩|s1 ⟩ ∧ |2⟩|s2 ⟩.

In this two-spin Hilbert space, the effective Hamiltonian is, up to an irrelevant global shift
of energies,

Heff = −Jτ.

We can write the transposition τ as (excercise!)

σj σj = 1 + σ (1) · σ (2)
(1) (2)
X
τ=
j∈0,x,y,z

and thus, up to another shift,

Heff = −J σ (1) · σ (2) . (2.12)

The exchange principle can thus be described as an effective interaction between the two
spins. Equation (2.12) is an embryonic version of the Heisenberg model of magnetism.

2.2 Second quantization

Goals
This section is mostly formal (definitions, generic constructions). Not too excit-
ing? Maybe. But familiarizing you with the formalism of “second quantization” is
one of the most important goals of this lecture. Much builds on it. Be alert!

So far, we have considered systems with a fixed number n of particles. We will now
treat the particle number as variable. Mathematically, this actually simplifies some cal-
culations (we won’t have to worry about combinatorial expressions like (2.6) any more).
Physically, this step is necessary e.g. for relativistic theories, where different species of
particles can be converted into each other.
CHAPTER 2. INDISTINGUISHABLE PARTICLES 44

2.2.1 Fock space


Start with a single-particle Hilbert space H(1) . To describe systems with an indefinite
particle number, we’ll use superpositions

X
|ψ⟩ = |ψn ⟩
n=0

with |ψn ⟩ ∈ Symn (H(1) ) (Bosons) or |ψn ⟩ ∈ ∧n (H(1) ) (Fermions). Terms corresponding
to different particle numbers are taken to be orthogonal, so that inner products are

X
⟨ψ|ψ ′ ⟩ = ⟨ψn |ψn′ ⟩.
n=0

The resulting Hilbert space is called the symmetric/anti-symmetric Fock space



 M
FS H(1) = Symn (H(1) ) (Bosons),
n=0
M∞
FA H(1) = ∧n (H(1) )

(Fermions).
n=0

Wait, n = 0 is included? That’s right, we allow for systems with zero particles. To make
sense of that, define
(H(1) )⊗0 , ∧(0) (H(1) ), Sym(0) (H(1) ) := C1 ,
the Hilbert space of one-component vectors. Up to a phase, it only contains a single
normalized vector, which is called the vacuum and denoted as |vac⟩ or |0⟩.
This construction is very transparentP in the occupation number basis, where it basically
amounts to removing the constraint i ni = n (and all the combinatorial nastiness that
comes with it). With respect to a basis {|i⟩} of H(1) , Fock space is the Hilbert space with
basis |n1 , n2 , . . . ⟩, where ni ∈ {0, 1, 2, . . . } for Bosons and ni ∈ {0, 1} for Fermions.
The vacuum is |0, 0, . . . ⟩ = |vac⟩ = |0⟩.

2.2.2 Creation and annihilation operators


Recall the treatment of the quantum harmonic oscillator (Appendix A.2.1). There, one
introduces the ladder operators that create/destroy excitations in the sense that
√ √
a† |n⟩ = n + 1|n + 1⟩ ⇔ a|n⟩ = n|n − 1⟩. (2.13)
The definition might feel a bit unmotivated at first, but it turns out to radically simplify
the analysis. Ladder operators can likewise be introduced on Fock space, and once more,
they turn out to simplify calculations with indistinguishable particles much more than one
could expect.
For any single-particle state |α⟩, the creation operator a†α is defined via its action on
n-particle states |ψn ⟩ as
√ 1 X
a†α |ψn ⟩ = n + 1 (sgn π)ζ π |α⟩ ⊗ |ψn ⟩

(2.14)
(n + 1)!
π∈Sn+1
| {z } | {z } | {z }
scale to match (2.13) (anti-)symmetrize add particle in state |α⟩
CHAPTER 2. INDISTINGUISHABLE PARTICLES 45

with ζ = 0 (Bosons) and ζ = 1 (Fermions). The associated annihilation operator is the


adjoint: aα = (a†α )† .

Equation (2.14) is commonly summarized as “a†α creates a particle in state |α⟩”.


This phrase should be thought of √ as a mnemonic, not as a definition. For one,
it omits the crucial scale factor n + 1. What is more, creation operators don’t
usually have a direct physical interpretation (Sec. 2.3.2). Rather, they appear as
mathematical building blocks that allow for a convenient representation of local
operators (Sec. 2.2.3).
It is slightly unfortunate that a† is a more natural starting point than a, requiring the
round-about definition of a as (a† )† . On the upside, the “dagger” symbol used by
physicists to denote the adjoint looks a bit like a “+”, so one can easily remember
that a† is the one that “adds” a particle.

We’ll usually fix a basis {|i⟩} of the single-body Hilbert space and work in the as-
sociated occupation number basis, where the ladder operators act in a transparent way.
Eq. (2.14) implies

a†i | . . . ni−1 , ni , ni+1 . . . ⟩ = ni + 1(−1)ζ j<i nj | . . . ni−1 , ni + 1, ni+1 . . . ⟩.
P

Here, we use the convention that |n1 . . . ⟩ equals 0 if one of the occupation numbers is
negative, or, in the Fermionic case, additionally if one occupation number exceeds 1. Ex-
plicitly, for Bosons:

a†i | . . . ni−1 , ni , ni+1 . . . ⟩ = ni + 1| . . . ni−1 , ni + 1, ni+1 . . . ⟩,
√ (2.15)
ai | . . . ni−1 , ni , ni+1 . . . ⟩ = ni | . . . ni−1 , ni − 1, ni+1 . . . ⟩,
and for Fermions
a†i | . . . ni−1 , ni , ni+1 . . . ⟩ = (−1)
P
nj
j<i | . . . ni−1 , ni + 1, ni+1 . . . ⟩,
P
nj
(2.16)
ai | . . . ni−1 , ni , ni+1 . . . ⟩ = (−1) j<i | . . . ni−1 , ni − 1, ni+1 . . . ⟩.
Iterating, any basis element can be written using creation operators acting on the vacuum:

(a† )n1 (a†2 )n2


|n1 , . . . ⟩ = √1 √ . . . |0⟩. (2.17)
n1 ! n2 !

Basis expansions and field operators


Choose a single-body basis {|i⟩} and a state |α⟩ ∈ H (1) . Plugging the expansion
X
|α⟩ = |i⟩⟨i|α⟩
i

into (2.14) shows that “creation operators can be expanded like kets and annihilation op-
erators like bras”:
⟨i|α⟩a†i ⇒ aα =
X X
a†α = ⟨α|i⟩ai . (2.18)
i i

We don’t need to restrict ourselves to normalizable states. For example, if |α⟩ = |x⟩
is a delta function centered at x ∈ R3 and |i⟩ = |ϕi ⟩ for some smooth function ϕi (x) in
L2 (R3 ), then the above reads
X X
|x⟩ = |ϕi ⟩⟨ϕi |x⟩ = ϕ̄i (x)|ϕi ⟩,
i i
CHAPTER 2. INDISTINGUISHABLE PARTICLES 46

and thus the operators “creating / destroying a particle at position x” are

ϕ̄i (x)a†i ⇒ ax =
X X
a†x = ϕi (x)ai .
i i

Recall that a classical field is any physical quantity that depends on points in space.
The ax are quantum operators depending on points in space, and thus a first example
of a quantum field. These annihilation field operators and their Heisenberg-picture time
evolution are commonly written as
t t
Ψ̂(x) := ax , Ψ̂(t, x) := ax (t) = e− iℏ H ax e iℏ H .

Despite the similarity in notation, the field operators Ψ̂(x) should not be confused with
wave functions ψ(x) ∈ L2 (R3 )!

All the caveats that apply to delta functions (App. A.1.8) likewise apply to the
Ψ̂(x). In particular, formulas involving field operators have physical content only
when integrated against smooth functions. (In the mathematical literature, the Ψ̂(x)
are therefore referred to as operator-valued distributions, to indicate that they give
proper operators only after an integration). See the discussion around (2.23) for an
example of how this pans out.

The converse of the above construction also works. From the completeness relation for
delta functions (A.14):
Z Z
|α⟩ = α(x) |x⟩ d3 x ⇒ a†α = α(x) Ψ̂† (x) d3 x. (2.19)

Commutation relations
As is the case for the treatment of the harmonic oscillators with ladder operators, their
commutation relations are important in calculations.
To treat the Bosonic and Fermionic cases in parallel, introduce the notation

[A, B]ζ := AB − (−1)ζ BA

so that

[A, B]ζ = AB − BA = [A, B] (Bosons, ζ = 0),


[A, B]ζ = AB + BA = {A, B} (Fermions, ζ = 1).

From (2.15, 2.16), one finds

[ai , a†j ]ζ = δij 1, [ai , aj ]ζ = [a†i , a†j ]ζ = 0. (2.20)

More generally, combining these basis-dependent relations with (2.18) gives

[aα , a†β ]ζ = ⟨α|β⟩ 1 (2.21)

which for field operators formally reads

[Ψ̂(x), Ψ̂† (y)]ζ = δ(x − y) 1. (2.22)


CHAPTER 2. INDISTINGUISHABLE PARTICLES 47

How should one interpret Eq. (2.22)? Recall the general rule that expressions in-
volving delta functions carry meaning only when integrated against smooth func-
tions. Viewed this way, (2.22) turns out to be an equivalent restatement of the un-
problematic version (2.21). Indeed, for smooth functions α(x), β(x), combining
Eq. (2.19) and Eq. (2.22) gives
Z Z
[aα , a†β ] = ᾱ(x)β(y)[Ψ̂(x), Ψ̂(y)† ] d3 x d3 y
Z Z
= ᾱ(x)β(y)δ(x − y) 1 d3 x d3 y (2.23)
Z
= ᾱ(x)β(y)1 d3 y = ⟨α|β⟩1.

2.2.3 Single- and two-particle operators


An n-particle Hamiltonian is typically of the form
n n
X 1 X (k,l)
H= h(k) + h
2
k=1 k̸=l=1

for a single-particle term h(k) (e.g. h(k) = Pk2 /(2m)) and an interaction term h(k,l) (e.g.
h(k,l) = V (xk − xl )). On Fock space, we have to sum over all possible particle numbers
n, so that, e.g., the single-particle term becomes
∞ X
M n
h(k) .
n=1 k=1

These formulas become much cleaner when expressed in terms of creation and annihilation
operators.
Indeed, choose a single-particle basis {|i⟩} and consider the expansion
X  X
h= ⟨i|h|j⟩ |i⟩⟨j| = hij |i⟩⟨j|. (2.24)
ij ij

We claim that for both Bosons and Fermions, the following holds:
∞ X
n
hij a†i aj .
M X
h(k) = (2.25)
n=1 k=1 ij

In other words: We can formally move from single-body operators to many-body operators
replacing “ket’s by creation operators and bra’s by annihilation operators”.

This is not so surprising if we look at (2.24) in the right way. The bra ⟨j| is a linear
map from H(1) to the complex numbers, a space that we have since identified as
the “vacuum sector”. In this sense, ⟨j| maps the single-particle state |j⟩ to |vac⟩.
Dually, we can re-interpret the ket |i⟩ as a linear map C(1) → H(1) , (z) 7→ z|i⟩, or
|vac⟩ 7→ |i⟩. Thus, the familiar matrix element expansion (2.24) can be interpreted
as a superposition of processes that “destroy a particle in state |j⟩ and create one in
state |i⟩, weighted by the amplitude hij ”. From this point of view, (2.25) amounts to
the claim that the same description remains valid in higher particle number sectors.
CHAPTER 2. INDISTINGUISHABLE PARTICLES 48

To verify (2.25) start with the case where {|i⟩} is an eigenbasis of h. We have already
found in (2.9) that in this case, the occupation number basis diagonalizes the single-body
operator, so that
∞ X
M n  X  X 
h(k) |n1 . . . ⟩ = λi ni |n1 , n2 , . . . ⟩ = λi a†i ai |n1 , n2 , . . . ⟩
n=1 k=1 i i

as claimed. The general case follows from the fact that, as remarked around (2.18), “cre-
ation operators transform like kets and annihilation operators like bras”: If {|αi ⟩} is an-
other single-particle basis, then inserting completeness relations and using (2.18) gives

⟨i|h|i⟩a†i ai = ⟨i|h|j⟩a†i aj
X X
(h is diagonal in {|i⟩}-basis)
i ij

⟨i|αk ⟩⟨αk |h|αl ⟩⟨αl |j⟩a†i aj


X
=
ijkl
X  X 
⟨i|αk ⟩a†i
X
= ⟨αk |h|αl ⟩ aj ⟨αl |j⟩
kl i j
X
= ⟨αk |h|αl ⟩a†αk aαl .
kl

Likewise, if h is a two-particle operator on H(1) ⊗ H(1) , then the symmetrized n-body


version is
n
1 X (k,l)
h ,
2
k̸=l=1

where the super-script denotes the two particles on which the operator acts non-trivially.
The factor 1/2 is there to avoid double-counting of (k, l) and (l, k). As above, one can
show that
∞ n
1 M X (k,l) 1X
h = hijrs a†i a†j as ar , hijrs = ⟨ij|h|rs⟩.
2 n=1 2 ijrs
k̸=l=1

Note that the indices s, r of the annihilation operators are reversed as compared to the
indices in the matrix element! This makes the sign come out right in the Fermionic case.
We omit the proof.

Some concrete operators


Let’s apply the framework developed above to some important examples, both in position
and in momentum representation.

Single-particle potential. The single-particle potential operator is


Z
U = U (x)|x⟩⟨x| d3 x.

We can directly read off the corresponding expressions in second quantization


Z
Ψ† (x)U (x)Ψ(x) d3 x
CHAPTER 2. INDISTINGUISHABLE PARTICLES 49

(“destroy a particle at x, multiply with potential at this point, re-create it”).


Its matrix elements in the Fourier basis are
Z

⟨k′ |U |k⟩ = (2π)−3 U (x)ei(k−k )x d3 x = (2π)−3/2 Ũ (k′ − k).

leading to
Z
(2π) −3/2
Ũ (k′ − k)a†k′ ak d3 k′ d3 k. (2.26)

Read that as: A potential term can change the momentum of particles. The amplitude
associated with a change of q = k′ − k is proportional to the Fourier transform Ũ (q) of
the potential.
If one works in a box of finite volume V = L3 , then (in the sense of App. A.1.9), the
expression becomes
1
Ũ (k′ − k)a†k′ ak d3 k′ d3 k.
X

V k,k′′ ∈ Z /(2πL)
3

Momentum and kinetic energy. In the Fourier basis, we directly get


Z Z
P = ℏ k |k⟩⟨k| d3 k 7→ ℏ k a†k ak d3 k,

P2 ℏ2 ℏ2
Z Z
= 2 3
∥k∥ |k⟩⟨k| d k 7→ ∥k∥2 a†k ak d3 k.
2m 2m 2m
In the sense of App. A.1.8, one can also express these in position basis:
Z
P 7→ −iℏ Ψ̂† (x) ∇ Ψ̂(x) d3 x,

P2 −ℏ2 ℏ2
Z Z
7→ Ψ̂† (x) ∇2 Ψ̂(x) d3 x = (∇Ψ̂(x)† )(∇Ψ̂(x)) d3 x.
2m 2m 2m

These expressions are very suggestive, but also easy to misinterpret. Keep in mind
that Ψ̂(x) = aδx is not a complex function on R3 , but rather a field of annihilation
operators for delta functions indexed by x. If you are confused, read the explanation
in App. A.1.8. If you are not confused, then you’re probably missing something
(confusion is the natural state at this point!), so you should really read App. A.1.8!

Chemical potential. The point of Fock space is that the particle number is variable. The
problem with Fock space is that the particle number is variable. Let’s say you want to find
the ground state of a gas (as we’ll do later). There will be some mechanism (walls of a
container, pressure exerted by other gases, ...) that controls at least the average number of
particles in the gas. We could explicitly describe this mechanism (sounds complicated),
or just follow the lead of the grand canonical ensemble of stat mech and add an effective
term −µN̂ that formally adjusts the energy carried by a particle, and then vary µ until the
ground state shows the right average particle number. The operator implementing this is
just
Z Z
Ψ̂(x) (−µ)Ψ̂(x) dx = a†k (−µ) ak dk.

CHAPTER 2. INDISTINGUISHABLE PARTICLES 50

Interaction potential. Now consider an interaction potential V (x1 , x2 ) = V (x1 − x2 )


that only depends on the relative position of two particles. The most prominent example
is, of course, the Coulomb potential. In position basis, the second quantized version is:
Z
1
V (x1 − x2 )Ψ† (x1 )Ψ† (x2 )Ψ(x2 )Ψ(x1 ) d3 x1 d3 x2 . (2.27)
2
The Fourier transform that turns (2.27) into its momentum representation is already
slightly annoying to perform. To guide us, let’s first guess the structure of the momentum
representation. Recall that potentials that are invariant under a simultaneous translation of
all particles conserve total momentum. The most general two-particle process compatible
with that conservation law is one that shifts the two momenta in a symmetric way, say by
±q. We thus expect an integral over terms

f (k1 , k2 , q) a†k1 +q a†k2 −q ak2 ak1

where the amplitude f (k1 , k2 , q) remains to be found. Comparison with (2.26) suggests
that f might be related to the Fourier transform of the potential. That turns out to be true:
Z 3
d x1 d3 x2 i(k1 −k1′ )x1 +i(k2 −k2′ )x2
⟨k1′ , k2′ |V |k1 , k2 ⟩ = e V (x1 − x2 )
(2π)3 (2π)3
Z 3
d x1 d3 x2 i(k1 −k1′ )x1 +i(k2 −k2′ )x2 d3 q iq(x1 −x2 )
Z
= 3 3
e e Ṽ (q)
(2π) (2π) (2π)3/2
d3 q
Z Z 3 Z 3
d x1 i(k1 −k1′ +q)x1 d x2 i(k2 −k2′ −q)x2
= Ṽ (q) e e
(2π)3/2 (2π)3 (2π)3
d3 q
Z
= (2π)−3/2 Ṽ (q) δ(k1 + q − k1′ )δ(k2 − q − k2′ ).
(2π)3/2

Therefore, the momentum representation of an interaction term is


Z
1 1
Ṽ (q) a†k1 +q a†k2 −q ak2 ak1 d3 k1 d3 k2 d3 q.
2 (2π)3/2
CHAPTER 2. INDISTINGUISHABLE PARTICLES 51

Summary

• Action of ladder operators on occupation number basis:



a†i | . . . , ni , . . . ⟩ = ni + 1(−1)ζ j<i ni | . . . , ni + 1, . . . ⟩.
P

• Commutation relations

[ai , a†j ]ζ = δij 1, [ai , aj ]ζ = [a†i , a†j ]ζ = 0.

• “Creation operators can be expanded like kets”:

⟨i|α⟩a†i
X
a†α =
i

• Annihilation field operators in position basis (not wave functions!):


t t
Ψ̂(t, x) := e− iℏ H ax e iℏ H .

• In second quantization, kets 7→ creation ops and bras 7→ annihilation ops.

2.3 Quasiparticles and collective excitations


The Bosonic Fock space ladder operators act on the occupation number basis in exactly
the same way as the ladder operators associated with quantum harmonic oscillators act
on their energy eigenbasis (compare e.g. Eq. (2.17) to Eq. (A.28)) In fact, other than the
way they have been constructed, there is no systematic way of distinguishing between
an n-dimensional harmonic oscillator and non-interacting Bosons with an n-dimensional
single-particle Hilbert space. Because any Hamiltonian that is quadratic in position and
momentum operators is equivalent to a collection of uncoupled Harmonic oscillators when
expressed in normal modes (Sec. A.2.2), such models are widely applicable. Excitations
arising this way are called quasiparticles or collective excitations.

Formally, one says that the two systems are unitarily equivalent. Define a linear map
U from L2 (Rn ) to FS (Cn ) by requiring that it sends an element |n1 , . . . ⟩L (R )
2 n

of the eigenbasis of n harmonic oscillators as constructed in (A.28) to the element


|n1 , . . . ⟩FS (C ) of the occupation number basis as constructed in (2.17). Then
n

U , mapping an ONB to an ONB, is unitary and one immediately verifies that


L2 (Rn ) † F (Cn )
U ai U = ai S .

The most elementary case are lattice vibrations, or phonons. Let’s have a look.
CHAPTER 2. INDISTINGUISHABLE PARTICLES 52

2.3.1 Phonons
Goals
The phonon Hamiltonian is conceptually easy to solve (by undergrad mechan-
ics tools), but has much to teach us! Here, phonons will serve as an example of
how Fock space describes collective excitations, rather than arising from a single-
particle space. We’ll also have the opportunity to recall normal mode expansions.
A continuum limit will later motivate rules for field quantization.

We consider N particles in one dimension whose interaction potential has a minimum


at distance a and goes to 0 for large distances. There is therefore an equilibrium configu-
ration where the particles are arranged in a linear chain, with the k-th particle at position
ka. Let Xk be the position of the k-th particle, measured relative to its equilibrium value.
Expanding the potential around the minimum to second order,
N  2 
X P r κ
H= + (Xr − Xr+1 )2 . (2.28)
r=1
2m 2

We have to specify boundary conditions. If the chain is longer than the length scale of any
phenomenon we’ll be studying, boundary effects shouldn’t matter much (c.f. App. A.1.9).
We therefore opt for the mathematically simplest case: cyclic boundary conditions, i.e. we
assume that the indices of the operators in (2.28) only depend on r modulo N .
The chain Hamiltonian is quadratic in positions and momenta and can therefore be
diagonalized using canonical transformations (App. A.2.2). Working out the details is an
excellent exercise, so we only present the final result here.
For n = 1 . . . N and k = n 2π
L with L = N a the total length, define
r N r N
1 X −ikra 1 X ikra
ϕk = e Xr , πk = e Pr .
N r=1 N r=1

In the sense of App. A.2.2, the ϕk , πk correspond to complex normal coordinates associ-
ated with standing waves with quasi-momentum k. Then
r r r
1  mωk 1  κ
ak = √ ϕk + i π−k , ωk = 2| sin(ka/2)|
2 ℏ mℏωk m

define annihilation operators ([ak , a†k′ ] = δk,k′ ) that diagonalize the Hamiltonian
X 1  1
ℏωk a†k ak +
X X
H= πk π−k + 2κ sin2 (ka/2)ϕk ϕ−k = .
2m 2
k k k

The (Heisenberg picture) equations of motion iℏ∂t ak (t) = [ak , H] are then solved by
ak (t) = e−iωk t ak (0). For the original observables this means
r
ℏ X 1
Xr (t) = √ (ak e−iωk t+ikar + a†k eiωk t−ikar ),
Nm 2ω k
k
r r (2.29)
mℏ X ωk −iωk t+ikar † iωk t−ikar
Pr (t) = −i (ak e − ak e ).
N 2
k
CHAPTER 2. INDISTINGUISHABLE PARTICLES 53

In these expressions, we’ve grouped adjoint terms together, to emphasize that Xr is Her-
mitian. Sometimes it’s more advantageous to group terms by complex normal modes
instead:
r
ℏ X 1
ak (t) + a†−k (t) eikar ,

Xr (t) = √
Nm 2ωk
k
r r (2.30)
mℏ X ωk †  ikar
Pr (t) = −i ak (t) − a−k (t) e .
N 2
k

Finally, note that every formula in this section equally applies to the classical case,
with the only exception that the Hamilton function reads (c.f. App. A.2.2):
X
H= ℏωk |ak |2 .
k

2.3.2 Global phase gauge symmetry and particle number conservation


A gauge symmetry is a mathematical redundancy in the description of physical objects.
In quantum mechanics, a global phase change |ψ⟩ 7→ eiϕ |ψ⟩ is such a redundancy. It is
implemented by the 1 × 1 unitary “matrix” (eiϕ ) ∈ U (1), and therefore called global U (1)
gauge symmetry.
When we constructed Fock space, we inadvertently got in tension with that symmetry!
That’s because multiplying all single-particle vectors by eiϕ means that a tensor product
of n such vectors changes by einϕ . In other words, if
X X †
N̂ = n̂i = ai ai
i i

is the total particle number operator, then U (1) acts on Fock space as eiϕN̂ . Thus, U (1)-
transformations induce relative phases between subspaces of different particle numbers.
These will change the expectation values of observables that do not commute with N̂ .
So can we observe global phase changes of single-particle states when working with
many-body systems?
For non-relativistic massive particles (i.e. the kind of systems treated in undergraduate
QM courses), the answer is “no”. Loosely speaking, we expect that in a “non-relativistic
theory” deserving of that name, massive particle cannot be created or destroyed. We should
then require that all physical observables commute with total particle number. The re-
quirement that all physical observables obey an extra symmetry (i.e. [A, N̂ ] = 0) is called
a superselection rule. In particular, because

a†eiϕ ψ = eiϕ a†ψ , aeiϕ ψ = e−iϕ aψ (2.31)

linear expressions in ladder operators are not directly observable in the presence of this
superselection rule.
The Fock space for phonons was not constructed starting from a single-particle Hilbert
space of a non-relativistic massive particle, so the argument does not apply in this case.
And indeed, the observable (2.29) corresponding to the displacement of the r-th particle
(clearly a measurable quantity, at least in principle) is a linear combination of ladder oper-
ators. Also, as we’ll see next, when the particle number tends to infinity, the physical and
mathematical definition of N̂ becomes iffy, which may lead to non-relativistic systems to
behave as if particle number conservation is violated.
CHAPTER 2. INDISTINGUISHABLE PARTICLES 54

Figure 2.2: The motion in a Newton cradle is determined by energy and momentum
conservation alone. (Figure adapted from Wikipedia.)

2.4 Bose gas: Take 1


Indistinguishable particles with interaction potential V are described by the Hamiltonian
Z  −ℏ2 ∇2  Z

H = Ψ̂ (x) − µ Ψ̂(x) d x + Ψ̂† (x)Ψ̂† (y)V (x − y)Ψ̂(x)Ψ̂(y) d3 x d3 y.
3
2m
Simplify: Restrict the gas to a box of finite volume V (App. A.1.9); choose a “hard core
interaction potential” V (x − y) = U δ(x − y), with Fourier transform Ṽ (q) = V −1 U ;
2 2
k
switch to momentum representation; suppress vector notation; set ϵk = ℏ2m . Then

U X †
(ϵk − µ)a†k ak + ak+q a†k′ −q ak′ ak .
X
H= (2.32)
2V ′
k k,k ,q

This still is difficult to treat, so let’s get some intuition first, to guide our analysis.

Superfluidity
At very low temperature, Helium becomes superfluid: A particle slowly passing through
it does not experience friction. Here’s a way to think about that: Recall the Newton cradle
(Fig. 2.2), where one can uniquely determine the number of balls being excited merely
from energy and momentum conservation. Likewise, one may model the interaction be-
tween the particle and the gas as a scattering process, where the particle transfers energy
and momentum to the gas. Now imagine that the energy-momentum relations of the par-
ticle and the excitations of the gas are “out of tune” in the sense that there is no process
that would respect both conservation laws. In this case, no scattering is possible and one
would expect the particle to pass through the gas uninhibited.
With this model in mind, we set it as our goal to work out the energy-momentum
relation of the low-lying excitations of H.

Bose-Einstein condensation
Recall that for non-interacting Bosons (i.e. when V = 0), the ground state is achieved
when all particles are in the lowest-energy state of the single-particle term. It is plausible
(though a very difficult question to treat rigorously) that remnants of this behavior per-
sist for non-zero interaction V and for low-lying states. We will thus treat H under the
assumption that there is a finite density
n0 1
ρ= = ⟨a†0 a0 ⟩ (2.33)
V V
of particles occupying the k = 0 mode. To achieve this, we add a “chemical potential
term” −µN̂ to the Hamiltonian and will later adjust µ to achieve (2.33).
CHAPTER 2. INDISTINGUISHABLE PARTICLES 55

Figure 2.3: SSB. TBD.

2.4.1 Approximate solution part 1


Because for low-lying many-body states |ψ⟩, we expect the occupation number ⟨ψ|a†0 a0 |ψ⟩
of the single-particle ground state to be much larger than the ones for other modes, we ne-
glect all terms that are of third order or higher in creation/annihilation operators for k ̸= 0.
A lengthy but uneventful calculation leads to
U † 2U X †
(ϵk − µ)a†k ak + a0 a0 a†0 a0 + a0 a0 a†k ak
X
H=
2V V
k k̸=0
(2.34)
U X † †
+ (a0 a0 ak a−k + a0 a0 a†k a†−k ) + O(a3k ).
2V
k̸=0

To make further progress, we employ Bogoliubov’s c-number substitution: Replace the



operator √1V a0 with a complex number ρeiθ .
Wait, we do what? Why would that be justified? The minimal story goes like this:
In the limit V → ∞, the number of Bosons n0 in the k = 0-mode is expected to be
macroscopic n0 = ρV → ∞. Because we cannot physically resolve the number of Bosons
√ √
in the mode, a0 |n0 ⟩ = n0 |n0 − 1⟩ “behaves just like” n0 |n0 ⟩ with respect to any
measurement we can actually implement. So switching √ to a mathematical model where a0
is not a ladder operator at all, but rather equal to ρV 1 should give similar results.
Well, OK. That always seemed at most mildly convincing to me. To get a better feeling
for why this is a justified way of arguing, let’s take a detour and introduce a broader
framework for such phenomena, which are connected to spontaneous symmetry breaking.

If you are already fully convinced, or if “mildly convincing” is anyway all you
aim for at this moment, you can skip ahead to Sec. 2.6.

2.5 Detour: Spontaneous symmetry breaking


Broadly interpreted, the concept of spontaneous symmetry breaking (SSB) refers to any
situation where the solutions of a problem are less symmetric than the problem itself.
There are banal ways in which this can manifest (Fig. 2.3), but there’s also deep ones. In
the examples we’ll look at, the technical origin of the effect may be traced back to the
(vague, for now) principle

“One cannot implement operators that act on macroscopically many particles.” (2.35)

2.5.1 Ferromagnetism
The guiding phenomenlogical example is ferrogmagnetism. If cooled below its Curie tem-
perature, a ferromagnet develops a magnetic moment M ̸= 0. In the absence of external
fields, the moment M is equally likely to point into any direction. Thus, statistically, the
behavior is rotationally invariant. But every time the magnet is cooled down, it “sponta-
neously” singles out one direction in space, thereby “breaking the symmetry”.
CHAPTER 2. INDISTINGUISHABLE PARTICLES 56

The simplest case of a model exhibiting ferromagnetic behavior is the Ising model. It
involves N spin-1/2 particles – and in fact, we can learn a lot by looking at their Hilbert
space in the limit N → ∞, even before introducing the Hamiltonian.
Indeed, consider the two states (depending on the relative phase)
1
N
|ψ± ⟩ = √ (|↑⟩⊗N ± |↓⟩⊗N ).
2

The |ψ± N
⟩ are eigenvectors of σx⊗N with eigenvalue +1 and −1 respectively. Despite them
being orthogonal, I claim that as N gets macroscopic, the two states become effectively
indistinguishable.
To justify this outrageous claim, assume that just one of the macroscopically many par-
ticles is lost (as will always, realistically, be the case). Then any measurement effectively
takes place on the reduced density matrix
1 1
N
tr1 |ψ± N
⟩⟨ψ± |= (|↑⟩⟨↑|)⊗(N −1) + (|↓⟩⟨↓|)⊗(N −1) ,
2 2
which is a uniform mixture of |↑ . . . ⟩, |↓ . . . ⟩, and independent of the relative phase. In this
sense: The operator σx⊗N does not actually describe a physically realizable measurement
in the limit N → ∞.
Sometimes, it is beneficial to keep idealized mathematical objects around (like δ func-
tions) even if they are not directly physical. In this case, however, it turns out that we’ll
attain a cleaner understanding of ferromagnetism, superfluidity, and many other impor-
tant quantum many-body phenomena, if we commit to the principle (2.35) and declare
operators like σx⊗N to be unphysical in the limit N → ∞.
Let’s explore this further. Define H↑ to be the space of states that can be reached by
physical operations starting from |↑⟩⊗N and define H↓ analogously. For microscopic N ,
the two spaces are identical, but as N → ∞, they become orthogonal. A good way to see
this is to consider the average magnetization. For a state |ψ⟩, it is defined as
N
1 X
m := ⟨ψ|σz(k) |ψ⟩. (2.36)
N
k=1

The average magentization is +1 on |↑⟩⊗N and −1 on |↓⟩⊗N . If A is any physical operator,


then by (2.35), A|ψ⟩ differs from |ψ⟩ only on a microscopic number of spins. For N → ∞,
this doesn’t affect the average in (2.36), and we conclude that no quantum-mechanical
process can change m in that limit.
One consequence is that within each of these two physically separated Hilbert spaces,
PN (k)
the average magnetization operator N1 k=1 σz can be replace by a number, namely by
±1 respectively. (Spoiler alert: That’s the mechanism that will allow us to replace √aV0 by
a complex number for Bose-Einstein condensates.) √

We can still mathematically write down superpositions |ψ⟩ = p|ψ↑ ⟩+eiϕ 1 − p|ψ↓ ⟩
between vectors |ψ↑ ⟩ ∈ H↑ , |ψ↓ ⟩ ∈ H↓ , of the two disjoint spaces. But because for every
physical operation A, the matrix elements between them vahish ⟨ψ↑ |A|ψ↓ ⟩ = 0, these co-
herent superpositions cannot be experimentally distinguished from the incoherent mixture

ρ = p|ψ↑ ⟩⟨ψ↑ | + (1 − p)|ψ↓ ⟩⟨ψ↓ |.

That’s a generalization of the example we started with.


CHAPTER 2. INDISTINGUISHABLE PARTICLES 57

It’s time to have a look at the Hamiltonian of the Ising model:


X
H = −J σz(i) σz(j) , J > 0,
i,j

where the sum is over nearest neighbors. The summands give



(i) (j) −J |si sj ⟩ = |↑↑⟩, |↓↓⟩,
−Jσz σz |si sj ⟩ =
+J |si sj ⟩ = |↑↓⟩, |↓↑⟩.

We immediately see that the Hamiltonian is invariant under a simultaneous flip of all spins,
realized by the operator σx⊗N . Also, the ground state energy is −J times the number of
neighboring pairs. It is attained on the subspace with basis |↑⟩⊗N , |↓⟩⊗N , or, equivalently,
N
with basis the |ψ± ⟩.
Given the discussion above, it is now easy to see what happens. The spin flip symmetry
of the Hamiltonian is implemented by σx⊗N , which “breaks” in the sense that it becomes
N
unphysical for N → ∞. For microscopic N , the |ψ± ⟩ are pure ground states that are
invariant under the spin flip symmetry (up to phase). As N → ∞, they remain invariant,
but they become effectively mixed. In fact, any ground state α|↑⟩⊗N + β|↓⟩⊗N becomes
a mixture of the two non-symmetric ones |↑⟩⊗N , |↓⟩⊗N . We can now connect back to the
loose definition of “symmetry breaking” in the very beginning: The restriction on physical
observables in macroscopic systems means that there is no longer a pure ground state that
shares the symmetry of the Hamiltonian.

Further comments (not needed in the sequel)


• Because σx σz σx† = −σz , the average magnetization vanishes for every state (pos-
sibly mixed) that is spin flip invariant. An observable that “witnesses the lack of
symmetry” in this way is called an order parameter.
• In reality, there’s at least some tiny external fields around, so we should modify the
P (k)
Hamiltonian to read Hλ = H + λ k σz , where λ corresponds to the net external
field. The sign of λ lifts the degeneracy of the ground space. SSB is then witnessed
by the fact that when taking limits limλ→0 limN →∞ Hλ (in that order!), the resulting
ground state depends on whether λ approaches 0 from above or from below.
• In elementary QM, one often associates different Hilbert spaces with the same quan-
tum system, as a matter of convenience. For example, a single harmonic oscillator
can be described by the Hilbert space L2 (R) of square-integrable functions, or by
the Fock space F(C). These choices are equivalent: Every basis vector |n⟩ ∈ F(C)
can be mapped to a wave function (in terms of Gaussians and Hermite polynomials)
and this way, every set of expectation values realizable on one of the Hilbert spaces
can be reproduced on the other. In contrast, the fact that the average magnetization
takes on different values in the two Hilbert spaces constructed above, shows that the
representation of the observables of the Ising model on them are inequivalent.

2.5.2 SSB and Bose-Einstein condensation


We are now ready to argue that Bose-Einstein condensation of a macroscopic number of
particles leads to spontaneous symmetry breaking, this time of a continuous symmetry.
CHAPTER 2. INDISTINGUISHABLE PARTICLES 58

Consider a Bose gas contained in a box of volume V . Recall from (2.33) that we are
interested in states |ψ⟩ that have a fixed density ρ = n0 /V of particles in the k = 0-mode:
1 1  1
⟨ψ|a†0 a0 |ψ⟩ = ⟨ψ| √ a†0 √ a0 |ψ⟩ = ρ.

(2.37)
V V V
In the limit V → ∞, measuring the precise occupation number

⟨ψ|a†0 a0 |ψ⟩ = V ρ → ∞

would require us to count a macroscopic number of particles. Consistent with the principle
(2.35), we reject his as unphysical. The density, however, should be measurable. Hence
we posit that an observable is physical only if it can be expressed in terms of the re-scaled
ladder operators
1 1
√ a0 , √ a†0 (2.38)
V V

as well as the ak , a†k for k ̸= 0 (with coefficients that do not depend on V , of course).
This seemingly minor restriction has dramatic effects in the limit V → ∞. Indeed,
h 1 1 i 1
√ a0 , √ a†0 = → 0, (2.39)
V V V

so that in the thermodynamic limit, the operators (2.38) commute! But then all physical
observables commute with √1V a0 (why?). This operator therefore plays the same role as
the average magnetization in the Ising model: Its eigenspaces are physically separated
in the sense that relative phases between them are not observable and no vector can be
mapped from one eigenspace to another. If √aV0 |ψ⟩ = λ|ψ⟩ then (2.37) implies that λ =
√ iθ
ρe for some θ ∈ [0, 2π)
Thus, we may always assume that the dynamics takes place in one of the eigenspaces
n a0 √ o
Hθ = |ψ⟩ √ |ψ⟩ = ρeiθ |ψ⟩ ,
V

where √aV0 acts like ρeiθ . This is what we set out to justify.

U (1) symmetry breaking


Just like spin flip symmetry before, there are unphysical operations that do connect differ-
ent eigenspaces. This role is played by the U (1) symmetry eiϕN̂ (Sec. 2.3.2). It “breaks”
in the V → ∞ limit, because it involves the diverging total particle number operator N̂ .
Mathematically, however, it holds that

eiϕN̂ Hθ = Hθ+ϕ .

That’s because annihilation operators transform as

e−iϕN̂ aα eiϕN̂ = eiϕ aα (2.40)

(the positive phase gets applied to one more particle than the negative one). Some conse-
quences:
CHAPTER 2. INDISTINGUISHABLE PARTICLES 59

By (2.40), the expectation value ⟨ √aV0 ⟩ vanishes in any state that is U (1) invariant. The
operator aV0 therefore constitutes an order parameter. Now comes a big difference to the
Ising example. On Fock space for massive non-relativistic particles, we have a second
condition for an observable to be physical: In addition to fulfilling (2.35), observables
also have to be gauge invariant (Sec. 2.3.2). Hence in this case, the order parameter is
not measurable (unlike the average magnetization, which is the central physical quantity
associated with the Ising magnet). It also means that the physical behavior of the Bose gas
can only depend on ρ, not on θ, so we are free to restrict to the case θ = 0 below.
We found that a state |ψ⟩ is pure with respect to the physical observables only if it is
contained in one of the Hθ spaces. But then, it isn’t U (1)-invariant. Only the mixed state
Z

ρ = eiϕN̂ |ψ⟩⟨ψ|e−iϕN̂

is. We’re again encountering the dichotomy that states are symmetric or pure, but not both.

2.6 Bose gas: Take 2


Back to the Bose gas. We (reasonably) assume that each low-lying state has a non-zero
density ρ = n0 /V of particles occupying the single-body ground state. The value of ρ will
be determined momentarily. For now, following the discussion on SSB, we just make the

substitution √aV0 7→ ρ. Then (2.34) becomes
 U 2 X Uρ X
ϵk − µ + 2U ρ a†k ak + (ak a−k + a†k a†−k ).

H ≃V − µρ + ρ +
2 2
k̸=0 k̸=0

The effect of ρ on the energy will, in the limit V → ∞, be dominated by the first
term, which is the only one proportional to V . Thus, low-lying states will have a density ρ
minimizing that term. Setting its derivative to zero gives the relation ρ = µ/U . We keep ρ
and eliminate µ, to get
Uρ X
ϵk + U ρ a†k ak + (ak a−k + a†k a†−k ).
X 
H =const. + (2.41)
2
k̸=0 k̸=0

This is a quadratic expression in ladder operators, so we know from general principles


(Sec. A.2.2) that it can be diagonalized using a canonical transformation. Let’s find it in
two (and a half) easy steps!

The result of the following “2½ easy steps” is summarized in (2.46). In principle,
one can just check directly that the form of H given there is indeed equal to
(2.41). Below, we only describe a somewhat natural thought process that leads
to (2.46). If you’re in a hurry, skip ahead.

Step 1: Decouple. Start with the rightmost term. It creates / destroys pairs of particles
of opposite momentum. This suggests switching the basis of the single-particle space to
one that consists of superpositions of states moving in opposite directions. Remembering
that | ± k⟩ are represented in position space by complex exponentials that are each other’s
conjugates, the cosine / sine basis
1 −i
√ (|k⟩ + | − k⟩), √ (|k⟩ − | − k⟩). (2.42)
2 2
CHAPTER 2. INDISTINGUISHABLE PARTICLES 60

seems like a natural candidate. Let’s agree that a vector k is positive if its first non-zero
component is. Then there is exactly one positive wave vector in every pair +k, −k. For
k > 0, define the annihilation operators
1
bk = √ (ak + a−k ) (“positive k label the cosines”)
2
i
b−k = √ (ak − a−k ) (“negative k label the sines”)
2
associated with the new basis. Inverting,
1 1
ak = √ (bk − ib−k ), a−k = √ (bk + ib−k ) k > 0. (2.43)
2 2
Plugging in, the pair term decouples, as hoped:
X Uρ 
ϵk + U ρ b†k bk + (bk bk + b†k b†k ) .

H = const. +
2
k̸=0

Step 2: Solve harmonic oscillator. It turns out that each summand represents a har-
monic oscillator and that a simple re-scaling of position and momentum coordinates will
put it into standard form. To see how this works, we switch to Hermitian operators for the
moment:
1 −i
X = √ (bk + b†k ), P = √ (bk − b†k ).
2 2
Abbreviating A = ϵk + U ρ, B = U ρ one directly finds
B A−B 2 A+B 2
A b†k bk + (bk bk + b†k b†k ) = P + X . (2.44)
2 2 2
Well, we know how to solve these using undergrad methods (App. A.2.1)! The transfor-
mation
r r
4 A + B A−B p
X̃ = X, P̃ = 4 P, E k = A2 − B 2
A−B A+B

is obviously canonical, [X̃, P̃ ] = [X, P ], and puts the oscillator into standard form:

1 1p r A − B r
A + B 2
2 2 2
Ek (X̃ + P̃ ) = (A + B)(A − B) P + X = (2.44).
2 2 A+B A−B

Therefore, setting b̃k = √1 (X̃ + iP̃ ), we have diagonalized H (that wasn’t too hard ,):
2
q
Ek b̃†k b̃k ,
X
H = const. + Ek = ϵ2k + ϵk 2U ρ. (2.45)
k̸=0

Step 2.5: Cleanup. Because E−k = Ek , the Hamiltonian is degenerate and any
unitary transformation within the ±k-subspaces will leave its form invariant. Choosing
1 1
ck := √ (b̃k + b̃−k ) for k > 0, ck := √ (b̃k − b̃−k ) for k < 0
2 2
CHAPTER 2. INDISTINGUISHABLE PARTICLES 61

turns out to lead to the cleanest theory. Plugging in all the nested definitions gives
r r  r r 
1  ϵk Ek 1  ϵk Ek
ck = uk ak − vk a†−k , uk = + , vk = − ,
2 Ek ϵk 2 Ek ϵk
q
Ek c†k ck .
X
Ek = ϵ2k + ϵk 2U ρ, H = const. +
k̸=0
(2.46)

The coefficients lie on the unit hyperbola:


   
1 ϵk Ek 1 ϵk Ek
u2k − vk2 = +2+ − −2+ =1
4 Ek ϵk 4 Ek ϵk

which implies (exercise) that the inverse transformation is

ak = uk ck + vk c†−k . (2.47)

Discussion
We have found that the elementary excitations of the Bose gas are given by quasi-particles
created by the c†k . The ground state is the quasi-particle vacuum characterized by

ck |0⟩(q) = 0 ∀k.

It is not to be confused with the particle vacuum |0⟩(p) characterized by ak |0⟩(p) = 0! For
example, using (2.47), the expected number of particles with momentum k in the quasi-
particle vacuum is

⟨0|(q) a†k ak |0⟩(q) = ⟨0|(q) (uk c†k + vk c−k )(uk ck + vk c†−k )|0⟩(q) = vk2 .

While the above shows that quasi-particle occupation number states | . . . nk . . . ⟩(q) do
not have definite particle numbers, it turns out that they do have definite momentum! In
the exercise, you will show that c†k creates quasi-particles with momentum ℏk. Thus, E(k)
found in (2.45) describes their energy-momentum (or dispersion) relation. Compared to a
free particle, E(k) involves the additional term ϵk 2U ρ. It dominates if
r
ℏ2 ∥k∥2 ∥ℏk∥ Uρ
ϵk = ≪ 2U ρ ⇔ ≪ =: c,
2m m m
i.e. for velocities much smaller than c. In this regime, we have Ek ≃ c∥ℏk∥, that is, energy
scales linearly with momentum. Beyond that, Ek is convex (“bends upwards”, Fig. 2.4),
so that Ek ≥ c∥ℏk∥ holds in general.
As alluded to in the very beginning, this means that a particle moving through the Bose
gas at low velocity cannot slow down by transferring energy and momentum to a quasi-
particle. Quantitatively: Let M be the mass of the test particle and p its initial momentum.
Assume it excites a quasi-particle of momentum q. Then energy conservation demands

∥p∥2 ∥p − q∥2 pq ∥q∥2


 
∥p∥ ∥q∥ ∥p∥
0= − −Eq = − − −Eq ≤ −c∥q∥ = −c ∥q∥
2M 2M M 2M M M

which has a solution only if the test particle has initial velocity ∥p∥/M at least c.
CHAPTER 2. INDISTINGUISHABLE PARTICLES 62

Figure 2.4: Blue line: Dispersion relation E(∥k∥) for the Bose gas. Orange line: E =
cℏ∥k∥ is a good approximation for small ∥k∥, and a lower bound for all k. The x-axis is
in units of mc/ℏ, y-axis in units of mc2 .

2.7 Further reading


A good presentation of many-body theory is Advanced Quantum Mechanics by Schw-
abl, which also covers the Bose gas. Spontaneous symmetry breaking is a complex phe-
nomenon that can be approached from many points of view that might feel quite different.
I enjoy the presentations by Strocchi (Elements of Quantum Mechanics in Infinite Systems
and Symmetry Breaking), but they might be a little too mathematical for the average taste.
A more phenomenological approach in the language of path integrals is in Chapter 6 of
Condensed Matter Field Theory by Alexander Altland (of Cologne) and Ben Simons.
Chapter 3

Field quantization and quantum theory


of light

Our goal is to construct a quantum theory for the EM field. Since quantum mechanics
is more fundamental than classical physics, one cannot hope to derive a quantum theory
from its classical limit. “Quantization” thus always involves educated guesses.
To educate ourselves, we’ll first have another look at lattice vibrations (Sec. 2.3.1). For
both their classical and their quantum model, one can easily construct a continuum limit.
The result is a classical and a quantum field theory. Their relation will serve as a template
for quantizing other fields.

3.1 Phonon continuum limit


Recall our treatment of N coupled particles arranged in a line of length L (Sec. 2.3.1). For
phenomena that have length scales much larger than the equilibrium spacing a = L/N ,
the behavior of the model should not depend on the precise value of a. (Try to infer the
lattice spacing from listening to the sound of a string instrument...). More precisely, the
family of models with parameters
1 1
N (λ) = λN, m(λ) = m, a(λ) = a, κ(λ) = λκ,
λ λ
for λ ∈ N should all behave similarly (Fig. ??). It thus make sense to investigate the limit
λ → ∞.
Quantities that do not depend on λqinclude the total length L = N a, the mass den-
κa
sity ρ = m/a, and the velocity c := ρ . Asymptotically, also the dispersion relation
becomes independent:
r r
(λ) κ 2 κ −1
ωk = 2 λ | sin(kaλ−1 /2)| → 2 λλ |ka/2| = c|k|.
m m
Recall the formula (2.29) for the displacement of the r-th particle in terms of the nor-
mal coordinates
r
ℏ X 1
Xr (t) = √ (ak e−iωk t+ikar + a†k eiωk t−ikar ).
Nm 2ω k
k

63
CHAPTER 3. FIELD QUANTIZATION AND QUANTUM THEORY OF LIGHT 64

Let’s rewrite it in a form suitable for our limit. The product N m is just the total mass,
invariantly expressed as Lρ. Also, it makes sense to label the particles not by their index
r = 1, . . . N , but by their equilibrium position x = ra ∈ [0, L]. With these substitutions,
we obtain the “displacement field”
s
ℏ X 1
ϕ(t, x) = √ (ak e−iωk t+ikx + a†k eiωk t−ikx ). (3.1)
Lρ 2ωk
k

The continuum model is now defined as an infinite collection of harmonic oscillators


L Z with Hamiltonian
indexed by k ∈ 2π

ℏc|k| a†k ak + const.


X
H= (3.2)
k

and associated displacement field ϕ(t, x) given by (3.1).

There’s some trouble brewing in (3.2): The “constant” is k 12 ℏc|k|, which di-
P
verges. This is the first of the many infinities of quantum field theory. This one is
easy to deal with: For finite N , the sum over the ground state energies of the har-
monic oscillators is finite. Subtracting this constant from the total energy does not
alter physical predictions, so as long as we do not dynamically change the ground
state energy (e.g. by putting stress on the material in a way that affects the equilib-
rium separationa) or get into
 thePrealm of general relativity. Thus, the renormaliza-
tion k ℏc|k| a†k ak + 21 7→ k ℏc|k| a†k ak , while maybe not very principled,
P

does not affect predictions and makes the continuum limit converge. So let’s adopt
this convention. (We’ll encounter more troubling infinities later).

As in Sec. 2.3.1 and App. A.2.1, the definitions so far make sense equally in classical
and in quantum mechanics. In QM, the ak ’s are annihilation operators that are taken to
act on Fock space with occupation number basis | . . . nk . . . ⟩. Classically, the ak ’s are
complex numbers and (3.1) is the most general real-valued solution of the wave equation
1 
2 2
∂ − ∂x ϕ(t, x) = 0 (3.3)
c2 t
under cyclic boundary conditions.
We went through this exercise in order to find a strategy for quantizing Maxwell’s
equations. The relation between the classical and the quantum continuum model found
here suggests the following recipe for quantizing classical wave equations:

Summary

• Consider a classical wave equation whose solutions are of the form


1
ak e−iωk t fk (x) + a†k eiωk t fk (x)†
X 
ϕ(t, x) = N √
k
2ωk

for some set of modes {fk (x)}k , a constant N and ak ∈ C.

• Choose normalization such that H = k ℏωk a†k ak is the energy of the field.
P

• The quantized field is obtained by associating an oscillator with every mode


and replacing the complex coefficients ak by annihilation operators acting
on a Bosonic or Fermionic Fock space.
CHAPTER 3. FIELD QUANTIZATION AND QUANTUM THEORY OF LIGHT 65

Fields for which this program can be implemented are called free. We’ll only work
with free field in this course. General, interacting fields, are treated in the QFT courses.
How to decide whether to use Fermionic or Bosonic Fock spaces will be a major topic in
Chap. 6.

Further comments
It is also of interest to write down a momentum field π(x) which describes the continuum
limit of the Pr . Because the mass of the individual particles goes to 0 for λ → ∞, only
the momentum density defines an interesting quantity in the limit. Thus, starting from
r r
1 1 ℏm X ωk
Pr = −i (ak eikar − a†k e−ikar ),
a a N 2
k

and arguing as above, we get for the momentum density field


r r
ℏρ X ωk
π(x) = −i (ak eikx − a†k e−ikx ).
L 2
k

In the continuum limit, the commutation relation (or iℏ times the Poisson bracket) between
the displacement and the momentum density fields is

−iℏ X ′ ′
[ϕ(x), π(y)] = [(ak eikx + a†k e−ikx ), (ak′ eik y − a†k′ e−ik y )]
2L ′
k,k
iℏ X ikx−k′ y iℏ X ik(x−y)
= e [ak , a†k′ ] = e = iℏδ(x − y).
L L 2π

k,k k∈ L Z

3.2 Quantization of the EM field


Classical electrodynamics can be described either in terms of E- and B-fields, or in terms
of scalar and vector potential Φ, A such that

B = ∇ × A, E = −∇Φ − ∂t A. (3.4)

The classical Hamilton function


1
H= P − qA)2 + qΦ
2m
for a charged particle is expressed in terms of the potential. This suggests that Φ, A, rather
than E, B, are the right fields to base a quantum theory on.
However, this immediately leads to a problem: Φ, A are determined by the physical
state of the EM field only up to gauge transformations

A 7→ A + ∇χ, Φ 7→ Φ − ∂t χ

with an arbitrary function χ. Here, we get rid of the ambiguity by adopting the Coulomb
gauge, fixed by the gauge condition

∇ · A(t, x) = 0. (3.5)
CHAPTER 3. FIELD QUANTIZATION AND QUANTUM THEORY OF LIGHT 66

Further, we restrict to the free-space version of Maxwell’s equation, i.e. we assume that
there are no charges or currents ρ = j = 0. In this case, the Maxwell equations become
1 
Φ(t, x) = 0, 2
∂t2 − ∂x2 − ∂y2 − ∂z2 A(t, x) = 0. (3.6)
c
In a box with side length L and cyclic boundary conditions, the space of complex
solutions to Eq. (3.6) is spanned by plane waves of the form
2π 3
Ak e±iωk t+ikx , A k ∈ C3 , k∈ Z, ωk := c∥k∥.
L
The gauge condition (3.5) requires the coefficients Ak to be “transversal” to the wave
vector k:

0 = ∇ · Ak e±iωk t+ikx = ik · Ak e±iωk t+ikx ⇔ k · Ak = 0.




We can take this into account by choosing, for each k, an ortho-normal basis (the polar-
ization vectors)

e1 (k), e2 (k) ⊂ {k}⊥ ⊂ R3



with eλ (−k) = eλ (k)

for the space orthogonal to k (Fig. ??). Then a general real-valued solution to the Maxwell
equations in Coulomb gauge is
r
ℏ X 1
eλ (k) akλ e−iωk t+ikx + a†kλ e+iωk t−ikx ,

A(t, x) = 3
√ (3.7)
ϵ0 L 2ωk
k,λ

where the sum is over wave vectors k ∈ 2π L Z and polarization directions λ ∈ {1, 2}. As
3

discussed before for phonons (Eq. (2.30), it is often convenient to re-arrange the sum in
(3.7) so that terms corresponding to the same complex mode are grouped together:
r
ℏ X 1
eλ (k) akλ (t) + a†−kλ (t) eikx ,

A(t, x) = √ (3.8)
ϵ0 L3 2ω k
k,λ

k
The time evolution of the E and B-fields follows by applying (3.4). Setting κ = ∥k∥ ,
r r
ℏ X ωk
eλ (k) akλ e−iωk t+ikx − a†kλ e+iωk t−ikx

E(t, x) = i (3.9)
ϵ0 L3 2
k,λ
r r
ℏ X ωk
eλ (k) akλ (t) − a†−kλ (t) eikx ,

=i (3.10)
ϵ0 L3 2
k,λ
r r
ℏ X ωk
κ × eλ (k) akλ e−iωk t+ikx − a†kλ e+iωk t−ikx , (3.11)

B(t, x) = i
ϵ0 L3 c2 2
k,λ
r r
ℏ ωk
κ × eλ (k) akλ (t) + a†−kλ (t) eikx .
X 
=i 3 2
(3.12)
ϵ0 L c 2
k,λ

Plugging these expressions into the formula


Z
ϵ0
Hem = E 2 (t, x) + c2 B 2 (t, x) d3 x
2
CHAPTER 3. FIELD QUANTIZATION AND QUANTUM THEORY OF LIGHT 67

for the energy of the EM field, one finds after some calculations
X
Hem = ℏωk |akλ |2 .
k,λ

The A-field is thus of the form discussed in Sec. 3.1 so that one can perform a free-field
quantization. From now on, we will thus treat the akλ ’s as annihilation operators for a
collection of harmonic oscillators acting on the Fock space Hem .

Notation
For increased legibility, we’ll now write k for (k, λ), with the convention that −k corre-
sponds to (−k, λ). Also, for an element | . . . nk . . . ⟩ of the occupation number basis of
the harmonic oscillators, write |{n}⟩.

3.3 States of the EM field


3.3.1 Number states
Elements |{n}⟩ of the occupation number basis – i.e. states with a definite number of
photons in each mode – are called number states or Fock states. The expected electric field
strength in any number state is

⟨{n}|(ak eikx − a†k e−ikx )|{n}⟩ = 0.


X
⟨{n}|E(x)|{n}⟩ ∝
k

Zero on average does not imply zero with probability one. Indeed, compute the variance:

⟨{n}|E(x) · E(x)|{n}⟩

−ℏ X ωk ωk′ ′ ′ 
ek · ek′ ⟨{n}| ak eikx − a†k e−ikx ak′ eik x − a†k′ e−ik x |{n}⟩
 
= 3
ϵ0 L ′
2
k,k
ℏ X X ℏωk
= 3
ωk ⟨{n}|ak eikx a†k e−ikx + a†k e−ikx ak eikx |{n}⟩ = (nk + 1/2),
2ϵ0 L ϵ0 L3
k k

which diverges. Another infinity!


The infinity encountered in (3.2) was easy to dismiss, as it related to an unobservable
choice of energy zero point. This one is a somewhat tougher nut to crack, because electric
field strength (proportional to the force exerted on a charged body) has direct physical
consequences. One can argue as follows: Any test particle used to measure the field
strength will have finite extent, so it cannot be concentrated on just one point x in space.
If we replace the point-sized probe by one with a charge density ρ(x), then one can show
(excercise!) that the spatially averaged force
Z
F = ρ(x)E(x) d3 x

has finite fluctuations, if ρ is sufficiently spread out. This is physically plausible. The
sum diverges because there are infinitely many summands with increasingly large wave
vector k. But these correspond to fields that oscillate rapidly, so that cancellations over
any finite region cause the net force to be small. Mathematically speaking, we found again
CHAPTER 3. FIELD QUANTIZATION AND QUANTUM THEORY OF LIGHT 68

Figure 3.1: Net force is zero, so QFT would presumably be OK with it. (Scene from the
Caucasian Chalk Circle, as depicted on in this poster).

(c.f. Sec. 2.2.2) that field operators should be thought of as distributions that have to be
integrated against smooth functions to be meaningful.
Is this a satisfactory solution?
Yes, in that it gives a good reason for why extended bodies don’t regularly get acceler-
ated into orbit due to vacuum fluctuations. No, because it paints quite the violent picture
of the microscopic world, where, supposedly, unbounded forces constantly tear at objects
and only cancellations prevent mayhem (Fig. 3.1). It sure feels like an indication that our
current theories of light and matter become invalid at very short length scales.

3.3.2 Coherent states


Number states have zero expected field strength. Since we expect classical electrodyndam-
ics to emerge as a limiting case, there should be states for which the expectation values
⟨E(x, t)⟩ resemble the classical behavior.
To construct these, recall the coherent states of a single harmonnic oscillator. For
α ∈ C, define

2 X αn
|α⟩ = e−|α| /2
√ |n⟩.
n=0 n!

Coherent states are eigenvectors of the annihilation operator


∞ ∞ ′
2 X αn √ 2 X αn +1 √ ′
a|α⟩ = e−|α| /2
√ n|n − 1⟩ = e−|α| /2 p n + 1|n′ ⟩ = α |α⟩.
n=1 n! n′ =0
(n′ + 1)!

A coherent state |{α}⟩ of the entire EM field is one where each mode k ≡ (k, λ) is in
a coherent state |αk ⟩. Let’s compute the expectation value of the E-field:
r
2ℏπ X √
⟨{α}|Ê(x, t)|{α}⟩ = −i ωk ek ⟨{α}|(a†k e−ikx+ωk t − ak eikx−ωk t )|{α}⟩
L3
k
r
2ℏπ √
ωk ek αk† e−ikx+ωk t − αk eikx−ωk t
X 
= −i 3
L
k

which is indeed the classical value (3.9).


CHAPTER 3. FIELD QUANTIZATION AND QUANTUM THEORY OF LIGHT 69

3.4 Light-matter interaction


The Hamiltonian of a single spinless particle with charge q, position and momentum oper-
ators X, P , subject to a field in Coulomb gauge is
2
P − qA(X)
ℏωk a†k ak .
X
H= + U (X) + (3.13)
2m
k

It acts on a total Hilbert space H = Hpar ⊗ Hem that is the tensor product between the
spaces of the particle Hpar = L2 (R3 ) and of the field Hem . Here, A(X) is defined by
(3.7), where the ladder operators ak , a†k act on Hem , but the parameter x is evaluated on
the position of the particle. In other words

x ∈ R3 , |ψ⟩ ∈ Hem .

A(X) |x⟩|ψ⟩ = |x⟩ A(x)|ψ⟩ (3.14)

We will now go through a sequence of simplifications and transformations. Start with

1 2 P2 q q2
P − qA(X) = − (P A(X) + A(X)P ) + A(X)2 .
2m 2m 2m 2m
As a first step, we will neglect the square A(X)2 , which describes two-photon processes.
Next, verify that in Coulomb gauge, momentum and the vector potential commute:
 
P A(X)|ϕ⟩ = −iℏ∇ A(X)|ϕ⟩ = −iℏ(∇ · A(X))|ϕ⟩ + A(X) · P |ϕ⟩ = A(X) · P |ϕ⟩,

so that we can write H ≃ Hpar + Hem + HI , with

P2 q
ℏωk a†k ak ,
X
Hpar = + U (X), Hem = HI = − P · A(X).
2m m
k

So far, we have worked in a “mixed picture”, where the EM field was expressed in
second quantization, but only a single particle in first quantization was present. We now
also pass to the second-quantized picture for the particle. To this end, let {|ϕi ⟩}i be an
eigenbasis of Hpart and denote the corresponding creation operators as b†i , so that

Ei b†i bi .
X
Hpart =
i

It remains to treat the interaction Hamiltonian. Even without doing any calculations, we
can see from (3.8) that HI will be of the form

gijk (ak + a†−k )b†i bj .


X

ijk

It thus describes a superposition of processes where a photon is removed from or added to


the field, while the state of the particle gets switched. Let’s calculate the amplitudes:
q X
ϕi HI ϕj b†i bj = − ϕi P · A ϕj b†i bj
X

ij
m ij
r Z
q ℏ X 1
ϕ†i (x) √ (ak + a†−k )e−ikx ek · P ϕj (x) d3 x b†i bj (3.15)

=− 3
m ϵ0 L 2ωk
ijk
CHAPTER 3. FIELD QUANTIZATION AND QUANTUM THEORY OF LIGHT 70

so that
r Z
q ℏ
gijk =− ϕ†i (x)e−ikx ek · P ϕj (x) d3 x.
m ϵ0 L3 2ωk
The wave lengths associated with atomic transitions are much longer than the length scales
of the atoms themselves. This justifies the dipole approximation, in which the dependen-
cies of the EM field on position is neglected by substituting eixk ≃ 1. Then
r Z
q ℏ
gijk ≃ − ϕ†i (x)ek · P ϕj (x) d3 x.
m ϵ0 L3 2ωk
In the expression, the momentum operator acts energy eigenfunctions in position represen-
tation. One can eliminate momentum using

iℏ m
[X, Hpart ] = P ⇒ P = [X, Hpart ],
m iℏ
so that the coupling constants become
r r
−q ℏ 1
gijk = ϕ i (ek · P ) ϕ j ⟩ = iq (Ej − Ei ) ϕi (ek · X) ϕj ⟩.
m ϵ0 L3 2ωk ϵ0 L3 2ℏωk
(3.16)

Because this expression is symmetric under inversion of k, the minus sign of a†−k in (3.15)
can be dropped, so that

ϕi HI ϕj b†i bj = gijk (ak + a†k )b†i bj .


X X
(3.17)
ij ijk

3.4.1 Spontaneous emission


The goal is to compute the life time of the first excited state n = 2 of a hydrogen atom. We
will employ first-order time-dependent perturbation theory in the form of Fermi’s Golden
Rule (Sec. A.3.1), which says that HI will cause an initial state |i⟩ to decay at a rate
Z

Γ≃ |⟨f |HI |i⟩|2 δ(Ei − Ef )ρ(f ) df. (3.18)

Here, |i⟩ = |ϕ2,l,m ⟩|0⟩ (we’ll choose l and m later). The delta function ensures that total
energy is conserved. Because the EM field is already in its lowest-energy state, only final
states where the atom has transitioned into its ground state and has emitted photons are
permitted. Because, by Eq. (3.17), HI is linear in ladder operators, the coupling matrix
element is non-zero only for final states that contain a single photon: |f ⟩ = |ϕ1,0,0 ⟩|k⟩,
where k = (k, λ) labels the state of the emitted photon. (This is an artifact of the approxi-
mations we have made – multiple-photon processes are, in principle, possible).
The energy difference between the two lowest levels (the Lyman-α line) is (Sec. A.2.3)
 1 3 3α2 2
E1,2 := 1 − EI = EI = mc .
4 4 8
The photon energy is ℏωk = ℏc∥k∥ and energy conservation is thus equivalent to ∥k∥ =
E1,2
ℏc .
CHAPTER 3. FIELD QUANTIZATION AND QUANTUM THEORY OF LIGHT 71

It follows that the integral in (3.18) is over states labeled by f = (k, λ), where k lies
E1,2
on a sphere of radius ℏc . For fixed λ, the density of states in k-space is ρ(k)d3 k =
L 3 3

2π d k. Switching to spherical coordinates,
3 3 3
E2
  
L L L
ρ(k) d3 k = d3 k = r2 dr sin θ dϕ dθ = dE sin θ dϕ dθ.
2π 2π 2π ℏ3 c3

Using (3.16, 3.17), the coupling constant for ℏωk = E1,2 is

e2 E1,2 2
|⟨ϕ2,l,m |⟨0|HI |ϕ1,0,0 ⟩|k⟩|2 = ϕ2,l,m (ek · X) ϕ1,0,0 ⟩ .
2ϵ0 L3
To evaluate the matrix element, we need to borrow some results on atomic eigenstates.

Four facts: (F1) The dipole matrix elements ⟨ϕ2,l,m |e · X|ϕ1,0,0 ⟩ are non-zero only
if l = 1. (F2) ⟨ϕ2,l,0 |x|ϕ1,0,0 ⟩ = ⟨ϕ2,l,0 |y|ϕ1,0,0 ⟩ = 0. (F3) States that differ only
in the magnetic quantum number m can be mapped onto each other by a rotation.
(F4) Using the explicit form of the functions ϕn,l,m (x), a tedious integral gives

215 2 ℏ2 215
|⟨ϕ2,1,0 |z|ϕ1,0,0 ⟩|2 = a0 = .
310 m2 c2 310 α2

Fact (F1) implies that in first-order perturbation theory, the states |ϕ2,l,0 ⟩ have infinite
life time unless l = 1, i.e. only the 2p → 1s transition can be computed in this approx-
imation. By (F3), m can be changed by rotating the atom. But the life time of a level is
independent of the atom’s orientation and hence of m. We trust that our approximations
reproduce this rotational invariance (they do), and compute Γ only for m = 0:
3
e2 X
Z 
2π 2 L E1,2
Γ= ϕ2,1,0 (ek · X) ϕ1,0,0 ⟩ sin θ dϕ dθ.
ℏ 2ϵ0 L3 2π ℏc
λ

Then by (F2, F4), only the z-component of ek · X = eλ (k) · X gives a non-zero contri-
bution, namely
X 2 215 2 X
ϕ2,1,0 (eλ (k) · X) ϕ1,0,0 ⟩ = a (eλ (k))2z .
310 0
λ λ

k
To evaluate the sum, note that with e0 (k) := ∥k∥ , the set {eλ (k)}2λ=0 forms on ortho-
normal basis. Expressing the length-squared of ez in that basis gets us
2
X 2
X 2
X
1= 2
|eλ (k) · ez | = cos θ + 2
(eλ (k))2z ⇒ (eλ (k))2z = sin2 θ.
λ=0 λ=1 λ=1

Using the identity sin3 θ dθ = − sin2 θ d(cos θ) = (z 2 − 1) dz, the integration results in
Z 2π Z π Z 1
4
sin3 θ dθ dϕ = 2π (z 2 − 1) dz = 2π .
ϕ=0 θ=0 −1 3
CHAPTER 3. FIELD QUANTIZATION AND QUANTUM THEORY OF LIGHT 72

To express all quantities in relativistic units, eliminate e2 in favor of the fine structure
2
constant α = 4πϵe0 ℏc . Now brew some coffee, close the door, and plug in:
3
2π α4πϵ0 ℏc ℏ2 215 L 3α2 mc2

4
Γ= 2π (don’t think, just copy)
ℏ 2ϵ0 L3 m2 c2 310 α2 2π 8ℏc 3
= 217 8−3 3−8 π 0 L0 ϵ00 α5 ℏ−1 m1 c2 (sort by units)
 8
2 mc2
= α5 = 6.27 × 108 Hz = 1/(1.6 ns) (yeah, go ahead and click).
3 ℏ

Amazingly, given the number of approximations made, this is the accepted value [Radzig,
Smirnov, Reference Data on Atoms, Molecules, and Ions, Table 7.4].

3.5 Further reading


For field quantization, see Photons and Atoms by Cohen-Tannoudji, Dupont-Roc, and
Grynberg. Matter-light interaction follows Quantum Optics by Walls and Milburn and
Advanced Quantum Mechanics by Sakurai (who uses Heaviside-Lorentz units instead of
SI units employed by the other authors – consult Wikipedia to convert).
Chapter 4

Scattering theory

73
Chapter 5

Symmetries in quantum mechanics

74
Chapter 6

Relativistic QM

75
Appendix A

Quantum mechanics recap

In this chapter, we recall some facts that should be familiar from linear algebra and intro-
ductory quantum mechanics courses. The textbook Quantum Mechanics by L. Ballentine
is a good source for this material.

A.1 Linear algebra of Hilbert spaces


A.1.1 Hilbert spaces
A Hilbert space H is a complex vector space with a sesquilinear inner product ⟨·|·⟩.
Sesquilinearity means that for all vectors

α, β, γ ∈ H

and complex numbers z ∈ C, we have

⟨α|β + γ⟩ = ⟨α|β⟩ + ⟨α|γ⟩, (A.1)


⟨α|zβ⟩ = z⟨α|β⟩, (A.2)

as well as
⟨α|β⟩ = ⟨β|α⟩. (A.3)
From this, it follows that

⟨α + β|γ⟩ = ⟨α|γ⟩ + ⟨β|γ⟩,


⟨zα|β⟩ = z̄⟨α|β⟩,

i.e. the inner product is anti-linear w.r.t. the first entry and linear w.r.t. the second one.

Beware that mathematicians usually employ the opposite convention, where the
sesquilinear inner product is linear in the first entry!

The norm of a vector α ∈ H is given by


p
∥α∥ := ⟨α|α⟩.

Recall that inner products are required to be definite, i.e. to fulfill

∥α∥ > 0 ∀α ̸= 0.

76
APPENDIX A. QUANTUM MECHANICS RECAP 77

There are two examples of Hilbert spaces you should be acquainted with: column
vectors and square-integrable functions. Let’s look at both in turn.
The vector space Cd is formed by d-dimensional complex column vectors
 
α1
α =  ... 
 

αd

with sesquilinear inner product


d
X
⟨α|β⟩ = ᾱi βi . (A.4)
i=1

Hilbert spaces appears e.g. in the description of spin degrees of freedom.


More involved is the Hilbert space L2 (Rn ) of square-integrable complex functions on
R . Given two functions α, β : Rn → C, we can define a “continuous analogue” of
n

Eq. (A.4):
Z
⟨α|β⟩ = ᾱ(x)β(x) dn x. (A.5)

For the non-pedantic physicist, the space of all wave functions, together with (A.5) defines
a Hilbert space. It is associated with a point particle with n degrees of freedom.

There are three technical problems that one has to address to define the Hilbert space
of functions with mathematical rigor.
The first problem is the integral is not actually defined for all functions. Set, for
example

sin(1/x) x ̸= 0,
ψ(x) =
0 x = 0.

Then
Z
|α(x)|2 dn x

does not exist (in either the Riemann or the Lebesque sense). The second problem
is that the integral may be defined, but infinite – take e.g. α(x) = 1 and compute
⟨α|α⟩. To get rid of both problems, we define a function α to be square-integrable
if
Z
∥α∥2 = ⟨α|α⟩ = |α(x)|2 dn x

exists and is finite. If α, β are square-integrable, then the product ᾱβ is integrable,
and the Cauchy-Schwarz inequality says that

|⟨α|β⟩|2 ≤ ∥α∥2 ∥β∥2 < ∞,

so that, by restricting to square-integrable functions, we have rid outselves of unde-


fined and infinite integrals!
The third problem is that the norm is no longer definite. Indeed, define a function

1 x=0
α(x) = .
0 x ̸= 0
APPENDIX A. QUANTUM MECHANICS RECAP 78

Then α ̸= 0, but ∥α∥2 = 0. Circumventing this problem requires some mathemati-


cal gymnastics: We say that two functions are equivalent if they differ only on a set
of measure zero. This means e.g. that the function α is equivalent to the 0-function,
as the two differ only at one point. If we define the Hilbert space L2 (Rn ) to be the
complex vector space of equivalence classes of square-integrable functions, then
one can show that (A.5) becomes a definite inner product. Problem solved.

Another technical issue with function spaces concerns physical units. Let me say
upfront that one can represent all physical quantities just by real numbers relative to
some fixed set of units, and that in this case, none of the issues below arise. (This is
what we will mainly do in this document). However, attaching a dimension to every
physical quantity has some value in that it can highlight certain inconsistencies and
guide heuristic arguments. So let’s briefly discuss how this would be done in QM.
For example, we may want ψ(x) to be defined not on the set of real numbers, but
on a set representing physical positions ([x] = L) measured in some concrete unit
of length, say meters m. Then [dx] = L as well, and for the normalization property
to work out, we can either stick with the scalar product
Z
⟨ϕ|ψ⟩ = ϕ̄(x)ψ(x) dx,
R
in which case the wave function needs to have the dimension [ψ(x)] = L1/2 , or we
retain dimensionless wave functions, in which case we have to redefine the scalar
product
Z
⟨ϕ|ψ⟩ = ϕ̄(x)ψ(x) dµ(x),
R·m
with respect to a dimension-free measure
1
dµ(x) := dx.
m
When working in another continuous representation (e.g. momentum, see below),
the units will have to be adapted accordingly. Unlike functions that depend
P on con-
tinuous parameters, discrete coefficients remain dimensionless (so that [ i |αi |2 ] =
1) and thus do not carry information about their physical interpretation.

We will sometimes consider slight modifications, e.g. by choosing some region R ⊂


Rd and working with the space L2 (R) of square-integrable functions on R, subject to
appropriate boundary conditions.

A.1.2 Linear operators


A map A between two vector spaces is linear if

A(ϕ + ψ) = A(ϕ) + A(ψ), A(λϕ) = λA(ϕ).

In QM, linear maps between Hilbert spaces are traditionally called operators.
Examples:
• H = Cd : In this case, operators can conveniently be specified as matrices, which act
on column vectors in the usual way. For example, we will have ample opportunity
to work with the Pauli matrices:
     
0 1 0 −i 1 0
σx = , σx = , σz = .
1 0 i 0 0 −1
APPENDIX A. QUANTUM MECHANICS RECAP 79

• H = L2 (R): The position operator acts on a function ψ : R → C by multiplying


it with its argument
(Xψ)(x) = xψ(x).
The momentum operator maps a function to −iℏ times its derivative
d
P = −iℏ : ψ 7→ P ψ = −iℏψ ′ .
dx
A.1.3 Dirac notation
Physicists often use notational aids to delinate vector-valued quantities from scalars. In
quantum mechanics, the suggestive Dirac notation (or “bra-ket” notation) is usually em-
ployed. Here, a vector α ∈ H is written as |α⟩. This is called a ket, for reasons that will
be obvious momentarily.
Every vector ψ ∈ H defines a linear function
H → C, ϕ 7→ ⟨ψ|ϕ⟩,
the “projection onto ψ”. In quantum, we denote this function as ⟨ψ| and call it a bra. Then
we can write
⟨ψ|(|ϕ⟩) = ⟨ψ|ϕ⟩, (A.6)
so a “bra-ket” is a “braket”. This passes for humor around here.
Linear functions from a vector spaces to C are also called dual vectors or (linear)
functionals. In the calculus of variation – i.e. the branch of calculus that turns the
action principle into the Euler-Langrange equation – the word “functional” is used
instead to refer to a function that takes other functions as arguments. Don’t be
confused by this ambiguity!
In math and engineering, is is common to use a star or sometimes a dagger super-
script to denote the functional associated with a vector in a Hilbert space:
|ψ⟩ ↔ ψ, ⟨ψ| ↔ ψ ∗ or ψ † .

The genius of this notation is that one doesn’t need to expend any thoughts on concepts
like “dual vectors” or “linear functionals” – the formalism almost forces one to use these
object correctly.
Let’s play around with this. Equation (A.6) is the inner product between |ψ⟩ and |ϕ⟩.
One can combine two vectors also to form an outer product, namely the linear operator
H → H defined as
 
|β⟩ 7→ |ϕ⟩⟨ψ| |β⟩ := |ϕ⟩ ⟨ψ|β⟩ . (A.7)
Definition (A.7) implies that composing bra’s and ket’s is associative: One can read the
expression
|ϕ⟩⟨ψ|β⟩
as either
 
|ϕ⟩⟨ψ| |β⟩ “operator acting on vector”
or as

|ϕ⟩ ⟨ψ|β⟩ “vector times inner product” ,
getting the same result.
APPENDIX A. QUANTUM MECHANICS RECAP 80

A.1.4 Bases
Let H be a Hilbert space. A set {|ei ⟩}i ⊂ H is called ortho-normal if

⟨ei |ej ⟩ = δi,j .

If in addition, every element |ψ⟩ ∈ H can be expressed as a liner combination


X
|ψ⟩ = ψi |ei ⟩ (A.8)
i

with suitable expansion coefficients ψi ∈ C, then we have an ortho-normal basis (ONB).

In physics, unless stated otherwise, “basis” always means “ortho-normal basis”.


Also, one usually assumes that every Hilbert space comes with some distinguished
basis, ideally with a clear physical interpretation.
Hilbert spaces are often infinite-dimensional. In this case, the “sum” in (A.8) is
actually an infinite series, and the equality sign is to be interpreted as the statement
n
!
X
lim ψi |ei ⟩ − |ψ⟩ = 0.
n→∞
i=1

Experience has it that ignoring all these mathematical subtleties tends to not create
major problems in physics. If you are bothered by this, it pays to pick up a book on
functional analysis.

Every ONB fulfills the completness relation

|ei ⟩⟨ei | = 1,
X
(A.9)
i

where 1 : |ψ⟩ 7→ |ψ⟩ is the identity map.


P
To prove it, calculate for an arbitrary |ψ⟩ = i ψi |i⟩,
! !
X X X
|ei ⟩⟨ei | ψj |ej ⟩ = ψj |ei ⟩ ⟨ei |ej ⟩ = |ψ⟩.
i j i,j
| {z }
δi,j

The converse is not true: There are complete sets that are not ortho-normal bases.

Using just the completeness relation, the following important properties of ONBs can
be easily verified:

1. Expansion coefficients are given by inner products


X 
|ψ⟩ = 1|ψ⟩ =
X
|ei ⟩⟨ei | |ψ⟩ = ⟨ei |ψ⟩ |ei ⟩.
| {z }
i i
ψi

2. Expansion coefficients of bras are the complex conjugate:

⟨ψ| = ⟨ψ|1 =
X
⟨ψ|ei ⟩ ⟨ei |.
| {z }
i
ψ̄i
APPENDIX A. QUANTUM MECHANICS RECAP 81

3. Inner products with respect to an arbitrary ONB:

⟨ψ|ϕ⟩ = ⟨ψ|1|ϕ⟩ =
X X
⟨ψ|ei ⟩⟨ei |ϕ⟩ = ψ̄i ϕi .
i i

The special case where ϕ = ψ is sometimes called the Parseval relation:


X
∥ψ∥2 = ⟨ψ|ψ⟩ = |ψi |2 .
i

4. Description of operators via matrix elements

A = 1 A1 =
X X
|ei ⟩⟨ei |A|ej ⟩⟨ej | = Ai,j |ei ⟩⟨ej |, Ai,j := ⟨ei |A|ej ⟩.
i,j i,j
(A.10)

so that

⟨ϕ|A|ψ⟩ = ⟨ϕ|1A1|ψ⟩ =
X
ϕ̄j Aij ψi .
ij

The expression (A.10) also shows that for every basis {|ei ⟩}i of the Hilbert space,
the set {|ei ⟩⟨ej |}ij is a basis for the vector space of linear operators.
The Dirac notation allows one to save a bit of ink when working with one fixed ONB.
Say we have agreed to work with {|ei ⟩}i . Then quantum physicists (and no-one else. . . )
commonly drop the symbol e and just put the index into the ket:

|i⟩ := |ei ⟩.

Vector and matrix representations


Assume that H is finite-dimensional and that some ONB {|i⟩}di=1 has been fixed. Then
the calculations above define a one-one relation between H and the Hilbert space Cd of
row vectors. Concretely, take the dictionary
 
ψ1
 .. 
kets ↔ column vectors |ψ⟩ ↔  . 
ψd

bras ↔ row vectors ⟨ψ| ↔ (ψ̄1 , . . . , ψ̄d )


 
A1,1 ... A1,d
 .. .. 
operators ↔ matrices A↔ . . 
Ad,1 ... Ad,d
with

ψi = ⟨i|ψ⟩, Ai,j = ⟨i|A|j⟩.

Under this identification, the composition rules of bra’s, ket’s, and operators correspond to
the usual rules of matrix-vector multiplication. This representation is particularly useful
for computer implementations!
APPENDIX A. QUANTUM MECHANICS RECAP 82

For example, a spin-1/2-degree of freedom is associated with the Hilbert space

H = {α|↑⟩ + β|↓⟩ | α, β ∈ C}

with basis {|↑⟩, |↓⟩}. Then one can introduce operators either abstractly or as matrices,
e.g.:
 
0 1
σx = |↑⟩⟨↓| + |↓⟩⟨↑| “ = ” ,
1 0
 
1 0
σz = |↑⟩⟨↑| − |↓⟩⟨↓| “ = ” .
0 −1

A.1.5 The adjoint


Recall that in Rd with Euclidean scalar product
X
(u, v) = ui vi
i

one can “move a matrix from one entry of the scalar product to the other by taking its
transpose”

(u, Av) = (At u, v), (At )i,j = Aj,i .

Likewise, if H is a Hilbert space and A an operator on H, then there is a unique adjoint


operator A† such that

⟨ϕ|Aψ⟩ = ⟨A† ϕ|ψ⟩ ∀ ψ, ϕ ∈ H.

With respect to a basis, one finds the formula


!
X X X
⟨ϕ|Aψ⟩ = ϕ̄i Aij ψj = ϕi Āij ψj = ⟨A† ϕ|ψ⟩ ⇒ (A† )ij = Āji .
ij j i

The matrix representation of A† is therefore the “conjugate transpose” of the one of A.


The expression A† for the adjoint is pronounced “A dagger”. An operator A is self-adjoint
or Hermitian if A = A† .
If one chooses a basis of H and expands
X
A= Aij |i⟩⟨j|,
ij

then, just as in
X
⟨i|A† |j⟩ = ⟨j|A|i⟩ = Āji ⇒ A† = Āji |i⟩⟨j|.
ij

The matrix representation of A† is therefore the “conjugate transpose” of the one of A. An


operator A is self-adjoint or Hermitian if A = A† .
APPENDIX A. QUANTUM MECHANICS RECAP 83

Properties. It is easy to see that taking the adjoint


• ...is anti-linear (A + zB)† = A† + z̄A† ,
• ...reverses products (AB)† = B † A† ,
†
• ...exchanges “bras” and “kets” |α⟩⟨β| = |β⟩⟨α|.
One can unify the last two properties by slightly generalizing the definition of the adjoint. If
H and K are two Hilbert spaces, then it is still true that for every operator A : H → K, there is
a unique adjoint A† : K → H, such that

⟨ϕ|Aψ⟩K = ⟨A† ϕ|ψ⟩H ϕ ∈ K, ψ ∈ H.

Now C = C1 is itself is a Hilbert space, and we can identify a ket |β⟩ ∈ H with the map
C → H sending z ∈ C to z|β⟩. Then one directly verifies that |β⟩† = ⟨β| and that the first two
properties listed above still hold in the generalized setting. The third property is then a special
case of the second one.
We’ll also use the following consequence

(A|β⟩)† = ⟨β|A† . (A.11)

Examples.
• The Pauli matrices are self-adjoint, as is evident by taking the conjugate-transpose.
• The momentum operator is self-adjoint:
Z ∞
⟨ϕ|P |ψ⟩ = ϕ̄(x)(−iℏ)ψ ′ (x) dx
−∞
Z ∞
=− (ϕ̄)′ (x)(−iℏ)ψ(x) dx (integration by parts)
−∞
Z ∞
= ψ̄(x)(−iℏ)ϕ′ (x) dx = ⟨ψ|P |ϕ⟩,
−∞

where we have used that for square-integrable functions limx→±∞ ψ(x) = 0 so that no
boundary terms appear when integrating by parts.

A.1.6 Spectral decomposition (discrete case)


Recall our old friend, the eigenvalue problem: Given an operator A : H → H, find all
λi , |ψi ⟩ such that

A|ψi ⟩ = λi |ψi ⟩

Of course, the λi ’s are the eigenvalues and the |ψi ⟩’s the eigenvectors of A.
A spectral decomposition (or eigendecomposition) of A is a representation of the form

1=
X X
A= λi |ψi ⟩⟨ψi |, |ψi ⟩⟨ψi |. (A.12)
i i

Given a spectral decomposition, compute


X
A|ψj ⟩ = λi |ψi ⟩ ⟨ψi |ψj ⟩ = λj |ψj ⟩.
i
| {z }
δij
APPENDIX A. QUANTUM MECHANICS RECAP 84

It follows that A has an eigendecomposition if and only if one can find an ONB comprised
of eigenvectors. In this case, one refers to it as A’s eigenbasis, and the λi ’s appearing in
the decomposition are exactly the eigenvalues of A.
Not every operator has an eigenbasis, e.g. the spin-1/2 raising operator
 
1 0 1
σ+ = (σx + iσy ) =
2 0 0

does not (why?). There’s a theorem in functional analysis that essentially says that A
has an eigendecomposition if and only if A commutes with its adjoint, i.e. AA† = A† A.
(Though the case when there is a continuum of eigenvalues needs more attention, see
section below).
The most important class of operators for which this holds are, of course, the self-
adjoint ones A = A† . What is more, in this case, all eigenvalues are real. Indeed, A|ψ⟩ =
λ|ψ⟩ implies (taking |ψ⟩ to be normalized without loss of generality)

λ = ⟨ψ|A|ψ⟩ = ⟨ψ|A† |ψ⟩ = ⟨ψ|A|ψ⟩ = λ̄.

Thus the self-adjoint operators are exactly those of the form

λi ∈ R, {|ϕi ⟩}i an ONB.


X
A= λi |ϕi ⟩⟨ϕi |,
i

A.1.7 Spectral decomposition (continuous case)


When working out eigendecompositions in infinite dimensions, we can run into trouble.
Let’s see what can go wrong.
d
First, consider the momentum operator P = −i dx . The eigenvalue equation is trivial
to solve:

−iψ ′ = λψ ⇔ ψ(x) = c eiλx .

Trouble is that these solutions are not square integrable:


Z ∞
∥ψ∥2 = |c|2 dx = ∞. /
−∞

There are additional problems! For the position operator (Xψ)(x) = xψ(x), the eigen-
value equation

xψ(x) = λψ(x) ∀x

is solved by

c x=λ
ψ(x) = ,
0 else

which has norm ∥ψ∥ = 0. So it seems like there are no eigendecompositions for the two
most important operators of QM. //
To get around the problem, we widen our domain of discourse by allowing for more
general objects than just square-integrable functions. Let’s first see how this formally
solves our problem. Whether we are “allowed to do this”, i.e. whether the formal con-
struction will lead to inconsistencies is something we’ll worry about later.
APPENDIX A. QUANTUM MECHANICS RECAP 85

Delta distributions
The distribution δy is a formal object whose inner product with a smooth function ϕ is
defined to be
Z
⟨δy |ψ⟩ = δ̄y (x)ψ(x) dx := ψ(y).

Then the expression


Z
x|δx ⟩⟨δx | dx
x

provides an eigendecomposition of the position operator X in the sense that for any pair
of smooth functions ϕ, ψ we get the correct result
Z  Z Z
⟨ϕ| x|δx ⟩⟨δx | dx |ψ⟩ = x⟨ϕ|δx ⟩⟨δx |ψ⟩ = xϕ̄(x)ψ(x) = ⟨ϕ|X|ψ⟩. (A.13)
x

Likewise, we have the completeness relation


Z
|δx ⟩⟨δx | dx = 1 (A.14)
x

in the same sense, i.e.


Z  Z Z
⟨ϕ| |δx ⟩⟨δx | dx |ψ⟩ = ⟨ϕ|δx ⟩⟨δx |ψ⟩ = ϕ̄(x)ψ(x) = ⟨ϕ|ψ⟩.
x

So when integrated against smooth functions, the expressions above behave just like an
eigendecomposition should. We can work this that! ,

Plane waves
We now turn to eigendecomposition of the momentum operator. For k ∈ R, define the
non-normalizable eigenfunction

ϕk (x) = (2π)−1/2 eikx .

We claim that
Z Z
|ϕk ⟩⟨ϕk | dk = 1, ℏk |ϕk ⟩⟨ϕk | dk = P,
k k

in the sense that for ψ, ϕ vanishing at infinity


Z 
⟨ϕ| |ϕk ⟩⟨ϕk | dk |ψ⟩ = ⟨ϕ|ψ⟩, (A.15)
Z k 
⟨ϕ| ℏk |ϕk ⟩⟨ϕk | dk |ψ⟩ = ⟨ϕ|P |ψ⟩. (A.16)
k

To see that this is true, note that the inner product with a function ψ
Z
−1/2
⟨ϕk |ψ⟩ = (2π) e−ikx ψ(x) dx = ψ̃(k)
APPENDIX A. QUANTUM MECHANICS RECAP 86

gives the Fourier transform ψ̃ of ψ evaluated at k. Recall that the inverse transform is
Z
−1/2
(2π) eikx ψ̃(k) dk = ψ(x).

The completeness relation Eq. A.15 thus follows from


Z  Z
⟨δx | |ϕk ⟩⟨ϕk | dk |ψ⟩ = (2π)−1/2 eikx ψ̃(k) dk = ψ(x).
k k

Next, for ψ vanishing at infinity, integration by parts give


Z
−1/2
ℏk⟨ϕk |ψ⟩ = (2π) ℏke−ikx ψ(x) dx
Z  
−1/2 d −ikx
= (2π) iℏ e ψ(x) dx
dx
Z  
d
= (2π)−1/2 −iℏ ψ(x) e−ikx dx = ⟨ϕk |P |ψ⟩
dx

which implies Eq. (A.16):


Z  Z 
⟨ϕ| ℏk|ϕk ⟩⟨ϕk | dk |ψ⟩ = ⟨ϕ| |ϕk ⟩⟨ϕk | dk P |ψ⟩ = ⟨ϕ|P |ψ⟩. ,,
k k

General eigendecompositions
We can now sketch the way in which a general Hermitian operator A has an eigendecom-
position. Consider all solutions to the eigenvalue equation

A|ψλ ⟩ = λ|ψλ ⟩,

regardless of whether |ψλ ⟩ is square-integrable or not. Assume for simplicity that A is


non-degenerate, i.e. that for every λ ∈ R, there is at most one eigenfunction |ψλ ⟩. An
eigenvalue λ ∈ C is called discrete if it is separated from all other eigenvalues by a finite
distance. Let D be the set of discrete eigenvalues. Eigenvalues that are not discrete are
called continuous. Collect them in another set C. Choose normalization such that

⟨ψλ′ |ψλ ⟩ = δλ′ ,λ λ ∈ D,



⟨ψλ′ |ψλ ⟩ = δ(λ − λ) λ ∈ C.

Then we have the completeness relation and spectral decomposition


Z
1 = |ψλ ⟩⟨ψλ | dλ +
X
|ψλ ⟩⟨ψλ |,
C λ∈D
Z X (A.17)
A= λ|ψλ ⟩⟨ψλ | dλ + λ|ψλ ⟩⟨ψλ |.
C λ∈D

We can unify the treatment of the discrete and the continuous part. Define

1 λ′ ∈ C
X 
ρ= δλ + IC in terms of the indicator function IC (λ′ ) = .
0 else
λ∈D
APPENDIX A. QUANTUM MECHANICS RECAP 87

The delta functions allow us to incorporate the sums in (A.17) into the integral:
Z
A = λ|ψλ ⟩⟨ψλ |ρ(λ) dλ. (A.18)

The completeness relation generalizes like this: For any subset S ⊂ R,


Z
|ψλ ⟩⟨ψλ |ρ(λ) dλ = PS , (A.19)
S

where PS projects onto the space spanned by {|ψλ ⟩ | λ ∈ S}. This looks somewhat like
the formula
Z
ρ(λ) dλ = µ(S)
S
for computing the measure of a set S given a density ρ. Therefore, the map S 7→ PS is
called a projection-valued measure and ρ the density of states (with respect to dλ). The
interpretation of ρ is particularly clear when applied to sets S that do not intersect the
continuous part S ∩ C = ∅. Then
Z
ρ(λ) dλ = |S ∩ D|
S
equals the number of eigenvalues of A in S.
See Chapter 1 of Quantum Mechanics by Ballentine for a more careful, but not too
technical exposition. A rigorous version is the spectral theorem of functional analysis.

A.1.8 More on delta distributions


How to think about distributions
Our account of general eigendecompositions and distributions is not mathematically rig-
orous. It can be made precise, but doing so takes a lecture in functional analysis (c.f. the
spectral theorem and the theory of distributions). Given that we won’t take the time here
to go into more details, how should one deal with distributions that pop up in equations?
Some strategies:
1. Integrate against smooth functions that quickly vanish at infinity. As in (A.13),
even if the intermediate mathematical expression contains δ’s, they should have all
vanished after one has integrated the expression over smooth functions in order to
extract physical quantities. The mathematically rigorous approach is based on this
strategy, and it is the one we will have at the back of our heads in this document.
2. Think of δ is an idealization of “highly concentrated”. One can in principle replace
(ϵ)
δx by functions δx that are supported on an ϵ-ball around x, where ϵ is much
smaller than any relevant length scale. The final physical results should then only
weakly depend on the actual choice of ϵ, and one should, in fact, be able to take a
limit ϵ → 0. In this sense, the actual distribution is an idealization that allows one
to directly obtain the limit, without first introducing an ϵ and eliminating it again in
the end.
3. Shut-up-and-calculate. The reason δ’s are so ubiquitous is that they work well as a
computational tool. So in reality, people just use them whenever they would have
used a Kronecker delta in a discrete analogue, and pretend that all algebraic manip-
ulations that are valid for Kronecker deltas also extend to distributions. This mostly
works.
APPENDIX A. QUANTUM MECHANICS RECAP 88

Derivatives of delta functions


While the mathematicians look the other way, let’s get adventurous and represent the mo-
mentum operator in position basis.
The derivative of the delta function δy′ (x) is a formal object whose inner product with
smooth functions vanishing at infinity is defined so that formally the rule of integration by
parts holds:
Z Z
⟨δy′ |ψ⟩ = δx′ (x)ψ(x) dx := − δx (x)ψ ′ (x) dx = −ϕ′ (y)

and therefore
Z
P = iℏ |δx ⟩⟨δx′ | dx

is valid in the sense that for all smooth ψ, ϕ vanishing at infinity,


Z Z
iℏ ⟨ψ|δx ⟩⟨δx′ |ϕ⟩ dx = −iℏ ψ̄(x)ϕ(x′ ) dx = ⟨ψ|P |ϕ⟩. (A.20)

Other expressions are


Z Z Z Z

P = −iℏ |δx ⟩⟨δx | dx = −iℏ |δx ⟩∂x ⟨δx | dx = iℏ |δy ⟩ δy′ (z) ⟨δz | dy dz.

The first holds because shifting the derivative to the bra means that in (A.20), ψ instead of
ϕ gets differentiated, and to remedy that, we need to use integration by parts once more,
which causes the change in sign. The second one holds because ∂x δx (y) = ∂x δ(y − x) =
−δx′ (y), so differentiating the index rather than the argument of the delta function also
incurs a sign change. A similar argument verifies the third expression. This last one is
interesting, because it is a formal generalization of (A.10) to continuous bases. It expresses
P in terms of its “matrix elements”
Z Z
⟨δy |P |δz ⟩ = −iℏ δy (x)δz′ (x) dx = iℏ δy′ (x)δz (x) dx = iℏδy′ (z).

Using these formulas, the kinetic energy operator reads

P2 ℏ2 ℏ2
Z Z
=− |δx ⟩∂x2 ⟨δx | dx = |δx′ ⟩⟨δx′ | dx.
2m 2m 2m

A.1.9 More on Fourier transforms


Let’s have a closer look at the n-dimensional Fourier basis ϕk (x) = (2π)−n/2 eikx , for
k ∈ Rn , and the associated transforms
Z
−n/2
ψ̃(k) := ⟨k|ψ⟩ = (2π) e−ikx ψ(x) dn x,
Z (A.21)
ψ(x) := ⟨x|ψ⟩ = (2π)−n/2 eikx ψ̃(k) dn k.
APPENDIX A. QUANTUM MECHANICS RECAP 89

Fourier transforms in finite regions


The Fourier basis for functions on Rn is continuous, which, as discussed above, comes
with technical difficulties. Things are much easier for spaces of functions in finite regions.
Concretely, choose some length L and consider the box B = [−L/2, L/2]n with side
length L centered at the origin. Let L2 (B) be the space of functions defined on the region
B with cyclic boundary conditions (i.e. functions take the same values on opposite faces
of the box) and with inner products given by integrals over B only:
Z
⟨ϕ|ψ⟩ = ϕ̄(x)ψ(x) dn x.
B

A plane wave eikx complies with the boundary conditions if and only if every component
ki of the wave vector is an integer multiple of 2π
L . Indeed, the discrete set of functions

1 ikx 2π n
ϕk (x) := e , k∈ Z ,
Ln/2 L
forms an ONB for L2 (B) and the formulas for the Fourier transform become
Z
1
ψ̃(k) = n/2 e−ikx ψ(x) dn x,
L B
1 X (A.22)
ψ(x) = n/2 ψ̃(k)eikx .
L 2π n
k∈ L Z
Comparison with (A.21) shows that, formally, the transition between a finite and an un-
bounded volume Fourier transform is facilitated by the substitution
Z
1 1 X
dn k ↔ (A.23)
π n/2
Rn L n/2
2π n
k∈ L Z
Note the asymmetry in (A.22): Fourier transformation takes the compact domain B to
the discrete domain 2πL Z . We can of course reverse the interpretation of the two functions
n

in (A.22). The formula then says that functions ψ(x) defined on a lattice Zn 2π L can be
expanded in terms of plane waves ϕk (x) = Ln/2 1
e−ikx with wave vectors k ∈ B. In
this context, B is sometimes called the Brillouin zone and k the crystal momentum or
quasi-momentum.
Of course, the universe isn’t actually a finite box with cyclic boundary conditions...
...but we may as well pretend it were! Physics is local, so we can assume that all phenom-
ena we are interested in take place in some box that is sufficiently large that the boundary
does not affect the predictions we extract from the theory.

Translation symmetry
Fourier transforms are intimately connected to translation symmetry. Let Ta be the trans-
lation operator that shifts functions along the vector a

(Ta ψ)(x) = ψ(x − a).

The Fourier basis diagonalizes translations:


Z
−ika
⟨x|Ta |ϕk ⟩ = e ik(x−a)
=e ⟨x|ϕk ⟩ ⇒ Ta = e−ika |ϕk ⟩⟨ϕk | dn k.
APPENDIX A. QUANTUM MECHANICS RECAP 90

It is the unique common eigenbasis of all Ta (why?). Therefore, if A is any operator that
commutes with translations

[Ta , A] = 0 ∀ a, (A.24)

then T must be diagonal in the Fourier basis, too. Explicitly, (A.24) implies that A is fully
specified by its “first column”

f (z) := ⟨δz |A|δ0 ⟩

in the sense that

⟨δx |A|δy ⟩ = ⟨δx |ATy T−y |δy ⟩ = ⟨δx |Ty A|δ0 ⟩ = ⟨δx−y |A|δ0 ⟩ = f (x − y).

It then follows that the eigenvalues of A are proportional to the Fourier transform of f :
Z
−n/2
⟨δx |A|ϕk ⟩ = (2π) ⟨δx |A|δy ⟩eiky dn y
Z
= eikx (2π)−n/2 f (x − y)e−ik(x−y) dn y = (2π)n/2 f˜(k) ⟨δx |ϕk ⟩

so that, summarizing,
Z
A = (2π) n/2
f˜(k) |ϕk ⟩⟨ϕk | dn k. (A.25)

Fourier transform for functions depending on space and time


Common notation and sign conventions slightly differ when one coordinate has the in-
terpretation of a time. Write x = (t, x) ∈ Rn , with t the “temporal” coordinate and
x ∈ Rn−1 the “spatial” ones. Wave vectors are denoted by k = (ω, k). To compute inner
products, we use the Minkowski form

⟨p, x⟩ = ωt − kx,

which (at least in the case of n = 4) determines the space-time metric in relativity. The
commonly used basis of plane waves is

ϕk (x) = (2π)−n/2 e−i⟨p,x⟩ = (2π)−n/2 e−iωt+kx

so that the forward and inverse Fourier transform are, respectively,


Z
ψ̃(ω, k) = (2π)−n/2 eiωt−ikx ψ(t, x) dt dn x,
Z (A.26)
−n/2
ψ(t, x) = (2π) e−iωt+ikx ψ̃(ω, k) dω dn k.

This convention extends to the case n =R 1. That is, if a function ψ depends only on
1
time, then its FT is taken to be ψ̃(ω) = 2π eiωt ψ(t)dt, whereas if the single parameter is
1
R −ikx
interpreted as a spatial coordinate or a generic parameter, then ψ̃(k) = 2π e ψ(x)dx.
APPENDIX A. QUANTUM MECHANICS RECAP 91

Finite Fourier transform


Occasionally, we’ll come across the finite Fourier transform. It is defined for functions
ψ : ZN → C, where ZN = {0, . . . , N − 1} with arithmetic done modulo N . The
standard basis on this space is given by delta functions δx (y) = δxy so that
X
|ψ⟩ = ψ(x)|δx ⟩.
x∈ Z N

The Fourier basis is


1 X ikx 2π
|ϕk ⟩ = √ e |δx ⟩, k∈ ZN .
N x∈ZN N

The Fourier transform and its inverse thus take the form
1 X −ikx 1 X
ψ̃(k) = ⟨ϕk |ψ⟩ = √ e ψ(x), ψ(x) = ⟨δx |ψ⟩ = √ eikx ψ̃(k).
N x∈ZN N k∈ 2π Z
N N

The theory developed above can be easily translated to the finite case.

A.1.10 Functions of operators


Let
X
A= λi |ϕi ⟩⟨ϕi |
i

be the eigendecomposition of an operator. Then


X X
A2 = λi λj |ϕi ⟩ ⟨ϕi |ϕj ⟩⟨ϕj | = λ2i |ϕi ⟩⟨ϕi |
ij
| {z } i
δij

and likewise
X
Ak = · · · = λki |ϕi ⟩⟨ϕi |.
i

k
P
Thus, if p(x) = k ck x is a polynomial, then
X X
p(A) = c k Ak = p(λi )|ϕi ⟩⟨ϕi |.
k i

For an arbitrary function f : C → C, one can thus consistently define its action on
operators with an eigendecomposition as
X
f (A) := f (λi )|ϕi ⟩⟨ϕi |.
i

(This convention is sometimes called the spectral calculus).


APPENDIX A. QUANTUM MECHANICS RECAP 92

A.1.11 Unitary operators


Unitary operators are the Hilbert space analogue of orthogonal rotations in Euclidean vec-
tor spaces: Invertible linear operators that preserve inner products. Let’s work out what
that means.
The inner product between U |ϕ⟩, U |ψ⟩ is ⟨ϕ|U † U |ψ⟩ (recall Eq. (A.11)). Thus U
preserves the inner product between any pair of operators if and only if
⟨ϕ|U † U |ψ⟩ = ⟨ϕ|ψ⟩ ∀ ϕ, ψ ∈ H.

We thus define: An operator is unitary if it is invertible and fulfills U † U = 1.


One can work out that these characterizations are equivalent:
1. U is unitary.
2. U has a spectral decomposition of the form

ϕi ∈ R,
X
U= eiϕi |ψi ⟩⟨ψi |,
i

i.e. all eigenvalues λi = eiϕi have absolute value equal to 1.


3. There is a Hermitian operator A such that U = eiA (in the sense of Sec. A.1.10).
4. U is such that U † U = 1 and U U † = 1 (in which case U is automatically invertible,
so we do not have to list this as an extra requirement).
5. If {|ei ⟩}i is an ONB, then so is {U |ei ⟩}i .
In quantum mechanics, unitary operators describe symmetries. The most important
symmetry is of course time evolution! The Hermitian operator that generates time evolu-
tion U (t) in the sense that U (t) = e−it/ℏH (as in Point 3.) is nothing but −1/ℏ times the
Hamiltonian.

A.1.12 Projections
Recall (see Fig. A.1) that in Rd with Euclidean scalar product (u, v) =
P
i ui vi , there is a
one-one relation between
• Subspaces V ⊂ Rd , and
• orthogonal projections P , i.e. linear maps fulfilling P = P t , P 2 = P .
The Hilbert space analogue works like this: An operator P is a projector (or projection)
if
1. P = P † , and
2. P 2 = P .
The first property means that P has a spectral decomposition. The second property then
implies that the eigenvalues are elements of {0, 1}. Thus,
X
P = |ψi ⟩⟨ψi |,
i

where the {|ψi ⟩} form an ONB for the subspace V ⊂ H onto which P projects.
Examples:
APPENDIX A. QUANTUM MECHANICS RECAP 93

Figure A.1: Orthogonal projection of u onto the x-y-plane.

• For every normalized vector |ψ⟩ ∈ H, the outer product P = |ψ⟩⟨ψ| is the projec-
tion onto the one-dimensional subspace V = {z|ψ⟩ | z ∈ C}.
• Define the parity operator Π on H = L2 (R) by

Π|δx ⟩ = |δ−x ⟩, that is (Πϕ)(x) = ϕ(−x).

Then it’s easy to see that P± = 21 (1 + Π) are projection operators onto the space of
even and odd functions respectively (why?).

A.1.13 The trace


The trace of an operator is the sum over its eigenvalues. It can be expressed as
X
tr A = ⟨i|A|i⟩,
i

where the sum is over any ONB {|i⟩}i .


Some properties:
• Cyclic invariance:
X X
tr AB = ⟨i|A|j⟩⟨j|B|i⟩ = ⟨j|B|i⟩⟨i|A|j⟩ = tr BA.
ij ij

• Trace of outer products are inner products:


X X
tr |α⟩⟨β| = ⟨i|α⟩⟨β|i⟩ = ⟨β|i⟩⟨i|α⟩ = ⟨α|β⟩.
i i

A.1.14 Commuting operators


Assume that two operators A, B have a joint eigenbasis {|ψi ⟩}:

A|ψi ⟩ = ai |ψi ⟩, B|ψi ⟩ = bi |ψi ⟩.

Then

[A, B]|ψi ⟩ = (AB − BA)|ψi ⟩ = (ai bi − bi ai )|ψi ⟩ = 0 ∀ i.


APPENDIX A. QUANTUM MECHANICS RECAP 94

so the operators commute.


Less obvious, but still true is that the converse also holds: If two operators commute,
one can construct a joint eigenbasis. In fact, the conclusion also holds for any set of
mutually commuting operators.
Warning: If two operators commute then it does not follow that every eigenbasis of
one is also an eigenbasis of the other (why?).

A.2 Some concrete systems


A.2.1 A single harmonic oscillator
Classical Hamiltonian mechanics
Let’s first retrace the solution of a harmonic oscillator
P2 1
H= + mω 2 X 2 .
2m 2
in classical mechanics. Choose problem-adapted units for length and momentum:
r r
mω 1 1
X̃ = X, P̃ = P ⇒ H = ℏω(P̃ 2 + X̃ 2 ).
ℏ mℏω 2
Wait, what’s ℏ doing in a classical calculation? Well, it’s convenient to work with dimen-
sionless quantities X̃, P̃ . But then XP/(X̃ P̃ ) is a constant having the dimension of an
action. There’s no preferred scale of action in classical mechanics – but ℏ does the job and
facilitates the later transition to QM.
Next, introduce complex coordinates
1 1
a := √ (X̃ + iP̃ ) ⇒ a† = √ (X̃ − iP̃ ),
2 2
where we use the “dagger” superscript to denote complex conjugation. These complex
coordinates may not have a direct physical interpretation, but they are easy to work with
and we can recover the original position and momentum coordinates as


r r r
ℏ † 2ℏ mℏω
X= (a + a ) = Re(a), P = −i (a − a† ) = 2mℏω Im(a).
2mω mω 2
The Poisson bracket {X, P } = 1 implies
1 1
{a, a† } =

− i{X̃, P̃ } + i{P̃ , X̃} = ,
2 iℏ
so the coordinate change (X, P ) → (a, a† ) is canonical up to the factor 1/(iℏ). The
Hamilton function reads in complex coordinates
1
H= ℏω(aa† + a† a) = ℏω|a|2 . (A.27)
2
and the equations of motion are (using standard properties of Poisson brackets)

∂t a = {a, H} = ℏω{a, a† a} = ℏω(a† {a, a} + {a, a† }a) = −iωa,

solved by a(t) = a(0)e−iωt .


APPENDIX A. QUANTUM MECHANICS RECAP 95

Quantum mechanics
Now assume that the X, P are not classical phase space coordinates, but instead position
and momentum operators on L2 (R). Replacing Poisson brackets {·, ·} by commutators
1
iℏ [·, ·] and complex conjugates by Hermitian conjugates, the above derivation goes through
verbatim for the quantum case, up until Eq. (A.27). There, the fact that the a†i , ai do not
commute means that we cannot simplify 21 (a†i ai + ai a†i ) as |ai |2 . Using the commutation
relations of the ladder operators instead, the Hamiltonian becomes
1  1
H = ℏω (a† a + aa† ) = ℏω a† a + .
2 2
Momentarily switching back to the position-space representation of the operators, one can
2
easily see that there is a unique ground state |0⟩, with wave function ⟨x̃|0⟩ = π −1/4 e−x̃ /2 .
From the commutation relations of the ladder operators, it then follows that with
1
|n⟩ := √ (a† )n |0⟩,
n!
the set {|n⟩}n≥0 forms an ONB of the Hilbert space. It is indeed the eigenbasis of H:
√ √  1
a|n⟩ = n|n − 1⟩ ⇒ a† |n⟩ = n + 1|n + 1⟩ ⇒ H|n⟩ = ℏω n + |n⟩.
2

A.2.2 Normal modes


Classical Hamiltonian mechanics
In classical mechanics, consider the Hamilton function
X P2 1X
k
H= + Vkl Xk Xl ,
2m 2
k kl

where the potential is given in terms of some coupling matrix V = (Vkl ). Without loss of
generality (why?), we can assume that V is symmetric and thus there exists an orthogonal
O that diagonalizes V :

(OOt )kl = δkl , (OV Ot )kl = δkl λk .


q
Instead of with the eigenvalues λk directly, we’ll work with ωk := λmk . (The energy of
the system is bounded below only if all λk ≥ 0. Only this case is physically interesting,
so don’t worry about imaginary ωk ). The rows of O form an ortho-normal basis called the
normal modes of the interaction. Expressing position and momentum in this basis gives
the normal coordinates
X X X X
ϕk = Okl Xl , πk = Okl Pl ⇒ Xl = Okl ϕk , Pl = Okl πk .
l l k k

This transformation is canonical:


X
{ϕk , πl } = Oki Ojl {Xi , Pj } = (OOt )kl = δkl , {πk , πl } = {ϕk , ϕl } = 0.
ij
| {z }
δij
APPENDIX A. QUANTUM MECHANICS RECAP 96

Plugging in, the Hamiltonian decouples


X 1 1 
H= πk2 + mωk2 ϕ2k .
2m 2
k

Each summand can now be treated as in Sec. A.2.1:


r r
mωk 1 X
ak := ϕk + i πk ⇒ H= ℏωk |ak |2 ,
2ℏ 2mℏωk
k

solved by ak (t) = ak (0)e−iωk t . The transformation back to the original coordinates reads
r
ℏ X 1
Xl (t) = √ (ak e−iωk t Okl + a†k eiωk t Okl ),
m 2ω k
k

r
X ωk
Pl (t) = −i mℏ (ak e−iωk t Okl − a†k eiωk t Okl ).
2
k

In words, the configuration X(t) of the particles is a linear combination of the normal
modes, with the coefficients oscillating with the eigenfrequencies ωk .
Some remarks:

• Even though V is real, it is sometimes convenient to choose the normal modes to be


a set of complex eigenvectors. The most important example are translation-invariant
couplings V , which are diagonal by suitable complex exponentials (Sec. A.1.9). The
calculations above can be easily adapted to this case.
• With a bit more effort, one can see that any Hamiltonian that is (i) a quadratic ex-
pression in position and momenta and (ii) has a lower bound on the ground state
energy can be diagonalized in a related way by a canonical change of coordinates.
Check out Williamson’s Theorem for details.

Quantum mechanics
As was done in the n = 1 case of Sec. A.2.1, assume now that the Xi , Pi are position and
momentum operators on L2 (Rn ). Then, again, the above applies verbatim to the quantum
case, except that the Hamiltonian reads
 1
ℏωk a†k ak +
X
H= .
2
k

In terms of normal coordinates


r
mωk
ϕk |x̃⟩ = x̃k |x̃⟩,

the ground state wave function is
n
Y 2 2
⟨x̃|0⟩ = π −1/4 e−x̃i /2 = (4π)−n/4 e−∥x̃∥ /2

i=1
APPENDIX A. QUANTUM MECHANICS RECAP 97

and the eigenbasis arises from laddering (don’t confuse the quantum numbers ni with the
number of n of degrees of freedom)
Y 1 X 1
|n1 , . . . ⟩ = √ (a†i )ni |0⟩ ⇒ H|n1 , . . . ⟩ = ℏωk nk + . (A.28)
i
ni ! 2
k

In principle, it is possible to work out the wave function ⟨x|n1 , . . . ⟩ in terms of the original
coordinates. But this gets ugly pretty quickly, so one usually tries to extract physical
predictions without having to go there.

A.2.3 Central potentials


For some constant κ, consider the Hamiltonian
P2 κ
H= − .
2m ∥X∥
Because the Hamiltonian is rotationally invariant, we can find joint eigenvectors
H|E, l, m⟩ = E|E, l, m⟩,
L2 |E, l, m⟩ = ℏ2 l(l + 1)|E, l, m⟩,
Lz |E, l, m⟩ = ℏm|E, l, m⟩.
Negative energy solutions correspond to bound states. These energies are quantized:
EI mκ2
En = − , n ∈ N , EI = (ionization energy).
n2 2ℏ2
The angular momentum quantum number l takes integer values between 0 and n − 1 (as
always, the magnetic quantum number m is an integer between −l and l). In spherical
coordinates, the eigenfunctions are of the form
r ℏ2
⟨x|n, l, m⟩ = Ylm (θ, ϕ) e− a0 n ynl (r/a0 ), a0 = (Bohr radius)

where Ylm (θ, ϕ) are the spherical harmonics, and ynl a polynomial of degree n − 1 that
one can explicitly work out in terms of generalized Laguerre polynomials. (Though round-
about nobody is thrilled by the prospect of “working things out in terms of generalized
Laguerre polynomials” – and fortunately, one can often invoke more elegant arguments so
that one doesn’t have to).

Hydrogen atom, fine structure constant


The Hamiltonian for the electron of a hydrogen atom corresponds to m = me (electron
e2
mass) and κ = 4πϵ 0r
(Coulomb repulsion). For atomic problems, it makes sense to express
quantities in the natural units that can be formed by combining the constants ℏ, m, c, and
the fine structure constant
e2 1
α= ≃ .
4πϵ0 ℏc 137
In particular, the natural scales are mc2 for energy; mc for momentum; mc

for length; and

mc2 for time.
ℏ 1
Then the factor in front of the potential term reads κ = ℏcα, and we get a0 = mc α
2
(Bohr radius), and EI = mc2 α2 (ionization energy of hydrogen).
APPENDIX A. QUANTUM MECHANICS RECAP 98

A.2.4 Fermionic oscillator


Let a be such that a, a† fulfill the canonical anti-commutation relations (CAR)

[a, a† ]+ = 1, [a, a]+ = 2a2 = 0, [a† , a† ]+ = 2(a† )2 = 0. (A.29)

The Fermionic number operator N = a† a is a projection:

N 2 = a† aa† a = a† (1 − a† a)a = N − (a† )2 a2 = N.

It follows that the Fermnionic occupation numbers can only be 0 and 1. Likewise,

aa† = 1 − a† a = 1 − N (A.30)

is the complementary projection.


As in the Bosonic case, the Hamiltonian H = ℏωN gives rise to the Heisenberg-picture
time evolution a(t) = e−iωt a(0). To see this, plug the expression into the Heisenberg
equation of motion iℏ∂t a(t) = [a(t), H] and use

[a, a† a]− = aa† a − a† aa = a(a† a) = a(1 − aa† ) = a.

Using (A.30)

1 = (a† a − aa† ),
ℏω ℏω
H ′ := H −
2 2
so the right hand side generates the same time evolution.
The simplest representation of the Fermionic oscillator is on H = C2 , with
   
† 1 0 0 1 0 1
a = (σx + iσy ) = ⇒ a = (σx − iσy ) = .
2 1 0 2 0 0

Then the occupation number operator and the occupation number basis is
     
1 0 0 1
N= , |0⟩ = , |1⟩ =
0 0 1 0

and the Hamiltonian


 
ℏω ℏω 1 0
H′ = σz = .
2 2 0 −1

A.3 Perturbation theory


A.3.1 Fermi’s golden rule
Here’s the problem we want to solve. Consider a quantum system that starts in some
initial state |ψ(t = 0)⟩ = |i⟩. Choose a projection operator PF onto a set of final states
orthogonal to the initial state PF |i⟩ = 0. The goal is to estimate the probability

Pi→F (t) = ⟨ψ(t)|PF |ψ(t)⟩

of finding the system in one of the final states when measured at time t.
APPENDIX A. QUANTUM MECHANICS RECAP 99

We consider the situation where the Hamiltonian is of the form H = H0 + λV , where


H0 is sufficiently simple that the stationary Schrödinger equation can be solved

H0 |f ⟩ = Ef |f ⟩,

and where λV is a “small” perturbation.


As is standard in perturbation theory, we assume (without much in the way of proof)
that one can expand

X
|ψ(t)⟩ = λs |ψs (t)⟩
s=0

as a power series in λ and that low orders give meaningful answers. Separating the
Schrödinger equation
X  X 
iℏ∂t λs |ψs ⟩ = (H0 + λV ) λs |ψs ⟩
s s

by degrees of λ gives

iℏ∂t |ψ0 ⟩ = H0 |ψ0 ⟩ 0th order


iℏ∂t |ψ1 ⟩ = H0 |ψ1 ⟩ + V |ψ0 ⟩ 1st order
.. ..
. .

With initial condition |ψ(t = 0)⟩ = |i⟩, the zeroth-order equation is solved by
t
|ψ0 (t)⟩ = e iℏ Ei |i⟩.

Plugging this into the first-order one and projecting onto an eigenstate |f ⟩ gives
t
iℏ∂t ⟨f |ψ1 (t)⟩ = Ef ⟨f |ψ1 (t)⟩ + e iℏ Ei ⟨f |V |i⟩

which is solved by
1
1 − e iℏ (Ei −Ef )t 1 Ef t
⟨f |ψ1 (t)⟩ = ⟨f |V |i⟩ e iℏ for Ef ̸= Ei , (A.31)
Ef − Ei
t 1
⟨f |ψ1 (t)⟩ = ⟨f |V |i⟩ e iℏ Ef t for Ef = Ei , f ̸= i. (A.32)
iℏ
Using L’Hôspital’s rule, one verifies that (A.31) tends to (A.32) for Ef → Ei . In this
sense, it suffices to work with (A.31) alone. Its square is

sin2 ((Ei − Ef ) 2ℏ
t
)
|⟨f |ψ(t)⟩|2 = 4|⟨f |V |i⟩|2 (i ̸= f ).
(Ei − Ef )2

With ϵ = (Ef − Ei ), τ = 2ℏ t
, the fraction is sin2 (ϵτ )/ϵ2 , the square of the “sinc
function” (Fig. A.2). It has a central peak of height τ , zeroes at ϵ = ± πτ , and shows
oscillations of quadratically decreasing amplitude for ϵ → ±∞. It is known (by the
Dirichlet integral) that the area under the curve is τ π. Therefore, the family of functions
1
fτ (ϵ) := πτ sin2 (ϵτ )/ϵ2 , converges to a δ-function centered at 0 as τ → ∞.
Qualitatively, we can now describe which parameters enter the probability Pi→F (t).
By the above, only states |f ⟩ with energy Ef in the range Ei ± 2πℏ t pick up significant
APPENDIX A. QUANTUM MECHANICS RECAP 100

Figure A.2: Squared sinc function sin2 (ϵτ )/ϵ2 . x axis in units of τ , y axis in units of τ1 .

weight. For such states, the modulus squared is proportional to t and the squared coupling
coefficient |⟨f |V |i⟩|2 .
To get a more quantitative statement, let ρ(f ) be a measure such that
Z
PF = |f ⟩⟨f |ρ(f ) df.
F

In other words, ρ(f ) is the “density of states”, in the sense of Sec. A.1.7. Then
2 t

2 sin (Ei − Ef ) 2ℏ
Z Z
2
⟨ψ(t)|PF |ψ(t)⟩ = |⟨f |ψ(t)⟩| ρ(f ) df = 4|⟨f |V |i⟩| ρ(f ) df.
F F (Ei − Ef )2

Given the discussion above, optimistically,


Z

⟨ψ(t)|PF |ψ(t)⟩ ≃ t |⟨f |V |i⟩|2 δ(Ei − Ef )ρ(f ) df =: t Γ. (A.33)
ℏ F

Let’s suspend disbelief for a while and take (A.33) at face value. It is called Fermi’s Golden
Rule: The probability Pi→F (t) increases linearly, with slope Γ proportional to the squared
coupling and the density of states, integrated over all final states with the right energy.
The “≃”–step in (A.33) involved quite the leap of faith. The squared-sinc-construction
gives a delta function only in the limit of large times, but first-order perturbation theory is
valid, at most, at short times. It’s unclear whether there’s an intermediate regime where
both approximations simultaneously hold. Also, if the spectrum is discrete, the density of
states ρ(f ) is itself a sum of delta functions (Sec. A.1.7), so that the integral has no obvi-
ous meaning. The cleanest (but not only) way around this issue is to restrict attention to
energies Ei that lie in the continuous part of the spectrum of H0 . This frequently involves
letting the “quantization region” L3 go to ∞ (c.f. Sec. A.1.9). One could analyze the con-
ditions for (A.33) to hold more carefully – but this is rarely done in practice. Experience
has shown that the “golden” rule gives the right answer more often than one could have
hoped, hence the moniker.
Appendix B

Miscellaneous Integrals

B.1 Gaussian and Fresnel integrals


Starting point is the famous formula due to Gauss
Z ∞
2 √
e−x dx = π,
−∞

which can be obtained by evaluating its square in polar coordinates.


From there, we one finds the general form
Z ∞ r
2 π β2 +γ
e−αx +βx+γ dx = e 4α (B.1)
−∞ α
which holds for all complex α, β, γ such that the integral converges: either Re[α] > 0;
or Re[α] = 0 and Re[β] = 0 (though in the latter case, the integralp is not absolutely
convergent, so it should be handled with care). In the formula, π/α is the principal
square root, defined to be the unique root with argument in (−π, π]. For real α, β, γ’s, the
above can be proven by completing the square and using the substitution rule. For complex
coefficients, one has to use a suitable contour integration. The special case α = ∓i and
β = γ = 0 is the complex asymptotic Fresnel integral
Z ∞
2 √
e±ix dx = πe±iπ/4 . (B.2)
−∞

The Gaussian integral (B.1) is taken over the entire real real line x → ±∞, but in
fact,
pis already close to its asymptotic value if the limits of the integral are large compared
to |α|. This is obvious if α has a large real part (because the absolute value of the
2
integrand is decaying with e− Re αx ). Imaginary parts of α also aid convergence, but for
a more subtle reason: They cause the integrand to oscillate rapidly for large arguments, so
that its contributions to the integral tend to cancel.
To visualize this effect, consider the non-asymptotic real Fresnel integrals
Z x Z x
C(x) := cos(t2 ) dt, S(x) := sin(t2 ) dt.
0 0
Separating real and imaginary parts in (B.2) gives
r
π
lim C(x) = lim S(x) = . (B.3)
x→∞ x→∞ 8
Their convergence is shown in (Fig. B.1).

101
APPENDIX B. MISCELLANEOUS INTEGRALS 102

Figure B.1: The Fresnel integrals C(x)p (orange) and S(s) (blue). The integral quickly
converges towards its asymptotic value π/8 (black line), with contributions of larger
arguments canceling to the oscillating behavior of the integrand.

B.2 Some Fourier transforms


Rotationally invariant functions
Let V (x) = V (∥x∥) be a rotationally-invariant function in R3 . Its Fourier transform
Ṽ (k) is computed most easily in the coordinate system (r, µ = cos θ, ϕ) where (r, θ, ϕ)
are spherical coordinates with polar vector parallel to k. The volume element is

r2 sin θ dr dθ dϕ = r2 dr dµ dϕ (B.4)

so that the Fourier transform


Z
Ṽ (k) = (2π)−3/2 e−ikx V (r) d3 x
Z ∞ Z 1
= (2π)−1/2 dr r2 V (r) dµ e−ikrµ
0 −1
∞ 1
e−ikrµ
Z 
−1/2 2
= (2π) dr r V (r)
0 −ikr µ=−1
Z ∞
i
dr rV (r) e−ikr − eikr

= 1/2
(B.5)
(2π) k 0
21 ∞
r Z
= rV (r) sin(kr) dr (B.6)
πk 0

reduces to a one-dimensional integral.


Appendix C

Function spaces and distributions

In this chapter, we take a more pedantic look at the function spaces that occur in QM. For
simplicity of presentation, we’ll mainly restrict attention to the one-dimensional case.

C.1 Square-integrable functions


What mathematical properties should a “wave function” ψ : R → C for a particle in one
dimension have?
First, according to the Born interpretation, p(x) := |ψ(x)|2 is the probability density
describing the R distribution of position measurement outcomes. For this interpretation to
make sense, |ψ(x)|2 dx must equal 1.
Next, physical predictions depend on ψ only through integrals. Integrals stay the same
if the value of the integrand is changed on a set of measure zero. Therefore, two functions
that agree almost everywhere (i.e. everywhere except on a set of measure zero) define the
same physical state and should therefore be identified. For any function ψ : R → C, write
[ψ] for the set of functions that agree with ψ almost everywhere.
These two conditions suggest that wave functions should belong to the space
 Z 
L (R) = [ψ] ψ : R → C, |ψ(x)| dx < ∞
2 2

of equivalence classes of square-integrable functions.1 This is indeed the standard choice.


In practice, the identification of functions agreeing almost everywhere is usually left
implicit. That is, L2 (R) is called “the space of square-integrable functions” instead of the
more precise “space of equivalence classes of square-integrable functions”, and one writes
ψ ∈ L2 (R) as a short-hand for [ψ] ∈ L2 (R). We will also follow this convention.
The Cauchy-Schwarz inequality says that
Z Z 1/2 Z 1/2
ψ(x)∗ ϕ(x) dx ≤ ψ(x)∗ ψ(x) dx ϕ(x)∗ ϕ(x) dx , (C.1)

so that
Z
⟨ψ|ϕ⟩ := ψ(x)∗ ϕ(x) dx

1 See any textbook on analysis, e.g. Folland’s Modern analysis, Chapter 2 for more details on integration

theory. Just two comments on terminology: (1) All integrals in the theory of R function spaces are to be understood
in the sense of Lebesgue. (2) A function f is integrable if the integral f exists and is finite. (So, counter-
intuitively, “f is integrable” and “the integral of f exists” are different statements!)

103
APPENDIX C. FUNCTION SPACES AND DISTRIBUTIONS 104

is well-defined as a sesquilinear form L2 (R) → C.

Remarks on the use of equivalence classes


Identifying functions that lead to the same physical predictions makes sense. Let’s re-
iterate, though, that consequently elements of L2 (R) aren’t strictly speaking functions,
but rather equivalence classes of functions. In particular, “the value ψ(x)” of an element
[ψ] ∈ L2 (R) at a point x is not a well-defined concept! This might be surprising, be-
cause in practice, we work with point-wise values ψ(x) all the time. We get away with
this because either: (1) We use ψ(x) in a context (e.g. under an integral) where it does not
matter which representative of the equivalence class has been chosen. Or, (2), there is an
(implicit) convention that fixes a representative. For example, it is easy to see that every
equivalence class contains at most one continuous function (Fig. ??). Thus, if we agree to
use continuous representatives whenever possible, there is no ambiguity for such classes.
The identification also makes the mathematical theory cleaner. For example, for a
function ψ : R → C, the integral ∥[ψ]∥2 := |ψ(x)|2 dx vanishes if and only if ψ is
R

supported on a set of measure zero, i.e. iff [ψ] = [0]. The implication ∥[ψ]∥ = 0 ⇒
[ψ] = 0 is part of the mathematical definition of a norm. It is frequently invoked in
physics arguments: For example, in the algebraic treatment of the harmonic oscillator,
one typically shows that ∥a|0⟩∥ = 0 and concludes that a|0⟩ = 0, i.e. that the attempt to
construct a negative-energy eigenstate by laddering leads to the 0 function.

C.1.1 Why go beyond L2 ?


The choice of L2 (Rn ) as the space of wave functions was physically well-motivated. But
it turns out that for the purpose of doing some calculations, it is “too small”, while for
others, “too large”.

Too small: L2 (R) does not contain the eigenfunctions of some important operators.
The eigenfunctions of the momentum operator are plane waves, which have norm ∞, and
therefore do not belong to L2 . The eigenfunctions of the position operator are supported
only on one single point. As elements of L2 , they are therefore equivalent to the function
that is identically 0.

Too large: L2 (R) contains elements for which important operators are undefined. For
example, there are classes [ψ] ∈ L2 (R) that do not contain any continuous representative,
in which case the action of the momentum operator is not well-defined. For an example
involving the position operator, take the function
1
ψ(x) = √ . (C.2)
π(x + i)

Then
Z Z
1 1 1 ∞
|ψ(x)|2 dx = dx = [arctan(x)]−∞ = 1,
π +1x2 π

so ψ ∈ L2 (R). But (by comparison with a 1/x dx = ∞), one can easily see that the
R∞

integral ⟨ψ|X k |ψ⟩ = x2x+1 dx is infinite for even k ∈ N and undefined for odd k ∈ N.
R k

In particular, ∥Xψ∥2 = ⟨ψ|X 2 |ψ⟩ = ∞, implying that Xψ ̸∈ L2 (R).


APPENDIX C. FUNCTION SPACES AND DISTRIBUTIONS 105

Figure C.1: Rigged Hilbert spaces are “rigged” in the sense of “fully equipped” (like
Imperator Furiosa’s War Rig, pictured above), not in the sense of “manipulated with the
goal to deceive”, like a loaded die. (OK, maayybe I was just looking for an excuse to
include that picture in my lecture notes).

Discussion
Do these issues mean that L2 (R) is not an appropriate mathematical model for the space
of wave functions? Arguably not!
For the eigenfunction examples, note that infinitely extended or infinitely concentrated
states are unphysical, so we cannot complain that the space L2 (R), designed to model
physical wave functions, does not contain them.
Now let’s look at the function ψ defined in (C.2). The fact that Xψ ̸∈ L2 (R) does not
mean that position measurements aren’t well-defined. To the contrary, p(x) = |ψ(x)|2 =
1
π(x2 +1) is a perfectly good probability density describing position measurement outcomes.
It’s just that none of the moments ⟨X k ⟩ (including the expectation value, k = 1) exist and
are finite. But nobody ever promised us that all probability distributions can be character-
ized via moments, so there is no fundamental issue with this. Likewise, any ψ ∈ L2 (R),
even if it exhibits discontinuities, has a Fourier transform ψ̃, and thus a probability density
p(ℏk) = |ψ̃(k)|2 over momentum measurement outcomes.
However, the discussion does suggest that for the purpose of doing calculations, it
would be good to identify a “sandwich of spaces”

Φ ⊂ L2 (R) ⊂ Φ′ , (C.3)

where Φ is “sufficiently small” that all relevant operators are well-defined on it, and Φ′ is
“large enough” that it contains a complete set of eigenvectors for all relevant operators.
As we’ll see, the spaces Φ and Φ′ are usually constructed together. Elements of Φ are
called test functions and those of Φ′ distributions. Constellations as in (C.3) are studied as
Gelfand triples or rigged Hilbert spaces (Fig. C.1)).
Which spaces of functions are the best choice for Φ, Φ′ depends on the problem one
wants to solve. An important set for quantum mechanics is Schwartz space (after Laurent
Schwartz, not to be confused with Hermann Schwarz of Cauchy-Schwarz-inequality fame)
for Φ and the associated space of tempered distributions for Φ′ . We’ll look at this case next,
and briefly sketch the general theory in Sec. C.3.

Remark. The domain D of a function f is the set of mathematical objects on which f is defined.
One equivalently says that “f is a function on D”... ...except in the theory of Hilbert spaces. We
have seen above that P and X are not defined on certain elements of the Hilbert space L2 (R)
– their domains D(P ), D(X) are strictly smaller. But one still says that “X is the position
operator on L2 (R)”. In general, if T is any linear operator whose domain D(T ) is dense in a
APPENDIX C. FUNCTION SPACES AND DISTRIBUTIONS 106

Hilbert space H, it is customary in functional analysis to call T a linear operator on H. (Of


course, physicists don’t worry about such details at all).

C.2 Distributions
C.2.1 Schwartz space
The most important set of test functions Φ in QM is Schwartz space S, the “smooth func-
tions whose derivatives vanish rapidly”:
n o
S = ϕ ∈ C ∞ (R) ∀α, β ∈ N0 : sup |xα ∂xβ ϕ(x)| < ∞ . (C.4)
x

The condition ϕ ∈ C ∞ (R) means that elements of Schwartz space are infinitely diff-
entiable; while the second condition says that ϕ and its derivatives vanish faster than any
polynomial function as |x| → ∞. It follows that S is invariant under P and X. It is also
easy to see that any square-integrable function can be arbitrarily-well approximated by
Schwartz-class functions, i.e. for every ψ ∈ L2 (R) and every ϵ > 0, there exists a ϕ ∈ S
such that ∥ψ − ϕ∥ ≤ ϵ. (Technically: S is dense in L2 (R) w.r.t. norm topology).
This already solves half of our problems: Because well-behaved functions are dense,
there is little loss of generality in assuming that any wave function of physical interest lies
in S. One can then apply X and P without any issue.

C.2.2 Tempered distributions


Constructing the space that contains the generalized eigenvectors requires us to to take a
little detour: We will first have to study linear functionals S → C.
A function u : R → C is locally integrable if for any compact set K ⊂ R,
Z
|u(x)| dx < ∞.
K
R1
For example, continuous functions are locally integrable, whereas 1/x isn’t (e.g. 0 | x1 | dx =
∞). Now, for any locally integrable function u that grows at most polynomially as |x| →
∞, and for any l ∈ N0 , define a functional TDl u : S → C by
Z
TDl u (ϕ) := u(x)(−∂x )l ϕ(x) dx. (C.5)

(The notation Dl u will be explained below). Then TDl u is well-defined as a linear func-
tional S → C. That’s because ϕ ∈ S implies that ∂xl ϕ ∈ S as well; local integrability of u
and continuity of ∂xl ϕ implies that the integrand is locally integrable; and finally fact that
∂xl ϕ vanishes faster than any polynomial, together with the matching growth restriction on
u, means that the integral remains finite as |x| → ∞. A functional of this form is called a
tempered distribution, and the space of all tempered distributions is denoted by S ′ .
In contrast, note that TDl u is rarely well-defined as a functional on L2 (R). For one,
elements ψ ∈ L2 (R) aren’t in general differentiable, and even if they are, they generally
vanish too slowly for the integral to converge. So we see that S, on account of being
smaller than L2 (R), allows for a larger set of linear functionals! Recall that we’re out to
find a set larger than L2 (R), so this seems like a promising direction to explore. Let’s look
at some examples.
APPENDIX C. FUNCTION SPACES AND DISTRIBUTIONS 107

Plane waves: Teikx defines a linear functional on Schwartz space, but, because eikx is
not normalizable, not on L2 (R).
Delta functional: Let θ(x) be the step function that is 0 for x < 0 and 1 for x ≥ 0.
Then, using integration by parts,
Z Z ∞
TDθ(x) (ϕ) = − θ(x)∂x ϕ(x) dx = − ∂x ϕ(x) dx = ϕ(0). (C.6)
0
The operation only makes sense for functions ϕ that are differentiable at 0 – so certainly
for elements of S, but not necessarily elements of L2 (R).
“Bra vectors”: For every ψ ∈ L2 (R), the “bra” ϕ 7→ ⟨ψ|ϕ⟩ = Tψ∗ defines a tempered
distribution. (Indeed, every square-integrable function is also locally integrable. That’s an
easy consequence of the Cauchy-Schwarz inequality).
The principal value is important in the theory of partial differential equations, where
one often wants to associate a distribution with the function u(x) = x1 in some way.
Unfortunately, x1 is not locally integrable, and indeed, ϕ(x)
R
x dx does not in general exist.
But as we’ll see, the principal value
  Z
1 ϕ(x)
pv (ϕ) := lim dx (C.7)
x ϵ→0 +
R\(−ϵ,ϵ) x
is finite for all ϕ ∈ S and, what is more, is given by the tempered distribution TD log |x| (ϕ).
To see that this makes sense, we first need to convince ourselves that log |x|, even though
it diverges as x → 0, is locally integrable. This follows from the fact that the anti-
derivative of log |x| is F (x) = x log |x| − x + C, which remains finite at the singularity:
limx→0 F (x) = C. Therefore, TD log |x| is indeed a tempered distribution. It remains to
be shown that it evaluates to the principal value:
Z
TD log |x| (ϕ) = − log |x| ϕ′ (x) dx
 Z −ϵ Z ∞ 
′ ′
= lim+ − log(−x) ϕ (x) dx − log x ϕ (x) dx
ϵ→0 −∞ ϵ
Z −ϵ Z ∞ 
ϕ(x) ϕ(x)
= lim+ dx − ϕ(−ϵ) log ϵ + dx + ϕ(ϵ) log ϵ
ϵ→0 −∞ x ϵ x
 
1
= pv (ϕ) + lim+ log(ϵ)(ϕ(ϵ) − ϕ(−ϵ))
x ϵ→0
   
1 ′ 1
= pv (ϕ) + 2ϕ (0) lim+ ϵ log(ϵ) = pv (ϕ).
x ϵ→0 x
| {z }
=0

Powers of 1/r in higher dimensions: In contrast to the previous example, u(x) =


∥x∥−k is locally integrable as a function on Rn , as long as n > k. This can be seen by
switching to n-dimensional spherical coordinates, where the volume element is propor-
tional to rn−1 , which lifts the singularity at 0. The definition (C.5) is easily adapted to
higher dimensions, and integrating against such a u thus defines a tempered distribution.
Unsurprisingly, the case k = 1, n = 3 is important due to its relation to the Coulomb and
the gravitational potential.

Regular distributions
Distributions of the form Tu (i.e. those that can be expressed without differentiating the
argument before integrating) are called regular. For regular distributions, it is common
APPENDIX C. FUNCTION SPACES AND DISTRIBUTIONS 108

to use the same symbol for both the distribution S → C and for the function R→C
defining it:
Z
T (ϕ) = T (x)ϕ(x) dx. (C.8)

You might complain that such an overloading of notation is not a nice thing to do. And
you’d be right. But things are about to get worse. Such a convention is even used for
non-regular distributions!
Consider e.g. the delta distribution δ(ϕ) = ϕ(0) discussed above. It is not regular.
(Because a hypothetical function giving rise to it would have to be zero everywhere except
at x = 0 – but an integral over a function supported on only one point is zero). But, in
analogy to (C.8), one still writes
Z
δ(ϕ) = δ(x)ϕ(x) dx.

The r.h.s. is not an integral and δ(x) not a function – the entire r.h.s. is to be read as an
elaborate notation for δ(ϕ). Whether this convention is genius (because it allows practi-
tioners to work with distributions without having to learn the abstract theory) or horrific
(because the one job of mathematics is to be rigorous and not to pretend that objects exist
when in fact they don’t) is a question that may be controversially debated.

C.2.3 Operations on distributions


Our goal is still to find generalized eigenvectors for X and P . These will turn out to be
distributions. For that to even make sense, we have to define what it means for an operator
to act on distributions.
Let A be any operator that maps S to S. There is a unique operator At , the transpose
of A, such that, for ϕ, ψ ∈ S,
Z Z
(Aψ)(x)ϕ(x) dx = ψ(x)(At ϕ)(x) dx.

(This is the bilinear analogue of the definition of the adjoint for sesquilinear inner prod-
ucts). It directly follows that for regular distributions with u ∈ S, TAu (ϕ) = Tu (At ϕ).
Using the notation in (C.8), this means

(AT )(ϕ) = T (At ϕ). (C.9)

We take Eq. (C.9) as the general definition for the action of an operator on distributions.
In words: Operations on distributions are defined by shifting them onto the argument.

Derivatives of distributions
The most important application is the differentiation operator (Dϕ)(x) = ∂x ϕ(x). By
partial integration, Dt = −D from which we get
Z
(DTu )(ϕ) = u(x)(−∂xl )ϕ(x) dx = TDu (ϕ)

and, more generally, Dl Tu = TDl u (which justifies the notation Dl u, as promised).


With these conventions in place, we can explain the notion of “derivative in the sense
of distribution” that you will likely have come across before. Take for example the step
APPENDIX C. FUNCTION SPACES AND DISTRIBUTIONS 109

function θ. Seen as a function R → C, it is not differentiable, due to the discontinuity at 0.


But the distribution Tθ does have a derivative: DTθ = δ, as computed in (C.6). Identifying
θ with Tθ , this fact is often expressed as “∂x θ(x) = δ(x) in the sense of distribution”. Note
that every locally integrable function is infinitely differentiable in the sense of distribution
(by virtue of the elements of the test function space S having this property).
As an application, let’s derive a famous identity that expresses the principal value in
terms of a “side limit of a deformed version of 1/x”, namely
 
1 1
lim+ = pv ∓ iπδ. (C.10)
ϵ→0 x ± iϵ x

First, recall that the principal complex logarithm


p x
Log(x + iy) = log x2 + y 2 + i arctan
y
is an analytic continuation of the logarithm to the complex numbers, except for a branch
cut on the negative real axis (Fig. ??). It follows that

lim Log(x ± iϵ) = log |x| ∓ iπθ(x)


ϵ→0+

which immediately implies (C.10) by differentiating both sides in the sense of distribution.

Generalized eigenvectors
We say that a distribution T is a generalized eigenvector of an operator A : S → S if

A T = λ T.

Plane waves are therefore eigenvectors of the differentiation operator D or the momen-
tum operator P = −iD:

D Teikx = T∂x eikx = Tikeikx = ik Teikx , P Teikx = k Teikx .

Likewise, if δa = ∂x θ(x − a) : ϕ 7→ ϕ(a) is the delta distribution at a ∈ R, then

(X δa )(ϕ) = δa (Xϕ) = aϕ(a) = a δa (π) ⇒ X δa = a δa .

So, with all these preparations in the bag, it was pretty easy to identify the generalized
eigenvectors!
Interface conditions for piece-wise continuous potentials. In one dimension, the
time-independent Schrödinger equation

ℏ2 2
 
− ∂x + V (x) − E ψ(x) = 0 (C.11)
2m

is an ordinary differential equation. The Picard-Lindelöf theorem says that if V is Lipschitz


continuous, then the ODE can be solved.
A staple of introductory QM lectures are potentials that are only piece-wise continuous.
In this case, there aren’t necessarily solutions to (C.11) in the ordinary sense. Here, we’ll
work out under which conditions one can stitch piece-wise solutions together to get a
generalized eigenvector of the Hamiltonian.
APPENDIX C. FUNCTION SPACES AND DISTRIBUTIONS 110

We treat the case where (C.11) has ordinary solutions ψ− (x) on (−∞, 0) and ψ+ (x) on
(0, ∞). Assume that both solutions and their first derivatives are continuous and bounded
around 0, and can thus be extended to 0. We also require that V is bounded around 0. Then

ψ− (x) x ≤ 0,
ψ(x) :=
ψ+ (x) x > 0

is a generalized eigenvector if and only if, for all test functions ϕ,


Z ∞
ℏ2 2
 
ϕ(x) − ∂x + V (x) − E ψ(x) dx = 0.
−∞ 2m

Because ψ is an ordinary solution away from zero, the integral is equal to


Z ϵ  2 Z ϵ
−ℏ2

−ℏ 2
lim ϕ(x) ∂x + V (x) − E ψ(x) dx = lim ϕ(x)∂x2 ψ(x) dx,
ϵ→0 −ϵ 2m 2m ϵ→0 −ϵ

where we have used that (V (x) − E) does not contribute to the integral in the limit, as
ϕ, ψ, and V are bounded around 0. Integrating by parts,
Z ϵ Z ϵ
′′ ′ ′
ϕ′ (x)ψ ′ (x) dx

lim ϕ(x)ψ (x) dx =ϕ(0) ψ+ (0) − ψ− (0) − lim
ϵ→0 −ϵ ϵ→0 −ϵ
′ ′
(0) − ϕ′ (0) ψ+ (0) − ψ− (0) ,
 
=ϕ(0) ψ+ (0) − ψ−

which vanishes for all ϕ if and only if the interface conditions


′ ′
ψ+ (0) = ψ− (0) and ψ+ (0) = ψ− (0) (C.12)

are met. Notably, these do not imply that ψ is twice differentiable, which would be required
for ordinary solutions to (C.11).

Fourier transforms of tempered distributions


Because the Fourier transform exchanges X and P , the characterization (C.4) of S, and
hence the space itself, is invariant under Fourier transforms. Applying the general scheme
(C.9), the Fourier transform of a tempered distribution T is (FT ) : ϕ 7→ T (F t ϕ).
To get more explicit formulas, first note that F t = F:
Z Z Z Z
1 −ikx
(Fψ)(k)ϕ(k) dk = √ e ψ(x)ϕ(k) dx dk = ψ(x)(Fϕ)(x) dx.

Thus, using the shorthand “tilde notation” for the Fourier transform,

T̃ (ϕ) = T (ϕ̃).

Delta distribution. For δ, compute


Z
1
δ̃(ϕ) = δ(ϕ̃) = ϕ̃(0) = √ ϕ(x) dx = T √1 (ϕ), (C.13)
2π 2π

that is, the FT of δ is a regular distribution, arising from the constant function
1
δ̃(k) = √ . (C.14)

APPENDIX C. FUNCTION SPACES AND DISTRIBUTIONS 111

One could be tempted to use the following formal calculation to arrive at the same conclu-
sion:
Z
1 1
δ̂(k) = “ √ e−ikx δ(x) dx” = √ .
2π 2π
But, unlike, (C.13), this is not a rigorous argument given our development of the theory so
far! That’s because we have defined δ(ϕ) only for ϕ ∈ S, but e−ikx is most definitely not
an element of Schwartz space. The integral is therefore only heuristically defined. One
can sometimes make sense of products of distributions – but the issue is subtle and we will
not pursue it here.
Constant functions. The constant function 1(x) = 1Rdoes not have a Fourier trans-
form in the ordinary sense. For one, the integral (2π)−1 dx that would define 1̃(0) is
infinite. However, because T1 defines a tempered distribution, it does have a FT. Slightly
abusing language once again, we call it the FT of 1 (in the sense of distribution).
We can find it by expressing F −1 in terms of F and applying it to (C.14). To this end,
let Π be the parity operator, which mirrors functions about the origin: (Πϕ)(x) = ϕ(−x).
Then it is easy to see that F † = ΠF t and hence unitarity of F implies
(ΠF)F = F † F = 1 ⇒ F −1 = ΠF. (C.15)
Applying this to (C.14) gives

F (1) = 2πδ.
The principal value. From an easy contour integration, the FT of 1/(x + iϵ) is
1
Z
1 √ √
√ e−ikx dx = −i 2πe−ϵk θ(k) → −i 2πθ(k) (ϵ → 0+ ).
2π x + iϵ
Using (C.10), we then find that the FT of the principal value is a regular distribution:
 
 1
F pv(1/x) (k) = lim+ F (k) + iπF(δ)(k)
ϵ→0 x + iϵ
r r
π π
=i (−2θ(k) + 1) = −i sign(k). (C.16)
2 2
Combining this result with (C.15) gives further transforms of common distributions:
r
2
(F sign) (k) = i pv(1/k) (C.17)
π
1 1
(Fθ) (k) = F (sign +1) (k) = i √ (pv(1/k) − iπδ) . (C.18)
2 2π
Coulomb and Yukawa potentials. Up to constants, the Coulomb potential is u(x) =
1
− ∥x∥ in R3 . Just like the constant function treated above, it does not have an ordinary
Fourier transform. For example, ũ(0) would be given by
Z Z
−3/2 1 3 −1/2
−(2π) d x = −2(2π) r dr = −∞. (C.19)
∥x∥
But as discussed in Sec. C.2.2, u(x) defines a regular distribution whose Fourier transform
turns out to be regular again, given by
r
2 1
ũ(k) = − . (C.20)
π ∥k∥2
APPENDIX C. FUNCTION SPACES AND DISTRIBUTIONS 112

Here’s how to find (C.20). Express

e−s∥k∥
Z Z
ϕ̃(k) 3
Tu (ϕ̃) = − d k = − lim ϕ̃(k) d3 k. (C.21)
∥k∥ s→0+ ∥k∥

as a limit of integrals involving the “regularizing” factor e−s∥k∥ .

This is valid, because the integral, interpreted as a function of s ∈ [0, ∞), is con-
tinuous at 0. In fact, it is even differentiable:
Z −s∥k∥ Z
e
−∂s |0 ϕ̃(k) d3 k = ϕ̃(k) d3 k, (C.22)
∥k∥

which is finite for ϕ̃ ∈ S. (Note that the same regularization does not work for the
integral in (C.19), which formally corresponds to the case ϕ̃(k) = 1. Of course,
the constant function is not an element of Schwartz space, and indeed, this choice
would cause (C.22) to diverge).

Plugging in the definition of the FT and exchanging integrals,


Z  Z −s∥k∥ 
e
Tu (ϕ̃) = lim −(2π)−3/2 e−ik·x ϕ(x)d3 k d3 x.
s→0 ∥k∥

The expression in parentheses is the FT of


1 −s∥x∥
V (x) = − e
∥x∥

which, up to constants, is known as the Yukawa potential. Its Fourier transform follows
from the general formula (B.5) for rotationally-invariant functions in terms of k = ∥k∥:
Z ∞
i
e−sr−ikr − e−sr+ikr dr.

Ṽ (k) = 1/2
(2π) k 0

The one-dimensional integral can immediately be solved as


∞ ∞
er(−s−ik) er(−s+ik)
 
1 1 2ik
− + =− + =− 2 .
−s − ik 0 −s + ik 0 −s − ik −s + ik s + k2

Collecting constants, we get the FT of the Yukawa potential, which gives (C.20) as s → 0:
r
2 1
Ṽ (k) = − . (C.23)
π s2 + k 2

Products and tensor products


If T is a tempered distribution, and v a smooth function that grows at most polynomially
as |x| → ∞, then the product vT between v and T is the tempered distribution

vT : ϕ 7→ T (vϕ). (C.24)

The product between a smooth function and a distribution behaves mostly like the product
between functions. In particular, if T = Tu is regular, then vTu = Tuv .
APPENDIX C. FUNCTION SPACES AND DISTRIBUTIONS 113

However, one cannot extend (C.24) to products between arbitrary distributions, while
retaining the basic properties of “multiplication”. For example

δx = 0 by (C.24)
⇒ (δ x) pv(1/x) = 0 pv(1/x) = 0 by above and (C.24)

x pv(1/x) = 1 by (C.24)
⇒ δ (x pv(1/x)) = δ 1 = δ by above and (C.24)

so there is no associative way to assign a meaning to “δ x pv(1/x)”.


Tensor products between distributions are perfectly well-defined, though. If S, T are
distributions, then S ⊗ T is the bilinear form that sends ϕ, ψ ∈ S to S(ϕ)T (ψ).
Using the fictional function notation T (x) for distributions T (as in (C.8)), the situation
can be summarized as: “T (x)S(x) makes no sense, but T (x)S(y) is unremarkable – just
integrate against two test functions”.
A propose bilinear forms. Take any distribution K on test functions on R2 , not
necessarily a tensor product. If ϕ, ψ are test functions on R, then their tensor product
(ϕ ⊗ ψ)(x, y) = ϕ(x)ψ(y) can be paired with K. This way, K, too, defines a bilinear
form:
Z
ϕ, ψ 7→ K(ϕ ⊗ ψ) also written as K(ϕ ⊗ ψ) = K(x, y)ϕ(x)ψ(y) dx dy.

One says that K is the integral kernel of the bilinear map.


This definition can be extended straight-forwardly to multilinear or sesquilinear func-
tions. The sesquilinear case is frequently used in quantum mechanics – c.f. Eq. (A.20).

C.3 Topological aspects, more pedantry, and generalizations


Our definition of tempered distributions in Eq. (C.5) was constructive: We showed how
to build distributions concretely given a function u and derivatives ∂xl . The mathematical
theory is usually formulated axiomatically. Distributions are defined indirectly, as linear
functionals on test function spaces, subject to some abstract properties. These properties
are phrased in the language of point set topology. In this section, we briefly introduce this
more abstract point of view.

Topological spaces
Consider a set X. A topology on X is a rule that allows us to decide when a sequence
xk : N → X converges to an element x ∈ X.
As a first example, assume that X is a vector space equipped with a norm ∥ · ∥. This
covers an extremely wide range of spaces, from X = R, the real numbers with norm
∥x∥ = |x| the absolute value, to X = L2 (R) with norm ∥x∥ = ⟨x|x⟩1/2 derived from the
inner product. We say that a sequence xk converges in norm topology to x,

xk → x, if lim ∥xk − x∥ = 0. (C.25)


k→∞

We’ll use these concepts to give very general definitions of continuity and complete-
ness.
APPENDIX C. FUNCTION SPACES AND DISTRIBUTIONS 114

Figure C.2: Sequence continuity is equivalent to the more familiar “ϵ-δ-definition” of


continuity for functions f : R → R. It is more general, though, and can also be applied to
spaces whose topologies do not derive from a distance measure.

Continuity
A function f between two topological spaces is continuous if it maps convergent sequences
to convergent sequences (Fig. C.2), i.e. if
xk → x ⇒ f (xk ) → f (x).
Example (important!): If ψ ∈ L2 (R), then the linear functional ⟨ψ| is continuous.
The proof reduces to the Cauchy-Schwarz inequality. If limk→∞ ∥ϕk − ϕ∥ = 0, then
1/2
⟨ψ|ϕk ⟩ − ⟨ψ|ϕ⟩ = ⟨ψ|(|ϕk ⟩ − |ϕ⟩) ≤ ∥ψ∥1/2 ϕk − ϕ →0 (k → ∞).

Completeness and Hilbert spaces


Now for completeness. A sequence xk is Cauchy if “its elements eventually become arbi-
trarily close” in the sense that
∀ ϵ > 0, ∃ n ∈ N such that ∀ k, l > n, it holds that ∥xk − xl ∥ ≤ ϵ.
A space X is complete if every Cauchy sequence converges to an element√of X.
As an example of a Cauchy sequence, let xk be the approximation of 2 to k decimal
√ are not complete: There is no q ∈ Q
places. This example shows that the rational numbers
such that xk → q (for then q would have to equal 2, which, famously, is not rational).
In the mathematical literature, a Hilbert space is defined as a
• complex vector space,
• with a sesquilinear inner product ⟨·|·⟩,
• that is complete with respect to the norm derived from the inner product.
The final condition is often glossed over in physics presentations. It is important, though.
For one, it means that series like
∞ K
X 1 X 1
|ψ(t)⟩ = (itH)k |ψ(0)⟩ := lim (itH)k |ψ(0)⟩,
k! K→∞ k!
k=0 k=0

used ubiquitously, are actually well-defined. Another reason is that the equivalence of
“kets” and “bras” requires this property: The set of continuous linear functionals on a
Hilbert space H is denoted by H′ . If |ψ⟩ ∈ H, then we’ve shown above that ⟨ψ| is
continuous, i.e. an element of H′ . The Riesz representation theorem says that the converse
is also true: Every continuous linear functional of a Hilbert space is given by some “bra
vector”.
One can show that L2 (R) is complete, i.e. actually a Hilbert space.
Contrast this with Schwartz space S. It, too is a complex vector space with the same
sesquilinear inner product as L2 (R). But it is not √
complete in norm topology and hence no
Hilbert space. The argument works just like the 2-example above. Because S is dense
in L2 (R), for every ψ ∈ L2 (R), there exists a sequence ϕk : N → S converging to ψ in
norm. Thus, if ψ ̸∈ S, the sequence ϕk has no limit point in S.
APPENDIX C. FUNCTION SPACES AND DISTRIBUTIONS 115

Topology on Schwartz space


Return to Schwartz space S. Because it is a subspace of L2 (R), we can use the norm
topology also for S. However, there’s a second, important, topology on that space. For
α, β ∈ N0 , define the (semi)-norms

∥ϕ∥α,β := sup |X α ∂β ψ(x)|.


R
x∈

A sequence ϕk : N → S converges with respect to this family of semi-norms,

ϕk → ϕ, if lim ∥ϕ − ϕk ∥α,β = 0 ∀α, β ∈ N0 . (C.26)


S k→∞

To avoid confusion, we’ll write ϕk →2 ϕ if we mean convergence with respect to the


L
Hilbert space norm. It is easy to see that ϕk → ϕ implies ϕk →2 ϕ, but not the other way
S L
round. One says that the topology (C.26) is finer than norm topology.
There’s a non-trivial regularity theorem (Reed-Simon, Thm. V.10), which states that
the constructive definition (C.5) of tempered distribution characterizes exactly the space
of linear functionals on Schwartz space that is continuous in the sense of (C.26).

Generalizations
The topological formulation above is the basis of generalizations. The common recipe
is to choose a test function space Φ (often norm-dense in L2 (R)), endow it with a finer
topology, and then consider the continuous dual Φ′ .
The most important choice is to take Φ to be the space of bump functions Cc∞ (R):
smooth functions with compact support. “Compact support” means that these functions
are identically zero for |x| large enough. (It is not obvious that one can define functions
that transition smoothly from being identically zero in some region to being non-zero in
other regions, but such functions do exist). In the context of distributions, the space of
bump functions is usually denoted by D.
Recall that Schwartz functions vanish faster than any polynomial, and thus integrals
against locally integrable functions u(x) that grow at most polynomially are finite. Be-
cause bump functions vanish identically for large x, integrals against any locally integrable
u are well-defined. This suggests, correctly, that the space of distributions D′ is even larger
than the space of tempered distributions S ′ .
The structure of D′ is somewhat more complicated than was the case for S ′ . We will
not discuss it here, but, for completeness, give the topology from which it derives. It is
defined in terms of the (semi-)norms

∥ϕ∥K,α := sup |∂xα ϕ(x)|


x∈K

indexed by compact subsets K ⊂ R and a number n ∈ N0 , in the same way as (C.26):

ϕk → ϕ if lim ∥ϕ − ϕk ∥K,α = 0 ∀α, K.


D k→∞

Because elements of D are smooth, distributions in D′ are arbitrarily often differen-


tiable. However, the space D is easily seen not be be invariant under Fourier transforms,
so the Fourier transform is not defined on D′ . This is the reason the space plays a less
prominent role in quantum theory.
APPENDIX C. FUNCTION SPACES AND DISTRIBUTIONS 116

Terminology
The word “distributions” used without qualification is most likely to refer to D′ , but can
also mean a general continuous dual space Φ′ , and may also refer to tempered distributions
S ′ , depending on context. Making matters worse, S is always called “Schwartz space”,
but the name “Schwartz” is also associated with the general mathematical theory of dis-
tributions and in particular also with D′ . “Tempered distributions” always means S ′ , at
least.
Lastly, if a science professor answers an inquiry about a questionable derivation by
claiming that it is to be understood “in the sense of distribution”, they likely mean neither
D′ nor Φ′ nor S ′ . Instead, they are probably vaguely aware of the fact that what they are
doing isn’t quite rigorous, but are optimistic that a smart mathematician could figure it
out, and in any case, want to get through their lecture with their dignity intact and have
found that “distribution” is a fully general incantation that reliably suppresses follow-up
questions.
Needless to say, I would never engage in such tactics.
Appendix D

Green’s functions

D.1 Introduction
In this chapter, we are interested in the affine equation

Lu = f (D.1)

for u, given L and f . We’ll restrict attention to the most important special case, where L
is a translation-invariant differential operator on Rn .

Example: The damped harmonic oscillator. Newton’s equation for the position
u(t) of a particle subject to a driving force mf (t), viscous damping coefficient
(mγ)/2, and undamped eigenfrequency ω0 is

∂t2 + 2γ∂t + ω02 u = f.




The problem is to find u(t) given f (t) and the boundary condition u(−∞) = 0.

Now here’s the basic idea: Formally, f (x) = f (x′ )δ(x − x′ ) dx′ is a superposition
R

of “delta impulses”. Thus, if we could work out how the system reacts to a delta im-
puls, we should be able to solve the general problem by linearity. Exploiting translational
invariance, we may even get away treating just the case of f (x) = δ(x).
This indeed works out. Assume we can find a G such that

LG = δ. (D.2)

Then defining u to be a superposition of shifted solutions G(x − x′ ), weighted by f (x),


Z
u(x) = G(x − x′ )f (x′ ) dn x′ . (D.3)

we get a solution of Lu = f , as anticipated:


Z Z
(Lu)(x) = (LG)(x − x )f (x ) d x = δ(x − x′ )f (x′ ) dn x′ = f (x).
′ ′ n ′

Some terminology: G is the Green’s function of L.1 The expression (D.3) for u is
known as the convolution G ⋆ f of G and f .
1 Yes, it’s “the Green’s function of L”, not “the Green function of L” as would be more in line with the

standard naming convention in mathematics (or, I guess, English grammar).

117
APPENDIX D. GREEN’S FUNCTIONS 118

Technically, (D.2) should be interpreted in the language of Chapter C. One assumes


(often implicitly) that f is an element of some space Φ of test functions. Then a Green’s
function G is a distribution in Φ′ and LG = δ is to be understood in the sense of distribu-
tion.
If h is a solution to the homogeneous equation Lh = 0, then G is a Green’s function
if and only if G + h is one. Thus G is unique if and only if L is invertible, in which
case G(x − x′ ) = ⟨x|L−1 |x′ ⟩ gives the matrix element of the inverse. Else, one can find
an entire affine space worth of Green’s functions (just as there’s an entire affine space of
solutions to Lu = f ). Typically, physically motivated boundary conditions are used to
select a particular one.

Elementary examples
The simplest example is L = ∂t . The general solution to Lu = f is, of course, the integral
Z t
u(t) = f (t′ ) dt′ .
a

Because ∂t is not invertible on the space of all differentiable functions, there is an ambigu-
ity in the solution, represented by a. Fixing a amounts to choosing the boundary condition
u(a) = 0. In these notes, we only treat the translationally-invariant theory, so we will
restrict attention to the choices a = ±∞. For a = −∞,
Z t Z
u(t) = f (t′ ) dt′ = θ(t − t′ )f (t′ ) dt′ .
−∞

The Green’s function G+ of ∂t under the boundary condition u(−∞) = 0 is therefore the
step function θ. And indeed, by Sec. C.2.3, ∂t θ(t) = δ(t) holds in the sense of distribution,
so that the step function fulfills (D.2).
An analogous calculation for a = +∞ leads to G− (t) = −θ(−t). Because the solu-
tions u(t) constructed using G+ only depend on the u(t′ ) for t′ ≤ t, one usually calls G+
a retarded Green’s function and, likewise, G− an advanced Green’s function. Their differ-
ence h(t) := G+ (t) − G− (t) = 1 is a solution of the homogeneous equation ∂t h(t) = 0,
consistent with our reasoning above.
For another example, take a quick look at L = ∂t2 . Because δ(t)t = 0 as distributions,

∂t2 (tθ(t)) = ∂t tδ(t) + θ(t) = ∂t θ(t) = δ(t),




which shows that G+ (t) = tθ(t) is a (retarded) Green’s function for ∂t2 . Other solutions
are
1 − 1 + 1
G− (t) = −tθ(−t), G= G + G = |x|.
2 2 2

D.2 Green’s functions from Fourier transforms


Fourier transforms turn derivatives into multiplications. Thus, for every differential oper-
ator L with constant coefficients, the equation Lu = f is equivalent to an equation

P (k)ũ(k) = f˜(k) (D.4)


APPENDIX D. GREEN’S FUNCTIONS 119

involving a polynomial multiplication operator P (k) and the Fourier transforms of the
functions. Formally, (D.4) is trivial to solve:
f˜(k) f˜(k) n
Z
−n/2
ũ(k) = ⇒ u(x) = (2π) eikx d k. (D.5)
P (k) P (k)
Likewise, the Fourier transform of the defining equation LG = δ for Green’s function is
P G̃ = (2π)−n/2 1, (D.6)
which has the formal solution
Z
1 1
G̃(k) = (2π)−n/2 ⇒ G(x) = (2π)−n eikx dn k. (D.7)
P (k) P (k)
The trouble is, of course, that if P has real zeros, the integrals in Eqs. (D.5, D.7) might
not exist. In the next sections, we’ll go through a variety of methods for anyway extracting
solutions by modifying these equations.

The problem of characterizing all G̃ ∈ Φ′ that satisfy (D.6) is known as the “prob-
lem of division” in the theory of distributions. In the univariate case, n = 1, it is
fairly easy to solve (we’ll introduce all necessary ingredients in Sec. D.2.3). Essen-
tially, this case is simple because univariate polynomials have finitely many roots. If
P has continuous sets of zeros, the problem can become very complicated, though.

D.2.1 Direct integration


No real zeros
The easiest case is the one where P (k) has no real roots. Then L is invertible (at least as
long as one assumes that u has a Fourier transform). Therefore, there is a unique Green’s
function. It is given by (D.7), which is absolutely integrable for all x.
As an example, let’s treat the damped harmonic oscillator with strictly positive viscous
damping coefficient, γ > 0. It corresponds to
L = ∂t2 + 2γ∂t + ω02 ⇒ P = −ω 2 − 2iγω + ω02 .
The polynomial factorizes as
q
P = −(ω − ω+ )(ω − ω− ), ω± = iγ ± ω02 − γ 2 .

Thus (employing the sign convention for the FT of time variables, as in (A.26)),
e−iωt
Z
1
G(t) = − dω.
2π (ω − ω+ )(ω − ω− )
A simple exercise in contour integration gives

1 √ 1 √ 
−γt i ω02 −γ 2 t −i ω02 −γ 2 t
G(t) = iθ(t)e − e + e .
ω+ − ω− ω+ − ω−
Since we’re done integrating in Fourier space, we can re-use the letter ω, defining it to be
p
|ω02 − γ 2 |. Then the above may be simplified (using l’Hôpital for the equality case) to

sin(ωt) ω0 > γ
e−γt  −γt
G(t) = θ(t) e t ω0 = γ . (D.8)
ω 
sinh(ωt) ω0 < γ
APPENDIX D. GREEN’S FUNCTIONS 120

Locally integrable case


Even if P (k) does have real roots, (D.7) might be locally integrable and thus well-defined
as a distribution
P(c.f. Sec. C.2.2). The most important example is the Laplace operator
3
L = −∆ = − i=1 ∂xi . Then P (k) = ∥k∥2 , and, by (C.20),
 
3/2 −1 1 1 1
G(x) = (2π) F (x) = (D.9)
∥k∥2 4π ∥x∥
is a Green’s function. (As the homogeneous equation −∆h = 0 has plenty of solutions, G
is far from unique).

D.2.2 Complex integration


Univariate case
Given a univariate polynomial P (ω), choose a deformation γ of the real axis in the com-
plex plane that avoids the zeros of P and consider the complex integral
Z −iωt
1 e
Gγ (t) := dω. (D.10)
2π γ P (ω)
The good news is that the integral now exists. The bad news is that it is not clear any more
that it has anything to do with Green’s functions. But actually, it does! Applying L,
e−iωt
Z Z Z
1 1 −iωt 1
LGγ (t) = P (ω) dω = e dω = e−iωt dω = δ(t).
2π γ P (ω) 2π γ 2π
The cool trick that makes this calculation work is that after multiplying with P (ω), the
integrand is an entire function, so we can move the integration path right back to the real
line without changing the value of the integral.
For the same reason, two paths γ, γ ′ will lead to the same Green’s function if they can
be deformed into each other without crossing a pole. In general, however, Gγ does depend
on the choice of γ.

“Infinitesimal” deformations
There’s a variant of this constructing that can be interpreted as “shifting the roots of P
away from the real axis” instead of “deforming the integration path to avoid the roots”.
Write the complex ω’s on the path γ as ω = u + iv(u). Then
Z −iωt
e−iut
Z
1 e 1
dω = ev(u)t du.
2π γ P (ω) 2π P (u + iv(u))
If there are no zeros between γ and the real line, the integral does not change under the
substitution v(u) 7→ ϵv(u) for 1 ≤ ϵ < 0. In particular, by continuity of the exponential,
e−iut e−iut
Z Z
1 1
lim+ eϵv(u)t du = lim+ du.
2π ϵ→0 P (u + iϵv(u)) 2π ϵ→0 P (u + iϵv(u))
In the common special case where v(u) is constant, the limit is typically written as
e−iωt
Z
1
G± (t) = dω, (D.11)
2π P (ω ± i0)
with the sign depending on sgn v. (This construction should remind you of the formula
(C.10), expressing the principal value as the “side limit” 1/(x ± i0)). Of course, ω 7→
P (ω ± iϵ) is a polynomial whose roots are shifted by ∓iϵ compared to the ones of P .
APPENDIX D. GREEN’S FUNCTIONS 121

Multivariate case
We can reduce the problem for arbitrary n to the n = 1-case. While the technique works
in general, it is particularly natural if one of the variables is distinguished in some way. In
physics applications, this is typically the time. With this in mind, we’ll use x = (t, x) for
the arguments of u, f, G, and k = (ω, k) for the arguments of Fourier transforms.
Define pk (ω) = P (ω, k). Then pk is a univariate polynomial, and we can just repeat
the n = 1-construction from above, but in a k-dependent way. That is to say, choose
deformations γk avoiding the zeros of pk and define

e−iωt
Z Z 
Gγ (x) := dω eikx dn−1 k. (D.12)
γk pk (ω)

The proof that Gγ is a Green’s function works exactly as in the univariate case.

Example: The Klein-Gordon equation


P3
The Klein-Gordon equation L = ∂t2 − i=1 ∂x2i + m2 corresponds to the polynomial
3
X p
P = −ω 2 + ki2 + m2 = −(ω − ωk )(ω + ωk ), ωk = m2 + ∥k∥2 . (D.13)
i=1

There are three natural choices for deformations γk avoiding the poles. The simplest ones
are γk± : straight lines parallel to the real axis with imaginary parts ±ϵ. From the discussion
above, the integral does not depend on the value of ϵ > 0. The third contour is γkF , which
moves around −ωk in the lower half-plane and around +ωk in the upper half-plane (the
superscript “F ” is for Feynman; see Fig. ??).
Let’s look at G+ (x) = Gγ + (x). The frequency integral can be evaluated exactly as in
the damped harmonic oscillator example above, leading to

e−iωk t − eiωk t
2πiθ(t)
2ωk
so that the full integral is

d3 k
Z
1
G+ (x) = iθ(t) e−iωk t − e+iωk t e−ikx

.
2ωk (2π)3

The k-integral can be expressed in terms of Bessel functions (but the result isn’t pretty).
In any case, the θ term means that the convolutions u = G+ ⋆ f only depend on f (t, x)
for t ≤ 0. We have thus constructed a retarded Green’s function.

D.2.3 Using the principal value


One can generalize the principal value (Sec. C.2.2) to find Green’s functions. We’ll out-
line how one can use this approach to construct all Green’s functions for a given one-
dimensional problem.
Start with L = ∂t . A Green’s function is a solution to
i
ω G̃(ω) = √ . (D.14)

APPENDIX D. GREEN’S FUNCTIONS 122

Thus we’re looking for distributions G̃ “proportional to 1/ω”. That was exactly our moti-
vation for introducing the principal value in Eq. (C.7). And indeed,
   Z Z  
1 1 1
ω pv (f˜) = lim+ ω f˜(ω) dω = 1 f˜(ω) dω ⇒ ω pv = 1.
ω ϵ→0 |ω|≥ϵ ω ω

Using that ωδ(ω) = 0, an affine space of solutions to (D.14) is given by


 
i 1
G̃λ (ω) = √ pv + λδ(ω),
2π ω

and one can show that these are all. An inverse Fourier transform gives

1 λ
Gλ (t) = sign(t) + √
2 2π

Choosing λ = ± π2 , we recover the retarded/advanced Green’s functions G± found


p
before in Sec. D.1.
Let’s now sketch how to generalize this construction to solve P G̃ = (2π)−1/2 for a
general polynomial P .
First assume that all real roots of P are simple, i.e. that
d
Y
P (ω) = Q(ω) (ω − al )
l=1

where Q is a polynomial with only complex roots and the al are distinct real numbers.
Define
Z ˜
1 f (ω)
G̃(f˜) = √ P dω,
2π P (ω)
R
where the symbol P denotes the principal value integral that is computed by approach-
ing each of the singularities al symmetrically, the same way integration around 0 is han-
dled in pv(1/ω). The proof that G̃ is indeed a Green’s function then works as the one for
pv(1/ω) given above. The ambiguity in defining G̃ corresponds to adding multiples of
delta distributions supported on the real zeros of P :
l
X
G̃ 7→ G̃ + λl δal .
i=1

It remains to treat roots with higher multiplicity. Here, we only discuss the case
P (ω) = −ω 2 ; the general case works similarly. The trick is to write
1 1
− = ∂ω .
ω2 ω
But we already know how to associate a distribution with 1/ω (the principal value) and
how to differentiate distributions (by differentiating minus the test function). Indeed, with
1
G̃ = √ D pv(1/ω)

APPENDIX D. GREEN’S FUNCTIONS 123

we get
Z
2 ˜ 1 1
(−ω )G̃(f ) = √ P (−∂ω )(−ω 2 ϕ(ω)) dω
2π ω
Z
1 1
= √ P (2ωϕ(ω) + ω 2 ∂ω ϕ(ω)) dω
2π ω
 Z Z  Z
1 1
=√ 2 ϕ(ω) dω − ϕ(ω) dω = √ ϕ(ω) dω.
2π 2π
One may verify that the ambiguity is given by

G̃ 7→ G̃ + λ1 δ + λ2 δ ′ .

D.3 Resolvents
Given a linear operator L on a Hilbert space H, the function that maps complex numbers
z to the operator

R(z; L) := (z 1 − L)−1 ,

is called the resolvent of L. (Warning: Some authors use the sign convention (L−z 1)−1 !).
From the discussion in Sec. D.1, it is clear that if (L − z) is invertible, its Green’s function
is

G(z; x) = −⟨0|R(z; L)|x⟩.

We’ll see that even if (L − z) is not invertible, suitable limits in z make sense as distribu-
tions and define Green’s functions.
Independent of their use for constructing Green’s functions, resolvents play an impor-
tant role in functional analysis. We’ll have a brief look at general properties first, and work
out some Green’s functions in Sec. D.3.2.

D.3.1 Resolvents and the spectrum


Recall that if L is a finite-dimensional square matrix, then each of the following conditions
are equivalent to z being an eigenvalue:
• (z − L) is not injective,
• (z − L) is not surjective,
• det(z − L) = 0.
The final condition allows us to prove that over C, every matrix has an eigenvalue. That’s
because the determinant is a polynomial, and thus, by the fundamental theorem of arith-
metic, has a root over the complex number.
In infinite dimensions, the situation is more complicated. Injectivity and surjectivity of
linear maps are no longer equivalent, and the definition of the determinant makes no sense
in infinite dimensions. Still, the analysis of eigenvalues for linear operators, culminating
in the spectral theorem, starts with a classification of how (z − L) is or is not invertible.
Here’s a summary of results. If L is a Hermitian operator, then every z ∈ C falls into
one of three categories:
APPENDIX D. GREEN’S FUNCTIONS 124

• (z − L) fails to be injective.
This happens iff z is a proper eigenvalue of L, i.e. iff there is a normalizable |ψ⟩ ∈ H
such that L|ψ⟩ = z|ψ⟩. The set of such z is called the point spectrum of L.
• (z − L) is injective, but not surjective.
This happens iff z is a generalized (but not a proper) eigenvalue of L, i.e. an eigen-
value associated with a non-normalizable distribution. In this case (z − L)−1 is an
unbounded operator on range L, which turns out to be dense in H. The set of such
z is called the continuous spectrum of L.
• (z − L) is both injective and surjective.
This happens iff (z−L)−1 is bounded. Because the spectrum of Hermitian operators
is real, Im z ̸= 0 is sufficient for this case. These z form the resolvent set of L.
It also holds that z 7→ R(z; L) is an (operator-valued) analytic function on the resolvent
set. This means one can use results from complex analysis, e.g.:
• The Taylor series of the resolvent converges. (This is almost as nice as it being a
polynomial, and replaces the use of the fundamental theorem of arithmetic in the
proof that every bounded operator has an eigenvalue).
• The residue theorem for contour integration applies, see Fig. ??.

D.3.2 The resolvent of the Laplacian


Pn
Here, we compute the resolvent function for the Laplacian − i=1 ∂x2i for n = 1 and
n = 3.

The Laplacian in one dimension


Set L = −∂x2 . Compute, using principal square roots,

G(z, x) = −⟨0|(z + ∂x2 )−1 |x⟩


−1 ∞ eikx
Z
= dk
2π −∞ z − k2
Z ∞
−1 eikx
= dk √ √ .
2π −∞ −(k + z)(k − z)

Assuming Im z ̸= 0, the expression can be evaluated by contour integration. If x ≥ 0,


then eikx remains bounded when k takes values in the √
upper half-plane. The contour is
then positively oriented and encloses a pole at k = ± z, where the sign is the one of
Im z. Thus,
√ √
√ e±i zx e±i z|x|
G(z; x) = 2πi Res(± z) = i √ =i √ , ± = sign Im z.
±2 z ±2 z

If x < 0, the contour needs to be closed in the lower half-plane, but a similar argument
shows that the same end result holds in this case, too.
Now for real values of z = u ∈ R√ . The spectrum of the Laplacian is R≥0 (associated
with the distributions eikx , for k = u). Therefore, if u < 0, it is an element of the
resolvent set, where the R(z) is analytic and hence continuous. We thus expect G(u; x) to
be independent of the particular limit taken. On the other hand, value z = u ≥ 0 are part
APPENDIX D. GREEN’S FUNCTIONS 125

of the spectrum, and limits G(z → u, x) might fail to exist or depend on the particular
way the limit is taken. For concreteness, we evaluate two side limits
 √
±i u|x|
 ±i e √ u>0
√ 2 u
G± (u; x) := lim G(u ± is; x) = e − |u||x| .
s↓0  √ u<0
2 |u|

The side limits lims↓0 G(±is; x) at u = 0 do not converge. These results are compatible
with our expectations.
The G± (u; x) are indeed Green’s functions for (u − ∂x2 ). We only need to verify this
for u > 0, where we find
√ √
2
 e±i u|x|  1 √
±i u|x|
  i u  ±i√u|x|
−∂x ± i √ = ∂x (sign x)e = δ(x) ± e
2 u 2 2
√ √
i u ±i√u|x| i u ±i√u|x|
⇒ (−∂x2 − u)G± (u; x) = δ(x) ± e ∓ e = δ(x).
2 2
The side limits don’t converge if u = 0, but the limit of their mean does:
1 1 i √ 1
lim (G+ (u, x) + G− (u, x)) = lim

√ 2i sin u|x| = − |x|,
u↓0 2 u↓0 2 2 u 2
which is a Green’s function for L = −∂x2 , as already argued in Sec. D.1.

The Laplacian in three dimensions


Evaluate the inner product in spherical coordinates in Fourier space (compare Eq. (B.4))
to find
G(z, x) = −⟨0|(z + ∆)−1 |x⟩
−1
Z
1
= d3 k eikx
(2π)3 z − ∥k∥2
Z ∞ Z 1
−1 k2
= dk dµ eikxµ
(2π)2 0 z − k 2 −1
Z ∞
−1 k 2 eikx − e−ikx
= dk
(2π)2 0 z − k2 ikx
Z ∞
−1 keikx
= dk
(2π)2 ix −∞ z − k2
Z ∞
−1 keikx
= 2
dk √ √ .
(2π) ix −∞ −(k + z)(k − z)
Arguing as in the one-dimensional case, a contour integration gives for Im z ̸= 0,
√ √ √
√  −1 (± z)e±i zx e±i zx
G(z; x) = 2πi Res ± z = √ = , ± = Im z.
2πx −2(± z) 4πx
The side limits are
√ √
± e±i u±isx e±i ux
G (u; x) = lim = .
s↓0 4πx 4πx
Unike the one-dimensional case, the expression does make sense for u =, where we get
G± (0; x) = 4πx
1
, as previously computed in Eq. (D.9).
APPENDIX D. GREEN’S FUNCTIONS 126

D.4 Propagators
Consider again a differential operator L on functions of time t and space x ∈ Rn−1 .
There is a close connection between inhomogeneous equations we have considered so far
and homogeneous initial value problems. For simplicity, we restrict attention to the case
where L is of the form
1
L= ∂t − H, (D.15)
c
for some constant c and where H acts only on the spatial degrees of freedom. Prime
example is

ℏ2
L = iℏ∂t + ∆ − V,
2m
so that Lψ(t, x) = 0 is the Schrödinger equation.
Recall that the inhomogeneous equation with causal boundary conditions is

Lu(t, x) = f (t, x),



lim u(t′ , x) = 0
t →−∞

As we have seen, if we have a Green’s function

LG(t, x) = δ(t)δ(x)
G(t′ , x) = 0 for all t′ ≤ 0,

then a solution is given by


Z Z
u(t, x) = dt′ dn−1 x′ G(t′ , x′ )f (t′ , x′ ).

In contrast, the homogeneous problem with initial value ϕ(x) is

Lu(t, x) = 0,
u(s, x) = ϕ(x) for some s ∈ R.

A propagator for L is a distribution K(t, x) satisfying

LK(t, x) = 0,
(D.16)
K(0, x) = δ(x).

One immediately verifies that given K, the initial value problem is solved by
Z
u(t, x) = dn−1 x′ K(t − t′ , x′ )f (t′ , x′ ). (D.17)

For the first-order differential operator (D.15), the propagator is given by the “matrix ele-
ments” of etcH

K(t, x) = ⟨t, x|ectH |0, 0⟩.


t
(In QM, the exponential is of course known as the time evolution operator U (t) = e iℏ H ).
APPENDIX D. GREEN’S FUNCTIONS 127

After these preparations, we can now state an observation sometimes known as Duhamel’s
principle. It says that one can solve the inhomogeneous problem in terms of homogeneous
ones, or, equivalently, one can obtain the Green’s function G from the propagator K. For
the latter point of of view, just set

G(t, x) = cθ(t)K(t, x),

and verify, using (D.16),

LG(t, x) = ∂t (θ(t)K(t, x)) − cHθ(t)K(t, x)


= δ(t)K(t, x) + θ(t)∂t K(t, x) − cθ(t)HK(t, x)
= δ(t)δ(x) + cθ(t)LK(t, x)
= δ(t)δ(x).

D.5 Epilogue
Warning: The concepts treated in this section are used in sufficiently many fields (quan-
tum physics, partial differential equations, functional analysis) and some of them (Green’s
functions, resolvents, propagators) are sufficiently strongly related that the terminology in
the literature is a complete mess. Any one of the three terms might be used to refer to any
one of the three concepts by various authors. Bottom line is that you have to be careful
when combining results from different sources.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy