Harlow QFT2
Harlow QFT2
Lecture notes for Physics 8.324: Relativistic Quantum field theory II, Fall 2024
Contents
1 Introduction and overview 3
1.1 Where we are now . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 What’s next? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Quantum Fermions 8
2.1 One fermion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 N fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Some comments on the inner product and hermiticity . . . . . . . . . . . . . . . . . . . . . . 12
2.4 Real fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3 Spinors 13
3.1 Angular momentum review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2 Lie algebra of the Lorentz group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3 The spinor representation 1: general idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.4 The spinor representation 2: explicit matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.5 Hermiticity and complex conjugation properties of the Dirac matrices . . . . . . . . . . . . . 20
1
8 Lattice fermions and the 2D Ising model 74
8.1 Scalar field on the lattice in 1 + 1 dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
8.2 Lattice fermions in 1 + 1 dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
8.3 Nielsen-Ninomiya theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
8.4 Classical 2D Ising model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
8.5 Transfer matrix and the Hamiltonian formulation for the 1D Ising model . . . . . . . . . . . . 82
8.6 Hamiltonian formulation of the 2D Ising model . . . . . . . . . . . . . . . . . . . . . . . . . . 84
8.7 2D Ising as a free Majorana fermion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
10 Quantum electrodynamics II: charged matter and the path integral 101
10.1 Charged matter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
10.2 Some comments on charge quantization and topology . . . . . . . . . . . . . . . . . . . . . . . 102
10.3 Gauge-invariant charged operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
10.4 QED path integral I: gauge-invariant formalism . . . . . . . . . . . . . . . . . . . . . . . . . . 105
10.5 QED path integral II: fixing the gauge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
10.6 Feynman rules for spinor electrodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
2
1 Introduction and overview
1.1 Where we are now
In the previous semester we developed the basic structure of relativistic quantum field theory. We saw
that constructing a relativistic theory of interacting quantum particles which respects causality requires the
fundamental degrees of the theory to be fields. Restricting for a moment to theories with no fermions,
quantum fields are operators Φa (x) obeying the following two conditions:
(1) Poincare symmetry: they transform in representations
X ′
U (Λ, a)† Φa (x)U (Λ, a) = Daa′ (Λ)Φa (Λ−1 (x − a)) (1.1)
a′
of the Poincaré group, where U (Λ, a) are the unitary operators implementing Poincaré symmetry on
Hilbert space. The representation matrices Daa′ obey
X
Daa′ (Λ1 )Da′ a′′ (Λ2 ) = Daa′′ (Λ1 Λ2 ). (1.2)
a′
Recall that the the Poincaré group is the set of spacetime transformations
with Λ obeying
Λµα Λν β ηµν = ηαβ (1.5)
and also
det Λ = 1
Λ00 ≥ 1. (1.6)
The condition (1.5) ensures the transformation (1.4) preserves the Minkowski space inner product, or equiv-
alently is an isometry of the Minkowski metric ηµν , while the conditions (1.6) ensure that it preserves the
orientations of space and time (i.e. it does not include time reversal and/or spatial reflection). The spacetime
vector a is a spacetime translation while Λ is a Lorentz transformation built out of some combination of
boosts and rotations. Setting a = 0 the Poincaré group reduces to what is sometimes called the proper
orthochronous Lorentz group, usually denoted SO+ (d−1, 1). Incorporating fermions into this formalism
is not difficult; we need to modify it in two ways:
1. We allow for fields which transform in representations of the spin double cover of the Poincare group.
This is a group that locally looks like the Poincare group, but globally allows for the fact that on fields
of half-integer spin a rotation by 2π is equal to −1 instead of 1. We’ll explain this in more detail in
the next few sections.
2. Fermionic fields are required to anticommute instead of commute at spacelike separation.
Last semester we saw that this formalism leads to some rather general consequences for relativistic
quantum mechanics:
Particle non-conservation: In quantum field theories the interactions always allow for particles to
be created and destroyed, essentially because the interaction terms always have the form (a + a† )n and
this always includes terms with different numbers of creation and annihilation operators.
3
Existence of antiparticles: Given any particle of mass m and charge q, we saw that to construct a
field operator obeying microcausality we need to also have a particle of mass m and charge −q.
Spin-statistics theorem: By using the rotational symmetry of the Euclidean path integral, we saw
that particles/fields of integer spin must be bosons and particles/fields of half-integer spin must be
fermions. This is essentially because we can exchange two fields by rotating each by π, and for an
appropriate basis of fields this is the same as a rotation by 2π.
CRT theorem: Again using rotational symmetry in Euclidean signature, we saw that any relativistic
quantum field theory has a special symmetry called CRT which reverses time, reflects one spatial
direction, and exchanges particles and antiparticles.
An example of a relativistic quantum field theory that we studied in great detail last semester is the free
scalar field, with Lagrangian density
1 m2 2
L = − ∂µ ϕ∂ µ ϕ − ϕ . (1.7)
2 2
We solved this theory by introducing the Heisenberg field operator
dd−1 p
Z
1 h ip·x † −ip·x
i
Φ(x) = a p
⃗ e + a p
⃗ e , (1.8)
(2π)d−1 2ωp⃗
p
[ap⃗ , ap⃗ ′ ] = 0
[a†p⃗ , a†p⃗ ′ ] = 0
[ap⃗ , a†p⃗ ′ ] = (2π)d−1 δ d−1 (⃗
p − p⃗ ′ ), (1.11)
which is essentially just the algebra of a continuous set of harmonic oscillators labeled by the spatial mo-
mentum p⃗. Moreover we showed that (after a renormalization of the cosmological constant) the Hamiltonian
is given by
dd−1 p
Z
H= ωp⃗ a†p⃗ ap⃗ . (1.12)
(2π)d−1
Thus the ground state of the theory, also called the vacuum, is the state |Ω⟩ which is annihilated by all
ap⃗ , and by acting on the vacuum with the a†p⃗ s we can create particles, each of which has momentum p⃗ and
energy ωp⃗ . These particles are bosons, since e.g. we have
We also considered interacting theories, in particular spending a lot of time on the λϕ4 theory with
Lagrangian density
1 m2 2 λ 4
L = − ∂µ ϕ∂ µ ϕ − ϕ − ϕ . (1.14)
2 2 4!
4
This theory has many physical applications, for example it can be used to study the critical points of
magnets and the Higgs boson in particle physics (this is the complex version). Unfortunately it cannot
be solved exactly, but we saw we could learn quite a bit about it using perturbation theory. Perturbation
theory is especially nice to construct starting from the path integral formalism. For example the path integral
computes the time-ordered n-point correlation function of ϕ as
We can evaluate these path integrals perturbatively by Taylor-expanding the integrand in λ, which reduces
all integrals to moments of Gaussian integrals that can be done explicitly. The result is the Feynman rules
for computing correlation functions.
In quantum field theory it is the fields that are fundamental, not the particles, and indeed in some
quantum field theories there are no particles at all. On the other hand many quantum field theories do have
particles, such as the standard model of particle physics, and in those theories we are often interested in
the scattering theory of the particles. We formalized this scattering theory by introducing “in” and “out”
eigenstates of the Hamiltonian, whose wave packets are non-interacting at early or late times respectively, and
we defined the S-matrix as the inner product between the in and out states. We were careful to emphasize
that the in and out states in general do not have any simple relationship to the fields appearing in the
Lagrangian, for example in QCD the fields in the Lagrangian describe quarks and gluons but the scattering
states are hadrons. We further explained how the S-matrix can be extracted from the correlation functions
of fields using the LSZ reduction formula, which works by taking the Fourier transform of the correlator
and then taking all external momenta to go on shell and looking at the residue of the resulting pole.
We closed the semester by introducing the renormalization group, initially as a way of interpreting the
various short-distance divergences which arise in computing Feynman diagrams and then as a general philos-
ophy of life. What we found is that the operation of going to low energies (or equivalently long distances) has
a strong focusing behavior, formalized as Polchinski’s theorem, which “forgets” high-energy/short-distance
information about the theory. This focusing behavior is what leads to the extraordinarily predictive nature
of quantum field theory: we can realize the generic low-energy behavior by including only “relevant” and
“marginal” terms in the Lagrangian (these are terms whose energy dimension is less than or equal to d). So
for example for d = 4 the most general Lagrangian we need to write down for a scalar field with a symmetry
ϕ′ = −ϕ is precisely the Lagrangian (1.14).1
gravity.
5
Gauge fields: these are one-form fields Aµ (x) which are also charged under a local gauge symmme-
try. The simplest example is the vector potential of electromagnetism, which has the familiar gauge
transformation
A′µ = Aµ + ∂µ Λ (1.17)
with Λ an arbitrary scalar function of space and time. We will see that such a gauge symmetry is
necessary whenever we have a massless particle of helicity one. There are also more sophisticated
“non-abelian” gauge fields, where Aµ is a matrix of one-forms instead of just a single one-form. Gauge
fields are the mediators of all of the non-gravitational forces in nature, and they also appear in the
description of interesting condensed matter phenomena such as the fractional quantum hall effect.
Gravity: in Einstein’s theory of gravity the geometry of spacetime is described by a dynamical metric
tensor gµν (x). This theory also has a local gauge symmetry, called general coordinate invariance
or diffeomorphism symmetry, which says that the theory looks the same in arbitrary coordinate
systems. Just as having a consistent theory of a massless particle of helicity one (the photon) requires
the gauge symmetry (1.17), having a consistent theory of a massless particle of helicity two (the
graviton) requires general coordinate invariance.
Gravitational theories are sufficiently different from non-gravitational theories that it is conventional to view
them as falling outside of the standard framework of quantum field theory. There are several reasons for
this:
1. No local operators: we will learn later that gauge symmetries typically must be viewed as redundan-
cies rather than physical transformations, which is why in electromagnetism it is the gauge-invariant
field strength tensor
Fµν = ∂µ Aν − ∂ν Aµ (1.18)
which measurable rather than the vector potential Aµ .2 In a gravitational theory general coordinate
transformations are gauge transformations, so no local operator O(x) can be gauge-invariant (since
gauge transformations would move it around). In a gravitational theory gauge-invariant observables
need to be defined in a relational manner, with their locations defined relative to some feature in the
system or else the asymptotic boundary. This is rather inconvenient from the point of view of our field
theory machinery, which is mostly built around correlation functions of local operators.
2. No energy-momentum tensor: in quantum field theory an essential object is the energy-momentum
tensor, which we defined as the derivative of the action with respect to a background metric. The various
Poincaré symmetry generators are expressed as integrals involving the energy-momentum tensor, and
its correlation functions appear in many important quantum field theory applications (for example
the energy flux in a particle detector or the flux of energy out of an evaporating black hole). In
gravitational theories there is no energy-momentum tensor since the metric is a dynamical field rather
than a background field.
3. Less important role for Poincaré symmetry: in ordinary quantum field theory the Poincaré group
arises as the symmetry group of the background Minkowski metric. In gravity the metric is dynamical,
and indeed depending on the matter configuration it need not look anything like Minkowski space. And
moreover even when we do restrict to situations where the metric approaches the Minkowski metric
at large distances, Poincaré symmetry mixes into a larger structure involving “BMS transformations”
which even today is not completely understood.
4. No renormalizable interactions: we saw at the end of last semester that in general relativity the in-
teractions between gravitons and also the interactions of gravitons and matter are non-renormalizeable,
or in the Wilsonian parlance they are irrelevant. This means that they vanish at energies that are low
2 An exception to this statement is the Aharonov-Bohm effect, which measures the gauge-invariant line integral of A around
a closed loop. This however is only really a separate degree of freedom if the loop is topologically non-contractible (or more
carefully non-contractible within the region of spacetime where the gauge field is a good description of the physics).
6
compared to the Planck scale, which is why gravity is so weak, and it also means that gravitational
interactions are not constrained by the focusing behavior of the renormalization group: as we work
to higher and higher orders in Newton’s constant, we need to include more and more terms in the
Lagrangian to parametrize the possible interactions. This means that we need some “UV-complete”
theory of quantum gravity if we are ever to understand in detail what happens in high-energy graviton
scattering.3
5. No theory yet: building off of the previous point, so far we do not have any candidate theory of
quantum gravity that is consistent with everything we know about the world. There is a promising
framework called string theory which seems to contain many of the right ingredients, but so far we only
know how to formulate string theory precisely in unrealistic situations with unbroken supersymmetry
and a non-positive cosmological constant.
6. Black holes and holography: in the special corners where we do understand string theory, the pre-
cise formulation looks nothing like the quantum field theory we have pursued in this class. Rather than
quantizing some matter fields and a metric on spacetime, we instead quantize some strongly-interacting
non-gravitational lower-dimensional theory living in some auxiliary spacetime. The conventional pic-
ture of spacetime is emergent, rather than fundamental. This is a radical change from our quantum
field theory understanding of non-gravitational physics, and it is usually referred to as holography.
Holography seems to be necessary in order to have a consistent theory of quantum black holes.
For these reasons we will therefore mostly not consider gravity any further in this class. Spinor and gauge
fields however will squarely be our business, and indeed the rest of the semester will more or less consist
of developing the formalism for these and introducing some of their most interesting applications. Here are
some of the applications we will consider:
Quantum electrodynamics (QED): the fundamental theory of electrons, positrons, and nuclei
interacting via quantized photons, QED underlies most of atomic physics and thus accounts for much
of the world we see around us. In particular phenomena such as the 2s→1s decay of hydrogen, the
anomalous magnetic moment of the electron, and the Lamb shift cannot be understood without a
quantum theory of light.
Yukawa theory of the nuclear force: How are the protons and neutrons in the atomic nucleus held
together? The answer is that there is an attractive force between nucleons which is mediated by scalar
particles called pions, and we will study this force using quantum field theory. Next semester we will
see how this force arises from quantum chromodynamics (QCD), which is the fundamental theory of
the strong force.
2D Ising model: one of the landmark developments of statistical physics was Onsager’s solution of
the classical Ising model in two spatial dimensions. A key accomplishment of Onsager’s approach was
the calculation of critical exponents in the vicinity of the transition point. Onsager’s original solution
is quite difficult to understand, but it was later realized that the essential reason why the model is
solvable is that at the critical point it can be rewritten in terms of a free fermion field theory. This
enables fairly straightforward calculations of the critical exponents.
Superconductivity: one of the most remarkable phenonema in solid state physics is the possibility of
conductivity with zero resistance. The available microscopic explanation of this, called BCS theory, is
fairly sophisticated and is not strong enough to account for all observed examples of superconductivity.
A general phenomenological theory is available however via the idea that a superconductor is merely
a system in which the gauge symmetry of electromagnetism is spontaneously broken, or “Higgsed”.
3 It must be emphasized however that other non-renormalizeable theories, such as that which controls the physics of pions
and nucleons, can be fit squarely within the framework quantum field theory. So unlike the other problems in this list, this one
is not unique to gravity.
7
The main topic we will not cover this semester is “non-abelian gauge fields”, which are a generalization of
QED that is needed to describe the strong and weak nuclear forces; this will be the main topic of the third
and final semester of QFT.
Problems:
1. Review anything above “what’s next” which you aren’t comfortable with, especially the equations.
2 Quantum Fermions
We now begin our systematic study of fermions in quantum field theory. By definition fermions are particles
with the property that if you have more than one of them then the quantum state is antisymmetric under
exchanging any two of them. This is to be distinguished from the bosons we have encountered so far, for
example in our free scalar field theory the creation operators were commuting so
In order to get fermions we need the creation and annihilation operators to be anticommuting, meaning
that we want to get something like
We’ve also included a spin index σ out of regard for the spin-statistics theorem: fermions cannot have spin
zero since a rotation by π must act on them as −1. Acting on the vacuum with creation operators obeying
the algebra (2.2) automatically gives us fermions, for example we now have
8
To get the fermionic algebra we therefore need fields which obey canonical anticommutation relations:
The canonical commutation relations are an essential part of the procedure for constructing quantum theories
starting from a classical Lagrangian, so these anticommutation relations may seem somewhat mysterious.
We will therefore now study a simplified version of them, learning that a finite number of quantum fermions
has a finite-dimensional Hilbert space and thus does not have a classical limit in the same way a finite number
of bosonic variables (such as the harmonic oscillator) does. This will end up being the reason why you are
familiar with the classical electromagnetic field but not with a “classical electron field”. Once we have better
intuition for quantum fermions, we will return to the problem of constructing fermionic quantum fields that
are consistent with Lorentz invariance and causality.
{Ψ, Ψ} = 0
{Ξ, Ξ} = 0
{Ψ, Ξ} = 1. (2.8)
This is the fermionic analogue of the canonical commutation relations for a single quantum particle moving
in one spatial dimension. The first two lines merely say that
Ψ2 = Ξ2 = 0, (2.9)
Ψ|0⟩ = 0. (2.10)
This is because given any state |ϕ⟩ which isn’t annihilated by Ψ we simply act on it once with Ψ to get a
state Ψ|ϕ⟩ which is annihilated by Ψ since Ψ2 = 0. Acting on |0⟩ with Ξ we then get another state
In modern parlance, the Hilbert space of a single fermion degree of freedom is just that of one qubit. I
emphasize that this is much simpler than the Hilbert space for a single bosonic variable obeying canonical
9
commutation relations, which has an infinite-dimensional Hilbert space spanned by position eigenstates |x⟩.
We can write this representation in terms of Pauli matrices:
1
Ψ= (σx + iσy )
2
1
Ξ = (σx − iσy ) . (2.15)
2
Here of course the Pauli matrices are given by
0 1 0 −i 1 0
σx = σy = σz = . (2.16)
1 0 i 0 0 −1
F = ΞΨ, (2.17)
which obeys
F |0⟩ = ΞΨ|0⟩ = 0
F |1⟩ = ΞΨΞ|0⟩ = {Ξ, Ψ}Ξ|0⟩ = |1⟩. (2.18)
In the Pauli representation the fermion parity operator (−1)F , which acts on |0⟩ as 1 and |1⟩ as −1, is simply
given by
(−1)F = σz . (2.19)
2.2 N fermions
All this generalizes nicely to a system of N fermions Ψa and their conjugates Ξa , which obey the canonical
anticommutation relations
{Ψa , Ψb } = 0
{Ξa , Ξb } = 0
{Ψa , Ξb } = δba . (2.20)
Ψa |0⟩ = 0. (2.21)
We can always find such a state by picking any state which isn’t annihilated by some subset of the Ψa and
then acting on it with them. The order in which we act doesn’t matter since the Ψa all anticommute and
thus can be moved past each other up to a sign. We then construct a basis for the Hilbert space of the form
The action of Ψa and Ξa on these basis states is not hard to work out from the algebra (2.20), for example
if sa = 0 then the state is annihilated by Ψa while if sa = 1 then it is annihilated by Ξa . The Hilbert space
for N fermionic variables is thus just the Hilbert space of N qubits. The fermion number operator F is now
defined to be
XN
F = Ξa Ψa , (2.24)
a=1
10
Figure 1: Creating a five-qubit basis state by acting on the |0⟩ state with fermion operators. The black dots
σ −iσ
indicate the x 2 y factors, and the Jordan-Wigner string operators σz always act on the zero state and thus
only contribute factors of +1.
We can give a more explicit representation of the fermion algebra using Pauli matrices. Considering first
the case N = 2, it not hard to check that the operators
σx + iσy
Ψ1 = ⊗I
2
σx − iσy
Ξ1 = ⊗I
2
σx + iσy
Ψ2 = σz ⊗
2
σx − iσy
Ξ2 = σz ⊗ (2.26)
2
{Ψ1 , Ψ2 } = 0 (2.27)
{Ψ2 , Ξ2 } = 1 (2.28)
σx +iσy σx −iσy
since σz2 = 1 and we already showed that { 2 , 2 } = 1 in the one-fermion case. It is also straight-
forward to confirm (2.22), for example
Ξ1 Ξ2 |00⟩ = |11⟩ (2.29)
σx −iσy
since σz |0⟩ = |0⟩ and 2 |0⟩ = |1⟩. This construction generalizes to N fermions in a nice way: we simply
have
σx + iσy
Ψa = σz ⊗ . . . σz ⊗ ⊗ I ⊗ ... ⊗ I
2
σx − iσy
Ξa = σz ⊗ . . . σz ⊗ ⊗ I ⊗ . . . ⊗ I, (2.30)
2
11
σ ±iσ
where there are N tensor factors in both expressions (for the N qubits) and the operator x 2 y appears
in the ath factor. In the homework you will show in detail that these expressions obey (2.20). The chain
σ ±iσ
of σz operators appearing to the left of x 2 y is typically called the Jordan-Wigner string, and this
representation of the N -fermion algebra on the Hilbert space of N qubits is called the Jordan-Wigner
representation. The expression (2.22) for the fermion basis has a nice graphical representation in terms of
the Jordan-Wigner strings, see figure 1. Fermion parity also has a nice representation along these lines, it is
the complete Jordan-Wigner string:
(−1)F = σz ⊗ . . . ⊗ σz . (2.31)
The easiest way to see this is to exponentiate the action of F on basis states:
P
(−1)F |s1 . . . sN ⟩ = (−1) a sa
|s1 . . . sN ⟩. (2.32)
It is worth emphasizing a rather subtle point about this construction. In quantum mechanics class we
typically learn that the composition of two physical systems is mathematically described by a tensor product.
So it is here: the Hilbert space of N1 + N2 fermions is the tensor product of an N1 -qubit Hilbert space and
an N2 -qubit Hilbert space. On the other hand the fermionic operators are NOT tensor product operators:
a fermion operator on the first system needs to anticommute with a fermion operator on the second system!
Taking care of this anticommutation is precisely the reason for including the Jordan-Wigner string. On the
other hand including the string seems a bit arbitrary: why did we put the σz operators on the left instead of
the right? And for that matter why did we have to introduce an ordering of the fermions at all? The algebra
(2.20) makes no reference to such an ordering, and for bosonic degrees of freedom we didn’t need to pick an
ordering. The place where the ordering appeared is when we introduced the particular basis (2.22) for the
system; other orderings would lead to bases which differ from this one by signs. Keeping track of the signs
associated to this choice is one of the main headaches of dealing with fermions: most of the time it doesn’t
matter and cancels from observable quantities, but every now and then it leads to something deep.
12
2.4 Real fermions
There is an alternative way of presenting the fermion algebra (2.20) where we make the redefinitions
1
χa = √ (Ψa + Ξa ) (2.38)
2
1
ea = √ (Ψa − Ξa ) ,
χ (2.39)
i 2
which then obey the algebra
{χa , χb } = {e eb } = δab
χa , χ
{χa , χ
eb } = 0. (2.40)
χ†a = χa e†a = χa ,
χ (2.41)
so they are conventionally referred to as real fermions. The Ψa are referred to as complex fermions, and
as you would expect one complex fermion is equal to two real fermions. The Jordan-Wigner representation
of real fermions is particularly simple:
σx
χa = σz ⊗ . . . σz ⊗ √ ⊗ I ⊗ . . . ⊗ I
2
σy
ea = σz ⊗ . . . σz ⊗ √ ⊗ I ⊗ . . . ⊗ I.
χ (2.42)
2
In the condensed matter and quantum information communities real fermions are sometimes called “Majo-
rana fermions”, but this term is really a relativistic notion that we will meet presently so we will reserve it
for then.
It is interesting to note that this construction always produces an even number of real fermions. It is a
topic of some controversy whether or not it makes sense to have a quantum system with an odd number of
real fermions. My vote is no: the constructions which I’ve seen always amount to either taking a system with
one more real fermion (to make the total number even) and then “forgetting” about one of the fermions,
which doesn’t give an irreducible representation of the anticommutation relations, or taking a system with
one fewer real fermions (to again get an even number) and then interpreting fermion parity (−1)F as an
“extra” real fermion.
Problems:
1. Confirm the action (2.25) of fermion number on the basis states of N fermions.
2. Confirm the canonical anticommutation relations (2.20) using the explicit representation (2.30).
3. Confirm the Majorana algebra (2.40) starting from the canonical anticommutation relations (2.20),
and also confirm them from the explicit representation (2.42).
3 Spinors
We would now like to construct Lorentz representations for fields that create and annihilate fermions. To
set up this task more precisely, it is useful to first review some facts about angular momentum in quantum
mechanics.
13
3.1 Angular momentum review
In three spatial dimensions a rotation is a 3 × 3 matrix R with unit determinant whose transpose is also
its inverse, or in other words it is an element of the Lie group SO(3).5 In quantum mechanics each spatial
rotation is represented on Hilbert space by a unitary operator U (R), with the unitary operators obeying the
representation condition
U (R)U (R′ ) = U (RR′ ). (3.1)
The rotation group can be represented in many different ways, and trying to work out all the representations
directly by constructing the unitary operators U (R) is somewhat challenging. A better idea is to first think
about how to represent the rotation group in an infinitesimal neighborhood of the identity, by way of the
angular momentum operators Jx , Jy , and Jz . To introduce these we observe that we can write a general
element of SO(3) as
R = eiS , (3.2)
where S is purely imaginary (to make sure that R is real) and the orthogonality condition R−1 = RT implies
that S is antisymmetric:
S T = −S. (3.3)
Setting the determinant of R to one also tells us that
1 = det R = eiTrS , (3.4)
so we also want S to be traceless:
Tr (S) = 0. (3.5)
In fact this follows already from the antisymmetry of S, so we need only require that S is imaginary and
antisymmetric. The set of 3 × 3 imaginary antisymmetric matrices is a three-dimensional vector space,
spanned by the generators
0 0 0 0 0 1 0 −1 0
Jx = i 0 0 −1 Jy = i 0 0 0 Jz = i 1 0 0 . (3.6)
0 1 0 −1 0 0 0 0 0
These are of course the angular momenta about the x, y, and z axes, the signs are chosen so that a
counterclockwise rotation by θ about the unit vector n̂ is
⃗
R(θ, n̂) = e−iθn̂·J . (3.7)
It is simple to check that these generators obey the algebra
X
[Ji , Jj ] = i ϵijk Jk , (3.8)
k
where ϵijk is the completely antisymmetric tensor with ϵxyz = 1. Equation (3.8) is called the Lie algebra
of SO(3). We can think of the Lie algebra as encoding the multiplication rules of SO(3) “near the identity”,
since if we multiply two elements of SO(3) which are near the identity we have
1 1
eiϵS1 eiϵS2 = 1 + iϵS1 − ϵ2 S12 + . . . 1 + iϵS2 − ϵ2 S22 + . . .
2 2
1 2 1
= 1 + iϵ(S1 + S2 ) − ϵ (S1 + S2 )2 − ϵ2 [S1 , S2 ] + . . .
2 2
iϵ(S1 +S2 )− 21 ϵ2 [S1 ,S2 ]+O(ϵ3 )
=e . (3.9)
5 A Lie Group is a group which is also a smooth manifold, which roughly speaking means it locally looks like a piece of Rn
for some n, and for which the group multiplication and inversion operations are infinitely differentiable. In physics we are most
often interested in matrix Lie groups, meaning Lie groups that can be faithfully represented using finite-dimensional matrices.
For matrix Lie groups the Lie algebra is the vector space of infinitesimal generator matrices, which we will see in a moment
needs to be closed under taking commutators.
14
This formula can be extended to all orders in ϵ, which gives something called the Baker-Campbell-Hausdorff
formula, and the higher order terms in the exponent on the right hand side all have the form of nested
commutators of S1 and S2 . Therefore if we know how to compute the commutators of the generators, then
we know how to compute the products of arbitrary group elements near the identity.6 Thus if we want to
construct a unitary representation of SO(3) near the identity, it is enough to find three hermitian matrices
Ji of any dimension obeying the Lie algebra (3.8), since exponentiating them then gives a representation
of SO(3) with that dimension (at least near the identity). Such a set of hermitian matrices is called a
representation of the Lie algebra of SO(3).
In your quantum mechanics class you presumably used ladder operators to construct all the finite-
dimensional irreducible representations of the Lie algebra (3.8), seeing that they are labeled by an integer
or half-integer j = 0, 12 , 1, 3/2, . . . called the spin and that they have dimension 2j + 1. The representations
with integer spin exponentiate to genuine representations of SO(3) (not just near the identity), and they can
all be realized by taking tensor products of the defining spin-one vector representation and then restricting
to invariant subspaces. The half-integer spin representations are more interesting, here we will focus on the
spin-1/2 representation which is furnished by the Pauli matrices:
σx σy σz
Jx = Jy = Jz = . (3.10)
2 2 2
Exponentiating this representation we have the famous formula
−iθn̂·J⃗ θ θ
e = cos I − i(n̂ · ⃗σ ) sin , (3.11)
2 2
This is perhaps surprising, because a rotation by 2π is equal to the identity in SO(3). In other words a
spin-1/2 particle doesn’t really transform in a representation of SO(3), even though we saw in the previous
paragraph that it will do so near the identity. The problem is global in nature: the product of two rotations
by π is equal to the identity in SO(3), but it is equal to −1 in the spin-1/2 representation. The reason
this can happen is that the group SO(3) is not simply-connected - it has closed loops which cannot be
contracted to a point.
To see how non-contractible loops can cause trouble, let’s write the group multiplication rule as
⃗ ⃗ ⃗ ⃗ ⃗ ⃗ ⃗ ⃗
e−iθ1 ·J e−iθ2 ·J = e−iϕ(θ1 ,θ2 )·J . (3.13)
15
Figure 2: Failure of a group representation due to a non-contractible loop: using multiplications near the
⃗ ⃗ ⃗ ⃗ ⃗ ⃗
identity we can get from the identity to eiθ1 ·J and then eiθ1 ·J eiθ2 ·J , but this isn’t the same path in the group
⃗ ⃗ ⃗
as the one we’d use to get from the identity to eiϕ(θ1 ,θ2 ) and if they can’t be deformed continuously into each
other then they might not give the same answer.
For any connected simply-connected Lie group G, every representation of the Lie algebra exponentiates
to a representation of G.
Every connected Lie group G is the quotient G/Γ
e of a unique connected simply-connected Lie group
G which is called its universal covering group, with Γ a discrete central subgroup of G.
e e Moreover
G and G have the same Lie algebra, since their group structure near the identity is identical.
e
SO(3) ∼
= SU (2)/Z2 , (3.15)
where Z2 is the discrete central subgroup of SU (2) consisting of I and −I. The Lie algebra of SU (2) consists
of traceless hermitian 2 × 2 matrices, which of course are spanned by the Pauli matrices, and thus the
Lie algebra of SU (2) is indeed the same as the Lie algebra of SO(3). Non-contractible loops in SO(3) are
precisely those which are obtained from paths in SU (2) which start at an element ge and end at −e
g , see figure
3 for an illustration. This situation is usually described by saying that SU (2) is a double cover of SO(3).
The half-integer spin representations of the Lie algebra of SO(3) really exponentiate to representations of
SU (2) rather than SO(3).9
Before leaving the rotation group there is one other pedagogical point which is worth making. This is
that if we act on the Pauli matrices by conjugation with the spin- 21 representation matrices we have the
vector transformation10 X
⃗ ⃗ ⃗
eiθn̂·J σi e−iθn̂·J = e−iθn̂·J σj , (3.16)
ij
j
where I emphasize that on the left-hand side we have the 2 × 2 spin-1/2 generators J⃗ while on the right-hand
side we have the 3 × 3 spin-1 generators J⃗ . In other words the states in the Hilbert space transform in
8 For people who like topology, the topology of SU (2) is actually just that of the three-sphere S3 , and the Z map we quotient
2
by identifies opposite points on this sphere. The topology of SO(3) is therefore that of the real projective space RP3 . One way
to see the topology of SU (2) is to notice that a general complex 2 × 2 matrix can be written as n4 I + i⃗ n·⃗σ , and then requiring
this matrix to have unit determinant implies that n21 + n22 + n23 + n24 = 1 and requiring it to be unitary requires that n1 , . . . , n4
are real.
9 There is a more old-fashioned description of this situation, where one says that the half-integer spins furnish projective
representations of SO(3), meaning representations where the multiplication rule (3.1) holds only up to a phase.
10 This illustrates a general construction in Lie theory: by conjugating the elements of the Lie algebra by the group trans-
formation we can construct a representation of any Lie group acting on its own Lie algebra. This is called the adjoint
representation, so what we learn here is that the adjoint representation of SU (2) is the j = 1 vector representation.
16
Figure 3: Visualizing a non-contractible loop in SO(3). SU (2) has the topology of the three-sphere S3 , and
SO(3) is obtained by identifying antipodal points (see footnote 8). We can therefore think of SO(3) as the
“northern hemisphere” of S3 (here represented as S2 ), with opposite points identified on the equator (which
is an S2 but is here represented as an S1 ). We can make a noncontractible loop starting at the north pole by
going down to the equator, coming out on the opposite side of the equator, and going back up to the north
pole. Try as you might, you can’t contract this loop to the north pole. It is fun to check however that if you
have a loop which traverses this path twice, you can contract it to the north pole! If you know about the
fundamental group, this shows that π1 (RP3 ) = Z2 .
the spin-1/2 represenation of SU (2), while the Pauli operators transform under conjugation in the spin-one
representation. This is a finite-dimensional model for our field Lorentz transformation formula
X ′
U (Λ)† Φa (x)U (Λ) Daa′ (Λ)Φa (Λ−1 x). (3.17)
a′
You should always be careful to distinguish the transformation of the states from the transformation of the
operators!
17
which (imposing antisymmetry in α and β) tells us that the Lorentz generator matrices are
Here that the α and β indices tell us which generator we are talking about, while the µ and ν indices are
the matrix indices. Via a slightly tedious computation (which you will do on the homework) we can then
compute the Lie algebra of the Lorentz group:
[J αβ , J ρσ ] = i η αρ J βσ + η βσ J αρ − η ασ J βρ − η βρ J ασ . (3.25)
Here I’ve suppressed the matrix indices on J αβ , and the commutator is a matrix commutator. Thus if we
wish to construct a representation of SO+ (d − 1, 1) near the identity, then following our experience with
SO(3) we should look for a set of matrices J αβ which obey the Lorentz algebra (3.25) and then exponentiate
them. This will always give a representation near the identity, and for d ≥ 4 at worst it will give us a
representation of the double cover Spin(d − 1, 1) of SO+ (d − 1, 1). We’ll give an explicit construction of
Spin(d − 1, 1) in the next subsection.
{γ µ , γ ν } = 2η µν . (3.26)
[J µν , [γ α , γ β ]] = [J µν , γ α ]γ β + γ α [J µν , γ β ] − [J µν , γ β ]γ α − γ β [J µν , γ α ]
= i (η µα γ ν − η να γ µ )γ β + γ α (η µβ γ ν − η νβ γ µ ) − (η µβ γ ν − η νβ γ µ )γ α − γ β (η µα γ ν − η να γ µ )
= i η µα [γ ν , γ β ] + η νβ [γ µ , γ α ] − η µβ [γ ν , γ α ] − η να [γ µ , γ β ] (3.30)
and thus
[J µν , J αβ ] = i η µα J νβ + η νβ J µα − η µβ J να − η να J µβ . (3.31)
By exponentiating these generators we therefore can construct a matrix Lie group Spin(d − 1, 1) ⊂ GL(n, C)
with the same Lie algebra as the Lorentz group. We will see soon that Spin(d − 1, 1) is a double cover of
SO+ (d − 1, 1) in just the same way as SU (2) is a double cover of SO(3), and it is simply-connected for d ≥ 4.
18
It will be useful later to note that using (3.24) we can rewrite our expression (3.29) as
α
[J µν , γ α ] = − (J µν ) β γβ , (3.32)
which shows that γ µ transforms as a vector under conjugation by the spinor representation matrices:
µν α
i µν i µν
i
e− 2 ωµν J γ α e 2 ωµν J = e 2 ωµν J γβ . (3.33)
β
with a = 1, 2, . . . , d−2
2 , then substituting into (3.26) we find precisely our real fermion algebra
{χa , χb } = {e eb } = δab
χa , χ
{χa , χ
eb } = 0 (3.35)
γ 0 = iσy ⊗ I ⊗ . . . ⊗ I
γ 1 = σx ⊗ I ⊗ . . . ⊗ I
γ 2a = σz ⊗ . . . ⊗ σz ⊗ σx ⊗ I ⊗ . . . ⊗ I
γ 2a+1 = σz ⊗ . . . ⊗ σz ⊗ σy ⊗ I ⊗ . . . ⊗ I, (3.36)
where in the third and fourth lines we’ve gone back to a = 1, 2, . . . , d−2
2 and the σx /σy appears in the
d
(a + 1)st tensor factor. For even d we have thus found a 2 2 -dimensional representation of Spin(d − 1, 1).
We can check that it is indeed not a representation of the Lorentz group SO+ (d − 1, 1), for example from
this explicit representation we have
−i 2 3 σz
J 23 = [γ , γ ] = I ⊗ ⊗ I ⊗ . . . ⊗ I, (3.37)
4 2
so then 23
e−i2πJ = −1 (3.38)
just as for the spin 1/2 representation of SO(3).
We argued in the previous section that this representation gives an irreducible representation of the
canonical anticommutation relations, and so it gives an irreducible representation of the Dirac algebra (3.26)
as well. On the other hand it is actually a reducible representation of Spin(d − 1, 1). The reason is our old
friend fermion parity, which we will here give new name:
d−2
γ = i− 2 γ 0 γ 1 . . . γ d−1 = σz ⊗ . . . ⊗ σz . (3.39)
This matrix commutes with the Lorentz generators J µν since they are quadratic in γ and therefore have
even fermion parity, and so our 2d/2 -dimensional representation of Spin(d − 1, 1) breaks up into a 2(d−2)/2 -
dimensional block where γ = 1 and a 2(d−2)/2 -dimensional block where γ = −1. These two representations
(when d is even) are called the Weyl representations of Spin(d − 1, 1).
19
What about when d is odd? There is then a simple trick: we simply take the γ-matrices we just
constructed for dimension d − 1, and then append γ d−1 = γ. Since γ anticommutes with all of the other γ µ
matrices and squares to one, it is indeed a valid candidate to complete the Dirac algebra for odd d! Since γ
is now one of the Dirac matrices, it no longer commutes with the Lorentz generators which act on xd−1 . The
spinor representation is therefore irreducible for odd d, and there are no Weyl representations. In general
we can write the dimensionality of the spinor representation as
d
n = 2⌊ 2 ⌋ , (3.40)
where ⌊x⌋ is the “floor” function that gives the largest integer which is less than or equal to x.
It is worth emphasizing that the representation we have constructed isn’t unique: given any invertible
matrix W we can always define
γ µ′ = W γ µ W −1 , (3.41)
which will obey (3.26) provided that γ µ does. Typically the γ-matrices are chosen so that γ 0 is antihermitian
and γ i is hermitian however, in which case we should restrict to unitary W .
To give some concrete expressions, for d = 2 the representation we’ve constructed is
0 1 0 1 1 0
γ0 = γ1 = γ= , (3.42)
−1 0 1 0 0 −1
with the γ = 1 block being called “left-moving” and the γ = −1 block being called “right-moving” (we’ll see
why in the next section). For d = 4 our representation is
0 I 0 I σx 0 σy 0 σz 0
γ0 = γ1 = γ2 = γ3 = γ= .
−I 0 I 0 0 −σx 0 −σy 0 −σz
(3.43)
For d = 4 this is not the most popular representation however, a more popular one, related to this one by
µ
µ
γnew = W γold W † with
1 0 0 i
1 1 0 0 −i
W =√ , (3.44)
2 0 1 i 0
0 −1 i 0
is given by
0 0 I i 0 σi I 0
γ = −i γ = −i γ= . (3.45)
I 0 −σi 0 0 −I
Here the block with γ = 1 is called “left-handed” and the block with γ = −1 is called “right-handed”, again
for reasons we will see next time. I’ll refer to (3.45) as the standard representation of the γ-matrices in
d = 4. You can check that in the standard representation the Lorentz generators are given by
i −σi 0 X1 σk 0
J 0i = J ij = ϵijk , (3.46)
2 0 σi 2 0 σk
k
20
Indeed we have
(γ 0 )3 = −γ 0
γ 0 γ i γ 0 = −γ i (γ 0 )2 = γ i . (3.48)
The complex conjugation properties are more annoying. Returning to even dimensions for a moment, in
our representation (3.36) you can see that γ 0 , γ 1 , γ 2 , γ 4 , . . . are real, γ 3 , γ 5 , . . . are imaginary, and γ is real.
Therefore the matrices11
B1 ≡ γ 3 γ 5 . . . γ d−1 B2 ≡ γB1 (3.49)
obey
d−2
B1 γ µ B1† = (−1) 2 γ µ∗
B2 γ µ B2† = (−1)d/2 γ µ∗
d−2
B1 γB1† = (−1) 2 γ∗
d−2
B2 γB2† = (−1) 2 γ∗. (3.50)
Thus the Dirac representation of Spin(d − 1, 1) is unitarily equivalent to its complex conjugate. The two
Weyl representations will also be self-conjugate if d−2 d−2
2 is even, while they will be exchanged if 2 is odd
(these statements follow from the last two lines of (3.50)).
To understand conjugation in odd dimensions we again define γ d−1 = γ, but now the only choice is to
define
B ≡ γ 3 γ 5 . . . γ d−2 (3.53)
and we then have d−3
Bγ µ B −1 = (−1) 2 γ µ∗ . (3.54)
Finally I’ll mention that in dimensions d = 0, 1, 2, 3, 4 mod 8, we can consistently impose a Majorana
constraint
Ψ∗ = BΨ, (3.55)
on an a field Ψ transforming in the spinor representation of Spin(d − 1, 1). Here B = B1 for d = 0, 2 mod 8,
B = B2 for d = 2, 4 mod 8 (either choice is allowed for d = 2 mod 8, and in fact they are unitarily equivalent),
and B = γ 3 . . . Γd−2 for d = 1, 3 mod 8. These choices follow from finding a B that obeys B ∗ B = 1, which is
needed for the consistency of the constraint since we must have
ψ = B ∗ ψ ∗ = B ∗ Bψ. (3.56)
21
Problems:
1. Confirm the expression (3.16) relating the spin-1/2 and spin-one representations of the rotation group.
Feel free to use mathematica to help with the matrix exponentials.
2. Confirm the Lorentz algebra (3.25) of the Lorentz group starting from the generators (3.24).
3. Show that for d = 4 if we define J x = J 23 , J y = J 31 , and J z = J 12 , then the Lorentz algebra (3.25)
implies that J i obeys the SO(3) Lie algebra (3.8).
4. Confirm the equivalence of the two expressions for γ in equation (3.39).
5. Confirm the equivalence of the two representations (3.43) and (3.45) that we’ve given for the four-
dimensional γ-matrices.
6. Check the conjugation equations (3.50), and find explicit representations of B1 and B2 for the γ-matrix
representations (3.42), (3.43), and (3.45). Hint: for the last representation you will need to use the
transformation in footnote (11).
where the symbol Λ now means an element of Spin(d − 1, 1) rather than an element of SO+ (d − 1, 1). U (Λ)
is the unitary tranformation which represents Λ on Hilbert space, Dab (Λ) is the matrix representing Λ in
the defining spinor representation of Spin(d − 1, 1), and Λ−1 x somewhat heuristically indicates acting on
x with the inverse of the Lorentz transformation which corresponds to the Z2 equivalence class of Λ in
Spin(d − 1, 1). To write this last part more honestly, we should first recall in more detail the double cover
relationship between Spin(d − 1, 1) and the proper orthochronous Lorentz group SO+ (d − 1, 1):
SO+ (d − 1, 1) ∼
= Spin(d − 1, 1)/Z2 . (4.4)
The Z2 we quotient by here is just the overall sign of the matrix in the defining representation of Spin(d−1, 1).
We can therefore introduce a quotient map
noncompact Lie groups the exponential map from the Lie algebra to the Lie group is not surjective so it is necessary to include
products of exponentials to get a group. This is not just pedantic nitpicking, for the physically relevant cases of Spin(2, 1) and
Spin(3, 1) the exponential map is indeed not surjective.
22
which sends each element Λ ∈ Spin(d−1, 1) to the element of SO+ (d−1, 1) corresponding to the equivalence
class {Λ, −Λ}. We can then write the field transformation more accurately as
X
U (Λ)† Ψa (x)U (Λ) = Dab (Λ)Ψb (q(Λ−1 )x), (4.6)
b
where now we honestly have Λ ∈ Spin(d − 1, 1) everywhere. Typically however we will follow common
practice and lazily omit the quotient map in the field transformation. Suppressing also the matrix indices,
we have the more elegant expression
Ψ† Ψ (4.8)
is Lorentz (or really Spin) invariant. This would indeed be the case if the spinor representation matrices
D(Λ) were unitary, but alas they aren’t in general. The culprit is J 0i , which is antihermitian since we have
i i
(J 0i )† = − [γ 0† , γ i† ] = [γ 0 , γ i ] = −J 0i . (4.9)
4 4
We therefore need to write the transformation of Ψ† as
′ i µν†
Ψ† = Ψ† e− 2 ωµν J , (4.10)
γ µ† = γ 0 γ µ γ 0 (4.12)
for the hermitian conjugates of the γ-matrices. This implies that we have
γ µ† γ 0 = −γ 0 γ µ , (4.13)
and thus
i i
J µν† γ 0 = − [γ µ† , γ ν† ]γ 0 = − γ 0 [γ µ , γ ν ] = γ 0 J µν . (4.14)
4 4
23
Therefore we can convert a J µν† to a J µν by moving a γ 0 across it. In particular this means we have
′ i µν† i µν
Ψ† γ 0 = Ψ† e− 2 ωµν J γ 0 = Ψ† γ 0 e− 2 ωµν J , (4.15)
so the quantity
Ψ ≡ Ψ† γ 0 (4.16)
has the simple transformation
′ i µν
Ψ = Ψe− 2 ωµν J . (4.17)
In terms of the group elements and including the position-dependence, we have
Ψ′ (x) = D(Λ)Ψ(Λ−1 x)
′
Ψ (x) = Ψ(Λ−1 x)D(Λ−1 ), (4.18)
so these transformations are primed to cancel each other. Spinor Lagrangians are thus typically constructed
out of Ψ and Ψ rather than Ψ and Ψ† . In particular we have discovered our first Lorentz-invariant quantity:
ΨΨ. (4.19)
This of course is easily fixed, we can just take our candidate Lagrangian term to be iΨΨ. As you might
guess this will end up being a mass term for the fermions which are created and annihilated by Ψ.
To get a sensible quantum field theory we of course also want to have terms involving derivatives of the
fields (otherwise the fields at different points don’t talk to each other). Based on our experience with scalar
fields you might guess that a good term to write down is
∂µ Ψ∂ µ Ψ, (4.21)
but there is a more relevant (in the Wilsonian sense) term that we can write down that has only one derivative
instead of two:
Ψγ µ ∂µ Ψ. (4.22)
To see that this term is Lorentz invariant, we first recall that in the last section we showed (using our new
notation) that13
D(Λ−1 )γ µ D(Λ) = Λµν γ ν . (4.23)
We thus have
′
Ψ γ µ ∂µ Ψ′ = ΨD(Λ−1 )γ µ D(Λ)Λµν ∂ν Ψ
= ΨΛµσ γ σ Λµν ∂ν Ψ
= Ψγ ν ∂ν Ψ, (4.24)
24
This certainly doesn’t look hermitian, or even antihermitian, but it is antihermitian up to a total derivative:
†
Ψγ µ ∂µ Ψ + Ψγ µ ∂µ Ψ = ∂µ Ψγ µ Ψ .
(4.27)
Thus iΨγ µ ∂µ Ψ will integrate to a real action (and more importantly lead to a Hamiltonian density which in-
tegrates to a hermitian Hamiltonian) provided that our boundary conditions at infinity ensure this boundary
term does not contribute (for example by setting Ψ or Ψ to zero at the spatial boundary).
We therefore can write down our first viable candidate Lagrangian density for a spinor field:
L = −i Ψγ µ ∂µ Ψ + mΨΨ .
(4.28)
The overall minus sign is included with the benefit of hindsight: it ensures we will get a Hamiltonian which
is bounded from below. To work out the units of the parameter m, we can use the derivative term (which
must have energy dimension d) to see that the energy dimension of a spinor field is
d−1
[Ψ] = . (4.29)
2
Since mΨΨ should have dimension d, this means that m has units of mass/energy:
[m] = 1. (4.30)
The term (4.21) that we considered above has energy dimension d + 1, and is thus irrelevant as promised.
The Lagrangian (4.28) is called the Dirac Lagrangian, and its equation of motion (for example obtained
by varying with respect to Ψ) is the Dirac equation
γ µ ∂µ Ψ + mΨ = 0. (4.31)
In a moment we will study solutions of the Dirac equation and then use them to construct a quantum
spinor field theory, but it is worthwhile to first mention a convenient notation due to Feynman. In spinor
computations we quite often want to contract some vector or one-form with the γ=matrix vector. Feynman
tells us to indicate this by drawing a slash through the object:
v/ ≡ vµ γ µ = v µ γµ . (4.32)
With this notation we can write the Dirac Lagrangian and the Dirac Equation as
L = −i Ψ∂Ψ/ + mΨΨ
/ + mΨ = 0.
∂Ψ (4.33)
There is a historical comment which is worth making at this point. In some quantum mechanics courses
the Dirac equation (4.31) is introduced as a kind of relativistic generalization of the Schrodinger equation,
and is solved e.g. in the presence of a background Coulomb potential to understand relativistic corrections
to the spectrum of the hydrogen atom. This idea is deeply wrong however, for several reasons:
The Dirac equation accounts for some but not all of the relativistic corrections to the hydrogen atom:
it treats the electron quantum mechanically but not the photon. At higher orders in the fine structure
constant it therefore misses important contributions such as the Lamb shift.
In this archaic interpretation the Dirac equation was seen as successfully predicting that electrons must
have spin 1/2, for example because attempting to have a spin-zero relativistic Schrodinger equation
leads to the Klein-Gordon equation which does not make any sense when viewed as governing a wave
function. In quantum field theory however there is absolutely no problem with having relativistic
scalars, and there is no reason a priori for the electron to have spin 1/2 (except of course that we know
this to be the case experimentally).
25
Taken literally, the Dirac equation predicts the existence of “negative energy electrons”, for which there
is of course no experimental evidence. In quantum field theory these instead become (positive energy)
positrons.
For these reasons (and also others) it is best to discard any “wave function” interpretation of the Dirac
equation: its true physical interpretation is as the equation of motion for a quantum field.
I’ll also mention that from here it is easy enough to write down interacting spinor theories as well. For
example in the Yukawa model of nuclear interactions one has a real scalar field ϕ interacting with a Dirac
spinor Ψ, with Lagrangian
1 m2ϕ 2
L = − ∂µ ϕ∂ µ ϕ −
/ + mΨΨ − igϕΨΨ.
ϕ − i Ψ∂Ψ (4.34)
2 2
Here Ψ creates/annihilates a nucleon (i.e. a proton or neutron) and ϕ creates/annihilates a pion.14 In a
few sections we will learn how to do perturbative scattering calculations in this theory, seeing that the pion
creates an attractive force between the nucleons.
Thus each component of ψ obeys the Klein-Gordon equation! We therefore can look for positive and negative
frequency solutions of the form
p)ei⃗p·⃗x−iωp⃗ t
ψ+ = u(⃗ (4.37)
−i⃗
p·⃗
x+iωp
⃗t
ψ− = v(⃗
p)e , (4.38)
We can also define u = u† γ 0 and v = v † γ 0 , and a little thought shows that these obey
u ip
/+m =0
/ − m = 0.
v ip (4.40)
Solving (4.39) explicitly is a bit tricky. We can make life easier by noting that if u(p) and v(p) are
solutions of (4.39) for some particular p, then we have
reflects the fact that pions are pseudoscalars, meaning that they are odd under parity instead of even. Either kind of interaction
is still called a Yukawa interaction, and we will mostly study the version with no γ since it is a bit easier to calculate in.
26
and
(iΛp p) = (iγ ν Λν µ pµ − m) D(Λ)v(⃗
/ − m) D(Λ)v(⃗ p)
= D(Λ) (iγ µ pµ − m) D(Λ−1 )D(Λ)v(⃗
p)
= D(Λ) ip/ − m v(⃗ p)
= 0. (4.42)
In both cases in the second line we used the γ-matrix transformation (4.23) with the substitution Λ → Λ−1 .
Therefore we can solve (4.39) for some particular “reference” momentum k, write each other timelike (or
null in the case of m = 0) momentum p as some standard Lorentz (or really spin) transformation Lp of k
p = Lp k, (4.43)
and then construct solutions of (4.39) for general p as
p) = D(Lp )u(⃗k)
u(⃗
p) = D(Lp )v(⃗k).
v(⃗ (4.44)
iγ 0 u(⃗k) = u(⃗k)
iγ 0 v(⃗k) = −v(⃗k), (4.46)
so we should take u(⃗k) to live in the +1 eigenspace of the hermitian matrix iγ 0 and v(⃗k) to live in its −1
eigenspace. We can get a physical interpretation of these subspaces by considering the action of the little
group Spin(d − 1) ⊂ Spin(d − 1, 1) of spin transformations which fix k. Since k is in its rest frame the little
group is generated by the rotation matrices
−i i j
J ij = [γ , γ ], (4.47)
4
which commute with γ 0 since they are quadratic in γ i and γ j . If we pick bases ua (⃗k, σ), v a (⃗k, σ) for the
iγ 0 = ±1 subspaces, then the little group must rotate us within these bases:
X ′ X {u}
Daa′ (Λ)ua (⃗k, σ) = ua (⃗k, σ ′ )D̂σ′ ,σ (Λ) (Λk = k)
a′ σ
X ′ X {v}∗
Daa′ (Λ)v a (⃗k, σ) = v a (⃗k, σ ′ )D̂σ′ ,σ (Λ) (Λk = k). (4.48)
a′ σ
{u,v}
The quantities D̂σ′ σ (Λ) must themselves form representations of the little group,
27
the equations (4.48) precisely give the “intertwining” relations between the little group representation acting
on the fields and the little group representation acting on the particles annihilated by ap⃗,σ or created by b†p⃗,σ .
I’ll remind you that a choice of representation of the little group is what we really mean by the spin/helicity
of a particle in relativistic quantum mechanics, so a choice of basis for u and v is thus equivalent to a choice
of spin basis for these particles. In particular a nice choice is to diagonalize a commuting set of the rotation
generators, for example J 12 , J 34 , . . ..
We can make this all more explicit by restricting to d = 4 and using the standard representation of the
γ-matrices. We then have
0 0 I
iγ = (4.51)
I 0
and
ij 1 ijk σk 0
J = ϵ , (4.52)
2 0 σk
in particular with
12 1 σz 0
J = . (4.53)
2 0 σz
We thus can pick basis vectors
1 0 0 −1
√ 0 √ 1 √ 1 √ 0
u(⃗k, +) = m 1
u(⃗k, −) = m
0
v(⃗k, +) = m
0
v(⃗k, −) = m
1 , (4.54)
0 1 −1 0
iγ 0 u(⃗k, ±) = u(⃗k, ±)
iγ 0 v(⃗k, ±) = −v(⃗k, ±)
1
J 12 u(⃗k, ±) = ± u(⃗k, ±)
2
1
12 ⃗
J v(k, ±) = ∓ v(⃗k, ±). (4.55)
2
The last two lines here come from (4.48): they show that u(⃗k, ±) will multiply an annihilation operator for
a particle of Jz = ±1/2 and v(⃗k, ±) will multiply a creation operator for a particle of Jz = ±1/2. Thus a
spinor field in d = 4 creates/annihilates particles of spin 1/2.
It will be useful later for us to note that our spinor basis vectors obey
X
u(⃗ p, σ) = −(p
p, σ)u(⃗ / + im)
σ
X
v(⃗ p, σ) = −(p
p, σ)v(⃗ / − im). (4.56)
σ
The easiest way to confirm these for d = 4 is by direct computation using the standard representation: in
the rest frame we have
X
⃗ ⃗ I I
u(k, σ)u(k, σ) = −im = mγ 0 − im = −/k − im
I I
σ
X
⃗ ⃗ −I I
v(k, σ)v(k, σ) = −im = mγ 0 + im = −/k + im, (4.57)
I −I
σ
28
and then we can boost to a general p⃗ by using the reference Lorentz transformation:
X
u(⃗
p, σ)u(⃗p, σ) = D(Lp )(−/k − im)D(L−1
p ) = −p / − im
σ
X
v(⃗
p, σ)v(⃗ k + im)D(L−1
p, σ) = D(Lp )(−/ p ) = −p
/ + im. (4.58)
σ
Du(⃗k) = u(⃗k)D̂{u}
Dv(⃗k) = v(⃗k)D̂{v}∗ (4.59)
and the spin sums as uu and vv. Taking the adjoints of these and multiplying by γ 0 we have
The spin sums therefore must be invariant under conjugation by D(Λ) with Λ in the little group of k:
where we have used the fact that D̂ is a unitary representation of the little group. I now claim that this
implies that when we must have
u(⃗k)u(⃗k) = A + Bγ 0 + Cγ + Dγ 0 γ
v(⃗k)v(⃗k) = A′ + B ′ γ 0 + C ′ γ + D′ γ 0 γ (4.62)
for some constants A, B, C, D, A′ , B ′ , C ′ , D′ . This is because the set of products of γ-matrices is actually a
basis (with complex coefficients) for all 2⌊d/2⌋ -dimensional matrices. One way to see this is to recall that
the set of all products of Pauli operators is such a basis, and it is not hard to see that by using our tensor-
product-of-Paulis representation of the γ-matrices we can make any such product. The only such products
which will be invariant under arbitrary spatial rotations are those which involve only γ 0 and those which
d−2
involve the product of all of the spatial γ i s. Recalling that γ = i− 2 γ 0 . . . γ d−1 and also that γ 2 = 1 and
(γ 0 )2 = −1, the only options are the identity, γ 0 , γ, and γ 0 γ. We can then use the Dirac equation in the
forms (4.39) and (4.40) to see that C = C ′ = D = D′ = 0 and to determine the ratio of A to B and A′ to
B ′ , confirming (4.57) (and thus (4.56)) up to an overall normalization.
It will also be useful for us next time to know that there is a second set of sum rules involving u and v:
X
ua (⃗ p, σ ′ ) = −2imδσσ′
p, σ)ua (⃗
a
X
v a (⃗ p, σ ′ ) = 2imδσσ′
p, σ)v a (⃗
a
X
ua (⃗ p, σ ′ ) = 0
p, σ)v a (⃗
a
X
v a (⃗ p, σ ′ ) = 0.
p, σ)ua (⃗ (4.63)
a
29
In the matrix language we just introduced, we can write these as
uu = −2im
vv = 2im
uv = 0
vu = 0. (4.64)
These rules can of course be confirmed directly from our explicit expressions for u and v, but it is more fun
to get them from group theory and the Dirac equation. The third and fourth are easy: in the rest frame u
and v are eigenstates of the hermitian operator iγ 0 with different eigenvalues, so they must be orthogonal:
u† v = v † u = 0. (4.65)
Moreover in the rest frame u is proportional to u† and v is proportional to v † by (4.46). The result for
general p⃗ then follows from (4.44). To get the first two lines we need to do a bit more. Indeed note that by
(4.59) and (4.60) we have
D̂{u}† uuD̂{u} = uu
(D̂{v}∗ )† vv D̂{v}∗ = vv. (4.66)
In other words uu and vv must be invariant under conjugation by an irreducible representation of the little
group; by Schur’s lemma they therefore must both be proportional to the identity:
uu = A
vv = B. (4.67)
We can determine the coefficients of proportionality using our previous spin sums:
Au = uuu = −(p
/ + im)u = −2imu
Bv = vvv = −(p
/ − im)v = 2imv, (4.68)
where in the last equality for each line we used the Dirac equation (4.39).
It is also useful to consider the quantities of the form uγ µ u, vγ µ v, etc. These can be computed using the
Gordon identities:
2imu(⃗p ′ , σ ′ )γ µ u(⃗ p ′ , σ ′ ) p′µ + pµ + 2i(p′α − pα )J αµ u(⃗
p, σ) = u(⃗ p, σ)
2imv(⃗p ′ , σ ′ )γ µ v(⃗
p, σ) = −v(⃗p ′ , σ ′ ) p′µ + pµ + 2i(p′α − pα )J αµ v(⃗ p, σ)
2imu(⃗p ′ , σ ′ )γ µ v(⃗ p ′ , σ ′ ) p′µ − pµ + 2i(p′α + pα )J αµ v(⃗
p, σ) = u(⃗ p, σ). (4.69)
30
and
p ′ , σ ′ )γ µ u(⃗
imu(⃗ p ′ , σ ′ )p
p, σ) = u(⃗ ′ µ
/ γ u(⃗ p, σ)
1
= p′α u(⃗ p ′ , σ ′ ) {γ α , γ µ } + [γ α , γ µ ] u(⃗
p, σ)
2
p ′ , σ ′ ) p′µ + 2ip′α J αµ u(⃗
= u(⃗ p, σ). (4.72)
Adding these together gives the first line of (4.69). The other two lines are derived similarly.
The matrix γ d−1 − γ 0 is not hermitian, and in fact it does not have a complete set of eigenvectors. It does
however always have a set of 2⌊d/2⌋−1 linearly-independent eigenvectors with eigenvalue zero. One way to
see this is to instead consider γ 1 − γ 0 , which is related to γ d−1 − γ 0 by a rotation, and which in our tensor
product representation is just
(σx − iσy ) ⊗ I ⊗ . . . ⊗ I. (4.75)
0 0 0 0
The matrix in the first factor is just , which annihilates the vector , so the tensor product of
2 0 1 1
with any basis vector on the other factors is an eigenvector of eigenvalue zero. These 2⌊d/2⌋−1 eigenvectors
transform in the spinor representation of the little group SO(d − 2).
We can make this more concrete by considering specific dimensions. For d = 2 (4.74) just reads
γ 1 − γ 0 u(⃗k) = γ 1 − γ 0 v(⃗k) = 0.
(4.76)
γu = −u
γv = −v, (4.77)
which confirms our notation from last time that a right-moving solution has γ = −1. If we instead take the
reference momentum to be k µ = (k, −k) then we see that a left-moving solution has γ = 1.15 For d = 4 a
basis of solutions for (4.74) is
0 0
√ 1 √ 0
u(⃗k, −) = v(⃗k, +) = 2k
0 u(⃗k, +) = v(⃗k, −) = 2k
1 .
(4.78)
0 0
31
which ensures that u(⃗k, ±) will multiply an annihilation operator for a particle of helicity ±1/2, while v(⃗k, ±)
will multiply a creation operator for a particle of helicity ±1/2 (remember that by definition the helicity of a
massless particle in d = 4 is the angular momentum about its direction of motion). By convention we define
the charge of a field to be the charge of the particle it annihilates, so this is why the upper two components
of a Dirac spinor in d = 4 are called a “left-handed” Weyl spinor while the lower two components are called
a “right-handed” Weyl spinor. The helicity sums again give
X X
u(⃗ p, σ) =
p, σ)u(⃗ v(⃗ p, σ) = −p
p, σ)v(⃗ /, (4.80)
σ σ
as you will check directly from (4.78) in the homework, and one can also again establish this generally (up
to normalization) from group theory.
All of these hold in general dimensions and arbitrary representations of the γ-matrices, both for massive and
massless spinors.
Problems:
1. Show that in even dimensions the interaction ΦΨγΨ between a real scalar and a Dirac spinor is Lorentz-
invariant and hermitian. Show the same for the modified kinetic term iΨγ ∂Ψ,/ with the hermiticity
now holding up to a total derivative.
2. Confirm the null spin sums (4.80) starting from our explicit representations for u and v in d = 4, and if
you are feeling brave also try making a group theory argument determining them up to normalization.
3. (Extra Credit) In 3 + 1 dimensions we can classify all the irreducible representations of Spin(3, 1)
using a a clever trick. Defining Jx = J 23 , Jy = J 31 , and Jz = J 12 , and also Ki = J 0i , show that the
quantities Ai = 21 (Ji + iKi ) and Bi = 12 (Ji − iKi ) obey the Lie algebra of two copies of SU (2):
[Ai , Aj ] = iϵijk Ak
[Bi , Bj ] = iϵijk Bk
[Ai , Bj ] = 0. (4.82)
Therefore any irreducible representation of the Lorentz algebra can be converted to an irreducible
representation of SU (2) × SU (2) and vice versa. Fortunately for us we already know how to think
32
about irreducible representations of SU (2), they are labeled by the spin j and have dimension 2j +1. To
get an irreducible representation of SU (2) × SU (2) we just need an object with a pair of SU (2) indices,
one transforming in the spin jA representation and one transforming in the spin jB representation. In
other words we can represent A ⃗ and B
⃗ as
⃗ a′ b′ ,ab = δbb′ J⃗jA
A a′ a
where J⃗j are the three generators of the spin j representation of SU (2). What is the dimensionality
of the spin (jA , jB ) representation of the Lorentz group? Can you make a guess for which values of jA
and jB give the scalar, Weyl spinor, and vector representations of Spin(3, 1)?
derivative. The correct rule, as we will see in a moment when we check the Dirac equation, is that we should think of the
derivative as acting from the right and re-order any anticommuting variables so that the one we are differentiating with respect
to is on the right. We will see such manipulations in more detail when we discuss the fermionic path integral.
33
and thus
γ 0 ∂0 Ψ + (γ i ∂i + m)Ψ = 0. (5.6)
In going from the second line to the third line of (5.5) we use the anticommutator identity
[AB, C] = A{B, C} − {A, C}B. (5.7)
Since Ψ obeys the Dirac equation, we can expand it in the set of solutions we constructed in the previous
section:
X Z dd−1 p 1 h a i
Ψa (x) = d−1
p u (⃗ p, σ)bp†⃗,σ e−i(⃗p·⃗x−ωp⃗ t) ,
p, σ)ap⃗,σ ei(⃗p·⃗x−ωp⃗ t) + v a (⃗ (5.8)
σ
(2π) 2ωp⃗
p
where ap⃗,σ and bp⃗,σ are operator coefficients for our solutions and as usual ωp⃗ = |p|2 + m2 . To isolate the
operator coefficients, let’s first take the Fourier transforms of Ψ and Ψ† :
Z X 1
dd−1 xΨ(0, ⃗x)e−i⃗p·⃗x = p u(⃗p, σ ′ )ap⃗,σ′ + v(−⃗ p, σ ′ )b†−⃗p,σ′ (5.9)
σ′
2ωp⃗
Z X 1
dd−1 xΨ† (0, ⃗x)ei⃗p·⃗x = p p, σ ′ )a†p⃗,σ′ + v † (−⃗
u† (⃗ p, σ ′ )b−⃗p,σ′ . (5.10)
σ′
2ωp⃗
Using our spin sum rules
u† (⃗ p, σ ′ ) = 2ωp⃗ δσ,σ′
p, σ)u(⃗
v † (⃗ p, σ ′ ) = 2ωp⃗ δσ,σ′
p, σ)v(⃗
u† (⃗ p, σ ′ ) = 0
p, σ)v(−⃗
v † (−⃗ p, σ ′ ) = 0,
p, σ)u(⃗ (5.11)
we can extract the operator coefficients as
Z
1
ap⃗,σ = p dd−1 x u† (⃗
p, σ)Ψ(0, ⃗x)e−i⃗p·⃗x
2ωp⃗
Z
1
a†p⃗,σ = p dd−1 x Ψ† (0, ⃗x)u(⃗
p, σ)e−i⃗p·⃗x
2ωp⃗
Z
1
bp⃗,σ = p dd−1 x Ψ† (0, ⃗x)v(⃗
p, σ)e−i⃗p·⃗x
2ωp⃗
Z
1
b†p⃗,σ = p dd−1 x v † (⃗
p, σ)Ψ(0, ⃗x)ei⃗p·⃗x . (5.12)
2ωp⃗
Our goal is now to compute the anticommutators of the a’s and b’s using the canonical anticommutation
relations (5.3). Many of these vanish automatically by virtue of only involving Ψ’s or only involving Ψ† ’s.
There are four nontrivial anticommutators that we need to compute:
Z
† 1 1 ′ ′
{ap⃗,σ , ap⃗ ′ ,σ′ } = p p dd−1 xdd−1 x′ u† (⃗ p, σ){Ψ(0, ⃗x), Ψ† (0, ⃗x ′ )}u(⃗
p ′ , σ ′ )ei⃗p ·⃗x −i⃗p·⃗x
2ωp⃗ 2ωp⃗ ′
1
= p ′ − p⃗)u† (⃗
(2π)d−1 δ d−1 (⃗ p, σ ′ )
p, σ)u(⃗
2ωp⃗
p ′ − p⃗)δσ,σ′
= (2π)d−1 δ d−1 (⃗ (5.13)
and
Z
1 1 ′
{ap⃗,σ , bp⃗ ′ ,σ′ } = p p dd−1 x e−i(⃗p+⃗p )·⃗x u† (⃗ p ′ , σ′ )
p, σ)v(⃗
2ωp⃗ 2ωp⃗ ′
1
= p + p⃗ ′ )u† (⃗
(2π)d−1 δ d−1 (⃗ p, σ)v(−⃗p, σ ′ )
2ωp⃗
= 0, (5.14)
34
and similarly
Thus we see that the ap⃗,σ and bp⃗,σ obey the anticommutation relations for two independent sets of fermionic
particles, with the convention being to say that ap†⃗,σ creates particles and bp†⃗,σ creates antiparticles. The one-
particle states created by acting with these on the vacuum carry a little group index σ just as we anticipated
in the last section, and in particular for d = 4 they are spin-1/2 particles. In quantum electrodynamics ap†⃗,σ
creates an electron and b†p⃗,σ creates a positron.
where in the last line we have used the anticommutation relation for b and b† to exchange their order at
the cost of an infinite negative contribution to the vacuum energy. This divergence is familiar from our
quantization of a free scalar field, we can absorb it into a renormalization of the cosmological constant. Once
we have done so, we see that the Hamiltonian is bounded from below and we can find the ground state,
usually called the vacuum state in field theory, by imposing the condition
Particles and antiparticles are then created by acting on the vacuum with creation operators, each carrying
energy ωp⃗ .
It is interesting to note that the infinite contribution to the cosmological constant is negative for fermions
and positive for bosons. In theories where there is a symmetry between bosons and fermions, one might
hope that these contributions could cancel. This is indeed the case, and theories with such a symmetry are
called supersymmetric theories. Supersymmetric field theories are quite interesting for several reasons:
(1) The automatic cancellation of divergences we saw here arises in many places in supersymmetric field
theories, leading to theories which are much simpler to analyze than their non-supersymmetric coun-
terparts. This makes supersymmetric field theories a wonderful laboratory for exploring quantum
field theory phenomena, especially in the context of strongly-interacting dynamics where traditional
perturbative approaches are useless.
35
(2) This cancellation also has the possibility of addressing the Higgs hierarchy problem in the standard
model that we mentioned last semester - if bosonic and fermionic contributions to the Higgs mass
cancel in loops, then it is not so puzzling that the Higgs boson is light compared to fundamental
energy scales such as the Planck mass. The experimental constraints on SUSY at the weak scale are
now quite strong however, so this idea seems less appealing than it did a few decades ago.
(3) Supersymmetry arises very naturally in string theory, which so far is our best-understood candidate for
a theory of quantum gravity. Unfortunately this supersymmetry might be broken at very high energy
scales however, so it need not be accessible with near-term experiments.
One could teach a whole class on supersymmetric field theories and supergravity, and in fact Jesse Thaler
is teaching one here at MIT right now! In this class however we will content ourselves with exploring the
simplest supersymmetric field theory in d = 4 spacetime dimensions, the Wess-Zumino model, which you
will do on the homework this week.
where in computing the derivative we (somewhat heuristically since we haven’t discussed Noether’s theorem
for fermionic fields) again take the derivative to act on the Lagrangian density from the right. We can check
this sign by computing the action of the symmetry charge on Ψ:
Z
[Q, Ψa (0, ⃗x)] = q dd−1 x[Ψb† (0, ⃗x ′ )Ψb (0, ⃗x ′ ), Ψa (0, ⃗x)]
Z
= −q dd−1 x{Ψb† (0, ⃗x ′ ), Ψa (0, ⃗x)}Ψb (0, ⃗x ′ )
∂µ J µ = −q ∂µ Ψγ µ Ψ + Ψ∂Ψ
/ = −qΨΨ(m − m) = 0, (5.22)
where in the second equality we used the Dirac equation both for Ψ and its adjoint
∂µ Ψγ µ = mΨ. (5.23)
36
It is interesting to compute the charge operator in terms of the raising and lowering operators, this gives
Z
Q =q dd−1 xΨ† Ψ
Z X Z dd−1 p Z dd−1 p′ 1
d−1 † ′ ′ † p ′ ·⃗
−i⃗ x † ′ ′ p ′ ·⃗
i⃗ x
=q d x √ u (⃗
p , σ )a p
⃗ ′ ,σ ′ e + v (⃗
p , σ )b p
⃗ ′ ,σ ′ e
(2π)d−1 (2π)d−1 2 ωp⃗ ωp⃗
σ,σ ′
× u(⃗ p, σ)ap⃗,σ ei⃗p·⃗x + v(⃗ p, σ)b†p⃗,σ ei⃗p·⃗x
X Z dd−1 p †
=q d−1
ap⃗,σ ap⃗,σ + bp⃗,σ b†p⃗,σ
σ
(2π)
dd−1 p †
Z
ap⃗,σ ap⃗,σ − b†p⃗,σ bp⃗,σ + “∞”.
X
=q d−1
(5.24)
σ
(2π)
Thus after a renormalization we see that a†p⃗,σ creates particles of charge q and b†p⃗,σ creates particles of charge
−q, so they are indeed antiparticles. In quantum electrodynamics q is the electric charge measured in units
of the elementary charge
e = 1.602176634 × 10−19 C, (5.25)
so q = −1 for the electron field and q = 1 for the proton field. In more civilized units we write the elementary
charge in terms of the fine structure constant
e2 1
α= ≈ . (5.26)
4πϵ0 ℏc 137
We will see later in the semester that the perturbation theory of QED is a perturbation series in α, and since
α ≪ 1 this perturbation series is quite good for many observables.
Another interesting symmetry we can consider is chiral symmetry, which in even spacetime dimensions
acts on a Dirac spinor as
Ψ′ (x) = eiθqγ Ψ(x). (5.27)
Here recall that d−2
γ = i− 2 γ 0 . . . γ d−1 . (5.28)
The derivative term in the Dirac Lagrangian is invariant under this transformation:
′
/ ′ = −iΨ† e−iθqγ γ 0 γ µ eiθqγ ∂µ Ψ = −iΨ† e−iθqγ γ 0 e−iθqγ γ µ ∂µ Ψ = −iΨ† γ 0 γ µ ∂µ Ψ,
−iΨ ∂Ψ (5.29)
where we’ve used that γ anticommutes with γ µ , but the mass term isn’t:
′
−iΨ Ψ′ = −iΨe−iθqγ γ 0 eiθqγ Ψ = −iΨe2iθqγ Ψ. (5.30)
Chiral symmetry is therefore only a symmetry of the Dirac Lagrangian if m = 0, in which case you can
think of it as a transformation which rotates the upper Weyl component of Ψ as eiθq and the lower Weyl
component of Ψ as e−iθq . In the homework you will work out the Noether current and charge for chiral
symmetry.
Finally we’ll discuss CRT symmetry, which we showed last semester is a symmetry of any relativistic
quantum field theory. Our general expression for the action of CRT symmetry on the set of dynamical fields
is †
Θ†CRT Φa (x)ΘCRT = ifa DE (RT )ab Φb (RT x) , (5.31)
where DE is the representation of Spin(d) which the fields transform under in the Euclidean path integral,
R is the coordinate transformation which reflects x1 → −x1 , and T is the coordinate transformation which
reverses time. We haven’t yet discussed Euclidean fermions (see the next section), but their transformation
37
under Euclidean rotations is easy enough to guess: the spinor representation of SO(d) is simply that generated
by the Euclidean γ-matrices
0
γE = iγ 0
i
γE = γi (5.32)
via the rotation generators
i µ ν
JEµν = − [γE , γE ]. (5.33)
4
In particular a Euclidean rotation in the 01 plane is generated by
1 0 1
JE01 = [γ , γ ], (5.34)
4
so we have 0
π
,γ 1 ]
DE (RT ) = e−i 4 [γ = −iγ 0 γ 1 , (5.35)
as can be confirmed somewhat tediously by using the γ-matrix algebra or more easily by working with our
product-Pauli representation. Therefore acting on a Dirac spinor we have the CRT transformation
Θ†CRT Ψ(x)ΘCRT = −(γ 0 γ 1 Ψ(RT x))∗ , (5.36)
where here I’ve used the convention of writing ∗ for the Hilbert space adjoint that doesn’t transpose Dirac
indices. On the homework you will confirm that the Dirac Lagrangian is invariant under this transformation.
It is also sometimes possible to define separate C, R, and T symmetry transformations on Dirac spinors; you
will explore this in the next problem set.
38
where
dd−1 p 1 ip·(x2 −x1 )
Z
G(x2 − x1 ) = e (5.40)
(2π)d−1 2ωp⃗
is the free scalar two-point function we computed last semester (see section four for an evaluation of this
integral in terms of a Bessel function). This means that the Dirac two-point function S ab (x2 − x1 ) has the
same qualitative features as the scalar two-point function: exponential decay at spacelike separation when
m > 0, power-law decay at spacelike separation when m = 0, and a power-law divergence as x2 and x1 come
together. As a sanity check we can confirm that when x2 ̸= x1 then S ab (x) obeys the Dirac equation:
(∂/2 + m)ab S bc (x2 − x1 ) = i(∂/2 + m)ab (∂/2 − m)bc G(x2 − x1 ) = iδ ac (∂22 − m2 )G(x2 − x1 ) = 0. (5.41)
We can also compute the two-point function with the operators in the other order, leading to
dd−1 p 1
Z
b ab −ip·(x2 −x1 )
⟨Ω|Ψ (x1 )Ψa (x2 )|Ω⟩ = − / − im) e
(p
(2π)d−1 2ωp⃗
= −i(∂/2 − m)ab G(x1 − x2 ). (5.42)
In the scalar case, we saw that in addition to the two-point function it was also natural to discuss the
time-ordered Feynman propagator
GF (x2 − x1 ) = ⟨Ω|T Φ(x2 )Φ(x1 )|Ω⟩ := Θ(t2 − t1 )⟨Ω|Φ(x2 )Φ(x1 )|Ω⟩ + Θ(t1 − t2 )⟨Ω|Φ(x1 )Φ(x2 )|Ω⟩. (5.43)
For example we saw that this is what the path integral formulation naturally computes, and also that
this is what shows up naturally when evaluating Feynman diagrams. For fermionic operators however this
definition of the time-ordered product is not so natural: fermions anticommute at spacelike separation, so if
the time-ordering does not respect this then the quantity we’d define would be singular when x2 and x1 lie
on the same time slice. To fix this, we define the time-ordered spinor propagator to be
b b
SFab (x2 − x1 ) = Θ(t2 − t1 )⟨Ω|Ψa (x2 )Ψ (x1 )|Ω⟩ − Θ(t1 − t2 )⟨Ω|Ψ (x1 )Ψa (x2 )|Ω⟩. (5.44)
dd p −i
Z
GF (x2 − x1 ) = eip·(x2 −x1 ) (5.46)
(2π) p + m2 − iϵ
d 2
for the scalar Feynman propagator, we have a covariant expression for the spinor propagator:
39
with B being whichever of the two complex conjugation matrices B1 or B2 obeys B T = B (and thus B ∗ B = 1
since both B1 and B2 are unitary). For example for d = 2 and d = 3 we have
B = I, (5.49)
so the Majorana constraint just says the components of Ψ are real, while for d = 4 you will see in the
homework that we want
0 −iσ2
B= . (5.50)
iσ2 0
Note that for d = 2 the Majorana constraint separately constrains the left-moving and right-moving compo-
nents of Ψ, while for d = 4 it relates the left-handed and right-handed components.
Taking the transpose of the constraint (5.48) we see that a Majorana fermion obeys
Ψ = ΨT C, (5.51)
with
C ≡ Bγ 0 . (5.52)
The Lagrangian for a massive Majorana fermion is
i i
L = − Ψ(∂/ + m)Ψ = − ΨT C(∂/ + m)Ψ. (5.53)
2 2
The factor of 1/2 is included because the usual Dirac action counts each independent component of a
Majorana spinor twice. Working out the canonical anticommutation relations and the Hamiltonian from
this action is somewhat subtle due to the presence of the Majorana constraint (5.48), to handle it properly
we need a version of the Hamiltonian formalism that deals with constraints. My favorite approach to this
is called “covariant phase space”, which you can read about in my first paper with Jie-qiang Wu, and there
is another approach called “Dirac brackets” which you can read about in Weinberg’s book. Either way the
result is that the canonical anticommutation relations for this theory are the same as for the Dirac theory:
{Ψa (t, ⃗x), Ψ∗b (t, ⃗y )} = {Ψa (t, ⃗x), B bc Ψc (t, ⃗y )} = δ ab δ d−1 (⃗x − ⃗y ), (5.54)
and thus
{Ψa (t, ⃗x), Ψb (t, ⃗y )} = B ab∗ δ d−1 (⃗x − ⃗y ), (5.55)
and that the Hamiltonian density is given by half of the Dirac result:
i i
H = Ψ γ i ∂i + m Ψ = ΨT C γ i ∂i + m Ψ.
(5.56)
2 2
To get a sense of what the Majorana constraint means for the particle content of the theory, we can write
both sides in terms of our field decomposition:
X Z dd−1 p 1 h ∗ †
i
Ψ∗ = u (⃗
p , σ)a p
⃗,σ e −ip·x
+ v ∗
(⃗
p , σ)b p
⃗ ,σ e ip·x
(2π)d−1 2ωp⃗
p
σ
X Z dd−1 p 1 h †
i
ip·x −ip·x
= Bu(⃗ p , σ)a p
⃗ ,σ e + Bv(⃗ p , σ)b p
⃗,σ e , (5.57)
(2π)d−1 2ωp⃗
p
σ
so comparing coefficients of eip·x and e−ip·x we see that the Majorana constraint imposes a relation between
ap⃗,σ and bp⃗,σ . Indeed if we choose the phase of B appropriately, we can arrange for the relationship to simply
be that
ap⃗,σ = bp⃗,σ . (5.58)
In other words we can arrange so that
p, σ) = v ∗ (⃗
Bu(⃗ p, σ)
p, σ) = u∗ (⃗
Bv(⃗ p, σ). (5.59)
You will confirm this in the homework for d = 4. A Majorana spinor thus creates a fermion which is its own
antiparticle.
40
Problems:
1. What is the Noether current J µ for the chiral symmetryRtransformation (5.27) of a massless Dirac
fermion? Write an expression for the Noether charge Q = dd−1 xJ 0 in d = 4 in terms of the creation
and annihilation operators for massless fermions. Hint: You will need to work out spin sums such
as u† γu = −uγ 0 γu, which you can do in the reference frame by noting that u(⃗k, σ) and v(⃗k, σ) are
eigenvectors of γ.
2. Check that the Dirac Lagrangian is invariant (up to a total derivative) under the CRT transformation
(5.36). Hint: you will need to remember that ΘCRT is antiunitary, but if you do things right you
shouldn’t have to know the complex conjugates of the γ-matrices (i.e. you shouldn’t have to use B1 or
B2 ). Make sure you also take into account the anticommuting nature of Ψ and Ψ† .
3. You should have found on the last homework that for d = 4 we have
0 −σ2
B2 = . (5.60)
σ2 0
p, σ) = −iv ∗ (⃗
B2 u(⃗ p, σ)
p, σ) = −iu∗ (⃗
B2 v(⃗ p, σ) (5.61)
for the u and v we defined in section four. Therefore to get a simple representation of the Majorana
field in terms of these basis vectors we should take our Majorana constraint to be Ψ∗ = BΨ with
B = iB2 so that (5.58) holds. With this choice, compute the Hamiltonian of a Majorana spinor in
terms of the creation and annihilation operators.
4. When working four-dimensional spinors many people like to use a notation which more explicitly
recognizes that the spinor representation is reducible. In this notation, called two-component spinor
notation, the fundamental objects are left-handed Weyl spinors ψα , with α = 1, 2. Right-handed
spinor indices are then written with a dot on them, so for example the complex conjugate of ψα is
denoted ψ α̇ . Note that here the bar just means complex conjugate. One then decomposes a Dirac
spinor in terms of two left-handed Weyl spinors as
!
ψα
Ψ = α̇β̇ , (5.62)
ϵ χβ̇
with
0 1
ϵα̇β̇ = = iσ2 . (5.63)
−1 0
This looks a bit nicer if we define
χα̇ = ϵα̇β̇ χβ̇ , (5.64)
and in general two-component indices can be raised and lowered using ϵ and its inverse −ϵ (there is
also an ϵαβ with the same components at ϵα̇β̇ ). The γ-matrices are then decomposed as
!
µ 0 σαµβ̇
γ = −i , (5.65)
σ µ,α̇β 0
with σ µ = (I, ⃗σ ) and σ µ = (I, −⃗σ ), except when γ 0 appears in the definition of Ψ = Ψ† γ 0 , in which
case it is written as
δ α̇β̇
0 0
γ = −i . (5.66)
δα β 0
41
What is the Dirac Lagrangian written in terms of ψα and χα ? Your expression should only have dotted
indices contracted with dotted indices and undotted indices contracted with undotted indices. If we
impose the Majorana constraint Ψ∗ = BΨ, what does this say about the relationship between ψα and
χα ? This notation is particularly convenient when discussing supersymmetry, if you are brave you can
try solving the next problem using two-component notation.
5. The simplest supersymmetric theory in four spacetime dimensions is the Wess-Zumino model, with
Lagrangian
1 1 m2 2 m2 2 i
L = − ∂µ Φ1 ∂ µ Φ1 − ∂µ Φ2 ∂ µ Φ2 − Φ − Φ − Ψ(∂/ + m)Ψ. (5.67)
2 2 2 1 2 2 2
Here Φ1 and Φ2 are a pair of real scalar fields, and Ψ is a Majorana fermion. Show that the Wess-
Zumino Lagrangian is invariant under the infinitesimal supersymmetry transformation
δS Φ1 = iϵΨ
δS Φ2 = ϵγΨ
δS Ψ = ∂µ (Φ1 + iΦ2 γ)γ µ ϵ − m(Φ1 − iΦ2 γ)ϵ. (5.68)
Here ϵ is an “infinitesimal Grassman Majorana spinor”, which means you should view it as anticom-
muting with Ψ and obeying the same Majorana constraint as Ψ does. As is usual for a global symmetry
you should take ϵ to be position-independent. You will find it useful to make use of the fact that
Cγ µ C −1 = −γ µT . (5.69)
[Qa , Qb ] = 0
[Pa , Pb ] = 0
[Qa , Pb ] = iδba , (6.1)
and to start with we are interested in computing transition amplitudes of the form
42
where |q, t⟩ is a simultaneous eigenstate of the Qa (t):
Qa (t)|q, t⟩ = q a |q, t⟩. (6.3)
Explicitly we have |q, t⟩ = eiHt |q, 0⟩ (there isn’t a sign mistake here, check it). We then constructed the path
integral formalism by repeatedly inserting complete sets of states:
N
Y −1 Z
⟨qf , tf |qi , ti ⟩ = dqm ⟨qf , tf |qN −1 , tf − ϵ⟩⟨qN −1 , tf − ϵ|qN −2 , tf − 2ϵ⟩ . . . ⟨q2 , t2 |q1 , t1 ⟩⟨q1 , ti + ϵ|qi , ti ⟩,
m=1
(6.4)
where we have split the time interval tf − ti into N pieces of size ϵ. We then used the canonical commutation
relations to show that when ϵ is small we have
dM p iϵ Pa pa q′a −q a
Z
−H(q ′ ,p)
⟨q ′ , t + ϵ|q, t⟩ ≈ e ϵ
, (6.5)
(2π)M
and thus the path integral expression
−1 Z −1 Z
N NY " N −1 !#
a
dM pn − qℓa
Y
M
X X qℓ+1
⟨qf , tf |qi , ti ⟩ = lim d qm exp iϵ pℓ,a − H (qℓ+1 , pℓ )
ϵ→0
m=1 n=0
(2π)M a
ϵ
ℓ=0
" Z !#
Z Z tf
qf
X
a
:= Dq|qi Dp exp i dt pa (t)q̇ (t) − H(q(t), p(t)) . (6.6)
ti a
In deriving this we took the operator ordering in H to put all P to the right of all Q. We also learned that
we can compute expectation values of time-ordered products as
Z Z
q
⟨qf , tf |T O1 (Q(t1 ), P (t1 )) . . . Op (Q(tp ), P (tp ))|qi , ti ⟩ = Dq|qfi Dp O1 (q(t1 ), p(t1 )) . . . Op (q(tp ), p(tp ))
" Z !#
tf X
a
× exp i dt pa (t)q̇ (t) − H(q(t), p(t)) ,
ti a
(6.7)
where the On are ordered so that all P are to the left of all Q, and that we could replace the external states
in this correlation function by the ground state provided that we take ti → −∞(1 − iϵ) and tf → ∞(1 − iϵ)
and then divide by the same with no operator insertions. We will now see how to write analogous formulas
for fermions.
43
then we must have (ψ a )2 = 0 since (Ψa )2 = 0. If ψ a is a complex number then this of course just implies
that ψ a = 0. We do have one state, |0⟩, which is annihilated by all of the Ψa , but this is very far from having
a complete basis of eigenstates to insert in deriving a path integral. To solve this problem we need to invent
a new kind of eigenvalue which can square to zero without being zero.
To build intuition, let’s first consider the case where M = 1. The Hilbert space is just that of one qubit.
The idea is to introduce a formal object ψ, called a Grassmann variable, which obeys
{ψ, ψ} = {ψ, Ψ} = {ψ, Ξ} = 0 (6.11)
and also
ψ|0⟩ = |0⟩ψ. (6.12)
Note that this implies that
ψ|1⟩ = ψΞ|0⟩ = −Ξψ|0⟩ = −Ξ|0⟩ψ = −|1⟩ψ. (6.13)
At this point people often ask “but what is the Grassman variable?” I don’t know a particularly elegant
answer to this question: it is simply a symbol ψ which obeys the rules I just stated. Given such a variable
we can then define a set of objects of the form
r = a + bψ, (6.14)
where a and b are complex numbers. The set R of all such objects forms what mathematicians call a ring:
we can add and multiply elements in the obvious way
(a + bψ) + (c + dψ) = (a + b) + (c + d)ψ
(a + bψ)(c + dψ) = ac + (ad + bc)ψ, (6.15)
but there is in general no multiplicative inverse (what would ψ −1 be?). We then introduce an extension H e
of the qubit Hilbert space H we started with, which allows superpositions of |0⟩ and |1⟩ with coefficients in
R. Mathematicans would call H e a module over the ring R, and in fact it is what is sometimes called a
supermodule since we can write it as the direct sum of a submodule with Grassmann weight zero and a
submodule with Grassmann weight one. In this accounting commuting objects such as |0⟩ and any complex
number have weight zero, while anticommuting objects such as ψ and |1⟩ have weight one. Weight zero
objects also commute with weight one objects. The reason we introduce the Grassmann supermodule H e
over the physical Hilbert space H is that it contains nontrivial eigenstates for Ψ. Indeed if we define
|ψ⟩ = |0⟩ − ψ|1⟩, (6.16)
we can easily check that
Ψ|ψ⟩ = ψ|0⟩ = ψ (|0⟩ − ψ|1⟩) = ψ|ψ⟩, (6.17)
where we used that Ψ|0⟩ = 0, Ψ|1⟩ = |0⟩, and the above rules for manipulating ψ. Note that |ψ⟩ has weight
zero, so we can commute it freely with anticommuting objects such as ψ.
Extending this construction to general M is not difficult, we introduce M Grassmann variables ψ a obeying
{ψ a , ψ b } = {ψ a , Ψb } = {ψ a , Ξb } = 0 (6.18)
and
ψ a |0⟩ = |0⟩ψ a . (6.19)
a
We can construct a simultaneous eigenstate of the Ψ in the Grassmann supermodule as
|ψ⟩ = (1 − ψ 1 Ξ1 ) . . . (1 − ψ M ΞM )|0⟩, (6.20)
which again has weight zero. The easiest way to show that this is an eigenstate is to first observe that for
some fixed b (not summed over) we have
Ψa (1 − ψ b Ξb ) = Ψa + ψ b ({Ψa , Ξb } + Ξb Ψa ) = (1 − ψ b Ξb ) Ψa + δba ψ b .
(6.21)
44
Thus when we act on |ψ⟩ with Ψa it moves freely through all the factors not involving ψ a , and when it meets
the factor (1 − ψ a Ξa ) (again with no sum on a) then it generates two terms: one where Ψa moves through
freely and the other where it is replaced by a factor of ψ a . The former term vanishes since when Ψa moves
through the rest of the factors it meets |0⟩ and annihilates it. We thus have
It will also be useful for us to work out the action on |ψ⟩ by a product of some of the Ψa , this is given by
There is no sign on the right hand side since, although we have to move each Ψai through the eigenvalues of
the Ψ’s to its right, we then have to move the eigenvalues back.
The Hilbert space H has an inner product, and it is natural to also define an inner product on the
Grassman supermodule. Given |ϕ1 ⟩, |ϕ2 ⟩ ∈ H and r1 , r2 ∈ R, the inner product on H
e is defined to obey
(|ψ1 ⟩ + |ψ1′ ⟩, |ψ2 ⟩ + |ψ2′ ⟩) = (|ψ1 ⟩, |ψ2 ⟩) + (|ψ1 ⟩, |ψ2′ ⟩) + (|ψ1′ ⟩, |ψ2 ⟩) + (|ψ1′ ⟩, |ψ2′ ⟩). (6.25)
Note that this construction requires us to know how to take the complex conjugates of Grassmann variables.
We thus need to designate whether each Grassman variable is real or complex: if ψ is real then (of course)
we have ψ ∗ = ψ, while if ψ is complex then ψ ∗ should be treated as an independent Grassman variable.
Moreover the action of the complex conjugate on products of Grassman variables is defined to obey
(ψ 1 ψ 2 )∗ = ψ 2∗ ψ 1∗ . (6.26)
To construct the path integral we also need “bra” versions of the eigenstates of Ψa . A natural guess
would be that these should be the duals of |ψ⟩ with respect to the inner product we just defined on the
Grassman supermodule H, e but this doesn’t actually work. The reason is simple: in fermionic systems Ψ
doesn’t necessarily anticommute with its adjoint, e.g. for a Dirac spinor Ψ and Ψ† are canonical conjugates,
so its left eigenstates don’t need to be the duals of its right eigenstates. What we need for the path integral
construction, as we will see in a moment, are really the left eigenstates rather than the duals. A better
approach is to first introduce a bra ⟨e0| ∈ H, which is defined by saying that it obeys
⟨e
0|Ξa = 0 (6.27)
ψ a ⟨e 0|ψ a ,
0| = ⟨e (6.29)
since we have
ψ a ⟨e 0|0⟩ψ a = ⟨e
0|0⟩ = ⟨e 0|ψ a |0⟩. (6.30)
We can get a complete basis for the bras as
45
presentation so that the only properties we need of ⟨e0| are (6.27) and (6.28), so the formulas we write will
be valid for either choice of inner product. We therefore won’t actually need to use the inner product we
constructed on H,e although it is still good to know about it.
We’ll now construct a left eigenstate of Ψ in the single-fermion case with M = 1. Defining18
⟨ψ|
e = −⟨e
0|Ψ(1 − Ξψ) = −⟨e
1| + ⟨e
0|ψ, (6.32)
e = (−1)M ⟨e
⟨ψ| 0|ΨM . . . Ψ1 (1 − Ξ1 ψ 1 ) . . . (1 − ΞM ψ M ), (6.34)
which obeys
e a = ⟨ψ|ψ
⟨ψ|Ψ e a (6.35)
as you will check on the homework. As for the case of a right eigenstate, we also have
e a1 . . . Ψan = ⟨ψ|ψ
⟨ψ|Ψ e a1 . . . ψ an . (6.36)
The weight of ⟨ψ|e is M mod 2, so we have to be careful moving it past anticommuting objects. It will also
be useful for us to know that
⟨ψe′ |ψ⟩ = (ψ ′M − ψ M ) . . . (ψ ′1 − ψ 1 ), (6.37)
which you will also check on the homework. We will see in the next subsection that the right-hand side of
⃗ ′ − ψ).
this equation is the definition of the Grassmann δ-function δ M (ψ ⃗
One way to motivate this that we would like the integral to be linear in the sense that
Z Z Z
dθ (f (θ) + g(θ)) = dθf (θ) + dθg(θ), (6.41)
46
and we’d also like the integral of a total derivative to vanish:
Z
df
dθ = 0. (6.42)
dθ
These two conditions tell us that Z
dθf (θ) = λb (6.43)
with λ some general f -independent constant. We might as well take λ = 1, since otherwise there is no
natural way to choose it.
There is an interesting feature of the behavior of the Grassmann integral under linear redefinitions of the
integration variable. Indeed say we have a change of variables
θ = θ′ c + d, (6.44)
with c, d ∈ C. We’d like the integral to be invariant under this change of variables, i.e. to have
Z Z
dθ′ f (θ′ ) = dθf (θ). (6.45)
The partial derivatives ∂∂θ are defined to act on f from the left, anticommuting with each θm with m ̸= n
n
until they meet a θn , which they then replace by one. So for example
∂
2θ1 θ2 θ3 = −2θ1 θ3 . (6.49)
∂θ2
∂
On terms with no θn , ∂θn gives zero. The integral is defined as
Z
dθ1 . . . dθN f (θ1 , . . . , θN ) = ∂θ1 . . . ∂θN f = aN . (6.50)
We can work out the change of variables rule under a linear transformation
X
′
θn = Lnm θm + bn (6.52)
m
47
in the same way as we did for N = 1: we want to ensure that
Z Z Z
dθ1 . . . dθN f (θ1 , . . . , θN ) = dθ1 . . . dθN f ((Lθ + b)1 , . . . , (Lθ + b)N ) = dθ1′ . . . dθN
′ ′ ′
f (θ1′ , . . . , θN
′
).
(6.53)
The highest-order term in the integrand of the middle quantity is
X X
′ ′ ′
LN,mN . . . L1,m1 θm N
. . . θ m 1
aN = (−1)p(π) LN,π(N ) . . . L1,π(1) θN . . . θ1′ aN = det L θN
′
. . . θ1′ aN ,
m1 ,...,mN π∈SN
(6.54)
where SN is the permutation group on N elements and (−1)p(π) is the sign of π. Thus for (6.53) to hold we
need to have
1
dθ1 . . . dθN = dθ′ . . . dθN
′
, (6.55)
det L 1
where again the factor of det L is inverted compared to what it would be for the transformation of an ordinary
integration measure.
It is useful to introduce a δ-function for Grassmann integration. This is given by
δ N (θ⃗ ′ − θ)
⃗ = (θ′ − θN ) . . . (θ′ − θ1 ) ,
N 1 (6.56)
for any function f . This δ-function also has a useful integral representation,
Z P ′
N ⃗′ ⃗
δ (θ − θ) = (−1) N (N +1)/2
dξ1 . . . dξN e a ξa (θa −θa ) . (6.58)
so to get a nonvanishing contribution to the integral we should pick the second term in each factor. The
sign works out from moving each ξn past the θ’s to its left.
It will also be useful for us to have a formula for the action of a product of the Ξa on |ψ⟩. Starting from
(6.20) we have
∂ ∂
Ξa1 . . . Ξan |ψ⟩ = − a1 . . . − an |ψ⟩, (6.60)
∂ψ ∂ψ
since we can re-order the factors (1 − ψ n Ξn ) in |ψ⟩ so that they appear in the same order as the Ξa and then
use
∂
Ξa (1 − ψ a Ξa ) = Ξa = − a (1 − ψ a Ξa ). (6.61)
∂ψ
Using our integral representation for the δ-function and switching back to our M -fermion notation we then
have
∂ ∂
⟨ψe′ |Ψa1 . . . Ψan Ξb1 . . . Ξbm |ψ⟩ = ⟨ψe′ |ψ ′a1 . . . ψ ′an − b1 . . . − bm |ψ⟩
∂ψ ∂ψ
∂ ∂
= (−1)M ψ ′a1 . . . (−1)M ψ ′an (−1)M +1 b1 . . . (−1)M +1 bm ⟨ψe′ |ψ⟩
∂ψ ∂ψ
Z P a ′a
= (−1)M (M +1)/2 dξ1 . . . dξM ψ ′a1 . . . ψ ′an ξb1 . . . ξbm e a ξa (ψ −ψ ) . (6.62)
48
6.4 The fermion path integral
We now have all the tools we need to imitate our derivation of the bosonic path integral in fermionic theories.
In particular our Grassmann technology has given us the completeness relation
Z Z
dM ψ|ψ⟩⟨ψ|
e = |ψ⟩dM ψ⟨ψ| e = 1, (6.63)
where for future convenience in the second expression we use that |ψ⟩ has weight zero to move the integration
measure between the ket and the bra. We can check this completeness relation by acting on an eigenstate:
Z Z
|ψ ⟩d ψ ⟨ψ | |ψ⟩ = dψ ′ δ M (ψ
′ M ′ e′ ⃗ ′ − ψ)|ψ
⃗ ′ ⟩ = |ψ⟩. (6.64)
Let’s now use this to get a path integral representation for the transition amplitude:
Z Z
M
⟨ψ^ ,
f ft |ψ ,
i it ⟩ = ⟨ψ^ ,
f ft | |ψ , t
N −1 N −1 ⟩d ψN −1 ⟨ψ ^ , t
N −1 N −1 | . . . |ψ1 , t1 ⟩dM ψ1 ⟨ψ
^ 1 , t1 |ψi , ti ⟩, (6.65)
where we have again discretized the time interval tf − ti into N intervals of length ϵ. We can approximate
the inner products at small ϵ using equation (6.62):
′ , t + ϵ|ψ, t⟩ ≈ ⟨ψ
⟨ψ^ g ′ , t| (1 − iϵH(Ψ, Ξ)) |ψ, t⟩
Z P a ′a
= (−1)M (M +1)/2 dM ξ(1 − iϵH(ψ ′ , ξ))e a ξa (ψ −ψ )
" !#
Z X ψ ′a − ψ a
M (M +1)/2 M ′
≈ (−1) d ξ exp iϵ i ξa − H(ψ , ξ) , (6.66)
a
ϵ
where we have ordered the operators in the Hamiltonian so that all Ξa appear to the right of all Ψa . It only
remains to work out the overall sign. Each inner product contributes a Grassmann measure dM ψm dM ξm−1 ,
and since this has an even number of Grassman integrals we can move them all to the left without incurring
any more signs. Thus we have
Z
N M (M +1)/2
⟨ψ
^f , tf |ψi , ti ⟩ = lim (−1) dM ξN −1 dM ψN −1 . . . dM ξ1 dM ψ1 dM ξ0
ϵ→0
" N −1 !#
a
X X ψℓ+1 − ψℓa
× exp iϵ i ξa,ℓ − H(ψℓ+1 , ξℓ ) . (6.67)
a
ϵ
ℓ=0
In both of these expressions we take ψ0 = ψi and ψN = ψf . Typically we won’t try to keep track of the
overall sign, which anyways will cancel out when we use ratios of path integrals to compute correlation
functions, so somewhat more heuristically we can write this as
" Z #
Z tf X
ψf a
⟨ψf , tf |ψi , ti ⟩ = Dψ Dξ exp i
^
ψi
dt i ξa (t)ψ̇ (t) − H(ψ(t), ξ(t)) . (6.69)
ti a
49
This is our path integral expression for a fermionic transition amplitude! Note in particular that iξa is
the canonical momentum, so this confirms the operator ordering in the Legendre transformation we did to
construct the Dirac Hamiltonian.
We can also introduce a path integral representation for time-ordered correlation functions. This requires
us to compute
′ , t + ϵ|O Ψ(t), Ξ(t) |ψ⟩ ≈ ⟨ψ
⟨ψ^ g ′ , t|(1 − iϵH(Ψ, Ξ))O Ψ(t), Ξ(t) |ψ, t⟩
Z P a ′a
= (−1)M (M +1)/2 dM ξ (1 − iϵH(ψ ′ , ξ))O(ψ, ξ)e a ξa (ψ −ψ )
" !#
Z X ψ ′a − ψ a
M (M +1)/2 M ′
≈ (−1) d ξ O(ψ, ξ) exp iϵ i ξa − H(ψ , ξ) , (6.70)
a
ϵ
where we have ordered the operators in O so that all Ξa appear to the left of all Ψa . We then have
Z
N M (M +1)/2
⟨ψ
^ f , tf |T O1 (Ψ(t1 ), Ξ(t1 )) . . . Op (Ψ(tp ), Ξ(tp ))|ψi , ti ⟩ = lim (−1) dM ξ0 dM ξ1 dM ψ1 . . . dM ξN −1 dM ψN −1
ϵ→0
or more heuristically
Z
ψf
⟨ψ
^ f , tf |T O1 (Ψ(t1 ), Ξ(t1 )) . . . Op (Ψ(tp ), Ξ(tp ))|ψi , ti ⟩ = Dψ ψi
Dξ O1 (ψ(t1 ), ξ(t1 )) . . . Op (ψ(tp ), ξ(tp ))
" Z #
tf X
× exp i dt i ξa (t)ψ̇ a (t) − H(ψ(t), ξ(t)) .
ti a
(6.72)
Note that if the operators O are fermionic, the antisymmetry in the time-ordered product will automatically
work out due to the anticommuting nature of the Grassmann variables. As in the bosonic case we can
compute vacuum expectation values using an iϵ prescription:
R ∞(1−iϵ)
R 0 i dtL(ψ(t),ξ(t))
Dψ 0 Dξ O1 (ψ(t1 ), ξ(t1 )) . . . Op (ψ(tp ), ξ(tp ))e −∞(1−iϵ)
⟨Ω|T O1 (Ψ(t1 ), Ξ(t1 )) . . . Op (Ψ(tp ), Ξ(tp ))|Ω⟩ = R ∞(1−iϵ) ,
R 0 i dtL(ψ(t),ξ(t))
Dψ 0 Dξ e −∞(1−iϵ)
(6.73)
where you can see that the overall sign in the measure indeed drops out. Note that the quantity on the
left-hand side is completely physical, not requiring any discussion of the Grassmann supermodule H, e but we
would have had trouble coming up with the quantity on the right-hand side without it.
For our Dirac theory it is useful to write out the Lagrangian more explicitly. Looking at (6.72) and
recalling that Ξ = Ψ† , we have
Thus we see at last that to have a “classical” Lagrangian for a Dirac field, the field variables must be
Grassmann-valued. Note also that when we differentiate this Lagrangian with respect to ψ̇ to extract the
canonical momenta π = iξ we should take the derivative from the right (see again (6.72)), just as we did
when we canonically quantized the Dirac Lagrangian.
50
6.5 Gaussian integrals
Following our experience with bosonic path integrals, let’s now learn how to evaluate fermionic Gaussian
integrals. The first kind of path integral we want to evaluate, analogous to the Dirac partition function, is
Z
T
Z[A] = dξ1 dψ 1 . . . dξM dψ M e−ξ Aψ (6.75)
= det(A). (6.77)
where b and b̂ are vectors whose components are Grassmann variables. Introducing the change of variables
ψ = ψ ′ − A−1 b̂
ξ = ξ ′ + (A−1 )T b, (6.79)
this simplifies to
Z
′T
Aψ ′ −bT A−1 b̂
Z[A, b, b̂] = dξ1 dψ 1 . . . dξM dψ M e−ξ
T
A−1 b̂
= det(A)e−b . (6.80)
By using this expression we can compute correlation functions in the Gaussian ensemble:
T
dξ1 dψ 1 . . . dξM dψ M ψ am . . . ψ a1 ξb1 . . . ξbm e−ξ Aψ
R
∂ ∂ ∂ ∂ −bT A−1 b̂
R = ... ... e . (6.81)
dξ1 dψ 1 . . . dξM dψ M e−ξT Aψ ∂bam ∂ba1 ∂ b̂b1 ∂ b̂bm b,b̂=0
51
so to get a contribution which is nonvanishing when b = b̂ = 0 we should only have b-derivative terms for
the first m factors and only have bT A−1 terms for the second m factors. We have a choice however of which
derivative acts on which bT A−1 , so (as in the bosonic case) we will have a sum over pairings of the ai with
the bi . Due to the anticommuting nature of the b’s and b̂’s however, the pairings will now be weighted by
sign. One nice way to write the answer is as
∂ ∂ ∂ ∂ −bT A−1 b̂ X
... ... e b,b̂=0
= (−1)p(π) (A−1 )a1 bπ(1) . . . (A−1 )a1 bπ(m) . (6.85)
∂bam ∂ba1 ∂ b̂ 1
b b
∂ b̂ m π∈Sm
As an illustration of these results we can use the path integral to easily compute the Feynman propagator
of a Dirac fermion. Recalling that Ξ = Ψ† , and taking into account the iϵ prescription, from the Dirac
Lagrangian (6.74) we have
A = −γ 0 γ 0 (1 + iϵ)∂τ + γ i ∂i + m .
(6.86)
We can invert this to find the Feynman propagator by solving the equation
γ 0 γ 0 (1 + iϵ)∂τ + γ i ∂i + m SF (x)γ 0 = δ d (x),
(6.87)
where I’ve written A−1 = −SF γ 0 since we defined SF (x) = ⟨Ω|T Ψ(x)Ψ(0)|Ω⟩. Writing
dd p
Z
SF (x) = ŜF (p)eip·x , (6.88)
(2π)d
the equation we need to solve is
iγ 0 (1 + iϵ)p0 + iγ i pi + m ŜF (p) = −1.
(6.89)
This is solved by
i(p/ + im)
ŜF (p) = , (6.90)
p2
+ m2 − iϵ
which is precisely the covariant momentum-space propagator we found in the previous section but now with
far less work!
We can also evaluate Majorana-type Gaussian integrals. Defining19
Z
1 T
Z[A] = dψ 1 . . . dψ 2N e 2 ψ Aψ (6.91)
quantum mechanically. If we nonetheless evaluate this integral with an odd number of fermions we’ll just get zero since the
determinant of an odd-dimensional antisymmetric matrix always vanishes (det A = det AT = det(−A) = − det A).
52
6.6 Euclidean path integral for fermions
As in the bosonic case, it is also interesting to consider a Euclidean version of the fermion path integral. We
define Euclidean Heisenberg operators
Ψa (τ ) = eHτ Ψa (0)e−Hτ
Ξa (τ ) = eHτ Ξa (0)e−Hτ (6.96)
and then proceed in the same way as before. The only difference is that everywhere we had iϵH we should
now write ϵH, so our final expression for a transition amplitude with operator insertions is now
Z ψf
⟨ψ
^ ,
f fτ |T O1 (Ψ(τ 1 ), Ξ(τ 1 )) . . . Op (Ψ(τ p ), Ξ(τ p ))|ψ ,
i iτ ⟩ = Dψ Dξ O1 (ψ(τ 1 ), ξ(τ 1 )) . . . Op (ψ(τ p ), ξ(τ p ))
ψi
" Z #
τf X
a
× exp − dt ξa (τ )ψ̇ (τ ) + H(ψ(τ ), ξ(τ )) .
τi a
(6.97)
There is however one novelty in the fermionic Euclidean path integral related to the thermal trace. Recall
that in the bosonic case we could compute the thermal trace as
Z Z Z
−βH q
M
d q⟨q|e |q⟩ = d q M
Dq q Dp e−SE , (6.99)
with !
Z β X
a
SE = dτ i pa (τ )q̇ (τ ) − H(q(τ ), p(τ )) (6.100)
0 a
being the bosonic Euclidean action. In other words the thermal partition function is just the Euclidean
partition function with periodic boundary conditions in Euclidean time, τ ∼ τ + β. What about in the
fermionic case? Here we are in for a surprise. Let’s try to compute the trace of an arbitrary operator O
using fermionic eigenstates: Z
Tr(O) =? dψ⟨ψ|O|ψ⟩.
e (6.101)
This equation however has an immediate problem: if we compute the “trace” of the identity then we just
get Z
dψ⟨ψ|ψ⟩
e = δ M (0) = 0. (6.102)
How can the trace of the identity be zero? Of course it can’t, to see what is really going on let’s compute
the right-hand side of (6.101) in the one-fermion case with the standard inner product (so that ⟨e
0| = ⟨0| and
⟨e
1| = ⟨1|):
Z Z
dψ⟨ψ|O|ψ⟩ = dψ − ⟨1| + ⟨0|ψ O |0⟩ − ψ|1⟩
e
Z
= dψ ψ (⟨0|O|0⟩ − ⟨1|O|1⟩)
= Tr (−1)F O .
(6.103)
53
On the other hand we have
Z Z
dψ⟨ψ|O|
e − ψ⟩ = dψ ψ (⟨0|O|0⟩ − ⟨1|O|1⟩) = Tr(O). (6.104)
Thus we see that to compute a genuine trace we need to use antiperiodic boundary conditions for fermions,
while if we use periodic boundary conditions then we are computing the trace with a factor of (−1)F inserted.
This statement is true in general, and so in particular to compute a thermal partition function for fermions
we need to use antiperiodic boundary conditions around the thermal circle.
Problems:
1. Confirm equation (6.35) starting from the definition (6.34).
0|ΨM . . . Ψ1 Ξ1 . . . ΞM |e
2. Confirm equation (6.37). Hint: only terms which are proportional to ⟨e 0⟩ = 1
contribute, you just need to be careful about the signs.
3. Confirm equation (6.57). Hint: you will make your life a lot easier if you first note that f (θ1′ , . . . , θN
′
)=
′ ′ ′
f (θ1 + (θ1 − θ1 ), . . . , θN + (θN − θN )). Now Taylor expand f in the quantities (θn − θn ). Do any terms
beyond the lowest order contribute to the integral?
4. Show for M fermions that dM ψ⟨ψ|O|ψ⟩ is proportional to Tr (−1)F O and dM ψ⟨ψ|O|
R R
e e − ψ⟩ is
proportional to Tr (O).
5. (Extra credit x2) In this problem we will study C, R, and T symmetries for Dirac fermions.
Let’s first take R to be the spacetime transformation
x0′ = x0
x1′ = −x1
xi′ = xi (i > 1), (6.105)
and also that these equations hold for d even provided that we take DR = eiθR γγ 1 with θR an arbitrary
phase. Argue that for odd d there is no matrix DR which works.
Now let’s define C to act as
Ψ′ = DC Ψ∗ . (6.108)
Show that invariance of the Lagrangian requires
DC† γ 0 DC = −γ 0T = γ 0∗
DC† γ 0 γ µ DC = γ µT γ 0T , (6.109)
54
and also show that for d = 0, 2, 3 mod 4 we can solve these by
†
B2 d = 0 mod 4
DC = B1† d = 2 mod 4 . (6.110)
†
B d = 3 mod 4
x0′ = −x0
xi′ = xi , (6.111)
DT† DT = I
DT† γ 0∗ DT = −γ 0
DT† γ 0∗ Γi∗ DT = −γ 0 γ i . (6.113)
Moreover show that for d = 0, 1, 2 mod4 these conditions are satisfied for
iθT 0
e Γ B1 d = 0 mod 4
DT = eiθT Γ0 B d = 1 mod 4 , (6.114)
iθT 0
e Γ B2 d = 2 mod 4
1 m2ϕ 2
L = − ∂µ ϕ∂ µ ϕ −
ϕ − iψ ∂/ + mψ ψ − igϕψψ. (7.1)
2 2
As we mentioned before this theory is a model of a nucleon (i.e. a proton or a neutron) ψ interacting with
a pion ϕ. As also mentioned before, a better model of this is what I’ll call true Yukawa theory:
1 m2ϕ 2
Ltrue = − ∂µ ϕ∂ µ ϕ −
ϕ − iψ ∂/ + mψ ψ − gϕψγψ. (7.2)
2 2
The difference between the two has to do with the spatial reflection symmetry R. You saw in the homework
last week that this acts on a Dirac fermion as
55
which acts on ψψ and ψγψ as
′
ψψ = ψ † γ 1† γ † γ 0 γγ 1 ψ = ψγ 0 γ 1† γ 0 γγγ 1 ψ = ψψ
′
ψγψ = ψ † γ 1† γ † γ 0 γ 1 ψ = ψγ 0 γ 1† γ 0 γγ 1 ψ = −ψγψ. (7.4)
In other words ψψ transforms as a scalar under spatial reflection while ψγψ transforms as a pseudoscalar.
Thus in order for spatial reflection to be a symmetry, in the theory with Lagrangian (7.1) the R-transformation
of ϕ should be the scalar transformation
ϕ′ (x) = ϕ(Rx), (7.5)
while in the theory with Lagrangian (7.2) ϕ must be a pseudoscalar:
In the real world pions are pseudoscalars, so it is the second choice which is correct. Nonetheless the true
Yukawa theory is a bit more annoying to calculate in, and also has Feynman rules which are a bit less similar
to quantum electrodynamics, so in this section we’ll mostly stick with the Lagrangian (7.1).
Before studying Yukawa theory in detail I must first confess that there is a sense in which it is rather
unnatural. A potential
m2ϕ 2 g3 3 g4 4
V (ϕ) = g1 ϕ + ϕ + ϕ + ϕ (7.7)
2 6 4!
for ϕ would be perfectly allowed by all of the symmetries of the Lagrangian (7.1) (the Yukawa term breaks
the ϕ′ = −ϕ symmetry of the free scalar theory), and each of these terms is relevant or marginal for d ≤ 4.
Why then have we only included the ϕ2 term? In fact the other terms are generated by loop diagrams via
renormalization group flow even if we set them to zero in the bare Lagrangian, and so we haven’t really gotten
rid of them anyways. The right way to deal with this systematically is to just include the full potential in the
Lagrangian and allow its couplings to be renormalized as usual. Our main purpose in studying this theory
however is to get some practice with fermions before we move on to studying quantum electrodynamics,
which does not have this problem, and so we will take the more lazy approach of simply assuming that the
bare g1 , g3 , and g4 couplings have been tuned to cancel whatever contributions to these terms are generated
by loop diagrams. We can thus ignore both these terms in the action and those diagrams.21
DϕDψDψϕ . . . ϕψ . . . ψψ . . . ψeiSϵ
R
b1 bℓ
⟨Ω|T Φ(x1 ) . . . Φ(xm )Ψaℓ (yℓ ) . . . Ψa1 (y1 )Ψ (z1 ) . . . Ψ (zℓ )|Ω⟩ = R . (7.8)
DϕDψDψeiSϵ
Here I suppressed indices and locations on the right-hand side to save space and Sϵ is the full interacting
action with the time integral evaluated according to the iϵ prescription. As we did last semester, we will
approximate the numerator and denominator of this integral perturbatively by Taylor expanding in the
interaction term. Let’s first study the denominator:
∞ n
gn
Z X Z Z
DϕDψDψeiSϵ ∼ DϕDψDψ dd wϕψψ(w) eiSϵ,0 , (7.9)
n=0
n!
where !
∞(1−iϵ) m2ϕ 2
Z Z
d−1 1
x − ∂µ ϕ∂ µ ϕ −
Sϵ,0 = dt d ϕ − iψ ∂/ + m ψ (7.10)
−∞(1−iϵ) 2 2
21 The situation is somewhat better in the true Yukawa theory (7.2), as spatial reflection symmetry now rules out the linear
and cubic terms in the potential since ϕ is a pseudoscalar. The quartic term however still must be included.
56
is the free action. I’ve written ∼ to indicate that the perturbation series is divergent and only gives an
asymptotic approximation to the exact answer, as we discussed last semester. We saw in the previous
section that moments in the Gaussian distribution for a fermionic theory are computed by
Z
1 b1 bm T X
DψDψ ψ am . . . ψ a1 ψ . . . ψ e−ψ Aψ = (−1)p(π) (A−1 )a1 bπ(1) . . . (A−1 )a1 bπ(m) , (7.11)
det A
π∈Sm
where I’ve taken the liberty of changing variables from ξ to ψ so A−1 is now just the spinor Feynman
propagator SFab . Similarly we saw last semester that for a free scalar field we have
s Z
Aϕ 1 T
X Y
det Dϕ ϕa1 . . . ϕam e− 2 ϕ Aϕ ϕ = (A−1
ϕ )aj ak , (7.12)
2π
Pm (j,k)∈P
where Pm is the set of unordered pairings of m objects and A−1 ϕ is the scalar Feynman propagator GF . Note
that the bosonic expression has no sign factor, while the fermion expression does. For expectation values
that involve both bosons and fermions we simply have the product of these two expressions, switching to
field theory notation we have:
s Z
1 Aϕ b1 bℓ
det DϕDψDψ ϕ(x1 ) . . . ϕ(xm )ψ aℓ (yℓ ) . . . ψ a1 (y1 )ψ (z1 ) . . . ψ (zℓ )eiSϵ,0
det A 2π
ℓ
!
XX Y a b Y
(−1)p(π)
p
= SF π(p) (yp − zπ(p) ) GF (xj − xk ) . (7.13)
Pm π∈Sℓ p=1 (j,k)∈Pm
As in our discussion of the free scalar theory, we can interpret the terms in this sum using Feynman
diagrams. We now have two kinds of lines: directed lines for spinor propagators, which we’ll draw so
that the arrow points from Ψ to Ψ, and undirected lines for scalar propagators. For example in the free
fermion+scalar theory we have the six-point function
b1 b2
h i
⟨T ϕ(x1 )ϕ(x2 )Ψa2 (y2 )Ψa1 (y1 )Ψ (z1 )Ψ (z2 )⟩ = GF (x2 −x1 ) SFa1 b1 (y1 − z1 )SFa2 b2 (y2 − z2 ) − SFa1 b2 (y1 − z2 )SFa2 b1 (y2 − z1 ) ,
(7.14)
which comes from the two Feynman diagrams shown in figure 4. In working out the signs such diagrams it
is convenient to use something called Wick contraction notation, which indicate how the operators in the
correlation function are paired. For these two diagrams the contractions are
⟨ϕϕψψψ ψ⟩ (7.15)
and
57
+
Figure 4: The two free contributions to a six point function ⟨T ΦΦΨΨΨΨ⟩ in Yukawa theory. The second
diagram contributes with a minus sign since we need to exchange two fermions before contracting them.
=g
correlation functions. Feel free to take a moment to go and celebrate. On the other hand we now need to
work out the sign of the permutation π for each diagram by permuting the fermions among the interaction
vertices until each ψ is directly to the left of the ψ it will pair up with. Moreover each interaction vertex has
one fermion line going in and one fermion line going out (see figure 5), so all the fermion propagators will
be combined together into lines which either go through the diagram from an external ψ to an external ψ
(neither of which we have for the denominator (7.9)), or else close off into loops. The structure of the Yukawa
interaction is such as we go along one of the lines, we simply multiply the spinor propagator matrices SFab
together from right to left as we go in the direction of the arrow. In particular for the denominator (7.9) we
only have loops, and in that case the overall sign is easy to work out: each fermion loop introduces a term
of the form
ψψ(xn ) . . . ψψ(x1 ), (7.17)
where the positions are labeled in the order they are encountered going around the loop. Each ψ is thus
contracted with the ψ directly to its right, contributing no sign, except for the first ψ which must be
exchanged with the last ψ before being contracted. Therefore each fermion loop simply contributes a minus
sign to the diagram. We thus arrive at a general formula for the sum of vacuum bubble diagrams:
s
pD
Z Z
1 Aϕ X (−1) Y Y Y
det DϕDψDψeiSϵ ∼ g nD dx1 . . . dxnD GF (xi − xj ) SF (xi − xj ) .
det A 2π sD
D (i,j)∈LD
ϕ ψ
LD
ψ
(i,j)∈LD
(7.18)
Here nD is the number of vertices in the diagram, (−1)pD is the overall sign obtained by exchanging the
fermions before you contract neighbors (in this case pD is just the number of fermion loops), sD is the
symmetry factor, LϕD is the set of (unoriented) scalar links in the diagram, Lψ D is the set of fermion lines
through the diagram, and the product of fermion propagators is taken as matrix multiplication from right
to left as you go along each fermion line. As in the scalar case we can simplify this formula by noting that
the sum of all diagrams is the exponential of the sum of connected diagrams:
s " #
(−1)pC
Z Z
1 Aϕ iSϵ
X Y Y Y
det DϕDψDψe ∼ exp g nC dx1 . . . dxnC GF (xi − xj ) SF (xi − xj ) .
det A 2π sC
C (i,j)∈LC
ϕ ψ
LC (i,j)∈LC
ψ
(7.19)
58
+ + +
+ + +...
Figure 6: Leading vacuum bubble diagrams for Yukawa theory. The second and fourth diagrams involve
“tadpoles” where a single scalar propagator ends on a subdiagram which is not connected to anything else.
+ =0
The connected diagrams up through three loops are shown in figure 6. The first two diagrams evaluate to
g2
Z h1 i
dd xdd yGF (x − y) Tr SF (x − y)SF (y − x) − Tr SF (0) Tr SF (0) , (7.20)
2 2
where the factor of 1/2 is one of the rare symmetry factors in Yukawa theory that arise when you have no
external legs. You will work out the next four diagrams on the homework.
These diagrams already illustrate the fact that loops in Yukawa theory generate a linear potential term
g1 ϕ even if we set it to zero in the bare action. This is because the Yukawa theory allows a scalar propagator
to end on some subdiagram that isn’t attached to anything else, for example in the second and fourth
diagrams in figure 6. Such contributions to a diagram are called “tadpoles”. If we add a bare g1 ϕ term
to the action, we can tune it to cancel all tadpole contributions (see figure 7). This cancellation has nice
physical interpretation: the linear term generated in the potential causes the expectation value of ϕ to shift
away from zero, so we need to define a new field
which by construction has zero expectation value. The Feynman diagrams for Φ′ are the same as those for
Φ, except that we no longer include tadpoles. From now on we will not draw any diagrams with tadpoles.
59
+ +...
+ +...
Figure 8: Leading contributions to the two-point function for a scalar and a spinor in Yukawa theory.
+ +...
Figure 9: Leading contributions to the connected fermion four-point function in Yukawa theory.
Let’s now consider the numerator in equation (7.8), which we can write as
Z
b1 bℓ
DϕDψDψ ϕ(x1 ) . . . ϕ(xm )ψ aℓ (yℓ ) . . . ψ a1 (y1 )ψ (z1 ) . . . ψ (zℓ )eiSϵ
∞ n
gn
Z Z
X
aℓ a1 b1 bℓ
= DϕDψDψ ϕ(x1 ) . . . ϕ(xm )ψ (yℓ ) . . . ψ (y1 )ψ (z1 ) . . . ψ (zℓ ) d wϕψψ(w) eiSϵ,0 .
d
n=0
n!
(7.22)
The Feynman diagrams for this expression work in the same way as for the denominator, except that now
there are external legs that the fermion lines can begin and end on. The disconnected vacuum bubbles
exponentiate and cancel with the denominator in (7.8), so relabeling the external positions to uniformize the
notation we have
b1 bℓ
⟨Ω|T Φ(x1 ) . . . Φ(xm )Ψaℓ (xm+ℓ ) . . . Ψa1 (xm+1 )Ψ (xm+ℓ+1 ) . . . Ψ (xm+2ℓ )|Ω⟩
X Z Y Y Y
= g nĈ (−1)pĈ dxm+2ℓ+1 . . . dxm+2ℓ+nĈ GF (xi − xj ) SF (xi − xj ) . (7.23)
Ĉ (i,j)∈Lϕ Lψ (i,j)∈Lψ
Ĉ Ĉ Ĉ
Here Ĉ is the set of tadpole-free diagrams where each interaction vertex is part of a connected component
which contains at least two external legs. We no longer need to write the symmetry factor since connected
diagrams with external legs don’t have them. The sign (−1)PĈ is worked out by exchanging the internal and
external fermions so that each ψ is to the left of the ψ it is contracted with. Fermion loops still each just
contribute −1, so we only really need to work this out for fermion lines which involve the external legs.
60
Figure 10: A one-loop correction to the Yukawa interaction vertex.
We can simplify this expression a bit more by instead computing the connected correlation function, also
known as the cumulant. For example
b1 b2 b1 b2 b1 b2
⟨T Ψa1 (x2 )Ψa1 (x1 )Ψ (y1 )Ψ (y2 )⟩c =⟨T Ψa1 (x2 )Ψa1 (x1 )Ψ (y1 )Ψ (y2 )⟩ − ⟨T Ψa1 (x1 )Ψ (y1 )⟩⟨T Ψa2 (x2 )Ψ (y2 )⟩
b2 b1
+ ⟨T Ψa1 (x1 )Ψ (y2 )⟩⟨T Ψa2 (x2 )Ψ (y1 )⟩. (7.24)
where now C is the set of connected tadpole-free diagrams. The tree and one-loop diagrams for the fermion
and scalar two-point functions are shown in figure 8, and the tree-level contributions to the fermion four-point
function in figure 9. Evaluating the two-point functions through one-loop gives
Z
2
⟨Ω|T Φ(x2 )Φ(x1 )|Ω⟩ = GF (x1 − x1 ) − g dd ydd zGF (x2 − y)GF (z − x1 )Tr (SF (z − y)SF (y − z)) + O(g 4 )
Z
⟨Ω|T Ψ(x2 )Ψ(x1 )|Ω⟩ = SF (x2 − x1 ) + g 2 dd ydd zGF (y − z)SF (x2 − y)SF (y − z)SF (z − x1 ) + O(g 4 ),
(7.26)
while the two tree-level contributions to the connected fermion four-point function give
Z
b b
⟨Ω|Ψa2 (x2 )Ψa1 (x1 )Ψ 1 (y1 )Ψ 2 (y2 )|Ω⟩c =g 2 dd z1 dd z2 GF (z2 − z1 )SFa1 c (x1 − z1 )SFcb1 (z1 − y1 )SFa2 d (x2 − z2 )SFdb2 (z2 − y2 )
Z
− g 2 dd z1 dd z2 GF (z2 − z1 )SFa2 c (x2 − z1 )SFcb1 (z1 − y1 )SFa1 d (x1 − z2 )SFdb2 (z2 − y2 )
+ O(g 4 ) (7.27)
At one-loop there are many diagrams contributing to this four-point function. Ten of them involve
replacing one of the propagators in the tree-level diagrams with its one-loop correction as in figure 8. Four
more arise from replacing one of the interaction vertices with a one-loop correction, as shown in figure 10.
Finally there are four “ladder” diagrams shown in figure 11. As you can see the amount of work to compute
higher-loop correlation functions grows quite quickly with the number of loops.
61
+ + +
following subsection, and the key input into the LSZ formula is the Fourier transform of the time-ordered
connected correlation functions:
b b
⟨Ω|T Φ(k1 ) . . . Φ(km )Ψaℓ (km+ℓ ) . . . Ψa1 (km+1 )Ψ 1 (km+ℓ+1 ) . . . Ψ ℓ (km+2ℓ )|Ω⟩c
Z
b b
= dd x1 . . . dd xm+2ℓ e−ik1 ·x1 −...−ikm+2ℓ ·xm+2ℓ ⟨Ω|T Φ(x1 ) . . . Φ(xm )Ψaℓ (xm+ℓ ) . . . Ψa1 (xm+1 )Ψ 1 (xm+ℓ+1 ) . . . Ψ ℓ (xm+2ℓ )|Ω⟩c .
(7.28)
In perturbation theory these position integrals, and also the integrals over the interaction vertices, can be
evaluated explicitly just as in our scalar theory by using our integral representations
dd p −i
Z
GF (x − y) = eip·(x−y)
(2π)d p2 + m2ϕ − iϵ
dd p i(p / + im) ip·(x−y)
Z
SF (x − y) = e (7.29)
(2π) p + m2ψ − iϵ
d 2
for the scalar and vector propagators. As in the scalar case the rules for evaluating a connected Feynman
diagram C contributing to this Fourier transform are quite simple to remember:
3. Label external momenta by the momenta in the Fourier transform, and label the internal momenta
imposing momentum conservation at each interaction vertex.
4. Multiply by an overall momentum-conserving δ-function (2π)d δ d (k1 + . . . + km+2ℓ ).
5. Write a momentum-space scalar propagator
−i
ĜF (p) = (7.30)
p2 + m2ϕ − iϵ
for each scalar line, both internal and external, write a momentum-space fermion propagator
i(p
/ + im)
ŜF (p) = (7.31)
p2+ m2ψ − iϵ
62
+
Figure 12: Momentum labels for the tree-level contribution to the fermion four-point function.
for each internal fermion line and each external fermion line whose fermion and momentum arrows are
aligned, and write a momentum space fermion propagator
i(−p
/ + im)
ŜF (−p) = (7.32)
p2 + m2ψ − iϵ
for each external fermion line whose fermion and momentum arrows are opposite.
6. Fermion propagators are multiplied as matrices along fermion lines just as in the position-space corre-
lator.
7. Work out the overall sign of the diagram by counting fermion exchanges as in the position space
correlator.
In the LSZ formula we will want to reverse the signs of the external momenta in the Fourier transform
corresponding to incoming particles, in general the rule continues to be that we should write ŜF (p) for
external lines whose momentum and fermion directions are aligned while we should write ŜF (−p) for external
lines whose momentum and fermion directions are opposite.
For example let’s apply these rules to the tree-level fermion four-point function from figure 9. Assigning
momentum labels as in figure 12, we have
b b
h
⟨Ω|T Ψa2 (k2′ )Ψa1 (k1′ )Ψ 1 (−k1 )Ψ 2 (−k2 )|Ω⟩c = (2π)d δ d (k1′ + k2′ − k1 − k2 )g 2 ŜFa1 c (k1′ )ŜFcb1 (k1 )ĜF (k1 − k1′ )ŜFa2 d (k2′ )ŜFdb2 (k2 )
i
− ŜFa2 c (k2′ )ŜFcb1 (k1 )ĜF (k1 − k2′ )ŜFa1 d (k1′ )ŜFdb2 (k2 ) .
(7.33)
63
On the left hand side we have the Fourier transform
Z
′ ′ ′ ′
′aN
⟨Ω|T ON ′
(kN ) . . . O1′a1 (k1′ )O1b1 † (−k1 ) . . . OM
bM †
(−kM )|Ω⟩c = dd x1 . . . dd xN dd x′1 . . . dd x′N eik1 ·x1 +...+ikN ·xN −ik1 ·x1 −...−ikN ·xN
′aN
× ⟨Ω|T ON (x′N ) . . . O1′a1 (x′1 )O1b1 † (x1 ) . . . OM
bM †
(xM )|Ω⟩c
(7.35)
of a time-ordered connected correlation function in an interacting quantum field theory, where you can see
that we have indeed flipped the signs of the momenta for operators associated to particles in the initial state.
On the right-hand side the Zn factors are defined by the matrix elements
Zn ua (⃗k, σ, n)
⟨Ω|Oa (0)|⃗k, σ, n⟩ = q
2ωn,⃗k
Zn v a (⃗k, σ, nc )
⟨⃗k, σ, nc |Oa (0)|Ω⟩ = q , (7.36)
2ωnc ,⃗k
where |⃗k, σ, n⟩ is a one-particle state of particle type n, momentum ⃗k, and spin/helicity σ, nc is the antiparticle
of n, and ua (⃗k, σ, n) and v a (⃗k, σ, nc ) are the objects appearing in the mode decomposition of a free field Φa
with the same quantum numbers as Oa :
X Z dd−1 p 1 h a i
a ip·x a c † −ip·x
Φ (x) = u (⃗
p , σ, n)a p
⃗ ,σ,n e + v (⃗
p , σ, n )a p
⃗ ,σ,nce . (7.37)
(2π)d−1 2ωp⃗
p
σ
We saw last semester that ua and v a are completely determined by Lorentz invariance up to an overall
factor, which is the reason they appear both in the free field and in the interacting matrix elements (7.36)
(that it is the same Zn in both lines of (7.36) is a consequence of CRT symmetry). The arrow in (7.34)
means that the external momenta k and k ′ are all taken to be close to on-shell, meaning to each be close
to obeying k 2 = −m2n with mn the mass of a particle of type n. This formula says that as long as the Oi
operators have a nonzero amplitude to annihilate particles of type ni (this is measured by the Z factors),
and the Oi† operators have a nonzero amplitude to create them (this is measured by the Z ∗ factors), then
the Fourier transform of the time-ordered correlation function has a multidimensional simple pole in this
limit (the factors of k2 +m1 2 −iϵ ) and the residue of this pole is the connected S-matrix element
64
creating fermions. The wave function renormalization constants Zϕ and Zψ are defined by
Zψ ua (⃗k, σ)
⟨Ω|Ψa (0)|⃗k, σ, f ⟩ = q
2ωf,⃗k
Zψ v a (⃗k, σ)
⟨⃗k, σ, f |Ψa (0)|Ω⟩ = q
2ωf,⃗k
a Zψ∗ ua (⃗k, σ)
⟨⃗k, σ, f |Ψ (0)|Ω⟩ = q
2ωf,⃗k
a Zψ∗ v a (⃗k, σ)
⟨Ω|Ψ (0)|⃗k, σ, f ⟩ = q
2ωf,⃗k
Zϕ
⟨Ω|Φ(0)|⃗k, s⟩ = q
2ωs,⃗k
Zϕ∗
⟨⃗k, s|Φ(0)|Ω⟩ = q , (7.38)
2ωs,⃗k
where u and v are now the ones we constructed by solving the Dirac equation. The frequencies are given by
q
ωs,⃗k = |k|2 + m2ϕ
q
ωf,⃗k = |k|2 + m2ψ , (7.39)
where mϕ and mψ are the physical masses of the scalar and fermion particles (not the bare quantities
appearing in the Lagrangian).
In the LSZ formula however we need to be careful about the fact that Ψ is not the complex conjugate of
Ψ. This has no effect for fermions in the final state, but for fermions in the initial state we should multiply
each O† by γ 0 on the right, which converts each u∗ on the right-hand side of (7.34) to a u. Similarly since
we are using Ψ as the O which annihilates antifermions, the “Zu” appearing in (7.34) for each final-state
†
antifermion becomes a Zψ∗ v. Ψ = γ 0† Ψ would therefore supply factors of (Zψ∗ v)∗ = Zψ γ 0† v for each initial-
state antifermion, but since instead want to use Ψ to create initial state antifermions we should multiply by
γ 0 on the right and so we should replace Z ∗ u∗ in the LSZ formula by Zψ v for each initial state antifermion.
To summarize, in the LSZ formula for Yukawa theory we have:
√
−i 2ω ⃗k ′
Final state scalar ⟨⃗k ′ , s|: Φ(k ′ ) on the left-hand side, Zϕ (k′ )2 +ms,
2 −iϵ on the right-hand side.
ϕ
√
−i 2ωs,⃗k
Initial state scalar |⃗k, s⟩: Φ(−k) on the left-hand side, Zϕ∗ k2 +m2 −iϵ on the right-hand side.
ϕ
√
−i 2ω ⃗k ′
Final state fermion ⟨⃗k ′ , σ ′ , f |: Ψa (k ′ ) on the left-hand side, Zψ ua (⃗k ′ , σ ′ ) (k′ )2 +mf,
2 −iϵ on the right-
ψ
hand side.
√
b −i 2ωf,⃗k
Initial state fermion |⃗k, σ, f ⟩: Ψ (−k) on the left-hand side, Zψ∗ ub (⃗k, σ) k2 +m2 −iϵ on the right-hand
ψ
side.
√
a −i 2ω ⃗k ′
Final state antifermion ⟨⃗k ′ , σ ′ , f |: Ψ (k ′ ) on the left-hand side, Zψ∗ v a (⃗k ′ , σ ′ ) (k′ )2 +mf,
2 −iϵ on the
ψ
right-hand side.
65
+
√
−i 2ωf,⃗k
⃗ b ⃗
Initial state antifermion |k, σ, f ⟩: Ψ (−k) on the left-hand side, Zψ v (k, σ) k2 +m2 −iϵ on the right-
b
ψ
hand side.
Let’s now see what the LSZ formula says about how to compute the S-matrix of Yukawa theory in
perturbation theory. Let’s first consider the example of fermion-fermion scattering at tree-level, for which
we computed the relevant four-point function in equation (7.33). We are supposed to take the external
momenta on-shell, in which case each of the fermion propagators in (7.33) can be approximated as
P
i(p
/ + imψ ) −i σ u(p, σ)u(p, σ)
ŜF (p) = 2 ≈ . (7.40)
p + m2ψ − iϵ p2 + m2ψ − iϵ
I emphasize that the sum rule replacing the numerator by a sum over uu only works when the momentum is
on-shell. Comparing this to the LSZ rules just stated, we see that to extract the S-matrix we should simply
replace each final state fermion propagator by a √2ω 1
u(⃗k ′ , σ ′ ) and each initial state fermion propagator
k′
⃗
1
by a √2ω u(⃗k, σ) (here we are using that Zψ = Zϕ = 1 and the physical mass equals the bare mass at
⃗
k
tree-level). The tree-level 2 → 2 fermion connected scattering amplitude is thus given by
1 1 1 1
⟨k2′ , σ2′ , f ; k1′ , σ1′ , f |p1 , σ1 , f ; p2 , σ2 , f ⟩c = p p q q g 2 (2π)d δ d (k1′ + k2′ − k1 − k2 )
2ωk⃗1 2ωk⃗1 2ω ⃗′ 2ωk⃗′
k 1 2
66
Figure 14: Decomposing a four-point function into a central contribution from the sum of pruned diagrams
contracted with the exact propagators on its external lines.
⟨ψψψψψψψψ⟩ (7.44)
and
⟨ψψψψψψψψ⟩, (7.45)
and the negative momenta in fermion propagators are for the lines where the fermion and momentum arrows
are pointing in opposite directions. Taking these external propagators on-shell, we can use that close to
on-shell we have
i −p
P
/ + imψ i σ v(⃗ p, σ)
p, σ)v(⃗
ŜF (−p) = 2 ≈ . (7.46)
p + m2ψ − iϵ p2 + m2ψ − iϵ
Comparing this to our LSZ rules we see that to extract the S-matrix we should replace each final state
′ ′
antifermion propagator by −v(k
√ ,σ ) and each initial state propagator with −v(k,σ)
2ω
√
2ω
(here we are again using
k′
⃗ ⃗
k
that Zψ = 1 at this order in g). We therefore have the covariant connected amplitude
" #
⃗′ ′ ⃗ ⃗ ⃗′ ′ v(⃗k2 , σ2 )u(⃗k1 , σ1 ) × u(⃗k ′1 , σ1′ )v(⃗k ′2 , σ2′ )
2 u(k 1 , σ1 )u(k1 , σ1 ) × v(k2 , σ2 )v(k 2 , σ2 )
iMc (f f → f f ) = ig
f − .
(k1 − k1′ )2 + m2ϕ − iϵ (k1 + k2 )2 + m2ϕ − iϵ
(7.47)
Let’s now generalize what we learned in this example to the case of an arbitrary scattering amplitude
in Yukawa theory. The first point to make is that we showed last semester that the full non-perturbative
two-point function becomes proportional to the free two point function in the limit that its momentum goes
67
on-shell:
b |Zψ |2 i(k/2 + imψ )
⟨Ω|T Ψa (k2 )Ψ (k1 )|Ω⟩ →(2π)d δ(k2 + k1 )
k22 + m2ψ − iϵ
−i|Zψ |2 σ u(⃗k2 , σ)u(⃗k2 , σ)
P
d
= (2π) δ(k2 + k1 )
k22 + m2ψ − iϵ
i|Zψ |2 σ v(−⃗k2 , σ)v(−⃗k2 , σ)
P
= (2π)d δ(k2 + k1 ) (7.48)
k22 + m2ψ − iϵ
with the very important caveat that the mass mψ appearing here is the physical mass of the fermion, while in
the propagator we’ve been using above it is really the bare mass mψ,0 . As before the exact scalar two-point
function also has this property:
−i|Zϕ |2
⟨Ω|T Φ(k2 )Φ(k1 )|Ω⟩ = (2π)d δ(k2 + k1 ) . (7.49)
k22 + m2ϕ − iϵ
A general diagram contributing to the correlation function on the left-hand side of the LSZ formula (7.34)
can be written as a “pruned” diagram with the property that there is no internal line can be cut which
separates one external line from the rest of the diagram contracted with arbitrary diagrams contributing to
the exact two-point function on each of the external legs (see figure 14). To compute the scattering amplitude
we therefore simply need to strip off these external exact propagators, except that there is a small mismatch:
removing an exact fermion propagator removes a factor of |Zψ |2 and removing an exact scalar propagator
removes a factor of |Zϕ |2 , but in the LSZ formula we only want to remove a factor of Zψ ,Zψ∗ , Zϕ , or Zϕ∗ . We
therefore have a leftover factor of Z or Z ∗ in each case, leading to the following rule:
iM
fc =Sum over pruned connected tadpole-free Feynman diagrams, with overall momentum δ-function removed and
with external propagators replaced by the factors shown in figure 15. (7.50)
The minus signs for the external lines involving antiparticles are typically dropped since they never contribute
when we compute the square of the amplitude, so you are free to drop them whenever you compute a cross
section or decay rate, but they need to be there if you want to get the right expression for the S-matrix.22
Let’s study this scattering amplitude in the non-relativistic limit where both fermion momenta are small
compared to their masses, in which case we can treat the momenta in u and u as being at rest. From our
sum rules we have
u(⃗0, σ)u(⃗0, σ ′ ) = −2imδσ,σ′ , (7.52)
and in the non-relativistic limit we simply have
(k1 − k1′ )2 = |⃗k1 − ⃗k ′1 |2 , (7.53)
22 Most QFT books (*ahem* Peskin, Schwartz) don’t write these sign factors, which they can get away with since they don’t
actually work out the LSZ formula for spinor fields and in any case the signs drop out when you square the amplitude. Srednicki
at least acknowledges they are there.
68
Figure 15: Factors for external lines in computing the Yukawa theory S-matrix. Here the dots indicate the
rest of the diagram, so the top row are final state particles/antiparticles and the bottom row are initial state
particles/antiparticles.
We can now compare this to the usual Born approximation for nonrelativistic scattering off of a potential,
which says that the scattering amplitude is23
fc ≈ −4im2 × V (⃗k ′ − ⃗k),
iM (7.55)
ψ
so apparently we have
g2
V (⃗k) = − . (7.56)
|k|2 + m2ϕ
You will show on the homework that this is the Fourier transform of the potential
g 2 1 −mϕ |x|
VY ukawa (⃗x) = − e . (7.57)
4π |x|
Note that potential is attractive: the proton and neutron are pulled together by the Yukawa force. This is
what holds nuclei together!
We can also consider the non-relativistic scattering of a proton and an antineutron, which has scattering
amplitude
⃗′ ′ ⃗ ⃗ ⃗′ ′
iMfc (p n → p n) = ig 2 u(k 1 , σ1 )u(k1 , σ1 ) × v(k2 , σ2 )v(k 2 , σ2 ) (7.58)
(k1 − k1′ )2 + m2ϕ
In the non-relativistic limit this becomes
4ig 2 m2ψ δσ1 ,σ1′ δσ2 ,σ2′
iM
fc (p n → p n) ≈ , (7.59)
|⃗k1 − ⃗k ′ |2 + m2
1 ϕ
23
q To derive this you need to translate Mc back to the S-matrix and remember that in the non-relativistic limit we have
f
2ωf, ⃗k ≈ 2mψ .
p
69
with the different overall sign canceling the differing sign from the sum rule v(0, σ)v(0, σ ′ ) = 2mψ δσ,σ′ , so
the potential between a fermion and an antifermion is the same as for a fermion and a fermion. In particular
it is again attractive! To get repulsive forces we need to exchange a particle of spin one, as we will see when
we study quantum electrodynamics.
where E1 and E2 are the energies of the initial particles, Eβ,j is the energy of the jth outgoing particle, and
p
(k1 · k2 )2 − m21 m22
uα = (7.61)
E1 E2
is the relativistic relative velocity of the initial particles. Last semester we showed that working in the center
of mass frame ⃗k1 = −⃗k2 with two particles in the final state we can simplify this to
dσ 1 1 |k ′ |d−3 f 2
dΩd−2
= If inal
2 (2π)d−2 16|k|Etot2 |Mc | , (7.62)
where |k| is the magnitude of the spatial momentum of either incoming particle, If inal is equal to one if the
final state particles are identical and zero if they are distinguishable, Etot is the total center of mass energy
q q
Etot = |k|2 + m21 + |k|2 + m22 (7.63)
dσ 1 |k|d−4 fc |2 .
= |M (7.65)
dΩd−2 128(2π)d−2 |k|2 + m2ψ
Our remaining job is thus to compute the square of the covariant matrix element (7.42). Writing the square
isn’t so illuminating if we don’t do anything else, but as we discussed last semester the situation improves
if we sum over the spins/helicities in the initial and final states. This isn’t purely a matter of laziness: in a
typical collider experiment it is hard to measure the spins/helicities of the outgoing particles, and the initial
state particle beams typically are not spin/helicity polarized. To capture this situation we should therefore
sum over spin/helicity in the final state and average over spin/helicity in the initial state.24
dσave 1 X dσ 1 |k|d−4 X f 2
≡ d−2 = d−2 2 |Mc | . (7.66)
dΩd−2 22⌊ 2 ⌋ σ,σ′ dΩd−2 128(2π)d−2 22⌊ 2 ⌋ |k|2 + mψ σ,σ′
24 If you are puzzled about why we treated the initial and final states differently, think about the quantum mechanics of the
experiment. A random initial spin is described by a normalized density matrix, for example ρ = 12 | ↓⟩⟨↓ | + 12 | ↑⟩⟨↑ | for a
spin-1/2 particle, while a measurement which cannot distinguish between spin up and spin down in the final state is a projection
P = | ↓⟩⟨↓ | + | ↑⟩⟨↑ |.
70
d−2
Here we’ve used that the irreducible spinor representation of SO(d − 1) is 2⌊ 2 ⌋ -dimensional. In d = 4 this
expression is simply
dσave 1 1 X
fc |2 .
= 2 2 2 |M (7.67)
dΩ2 2048π |k| + mψ ′ σ,σ
We can compute the spin sum explicitly using a standard sequence of tricks. We first note that if f and g
are commuting spinors (for us they will always be u or v), then
Indeed for the square of the first term we have (using an abbreviated notation)
X X
|u1′ u1 u2′ u2 |2 = u1′ u1 u1 u1′ × u2′ u2 u2 u2′
σ,σ ′ σ,σ ′
X
= Tr (u1 u1 u1′ u1′ ) × Tr (u2 u2 u2′ u2′ )
σ,σ ′
h i h i
′ ′
k 1 + imψ ) k/1 + imψ Tr (/
= Tr (/ k 2 + imψ ) k/2 + imψ , (7.71)
and h i
′ ′
X
u2′ u1 u1′ u2 u2 u2′ u1 u1′ = Tr k/1 + imψ (/
k 2 + imψ ) k/2 + imψ (/
k 1 + imψ ) . (7.74)
σ,σ ′
To further evaluate these traces we need to learn how to take the trace of a product of γ-matrices. This
is sometimes called “trace technology”. Let’s start simple: from our product-Pauli representation of the
γ-matrices we clearly have
Tr(γ µ ) = 0. (7.75)
From the Dirac algebra we also have
1 d
Tr (γ µ γ ν ) = Tr ({γ µ , γ ν }) = 2⌊ 2 ⌋ η µν . (7.76)
2
25 Actually here we only need the first of these, but when you compute amplitudes involving antiparticles in the homework
71
Things get trickier as we go to more γ-matrices, so let’s first take d to be even. We can then show that the
trace of the product of any odd number of γ-matrices is zero:
Tr (γ µ1 . . . γ µn ) = Tr (γγγ µ1 . . . γ µn )
= Tr (γγ µ1 . . . γ µn γ)
= (−1)n Tr (γ µ1 . . . γ µn γγ)
= (−1)n Tr (γ µ1 . . . γ µn ) . (7.77)
We can also show that the trace of the product of any odd number of γ-matrices together with γ is zero as
well:
Tr (γγ µ1 . . . γ µn ) = (−1)n Tr (γ µ1 . . . γ µn γ) = (−1)n Tr (γγ µ1 . . . γ µn ) . (7.78)
For the above calculation what we need the trace of the product of four γ matrices: we can compute this as
follows:
Tr γ µ γ ν γ α γ β = 2η µν Tr γ α γ β − Tr γ ν γ µ γ α γ β
= 2η µν Tr γ α γ β − 2η µα Tr γ ν γ β + Tr γ ν γ α γ µ γ β
= 2η µν Tr γ α γ β − 2η µα Tr γ ν γ β + 2η µβ Tr (γ ν γ α ) − Tr γ µ γ ν γ α γ β ,
(7.79)
and thus
d
Tr γ µ γ ν γ α γ β = 2⌊ 2 ⌋ η µν η αβ − η µα η νβ + η µβ η να .
(7.80)
It is useful to re-package these results as statements about the traces of slashed vectors:
d
//b = 2⌊ 2 ⌋ a · b
Tr a
d
/ = 2⌊ 2 ⌋ (a · b)(c · d) − (a · c)(b · d) + (a · d)(b · c) .
//b/cd
Tr a (7.81)
σ,σ ′
d
X
|u2′ u1 u1′ u2 |2 = 22⌊ 2 ⌋ k1 · k2′ − m2ψ k2 · k1′ − m2ψ
σ,σ ′
X X
u1′ u1 u2′ u2 u1 u2′ u2 u1′ = u2′ u1 u1′ u2 u2 u2′ u1 u1′
σ,σ ′ σ,σ ′
d
= 2⌊ 2 ⌋ m4ψ − m2ψ (k1 · k2 + k1 · k1′ + k1 · k2′ + k1′ · k2 + k1′ · k2′ + k2 · k2′ )
+ (k1 · k1′ )(k2 · k2′ ) − (k1 · k2 )(k1′ · k2′ ) + (k1 · k2′ )(k2 · k1′ ) . (7.82)
72
We can evaluate these expressions in the center of mass frame, where
k1 · k2 = −ω 2 − |k|2
k1 · k1′ = −ω 2 + |k|2 cos θ
k1 · k2′ = −ω 2 − |k|2 cos θ
k1′ · k2 = −ω 2 − |k|2 cos θ
k1′ · k2′ = −ω 2 − |k|2
k2 · k2′ = −ω 2 + |k|2 cos θ
(k1 − k1′ )2 = 2|k|2 (1 − cos θ)
(k1 − k2′ )2 = 2|k|2 (1 + cos θ) (7.85)
Here is θ is the angle between ⃗k and ⃗k ′ , and we see that the squared amplitude has an interesting angular
dependence. This is quite different from our tree-level 2 → 2 squared scattering amplitude in ϕ4 theory,
which was just λ2 . In limit where mψ = 0 the expression for the differential cross section isn’t too bad, we
have
"
X
2 4 2⌊ d
⌋ 4 (1 − cos θ)2 (1 + cos θ)2
|M
fc | = g 2 2 |k|
2 +
2
(2|k| (1 − cos θ) + mϕ ) 2 (2|k| (1 + cos θ) + m2ϕ )2
2
σ,σ ′
#
1 1 + cos θ
+ d−4 2 2 . (7.86)
2⌊ 2 ⌋ (2|k|2 (1 − cos θ) + mϕ )(2|k|2 (1 + cos θ) + mϕ )
Life is even simpler if we take the fermions to be distinguishable, in which case only the first term contributes
so we have the full differential cross section
dσave g 4 |k|d−2 (1 − cos θ)2
= 4 × 2 . (7.87)
(1 + 2 |k|
dΩd−2 d−2
32(2π) mϕ
m2
(1 − cos θ))2
ϕ
This has several interesting features, for example when |k| ≪ mϕ the cross section is zero at θ = 0 and
peaked at θ = π, while for |k| ≫ mϕ it is roughly independent of angle.
Problems:
1. Write out the values of the three-loop bubble diagrams in figure 6. You don’t need to evaluate the
integrals over positions, and I’ll forgive you if you ignore the symmetry factors.
2. Write out the contribution to the connected fermion four-point function from the four ladder diagrams
shown in figure 11. You don’t need evaluate the position integrals.
3. Write out the tree-level contributions to the connected fermion four-point function in true Yukawa
theory with interaction −gϕψγψ.
4. Show that (7.56) is the Fourier transform of (7.57).
73
5. Compute the tree-level spin-summed/averaged differential cross section for f f → f f scattering (you
can start from the covariant amplitude (7.47)). If you insist you can work in d = 4, and it is ok to
leave it in a form similar to (7.83).
6. Compute the tree-level spin-summed differential cross section for s s → f f scattering. Make sure to
include both diagrams which contribute. If you insist you can work in d = 4, and it is ok to leave it in
a form similar to (7.83).
7. Extra credit: compute the tree-level spin-summed/averaged differential cross section for f f → f f
scattering in the true Yukawa theory with interaction −gϕψγψ. You can work in d = 4.
We’d like to define a version of this theory where the field lives only on a spatial lattice, with site locations
xi = an (8.4)
with n ∈ Z. The length parameter a is called the lattice spacing. Since the fields are now labeled by
a discrete parameter, we should rescale them so that the canonical commutation relation has a discrete
Kronecker δ instead of a continuous δ-function:
√
Φ(n) = aΦcont (an) (8.5)
√
Π(n) = aΠcont (an), (8.6)
26 Many of the topics in this section are nicely reviewed in two papers by my adviser Leonard Susskind from 1977-1978, I put
74
in terms of which we have
[Φ(n), Π(m)] = iδm,n . (8.7)
These results follow from the fact that in the continuum limit we have the replacements
X Z
a → dx
n
1
δn,m → δ(x − y). (8.8)
a
The spatial derivative in the Hamiltonian should be replaced by a finite difference
Φcont (x + a) − Φcont (x)
∂1 Φcont (x) → , (8.9)
a
so in terms of the lattice field the Hamiltonian is
2 !
1X Π(n)2 Φ(n + 1) − Φ(n)
H= a +
2 n a a3
2 !
1X 2 Φ(n + 1) − Φ(n)
= Π(n) + . (8.10)
2 n a2
75
4
-3 -2 -1 1 2 3
Figure 16: The dispersion relation ω 2 (k) for a lattice scalar field, in units where a = 1. Note that the only
low-energy states are near k = 0.
{Ψcont
i (0, x), Ψcont∗
j (0, y)} = δij δ(x − y), (8.20)
(∂0 − ∂1 )Ψcont
L =0
(∂0 + ∂1 )Ψcont
L = 0, (8.21)
so Ψcont
L is a function only of t + x, i.e. it is “left-moving”, and Ψcont
R is a function only of t − x, i.e. it is
“right-moving”. I emphasize that Ψcont L and Ψ cont
R are decoupled, so we can have a sensible theory which
has only one or the other - field theories in d = 2 with inequivalent numbers of left-moving and right-moving
massless fermions are called chiral theories.28
28 In d = 4 chiral theories are those with inequivalent numbers of left-handed and right-handed massless fermions. The term
is also sometimes used when there are equal numbers but the left-handed and right-handed massless fermions have different
interactions.
76
In this section we will be particularly interested in Majorana fermions, which obey the constraint Ψcont∗ =
B1 Ψcont . Here this just means that Ψcont L and Ψcont
R are real. The Lagrangian and Hamiltonian for a
Majorana spinor in 1 + 1 dimensions are
Z
i
dx Ψcont cont
+ Ψcont cont
L= L (∂0 − ∂1 )ΨL R (∂0 + ∂1 )ΨR
2
Z
i
dx Ψcont cont
− Ψcont cont
H= L ∂1 ΨL R ∂ 1 ΨR , (8.22)
2
and its canonical anticommutation relation is
{Ψcont
i (0, x), Ψcont
j (0, y)} = δij δ(x − y). (8.23)
As in the scalar case we can introduce a latticized version
√
ΨL,R (n) = aΨcontL,R (na) (8.24)
of our spinor field, which in the Dirac case obeys the anticommutation relation
{Ψi (n), Ψ∗j (m)} = δij δnm (8.25)
and in the Majorana case obeys
{Ψi (n), Ψj (m)} = δij δnm . (8.26)
What about the Hamiltonian? Perhaps the most obvious choice of Hamiltonian for a single right-moving
Majorana-Weyl fermion ΨR is
iX ΨR (n + 1) − ΨR (n)
− ΨR (n) , (8.27)
2 n a
but this isn’t actually hermitian since the second term is an (infinite) imaginary constant (remember that
ΨR (n)2 = 12 by the anticommutation relations). On the other hand we don’t care about an additive constant
in the Hamiltonian, so we can instead take the Hamiltonian to be
i X
H=− ΨR (n)ΨR (n + 1). (8.28)
2a n
We can make this look more like the continuum Hamiltonian by writing it as
iX ΨR (n + 1) − ΨR (n − 1)
H=− ΨR (n) , (8.29)
2 n 2a
which is now manifestly hermitian since the dagger just exchanges the two sums. The quantity
ΨR (n + 1) − ΨR (n − 1)
(8.30)
2a
is called the symmetric derivative. It is actually a better approximation to the continuum derivative than
the naive discrete derivative (8.9), since if f is a smooth function we have
f (x + ϵ) − f (x − ϵ)
= f ′ (x) + O(ϵ2 ). (8.31)
2ϵ
The equation of motion for this Hamiltonian is
Ψ̇R (t, n) = i[H, ΨR (t, n)]
1 X
= [ΨR (t, m)ΨR (t, m + 1), ΨR (t, n)]
2a m
ΨR (t, n + 1) − ΨR (t, n − 1)
− , (8.32)
2a
77
1.0
0.5
-3 -2 -1 1 2 3
-0.5
-1.0
Figure 17: The dispersion relation ω(k) for a right-moving lattice Majorana-Weyl spinor ΨR in 1 + 1 di-
mensions, in units where a = 1. Note in particular the extra low-energy modes near k ≈ π, these are the
left-moving doublers.
which is a discrete version of the Dirac equation (8.21) with the spatial derivative replaced by the symmetric
derivative.
We can find the dispersion relation as in the scalar case, by looking for solutions of the form
eika − e−ika
−iω = − , (8.34)
2a
or in other words
sin(ka)
ω= . (8.35)
a
This dispersion relation is plotted in figure 17. Note that for a spinor it is ω which is an analytic function
of k, while for the scalar it was ω 2 . In particular to get ω > 0 we should take k > 0, which reflects the
right-moving nature of the spinor field. For small k we have
ω ≈ k, (8.36)
which is precisely the continuum dispersion relation for a right-moving continuum fermion. On the other
hand we now have a surprise: there are more light excitations near k = πa ! We can get a sense of what these
excitations look like by defining ψR (n) = eiπn χ(n) and k = πa − κ with 0 < κa ≪ 1, in terms of which have
with
ω ≈ κ. (8.38)
In other words these are precisely the low-energy excitations of a left-moving fermion; our attempt to
construct a lattice version of a one-component right-moving Majorana-Weyl spinor has instead resulted
in a theory whose low-energy description is a two-component Majorana spinor with both left-moving and
right-moving fields described by the continuum Hamiltonian (8.22)! This phenomenon is called fermion
doubling, and the extra left-moving mode we generated is called a fermion doubler.
78
8.3 Nielsen-Ninomiya theorem
You may be wondering if the fermion doubling phenomenon we just discovered can be avoided by some more
clever choice of the lattice Hamiltonian, or more generally if it is only a problem in 1 + 1 dimensions. In
fact fermion doubling is a quite general phenomenon, which is present in any spacetime dimension. It is
particularly robust in even spacetime dimensions, where it is formalized in the Nielsen-Ninomiya theorem.
This says that any weakly-interacting lattice fermion theory in even spacetime dimensions which respects
locality, lattice translation symmetry, and chiral symmetry necessarily has an equal number of upper and
lower component massless Weyl fermions at low energies. In other words you can’t put chiral theories on a
lattice. We won’t get into the general proof of the theorem here, but I’ll give you the essence of the argument
for the 1 + 1 dimensional case we just discussed.29
Let’s consider a rather general lattice Hamiltonian
1X
H= Ψ(n)K(n − m)Ψ(m), (8.39)
2 n,m
and we can take K(n − m) = −K(m − n). Hermiticity requires K to be pure imaginary, and the fact that K
depends only on n − m is the a consequence of lattice translation invariance. You will show on the homework
that the equation of motion for this system is
X
Ψ̇(t, n) = i K(m − n)Ψ(t, m), (8.41)
m
The right-hand side of this equation is just the discrete Fourier transform of K. Note in particular that we
have the periodicity
2π
ω k+ = ω(k), (8.44)
a
which is again a consequence of lattice translation symmetry.
To implement the idea of locality, we should require K(n) to vanish at large n. In simple lattice models
such as the one we considered in the previous section K has compact support, which certainly implies this
vanishing, but we can relax that assumption quite a bit and stillP prove the theorem. The most convenient
assumption to make is that K falls off fast enough that the sum m K(m)m is absolutely convergent, which
implies that the Fourier transform of K has a continuous first derivative. From (8.43), we therefore see that
ω(k) is a continuously differentiable map from S1 to R.
The Nielsen-Ninomiya theorem now follows immediately from the observation that any such function
must cross the surface ω = 0 an even number of times to make sure you get back to where you started as
you go around the circle. Moreover half of those crossings must go from negative to positive as you increase
k, while the other half must go from positive to negative (see figure 18 for an illustration). Crossings of the
former type give rise to right-movers, while crossings of the latter type give rise to left-movers. Thus the
number of left-movers and right-movers must be equal!
29 There is one sense in which 1 + 1 is special: the antiparticle of a left/right mover is also a left/right mover, while in 3 + 1
dimensions the antiparticle of a left/right handed particle is right/left handed. This means that in the argument we now give
we won’t need to assume anything about chiral symmetry, while in 3 + 1 dimensions one needs to.
79
Figure 18: The Nielsen-Ninomiya theorem in 1 + 1 dimensions: a fermion dispersion relation ω(k) must be
periodic and have continuous first derivative, so it crosses ω = 0 an equal number of times in either direction.
Low-energy right-movers are shaded red and low-energy left-movers are shaded blue.
Fermion doubling has a rather embarrassing consequence: the standard model of particle physics is a
chiral theory, so here in 2024 we still don’t know how to write down a lattice model which regulates the
standard model at short distances. This means that we don’t actually have a non-perturbatively good
definition of the standard model, which is a bit unsettling since it is supposed to be part of our current
understanding of the fundamental laws of nature.
There is much more that could be said about lattice fermions, in particular there are many proposals for
how to minimize the number of doublers that all have various advantages and disadvantages, but we will
have to leave it here. Will Detmold is teaching a class on lattice field theory next semester, so that is one
place to go if you want more!
This gives a model of a two-dimensional classical ferromagnet with an “easy” direction of magnetization,
which is to be distinguished from the Heisenberg model which allows the spin to point in any direction.
At zero temperature the Ising model has two obvious ground states: we can have all of the spins to be
up (i.e. σ = 1 everywhere), or we can have all of the spins to be down (i.e. σ = −1 everywhere). Either
of these configurations is called “ordered”. On the other hand at infinite temperature all states are equally
likely, so each spin is independently random and we say the system is “disordered”. At finite temperature
things are more interesting. In the partition function
X
Z(β) = e−βE[σ] (8.47)
σ
80
Figure 19: Flipping a cluster in the Ising model with perimeter L = 12. At high temperature it is entropically
favorable to do this even though it increases the energy at the perimeter.
there is a battle between energy and entropy: misaligning some of the spins costs energy, and is thus punished
by the Boltzmann factor e−βE , but the more spins are misaligned the more ways there are of misaligning
them and this can compete with the Boltzmann suppression at high enough temperature (remember that
β = 1/T ). There is a beautiful argument due to Peierls that there is some nonzero temperature below which
the energy wins, ordering the system (almost all spins up or down), and some other finite temperature above
which the entropy wins and the system is disordered. The idea is to look at the energy cost of flipping a
connected cluster of spins compared to the gain in entropy of doing so. Say that the spins are currently all
aligned. The cost in energy of flipping a cluster whose boundary perimeter (meaning the number of bonds
which are misaligned) is L is
∆E = 2JL, (8.48)
while we can upper bound the number N (L) of clusters of perimeter L containing some fixed lattice site
(nx , ny ) by saying that as we go around the perimeter for each link we need to choose one of three directions
to go in, which gives
N (L) < 3L . (8.49)
Therefore when
2βJ > log 3 (8.50)
the gain in entropy from flipping a cluster can’t possibly balance the energy cost of doing so, so it is better
not to flip and the system is ordered. See figure 19 for an illustration. On the other hand, although it is
hard to compute N (L) exactly it isn’t too hard to argue that it is lower bounded by
for some 1 < C < 3 (basically the constraints that the perimeter needs to close and not self-intersect aren’t
strong enough to completely kill the exponential from choosing which direction to go at each step). This
means that in the partition function the gain in entropy will certainly beat the energy cost if we have
in which case many clusters will form and the system will disorder.
It is harder to say what happens in the intermediate temperature window
2J 2J
<T < (8.53)
log 3 log C
81
between these two bounds, but in fact what happens is that there is a single critical temperature
2J
Tc = √ (8.54)
log(1 + 2)
above which the system is disordered and below which it is ordered. Above the critical temperature the
spin-spin correlation function decays exponentially with distance,
where ξ is called the correlation length. At the critical temperature the correlation length goes to infinity
and the system becomes scale-invariant, so the spin-spin correlation function can only decay as a power:
1
⟨σ(0)σ(x)⟩ ∼ . (8.56)
|x|η
The dimensionless number η is another example of a critical exponent, similar to the critical exponent ν
that we computed approximately in the three-dimensional Ising model using the ϵ-expansion at the end of
the previous semester. As in that case, we can rewrite η in terms of the anomalous dimension of the spin
operator σ:
η = 2∆σ . (8.57)
Computing these critical exponents is the great achievement of Onsager’s solution, and in particular there
is the famous formula
1
∆σ = . (8.58)
8
In the remainder of this section we will see how the Ising model can be rewritten in terms of a free lattice
fermion, which makes it clear why it is soluble.30
As a side comment, it is great fun to produce samples of the 2D Ising model using the Metropolis
algorithm. Starting from an arbitrary initial configuration this works by going through the sample and
flipping each spin either if doing so decreases the total energy or with probability e−β∆E even if it doesn’t.
This algorithm converges towards a typical sample from the thermal distribution with inverse temperature
β rather quickly, giving a powerful way to study statistical systems numerically. See figure 20 for some Ising
samples generated this way.
8.5 Transfer matrix and the Hamiltonian formulation for the 1D Ising model
One of the most powerful ideas in statistical mechanics is the transfer matrix, which in many cases allows us
to convert a classical statistical system at finite temperature in D spatial dimensions to a quantum system
at zero temperature in D Euclidean spacetime dimensions. Roughly speaking this limit replaces thermal
fluctuations by quantum fluctuations, with ℏ being the effective “temperature”. As a warmup let’s first see
how this works for the one-dimensional classical Ising model, which has energy functional
JX
E[σ] = (σ(n + 1) − σ(n))2 . (8.59)
2 n
Here I’ve taken the liberty of shifting the ground state energy to be zero, which simplifies life. The thermal
partition function of this system is X
Z(β) = e−βE[σ] , (8.60)
σ
30 The realization that the model can be rewritten in terms of free fermions goes back to Schultz, Mattis, and Lieb in 1964.
Elliott Lieb is a legendary mathematical physicist, who also proved the strong subadditivity of von Neumann entropy and
important results on the quantum stability of matter. He is still around (he’s 92), I had fun talking to him when I was a
postdoc at Princeton and I just saw him last month when I visited!
82
Final Spin Configuration Final Spin Configuration Final Spin Configuration
Figure 20: Samples of the 2D Ising model on a 200 × 200 lattice generated by the Metropolis algorithm
starting from random spins. On the left we have T = 4J, in the center we are at the critical temperature
2J √
T = log(1+ 2)
≈ 2.269J, and on the right we have T = .5J. At high temperature you can see the spins stay
disordered, at the critical temperature blocks of arbitrary sizes are forming and disappearing, while at low
temperature the spins are coalescing into large aligned regions which will eventually merge and order the
magnet.
and if we study this with periodic boundary conditions (i.e. so that σ(N + 1) = σ(1)) then we can write this
partition function as
Z(β) = Tr T N
(8.61)
where βJ
(σ ′ −σ)2
Tσ′ ,σ = e− 2 (8.62)
is called the transfer matrix. We can think of T as the contribution to the partition function from the link
connecting spin σ to spin σ ′ , and the trace arises because of our periodic boundary conditions. In matrix
notation we have
e−2βJ
1
T = −2βJ . (8.63)
e 1
The connection to quantum mechanics arises because when βJ ≫ 1 we can view T as the infinitesimal
Euclidean evolution by a Hamiltonian:
T ≈ 1 − ϵH + O(ϵ2 ), (8.64)
with
ϵ = e−2βJ
H = −σx . (8.65)
We can then take the continuum limit ϵ → 0, in which case a correlation function involving Euclidean time
separation τ comes from a lattice separation ∆n given by
τ
∆n = . (8.66)
ϵ
That the “bare coupling” βJ goes to infinity in this limit is an example of renormalization group flow. By
construction correlation functions computed in the continuum limit will match those computed at distances
which are large compared to the lattice scale in the original model. Indeed we can compute the spin-spin
correlation function: the ground state is simply
1
|Ω⟩ = √ (| ↑⟩ + | ↓⟩) , (8.67)
2
83
and the spin-spin correlation function is
which indeed decays exponentially at large τ as we would expect for a disordered phase (the 1D Ising model
is always disordered).
We can again write the partition function with periodic boundary conditions in the y direction as
Z(β) = Tr T Ny ,
(8.70)
Here you should think of σ(nx ) as the spins in row ny and σ ′ (nx ) as the spins in the row ny + 1. The transfer
matrix includes the Ising interactions for the links connecting these two rows (the first term) and also the
Ising interactions within the ny row. Multiplying the transfer matrix together Ny times and then taking the
trace thus accounts for each interaction once.
To get a quantum interpretation of the transfer matrix we again want to take the limit of continuous
time, which we will arbitrarily interpret to be the y direction. We can guess the right way to take this limit
by noting that if σ and σ ′ differ at m sites then we have
" #
−2mβJy βJx X 2
Tσ′ ,σ = e exp − (σ(nx + 1) − σ(nx )) . (8.72)
2 n
x
βJy → ∞
βJx → λe−2βJy (8.74)
ϵ = e−2βJy
X λX 2
H=− σx (n) + (σz (n + 1) − σz (n))
n
2 n
X X
=− σx (n) − λ σz (n)σz (n + 1) + constant. (8.75)
n n
84
This Hamiltonian is traditionally called the 1D Ising model in a transverse field, or sometimes the
quantum Ising chain. The two terms in the Hamiltonian do not commute, so the ground state is a tug of
war between them. When λ is small the ground state is just the product of the σx = 1 state for each spin,
with no long-range correlation, while when λ is large then the spins all want to be aligned in the σz basis.
This of course is just the phase structure we found for the original 2D Ising model, which apparently has
survived the quantum limit! In particular we should expect a phase transition at some critical value of λ;
we will now see that the transition happens at λ = 1 and is governed by a free Majorana fermion.
−i X
H= Ψ(n)Ψ(n + 1). (8.76)
2a n
We’ve already seen that at low energies this theory is described by the continuum free massless Majorana
fermion theory (8.22) with both left and right moving fields Ψcont
L/R due to fermion doubling. This theory is
scale-invariant, so its correlation functions decay as powers just as we saw for the 2D Ising model at the
critical point. In fact they are the same powers, as we will now see.
The key idea is to use our old Jordan-Wigner transformation to rewrite these Majorana fermions in terms
of qubits. We saw back in the second section that we can represent the algebra of 2N Majorana fermions on
the Hilbert space of N qubits, with the fermions represented by products of Pauli operators. More concretely
we have a qubit for each even n, and we have the following representations:
σx
ψ(2m) = . . . σz ⊗ σz ⊗ √ ⊗ I ⊗ I ⊗ . . .
2
σy
ψ(2m + 1) = . . . σz ⊗ σz ⊗ √ ⊗ I ⊗ I ⊗ . . . , (8.77)
2
where the σx and σy are acting on the mth qubit. We can therefore rewrite the Hamiltonian (8.76) as
i X
H=− Ψ(2m)Ψ(2m + 1) + Ψ(2m − 1)Ψ(2m)
2a m
1 X
= σz (m) + σx (m − 1)σx (m) . (8.78)
4a m
We can act with a rotation by π/2 in the zx plane and shift the sum in the second term to rewrite this as
1
H= − σx (m) + σz (m)σz (m + 1) (8.79)
4a
which (up to an overall rescaling) is precisely the transverse field Ising Hamiltonian (8.75)! More precisely,
it is the transverse field Ising Hamiltonian with λ = 1, which we can thus identify as the critical point since
we’ve now shown it has a massless dispersion relation at low energy. Therefore all the critical exponents
of the 2D Ising model can be computed using the free Majorana fermion theory (8.22), and indeed they
agree with the critical exponents from Onsager’s solution and also with experiment! There is much more
that could be said about this, but we need to get on to electromagnetism so I’ll content myself with a few
comments:
You might be wondering what happens in the fermion description if we set λ ̸= 1 in the transverse
field Ising description. Working this out shows that it introduces a mass in the fermion theory, so the
correlation length indeed becomes finite away from criticality.
85
You may be concerned that the anisotropic continuous-time limit we used to derive the transverse-
field Ising Hamiltonian is quite different from the square lattice isotropic Ising model we started with.
The reason we ended up with the same critical exponents is the renormalization group: the details
of the lattice structure (such as whether or not it is isotropic) do not matter in the continuum since
by Polchinski’s theorem they are irrelevant. On the other hand questions which specifically refer to
bare parameters, such as the transition temperature Tc (or the transition value of λ) do depend on the
lattice details and therefore cannot be computed from the continuum free field theory.
The next thing we’d do if we had more time is the free field computation of the spin-spin correlation
function to recover Onsager’s famous ∆σ = 1/8. This is trickier than you might expect however, as
it isn’t so obvious how to represent the spin operator σz (m) in terms of the continuum Majorana
fermion. The correct thing to do is to remove a small disk in the vicinity of the operator from the
path integral and then impose boundary conditions at the edge of the disk where the fermion changes
sign as you go around it.31 To compute the two-point function the best thing to do is make use of the
conformal symmetry of the problem, which relates the scaling dimension of this operator to the energy
of the ground state on a spatial circle with periodic boundary conditions for the fermion. This ground
state energy is UV-divergent, but can be renormalized to give a nice finite answer called the Casimir
energy. When the dust settles you indeed find ∆σ = 1/8.
Problems:
1. Confirm the equation of motion (8.41) and the dispersion relation (8.43) for the general lattice Hamil-
tonian (8.39), and also check that the frequency ω(k) is real.
it is important to realize that the fermion parity symmetry Ψ′L/R = −ΨL/R must be treated as a gauge symmetry (see the
next section) in order for the match between the critical Ising and massless Majorana Hamiltonians to work. Essentially this is
because of the Jordan-Wigner strings - they should really be thought of as “Wilson lines” for fermion parity, and in particular
if we work with periodic boundary conditions in space then the transverse Ising Hilbert space only allows the fermions to act
in pairs.
86
Infrared divergences: The scattering theory we have presented so far, based on the LSZ formula,
works best when all particles are massive. In QED the photon is massless, which leads to new compli-
cations in defining asymptotic states. In particular an electron can never be found in pure isolation,
it always needs to carry around its Coulomb field, and this field is made out of a large number of
photons. We will see that this implies that the S-matrix element from some fixed number of photons
and electrons to some other fixed number is always zero, since the probability is one that an infinite
number of “soft” photons are radiated in any scattering process. We therefore need to learn how to do
scattering with infinite numbers of particles in the initial/final state.
We clearly have our work cut out for us, so let’s begin.
⃗ = ρ
⃗ ·E
∇
ϵ0
⃗ ⃗
∇·B =0
⃗ ×E
∇ ⃗˙
⃗ = −B
⃗ ×B
∇ ⃗˙
⃗ = µ0 J⃗ + ϵ0 µ0 E. (9.1)
⃗ ×E
∇ ⃗ = −1B ⃗˙
c
⃗ = 1 J⃗ + E ⃗˙
⃗ ×B
∇
c
⃗ + q ⃗v × B.
F⃗ = q E ⃗ (9.3)
c
We will of course also set c = 1, in which case you can remember that to get to Heaviside-Lorentz units
(with c = 1) you can just set ϵ0 = µ0 = 1 - every physics undergraduate’s dream come true!
⃗
You hopefully also learned that you can automatically solve two of Maxwell’s equations by writing E
and B⃗ in terms of a scalar potential ϕ and a vector potential A,
⃗ via
⃗ =∇
B ⃗ ×A
⃗
⃗ = −∇ϕ
E ⃗˙
⃗ − A. (9.4)
⃗ and B.
where Ω is an arbitrary function of space and time, without changing E ⃗
87
⃗ into a one-form
This all looks much nicer if we adopt more relativistic notation. We can combine ϕ and A
gauge field as
Aµ = (−ϕ, A),⃗ (9.6)
in terms of which the gauge transformation is simply
Fµν = ∂µ Aν − ∂ν Aµ , (9.9)
∂ν F µν = J µ
∂α Fβγ + ∂β Fγα + ∂γ Fαβ = 0, (9.12)
with the second line following automatically from the expression for F in terms of A. You showed on the
homework last semester that these equations follow from the Maxwell Lagrangian
1
L = − Fµν F µν + Aµ J µ , (9.13)
4
where Aµ is treated as the fundamental dynamical variable. In this section we will treat J µ as a background
source for Aµ , while in the next section we will build J µ out of dynamical charged fields. Equations (9.12)
and (9.13) are valid in any dimension, so from now on we will consider Maxwell theory in d spacetime
dimensions. We will soon see that the theory is inconsistent unless J µ obeys the conservation equation
∂µ J µ = 0, which is a reasonable requirement for a background current. One indication of this is that with
this requirement the Lagrangian density is invariant under gauge transformations up to a total derivative:
88
(2) The canonical conjugate of Ai is
∂L
Πi = = F i0 = −E i , (9.15)
∂∂0 Ai
so we can write the 0th component of the equation of motion as
∂i F 0i = −∂i Πi = ρ. (9.16)
This of course is just Gauss’s law, but the the point here is that it doesn’t involve the time derivative of
Πi and thus gives a constraint on the canonical variables Ai and Πi instead of a genuinely dynamical
equation of motion. This constraint again is not compatible with the canonical commutation relations
dd−1 xAi (⃗
x)f i (⃗
R
⃗ independently by conjugating by ei
as these would allow us to adjust each component of Π x)
i
with f arbitrary.
We can view the condition Π0 = 0 as a constraint as well, so the basic problem we need to solve in order
to come up with a sensible Hamiltonian formulation of Maxwell theory is to figure out how to handle the
constraints
Π0 = 0
∂i Πi + ρ = 0. (9.18)
The pair of constraints (9.18) are an example of what are called first-class constraints, which means
that if we compute the commutators of the quantities on the left-hand side of the constraint equations using
the naive canonical commutation relation
we get something which vanishes after imposing the constraints. That is clearly true here, as the constraints
only involve Πµ and not Aµ . First class constraints are always related to gauge symmetries, essentially be-
cause their commutators form a Lie algebra that we can exponentiate to construct the gauge transformations
and they always commute with the Hamiltonian since it is gauge-invariant.32 We can check explicitly here
that they generate the gauge symmetry:33
Z
d−1 ′ ′ 0 ′
i d x Ω̇(⃗x )Π (⃗x ), A0 (⃗x) = Ω̇(⃗x)
Z
d−1 ′ ′ j ′ ′
i − d x Ω(⃗x ) ∂j Π (⃗x ) + ρ(⃗x ) , Ai (⃗x) = ∂i Ω(⃗x). (9.20)
In the second of these we neglected a boundary term; this is justified for gauge transformations which vanish
at infinity. We will consider the case of gauge transformations which don’t vanish at infinity later in the
section. In the meantime we have learned the following: in order to have a well-posed initial value problem
in a system with first-class constraints, it is necessary to view gauge transformations which vanish at infinity
as redundancies of description rather than physical transformations. There are two standard ways to do this:
Gauge-fixing: Impose some kind of additional requirement on the dynamical variables which removes
the gauge symmetry. For example in electrodynamics we can impose the Coulomb gauge condition
⃗ ·A
∇ ⃗ = 0, which we will shortly argue removes the gauge freedom. We can then solve the constraints
explicitly (including the gauge-fixing condition) to get an unconstrained system.
32 There is a converse to this statement called Noether’s second theorem, which in this language says that if the Lagrangian
has a gauge symmetry then it is generated by first-class constraints.
33 In the second of these we neglected a boundary term, we will be more careful about it later in the section.
89
Quotient by gauge transformations: Apply an equivalence relation Aµ ∼ Aµ + ∂µ Ω to the set of
solutions, so that they are only defined modulo gauge transformations which vanish at infinity. Physical
observables are then required to be gauge-invariant, in which case there is a good initial value problem
for all observables. Quantum mechanically we start with a larger Hilbert space that doesn’t obey the
constraints and then we restrict to the set of gauge-invariant states which are annihilated by them.
The first method is more standard in practice, but it has two unfortunate aspects. The first is that fixing
the gauge breaks manifest Lorentz invariance, leading to unwieldy expressions which magically end up
being Lorentz-invariant at the end of the calculation. The second is that gauge-fixing introduces apparent
non-locality in the Hamiltonian, which again magically cancels at the end of the calculation. For these
reasons gauge-fixing is often a source of confusion in quantum field theory, for students and researchers alike.
The second method is more elegant, as it preserves manifest Lorentz invariance and locality at every step,
but it is also more abstract and in particular it requires us to introduce a larger Hilbert space that includes
“unphysical” states. We then need to confirm that our evolution does not mix unphysical states with physical
states. Moreover some calculations in the Hamiltonian formalism (which anyways breaks covariance) are
easier once we fix the gauge. Our approach here will be to take the second approach as fundamental and
derive the first approach from within it, which I think is the most enlightening way to proceed: we never
need to worry about whether the theory we are defining is Lorentz-invariant and local, but we are free to fix
the gauge whenever doing so is convenient.
δ
Πµ (⃗x) = −i , (9.21)
δAµ (⃗x)
which by construction obeys the naive canonical commutation relation (9.19). We then construct the true
Hilbert space H by restricting to states which are annihilated by the two constraints (9.18). Since these
constraints generate the gauge symmetry, this is the same as restricting to the set of gauge-invariant states.
People thus often use the terms “physical Hilbert space” or “gauge-invariant Hilbert space” to describe
H. The set of states in Hbig which are annihilated by Π0 is quite simple: it is the functionals which are
independent of A0 . We therefore can just write the wave functional as Ψ[A].⃗ The physical states are then
those wave functionals obeying
δ ⃗ = −ρ(⃗x)Ψ[A].
⃗
i∂j Ψ[A] (9.22)
δAj (⃗x)
We also need to define a Hamiltonian on H. We can first try to construct a Hamiltonian on Hbig in the
usual way:
Z
H = dd−1 x Ȧ0 Π0 + A ⃗ + 1 F 0i F0i + 1 Fij F ij − Aµ J µ ]
⃗˙ · Π
2 4
Z
d−1 1⃗ ⃗ 1 ij ⃗ ⃗ ⃗ ⃗
0 ⃗ ⃗
0
= d x Π · Π + Fij F − A · J + ∇ · (A0 Π) − A0 J + ∇ · Π + Ȧ0 Π . (9.23)
2 4
We can drop the total derivative term by assuming boundary conditions where A0 → 0 at ∞. This Hamil-
tonian however is not well-defined on Hbig , because Ȧ0 is not something we know how to build out of Aµ
and Πµ . Indeed if we compute the commutator of H with A0 we just get
which although true is not very helpful. This is a concrete illustration of our inability to define a good time
evolution on the full Hilbert space Hbig . On the other hand there is no such obstruction to defining the
90
Hamiltonian on the physical Hilbert space H: since the last two terms in (9.23) are both proportional to
the constraints, they vanish when H acts on states in H. We therefore can take the Hamiltonian on H to
simply be Z
1⃗ ⃗ 1 ⃗ · J⃗ .
H = dd−1 x Π · Π + Fij F ij − A (9.25)
2 4
For the theory to be consistent, we need to check that this Hamiltonian evolves gauge-invariant states
to gauge-invariant states. This is a bit subtle in general because if ρ is time-dependent then the Gauss
constraint is also time-dependent, so in the Schrodinger picture the gauge-invariant subspace changes with
time. The condition we need is that
⃗ · Π(⃗
∇ ⃗ x) + ρ(t + ϵ, ⃗x) e−iϵH(t) = e−iϵH(t) ∇⃗ · Π(⃗
⃗ x) + ρ(t, ⃗x) (9.26)
= ∂i J i (t, ⃗x)
= −ρ̇(t, ⃗x). (9.28)
Note that when ρ̇ ̸= 0 this means that the Hamiltonian isn’t gauge-invariant. This is an artifact of treating
J µ as a background field; once we build J µ out of dynamical fields then this gauge transformation for
the Hamiltonian will be canceled by a compensating transformation from the matter Lagrangian so H will
be gauge-invariant. It is important to emphasize that this argument only works if J µ is conserved : it is
inconsistent to try to quantize a Maxwell field coupled to a current which isn’t conserved.
∂0 F 01 = ∂1 F 01 = 0, (9.29)
since we must have Ω(0) = Ω(L) because we are on a circle. The classical phase space of this system is
two-dimensional, and in fact h and −E are canonical conjugates:
91
The Hamiltonian (9.25) is simply
L 2
H= E , (9.34)
2
so Maxwell theory in 1 + 1 dimensions is the same as the problem of the quantum mechanics of a non-
relativistic particle on an infinite line! There are few interesting things about this example. First of all there
are no propagating photons - naively Aµ has two independent components which could support propagating
waves, but the gauge constraints removed both of them. There is however still a single remaining degree of
freedom, the holonomy h and its canonical conjugate −E. h is particularly interesting because it is nonlocal ;
in gauge theories nonlocal observables are fairly often of interest. Another worthwhile observation is that
the energy grows linearly with the system size L; we will eventually see that this is the mechanism behind
the confinement of quarks and gluons into hadrons in quantum chromodynamics (QCD), which is the theory
of the strong nuclear force.
The holonomy h brings up an interesting question about the foundations of quantum electrodynamics.
In classical electromagnetism we learn that the fundamental degrees of freedom are the electric and magnetic
fields, with Aµ something of a mathematical afterthought. The holonomy h is novel in this regard because
there is no way to express it in terms of electric and magnetic fields. Indeed in 1 + 1 dimensions there is
no magnetic field, and h is clearly independent of E since they don’t commute. Thus there apparently is
more to electromagnetism than just E ⃗ and B!
⃗ On the other hand, is the holonomy really measureable?
In fact it is, as is beautifully illustrated by the Aharanov-Bohm effect. Say we have a non-relativistic
charged particle of mass m moving on a circle of circumference L, about which the holonomy h is not zero.
Without loss of generality we can take A1 to be constant, in which case h = LA1 . The Hamiltonian for a
non-relativistic particle moving in an electromagnetic field is
|p − qA|2
H= − qA0 , (9.35)
2m
which in this case is just
(p − qh/L)2
H= . (9.36)
2m
The periodic boundary conditions require p to be quantized so that the wave function is single-valued, with
quantization
2πn
p= , (9.37)
L
so the energy levels of this system are
(2πn − qh)2
En = . (9.38)
2mL2
Note in particular that they depend on h, which thus can indeed be measured by looking at the time evolution
of the relative phase of the wave function in a superposition of two energy eigenstates.
92
where eµ is called the polarization one-form. Substituting this into Maxwell’s equation (with J = 0) we
find
k µ (kµ eν − kν eµ ) = k 2 eν − (e · k)kν = 0. (9.40)
Let’s first consider the case where k 2 ̸= 0. This equation then says that eν ∝ kν . Solutions of this type are
“pure gauge”, since if eµ = αkµ then we have
Aµ = −i∂µ αeik·x .
(9.41)
Pure gauge solutions do not contribute to the field strength tensor Fµν , as we can confirm from the expression
which vanishes if eµ ∝ kµ . What about solutions with k 2 = 0? The equation of motion (9.40) then tells
us that we must either have kν = 0 or else k null with e · k = 0. The holonomy in the previous section is
an example of the former kind of solution, but in infinite volume we will adopt boundary conditions where
A → 0 at infinity so k = 0 solutions can’t be excited. We are thus left with the solutions with null k obeying
e · k = 0. (9.43)
To parametrize the set of e which obey this equation we can find a basis of d − 1 linearly independent vectors
which are orthogonal to k, but these don’t all describe distinct physical polarization states. Indeed since k
is null the polarization vectors eµ and eµ + αkµ are both orthogonal to k but they lead to the same field
strength tensor Fµν . We therefore should impose an equivalence relation
eµ ∼ eµ + αkµ (9.44)
on the set of polarization one-forms, which leads to a basis of d − 2 distinct polarizations. These of course
are the d − 2 polarization states of a helicity-one particle, the photon, in d spacetime dimensions.
To understand photon polarization more explicitly, we can recall our usual rule for the Lorentz trans-
formations of polarization tensors. This is that for each little group element Λ obeying Λk = k, with k the
reference momentum, we have ′
Daa′ (Λ)ua (⃗k, σ) = ua (⃗k, σ ′ )D̂σ′ ,σ (Λ). (9.45)
The little group of a massless particle is generated by the subset SO(d − 2) of the rotation group that
preserves the direction of the reference momentum, together with a set of “null rotations” generated by
the combinations of a rotation that changes the direction of k together with a boost the changes it back.
Specializing to d = 4 and taking the reference momentum to point in the z direction,
αω 0
We can figure out the helicity basis from the infinitesimal version of (9.45) (replacing u → e),
µ
J 12 ν eν (⃗k, ±) = ±eµ (⃗k, ±), (9.49)
93
which tells us that polarization vectors of definite helicity are
0
µ ⃗ 1 1 + α± k µ
e (k, ±) = √ (9.50)
2 ±i
0
with α± arbitrary. You’ve hopefully seen these polarization vectors before, σ = +1/ − 1 describes right/ left
handed circular polarization. As usual the polarization vectors for general momenta are then obtained by
acting on these with an arbitrary Lorentz transformation Lp that maps k to p:
The action of the rest of the little group on eµ (⃗k, σ) is more surprising. When we discussed the little
group for massless particles last semester, we took the the generators J 13 + J 01 and J 23 + J 02 to act
trivially in the representation D̂, since otherwise a massless particle would need to have an infinite number
of helicity states. This however is not what happens with the helicity vectors (9.50); instead we have
µ ν i(β ± iγ) µ
β J 13 + J 01 + γ J 23 + J 02 e (⃗k, ±) =
√ k . (9.52)
ν 2
So in other words (9.45) is actually impossible to satisfy for a massless particle of helicity one created/annihilated
by a one-form gauge field! We can rephrase this by noting that if we try to define a “physical” Heisenberg
gauge field
X Z dd−1 p 1 h †
i
∗ −ip·x
Aphys
µ (x) = e µ (⃗
p , σ)a p
⃗,σ e ip·x
+ e µ (⃗
p , σ)ap⃗,σ e (9.53)
(2π)d−1 2ωp⃗
p
σ
out of creation and annihilation operators on the gauge-invariant Fock space of helicity-one photons with
some particular choice of eµ , under Lorentz transformations we have
U (Λ)† Aphys
µ (x)U (Λ) = Λµν Aphys
ν (Λ−1 x) + ∂µ Ω(x) (9.54)
with
eµν (⃗ p, σ) − pν eµ (⃗
p, σ) = i (pµ eν (⃗ p, σ)) . (9.56)
These polarization tensors do obey equation (9.45), since the extra gauge transformation we found for eµ
cancels between the two terms. From here the remaning steps are the same as for the scalar and the spinor
field: we take the Fourier transform of Fµν to extract ap⃗,σ and a†p⃗,σ , use the canonical commutation relations
(9.19) to show that these obey
94
and then substitute into the Hamiltonian to see that
X Z dd−1 p
H= ω⃗ ap†⃗σ ap⃗σ + constant.
d−1 p
(9.58)
σ
(2π)
These calculations are not hard in principle, but in practice they are somewhat tedious due to the proliferation
of indices. We’ll instead do this calculation below using the gauge-fixing method, which makes it more
manageable.
To simplify our discussion we’ll first take the spatial boundary to be at some finite location Γ, which we can
take to infinity later. For example we can take Γ to be the set of spacetime points (t, ⃗x) obeying |x| = R for
some large but finite R. The variation of the action has a boundary term
Z
δS ⊃ dd−1 xnµ F µν δAν , (9.60)
Γ
where nµ is the outward-pointing normal vector at the boundary Γ. In order for the action to be stationary
(up to future and past terms) about solutions of the equations of motion, we therefore need to have
nµ F µν δAν |Γ = 0 (9.61)
for all variations δA which obey the boundary conditions. One way to satisfy this is Neumann boundary
conditions:
nµ F µν |Γ = 0. (9.62)
In old-fashioned language this sets the the normal electric field and the tangential magnetic field to zero at
the boundary. The alternative way is Dirichlet boundary conditions:
tµ Aµ |Γ = 0, (9.63)
where tµ is any vector which is tangent to Γ. In more geometric language the pullback of the one-form A
to Γ vanishes. In old-fashioned language these boundary conditions set the normal magnetic field and the
tangential electric field to zero - they are the boundary conditions one would find at the edge of a perfect
conductor.34 In quantum electrodynamics Dirichlet boundary conditions are more natural since we will want
to have Aµ vanish at infinity, so we will adopt them from now on. Dirichlet boundary conditions constrain
the set of allowed gauge transformations; in order to prevent a gauge transformation from generating a
nonzero tangential A at the boundary, we must require it to be constant there:
This constant does not need to vanish, and the right thing to do with it is a bit subtle so we will now discuss
it in some detail.
Let’s first remember that our earlier argument that the Gauss constraint generates gauge transformations
neglected a boundary term. This was ok for gauge transformations that vanish at Γ, but we are now interested
in gauge transformations which approach a nonzero constant at Γ so we need to be more careful. To avoid
34 Neumann boundary conditions correspond to a “perfect magnetic conductor”, but given our lack of magnetic monpoles we
95
the boundary term, we should instead take the generator of an infinitesimal gauge transformation ω(x) on
Hbig to be
Z
dd−1 x ωρ − ∂i ωΠi ,
Qω = (9.65)
Σt
where Σt is the time slice at time t. You can check that this has the right commutator with Ai to generate
an infinitesimal gauge transformation Ω without dropping any boundary term:
To relate this to the Gauss constraint we can integrate by parts, now being careful about the boundary term:
Z Z
dd−1 x ω(ρ + ∂i Πi ) − dd−2 xωni Πi .
Qω = (9.67)
Σt ∂Σt
On the physical Hilbert space this vanishes if ω|Γ = 0, reflecting the fact that such gauge transformations
are redundancies. On the other hand for gauge transformations where ω|Γ = ω0 with ω0 constant, on the
physical Hilbert space H we instead have
Z Z
Qω = −ω0 d x ni Π = ω0 dd−1 x ρ = ω0 Qel ,
d−2 i
(9.68)
where Qel is the total electric charge. This need not vanish, which tells us something interesting: gauge
transformations which approach a nonzero constant at infinity are not redundancies - they act nontrivially
on H as long as the net electric charge isn’t zero.35
Constant gauge transformations in Maxwell theory give us a new kind of internal symmetry that we
haven’t encountered before. Last semester we defined an internal symmetry in quantum field theory to be
a unitary operator U which preserves the local algebras A[R] and also leaves the energy-momentum tensor
invariant. These conditions are both satisfied here for
so this is a valid internal symmetry. We further said that an internal symmetry is a global symmetry if
there is a local operator which transforms nontrivially under conjugation by U . By Gauss’s law however
we can express Qel as the electric flux through infinity, so it must commute with all local operators. It
therefore isn’t a global symmetry. How can it act nontrivially on the Hilbert space? We will see next time
that the operators which create charged particles are necessarily non-local operators which extend out to
spatial infinity, so these operators can and do have a nonzero commutator with Qel . Symmetries of this type
are often called asymptotic symmetries, especially in the limit where we take Γ to infinity.
Temporal gauge: A0 = 0.
35 There is a more systematic approach to deciding which gauge transformations are redundancies using the covariant phase
space approach to Hamiltonian mechanics: a continuous gauge transformation is a redundancy if and only if it is a zero mode
of the pre-symplectic form. See my first paper with Jie-qiang Wu for an explanation of this formalism and also how to apply
it to Maxwell theory.
96
Lorenz gauge: ∂ µ Aµ = 0.
Axial gauge: A1 = 0.
For a gauge-fixing condition to be successful, we need to show that every gauge field configuration Aµ differs
by a gauge transformation from one which obeys the gauge-fixing condition, and we also need to show that
the representative which obeys the gauge-fixing condition is unique. The above conditions are not always
strong enough to fulfil both requirements, for example we can stay in temporal gauge while doing a gauge
transformation obeying Ω̇ = 0. Similarly we can stay in Lorenz gauge while doing a gauge transformation
obeying ∂ 2 Ω = 0. On the other hand with the Dirichlet boundary conditions we discussed in the last section,
axial gauge and Coulomb gauge do satisfy both requirements and thus give valid gauge-fixings of Maxwell
theory. It is more convenient to use Coulomb gauge, as it preserves rotational symmetry, so we will proceed
with that.
The procedure for going to Coulomb gauge is straightforward; given a gauge field Aµ , we construct a
gauge transformation Ω such that
Acµ = Aµ + ∂µ Ω (9.70)
obeys
∂i Aci = 0. (9.71)
We can find Ω by solving the equation
⃗ · A,
∇ 2 Ω = −∇ ⃗ (9.72)
2
which is of course just a version of the Poisson equation ∇ ϕ = −ρ that we solve to find the electrostatic
potential given a charge distribution. We can solve it by introducing a spatial Green’s function K(⃗x) obeying
This is the same equation that determines the Euclidean propagator of a massless scalar field in d − 1
dimensions, so we already know the solution:
dd−1 p ei⃗p·⃗x
Z
1
K(⃗x) = = . (9.74)
(2π)d−1 p2 (d − 3)Ωd−2 |x|d−3
We thus have Z
Ω(t, ⃗x) = ⃗ · A(t,
dd−1 x′ K(⃗x − ⃗x ′ )∇ ⃗ ⃗x ′ ). (9.75)
This indeed vanishes at infinity as long as d ≥ 4. For d = 2, 3 the boundary conditions are more important
since the Coulomb potential grows with distance, so in discussing Coulomb gauge we’ll restrict to d ≥ 4 (we
anyways already solved the theory in d = 2). The Coulomb gauge field thus is given by
Z
Aci (t, ⃗x) = Ai (t, ⃗x) + dd−1 x′ ∂i K(⃗x − ⃗x ′ )∂j′ Aj (t, ⃗x ′ ). (9.76)
and
Z
Ac0 (t, ⃗x) = A0 (t, ⃗x) + dd−1 x′ K(⃗x − ⃗x ′ )∂j Ȧj (t, ⃗x ′ )
Z
= A0 (t, ⃗x) + dd−1 x′ K(⃗x − ⃗x ′ ) ∇2 A0 (t, ⃗x ′ ) − ρ(t, ⃗x ′ )
Z
= − dd−1 x′ K(⃗x − ⃗x ′ )ρ(t, ⃗x ′ ). (9.77)
In going to the second line for the A0 expression we used Gauss’s law, while in going to the third we
used (9.73). These expressions tell us something quite interesting: the gauge field in Coulomb gauge is a
97
nonlocal function of the fundamental gauge field Aµ and the background current J µ . This is reflected in the
commutation relations it obeys:36
[Aci (⃗x), Acj (⃗y )] = 0
[Πi (⃗x), Πj (⃗y )] = 0
∂ ∂
[Aci (⃗x), Πj (⃗y )] = iδij δ d−1 (⃗x − ⃗y ) + i K(⃗x − ⃗y ). (9.78)
∂xi ∂xj
Using the fact that
⃗ = Ȧc − ∇A
Π ⃗ c0 , (9.79)
and also our expression (9.77) for Ac0 ,
we can rewrite the Hamiltonian in Coulomb gauge as
Z Z
d−1 1 ⃗˙ c ⃗˙ c 1 ij ⃗ ⃗ 1
H = d x A · A + Fij F − A · J + dd−1 xdd−1 yK(⃗x − ⃗y )ρ(⃗x)ρ(⃗y ). (9.80)
2 4 2
Note in particular the non-local Coulomb interaction term, which is a consequence of the non-locality of
the Coulomb gauge field. It is sometimes presented as a potential problem for the theory that needs to
be surmounted, but here we know things will be fine since we started with a local and Lorentz-invariant
presentation of the theory. The quantity A ⃗˙ c has the same algebra with A ⃗ c as Π⃗ does, since they differ only
⃗ c
by ∇A0 and this depends only on ρ and thus commutes with A . ⃗ c
Why bother introducing the Coulomb gauge gauge field Acµ ? The reason is that, despite its inherent
non-locality, it gives us a way to find a gauge-invariant operator on the physical Hilbert space H which obeys
Maxwell’s equations as its Heisenberg equations of motion - precisely the thing we couldn’t accomplish using
the gauge-invariant formalism. Non-locality is just the price we pay for doing this. Indeed since Acµ (x) obeys
Maxwell’s equations, when J µ = 0 we can expand it in a basis of the plane-wave solutions we constructed
before:
X Z dd−1 p 1 h †
i
c ip·x ∗ −ip·x
Aµ (x) = e µ (⃗
p , σ)a p
⃗ ,σ e + eµ (⃗
p , σ)a p
⃗,σ e . (9.81)
(2π)d−1 2ωp⃗
p
σ
We need to require however that the polarization vectors eµ are consistent with the Coulomb gauge condition
(9.71). Since we’ve now set ρ = 0 we will also have Ac0 = 0 , so we must also have e0 (⃗ p, σ) = 0. We therefore
can focus on the spatial components of the gauge field:
X Z dd−1 p 1 h i
⃗ c (x) =
A d−1
p ⃗e(⃗ p, σ)a†p⃗,σ e−ip·x .
p, σ)ap⃗,σ eip·x + ⃗e ∗ (⃗ (9.82)
σ
(2π) 2ωp⃗
We saw before that the equations of motion require pµ eµ (⃗ p, σ) = 0, so in Coulomb gauge the polarization
vectors must obey
p⃗ · ⃗e(⃗
p, σ) = 0. (9.83)
In d = 4 we have already constructed these polarization vectors, in particular for the reference momentum
1
0
kµ = ω
0
(9.84)
1
we have
1
1
⃗e(⃗k, ±) = √ ±i . (9.85)
2 0
36 There are a number of approaches to deriving this algebra. In particular Weinberg uses a general method called “Dirac
brackets” to combine the gauge fixing condition with the electromagnetic constraints. This is a good thing to know about in
general, but it is unnecessary for us here since we defined Acµ as a nonlocal functional of Aµ rather than a fundamental object
in its own right. We can therefore derive the algebra of Acµ directly from that of Aµ .
98
Thus Coulomb gauge simply sets α± = 0 in our earlier discussion of the polarization. For more general
momenta the polarization vector is given by
where Rp⃗ is some fixed rotation which turns ⃗k to point in the direction of p⃗. These polarization vectors obey
⃗e ∗ (⃗ p, σ ′ ) = δσ,σ′
p, σ) · ⃗e(⃗
X pi pj
e∗i (⃗
p, σ)ej (⃗p, σ) = δij − , (9.87)
σ
|p|2
as you can easily check for the reference momentum. You can fairly easily convince yourself that these
equations hold in general dimensions, as they are simply expressing that we have adopted an orthonormal
basis of polarization vectors which is complete in the subspace orthogonal to p⃗.
To study the algebra of the ap⃗,σ we can extract them the usual way:
r Z
ωp⃗ ⃗ c (0, ⃗x) + i A⃗˙ c (0, ⃗x) .
ap⃗,σ = dd−1 xe−i⃗p·⃗x⃗e ∗ (⃗
p, σ) · A (9.88)
2 ωp⃗
We can then extract the algebra using the Coulomb gauge commutation relations (9.78), for example we
have
r Z
† ωp⃗ ωp⃗ ′ ′ ′ i i
[ap⃗,σ , ap⃗ ′ ,σ′ ] = dd−1 xdd−1 x′ e−i⃗p·⃗x+i⃗p ·⃗x e∗i (⃗ p ′ , σ′ ) ×
p, σ)ei′ (⃗ [Ȧi (⃗x), Ai′ (⃗x ′ )] − [Ai (⃗x), Ȧi′ (⃗x ′ )]
4 ωp⃗ ωp⃗ ′
r Z
ωp⃗ ωp⃗ ′ ′ ′ 2 2
= dd−1 xdd−1 x′ e−i⃗p·⃗x+i⃗p ·⃗x e∗i (⃗ p ′ , σ′ ) ×
p, σ)ei′ (⃗ δi,i′ δ d−1 (⃗x − ⃗x ′ ) + ∂i ∂i′ K(⃗x − ⃗x ′ )
4 ωp⃗ ωp⃗ ′
p − p⃗ ′ ).
= δσ,σ′ (2π)d−1 δ d−1 (⃗ (9.89)
In going to the last line we have used that the derivatives acting on K will become powers of p⃗ or p⃗ ′ in the
Fourier transform, and the Fourier transform will also set p⃗ = p⃗ ′ . When these powers are contracted with ⃗e
or ⃗e ∗ they vanish due to the Coulomb gauge condition (9.83). Similarly we have
Finally one can compute the Hamiltonian, on the homework you will show that this is given (still with
J µ = 0) by
dd−1 p
Z
1X
† †
H= ω p
⃗ a p a
⃗,σ p⃗,σ + a a
⃗,σ p
p ⃗,σ
2 σ (2π)d−1
XZ dd−1 p
= ωp⃗ a†p⃗,σ ap⃗,σ + constant. (9.91)
σ
(2π)d−1
Thus the quantization of the Maxwell field in Coulomb gauge (and thus in any gauge) indeed produces the
Fock space of non-interacting photons.
We can also compute the two-point function of the Coulomb gauge field in the usual way:
X Z dd−1 p dd−1 p′ 1 ′
c c
⟨Ω|Aµ (x2 )Aν (x1 )|Ω⟩ = d−1 (2π)d−1 2√ω ω ′ µ
e (⃗ p , σ ′ )eip·x2 −ip ·x1 ⟨Ω|ap⃗,σ a†p⃗ ′ ,σ′ |Ω⟩
p, σ)e∗ν (⃗
(2π) p
⃗ p⃗
σ,σ ′
∗
dd−1 p
Z P
σ eµ (⃗
p, σ)eν (⃗
p, σ) ip·(x2 −x1 )
= d−1
e . (9.92)
(2π) 2ωp⃗
99
In fact this equation is true in any gauge for the appropriate choice of eµ ; imposing Coulomb gauge we have
A0 = 0 and the two-point function of the spatial components of Ac is
p p
i j
dd−1 p δij − |p|2 ip·(x2 −x1 )
Z
⟨Ω|Aci (x2 )Acj (x1 )|Ω⟩ = e . (9.93)
(2π)d−1 2ωp⃗
This is a rather unpleasant and non-covariant expression, which we landed on to the complicated Lorentz
transformation properties of Acµ . We can also consider the time-ordered correlation function, which has the
nicer (but still not covariant) expression
p, σ)e∗ν (⃗
dd p −i σ eµ (⃗
Z P
c c p, σ) ip·(x2 −x1 )
⟨Ω|T Aµ (x2 )Aν (x1 )|Ω⟩ = e , (9.94)
(2π)d p2 − iϵ
which again is valid also in other gauges. Next time we will see using the path integral that if we are
sufficiently clever we can justify replacing the helicity sum in the numerator with ηµν in Feynman diagram
calculations, which certainly improves our quality of life going forward.37
Problems:
1. Work out the relationship between SI and Heaviside-Lorentz units (not setting c = 1). Hint: you can
rescale the definition of charge by (qSI , ρSI , J⃗SI ) = α × (qHL , ρHL , J⃗HL ). How should you rescale E
⃗ SI
and B⃗ SI and what is the value of α?
2. In d = 3 there is a variant of the Maxwell action we can write down called the Chern-Simons action,
given by Z
k
SCS = d3 xϵαβγ Aα Fβγ . (9.95)
4π
The factor of 4π is there because for subtle topological reasons if we define our Maxwell field to allow
for magnetic monopoles (i.e. we take the gauge group to be U (1) instead of R), then we need to take k
to be an integer. What are the equations of motion of this theory and what is the conjugate momentum
⃗ What is the Hamiltonian?
to A?
3. Confirm equations (9.88) and (9.90).
4. Derive the Hamiltonian (9.91) starting from (9.80).
5. Our first theory with charged dynamical fields coupled to electromagnetism is scalar electrodynam-
ics, which is a theory of a complex scalar field Φ coupled to a Maxwell gauge field Aµ with Lagrangian
density
1
L = − Fµν F µν − (∂µ Φ∗ + iqAµ Φ∗ )(∂ µ Φ − iqAµ Φ) − m2 |Φ|2 . (9.96)
4
What is the current density appearing in Maxwell’s equations? What is the equation of motion for
Φ? What are the canonical conjugates of Φ and Φ∗ , and what is the current density expressed in
terms of the canonical coordinates and momenta? Show that this action is invariant under the gauge
transformation
A′µ = Aµ + ∂µ Ω
Φ′ = eiqΩ Φ. (9.97)
37 Ifyou want to try to justify this replacement now, there is a somewhat convoluted explanation in section 8.5 of Weinberg.
The rough idea is that the non-covariance of the propagator in Coulomb gauge is precisely that which is needed to conspire to
remove the non-locality in the Coulomb gauge Hamiltonian (9.80) which arises once dynamical charges are present. From the
point view of our covariant formulation no such gymnastics are needed.
100
10 Quantum electrodynamics II: charged matter and the path
integral
In the previous section we quantized the electromagnetic field in the presence of a conserved background
current J µ . In the real world currents are of course made out of charges, so we now need to introduced
charged matter fields. We will then use the path integral approach to derive the Feynman rules for photons
coupled to spinor and scalar charged fields.
We saw in the previous section however that in order to construct a sensible Hilbert space and Hamiltonian
for quantum electrodynamics it is necessary to impose the Gauss constraint
⃗ ·Π
∇ ⃗ +ρ=0 (10.2)
independently at each point in spacetime. In order for the Hamiltonian to preserve the set of states which
are annihilated by this constraint, it must be invariant under a combined operation where we simultaneously
perform an arbitrary gauge transformation on Aµ and also a transformation of the matter fields where the
“amount” of the symmetry transformation at each point in spacetime and is set by the gauge transformation.
In other words the gauge symmetry must act as a local symmetry both on Aµ and on any charged matter
fields.
One simple example of such a theory is scalar electrodynamics, which takes a complex scalar field of
charge q and couples it to a Maxwell field Aµ through the Lagrangian
1
L = − Fµν F µν − (∂µ Φ† + iqAµ Φ† ) (∂ µ Φ − iqAµ Φ) − m2 Φ† Φ. (10.3)
4
You showed on the previous homework that this Lagrangian density is invariant under the combined gauge
transformation
This example is actually a bit more complicated than one might like however, since the quadratic term in A
suggests that perhaps the current J µ has some nontrivial dependence on Aµ . This concern is not actually
realized once we express the current in terms of the canonical momenta, but anyways we can avoid it by
instead coupling to a charged spinor field Ψ. This brings us to our next example of a gauge theory coupled
to matter, spinor electrodynamics:
1
L = − Fµν F µν − iΨ D
/ + m Ψ, (10.7)
4
101
where
Dµ = ∂µ − iqAµ (10.8)
is again the gauge covariant derivative and q is the electric charge of the spinor field. The current which
couples to Aµ is simply
J µ = −qΨγ µ Ψ, (10.9)
which is indeed the Noether current for the charge rotation symmetry Ψ′ = eiqθ Ψ. The spinor electrody-
namics Lagrangian is invariant under the gauge symmetry
Spinor electrodynamics is arguably the most successful scientific theory we possess, explaining most of atomic
physics, chemistry, radiation, etc, to extraordinary precision. We will spend this section and the next two
studying it in some detail.
e = .302822 . . . . (10.12)
This dimensionless number is usually reported in terms of the fine structure constant
e2 1
α= = .0072973525643(11) ≈ . (10.13)
4πϵ0 ℏc 137
Why should charge be quantized? We don’t know for sure of course, but there is a very plausible potential
explanation in terms of the topology of the gauge group. So far we have been assuming that Ω is a function
of spacetime that takes values in R. On the other hand let’s imagine that there is some constant e such that
we should identify
2π
Ω∼Ω+ . (10.14)
e
Looking at the matter field transformation
Ψ′ = eiqΩ Ψ, (10.15)
this will only be consistent with the periodicity of Ω if we have
q
∈ Z, (10.16)
e
i.e. if charge quantization holds. Since there is extraordinarily good evidence for charge quantization, most
physicists view this as strong evidence that the gauge group of electromagnetism is indeed U (1) instead of
R.
Why didn’t we have to discuss the topology of the gauge group earlier? For the most part we were
studying Maxwell theory in Minkowski space, in which case the spectrum of the Hamiltonian (i.e. the set of
photon states) is independent of whether the gauge group topology is U (1) or R. The only immediate effect
of having the topology be U (1) is that we can only couple the gauge field to charged fields with q ∈ Ze. On
102
the other hand if the spacetime topology is nontrivial then we can detect the topology of the gauge group
even in the absence of charged matter. For example in 1 + 1 dimensions we found the holonomy degree of
freedom Z L
h= dx1 A1 , (10.17)
0
which we argued was gauge-invariant via the manipulation
If Ω is not periodic, i.e. the gauge group is R, then this manipulation was correct. If Ω has periodicity 2π
e
however, then it is possible to have gauge transformations which have nontrivial winding as we go around
the circle:38
2πn
Ω(L) = Ω(0) + n ∈ Z. (10.19)
e
We then can only conclude that
2πn
h′ = h + n ∈ Z. (10.20)
e
In order to get something gauge-invariant we thus need to promote the holonomy to a Wilson loop
W = eieh . (10.21)
Another way to think about this is that the quantum mechanics of U (1) Maxwell theory on a circle in 1 + 1
dimensions is that of a particle moving on a circle of circumference 2π
e (not to be confused with the spatial
circle, whose circumference is L).
There is another important feature of U (1) gauge theory: if the gauge group is U (1) then the theory
allows for magnetic monopoles, which you will show on the homework have possible magnetic charges
Z
Qmagnetic = B ⃗ = 2πn
⃗ · dA n ∈ Z. (10.22)
S 2 e
So far these have not been observed in nature, which one might use as an argument against the possibility
of U (1) topology for the gauge group, but on the other hand they are expected to be heavy and so hard to
produce and the quantization of charge is quite compelling.
where C is any curve in spacetime which goes from x to y and q is the charge of Ψ. This is gauge-
invariant due to the Wilson line transformation law:
A′ ·dx
R R R
eiq C = eiq C
A·dx+iq C
∂Ω·dx
= eiqΩ(y)−iqΩ(x) . (10.24)
The operator (10.23) is non-local, acting with it on the vacuum creates an fermion and antifermion
connected by a line of elecric flux.
38 If you like topology this has to do with the fact that π1 (S1 ) = Z.
103
Connect a local charged operator to spatial infinity using an infinitely long Wilson line:
R
e C (x) = eiq
Ψ C
A·dx
Ψ(x), (10.25)
where now C runs from x to an arbitrary point y at spatial infinity. Acting on the vacuum this operator
creates an antifermion connected to infinity by a line of electric flux. It is invariant under any gauge
transformation which vanishes at infinity, which you’ll recall are the ones we quotient by in defining
the physical Hilbert space H. Ψe C (x) is sometimes called a “dressed” field, with the Wilson line being
the “dressing” of the “bare” charged field Ψ(x).
The last operator is interesting because it transforms nontrivially under gauge transformations which ap-
proach a nonzero constant at infinity. We saw in the last section that these are generated by the total electric
charge operator Qelectric , and indeed we have
e−iθQelectric Ψ
e C (x)eiθQelectric = eiθq Ψ
e C (x) (10.26)
This is essentially just Gauss’s law: the electric field created by Ψe C (x) is detected by
Z
Qelectric = ⃗ · dA.
E ⃗ (10.27)
Sd−2
∞
Dressed operators such as Ψ e C (x) have an interesting interpretation in terms of gauge-fixing. For example
let’s take C to be a straight line in the x1 direction from (t, x, ⃗x⊥ ) to (t, ∞, ⃗x⊥ ). Now let’s go to axial gauge
A1 = 0. We then simply have
Ψ
e C (x) = Ψ(x), (10.28)
so in other words we can think of Ψe C (x) as the gauge-invariant description of the charged field in axial gauge!
This naturally suggests the question of how to come up with a gauge-invariant description of the charged
field in other gauges, for example Coulomb gauge. To answer this question it is useful to first introduce a
rather general set of dressings, defined by
⃗ cl ·A
dd−1 y E ⃗
R
e ⃗x) = ei
Ψ(t, Σt Ψ(x), (10.29)
at time t. The axial gauge dressing just mentioned has this form with
⃗ cl (y 1 , ⃗y ⊥ ) = qθ(y 1 − x1 )δ d−2 (⃗y ⊥ − ⃗x⊥ )x̂1 ,
E (10.31)
which you can check indeed obeys Gauss’s law. In general we have the gauge transformation
dd−1 y E ⃗′
⃗ cl ·A ⃗ cl ·∇Ω
dd−1 y E ⃗ ⃗ cl ·A
dd−1 y E ⃗
R R R
i i i
e Σt =e Σt e Σt
⃗ cl ·A
dd−1 y E ⃗
R
i
= eiΩ∞ Qelectric −iqΩ(x) e Σt , (10.32)
so Ψ(t,
e ⃗x) has the same gauge transformation as Ψ e C (t, ⃗x) and in particular is invariant under gauge trans-
formations which vanish at infinity. As you might guess, to get the Coulomb gauge dressing we should
take
⃗ cl (⃗y ) = ∇ϕ
E ⃗ cl (10.33)
with
q
ϕcl (⃗y ) = , (10.34)
(d − 3)Ωd−2 |y − x|d−3
104
since then we have Z Z Z
d−1
d ⃗ cl · A
yE ⃗=− ⃗ ·A
ϕcl ∇ ⃗+ ⃗ · dA
ϕcl A ⃗ (10.35)
Σt Σt ∂Σt
⃗ ·A
which vanishes in Coulomb gauge since ∇ ⃗ = 0 and ϕcl A
⃗ goes to zero at infinity fast enough to make the
boundary term vanish.39
Π0 = 0
⃗ ·Π
∇ ⃗ + J 0 = 0. (10.36)
The first constraint is easy enough to impose by restricting to wave functionals which depend only on A,⃗ but
the second is more complicated. Let’s therefore define an “intermediate” Hilbert space Hint consisting of
wave functionals which are independent of A0 , but on which we haven’t yet imposed the Gauss constraint.
We’ll then define a projection PGI on Hint which projects to the physical Hilbert space H that is annihilated
by the Gauss constraint.
We next recall that the key step in deriving the path integral is computing the matrix elements in the
position basis of the infinitesimal time-evolution operator e−iϵH , for example in the bosonic case we had
Z
dp ip(q′ −q)−iϵH(q′ ,p)
⟨q ′ |e−iϵH(Q,P ) |q⟩ ≈ e . (10.37)
2π
In electrodynamics however the Hamiltonian H is only well-defined on the physical Hilbert space, so this
equation cannot hold in Hint . In order to compute something sensible, we need to include the projection
PGI in some way. To this end we’ll note that we can write PGI in terms of a path integral over a quantity
we will prophetically call a0 :
0 ⃗ ⃗
R d−1
Da0 eiϵ d xa0 (J +∇·Π)
R
PGI = R , (10.38)
Da0
where a0 is taken to go to zero at spatial infinity and the infinite quantity in the denominator is there to
ensure that acting on a gauge-invariant state we get one.40 We then have the following manipulation:41
Z
⃗ ⃗ Π,
′ −iϵH(A, ⃗ J)
⃗ ⃗ ⃗ ′ |(1 − iϵH(A,
⃗ Π,
⃗ J)|π⟩⟨π|P
⃗ ⃗
⟨A |e PGI |A⟩ ≈ Dπ⟨A GI |A⟩
Z Z Z
1 ⃗ ′ |π⟩⟨π|A⟩(1
⃗ ⃗ · ⃗π ))
≈R Da0 Dπ⟨A − iϵH(⃗a ′ , ⃗π , ⃗j))(1 + iϵ dd−1 xa0 (j 0 + ∇
Da0
Z Z Z
1 ⃗ · ⃗π )
≈R Da0 Dπ exp i (⃗a ′ − ⃗a) · ⃗π − iϵH(⃗a , ⃗π , ⃗j) + iϵ dd−1 xa0 (j 0 + ∇
Da0
Z Z
1 R d−1
≈R Da0 Dπeiϵ d xL(a,⃗π) , (10.39)
Da0
39 The vanishing of the boundary term is clear for d > 4, as A ⃗ and ϕcl should both fall off like 1/rd−3 so we can naively
estimate the size of the boundary term as rd−2 /r2d−6 = 1/rd−4 . For d = 4 one needs to think more the structure of A ⃗ in
Coulomb gauge to see that the boundary term still vanishes, I’ll leave the details to you.
40 If we take the gauge group to be U (1) then we should integrate a at each point in space over the range (− π , π ), which
0 ϵe ϵe
becomes (−∞, ∞) in the continuum limit ϵ → 0. We can therefore interpret the factor in the denominator as the volume of
the gauge group on a given time slice. This statement becomes precise if we introduce a lattice regulator, as we may do later
in the semester.
41 Here I don’t write the matter fields explicitly except through their appearance in the current J, but of course these need
105
where L is the covariant Maxwell Lagrangian written in terms of a and π and as usual we have ordered
H (and the matter fields in J 0 ) so that canonical momenta appear to the right. Thus the covariant path
integral automatically imposes a projection onto gauge-invariant states in addition to time evolution by the
gauge-invariant Hamiltonian! Following our usual procedures we then end up with a fully gauge-invariant
and covariant expression for time-ordered correlation functions of gauge-invariant operators in the vacuum:
∗
DaDϕDϕ∗ ON [a, ϕ] . . . O1 [a, ϕ]eiSϵ [a,ϕ,ϕ ]
R
⟨Ω|T ON [A, Φ] . . . O1 [A, Φ]|Ω⟩ = , (10.40)
DaDϕDϕ∗ eiSϵ [a,ϕ,ϕ∗ ]
R
where as usual the iϵ prescription in the action is used to project onto the vacuum at early and late times
and Φ/ϕ represent the matter fields. O1 and ON are taken to be gauge-invariant, otherwise on the left-hand
side we should sandwich them between projection operators PGI .
Unfortunately in gauge theories the gauge-invariant path integral (10.40) is in some sense “too pure”
for practical calculations. The basic problem is that the integrals in the numerator and the denominator
both generate infinite factors due to the fact that directions in field space which correspond to to gauge
transformations are “flat directions” of the integral. We therefore should divide both the numerator and the
denominator by the (infinite) volume of the gauge group in order for them to be (reasonably) Rwell-defined.
In fact our above argument essentially generated these factors, in the guise of the denominator DA0 . This
leads to some inconvenient infinities when we try to do perturbative calculations. For example we can try
to compute a two-point function of the gauge field Aµ . Aµ itself isn’t gauge-invariant of course, but its two-
point function will regularly show up as in intermediate step in calculations of gauge-invariant quantities so
we had best say a little about it. We can write the kinetic term in the exponent as
Z Z
i d µν 1
− d xfµν f = − dd xdd yaµ (x)aν (y)Aµν (x, y), (10.41)
4 2
with42
∂2 ∂2
Aµν (x, y) = i µ ν
− η µν δ d (x − y). (10.42)
∂x ∂x ∂xα ∂xα
The photon propagator
dd p ˆ
Z
∆µν (x − y) = ∆µν (p)eip·(x−y) (10.43)
(2π)d
should be the inverse of Aµν in the sense that
Z
dd yAµν (x, y)∆νλ (y − z) = δλµ δ d (x − z). (10.44)
We now meet the essential problem: the matrix (p2 η µν − pµ pν ) is not invertible! Indeed it has pν as a zero
eigenvector, and so there is no quantity ∆ ˆ νλ (p) obeying (10.45). This shouldn’t be a surprise of course,
as Aµ isn’t gauge-invariant and we saw before that to give it a well-defined two-point function we need to
gauge fix. This isn’t to say that (10.40) is wrong by the way, it isn’t; the ambiguity in defining the inverse of
Aµν all cancels when we use use the propagator in gauge-invariant quantities. Indeed the Euclidean lattice
version of (10.40) is the starting point for rigorous lattice calculations in quantum electrodynamics, so it had
better not have any fundamental problems. Nonetheless for perturbative calculations it is useful to have a
nice photon propagator, we’ll now see how to do this.
42 We should replace the time derivatives here by (1 + iϵ)∂ to implement the iϵ prescription, I’ll leave this implicit to preserve
t
covariant notation.
106
10.5 QED path integral II: fixing the gauge
The most obvious way fix the gauge in the QED path integral is simply to only integrate over configurations
which respect the gauge-fixing condition. This approach to the path integral is described in Weinberg’s
book, where he derives it starting from the canonical quantization in Coulomb gauge that we discussed in
the previous section. It leads to path integrals which are somewhat difficult to evaluate in practice however,
as one ends up needing to insert a factor Y
δ(∇⃗ · A(x))
⃗ (10.46)
x
in the Lagrangian path integral (10.40) to impose the gauge condition (and thus to get rid of the infinite
factors of the gauge group volume in the numerator and denominator in (10.40)). It was realized by Faddeev
and Popov however that we can get a more manageable path integral if we instead impose the gauge constraint
more softly, via a Gaussian suppression of configurations that don’t obey the gauge fixing condition rather
than a delta function suppression. We’ll now see how to do this starting directly from our gauge-invariant
path integral (10.40). Working just with the numerator, we have the following manipulation
Z Z R d 2
Z
∗ iSϵ [a,ϕ,ϕ∗ ] i
− 2ξ ∗
DaDϕDϕ ON [a, ϕ] . . . O1 [a, ϕ]e ∝ Df e d xf
DaDϕDϕ∗ ON [a, ϕ] . . . O1 [a, ϕ]eiSϵ [a,ϕ,ϕ ]
Z
∗ i
R d µ 2 2
∝ DΩDaDϕDϕ∗ ON [a, ϕ] . . . O1 [a, ϕ]eiSϵ [a,ϕ,ϕ ]− 2ξ d x(∂µ a +∂ Ω)
Z
∗ i
R d µ 2
= DΩDaΩ DϕΩ Dϕ∗Ω ON [aΩ , ϕΩ ] . . . O1 [aΩ , ϕ]eiSϵ [aΩ ,ϕΩ ,ϕΩ ]− 2ξ d x(∂µ aΩ )
Z
∗ i
R d µ 2
∝ DaDϕDϕ∗ ON [a, ϕ] . . . O1 [a, ϕ]eiSϵ [a,ϕ,ϕ ]− 2ξ d x(∂µ a ) .
(10.47)
In the first line we simply multiplied by a constant, and then in the second line we changed variables in the
f integral from f to Ω via the field-dependent expression
f = ∂µ Aµ + ∂ 2 Ω. (10.48)
This change of variables generates a field-independent determinant det(∂ 2 ) which we discarded (hence the
∝). Each value of f is attained in this integral since we can always solve (10.48) for Ω using the Green’s
function for the scalar wave operator ∂ 2 . The solution is not unique since we can add to Ω any function
which solves the massless wave equation to Ω, so the integral over Ω generates an additional constant factor
reflecting this redundancy (which is the same redundancy which prevented the Lorenz gauge from being a
complete gauge-fixing). In the third line we then used the gauge invariance of the path integral measure and
action, as well as the operator insertions O1 . . . ON , using the abbreviated notation
(aΩ )µ = aµ + ∂µ Ω
ϕΩ = eiqΩ ϕ, (10.49)
and then in the final line we changed the integration variables from (aΩ , ϕΩ ) to (a, ϕ) and then performed
and discarded the integral over Ω which now gives a field-independent constant. You may worry about the
convergence of the f integral, but implicitly here we are using the iϵ prescription t = τ (1 − iϵ) so the measure
in the dd x integral includes a factor of (1 − iϵ) which makes the f -integral convergent for ξ > 0. The final
upshot of this is that we can replace equation (10.40) by43
∗ i
R d µ 2
DaDϕDϕ∗ ON [a, ϕ] . . . O1 [a, ϕ]eiSϵ [a,ϕ,ϕ ]− 2ξ d x(∂µ a )
R
⟨Ω|T ON [A, Φ] . . . O1 [A, Φ]|Ω⟩ = ∗ i
R
d µ 2
, (10.50)
DaDϕDϕ∗ eiSϵ [a,ϕ,ϕ ]− 2ξ d x(∂µ a )
R
43 It must be acknowledged that the interpretation of this procedure in the operator formalism isn’t so clear: does canonical
quantization make sense if we only suppress configurations which violate the gauge-fixing condition rather than forbidding
them entirely? The answer to this question ends up being “yes”, but we need to introduce a rather involved formalism called
BRST quantization to see it. We won’t pursue this further here, as anyways we already have a gauge-invariant operator
interpretation of the path integral (10.40) through our construction of the physical subspace of Hbig .
107
Figure 21: Feynman rules for spinor electrodynamics. Photon propagators are indicated by wavy lines.
for arbitrary ξ > 0. This is a substantial improvement on (10.40), as at least those gauge transformations
which have ∂ 2 Ω ̸= 0 are now suppressed in the integral.
Let’s see what this says about the photon propagator. Including our new “gauge-fixing” term in the
action, equation (10.42) is now modified to
∂2 ∂2
µν 1 µν
A (x, y) = i 1− −η δ d (x − y). (10.51)
ξ ∂xµ ∂xν ∂xα ∂xα
dd p
−iηµν
Z
pµ pν
∆µν (x − y) = + i(1 − ξ) eip(x−y) . (10.54)
(2π)d p2 − iϵ (p2 − iϵ)2
Anyone sane would look at these expressions and immediately set ξ = 1, which is called Feynman gauge.
There are however a few other options which are considered, most commonly the Lorenz gauge ξ = 0, in
which case the Gaussian in the path integral becomes a δ-function imposing the Lorenz gauge condition.
True masochists can work with ξ as a free parameter, as it gives a useful check that any physical observable
must be independent of ξ. We however will stick with Feynman gauge, so from now on our photon propagator
is
dd p −iηµν ip(x−y)
Z
∆µν (x − y) = e = ηµν GF (x − y). (10.55)
(2π)d p2 − iϵ
108
+ + + +...
Figure 22: Leading diagrams for the two-point function of the gauge-invariant operator ψψ (self-contractions
are removed by normal ordering).
+ +...
Figure 23: Leading diagrams for the photon propagator.
Instead of the Yukawa interaction −igϕψψ, which supplies a factor of g for each vertex, we now have
the QED interaction −qaµ ψγ µ ψ, which supplies a factor of −iqγ µ for each vertex. The γ µ matrix is
multiplied by the fermion propagators which are attached to the vertex, with the multiplication going
from right to left in the direction of the fermion arrows as in Yukawa theory.
Instead of the scalar propagator
−i
ĜF (p) = (10.56)
p2 + m2ϕ − iϵ
we now have the photon propagator
ˆ µν (p) = −iηµν .
∆ (10.57)
p2 − iϵ
In order to use these expressions the external operators must be gauge-invariant.
In the next section we will learn how to compute perturbative scattering amplitudes in QED.
109
Problems:
1. Show that the Maxwell kinetic term indeed can be written as (10.41) with Aµν given by (10.42), and
also that (10.53) solves (10.52).
2. Starting from (10.53), write an expression for the free time-ordered two-point function of Fµν in
momentum space. Make sure that your answer is independent of ξ.
3. Write out position and momentum space expressions for the one-loop correction to the Fµν two-point
function from the diagram shown in figure 23. You don’t need to evaluate the trace or the loop integrals.
4. In this problem we’ll pursue one approach to the Dirac quantization of magnetic monopoles. One way
to think about magnetic monopoles is that on a two-sphere surrounding the monopole the gauge field
is given in spherical coordinates by44
n
Aϕ = − (1 + cos θ). (10.59)
2e
Show that this gauge potential describes a constant magnetic flux through the sphere, whose integral
is Z
B ⃗ = 2πn .
⃗ · dA (10.60)
e
Compute the holonomy Z 2π
h= dϕAϕ (10.61)
0
around a circle of constant θ. Does this vanish at θ = π? How about at θ = 0? In U (1) gauge theory
the Wilson line
W = eieh (10.62)
should become the identity for a circle of vanishing size. Applying this to the circle at θ = 0, what do
we learn about n?
5. Extra credit: show that the boundary term in (10.35) indeed vanishes for d = 4. You’ll need to
remember how to solve Maxwell’s equations in Coulomb gauge given a fixed background current, and
you’ll also need to be a bit clever.
beware that here Aϕ is the ϕ component of the one-form A, i.e. the one-form is Aϕ dϕ. Aϕ is not the component of a vector
A in the direction of a unit vector ϕ̂, so if you want to use formulas for the spherical curl e.g. in Griffiths then you need to
figure out how to rescale Aϕ . Your life will be easier if you instead just integrate the two-form Fµν on the two-sphere using the
intrinsic definition of the integral of a two-form on a two-manifold.
110
for the charged fields):
i
dd x(∂µ aµ )2
R
DaDψDψ ON [a, ψ, ψ] . . . O1 [a, ψ, ψ]eiSϵ [a,ψ,ψ]− 2ξ
R
⟨Ω|T ON [A, Ψ, Ψ] . . . O1 [A, Ψ, Ψ]|Ω⟩ = ∗ i
R
d µ 2
,
DaDψDψ eiSϵ [a,ψ,ψ ]− 2ξ d x(∂µ a )
R
(11.1)
where O1 . . . ON are all gauge-invariant. We also saw how to evaluate this path integral perturbatively using
Feynman diagrams, in which we chose to set ξ = 1 and use the photon propagator
ˆ µν (p) = −iηµν .
∆ (11.2)
p2 − iϵ
As usual the spinor propagator is
i(p
/ + im)
ŜF (p) = , (11.3)
p2+ m2 − iϵ
where m is the (bare) mass of the electron, and the interaction vertex factor is −iqγ µ . The next step in
most QFT textbooks is to take each Oi in (11.1) to be Aµ , Ψ, or Ψ, e and then apply the LSZ formula to
extract the S-matrix by taking the external momenta to be on shell and stripping off the resulting poles.
Unfortunately however these operators are not gauge-invariant, so this manipulation is not justified without
some further explanation.
For Aµ the idea is simple: what the LSZ procedure really requires us to do is take the Fourier transform
of Aµ , take the external momentum to be on-shell, and then contract with ϵµ∗ (p, σ) for a photon in the final
state or ϵµ (p, σ) for a photon in the initial state. In the path integral this means that the operator we are
really inserting is Z
p, σ) dd xe−ip·x aµ (x)
eµ∗ (⃗ (11.4)
with p0 = |p| for initial state photons. The polarization vectors here are chosen in some arbitrary gauge,
for example we can use the Coulomb gauge eµ . Regardless of the choice we make the operators (11.4) and
(11.5) are actually gauge-invariant, since under a gauge transformation
a′µ = aµ + ∂µ Ω (11.6)
the Fourier transform of aµ changes by ipµ Ω̂(p) and on-shell we have e · p = e∗ · p = 0. Thus we can use
(11.4) and (11.5) in (11.1) with no problem.45
The situation is substantially more complicated for the electron/positron fields Ψ and Ψ. As explained
in the previous section, to make these gauge-invariant we need to dress them using a classical solution of
Gauss’s law:
⃗ cl ·A
dd−1 xE ⃗
R
iq
Ψ(x)
e = e Σt Ψ(x). (11.7)
We can think of the dressing as creating a coherent state of the electromagnetic field, and in a proper
treatment of the asymptotic states of QED this coherent state must be included in order to get sensible
transition amplitudes. The theory of these asymptotic states goes well beyond what we can reasonably
cover in this class, there is a nice four-paper series “coherent soft-photon states and infrared divergences”
by Kibble from 1968 which I encourage you to look at if you want to learn more about it. There is also a
famous paper by Faddeev and Kulish a few years later which is more commonly cited, it simplifies somewhat
Kibble’s formalism but the connection to LSZ is less clear. I will content myself with just making a few
comments:
45 Of course we could also just use the gauge-invariant operator F
µν in the LSZ formula, which would remove the need for this
discussion, but that introduces an unnecessary extra index and for the non-abelian gauge theories we’ll discuss next semester
the analogue of Fµν isn’t gauge-invariant so we might as well stick with Aµ .
111
Due to the nonlocality of Ψ
e and the masslessness of the photon, when we Fourier transform correlation
functions involving Ψ and go on the mass shell the singularity we find is modified from a simple pole
e
to a branch cut. Extracting the scattering amplitude now requires isolating the coefficient of this cut.
The details of the solution E⃗ cl end up not contributing to the coefficient of the branch cut as the
external momentum goes on shell, it is only the small-momentum part of its Fourier transform which
matters and this serves primarily to cancel certain “soft photon infrared divergences” that arise at
higher orders in perturbation theory applied to the “bare” correlation function.
Indeed in the standard textbook treatment of QED the dressing is neglected entirely: one simply inserts
bare charged fields into the path integral (11.1) and then tries extract an S-matrix. A price is paid
however for doing this: the dressing factors no longer cancel the soft photon infrared divergences and
in fact the latter set the scattering amplitude to zero!46 The textbook approach to this (see chapter
13 of Weinberg for a nice presentation) is to introduce a small photon mass to regulate the divergence,
square the regularized amplitude to get a differential cross section, and then sum over the number of
low-energy photons in the final state. One can then argue that this “inclusive” differential cross section
is finite in the limit that the mass of the photon goes to zero.
I must confess that I was never happy with the standard approach described in this third bullet point -
why should we need to introduce a nonzero photon mass when the correlation functions of gauge-invariant
operators are all finite? Is it really true that it is only the squares of transition amplitudes which make
sense in QED, and if so is this a modification of quantum mechanics? The approach based on dressing and
coherent states described by Kibble/Fadeev/Kulish is much more satisfying to me, since charged particles
really do come together with the Coulomb fields and neglecting this is a crime for which one should be and
is punished. This does not mean that the differential cross sections computed by the standard method are
wrong of course, but to properly interpret them it is better to remember the dressing.
Fortunately for us the QED calculations we will actually do are either at lowest order in perturbation
theory, in which case we can neglect the dressing part of Ψ,
e or they involve one-loop diagrams which can be
interpreted entirely in terms of gauge-invariant correlation functions where these infrared subtleties are not
important. We thus will for the most part leave the infrared problem here.
i kµ eν (⃗k, σ) − kν eµ (⃗k, σ)
⟨Ω|Fµν (0)|⃗k, σ, γ⟩ = Zγ p , (11.8)
2ω⃗k
where |⃗k, σ, γ⟩ is a one-photon state (one-photon states are well-defined in QED, unlike one-electron states
since the latter need to be dressed). To get an element of iM fc we want to remove the LSZ factors of
p
X −i 2ω⃗k ′ ′ ′
Zγ ⃗ ′ ′
eµ (k , σ ) ′ 2 ⟨⃗k , σ , γ| (11.9)
′
(k ) − iϵ
σ
112
Figure 24: External leg factors for computing the connected covariant amplitude M fc in quantum electro-
dynamics. As in Yukawa theory we should sum over all pruned connected diagrams, with the external
propagators replaced by these factors. The dot indicates the interaction vertex which is connected to the
rest of the diagram, so the top row are final state factors and the bottom row are initial state factors.
for each initial-state photon in correlation functions involving Aµ . The exact external propagator legs in the
correlation function near the mass shell each have the form
−iηµν ν
Zγ2 J , (11.11)
k 2 − iϵ
where J ν indicates the current that the propagator is contracted with at the interaction vertex, and so
to extract the scattering states we should strip off the exact external propagator by only drawing pruned
diagrams and then contracting each interaction current J µ which is attached to an external line with a factor
of Zγ eµ (⃗k ′ , σ ′ ) for each photon in the final state and a factor of Zγ eµ (⃗k, σ) for each photon in the initial
p
state (the factors of 2ω⃗k are absorbed into the definition of iM fc , which removes these factors and also the
overall momentum-conserving δ-function. These rules are shown in figure 24.
In the traditional approach to the renormalization of QED the phase of Ψ and the sign of Aµ are adjusted
so that both Zψ and Zγ are real and positive, and one then defines the conventional factors
p
Zψ = Z2
p
Zγ = Z3 . (11.12)
We will see in the next section that the electron charge q must be renormalized, and this renormalization is
conventionally parametrized by a constant Z1 > 0 via
p Z2
q= Z3 q0 , (11.13)
Z1
where q0 is the bare electron charge. In fact however this constant is not free: if we define the renormalized
fields
−1/2
ΨR = Z2 Ψ
−1/2
AR,µ = Z3 Aµ , (11.14)
then to preserve the gauge invariance of the gauge covariant derivative we must have
∂µ − iqR AR,µ = ∂µ − iq0 Aµ , (11.15)
113
Figure 25: The tree-level contribution to e+ e− → µ+ µ− .
and thus
Z1 = Z2 . (11.16)
The electron mass m is also renormalized as
mR = m + δm, (11.17)
so to work out the renormalization of QED we need to compute the three quantities Z2 , Z3 , and δm. We
will return to this in the next section when we consider loop diagrams.
where in the second line we’ve adopted the same abbreviated notation for u and v that we used in Yukawa
theory and also set ϵ → 0 since there are no integrals over momentum since we are at tree level. To compute
the spin-averaged differential cross section we need to square this and sum/average over the external spins.
Noting that
(f γ µ g)∗ = g † γ 0 γ µ γ 0 γ 0† f = gγ µ f (11.19)
114
for f , g equal to any u or v, we have (setting d = 4)
X
fc |2 = q4 X
|M 4
u1′ γ µ v2′ v 2′ γ ν u1′ v 2 γµ u1 u1 γν v2
(k1 + k2 )
σ,σ ′ ′
σ,σ
4
q h
′ ′
i h i
= 4
k 1 + imµ )γ µ (/
Tr (/ k 2 − imµ )γ ν Tr (/ k 2 − ime )γµ
k 1 + ime )γν (/
(k1 + k2 )
16q 4 ′ µ ′ ν
= 4
(k1 ) (k2 ) + (k1′ )ν (k2′ )µ − (k1′ · k2′ )η µν + m2µ η µν (k1 )µ (k2 )ν + (k1 )ν (k2 )µ − (k1 · k2 )ηµν + m2e ηµν
(k1 + k2 )
32q 4
= 4
(k1 · k1′ )(k2 · k2′ ) + (k1 · k2′ )(k2 · k1′ ) − m2µ (k1 · k2 ) − m2e (k1′ · k2′ ) + 2m2µ m2e .
(k1 + k2 )
(11.20)
As usual we can convert this into a spin summed/averaged differential cross section in the center of mass
frame using the formula
dσave |k ′ | 1X f 2
= 2 2 × | Mc | . (11.21)
dΩ 64π Etot |k| 4 ′
σ,σ
In the center of mass frame the particles all have the same energy ω = Etot /2, with
p
|k| = ω 2 − m2e
q
|k ′ | = ω 2 − m2µ , (11.22)
We thus have
r r !2 r r !2
1X f 2 q4 m2µ m2µ m2µ m2µ 2
|Mc | = 1 − 1 − 2 1 − 2 cos θ + 1 + 1 − 2 1 − 2 cos θ + 2 (m2µ + m2e )
4 ′ 2 ω ω ω ω ω
σ,σ
! !
m2µ + m2e m2µ m2e
= q4 1 + + 1 − 1 − cos2 θ , (11.24)
ω2 ω2 ω2
and therefore
q
m2 ! !
q4 1 − ω2µ m2µ + m2e m2µ m2
dσave
= 2 1+ + 1− 2 1 − 2e 2
cos θ . (11.25)
64π 2 Etot ω2
q
dΩ m2 ω ω
1 − ω2e
115
The spin summed/averaged total cross section is
Z
dσave
σave = dΩ
dΩ
q
m2 ! !
q 4 1 − ω2µ m2µ + m2e 1 m2µ m2e
= 2 1+ + 1− 2 1− 2
ω2
q
16πEtot m2 3 ω ω
1 − ω2e
q
m2 !
q 4 1 − ω2µ m2µ + m2e m2e m2µ
= 2 1+ + . (11.26)
2ω 2 4ω 4
q
12πEtot m2
1 − ω2e
Note that the cross section vanishes when ω = mµ and must be taken to zero below this, since for ω < mµ
the electron and positron do not have enough energy to produce the muon and antimuon. This kind of
sudden turning on of a production cross section is called a threshold.
In the real world we have
me ≈ 5.1 × 105 eV
mµ ≈ 1.1 × 108 eV, (11.27)
which is a bit simpler but still has clear threshold behavior. In the vicinity of the threshold the angular
dependence of the differential cross section is suppressed, so when the muon/antimuon pair is produced
almost at rest their momenta are essentially equally likely to be pointing in any direction. On the other
hand in the ultra high energy limit ω ≫ mµ we can lmost take mµ to zero, leading to the nice expressions
dσave q4
1 + cos2 θ
= 2 2
dΩ 64π Etot
q4
σave = 2 . (11.29)
12πEtot
In particular there is now strong angular dependence of the differential cross section, with the cross section
being peaked at θ = 0 (“forward scattering”) and θ = π (“backward scattering”), with the differential cross
section being minimized at θ = π/2. In the total cross section the power of q can be immediately seen from
the diagram and the power of Etot is set by dimensional analysis, so we only need to evaluate the diagram
to get the 12π.
116
Figure 26: Experimental measurement of the R ratio of σave (e+ e− → hadrons) to the high-energy limit of
σave (e+ e− → µ+ µ− ). The jumps around 3 GeV and 8 GeV correspond to the thresholds for the charm and
bottom quarks respectively, with the various other peaks corresponding to hadronic resonances not captured
by our simple tree-level analysis. The solid black line is the tree-level result described in the text. Image
shamelessly stolen from Peskin and Schroeder.
Since quarks are electrically charged, we can pair-produce them by colliding electron and positron pairs just
as we saw for muons in the previous subsection. At tree level the diagram is the same as in figure 25, except
that we should label the final state quarks as q and q instead of µ− and µ+ . Moreover the electron charge
q should be replaced by the quark electric charge the upper interaction vertex (but not the lower one), the
muon mass should be replaced by the quark mass, and we should multiply by a factor of 3 since there are
three colors of quark. Roughly speaking we thus should expect the total cross section for e+ e− → q q to be
given by
σave (e+ e− → q q) = Rσ(e+ e− → µ+ µ− ), (11.30)
with X
R=3 qi2 , (11.31)
i
with the sum being over all quark types whose mass is less than Ecom /2 and qi being the electric charge of
the ith type of quark. There is a major problem with this expectation however: in QCD the potential energy
between a quark and antiquark grows linearly with distance, so if we try to separate them eventually there
is enough energy to pull more quarks and antiquarks into existence. These additional quarks/antiquarks
combine with the original quark and antiquark to form hadrons, which are bound states of quarks and
gluons such as pions, protons, and neutrons that are uncharged under the strong force. This dynamical
process is called hadronization, and it is the mechanism which enforces confinement - one can never
117
find quarks, antiquarks, or gluons in isolation. You will learn more about confinement next semester. The
amazing thing however is that hadronization happens at energy scales which are of order
and as long as we are colliding our electron-positron pair at energies which are large compared to this then
we can cleanly separate the perturbative scattering process shown in figure 25 (with the muon/antimuon
replaced by a quark/antiquark) from the complicated hadronization dynamics. Indeed the only effect of
hadronization is to replace the outgoing quarks by jets, which are collimated streams of hadrons that are
moving roughly in the direction the initial quarks were moving in. Therefore a better version of (11.30) is
with R again given at first pass by (11.31). On the right-hand side we are now just looking at the total cross
section where the electron-positron pair produces some kind of hadronic final state - we can hope that away
from special energies this will be dominated by back-to-back jets coming from a qq pair. Let’s see what to
expect:
For center of mass energies above ΛQCD but below 2mc ≈ 2.6 GeV the R ratio is
2 2
2 1
3 +6 = 2. (11.34)
3 3
For center of mass energies above 2mc but below 2mb ≈ 8.4 GeV the R ratio is
2
2 10
2+3 = ≈ 3.3. (11.35)
3 3
For center of mass energies above 2mb and below 2mt , the latter of which has not been reached in any
e+ e− collider, the R ratio is
2
10 1
+3 = 11/3 ≈ 3.7. (11.36)
3 3
The experimental data (as of 1995, sorry) is shown in figure 26, as you can see this rather naive prediction
actually works quite well! It also is quite sensitive to the details of QCD, to get these numbers we needed
to know about how many quarks there are of each charge and also their masses.
(k1 + k2 )2 + m2 = 2k1 · k2
(k1 − k2′ )2 + m2 = −2k1 · k2′
µ
p, σ) = γ µ (−p µ
p, σ) = 2pµ u(⃗
(p
/ + im)γ u(⃗ / + im) + 2p u(⃗ p, σ), (11.38)
118
Figure 27: Compton scattering at tree level.
which give
′
!
∗ ∗ ∗
2
fc (e γ → e γ) = −iq u1′ /e2′ k/2 /e2 + 2(k1 · e2 )/e2′ /e k/ /e ′ − 2(k1 · e∗2′ )/e2
iM − −
+ 2 2 2 u1 . (11.39)
2 k1 · k2 k1 · k2′
Note that on the right-hand side we have already summed over the electron spins, but we have left the
photon polarization sums to be evaluated.
In evaluating the polarization sum here there is a quite useful trick. In Coulomb gauge we saw that the
polarization sum is given by X pi pj
p, σ)e∗j (⃗
ei (⃗ p, σ) = δij − , (11.41)
σ
|p|2
while any sum involving e0 vanishes. We can write this in a more covariant way by introducing a null rigging
vector ℓµ , which is defined to obey
ℓµ pµ = −1
ℓµ eµ (⃗
p, σ) = 0. (11.42)
1
More concretely in d = 4 if pµ = (ω, 0, 0, ω) then we have ℓµ = 2ω (1, 0, 0, −1). We then have
X
p, σ)e∗ν (⃗
eµ (⃗ p, σ) = ηµν + ℓµ pν + pµ ℓν , (11.43)
σ
as you can check by contracting this tensor with pµ , ℓµ , and eµ in either index and seeing that it agrees with
(11.41). This expression actually holds in any gauge provided that after a gauge transformation e′µ = eµ +αpµ
we choose a new rigging vector so that (11.42) continues to hold. The nice thing about this expression is
that in Feynman amplitudes the external polarization one-forms are always contracted with an insertion of
119
the electric current j µ = −qψγ µ ψ. Since this current is conserved, if we replace any external factor eµ (⃗
p, σ)
or e∗µ (⃗
p, σ) with pµ in a squared scattering amplitude then we must get zero by the current conservation
equation pµ j µ = 0. This statement is called the Ward-Takehashi identity, as it is a version of the Ward
identity we discussed last semester for correlation functions involving conserved currents.47 This means that
in expressions such as (11.40), where both indices are contracted with a scattering amplitude, we can simply
make the replacement X
p, σ)e∗ν (⃗
eµ (⃗ p, σ) → ηµν . (11.45)
σ
This is essentially the same reason that we could replace the complicated Coulomb gauge propagator by the
Feynman gauge propagator.
Returning now to our Compton scattering calculation, we have
′
" !
X
2 q4 ′ γ ν k/2 γ µ + 2k1µ γ ν γ µ k/2 γ ν − 2k1ν γ µ
|Mc | = ηµα ηνβ Tr (/
f k 1 + im) +
4 k1 · k2 k1 · k2′
σ,σ ′
′
!#
γ α k/2 γ β + 2k1α γ β γ β k/2 γ α − 2k1β γ α
× (/k 1 + im) + . (11.46)
k1 · k2 k1 · k2′
From here it is “just” a slog in computing traces of γ-matrices to evaluate this, I’ve left the details to you
in the homework and the result is:
" 2 #
′
1X f 2 4 k1 · k2 k1 · k2 2 1 1 4 1 1
|Mc | = 2q + + 2m − +m − . (11.47)
4 ′ k1 · k2 k1 · k2′ k1 · k2′ k1 · k2 k1 · k2′ k1 · k2
σ,σ
The next step is to convert this expression to a differential cross section. So far we have been doing this
in the center of mass frame, but Compton scattering is usually studied in the “lab” frame where the initial
electron is at rest. We therefore need to revisit the relationship between the covariant matrix element and
the differential cross section. In the lab frame we have
k1 = (m, ⃗0)
k2 = (ω, ωn̂)
k ′ = (E ′ , ⃗k ′ )
1 1 1
k2′ = (ω ′ , ω ′ n̂′ ), (11.48)
where n̂ and n̂′ are unit vectors, and with a little effort one can show that the energy of the outgoing photon
is
ω
ω′ = ω (11.49)
1+ m (1 − cos θ)
where θ is the angle between n̂ and n̂′ . An analysis similar to the one we did in the center of mass frame
(see Peskin and Schroeder for help) then shows that the relationship between squared matrix element and
47 This argument is a bit too quick, as the Ward identity we derived last semester had some contact terms on the right-hand
side:
n
X
∂µ ⟨T J µ (x)O1 (y1 ) . . . On (yn )⟩ = i δ d (x − ym )⟨T O1 (y1 ) . . . δS O(ym ) . . . On (yn )⟩ + . . . , (11.44)
m=1
where δS O is the symmetry transformation of O and the . . . indicates possible non-universal terms proportional to derivatives
of δ d (x − ym ) which are called Schwinger terms. To get from this to a statement about scattering amplitudes we can observe
that QED scattering amplitudes have the form eµ1 ∗ (⃗k ′1 , σ1′ ) . . . eν1 (k⃗1 , σ1 ) . . . ⟨β|T J µ1 (k1′ ) . . . J ν1 (k1 ) . . . |α⟩, where |α⟩ and |β⟩
are the initial and final states of the electrons/positrons. If we replace one of the e or e∗ by its associated kµ we can replace the
current correlator by the Fourier transform of the right-hand side of (11.44). The universal contact term does not contribute
since the currents are all neutral under the gauge symmetry. It is more subtle to argue that there are no Schwinger term
contributions, for spinor electrodynamics the argument is essentially that the current J µ = −qΨγ µ Ψ involves no derivatives so
when we compute [J 0 , J ν ] there is no place for a derivative of a δ-function to come from. A diagrammatic argument leading to
the same conclusion is given in Peskin and Schroeder.
120
the differential cross section in the lab frame for the elastic scattering of a massless particle off of a massive
particle is given by
dσ (ω ′ )d−2 1 fc |2 .
= |M (11.50)
dΩ 16m2 ω 2 (2π)d−2
Specializing to d = 4 the spin summed/averaged differential cross section is thus
dσave (ω ′ )2 1 X f 2
= |Mc | . (11.51)
dΩ 64π 2 m2 ω 2 4 ′
σ,σ
k1 · k2 = −mω
k1 · k2′ = −mω ′ , (11.52)
so combining everything we at last arrive at the famous Klein-Nishina formula for the differential cross
section of Compton scattering:
′ 2 " ′ 2 #
q4
dσave ω ω ω 1 1 2 1 1
= + ′ + 2m − +m −
dΩ 32π 2 m2 ω ω ω ω ω′ ω′ ω
" #
2
q4 ω′ ω′
ω
= 2 2
+ ′ − sin2 θ , (11.53)
32π m ω ω ω
where ω ′ is given in terms of ω and the scattering angle θ by equation (11.49). In particular in the non-
relativistic limit ω ≪ m we have ω ′ ≈ ω and thus
dσave q4
1 + cos2 θ ,
≈ (11.54)
dΩ 32π 2 m2
which a glance at an electromagnetism textbook (or wikipedia) will convince you is indeed the differential
cross section for the Thomson scattering of light off of a charged particle in classical electromagnetism. It
is peaked at forward and backward scattering just as we found for the high-energy limit of e+ e− → µ+ µ− .
The total cross section in this limit is
q4
σave = , (11.55)
6πm2
so as far as low-energy photons are concerned the “size” of an electron is
p q2
rthomson = σave /π = √ . (11.56)
6πm
Note that this is smaller than the Compton wavelength 1/m of the electron by a factor of √46 α ≈ .01, with
α being the fine-structure constant. On the other hand in the high-energy limit we instead have
dσave q4
≈ 2
, (11.57)
dΩ 32π mω(1 − cos θ)
which is peaked at forward scattering as you might expect for high energy scattering off of a fixed target.
Problems:
1. Compute the differential cross section for electron-muon scattering e− µ− → e− µ− . You should find
that you can re-use some results from our analysis of e+ e− → µ+ µ− .
121
2. Derive the γ-matrix contraction identities
γ µ γµ = d
γ µ γ α γµ = −(d − 2)γ α
γ µ γ α γ β γµ = (d − 4)γ α γ β + 4η αβ
γ µ γ α γ β γ γ γµ = −2γ γ γ β γ α − (d − 4)γ α γ β γ γ .
3. Evaluate the trace from the spin sum in the Compton scattering expression (11.46), showing that
the result is (11.47). Hint: you would be wise to use the contraction identities from the previous
problem before computing the trace, as this way you never need to compute a trace of more than
four γ-matrices. If necessary you can look at Peskin and Schroeder for help, but beware that they use
different conventions for the metric and the γ-matrices.
fc for Moller scattering e− e− → e− e− . Make sure to
4. Write out the connected covariant amplitude iM
include both diagrams.
5. Extra credit: Compute the spin summed/averaged differential cross section for Moller scattering.
where pe is p except that its 0 component is p0 (1 + iϵ) and we are now distinguishing between the bare mass
m0 and the physical mass m that we will introduce in a moment. We can parametrize the exact propagator
as
i
ŜFexact (p) = , (12.2)
/e − i(m0 + Σ(p
p /))
where Σ(/
pe) is a matrix called the self-energy of the electron. By Lorentz invariance it must have the form
Σ(p p2 ) + i/
/) = A(e p2 ),
peB(e (12.3)
122
The physical mass is found by the location of the pole in p2 , so we have
m0 + A(−m2 )
m= . (12.6)
1 + B(−m2 )
There is an easy way to remember this formula: we can rewrite it as
δm ≡ m − m0 = Σ(p
/)|p/=im . (12.7)
It is tempting to derive this “by inspection” from equation (12.2) (as most textbooks do), but it is not
actually possible to have p/ = im for some choice of momentum pµ (none of the γ-matrices are proportional
to the identity and indeed in the standard d = 4 representation they have only off diagonal components) so
that argument should be viewed as heuristic. We can also extract the field renormalization constant Z2 ; to
do this it is convenient to rewrite (12.5) as
2
i p/ + i m1+B(p
0 +A(p )
2)
ŜFexact (p) = 2 (12.8)
2 2 m0 +A(p2 )
(1 + B(p )) p + 1+B(p2 ) − iϵ
123
Figure 28: Leading contribution to the electron self-energy in spinor electrodynamics.
where in the second line we shifted the integration variable. The form (12.3) of Σ is now clear. The term
involving /ℓ integrates to zero by rotational invariance. The remaining terms go like 1/ℓ4 at large ℓ, so this
integral has a logarithmic UV divergence. We therefore need to regulate it, for example using Pauli-Villars
or dimensional regularization. In the former we subtract the same integrand with µ replaced by Λ, after
which the momentum integral can be evaluated in d = 4, e.g. using mathematica, to give
Z 1
q2 xΛ2
Σ(p/) = dx 2m0 + ixp / log . (12.19)
8π 2 0 x(1 − x)p2 + (1 − x)m20 + xµ2
This integral is finite in the limit that µ → 0 so the self-energy does not have an infrared divergence. We
can compute one-loop renormalization of the electron mass using (12.7), which at this order we can rewrite
as
δm ≈ Σ(p /)|p/=im0 (12.20)
124
since Σ already has a factor of q 2 so the difference between m and m0 on the right-hand side is higher order
in q 2 . We thus have
q 2 m0 1 xΛ2
Z
δm = dx (2 − x) log
8π 2 0 (1 − x)2 m20
3q 2 m0 Λ2
= 1 + 2 log . (12.21)
32π 2 m20
In the first term we can take µ → 0, but doing so in the second gives a logarithmic divergence in the integral
at x = 1. Evaluating the integral and expanding at small µ one finds
q2
2
′ 9 Λ m0
/)|p/=im0 = − 2
iΣ (p + log − log , (12.23)
8π 4 m0 µ2
125
The first term in the parentheses is now the same as (12.19), while the last two are easily integrated to give
q2 ip
/
ΣDR (p /) + 2 3m0 +
/) = ΣP V (p . (12.31)
8π 4
Therefore the mass renormalizations in the two schemes are related as
11q 2 m0
δmDR = δmP V + (12.32)
32π 2
and the field renormalizations are related as
q2
Z2,DR = Z2,P V − . (12.33)
32π 2
for some functions Π1 and Π2 . It will soon be clear why I extracted the factor of p2 from Π1 . In the meantime
the matrix we want to invert to find the propagator is
(∆ˆ −1 − Π)µν = i p2 η µν (1 + Π1 ) − 1 − 1 + Π2 pµ pν , (12.37)
ξ
whose inverse is
ˆ exact −i pµ pν
∆ µν = ηµν − (1 − ξ(1 + Π2 )) 2 . (12.38)
(p2 − iϵ)(1 + Π1 ) p − iϵ
This propagator is simpler in the Landau gauge ξ = 0, which is part of its appeal, but we will stick with
Feynman gauge ξ = 1.
We can further constrain the photon self-energy by noting that it has a simple relationship to the Fourier
transform of the two-point function of the electromagnetic current: the latter is given by
⟨T J(p)J(−p)⟩ = − Π + Π∆Π ˆ + Π∆Π ˆ ∆Πˆ + ...
ˆ −1
= −Π (1 − ∆Π)
ˆ −1 Π.
= −(1 − Π∆) (12.39)
As we discussed in the previous section, correlation functions of currents such as this one must have the
property that if we dot either current into its momentum pµ we get zero by the Ward-Takehashi identity.
We therefore see that the self-energy must obey
126
Figure 29: The one-loop contribution to the photon self-energy.
(2) We can extract the field renormalization constant Z3 for the photon from this formula, the residue of
the pole is
1
Z3 = . (12.44)
1 + Π(0)
127
As usual we can simplify this integral using a Feynman parameter and a shift of the integration variable:
Tr (/ℓ + im)γ µ (/ℓ − p
/ + im)γ ν
Z 1 Z
µν 2 dd ℓ
Π (p) = −q dx 2
(2π)d
0 x ℓ2 + m2 − iϵ + (1 − x) (ℓ − p)2 + m2 − iϵ
Tr (/ℓ + im)γ µ (/ℓ − p/ + im)γ ν
Z 1 Z
2 dd ℓ
= −q dx 2
(2π)d
0 (ℓ − (1 − x)p)2 + x(1 − x)p2 + m2 − iϵ
/ + im)γ µ (/ℓ − xp
/ + im)γ ν
Z 1 Z
2 dd ℓ Tr (/ℓ + (1 − x)p
= −q dx 2 . (12.46)
(2π)d
0 ℓ2 + x(1 − x)p2 + m2 − iϵ
and by symmetry the terms which are linear in ℓ will integrate to zero and thus can be dropped. Moreover
the quantity ℓµ ℓν must integrate to something proportional to η µν , and by contracting with ηµν we can see
that the correct replacement is
ℓ2
ℓµ ℓν → η µν . (12.48)
d
Wick rotating the integral we thus have
dd ℓ d2 − 1 ℓ2 η µν + x(1 − x) p2 η µν − 2pµ pν − m2 η µν
Z 1 Z
µν 2 ⌊d ⌋
Π (p) = −iq 2 2 dx 2 . (12.49)
(2π)d
0 ℓ2 + x(1 − x)p2 + m2
This integral has a quadratic UV divergence, and also a subleading logarithmic divergence. As usual it
is up to us how to regulate the integral, but now we can get into trouble if we aren’t careful. For example
say we use a hard momentum cutoff ℓ2 < Λ2 . Then the leading divergence has the form
Λ2 η µν , (12.50)
which does not match the form (12.42) that we argued for from gauge invariance in the previous subsection.
A hard momentum cutoff violates gauge invariance! This doesn’t mean that it is absolutely wrong, but it
means that we need to include a gauge non-invariant counterterm (in this case a UV-divergent photon mass)
to restore gauge invariance. Including gauge non-invariant terms in the action to fix the gauge non-invariance
of the cutoff is rather inconvenient in practice, so essentially nobody does it. For example in lattice gauge
theory the cutoff always preserves gauge invariance. Here we will use dimensional regularization, which
also preserves gauge invariance as we will now see (essentially because the current conservation equation
∂µ J µ = 0 is dimension-independent.48
48 There is an important caveat to this statement, which is that in chiral gauge theories such as the standard model where
ΨL and ΨR have different gauge couplings it isn’t so clear how to continue these theories away from d = 4 while preserving
gauge invariance (what do we do with γ?). This leads to the possibility of anomalies, which can spoil the gauge invariance of
a quantum field theory without the possibility of repair by a gauge non-invariant counterterm. We will study this phenomenon
in detail next semester.
128
Evaluating (12.49) using our integration formula (12.25) gives
Z 1 "
2 − d µν Γ d2 + 1 Γ 1 − d2
d Ωd−1 d−2
µν 2 ⌊2⌋
Π (p) = −iqd 2 dx η (x(1 − x)p2 + m2 ) 2
(2π)d 0 d 2
#
d d
2 µν µ ν 2 µν Γ 2 Γ 2 − 2 2 2 d−4
+ x(1 − x)(p η − 2p p ) − m η (x(1 − x)p + m ) 2 ,
2
(12.51)
which we can simplify using the Γ function identity Γ(x + 1) = xΓ(x) to get
1
Z
d Ωd−1 d d d−4
Πµν (p) = −iqd2 2⌊ 2 ⌋ Γ Γ 2− p2 η µν − pµ pν dxx(1 − x)(x(1 − x)p2 + m2 ) 2 . (12.52)
(2π)d 2 2 0
To expand near d = 4 we need to decide how to handle the quantity ⌊ d2 ⌋, the standard convention seems to
be to just set it equal to two, so we’ll respect this, but one could also choose d/2 and get a scheme which
differs slightly from the standard dimensional regularization. Expanding near d = 4 as d = 4 − 2ϵ, we then
have Z 1
q2 e2
2 1 µ
Π(p ) = dx x(1 − x) + log(4π) − γ + log . (12.54)
2π 2 0 ϵ x(1 − x)p2 + m2
Introducing an explicit UV cutoff by49
1
log Λ2 = log µ
e2 + + log(4π) − γ, (12.55)
ϵ
we can rewrite this as
1
q2 Λ2
Z
2
Π(p ) = dx x(1 − x) log , (12.56)
2π 2 0 x(1 − x)p2 + m2
so in particular we have
q2
Λ
Π(0) = log (12.57)
6π 2 m
and thus
q2
Λ
Z3 ≈ 1 − Π(0) = 1 − 2 log . (12.58)
6π m
Having now determined δm, Z2 , and Z3 at one loop, this completes the one-loop renormalization of spinor
electrodynamics.
a small rescaling of Λ and it is only the coefficient of the logarithm that will be meaningful for us anyways.
129
-
-
-
+
-
-
+
+
+
-
-
-
-
+
+
+
+
-
+
- -
- -
+
+
+
+
+
+
+
-
+
-
+
-
-
+
-
+
+
+
-
-
+
-
-
-
Figure 30: The scale dependence of vacuum polarization. Measuring the electric flux through a sphere which
is large compared to 1/m feels the full screening of the vacuum pairs, while spheres which are small compared
to 1/m are only screened by those pairs which are small enough.
In the Wilsonian approach to renormalization the way we are supposed to interpret this equation is that it
tells us how to tune q0 as a function of the cutoff Λ so that q stays fixed as we vary Λ. Given our calculation
of Z3 in the previous subsection, at one loop this tuning is apparently
q2
Λ
q0 (Λ) = 1 + 2
log q. (12.60)
12π m
dq0 q3
β(q) ≡ = 02. (12.61)
d log Λ 12π
What these equations say is that the effective electromagnetic coupling grows logarithmically with increasing
energy (or decreasing distance). This is sometimes described as saying quantum electrodynamics is “infrared-
free”, as the coupling gets weaker as we flow to lower energies.50 If we take the cutoff Λ to be large enough
that
q2
Λ
log ∼1 (12.62)
12π 2 m
then the perturbative approximation we made in this calculation will break down, but this does not happen
until the absurdly large energy scale
12π 2 3π
Λ = me q2 = me α ≈ me1300 , (12.63)
which is hardly something for us to worry about in the real world. On the other hand this does mean that
spinor electrodynamics may not make sense as a continuum quantum field theory.
There is a nice physical picture of the scale-dependence of the electron charge, shown in figure 30. The
idea is that the presence of a “bare” electric charge polarizes the vacuum by biasing the orientation of the
50 It stops flowing below the mass of the electron however, and indeed we will see in the next subsection that the intrinsic
definition of the renormalized charge q is that it is the coupling at the scale of the electron mass.
130
Figure 31: Electron-proton scattering by exchanging the exact photon propagator.
electron-positron pairs which are inherently present in the vacuum due to the entanglement illustrated by
the nonvanishing two-point function of the electron field at spacelike separation. These pairs come in all
sizes, but pairs which are larger than the Compton wavelength 1/m are exponentially suppressed so roughly
speaking we can say they don’t exist. The electric charge as measured at distances which are large compared
to the Compton wavelength is screened by these pairs, since the surface where we measure the electric flux
will necessarily cut some of the screening pairs and thus result in a smaller measured value of the charge
than its true bare value. As we make the surface smaller than 1/m however, the screening pairs which are
larger than the size of the surface do not contribute so the screening effect becomes less complete.
Vacuum polarization also leads to an interesting modification of the Coulomb potential at short distances.
To extract this contribution we can consider the scattering of an electron off of a proton using the exact
photon propagator in the exchange as in figure 31. Using our expressions (12.43) for the exact photon
propagator and (12.44) for the exact photon field renormalization constant we can evaluate this diagram as
Taking the non-relativistic limit as in our discussion of Yukawa theory this becomes
so comparing to the Born approximation we find that the Fourier transform of the non-relativistic potential
is
−q 2 (1 + Π(0))
V (⃗k) = . (12.66)
|k|2 (1 + Π(|k|2 ))
Since we have computed Π((p2 ) only to leading order in q we can approximate this as
−q 2
V (⃗k) ≈ × 1 + Π(0) − Π(|k|2
) . (12.67)
|k|2
q 2 |k|2
Π(0) − Π(|k|2 ) ≈ . (12.69)
60π 2 m2e
131
Therefore we have
q2 q4
V (⃗k) ≈ − 2 − , (12.70)
|k| 60π 2 m2e
whose Fourier transform is
q2 q4
V (⃗r) = − − δ 3 (⃗r) (12.71)
4π|r| 60π 2 m2e
α 4α2 3
=− − δ (⃗r). (12.72)
|r| 15m2e
This additional potential shows that when the electron is very close to the proton it feels an additional
attraction due to the unscreening of the proton. This effect is actually measurable: it predicts a shift of the
energy levels of hydrogen which at first order in perturbation theory is given by
4α2
∆E = − |Ψ(0)|2 (12.73)
15m2e
where Ψ(⃗r) is the unperturbed wave function. Atomic wave functions of nonzero angular momentum have
wave functions which vanish at the origin, so this effect only affects s orbitals. For example for the 1s and
2s states of hydrogen we have
m3e α3
|Ψ1s (0)|2 =
π
2 m3e α3
|Ψ2s (0)| = , (12.74)
8π
so the energy shifts are
4me α5
∆E1s = − ≈ −8.99 × 10−7 eV
15π
me α 5
∆E2s =− ≈ −1.12 × 10−7 eV. (12.75)
30π
These contributions are smaller than the usual fine structure of hydrogen by one power of α, but they are still
measurable and in fact have been measured! In ordinary hydrogen they are not the largest effects at O(α5 ),
but they become so for “muonic hydrogen” where the electron is replaced by a muon since in that case the
2
m
squared wave functions are proportional to m3µ so we get an enhancement by a factor of mµe ≈ 4 × 104
compared to the naive energy scale of mµ .
where A ⃗ cl is the vector potential for the classical gauge field. We will be particularly interested in the effects
of this Hamiltonian on a single charged particle, which at first order in the background field are controlled
by the matrix element
Z
p ′ , σ ′ |Hback |⃗
⟨⃗ ⃗ cl (x) · ⟨⃗
p, σ⟩ = − dd−1 xA ⃗
p ′ , σ ′ |J(x)|⃗
p, σ⟩. (12.77)
132
This motivates us to study the matrix element
p ′ , σ ′ |J µ (x)|⃗
⟨⃗ p, σ⟩ (12.78)
Now let’s consider the interacting theory. By translation invariance we must have
p ′ , σ ′ |J µ (x)|⃗
⟨⃗ p ′ , σ ′ |e−ix·P J µ (0)eix·P |⃗
p, σ⟩ = ⟨⃗ p, σ⟩
′
p ′ , σ ′ |J µ (0)|⃗
= ei(p−p )·x ⟨⃗ p, σ⟩, (12.81)
p ′ − p⃗)⟨⃗
= (2π)d−1 δ d−1 (⃗ p ′ , σ ′ |J 0 (0)|⃗
p, σ⟩ (12.82)
and thus
p , σ ′ |J 0 (0)|⃗
⟨⃗ p, σ⟩ = q0 δσ′ ,σ . (12.83)
Moreover by current conservation we must have
(p − p′ )µ ⟨⃗
p , σ ′ |J µ (0)|⃗
p, σ⟩ = 0. (12.84)
Now let’s see what we can learn from Lorentz invariance. The quantity
p , σ ′ |J µ (0)|⃗
p p
⟨⃗ p, σ⟩ 2ωp⃗ 2ωp⃗ ′ (12.85)
transforms as a four vector if we act with a Lorentz transformation on the one-particle states. This motivates
us to define
1 1
p ′ , σ ′ |J µ (0)|⃗
⟨⃗ p, σ⟩ = −q0 p p p ′ , σ ′ )Γµ (p, p′ )u(⃗
u(⃗ p, σ), (12.86)
2ωp⃗ 2ωp⃗ ′
where Γµ is a spinor matrix which is also a spacetime vector, and which must be built out of p, p′ , and
γ µ in some way. Γµ is called the vertex function. Here u and u are serving in their role as intertwiners,
converting the Lorentz transformations of the two one-particle states to the vector Lorentz transformation.
′
In principle the matrix structure of Γµ could include p
/ or p
/ , but these can always be anticommuted through
any other γ-matrices so that they are adjacent to u or u respectively, in which case we can use the Dirac
equation expressed as
p
/u(⃗p, σ) = imu(⃗
p, σ)
′ ′ ′
u(⃗
p , σ )p
/ = imu(⃗p ′ , σ′ ) (12.87)
to remove their matrix structure. The matrix structure of Γµ thus comes entirely from powers of the γ
matrices. Any term with more than one γ µ can be simplified however, since only one of them can have a
51 Note that we are using the “bare” current here, so it integrates to the bare charge.
133
free µ index so the rest must be contracted together (we’ve already removed all of their contractions with p
or p′ ). We can then use our γ-matrix contraction identities to remove all of these contractions. The result
is that we must have
"
′ ′ ′ i
µ
p , σ ) F (p − p′ )2 γ µ −
′ ′
(p + p′ )µ G (p − p′ )2
u(⃗p , σ )Γ (p, p )u(⃗
p, σ) = u(⃗
2m
#
(p − p′ )µ ′ 2
+ H (p − p ) u(⃗
p, σ), (12.88)
2m
where F , G, and H are scalar functions called form factors. They must be real by the hermiticity of J µ ,
and in fact by the current conservation condition (12.84) we must have H = 0. Moreover by (12.83) we must
have
F (0) + G(0) = 1. (12.89)
In free field theory we see from (12.80) that F = 1 and G = 0, but in interacting electrodynamics they are
both nontrivial functions.
Now let’s see what we can say about the interaction of our charged particle with a classical field. In the
non-relativistic limit we’d like to expand the vertex function at small momenta, but at zero momentum we
only recover the relation (12.89) so we need to work at first order in the momenta. In the G term this is
easy since we already have a factor of p + p′ , but in the F term we need to be careful since there is a linear
term hiding in uγ µ u. To extract it we can use the Gordon identity
1
p ′ , σ ′ )γ µ u(⃗
u(⃗ p, σ) = p ′ , σ ′ ) ((p + p′ )µ + 2i(p′ − p)α J αµ ) u(⃗
u(⃗ p, σ) (12.90)
2im
that we derived back in section four, where
i
J αµ = − [γ α , γ µ ], (12.91)
4
from which we have
′ µ i ′ ′ µ 1 ′ αµ
uΓ u=u − (p + p ) (F + G) + (p − p)α J F u. (12.92)
2m m
Both terms now have an explicit power of momenta, so in the non-relativistic limit of the spatial components
we can equate the momenta in u′ and u, which allows us to contract them to get
where Sσji′ ,σ are the standard spin matrices for SO(d − 1). For d = 4 we can write them as
1 jik k
S ji = ϵ S (12.94)
2
k
with S k = σ2 . For a time-independent background vector potential the matrix element of the interaction
Hamiltonian is thus
Z !
q0 ′
p ′ , σ ′ |Hback |⃗
⟨⃗ p, σ⟩ ≈ d3 xAicl (⃗x)ei(⃗p−⃗p )·⃗x − (p + p′ )i + 2i(p − p′ )j ϵjik Sσk′ ,σ F (0) . (12.95)
2m
The first term in parenthesis is just the usual −⃗ ⃗ coupling of a charged particle to the electromagnetic field.
p·A
The second term is more interesting: it describes a magnetic moment of the particle. Indeed integrating
by parts the second term has the form
Z
q0 F (0) ′
− d3 xBcl
k
(⃗x)Sσk′ ,σ ei(⃗p−⃗p )·⃗x , (12.96)
m
134
Figure 32: The one-loop correction to the vertex function. The dashed photon line indicates that it is a
classical background field.
135
The factor of Z2 is there in the first term because of the on-shell external electrons, and the loop integral
comes from the diagram shown in figure 32. I’ll just compute the integral in d = 4, since in this case the
numerator simplification is quite a bit more complicated if we use dimensional regularization; we’ll eventually
use Pauli-Villars to deal with the logarithmic UV divergence of the integral. µ is again a small photon mass
to regulate an infrared divergence which arises when ℓ ≈ p, since then all three factors in the denominator
are close to vanishing so there is a logarithmic infrared divergence. We can simplify the numerator using our
formulas for contracting γ matrices, and we can combine the denominators using a slightly more advanced
Feynman parameter identity:
Z 1 Z 1 Z 1
1 2
= dx dy dzδ(x + y + z − 1) , (12.104)
ABC 0 0 0 (xA + yB + zC)3
after which we can write the loop contribution as
u′ /ℓγ µ (/ℓ + /q) − 2im(2ℓ + q)µ − m2 γ µ u
Z 1 Z 1 Z 1
d4 ℓ
Z
′ µ 2
u Γ u ⊃ 4iq0 dx dy dzδ(x+y+z−1) ,
0 0 0 (2π)4 (x(ℓ2 + m2 ) + y((q + ℓ)2 + m2 ) + z((ℓ − p)2 + µ2 ) − iϵ)3
(12.105)
where I’ve defined the momentum transfer
q = p′ − p. (12.106)
q is not to be confused with the renormalized charge, which will not appear in this calculation until the very
end. We can simplify the quantity in the denominator using that p and p′ are on shell, which results in
x(ℓ2 + m2 ) + y((q + ℓ)2 + m2 ) + z((ℓ − p)2 + µ2 ) = (ℓ + yq − zp)2 + xyq 2 + (1 − z)2 m2 + zµ2 , (12.107)
so shifting the integration momentum by a constant we get
Z 1 Z 1 Z 1
u′ Γµ u ⊃4iq02 dx dy dzδ(x + y + z − 1)
0 0 0
d4 ℓ u′ (/ℓ + z p
/ − y /q)γ µ (/ℓ + z p
/ + (1 − y)/q) − 2im(2ℓ + 2zp + (1 − 2y)q)µ − m2 γ µ u
Z
× 3 .
(2π)4 (ℓ2 + xyq 2 + (1 − z)2 m2 + zµ2 − iϵ)
(12.108)
Simplifying the numerator further requires some work: we can drop all terms which are linear in ℓ, we can
′ ′
replace ℓµ ℓν by 41 η µν ℓ2 , and we can move p/ to the right to act on u and p / to the left to act on u . When
52
the dust settles we have
Z 1 Z 1 Z 1
′ µ 2
u Γ u ⊃4iq0 dx dy dzδ(x + y + z − 1)
0 0 0
d4 ℓ u′ − 21 ℓ2 + (1 − x)(1 − y)q 2 + (z 2 + 2z − 1)m2 γ µ + imz(z − 1)(p + p′ )µ u
Z
× 3 .
(2π)4 (ℓ2 + xyq 2 + (1 − z)2 m2 + zµ2 − iϵ)
(12.109)
This is now in a form where we can extract the form factors F and G: Wick rotating, which supplies a factor
of i and removes the iϵ, we have
d4 ℓ 12 ℓ2 − (1 − x)(1 − y)q 2 − (z 2 + 2z − 1)m2
Z 1 Z 1 Z 1 Z
F (q 2 ) = 1 + (Z2 − 1) + 4q02 dx dy dzδ(x + y + z − 1) 3
0 0 0 (2π)4 (ℓ2 + xyq 2 + (1 − z)2 m2 + zµ2 )
Z 1 Z 1 Z 1
d4 ℓ z(1 − z)
Z
G(q 2 ) = −8q02 m2 dx dy dzδ(x + y + z − 1) . (12.110)
0 0 0 (2π) (ℓ + xyq + (1 − z)2 m2 + zµ2 )3
4 2 2
52 The simplification also produces a term u′ (im(z − 2)(x − y)q µ )u in the numerator, but this integrates to zero since it is
antisymmetric in x and y while the denominator is symmetric. This is reassuring, as such a term would have given us a nonzero
form factor H which would have violated current conservation.
136
We can now (at last) evaluate these integrals using our standard formulas. The momentum integral in
the expression for F has logarithmic UV divergence which we can regulate using Pauli-Villars, which results
in
Z 1 Z 1 Z 1 "
2 q02 (1 − x)(1 − y)q 2 + (z 2 + 2z − 1)m2
F (q ) = 1 + (Z2 − 1) + 2 dx dy dzδ(x + y + z − 1) −
8π 0 0 0 xyq 2 + (1 − z)2 m2 + zµ2
#
zΛ2
+ log .
xyq 2 + (1 − z)2 m2 + zµ2
(12.111)
Looking at our one-loop expression (12.24) for Z2 , we can see that the logarithmic divergence cancels so
F (q 2 ) is UV-finite. Evaluating the Feynman parameter integrals at q 2 = 0 we have
q02
F (0) = 1 + . (12.112)
8π 2
Our expression for G(q 2 ) is finite both in the IR and the UV so we can set µ = 0 and Λ = ∞. We then have
1 1 1
q02 m2 z(1 − z)
Z Z Z
G(q 2 ) = − dx dy dzδ(x + y + z − 1) . (12.113)
4π 2 0 0 0 xyq 2 + (1 − z)2 m2
ae = .00115965218059(13). (12.117)
137
13.1 Spontaneous symmetry breaking in quantum mechanics
A rather trivial example of spontaneous symmetry breaking happens in the non-relativistic hydrogen atom:
there are two degenerate 1s states, one for each spin of the electron, and they are mixed by spatial rotations.
We thus can say that rotational invariance is spontaneously broken for the non-relativistic hydrogen atom.53
This example might feel contrived, and indeed it is sometimes claimed that in quantum mechanics with a
finite number of degrees of freedom obeying canonical commutation relations, with no extra label such as
spin, the ground state can’t be degenerate. To give you a flavor of these arguments, here is one in the
context of a nonrelativistic particle moving in a potential V (x) in one dimension. Say that ψ1 and ψ2 are
normalizable wave functions obeying
Multiplying the first equation by ψ2 and the second equation by ψ1 , taking the difference, and then integrating
we see that
ψ1 ψ2′ − ψ1′ ψ2 = constant. (13.2)
By normalizability ψ1 and ψ2 must got to zero at infinity, and for “reasonable” potentials ψ1′ and ψ2′ will not
grow fast enough to cancel this. The constant must therefore be zero. We can then rewrite this equation as
which we can integrate to see that ψ1 must be proportional to ψ2 and thus they are really the same state.
On the other the other hand this argument involved various assumptions, and if we drop them then
spontaneous symmetry breaking is possible. A simple example is the quantum mechanics of a particle on a
circle x ∼ x + L with Lagrangian
mẋ2 θ
L= + ẋ. (13.4)
2 L
The canonical momentum is
θ
p = ẋ + , (13.5)
L
and the Hamiltonian is 2
1 θ
H= p− . (13.6)
2m L
The energy eigenstates are
ψn (x) = eipn x , (13.7)
with
2πn
pn = , (13.8)
L
so the energy levels are
1 2
En =(2πn − θ) . (13.9)
2mL2
In particular if we choose θ = π, then the n = 0 and n = 1 states are degenerate ground states. These are
actually related by a symmetry, x → −x. The Lagrangian may not look invariant under this symmetry, but
it is invariant up to a total derivative. The symmetry is manifest in the Hamiltonian formulation, as the
Hamiltonian is invariant under
x′ = −x
2θ
p′ = −p + . (13.10)
L
53 In real hydrogen this degeneracy is broken by the interaction between the spin of the electron and the spin of the proton,
138
Figure 33: A scalar potential that exhibits spontaneous symmetry breaking.
Note that this symmetry only respects the periodicity of x for θ = πm with m ∈ Z, as otherwise it does not
respect the quantization of p.54
semester.
139
This potential again has the form shown in figure 33, but we proved in the last section that theories of this
type have a unique ground state. So what is going on?
We can get a better sense of what is happening by defining a pair of states |±⟩ by the property that |±⟩
minimizes
⟨±|H|±⟩ (13.15)
subject to the constraint that r
6k 2
⟨±|x|±⟩ = ± . (13.16)
λ
Were this theory to exhibit spontaneous symmetry breaking then |±⟩ would be a pair of degenerate ground
states. Let’s study their matrix elements. The x′ = −x symmetry is implemented on the Hilbert space by a
unitary operator U that commutes with H, so we have
Moreover we have
⟨+|H|−⟩ = ⟨−|U † H|−⟩ = ⟨−|HU † |−⟩ = ⟨−|H|+⟩ = b. (13.18)
By hermiticity we also have
⟨+|H|−⟩ = ⟨−|H|+⟩∗ , (13.19)
so b must be real. Within the subspace spanned by |±⟩ we thus have
a b
H= , (13.20)
b a
Diagonalizing H with nonzero b shows that the true eigenstates are |+⟩ ± |−⟩, with energy a ∓ |b|. Thus the
ground state is
|Ω⟩ = |+⟩ + |−⟩, (13.22)
which is invariant under the symmetry so there is no spontaneous symmetry breaking.
Let’s now apply this discussion to the quantum field theory (13.11). The key difference is that in order
to tunnel from |+⟩ to |−⟩ or vice versa, the field now needs to tunnel everywhere in space. The tunneling
exponent in the WKB approximation therefore picks up an extra factor of the volume of space. More
concretely if we define a constant mode
R d−1
d xϕ
ϕ0 =
Vol
R d−1
d xϕ̇
π0 = , (13.23)
Vol
the Hamiltonian derived from (13.11) becomes
π02
H= + Volume × V (ϕ0 ). (13.24)
2Volume
Thus in the tunneling exponent (13.21) we should replace m → Volume and V → Volume × V , which gives
an overall factor of the spatial volume in the exponent. Thus in the infinite volume limit we have b = 0, so
140
Figure 34: Turning on an explicit symmetry breaking perturbation. This forces the system to pick |+⟩ or
|−⟩ to be the true ground state.
there are indeed two degenerate ground states exchanged by the symmetry.55 Thus we see that spontaneous
symmetry breaking is considerably more robust in quantum field theory than it is in quantum mechanics.
Let’s say a bit more precisely what the difference is between this example and the quantum mechanical
examples we discussed above. In the field theory model (13.11), the field ϕ is an example of what is called
an order parameter - it is a dynamical variable whose expectation value tells us which vacuum we are in.
In the hydrogen atom example we can take the order parameter to be the spin of the electron in the z-basis,
while in the example with a periodic particle it is the momentum p. In both of the single particle cases
the order parameter commutes with the Hamiltonian, as otherwise it would need to have fluctuations in the
ground state. What can happen in quantum field theory however is that we can have an order parameter
which has fluctuations in any finite volume, but which stops fluctuating in the limit of infinite volume (we will
see this more explicitly momentarily). This allows for more interesting examples of spontaneous symmetry
breaking.
k2 2 λ 4
V (x) = gx − x + x . (13.25)
2 4!
The rule of degenerate perturbation theory is that perturbed Hamiltonian will be diagonal in the basis which
diagonalizes the perturbation. If we look at the matrix elements of the perturbation we have
r
6k 2
⟨±|gx|±⟩ = ±g (13.26)
λ
by assumption, while the off-diagonal matrix elements of gx are much smaller since ⟨x|+⟩ and ⟨x|−⟩ are
peaked at different places. We can also see this directly from the plot of the potential as in figure 34, where
55 If the spontaneously broken symmetry is continuous then b only vanishes like a power of the volume, but it still vanishes.
141
the linear perturbation breaks the degeneracy and favors one minimum or the other depending on the sign
of g. In our matrix language we are thus now diagonalizing
a+c b
H= (13.27)
b a−c
√
which has eigenvalues a ∓ b2 + c2 and eigenvectors which when |b| ≪ |c| are close to |∓⟩ if c is positive
and |±⟩ if c is negative. This idea should also be familiar in the context of the Ising model, where if we turn
on a small external magnetic field this biases the system towards all up or all down depending on the sign
of the field. In particular if the strength of the perturbation stays nonzero in the large volume limit then
eventually we will always be in the regime |b| ≪ |c| no matter how small the perturbation.
For quantum field theory in infinite volume we can be somewhat more quantitative about this situation.
Indeed let’s say that we have a set of degenerate vacua |v⟩ where v is some set of parameters. We will assume
translation symmetry is not broken, in which case these vacua obey
P⃗ |v⟩ = 0 (13.28)
with P⃗ the total spatial momentum, and since there are no particles in these vacua they should be discrete
eigenvectors of P⃗ (i.e. they shouldn’t be part of some continuous set of δ-function normalized momentum
eigenstates). We can thus take each |v⟩ state to have norm one, and by choosing an orthonormal set of |v⟩
we can assume that
⟨u|v⟩ = δu,v . (13.29)
Note that this is the usual Kronecker δ, even though v may include continuous parameters (in the next
subsection we will see examples where it does). Our goal is now to show that we can additionally choose
this basis so that for any local operator O(x) we have
In other words local operators can never mix between the different vacua. This thus implies that any
deformation of the Hamiltonian by a local operator (or a sum of local operators) will be diagonal in the
|v⟩ basis, and thus that these will be the states among which the ground state is selected when we add a
symmetry-breaking perturbation. The idea is that for any two local operators O1 and O2 we have
dd−1 p X
X Z
⟨u|O1 (⃗x)O2 (0)|v⟩ = ⟨u|O1 (0)|w⟩⟨w|O2 (0)|v⟩ + d−1
⟨u|O1 (0)|N, p⃗⟩⟨N, p⃗|O2 (0)|v⟩ei⃗p·⃗x (13.31)
w
(2π)
N
where |N, p⃗⟩ are some complete set of single and multiparticle states. If we consider the limit of large |x|
the momentum integral should vanish by the Riemann-Lebesgue lemma (assume some mild integrability of
these matrix elements), so we thus have
X
lim ⟨u|O1 (⃗x)O2 (0)|v⟩ = ⟨u|O1 (0)|w⟩⟨w|O2 (0)|v⟩. (13.32)
|x|→∞
w
Now let’s assume that O1 and O2 are bosonic. They therefore must commute at spacelike separation, and
so we must have X X
⟨u|O1 (0)|w⟩⟨w|O2 (0)|v⟩ = ⟨u|O2 (0)|w⟩⟨w|O1 (0)|v⟩. (13.34)
w w
In other words the matrices ⟨u|O1 (0)|v⟩ and ⟨u|O2 (0)|v⟩ commute for any bosonic O1 and O2 . If O1 or O2
is fermionic then its matrix elements between the |v⟩ must vanish since each |v⟩ should be invariant under
142
Figure 35: The sombrero potential for the O(2) vector model.
fermion parity (which is built out of infinitesimal rotations). We can therefore simultaneously diagonalize
all of the local operators, leading to (13.30).
We can think of these |v⟩ states as being precisely the ones where the cluster decomposition principle
is obeyed:
lim ⟨v|O1 (⃗x)O2 (0)|v⟩ = ⟨v|O1 (0)|v⟩⟨v|O2 (0)|v⟩ (13.35)
|x|→∞
for any O1 and O2 . One consequence of this is that the fluctuations of the zero modes of operators vanish
in these states:
R d−1
d xO(⃗x) dd−1 yO(⃗y )
R Z
1
⟨v| |v⟩ = dd−1 x⟨v|O(⃗x)O(0)|v⟩
Volume Volume Volume
= ⟨v|O(0)|v⟩⟨v|O(0)|v⟩
R d−1 R d−1
d xO(⃗x) d yO(⃗y )
= ⟨v| |v⟩⟨v| |v⟩, (13.36)
Volume Volume
where in the first equality I used translation invariance, in the second I used that the integral is determined
by the large |x| regime since the contribution from any finite region is eliminated by the infinite volume
factor, and in the third I again used translation invariance. This shows that these |v⟩ states are indeed those
where the zero modes of the fields have definite values, and in fact usually the label v is precisely just some
list of these expectation values. In our ϕ4 theory with negative mass squared these are precisely the states
|±⟩, while in the states |+⟩ ± |−⟩ the zero mode of ϕ has O(1) fluctuations.
This theory is usually called the “O(N ) vector model”, as it has an O(N ) global symmetry that rotates
among the ϕi fields:
ϕ′i = Rij ϕj , (13.38)
with R ∈ O(N ). For N = 2 this potential is called the sombrero or mexican hat potential, it is shown in
figure 35. The minimum of this potential is an SN −1 in field space located at
m
|ϕ| = √ , (13.39)
λ
143
so to pick a vacuum we need to pick a unit vector in RN . This unit vector is preserved by an O(N − 1)
subgroup of O(N ), so this situation is typically described by saying that the O(N ) global symmetry of the
model is spontaneously broken to a O(N − 1) subgroup, with the latter still leaving the vacuum invariant.
In particular for the case of N = 2 the unbroken symmetry is just O(1) = Z2 .
What are excitations of this theory about the ground state |n̂⟩ in which the zero mode of ϕi is pointing
in the n̂ direction? Looking at figure 35 the answer is clear: in the radial direction orthogonal to the ground
state SN −1 there is a massive scalar field excitation (with positive mass squared), while for each of the N − 1
field directions along the SN −1 there is a massless scalar field. These massless scalars are called Goldstone
bosons, or sometimes Nambu-Goldstone bosons, and their existence is the hallmark of a spontaneously
broken continuous symmetry. Note that there is an independent massless scalar for each broken generator
of O(N ). In particular for N = 2 there is one Goldstone boson. We can extract it explicitly by writing
ϕ1 = ρ cos θ
ϕ2 = ρ sin θ, (13.40)
The Ward identity that we proved last semester shows that we have
∂
⟨v|T J µ (x)On (0)|v⟩ = iδ d (x)⟨v|δS On (0)|v⟩ + . . . , (13.46)
∂xµ
where . . . indicates possible Schwinger terms involving derivatives of the δ-function. Defining the Fourier
transform
dd p
Z
µ
⟨v|T J (x)On (0)|v⟩ = ⟨v|T J µ (p)On (0)|v⟩eip·x , (13.47)
(2π)d
56 This is not the most general version of the theorem. There is also a non-relativistic version of Goldstone’s theorem, and
the spontaneous breaking of spacetime symmetries such as translation and rotation invariance also leads to Goldstone bosons.
144
we can rewrite (13.46) as
pµ ⟨v|T J µ (p)On (0)|v⟩ = ⟨v|δS On (0)|v⟩ + . . . , (13.48)
where the . . . terms come from Schwinger terms and are proportional to positive powers of momentum
(from the derivatives on the δ function). Now let’s consider the limit of zero momentum. As long as
⟨v|δS On (0)|v⟩ =
̸ 0, the two point function in momentum space must have a term
pµ
⟨v|T J µ (p)On (0)|v⟩ ⊃ ⟨v|δS On (0)|v⟩ × . (13.49)
p2
Therefore the two-point function must have a pole at zero momentum, which I’ll remind you is equivalent
to saying there is a massless particle which can be created by O(0) and annihilated by J µ . Moreover the
momentum pµ can be recognized as the intertwiner which allows a vector field J µ to annihilate a particle of
zero spin (think ∂ µ ϕ in free field theory), so our massless particle has spin zero. This particle must have the
same internal symmetry charges as J µ since it is annihilated by it. This conclusion required that
X
⟨v|δS On (0)|v⟩ = Tnm ⟨v|Om (0)|v⟩ = ̸ 0, (13.50)
m
but this is precisely the condition that the vacuum transforms nontrivially under symmetry generated by
J µ (assuming that we choose On to include any fields which have a vacuum expectation value). Moreover
if there is more than one J µ for which this is true then there is an independent Goldstone boson for each
spontaneously-broken direction in the internal global symmetry group. Indeed let Jaµ be a basis for the set
of broken currents and let b run over the set of Goldstone bosons contributing to this two-point function.
The relevant matrix elements are determined by Lorentz invariance to be
iFab pµ
⟨v|J µ (0)|⃗
p, b⟩ = p
2ωp⃗
Zbn
⟨⃗
p, b|On (0)|v⟩ = p , (13.51)
2ωp⃗
We can view the right hand side as a matrix Tnm , which by definition has a rank which is equal to the
a
number Nb of broken generators (i.e. the number of generators Tnm such that no linear combination of them
leaves the vacuum invariant). The left-hand side is a product of two matrices, so since their product has
rank Nb they must each separately have rank at least Nb . In particular Zbn has rank Nb , so On creates Nb
linearly-independent boson states. This is the most we could possibly create, since Fab has rank Nb .
There are many examples of Goldstone bosons in nature, here are a few:
In QCD in the limit mu = md = 0 there is an SU (2) chiral symmetry. This symmetry turns out to be
spontaneously broken, leading to three Goldstone bosons: the charged pions π± and the neutral pion
π0 .
In superfluid liquid helium particle number symmetry is spontaneously broken, which leads to a mass-
less Goldstone boson whose dynamics account for the remarkable bulk properties of the system.
In a crystal spatial translation symmetry is spontaneously broken, which leads to Goldstone bosons
called phonons.
In the early universe it seems that there was an approximate spontaneously broken shift symmetry of
some field called the inflaton, whose approximate Goldstone boson fluctuations eventually gave rise to
the anisotropy of the cosmic microwave background and the stars and galaxies we see today.
145
13.5 The Abelian Higgs model
What happens if a gauge symmetry is spontaneously broken? This turns out to be a rather slippery question
to make precise, and in particular to the extent that spontaneous breaking of a gauge symmetry makes sense
it does not lead to any vacuum degeneracy.57 It is thus safest not to refer to (or think of) the phenomenon
as spontaneous symmetry breaking; I’ll instead refer to it as the Higgs mechanism.58 Rather than give a
general discussion we’ll instead just consider a simple example, the Abelian Higgs model, which is another
name for scalar QED with a negative mass squared:
1 λ
L = − F µν Fµν − (Dµ ϕ)∗ Dµ ϕ + m2 |ϕ|2 − |ϕ|4 . (13.53)
4 4
Here we take ϕ to have charge q, so the covariant derivative is
Dµ ϕ = (∂µ − iqAµ ) ϕ. (13.54)
We can guess that the potential causes ϕ to get a nonzero expectation value, but since ϕ is not gauge-invariant
we need to be more careful about analyzing what this theory does. It turns out the most useful approach is
to introduce the field redefinition
ϕ(x) = ρ(x)eiqθ(x) , (13.55)
with ρ > 0. This field redefinition is not good near ρ = 0, but at least to all orders in perturbation theory
we are ok as long as we end up expanding around somewhere other than ρ = 0. In terms of these variables
the gauge transformations of this theory are
A′µ = Aµ + ∂µ Ω
θ′ = θ + Ω
ρ′ = ρ. (13.56)
Substituting this redefinition into the covariant derivative we have
Dµ ϕ = (∂µ ρ − iq(Aµ − ∂µ θ)ρ)eiqθ , (13.57)
so the Lagrangian becomes
1 λ
L = − Fµν F µν − ∂µ ρ∂ µ ρ + m2 ρ2 − ρ4 − q 2 ρ2 (Aµ − ∂µ θ) (Aµ − ∂ µ θ) . (13.58)
4 4
The potential thus sets ρ to r
2m2
ρ0 = . (13.59)
λ
I emphasize that ρ is gauge-invariant, so this expectation value does not break the gauge symmetry; the
vacuum is gauge-invariant (as it had better be since it should be a physical state). The Lagrangian (13.58)
has a number of remarkable features, which we will now discuss in turn:
The variable θ appears only in derivatives, so one might naively expect it to lead to a massless scalar
particle. If we remove the photon from the theory then this is indeed the case, with θ being the
Goldstone boson for the spontaneous breaking of U (1) global charge rotation symmetry. In fact this is
just the O(2) vector model written in slightly different variables. In the theory (13.58) however, we can
remove the θ field by doing the gauge transformation Ω = −θ. This is called going to unitarity gauge,
which has gauge-fixing condition θ = 0. The theory (in any gauge) thus has no physical Goldstone
boson, it has been removed by the gauge redundancy. This is consistent with the general slogan that
the Higgs mechanism is not the same thing as spontaneous symmetry breaking.
57 This
has been shown rigorously from a lattice point of view, where it is called “Elitzur’s theorem”.
58 Thehistory of the Higgs mechanism is a convoluted mess, with the list of people deserving some credit including Anderson,
Brout, Englert, Guralnik, Hagen, Higgs, Kibble, and Nambu. I’m not enough of a historian to take sides, but it does need a
name and Higgs is a fine one.
146
If we look at fluctuations of the ρ field, which we can parametrize as ρ = ρ0 + √δ2 , the kinetic term for
δ has the form
1
L ⊃ − ∂µ δ∂ µ δ − m2 δ 2 , (13.60)
2
√
so this theory has a massive scalar excitation with mass mHiggs = 2m. This is called the Higgs
boson, and a close analogue of it was discovered in 2012 by the ATLAS and CMS experiments at the
Large Hadron Collider in Geneva.
The quadratic action for the photon now includes a mass term
µ2
J ⊃− Aµ Aµ , (13.61)
2
with the photon mass µ being given by59
√ 2qm
µ= 2qρ0 = √ . (13.64)
λ
This term does not look gauge-invariant, but the full Lagrangian is; the Higgs mechanism thus gives
us a gauge-invariant way to describe realize a massive photon. More concretely the gauge-invariant
eµ = Aµ − ∂µ θ. This situation is sometimes described by saying that the
massive photon field is A
massive photon has “eaten the Goldstone boson”. Of course in the standard model the true photon is
massless, but an analogue of this mechanism leads to three heavy massive particles of spin one called
the W± and Z bosons.
In the standard model the Higgs mechanism gives mass to matter fields in addition to gauge bosons.
Here is how it works. Say we have two charged spinors Ψ1 and Ψ2 , with charges q1 and q2 respectively,
and moreover let’s assume that q1 + q = q2 . Then we can have a gauge invariant Yukawa-type term in
the Lagrangian
L ⊃ −igϕψ 2 ψ1 + c.c. (13.65)
If ϕ gets an expectation value then this becomes a mass term that mixes ψ1 and ψ2 , and we can
rediagonalize the spinor basis to find a pair of massive spinors.
13.6 Superconductivity
The first physics application of the Higgs mechanism was not actually to particle physics; it was originally
proposed (before the work of Higgs) by Nambu and then Anderson as a model of superconductivity. In
this context it is therefore sometimes called the Anderson-Higgs mechanism, especially by condensed matter
physicists, although I don’t think that is fair to Nambu. In any event the key point is that a superconductor
is really nothing but a material in which some low-energy field with electric charge q has a radial part ρ
which gets an expectation value. In other words it is a material in which electromagnetism is Higgsed!
We won’t concern ourselves here with why this happens, for that we need the BCS theory for conventional
superconductors and something fancier for unconventional superconductors, but given this definition we can
easily understand the most famous superconducting phenomona without any need for complicated models.60
The idea is that inside a superconducting material we can describe the low-energy behavior using the
photon field Aµ and the field θ corresponding to the phase of the operator whose radial part gets an expecta-
tion value. In order for the low-energy action to be gauge-invariant, it can depend on Aµ and θ only through
59 To see that this term indeed has the interpretation of a photon mass, we can look at how it modifies Maxwell’s equations.
These become
∂ µ Fµν = µ2 Aν , (13.62)
which in Lorenz gauge ∂µ Aµ = 0 have the form
∂ 2 Aν = µ2 Aν . (13.63)
2
Thus the solutions in momentum space obey p = −µ . 2
60 We here only give a rather cursory treatment, see section 21.6 of Weinberg for more.
147
Figure 36: Magnetic flux through a superconducting loop in the presence of a persistent current I: the flux
through the disk bounded by the dashed line is quantized in units of 2π/q, so the current cannot decay. The
⃗ and J⃗ vanish.
current is flowing at the edge of the loop, inside of it both B
the combination Aµ − ∂µ θ. Moreover we’d like the system to be stable when A and θ both vanish, so the
energy should have a minimum when this quantity vanishes. These expectations are realized for the Abelian
Higgs Lagrangian (13.58), but we do not need to to use the details of this Lagrangian (and in fact it is
only a good approximation near the transition temperature, where it is referred to as the Landau-Ginzburg
theory). In general if we are deep inside a superconductor which is in (or near) its ground state we should
expect that
Aµ = ∂µ θ. (13.66)
Since this is pure gauge, the electric and magnetic fields must both vanish. The vanishing of the electric field
is familiar for a perfect conductor, the vanishing of the magnetic field inside of a superconductor is called
the Meissner effect.61
Of course the most famous feature of a superconductor is electrical conductivity with zero resistance. To
understand this, we first need to appreciate that since θ is the phase of an operator of charge q it is actually
a periodic variable obeying
2πn
θ∼θ+ . (13.67)
q
Now let’s consider a closed loop of superconducting material. Consider circular path L which circles the
loop once within the material, which we can view as the boundary of a disk D that extends outside of the
superconducting material. See figure 36 for an illustration. We can compute the magnetic flux through D
via the following calculation:
Z Z Z
⃗ ⃗
B · dA = ⃗
A · d⃗x = ⃗ · d⃗x = θ(2π) − θ(0) = 2πn
∇θ n ∈ Z. (13.68)
D L L q
Thus the magnetic flux through the loop is quantized, with the quantization integer being set by the number
of times that θ winds around the loop. Since this flux is linearly proportional to the current circulating
around the loop (see e.g. the Biot-Savart law), this means the current is also quantized. In particular this
means that the current cannot decay continuously, so if it starts with n ̸= 0 then it must remain so. In other
words a superconducting loop can support a persistent current that lasts forever without any external
61 Incidentally this already shows that there is more to a superconductor than just being a perfect conductor - the magnetic
field inside the latter has to be constant, but it doesn’t have to be zero.
148
potential applied, which is the hallmark of superconductivity! Experimentally the set of allowed currents
can be measured and gives q = −2e, so apparently in a typical superconductor the operator whose radial
component gets an expectation value has the charge of two electrons - the excitations it annihilates are
typically called Cooper pairs.
It is interesting to consider where the persistent current is located within the current loop. Inside the loop
we have B ⃗ = 0 by the Meissner effect, and so by Ampere’s law we have (assuming the current is stationary)
J⃗ = ∇⃗ ×B ⃗ = 0. So there is no current flowing inside the loop. It therefore must all be flowing right at the
surface! This is a typical feature of superconductors - most of the interesting physics has to do with what
happens at an interface between a superconductor and a material (such as air) in which electromagnetic
gauge symmetry is not Higgsed. Another example of such a phenomenon is the Josephson effect, where
bringing two superconductors close together results in an oscillating current between them. The Josephson
effect has many applications, for example in SQUIDs (which are very precise magnetometers) and as qubits
for potential quantum computing architectures.
149