Combined Notes
Combined Notes
In this section we will revise some of the basic operations of linear algebra. Most of these operations
should be familiar to you from PHYS 10071. However, we will also introduce some new concepts that
will be useful in the rest of the course, for example working with n ⇥ m matrices. Make sure that you
are comfortable with these concepts and are well practiced with them as they will be used throughout
the course.
By the end of this lecture you should be able to:
• Perform basic operations on vectors and matrices, e.g. matrix multiplication and addition.
where aij is the element in the ith row and jth column, and can in general be complex.
6
Chapter 0. A Brief Revision of Linear Algebra 7
A + B = B + A, (commutative) (0.5)
(A + B) + C = A + (B + C), (associative). (0.6)
Scalar multiplication:
When multiplied by a scalar , each element of a matrix is multiplied by that scalar, such that:
Matrix multiplication:
The biggest difference between matrix algebra and regular algebra is found in matrix multiplication,
which is defined as follows: For a n ⇥ m matrix A and a m ⇥ p matrix B,
m
X
C = AB ) cij = aik bkj , (0.10)
k=1
where C is a n ⇥ p matrix. Each element of the matrix C is given by the sum of the products of the
rows of A with the columns of B.
An important distinction to make is that matrix multiplication is not commutative, that is, AB 6=
BA in general. It is, however, distributive and associative,
You should try and follow this through yourself, and practice with the following exercises:
Exercise 0.1:
Consider the following matrices,
✓ ◆ ✓ ◆ ✓ ◆
1 2 2 0 0 1
A= , B= , C= . (0.15)
2 1 0 0 1 0
(i) AB
(ii) BA
(iii) ABC
(iv) CBA
Exercise 0.2:
The Pauli matrices may be represented as follows:
✓ ◆ ✓ ◆ ✓ ◆
0 1 0 i 1 0
x = , y = , z = . (0.16)
1 0 i 0 0 1
[ x, y] = 2i z, (0.17)
Transpose:
The transpose of a matrix A is defined as the matrix AT such that [aT ]ij = aji , that is, the rows of
A become the columns of AT and vice versa. It has an interesting property on a product of matrices.
For example, consider the product of two matrices A and B, C = AB. Taking the transpose of C we
find,
Xm
T T T T T
C = (AB) = B A ) [c ]ij = ajk bki . (0.18)
k=1
Exercise 0.3:
A† = (AT ). (0.19)
Trace:
The trace of a matrix A is defined as the sum of the diagonal elements of A, that is,
n
X
Tr(A) = aii . (0.20)
i=1
A crucial property of the trace is that it is invariant under cyclic permutations, that is, Tr(ABC) =
Tr(BCA) = Tr(CAB). If you plan to study quantum mechanics in the future, this property comes
up repeatedly. Try your hand at proving this property for yourself.
Exercise 0.4:
Prove that the trace of a matrix is invariant under cyclic permutations,
It is also important to note that the trace of a matrix is equal to the sum of its eigenvalues, which will
come in handy later in the course.
Determinant:
The determinant of a matrix A is defined as,
X n
Y
det(A) = sgn( ) ai (i) , (0.22)
2Sn i=1
where Sn is the set of all permutations of the set {1, 2, . . . , n}. The above general case is a bit
complicated, so lets consider the of a general 3 ⇥ 3 matrix (which is the largest matrix I will ask you
to find a determinant for), 0 1
a11 a12 a13
A = @a21 a22 a23 A . (0.23)
a31 a32 a33
The determinant of A is then given by,
a22 a23 a a a a
det(A) = a11 + a12 23 21 + a13 21 22 ,
a32 a33 a33 a31 a31 a32 (0.24)
= a11 (a22 a33 a23 a32 ) + a12 (a23 a31 a21 a33 ) + a13 (a21 a32 a22 a31 ).
Matrix inverse:
For a general matrix A, we can define the inverse matrix as the matrix A 1 such that AA 1 = 1.
Chapter 0. A Brief Revision of Linear Algebra 10
It is important to note that not all matrices have an inverse, for example, the zero matrix 0 does not
have an inverse. If a matrix does have an inverse, then it is unique. The inverse of a matrix can be
found by using the formula,
1
A 1= CT , (0.25)
det(A)
where CT is the transpose of the matrix of cofactors C corresponding to A. I don’t want to go into
any detail on how to calculate the inverse in with this method, later in the course we will learn how to
do this with eigenvalues and the spectral decomposition of a matrix. However, it is important to note
that if det(A) = 0, then the above formula blows up, and thus A is not invertible.
the inertial frame S is moving at a velocity v relative to an observer in S 0 . Therefore, we can apply
a second Lorentz transformation, this time with ! to obtain:
x = ⇤( )x0 . (0.28)
The above expression gives us a natural definition for the identity matrix, ⇤( )⇤( ) = ⇤( )⇤( ) =
1, where 1 is the identity matrix. We say that ⇤( ) is the matrix inverse of ⇤( ) and vice versa.
Chapter 0. A Brief Revision of Linear Algebra 11
Exercise 0.5:
Two spacetime events x1 and x2 occur in an intertial frame S, where
0 1 0 1
0 ct
B0C B C
x1 = B C , x2 = Bu cos(↵)C . (0.30)
@0A @ u sin(↵) A
0 0
Using matrix algebra, find the corresponding spacetime coordinates in a frame S 0 , which is
moving at a velocity vx in the x-direction and vy in the y-direction relative to S. The Lorentz
transformation for motion in the for the above motion is,
0 1
x y 0
2
B 1+ ( 1) x ( 0 1) x y
0C
0 B x 2 2 C
x =B ( 1) ( 1) 2 C (0.31)
@ y
0
2
x y
1+ 2
y
0A
0 0 0 1
Your life for the next 2 years will be much easier if you are able to do this. From this point onwards,
eigenvalue problems will crop up time and again. Indeed, if you are able to diagonalise a matrix then
you can solve any linear system of equations, differential or otherwise.
The eigenvalues of a matrix A are the values that satisfy the linear equation,
Av = v , (0.32)
The procedure for calculating eigenvalues is most easily understood through examples. Consider
the 2 ⇥ 2 matrix, ✓ ◆
0 1
x = . (0.34)
1 0
This matrix is one of the so-called Pauli matrices, which will come up time and again in this course,
Chapter 0. A Brief Revision of Linear Algebra 12
and are used extensively in quantum theory. The characteristic equation for this matrix is given by,
✓ ◆
1
det( x 1) = det = 2 1 = 0, (0.35)
1
which has the solutions = ±1. The corresponding eigenvectors then found by solving the equations,
✓ ◆✓ ◆ ✓ ◆
0 1 v1 v
=± 1 , (0.36)
1 0 v2 v2
which yields v2 = ±v1 , therfore the eigenvectors are:
✓ ◆
1
v± = v1 . (0.37)
±1
Note that the eigenvectors are not unique, the prefactor v1 can be any complex number v± will still
be an eigenvector of x . This is a general feature of eigenvectors, they are only defined up to a
multiplicative constant. However, we quite often place constraints on eigenvectors based on physical
intuition. For example, in quantum mechanics we require eigenvctors to be normalised in order to be
valid states of a quantum system (i.e. the eigenvector corresponds to a wavefunction which has be
normalised). The normalisation condition for an eigenvector v± is given by dot product of the vector
with itself,
hv± |v± i = 1. (0.38)
where we have used the notation h·|·i to denote the dot product of two vectors, where the left hand
side is conjugated. Or to put it mathematically,
hv± |v± i ⌘ v̄± · v± ,
where the bar over the top of vector or variable denotes a complex conjugate ā ⌘ a⇤ . This change
in notation may seem strange, but I promise, the reason for this will become clear. Using the above
expression, we find v1 = p12 , and so the eigenvectors of the matrix x are given by:
✓ ◆
1 1
v± = p . (0.39)
2 ±1
Exercise 0.6:
Find the normalised eigenvalues and eigenvectors of the following matrices:
✓ ◆
0 i
(i) y =
i 0
✓ ◆
1 0
(ii) z =
0 1
Note that in part (i) you will need to use the complex conjugate when normalising the eigen-
vectors.
For good measure, let’s try another example, but with a more complicated 3 ⇥ 3 matrix,
0 1
0 1 0
1
S = p @1 0 1A . (0.40)
2 1 1 0
The characteristic equation for this matrix is given by,
0 1
1 0
det(S 1) = det @ 1 1 A= ( 2
1) = 0, (0.41)
0 1
Chapter 0. A Brief Revision of Linear Algebra 13
which has the solutions = 0, ±1. We can now find the corresponding eigenvectors by solving the
equations, 0 10 1 0 1
0 1 0 v1 v1
1
p @1 0 1 A @v 2 A = @v2 A . (0.42)
2 0 1 0 v3 v3
Here I will show you how to find the eigenvalue for = +1, and leave the others as an exercise. For
= +1, by carrying out the matrix multiplication we find the following equations,
p
v2 = 2v , (0.43)
p 1
v1 + v3 = 2v , (0.44)
p 2
v2 = 2v3 . (0.45)
Exercise 0.7:
Find the normalised eigenvectors associated to the = 0, 1 eigenvalues of the matrix,
0 1
0 1 0
1
S = p @1 0 1A . (0.48)
2 0 1 0
• For matrix algebra Chapter 8, section 3 and 4 in Riley, Hobson, and Bence.
• For matrix operations such as the trace and determinant, section 8.8 and 8.9 in Riley, Hobson,
and Bence.
• For eigenvalues and eigenvectors, section 8.13 in Riley, Hobson, and Bence.
There are are additional exercises at the end of this chapter to practice if you need to.
Lecture 1
Vector Spaces
Welcome to PHYS20672. This course is all about vector spaces, complex numbers, and linear algebra.
It is a course that is both abstract and practical, and will form the foundation of your studies in physics
and mathematics. Particularly if you have ambitions to study quantum mechanics, the language of
vector spaces is essential.
1.1 Housekeeping
A few things to note before we begin:
• This course will start with vector spaces and before moving on to complex analysis.
• Dr Jake Iles-Smith will be teaching the first half of the course, his office number is 7.26 Schuster.
• Dr Mike Godfrey will be teaching the second half of the course, his office number is 7.19 Schuster.
• Jake will be available for questions for the hour after the Monday and Thursday lecture.
• There are 10 problem sheets, including one sheet of revision surrounding linear algebra. You
shoud work through these sheets in your own time.
• Solutions to problem sheets will be posted on blackboard the week after the problem sheets have
been released.
• There is a Piazza message board for this course. Please engage with this and ask questions, but
as always, please keep things constructive.
• A change from previous years is that exam questions will be mixed! You will not be able to avoid
questions on vector spaces, so please keep abreast with this part of the course!
• The exam will cover definitions, proofs, as well as the application of knowledge covered in the
lectures. If it has a brightly coloured box around it, it’s probably important!
14
Chapter 1. Vector Spaces 15
For the vector spaces portion of the course I am trialling two things this year: 1) Some new lecture
notes– there will be typos, please let me know if you find any. 2) I will be recording a few videos to go
through examples in more detail. Generally I will also repeat these examples in the lecture notes for
those who prefer. You can post feedback on the Piazza.
Disclaimer: I will assume that you are comfortable with basic operations in linear alge-
bra, such as: matrix multiplication; trace, transpose, and determinant of a matrix; and
eigenvalue problems. If you can’t remember how to do these operattions, then I have
placed some revision material in the blackboard folder for week 0. Please practice these
skills!
• K. F. Riley, M. P. Hobson and S. J. Bence, Mathematical Methods for Physics and Engineering,
3rd ed. (Cambridge, 2006)
• R. Shankar, Principles of Quantum Mechanics, 2nd ed. (Springer, 1994). [This is also the
recommended book for PHYS30201 Mathematical Fundamentals of Quantum Mechanics next
year.]
Riley et al give a nice introduction to vector spaces and their algebra from a general perspective, but
do not discuss function spaces. Shankar is pitched more towards the elements of linear algebra required
to study quantum mechanics. From the outside, Shankar uses Dirac notation, which we will cover in
lecture 2 and 3. eBooks for Riley and Shankar can be found through the library.
The primary recommended textbook for the second part of the course is the "Schaum’s Outline"
by Spiegel et al. The lectures follow this book fairly closely. The weekly outlines contain detailed
references to sections from it. It’s not written in a standard textbook style – it’s more like lecture
notes. The book has many examples for practice (as well as the "640 fully solved problems" advertised
on the cover). It’s been the recommended book to go to for this subject for many years. (I remember
using it as an undergraduate.)
If you are interested in more advanced aspects we touch upon in this course, then you can have a look
at
• Atland and von Delft, Mathematics for Physicists: Introductory Concepts and Methods, (Cam-
bridge, 2019)
This is a slightly more advanced book, but treats the mathematical concepts discussed in this course
carefully and in rigorous manner.
Chapter 1. Vector Spaces 16
1
In most cases you have encountered, these are actually infinite dimensional spaces!
Chapter 1. Vector Spaces 17
A field in mathematics is simply a set of numbers which has addition and multiplication defined
upon it. For example, the set real numbers F = R is a field, as is the set of complex numbers F = C.
These are the two fields that we will be working with in this course.
We can now define precisely what we mean by a ‘vector space’:
Definition 1.1: Vector space
(i) Addition: The set is closed under addition, such that if u, v 2 V , then w = u + v 2 V .
In other words, if two vectors in a vector space are added together, the resulting object is also
a vector within the same space. This addition operation is commutative:
a + b = b + a, (1.2)
and associative:
(a + b) + c = a + (b + c). (1.3)
(ii) Scalar Multiplication: The set is closed under multiplication by a scalar (i.e. any complex
number). So that for u 2 V , then u 2 V for 2 F, where F is the field over which the vector
space is defined. Scalar multiplication is associative:
as well as distributive:
( + µ)u = u + µu for , µ 2 F. (1.5)
Multiplication by unity leaves a vector unchanged 1 ⇥ u = u.
(iii) Null vector: There exists a null vector 0 defined such that u + 0 = u.
(iv) Negative vector: All vectors have a corresponding negative vector u, which satisfies
u + ( u) = 0. (1.6)
There are two things to note about the above definitions. First is that subtraction is only defined
indirectly, through the definition of the negative vector, i.e. we have
u v = u + ( v). (1.7)
Second, there is no definition of length. This is curious as no doubt you have heard the phrase ‘vectors
have both direction and magnitude’ countless times. As defined above, the notion of vectors is much
more general than this statement suggests. Let’s look at some examples of vector spaces.
Chapter 1. Vector Spaces 18
The first example is the real vectors in three dimensions that we have already encountered. This
is typically denoted as R3 , where the superscript gives the dimensionality of the vector space, and the
R is the field over which the vector space is defined. We can check whether the above definition of a
vector space holds for R3 .
(i) First, if we add two vectors together u, v 2 R3 , then the result is still a vector in R3 . So R3 is
closed under addition.
(ii) Second, if we multiply a vector by a real scalar, then the result is still a vector in R3 . If we
multiply a vector by a real scalar, then we still have a vector in R3 , therefore it is closed under
addition.
(iii) The null vector is simply the vector with all components equal to zero
(iv) The negative vector is simply the vector with all components multiplied by 1.
Exercise 1.1
Consider the set of real 2x2 matrices, of the form,
✓ ◆
↵
a= ,
where ↵, , , 2 R. Show that these matrices form a vector space over the real numbers.
Exercise 1.2
Show that the set of functions f (x), for x 2 [0, L], form a vector space.
• Section 8.1 in Riley, Hobson and Bence. This is all written in vector notation as opposed to
Dirac notation covered in lecture 2 this week.
• Section 1.1-1.4 Shankar. Shankar uses Dirac notation from the outset, so it might be worth
waiting until the end of Lecture 3 before turning to this reference.
Lecture 2
Linear dependence and basis vectors
• Define linear independence, and check whether a set of vectors are linearly independent.
• Define a basis vector, and show that any vector can be written as a linear combination of basis
vectors.
These vectors are linearly independent if the only solution to the equation ↵u+ v = 0, is for ↵ = = 0.
Substiting the vectors in, we obtain two equations,
↵+ =0,
↵ =0.
↵+ + 3 =0,
↵ + 2 =0.
Solving this, we find ↵ = 5 , and = . These equations are overdetermined, and therefore there
are an infinite number of solution parameterised by . More formally we define linear independence
as follows:
19
Chapter 2. Linear dependence and basis vectors 20
has only one solution: i = 0 8 i. Otherwise, they are said to be linearly dependent.
A set of vectors {ui for i = 1, 2, · · · , n} A vector space V has dimension N if it can accomodate
no more than N linearly independent vectors ui .
We often denote such a vector space by V N (R) if the vector space is real, and V N (C) if it is complex,
or just V N if we want to be vague!
Notice that we have already seen an example of this definition of dimensionality in action. Consider
the example 2-component vectors we used to explain linear dependence. In this case, we had the two
linearly independent vector u and v, but when we added the third vector w, they became linearly
dependent. This is because the vector space we are considering here is 2 dimensional, so it can only
accomodate 2 linearly independent vectors. So regardless of our choice of w the vectors would have be
linearly dependent!
In reading around the suggested texts, you may also come across the term span. This is a term
that is intimately linked to concepts of linear independence and dimensionality. So let me define it
here for good measure:
The span of a set of vectors {ui for i = 1, 2, · · · , n} is the set of all vectors that can be written
as a linear combination of the ui .
So in our previous example, the span of {u, v} is the set of all vectors that can be written as a
linear combination of u and v. In this case, we found that upon adding a third vector w, we could
write w as a linear combination of u and v, we say that w is in the span of u and v. In fact we can
go further, and write any arbitrary vector in R2 as a linear combination of u and v. Therefore, we can
say that R2 is spanned by {u, v}.
Chapter 2. Linear dependence and basis vectors 21
Theorem 2.1:
Proof. This follows from the definition of linear independence and dimensionality: Since there
are no more than N linearly independent vectors, therefore there must be a relation of the form
N
X
i ei + 0u = 0, (2.4)
i=1
where u 2 V N is an arbitrary vector, and not all i are zero. In particular, the definition of
linear dependence requires 0 6= 0. We therefore have:
N N
1 X X
u= i ei = u i ei . (2.5)
0 i=1 i=1
with ui = i/ 0.
The above theorem is a very important result, and we will use it extensively throughout the course.
Its main implication is that it allows us to define a basis for a vector space:
Any set of linearly independent vectors in V N is called a basis, and they are said to span V N ,
or synonymously, they are complete. This allows us to write any vector v 2 V N as,
N
X
v= v i ei , (2.6)
i=1
You have already come across basis vectors in previous courses, for example for R3 , one choice of
basis vectors are the cartesian unit vectors, e1 = i, e2 = j, and e3 = k. Notice that I wrote one choice
of basis vectors, it is important to note that the choice of ei is not unique. For example, a vector in
cartesian coordinates will remain unchanged by a rotation of the coordinate system, with the rotation
impacting only the definition of the unit vectors. Or equally we could represent our vector in terms
of spherical coordinates, in which case the basis vectors would be different again. All are completely
valid representations of the same vector. We will return to this point later in the course.
Exercise 2.4
If {eP
i } is a basis of V , prove that for any u 2 V , the coefficients ui in the expansion
N N
N
u = i=1 ui ei are unique.
Chapter 2. Linear dependence and basis vectors 22
Exercise 2.5
For the set of real 2 ⇥ 2 matrices, show that
✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆
1 0 0 1 0 0 0 0
e1 = , e2 = , e3 = , e4 = , (2.7)
0 0 0 0 1 0 0 1
For example, lets consider the vector space R3 , this is the set of numbers (x, y, z), where x, y, z 2 R.
A subspace of R3 is the set of vectors (x, y, 0), where x, y 2 R, which define xy-plane in R3 . We can
check that these vectors satisfy the requirements of a subspace:
• For two vectors u = (x1 , y1 , 0) and v = (x2 , y2 , 0), we have u + v = (x1 + x2 , y1 + y2 , 0), which
is also in the subspace. Therefore it is closed under addition.
• For a vector u = (x, y, 0), and a scalar 2 R, we have u = ( x, y, 0), which is also in the
subspace. Therefore it is closed under scalar multiplication.
Theorem 2.2:
It is also constructive to think about a counter examples. For example, the set of vectors H =
{(x, y) : x2 + y 2 1}, i.e. the vectors that lie within a circle of radius 1, is not a subspace of R2 .
To see this consider an arbitrary point within the unit circle, say (x1 , y1 ), and some scalar . We can
easily choose a such that x1 or y1 > 1, thus lying outside of the unit circle. Therefore, the set is
not closed under scalar multiplication, so it is not a subspace of R2 .
• 3blue1brown is a great youtube channel for visualising linear algebra concepts. Check out the
video on linear combinations and basis vectors, https://www.youtube.com/watch?v=k7RM-ot2NWY.
Lecture 3
Inner Product Spaces & Dirac notation
Up to now, we have established a framework for describing vectors in an arbitrary vector space.
Missing from this rather abstract construction is the notion of direction and magnitude that, prior to
this course, underpinned our understanding of what a vector is. In Euclidean space both the magnitude
of a vector and its direction (relative to another vector) is given by the familiar scalar/dot product,
defined as a · b = |a||b| cos ✓, where ✓ is the angle between a and b, and |a| is the magnitude of vector
a. In this lecture we will generalise the notion of the dot/scalar product to abstract vector spaces,
which will be referred to as the inner product.
By the end of this lecture you should be able to:
• Define an inner product space, and calculate the inner product between two vectors.
• Write vectors in Dirac notation, and use this to calculate inner products.
The inner product between two vectors, denoted ha|bi, is defined as a scalar function, that is,
a function that takes two vectors and returns a scalar quantitity. For a complex vector space
V N (C), the inner product has the following properties:
(i) It is linear in the second argument, that is if w = u + µv, then ha|wi = ha|ui + µha|vi.
(ii) Under complex conjugate we have ha|wi = hw|ai. This is trivial requirement in the case
of real vector spaces.
The above definition also allows us to naturally define the idea of a magnitude of a vector, which we
call the norm:
kak2 = ha|ai 0. (3.1)
23
Chapter 3. Inner Product Spaces & Dirac notation 24
Exercise 3.1
Confirm that the dot product a · b for a, b 2 R3 satisfies the above properties.
A vector space which has a linear inner product is called an inner product space. Inner product are
exceptionally important in physical sciences. Clearly from the above exercise, Euclidean space is an
inner product space, therefore Newtonian mechanics can be phrased in such language. Though it may
not be obvious just yet, Quantum mechanics is also described in terms of a complex inner product
space called Hilbert space.
Exercise 3.2
Show that if w = u + µv, then we have: hw|ai = ¯ hu|ai + µ̄hv|ai. This property is called
antilinearity or conjugate linearity.
3.1.1 Orthogonality
We have now recovered the notion of magnitude of a vector using the inner product, we can also develop
the concept of direction as well using the ideas of orthogonality:
Consider two vectors a and b that are elements of the inner product space V N . a and b are
said to be orthogonal if they satisfy ha|bi = 0.
This naturally allows us to define the concept of an orthogonal and orthonormal basis:
So, how do we actually calculate the inner product of two vectors? Consider the set of vectors
j=1 , which form a complete andPorthonormal basis
{ej }N
P
set for the vector space V N . Two vectors
a, b 2 V N with representations a = j aj ej and b = j bj ej will have an inner product of the form
N
X
ha|bi = āi hei |ej ibj , (3.2)
i,j=1
where we have used the linear and antilinear properties of the inner product. Using the definition of
an orthonormal basis, hei |ej i = ij , we then have:
N
X N
X N
X
ha|bi = āi bj = āi ij bj = āi bi . (3.3)
i,j=1 i,j=1 i=1
Note that this is exactly the same form as the scalar product used in Euclidean vector spaces.
Finally, for a given vector a which can be written in the orthonormal basis {ej }N
j=1 the definition
of the inner product also allows us provides us some understanding of what it means to write a vector
Chapter 3. Inner Product Spaces & Dirac notation 25
and take the inner product with a particularly basis vector ek , we have:
N
X N
X
hek |ai = aj hek |ej i = aj kj = ak (3.5)
j=1 j=1
therefore the coefficient ak = hek |ai, often called the projection of a on ek , tells us how much of the
vector a is in the ek direction.
wish to construct an orthonormal basis from it. This process is called orthogonalisation. There are
many different approaches to orthogonalisation, in this section we will present a particularly famous
method called Gram-Schmidt orthogonalisation.
..
.
Physically, what is happening in this procedure: In step 1, an initial vector is set. In step 2, we
take the second vector and subtract the component of the first vector from this vector. In step 3, we
take the third vector and subtract the components of the first two vectors from this vector. This is
repeated iteratively until we have cycled through out full set of vectors. The fact that we can write
this procedure in such general terms shows that you can always find an orthonormal basis for any
inner-product space!
Chapter 3. Inner Product Spaces & Dirac notation 26
All vectors are replaced with a ket, | i, and components are included as a scalar multiple as usual. This
allows us to clearly distinguish between vector and scalar quantities without having to phaff around
with bold symbols or arrows over the top!
With every ket, | i, there is also a bra, h |, associated with it. This bra has both addition and
scalar multiplication defined in the same way as ket’s, meaning that they form their own vector space.
If our vector | i is an element of vector space V , then h | is a vector in V ⇤ which we call dual space
or simply dual of V .
There exists a one-to-one mapping between the vector space V and its dual V ⇤ , which we denote
as a dagger
(| i)† = h | , (h |)† = | i , (3.15)
and is often referred to as the adjoint or hermitian conjugate. This map is antilinear, such that:
(↵ | i + | i)† = ↵ ¯ h | + ¯h |. (3.16)
One of the key virtues of Dirac notation is that it naturally allows us to write inner products such that
h | i = h | | i, that is we have an inner product by acting a bra from the left on a ket. For example,
(↵ h | + h |) |!i =↵h |!i + h |!i,
(3.17)
h!| (↵ | i + | i) =↵h!| i + h!| i.
The best way to understand how useful Dirac notation can be is to see it in action: Consider a
two dimensional complex vector space V 2 , with orthonormal basis {|e1 i , |e2 i}. From the definition of
orthonormality, we have:
he1 |e1 i = he2 |e2 i = 1 and he1 |e2 i = he2 |e1 i = 0. (3.18)
As this is a basis, then by definition we can represent any vector in V 2 (C) in terms of it, such that
| i = a |e1 i + b |e2 i where a, b 2 C. (3.19)
The corresponding bra is then,
h | = ā he1 | + b̄ he2 | .
Using the definition of orthonormality we then have:
h | i = (ā he1 | + b̄ he2 |)(a |e1 i + b |e2 i) = |a|2 + |b|2 .
Go through the following exercises to try and become practice in Dirac notation:
Exercise 3.3
For two vectors | i = a |e1 i + b |e2 i and | i = c |e1 i + d |e2 i where a, b, c and d 2 C, show that:
¯
h | i = ac̄ + bd. (3.20)
Show that h | i = h | i, for this specific case, and generalise this to V N (C).
Exercise 3.4
Write the column vectors, ✓ ◆ ✓ ◆
2+i 5+i
and , (3.21)
i i
in Dirac notation, and calculate their norms and the inner product assuming that the basis used
is orthonormal.
Exercise 3.5
PN
A vector |ai 2 V N can be represented in the orthonormal basis {|eij }N
j=1 as |ai = j=1 aj |ej i.
Show that aj = hej |ai.
Chapter 3. Inner Product Spaces & Dirac notation 28
Proof. This result is trivially proven it either |ai or |bi are the null vector, |0i. Assuming |ai or
|bi are not |0i, the proof is found by first considering |ui = |ai |bi, where = hb|ai /kbk2 .
can for this inequality can be proven using
The above inequality holds for any inner product space, and is a very useful result. In the context of
quantum mechanics you can use Cauchy-Schwarz inequality to prove a general form of the uncertainty
principle A B |h[A, B]i|/2, where A and B are two Hermitian observables.
Another useful result ios the triangle inequality, which follows from the Cauchy-Schwarz inequality:
Exercise 3.6
Prove the triangle inequality using properties of the Cauchy-Schwarz inequality.
• Section 8.1.2 and 8.1.3 in Riley, Hobson and Bence. This is all in vector notation.
• Section 1.2-1.5 Shankar. This is all in Dirac notation, and would act as a nice revision of the
basic principles of Vector spaces.
Please do ensure that you are comfortable with Dirac notation, as we will be using it extensively
throughout the rest of the course. Problems Q7-11 on problem sheet 1 and Q1-5 on problem sheet 2
cover the material for this lecture.
Lecture 4
Linear operators
Now we have introduced vector spaces, alongside the notion of distance and direction provided to us
by the inner product, we can now consider how a vector can be manipulated or transformed. In the
language of linear algebra, vectors are transformed by linear operators, which we will introduce in this
lecture.
By the end of the lecture, you should be able to:
• write linear operators in terms of Dirac notation using the outer product.
• Define the adjoint of an operator, and determine whether an operator is unitary or Hermitian.
consider a vector |ci = µ |ai + |bi, where |ai , |bi 2 V and , µ are scalars. A linear operator Â
is defined such that: ↵
c0 = Â |ci = µ(Â |ai) + (Â |bi) 2 W. (4.1)
We say that the operator  has mapped our vector |ci 2 V to a new vector |c0 i in another vector
space W .
For the rest of this course we will restrict ourselves to the simplifying case W = V , this is the most
important case for applications of linear algebra in quantum mechanics, however it is not always true
in more general applications of linear operators.
(i) The addition of linear operators is distributive, such that, for two operators  and B̂, then
(ii) We also define that scalar multiplication for a linear operator is given by ( Â) |vi = (Â |vi).
29
Chapter 4. Linear operators 30
(iv) The null operator is given by Ô |vi = |0i, where |0i is the null vector.
Exercise 4.1
Use the above definitions to show that the set of all linear operators acting on a vector space is
itself a vector space.
where  operates on |vi first, followed by B̂. Note that in general products of operators are not
commutative, that is B̂ Â 6= ÂB̂.
Using this definition of the product of operators we have one final definition, the inverse of an
operator:
Definition 4.2: Inverse operator
then B̂ is called the inverse of Â, which we denote B̂ = Â 1, and satisfies ÂÂ 1 = Â 1 Â = 1̂.
By the above definitions, we can identify |ciha| as a linear operator. This operator is sometimes referred
to as dyad, or described as the outer product of the vectors |ci and |ai,
Consider the vector the complex vector space V N (C). The outer product of vectors |ai , |bi 2
V N , denoted |aihb|, is a linear operation which constructs a linear operator on V N . The outer
product has the properties:
• It is linear in the first argument, that is if |ci = µ |ai + |bi, then |cihd| = µ |aihd| + |bihd|.
• It is anti-linear in it’s second argument, that is if |di = µ |ai + |bi, then |cihd| = µ̄ |ciha| +
¯ |cihb|.
Chapter 4. Linear operators 31
Let us illustrate the action of projection operators with an example: Suppose an inner product space
V N is spanned by the orthonormal basis {|ej i}N j=1 , then we may construct an operator:
We can see trivially that P̂j2 = |ej i hej |ej i hej | = P̂j , and P̂j is therefore a projection operator. If we
P
consider the action on an arbitrary vector in V N , |bi = N k=1 bk |ek i, then we have:
N
X N
X
P̂j |bi = bk P̂j |ek i = bk |ej i hej |ek i = bj |ej i , (4.4)
k=1 k=1
where we have used that hej |ek i = jk for an orthonormal basis. So the projection operator P̂j therefore
projects out the j th -component of the vector |bi.
Exercise 4.2
Suppose an inner product space V N is spanned by the orthonormal basis {|ej i}N j=1 , show that
P1 ihe1 | + |e2 ihe2 | is projection operator, and determine its action on an arbitrary state
P̂ = |e
|bi = N k=1 bk |ek i.
It is important to recognise that their exists a completeness relation for any orthonormal basis V N .
We will see later that can be used to change the representation of a vector between orthonormal basis
sets.
Chapter 4. Linear operators 32
For a linear operator Â, we can define its adjoint (or synonymously Hermitian conjugate) †
as:
hu| † |vi = hv|  |ui 8 |ui , |vi 2 V. (4.6)
If we choose |vi = |ej i and |ui = |ek i, where the vectors {|ej i}N j=1 are an orthonormal basis for V ,
then we find:
(† )ij = hei | † |ej i = hej |  |ei i = Āji , (4.7)
so in matrix form, the Hermitian conjugate or adjoint of an operator corresponds to its conjugate
transpose. This is the same operator that mapped a ket to a bra.
Exercise 4.3
Prove the following statements,
1. (ÂB̂)† = B̂ † † .
The definition of the adjoint allows us to define two very important class of operators, which we
encounter frequently in both linear algebra and quantum mechanics:
In matrix language, we find that the components satisfy Aij = Āji . For a real vector space, a
self-adjoint matrix is equivalent to a real symmetrix one.
Û † Û = Û Û † = 1̂ (4.9)
that is Û 1 = Û † .
For a real vector space then the Hermitian conjugate reduces to the transpose, and we have
Û T Û = 1̂, which is the definition of an orthogonal matrix.
As we discussed in lecture 2, while a decomposition of a vector into a particular basis set is unique, the
choice of basis is not. For a variety of reasons, it is often useful to change the basis of a vector or linear
operator. For example, in quantum mechanics if we wish to measure a particular property of a quantum
system, it is often convenient to write the state of the system in the eigenbasis of the measurement.
The expansion coefficients of the resulting basis will then give the probability of measuring each of the
possible outcomes. For example, if we wish to measure the spin of an electron in the Z direction, we
can write its state as,
| i = ↵ |"i + |#i , (5.1)
the probability of measuring spin up can then be trivially identified as |↵|2 and similarly for spin down
| |2 .
By the end of this lecture you should be able to:
• Change the basis of a vector and linear operator.
• Write vectors and linear operators as matrices.
are complete and orthonormal they each have a completeness relation associated to them,
N
X N
X
1̂ = |ej ihej | and 1̂ = |fj ihfj | . (5.2)
j=1 j=1
Both completeness relation are completely valid representations of the identity, but written in terms
of a different set of basis vectors. These completeness relations will be crucial for changing the basis
of a given vector or operator.
33
Chapter 5. Changing basis of vectors and operators 34
Basis transformation (a
|yi
<latexit sha1_base64="5rYgc3qZ+59nc/9DfqHhP/WOP7I=">AAACA3icbVDLSsNAFJ3UV62vqks3wSK4KomIuiy6cVnBPqANZTK9aYdOJmHmRgghSz/ArX6CO3Hrh/gF/obTNgvbeuDC4Zx7ufcePxZco+N8W6W19Y3NrfJ2ZWd3b/+genjU1lGiGLRYJCLV9akGwSW0kKOAbqyAhr6Ajj+5m/qdJ1CaR/IR0xi8kI4kDzijaKROfwKYpfmgWnPqzgz2KnELUiMFmoPqT38YsSQEiUxQrXuuE6OXUYWcCcgr/URDTNmEjqBnqKQhaC+bnZvbZ0YZ2kGkTEm0Z+rfiYyGWqehbzpDimO97E3F/7xegsGNl3EZJwiSzRcFibAxsqe/20OugKFIDaFMcXOrzcZUUYYmoYUt8TjVnOm8YpJxl3NYJe2LuntVv3y4rDVui4zK5IScknPikmvSIPekSVqEkQl5Ia/kzXq23q0P63PeWrKKmWOyAOvrF9jpmNI=</latexit>
<latexit sha1_base64="Vr1t9vTdoy9eb0BOfjthC3QV40U=">AAACCnicbVDLSsNAFJ3UV62vqks3g0VwVRIRdVl047KCfUCTlsn0ph06mYSZiRBC/sAPcKuf4E7c+hN+gb/htM3Cth64cDjnXs7l+DFnStv2t1VaW9/Y3CpvV3Z29/YPqodHbRUlkkKLRjySXZ8o4ExASzPNoRtLIKHPoeNP7qZ+5wmkYpF41GkMXkhGggWMEm2kvjsBnaV9N5YshHxQrdl1ewa8SpyC1FCB5qD64w4jmoQgNOVEqZ5jx9rLiNSMcsgrbqIgJnRCRtAzVJAQlJfNvs7xmVGGOIikGaHxTP17kZFQqTT0zWZI9Fgte1PxP6+X6ODGy5iIEw2CzoOChGMd4WkFeMgkUM1TQwiVzPyK6ZhIQrUpaiElHqeKUZVXTDPOcg+rpH1Rd67qlw+XtcZt0VEZnaBTdI4cdI0a6B41UQtRJNELekVv1rP1bn1Yn/PVklXcHKMFWF+/g0eb7w==</latexit>
|y i
|vi
<latexit sha1_base64="Vd4n+VV9tNqwOghk/uHW+qi2VWI=">AAACA3icbVDLSsNAFJ3UV62vqks3g0VwVRIp6rLoxmUF+4A2lMl00g6ZTMLMTSGELP0At/oJ7sStH+IX+BtO2yxs64ELh3Pu5d57vFhwDbb9bZU2Nre2d8q7lb39g8Oj6vFJR0eJoqxNIxGpnkc0E1yyNnAQrBcrRkJPsK4X3M/87pQpzSP5BGnM3JCMJfc5JWCk7iBgkE3zYbVm1+058DpxClJDBVrD6s9gFNEkZBKoIFr3HTsGNyMKOBUsrwwSzWJCAzJmfUMlCZl2s/m5Ob4wygj7kTIlAc/VvxMZCbVOQ890hgQmetWbif95/QT8WzfjMk6ASbpY5CcCQ4Rnv+MRV4yCSA0hVHFzK6YToggFk9DSlniSak51XjHJOKs5rJPOVd25rjceG7XmXZFRGZ2hc3SJHHSDmugBtVAbURSgF/SK3qxn6936sD4XrSWrmDlFS7C+fgHUG5jP</latexit>
<latexit sha1_base64="G2KKyZSpT86ea2mQRoow+SwFxMs=">AAACCnicbVDLSsNAFJ3UV62vqks3g0VwVRIp6rLoxmUF+4AmLZPpTTt0JgkzE7GE/IEf4FY/wZ249Sf8An/DaZuFbT1w4XDOvZzL8WPOlLbtb6uwtr6xuVXcLu3s7u0flA+PWipKJIUmjXgkOz5RwFkITc00h04sgQifQ9sf30799iNIxaLwQU9i8AQZhixglGgj9dwx6PSp58aSCcj65YpdtWfAq8TJSQXlaPTLP+4goomAUFNOlOo6dqy9lEjNKIes5CYKYkLHZAhdQ0MiQHnp7OsMnxllgINImgk1nql/L1IilJoI32wKokdq2ZuK/3ndRAfXXsrCONEQ0nlQkHCsIzytAA+YBKr5xBBCJTO/YjoiklBtilpIiUcTxajKSqYZZ7mHVdK6qDqX1dp9rVK/yTsqohN0is6Rg65QHd2hBmoiiiR6Qa/ozXq23q0P63O+WrDym2O0AOvrF4Gmm+4=</latexit>
|x i
✓ ↵
<latexit sha1_base64="HGHhHjYU7o6elSXqVk0Cyk2XkZ8=">AAACD3icbVBLTgJBFOzBH+Jv0KWbQWLiRjJjiLokunGJiXwShpA3TcN06Pmk+41mQjiEB3CrR3Bn3HoET+A1bGAWAlbykkrVe6mX8mLBFdr2t5FbW9/Y3MpvF3Z29/YPzOJhU0WJpKxBIxHJtgeKCR6yBnIUrB1LBoEnWMsb3U791iOTikfhA6Yx6wYwDPmAU0At9cyiiz5DcEvnbskFEfvQM8t2xZ7BWiVORsokQ71n/rj9iCYBC5EKUKrj2DF2xyCRU8EmBTdRLAY6giHraBpCwFR3PHt9Yp1qpW8NIqknRGum/r0YQ6BUGnh6MwD01bI3Ff/zOgkOrrtjHsYJspDOgwaJsDCypj1YfS4ZRZFqAlRy/atFfZBAUbe1kBL7qeJUTQq6GWe5h1XSvKg4l5XqfbVcu8k6ypNjckLOiEOuSI3ckTppEEqeyAt5JW/Gs/FufBif89Wckd0ckQUYX7+V35xR</latexit>
↵
<latexit sha1_base64="YJ2j9N4oXTSUXt0szr1nxBU+hxA=">AAACAnicbVDLSgNBEJyNrxhfUY9eBoPgKexKUI9BLx4jmAckS+idzGbHzM4OM7PCEnLzA7zqJ3gTr/6IX+BvOEn2YBILGoqqbrq7AsmZNq777RTW1jc2t4rbpZ3dvf2D8uFRSyepIrRJEp6oTgCaciZo0zDDaUcqCnHAaTsY3U799hNVmiXiwWSS+jEMBQsZAWOlVg+4jKBfrrhVdwa8SrycVFCORr/80xskJI2pMISD1l3PlcYfgzKMcDop9VJNJZARDGnXUgEx1f54du0En1llgMNE2RIGz9S/E2OItc7iwHbGYCK97E3F/7xuasJrf8yETA0VZL4oTDk2CZ6+jgdMUWJ4ZgkQxeytmESggBgb0MIWGWWaET0p2WS85RxWSeui6l1Wa/e1Sv0mz6iITtApOkceukJ1dIcaqIkIekQv6BW9Oc/Ou/PhfM5bC04+c4wW4Hz9Apl+mBk=</latexit>
|xi
<latexit sha1_base64="RoHT7McsiwvljiZvTsXAkuHDqqY=">AAACA3icbVDLSsNAFL2pr1pfVZduBovgqiRS1GXRjcsK9gFtKJPppB0ymYSZiRhCln6AW/0Ed+LWD/EL/A2nbRa29cCFwzn3cu89XsyZ0rb9bZXW1jc2t8rblZ3dvf2D6uFRR0WJJLRNIh7JnocV5UzQtmaa014sKQ49TrtecDv1u49UKhaJB53G1A3xWDCfEayN1B0EVGdP+bBas+v2DGiVOAWpQYHWsPozGEUkCanQhGOl+o4dazfDUjPCaV4ZJIrGmAR4TPuGChxS5Wazc3N0ZpQR8iNpSmg0U/9OZDhUKg090xliPVHL3lT8z+sn2r92MybiRFNB5ov8hCMdoenvaMQkJZqnhmAimbkVkQmWmGiT0MKWeJIqRlReMck4yzmsks5F3bmsN+4bteZNkVEZTuAUzsGBK2jCHbSgDQQCeIFXeLOerXfrw/qct5asYuYYFmB9/QLXT5jR</latexit>
Figure 5.1.1: A figure showing a change of basis for a vector |vi. Here we have rotated the basis vectors |xi and
|yi by some angle ↵.
We have not changed the vector, applying only the idenity operator through the completeness relation,
but we now have two representations of the same vector. Effectively, we have changed the basis of our
vector. Intuitively, we can think of changing basis as a rotation of the coordinate system which leaves
the vector |vi invariant.
To make this more concrete, let’s take a simple example of a two-dimensional vector space, with
orthonormal basis vectors {|xi , |yi}. Since this is a complete orthonormal basis, there exhists a com-
pletness relation 1̂ = |xihx| + |yihy|. An arbitrary vector in this space, |vi, may be written as,
where vx = hx|vi and vy = hy|vi. Now, suppose we wish to rotate the coordinate system by some angle
↵, as shown in Fig. 5.1.1. In this case we have a new basis {|x0 i , |y 0 i}, which can be written in terms
of the unrotated basis as,
↵ ↵
x0 = cos ↵ |xi + sin ↵ |yi and y 0 = cos ↵ |yi sin ↵ |xi . (5.6)
Associated with this new basis is a completness relation, 1̂ = |x0 ihx0 | + |y 0 ihy 0 |. This allows us to write,
↵ ↵
|vi = vx0 x0 + vy0 y 0 , (5.7)
with vx0 = hx0 |vi and vy0 = hy 0 |vi. We have not done anything to our vector, but simply changed the
definition of the axes we use to represent it.
Let’s do a quick example. Consider the vector space R2 , which has a orthonormal basis {|xi , |yi}.
A vector |vi in this space is given by,
|vi = |xi + 2 |yi . (5.8)
Let’s say we want to rewrite |vi in terms of the linearly independent basis,
To do this, we first rewrite the basis vectors |xi and |yi in terms of |ui and |wi,
1 1
|xi = (|ui + |wi) and |yi = (|ui |wi). (5.10)
2 2
Subbing this into our expression for |vi, we have,
1 3 1
|vi = (|ui + |wi) + (|ui |wi) = |ui |wi . (5.11)
2 2 2
Chapter 5. Changing basis of vectors and operators 35
Exercise 2.4
Consider the two dimensional inner-product space V 2 (C). This space has an orthonormal
bases {|0i , |1i}. We can use this basis to construct a new basis {|+i , | i}, such that
|±i = p12 (|0i ± |1i).
(c) Consider another vector | i = |+i + | i. Show that the inner product
1
h | i = p ¯(↵ + ) + ¯ (↵ ) ,
2
is the same in both the {|0i , |1i} and {|+i , | i} basis sets.
(d) Show that any inner product is left invariant by a change of basis.
It is worth noting that while a vector |vi might look quite different when written in terms of
different basis vectors, the underlying vector is the same. One way to see this is to consider the inner
product of the vector with itself:
Consider a vector |vi 2 V N , the vector space V N has two orthonormal bases {|ej i}N
j=1 and {|fj i}j=1 ,
N
If the inner product is left invariant by a change of basis, then we must have,
N
X N
X
|vj |2 = |cj |2 . (5.15)
j=1 j=1
Now recall that vj = hej |vi and cj = hfj |vi, therefore we have,
N
X N
X
hv|ej i hej |vi = hv|fj i hfj |vi . (5.16)
j=1 j=1
where the term hej |Â|ek i specifies how the operator acts on the basis vectors. We can also apply
the completeness relation in terms of the another orthonormal basis {|fj i}N j=1 , yielding a completely
equivalent representation of the operator,
0 1 !
XN N
X
 = 1Â1 = @ |fj ihfj |A  |fk ihfk | ,
j=1 k=1
(5.19)
N
X
= hfj |Â|fk i |fj ihfk | .
j,k=1
This operator would have completely the same action on the same vector, but we have simply changed
the basis with which we represent it.
As an example, let us consider us consider a complex vector space C2 with orthonormal basis
{|0i , |1i}. One of the Pauli operators that act on this space can be written as,
Say we want to rewrite ˆx in terms of a second orthonormal basis for C2 as {|+i , | i}, where |±i =
p1 (|0i ± |1i). We can do this in two ways, first we can write the basis vectors {|0i , |1i} in terms of
2
{|+i , | i}, such that,
1 1
|0i = p (|+i + | i) and |1i = p (|+i | i). (5.21)
2 2
Subbing these expressions into ˆx , we have,
ˆx = |0ih1| + |1ih0|
1 1
= (|+i + | i)(h+| h |) + (|+i | i)(h+| + h |)
2 2 (5.22)
1 1
= (|+ih+| + |+ih | | ih+| | ih |) + (|+ih+| |+ih | + | ih+| | ih |)
2 2
= |+ih+| | ih | .
Alternatively, we could have written the completeness relation for the in terms of the basis {|+i , | i},
and found the action of ˆx on these basis vectors. For example, we have,
ˆx = |+ih+| | ih | . (5.24)
Both approaches are completely valid, and yield the same result. However, the latter approach is
often more convenient and systematic when dealing with high-dimensional vector spaces. The above
representation of ˆx is actually a very special representation, known as the spectral decomposition of
the operator, which we will return to in Lecture 7.
Supporting questions for this lecture can be found in Problem Sheet 2 Question 9.
Lecture 6
Matrix representations of vectors and operators
The more observant of you may have noticed that there is an intimate link between matrices, operators,
and vectors. Indeed, as we shall see in this lecture, we can represent any vector or operator in terms
a matrix. When doing this it is important to specify which basis you are using to write the matrix
representation, as —unlike the underlying vector or operator— the matrix representation will change
depending on the basis used.
In this lecture we will,
N
X
|vi = vj |eij . (6.1)
j=1
This is what is referred to as a representation of |vi. Now as this complete basis leads to a unique set
of coefficients vj , it is common to express this simply as an N ⇥ 1 matrix, or equivalently a column
vector 0 1
v1
B v2 C
B C
|vi ! B . C . (6.2)
@ .. A
vN
38
Chapter 6. Matrix representations of vectors and operators 39
The arrow notation above should read as ‘is represented by’. With this notation, then we can identify
the basis vectors {|ej i} as having matrix representations:
0 1 0 1 0 1
1 0 0
B 0 C B 1 C B 0 C
B C B C B C
B 0 C B 0 C B C
|e1 i ! B C , |e2 i ! B C , · · · , |eN i ! B 0 C . (6.3)
B .. C B .. C B .. C
@ . A @ . A @ . A
0 0 1
It is important to note that the above matrix representation is not unique. Just as we can represent a
vector in terms of a different basis, we can also write its matrix representation in terms of a different
basis. The resulting vector will look the same, but the context in which it is represented will be
different. Therefore, when you write a matrix representation of a vector, it is crucial that you state
the basis in which it is represented.
Just as there is a matrix representation of ket vectors, we can also associate a matrix representation
with bra vectors. This can be found by recognising that a bra vector is simply the adjoint of a ket
vector, or in other words, the conjugate transpose, hv| = |vi† . Therefore, we have,
N
X
hv| = vj⇤ he|j ! v̄1 v̄2 · · · v̄N . (6.4)
j=1
where we have used that cj = hej |ci and bj = hej |bi. We have also introduced Ajk = hej | Â |ek i, which
we refer to as the matrix elements of Âjk . Indeed, if you look back at the week 0 lecture notes (or your
P
first year maths notes), you can see that cj = N k=1 Ajk bk is the definition of matrix multiplication!
So we can represent the expression |ci = Â |bi in matrix form as:
0 1 0 10 1
c1 A11 A12 · · · A1N b1
B c2 C B A21 A22 · · · A2N C B b2 C
B C B CB C
B .. C = B .. .. .. C B .. C . (6.7)
@ . A @ . . ··· . A@ . A
cN AN 1 AN 2 · · · AN N bN
Chapter 6. Matrix representations of vectors and operators 40
where the matrices and vectors are represented in the {|ej i}N 1
j=1 basis .
Another useful and equivalent representation of the operator  is in terms of the outer product
introduced in the previous section
0 1 !
N
X N
X XN
 = 1̂Â1̂ = @ |ej ihej |A  |ek ihek | = Ajk |ej ihek | ,
j=1 k=1 j,k=1
where Ajk = hej | Â |ek i as before. Since the above representation is equivalent to the matrix represen-
tationP
of Â, this allows us toP write a matrix definition of the outer product, such that for two vectors
|ai = j=1 aj |ej i and |bi = N
N
j=1 bj |ej i, the matrix representation of the outer product is
0 1 0 1
a1 b̄1 a1 b̄2 ··· a1 b̄N a1
B a2 b̄1 a2 b̄2 ··· a2 b̄N C B a2 C
B C B C
|aihb| ! B .. .. .. C=B .. C b̄1 b̄2 · · · b̄N . (6.8)
@ . . ··· . A @ . A
aN b̄1 aN b̄2 · · · aN b̄N aN
The outer product is often also referred to as the Dyadic product, where two vectors are used to
construct a matrix, and can sometimes be seen written as ab† .
In the case of the completeness relation, we can use the dyadic product to construct a particularly
simple form for the identity operator. If we consider the matrix representation of the outer product
|e1 ihe1 |, 0 1 0 1
1 1 0 ··· 0
B 0 C B 0 0 ··· 0 C
B C B C
|e1 ihe1 | ! B . C (1 0 · · · 0) = B . . . C, (6.9)
@ .. A @ .. .. · · · .. A
0 0 0 ··· 0
Repeating this procedure for the full basis, and adding them together, we have
0 1
1 0 ··· 0
XN B 0 1 ··· 0 C
B C
1̂ = |ej ihej | ! B . .. .. C , (6.10)
@ .. . ··· . A
j=1
0 0 ··· 1
where we have unity on the diagonal, and zeroes everywhere else. Alternatively, one could find the
matrix elements as 1̂jk = hej | 1̂ |ek i = hej |ek i = ij .
The underlying vector is left unchanged, however the matrix representations have changed. Therefore,
it is crucial to state what basis a particular matrix representation is given in. Though it is slightly
cumbersome notation, an arrow with the basis stated above it is useful for keeping track of which basis
has been used in the representation of a vector or matrix.
Let’s now consider a vector space V N with two different orthonormal basis {|ej i}N
j=1 and {|fj i}j=1 .
N
Each basis can be used to represent a vector |vi, with matrix elements aj = hej |vi and bj = hfj |vi.
How can we change between the two equivalent representations of |vi?
Since the vector is left unchanged by our choice of basis, we can write,
N
X N
X
|vi = aj |ej i = bk |fk i . (6.13)
j=1 k=1
If we take the inner product with the basis vector |el i and using the orthonormality of the basis, then
we have:
N
X N
X
al = hel |vi = bk hel |fk i = (U)lk bk . (6.14)
k=1 k=1
Notice that the righthand side of this expression is simply the matrix product between a matrix U
with matrix elements (U)lk = hel |fk i, where I have used boldface to denote a matrix. So, if we want
to change the basis with which we represent a vector, we simply take the matrix product:
0 1 0 10 1
a1 he1 |f1 i · · · he1 |fN i b1
B a2 C B he2 |f1 i · · · he2 |fN i C B b2 C
B C B CB C
B .. C = B .. .. .. C B .. C . (6.15)
@ . A @ . . . A @ . A
aN heN |f1 i · · · heN |fN i bN
We can interpret the action of the matrix U as a rotation of the column vector into a new basis. This
can be shown explicitly by considering the example given in Fig. 5.1.1. For this example, we transform
the representation of a vector written in the basis {|xi , |yi} into the new basis {|x0 i , |y 0 i} using the
matrix equation
✓ ◆ ✓ 0 ◆✓ ◆ ✓ ◆✓ ◆
v x0 hx |xi hx0 |yi vx cos ↵ sin ↵ vx
= = , (6.16)
vy 0 hy 0 |xi hy 0 |yi vy sin ↵ cos ↵ vy
where we have used the definition of the |x0 i and |y 0 i given in Eq. 5.6. Notice that the operator use
to change representation has the same form as a rotation matrix in two dimensions. If you are unsure
about this result, then you can check using geometric arguments.
An important property of the matrix used to transform between representations is that it is unitary:
N N
!
X X X
† †
(U U)jk = (U )jm (U)mk = hfj |em i hem |fk i = hfj | |em ihem | |fk i = jk , (6.17)
m=1 m=1 m
which are the components of the identity matrix. So we only need one matrix to switch between
representations of vectors.
Although the operator is equivalent in both basis, the matrix elements Ajk 6= Ãjk are different. We
can find the relationship between the two representations in a similiar fashion as Eq. 6.14:
X
Ajk = hej |fl i Ãlm hfm |ek i , (6.19)
lm
where we see that this is once again in matrix product form. If we suppose that A is the matrix
representation of  with respect to the basis {|ej i}N j=1 , and likewise à is its representation w.r.t.
{|fj i}j=1 , then the representations can be linked through the product,
N
A = UÃU† , (6.20)
However, say we want to transform the representation of x into the basis {|+i , | i}. First, we find
the unitary matrix U that transforms between representations,
✓ ◆ ✓ ◆
h+|0i h+|1i 1 1 1
U= =p . (6.22)
h |0i h |1i 2 1 1
which is the same as the spectral representation we found in the previous lecture. An important note to
finish this lecture with is that the underlying operator has not changed, only its matrix representation.
Therefore, whenever you write a matric for an operator (or a vector), be clear exactly which basis you
are using to represent it in.
Supporting questions for this lecture can be found in Problem Sheet 2 Question 8.
Lecture 7
Eigenvalue problems and Diagonalisation
It is hard to emphasise enough the importance of the eigenvalue problem. It emerges in all branches
of science and mathematics: you’ve already seen that quantum mechanics boils down to solving eigen-
value problems, and in maths of waves and fields you have seen that they are solutions to differential
equations. Here we are going to study the eigenvalue problem in the context of linear operators and
vector spaces, but keep in mind that these ideas are transferrable to all manner of physical contexts!
By the end of this lecture you should be able to:
• Derive the properties of the eigenvalues and eigenvectors of Hermitian and Unitary operators.
• Define the spectral representation of a linear operator and diagonalise the matrix representation
of linear operators.
where and |ui are an eigenvalue and eigenvector associated to  respectively, where |ui =
6 |0i,
the null vector.
where we have used |ui = 1̂ |ui. For an N -dimensional vectorspace, this equation has non-trivial
solutions (i.e. |ui =
6 |0i) if
det(Â 1̂) = 0, (7.3)
which generates a polynomial of degree N , with N solutions for 2 C. One can then find the
eigenvectors by substituting a particular value of into the eigenvalue equation.
43
Chapter 7. Eigenvalue problems and Diagonalisation 44
Exercise 7.1
A vector space V 2 has an orthonormal basis {|0i , |1i}. The Pauli operators ˆx , ˆy and ˆz have
the following matrix representations with respect to the basis {|0i , |1i}:
✓ ◆ ✓ ◆ ✓ ◆
1 0 0 1 0 i
z = , x = , and y = . (7.4)
0 1 1 0 i 0
Find the eigenvalues and eigenvectors for each of the Pauli operators.
It is worth noting that each distinct eigenvalue gives a distinct eigenvector. If an eigenvector is
repeated m > 1 times, we say that it is a degenerate, and there may be up to m linearly independent
eigenvectors corresponding to this eigenvalue.
Let  be a Hermitian operator, that is † = Â. The eigenvalues for this operator are real, and
its eigenvectors are orthogonal.
huj |  |uk i = huj | † |uk i = huk |  |uj i = ¯ j huj |uk i , (7.7)
( k
¯ j ) huj |uk i = 0. (7.8)
There are two cases to consider, for j = k, we must have j = ¯ j 2 R since huj |uj i > 0 for all
6 0. For j 6= k, eigenvectors are distinct therefor j 6=
|uj i = k , so we must have huj |uk i = 0.
Exercise 7.3
Prove the above theorem as an exercise.
This is called the spectral representation of Â. In matrix form, this corresponds to a diagonal matrix
with matrix elements huj | Â |uk i = k huj |uk i = k jk . This yields,
0 1
1 0 ··· 0
{|uj i}N B 0 2 ··· 0 C
j=1 B C
 ! Adiag = B . . C. (7.12)
@ .. .
. ··· 0 A
0 0 ··· N
Given a matrix representation of  in terms of the orthonormal basis {|ej i}N j=1 , which we denote A,
then we can rewrite this transform this operator into its eigenbasis following the same procedure given
in 6.3,
Adiag = TAT† , (7.13)
where the elements of the unitary operators is given by (T)jk = huj |ek i, where the j th column of the
matrix is the j th eigenvector of Â.
Exercise 7.4
Using the eigenvalues and eigenvectors found previously, write the Pauli operators in diagonal
form.
Chapter 7. Eigenvalue problems and Diagonalisation 46
So  |ui is also an eigenvector of B̂ with eigenvalue . The implications of this statement depend on
whether eigenvalues are non-degenerate or degenerate:
(1) Non-degenerate: If is non-degenerate, then there is one eigenvector |ui associated to the
eigenvalue . This means that  |ui = µ |ui, where µ is a scalar. Therefore |ui is a simultaneous
eigenvector of B̂ and Â.
(2) Degenerate: If has degeneracy m > 1, with corresponding eigenvectors {|uj i}m j=1 , then a
linear combination of these eigenvectors is still and eigenvector of Â. The operator B̂ acts in this
subspace, and it is possible to find a basis that simultaneously diagonalises  and B̂.
Let’s start by doing some quantum mechanics: The time-dependent Schrödinger equation can be
written in vector form as,
@ | it
i~ = Ĥ | it . (8.1)
@t
This is precisely the same Schödinger equation you used in Introduction to QM, but we replace the
wavefunction with a ket, and Ĥ is the Hamiltonian operator. Since this is a simple first-order ODE,
the solution to this differential equation is then (which we will prove later),
iĤt/~
| it = e | i0 , (8.2)
⇣ ⌘
where we have an exponentiated the Hamiltonian. But how do we calculate exp  ? More generally,
how do we apply functions f to operators?
In this lecture we will,
• Apply a function to an operator using the spectral representation.
• Find the trace of an operator and show that it is basis independent.
If we build up the matrix representation of f (Â) in the eigenbasis of  in the usual way, then we
obtain: 0 1
f ( 1) 0 ··· 0
B 0 f ( 2) · · · 0 C
B C
f (Â) = B .. .. C, (8.5)
@ . . ··· 0 A
0 0 · · · f( N)
48
Chapter 8. Functions of operators and the operator trace 49
i.e. we have a diagonal matrix with the function applied to the eigenvalues of  along the diagonals.
This is only valid if the series of expansion of f (x) converges for all x = j .
Let’s do an example: Consider the 2D vector space C2 , with orthonormal basis {|0i , |1i}. The
Pauli operator ˆx = |0ih1| + |1ih0|, has a spectral decomposition,
ˆx = |+ih+| | ih | , (8.6)
p
where |±i = (|0i ± |1i)/ 2 are the eigenvectors of ˆx . We can define a rotation in C2 as,
or in matrix form, ✓ ◆
|0i,|1i cos(✓) i sin(✓)
R̂(✓) ! . (8.10)
i sin(✓) cos(✓)
An alternative way to find the matrix representation of R̂(✓) would be to use the definition of the
Taylor expansion:
1
X ( i✓)n n
R̂(✓) = ˆx . (8.11)
n!
n=0
and note that ˆx2 = 1̂ so that we have two terms in the expansion:
X (i✓)n X (i✓)n
R̂(✓) =ˆx + 1̂ ,
n! n! (8.12)
j=odd j=even
= cos(✓)1̂ + i sin(✓)ˆx .
Both are valid approaches which lead to the same result. One advantage of the latter, however, is
that it does not require the knowledge of the eigenvectors of ˆx . This is useful from a computational
point of view, as finding the eigenvectors of an operator in N -dimensional space can be a compu-
tationally expensive task. Specifically, the computational complexity1 of eigen decomposition scales
as O(N 3 ). Therefore it is often advantageous to use the Taylor expansion and repeatedly multiply
matrices together to find the function of an operator.
Exercise 8.1
⇣ ⌘
Find the exponential f (Ĥ) = exp Ĥ , where Ĥ is a Hermitian operator, with eigenbasis
i=1 , and
{|✏i i}N 2 R.
As a final note, and a property that will be useful in your studies of quantum mechanics, is that if
 is a Hermitian operator, then the function,
⇣ ⌘
Û (✓) = exp iÂ✓ ,
1
i.e. the time taken for a computational task to run
Chapter 8. Functions of operators and the operator trace 50
where ✓ 2 R, isP
also unitary. This can easily be seen using the spectral representation of Â, where we
have that  = N j=1 j | j ih j |, then we have:
⇣ ⌘ XN
Û (✓) = exp iÂ✓ = exp( i j ✓) | j ih j | . (8.13)
j=1
This is exactly the form of the time-evolution operator in quantum mechanics, where  is the Hamil-
tonian operator, and ✓ is the time.
Consider a linear operator  which acts on the vectorspace V N . For a given orthonomal basis
j=1 , the trace of  is the sum of its diagonal matrix elements,
{|ej i}N
N
X
Tr(Â) = hej | Â |ej i . (8.16)
j=1
The trace is independent of the choice of orthonormal basis. P To see this, we consider a second or-
thonormal basis {|fj i}N
j=1 , which allows us to write  = lm Ãlm |fl ihfm |, with Ãlm = hfl |  |fm i.
Taking the trace with respect to the basis |ej i, we have:
N N
!
X X X
Tr(Â) = hej | Â |ej i = hej | Ãlm |fl ihfm | |ej i
j=1 j=1 lm
X
= Ãlm hej |fm i hfl |ej i
lmj
X
= Ãlm hfl |ej i hej |fm i
lmj
0 1
X X (8.17)
= Ãlm hfl | @ |ej ihej |A |fm i
lm j
N
X
= Alm hfl |fm i
lm=1
XN
= hfl | Â |fl i .
l=1
Chapter 8. Functions of operators and the operator trace 51
(i) The trace is linear, that is for operators  and B̂, we have
(iii) The trace of the product of operators  and B̂ is independent of the order of the product, that
is, Tr(ÂB̂) = Tr(B̂ Â). This is true even if the operators are non-commuting.
(iv) It follows that the trace of product of three or more operators is invariant under cyclic invariance:
(v) The complex conjugate of a trace corresponds to the adjoint of it’s argument:
Tr(Â) = Tr(† ),
(8.21)
Tr(ÂB̂ Ĉ) = Tr(Ĉ † B̂ † † )
where in the latter case, we have taken care to reverse the order of the operators in the product.
Exercise 8.2
Prove properties (i-v) of the trace.
If we have the eigenvalues and eigenvectors of an operator, then we can find a very simple expres-
sion for the trace. Consider the spectral representation of a Hermitian operator Â, with normalised
eigenvectors {| j i}N
j=1 and corresponding eigenvalues j . Using the definition of the trace, we have:
N
X N
X N
X
Tr(Â) = h j | Â | j i = j h j| ji = j. (8.22)
j=1 j=1 j=1
Exercise 8.3
The quantum harmonic oscillator (QHO) is described by the Hamiltonian Ĥ = ~!(n̂+1/2). The
hermitian operator n̂ is often called the number operator as its eigenvectors {|ni}1
n=0 correspond
to the number of excitations in the QHO, and satisfy the eigenvalue equation n̂ |ni = n |ni.
(c) The average energy of the QHO while in thermal equilibrium can be found using the trace
formula, hĤi = tr(Ĥ ⇢ˆ), show that,
~!
hĤi = ~!
+ ~!.
e 1
This is a formula you will come across a number of times in P
your statistical mechanics
1 @ P1 1
course. [Hint: you can use the identity ~! @ n=0 e n~! =
n=0 ne
n~! ]
• Shankar pg 54-57.
Over the past 8 lectures we have made nebulous references to functions being linked to vectors and
linear algebra. In this lecture we will make this link explicit and formalise the relationship between
functions and linear algebra.
Specifically, by the end of this lecture you should be able to:
• Define the concept of completeness in function space and the coordinate representation.
x2 x3
x1 x4
A B A .. B
.
(b)
(a) (b)
Figure 9.1.1: A string clamped between two points A and B. The continuous function of the amplitude (x)
pictured in (a) can be approximated by discrete points shown in (b).
Let’s consider an example of a vibrating string clamped between two point A and B, given in
Fig. 9.1.1. At a given instant, the oscillation of this string is described by the continuous function (x)
for x 2 [A, B]. A natural way to store this function in a computer would be discretise the x axis into
N pieces, and evaluate function at each point xn = n x, where x = (A + B)/N . The result will be
53
Chapter 9. Functions as vectors 54
For two functions f and g which are defined in the interval x 2 [a, b], the innerproduct can be
quite naturally defined as
Z b
hf |gi = f (x)g(x) dµ(x), (9.2)
a
where dµ(x) is called the measure of a function space. The choice of this measure can vary
significantly depending on the function space considered.
For example, in the case of applications in quantum mechanics, we have dµ(x) = dx, and the inner
product of two wavefunctions (x) and (x) is:
Z b
h | i= (x) (x) dx. (9.3)
a
which is 0 only if f (x) = 0 8 x. If kf k2 is finite, then we say that f is square integrable. Vector spaces
of functions with an inner product defined, where all functions have finite norm are of such importance
Chapter 9. Functions as vectors 55
to physics that they have a special name, they are called Hilbert spaces. In fact, last semester you
made extensive use of Hilbert spaces in quantum mechanics!
Take the time to check for yourselves that the above definition of an inner product satisfies the
same requirements as in Sec. 3.1, i.e., it is linear in the second argument, antilinear in the first, and
hf |gi = hg|f i.
The above definition of an inner product allows us to naturally extend the notion of orthogonality
to functions: Two functions f and g are orthogonal if hf |gi = 0.
Note: This is the first area we have to be cautious about convergence. If the domain of integration
is infinite (i.e. a or b = ±1) then f and g must tend to zero sufficiently quickly as |x| ! 1 for
the inner product to be finite. This is a similar requirement that we have for the existence of Fourier
transformations.
where the coefficients fn are defined as in Sec. 3.1 through the inner product,
Z b
fn = hun |f i = un (x)f (x) dx. (9.6)
a
Note Eq. 9.5 is written in terms of the components of f , since we reference some value of x. We could
instead
Pmake the vector nature of f more explicit by writing the above expansion in Dirac notation,
|f i = 1 n=1 fn |un i.
While the language we have used to describe the expansion in 9.5 may be somewhat new to you,
the idea of expanding one function in terms of a set of functions should be quite familiar to you. Here
are two examples you should have encountered before:
2
Example 1. Consider the wavefunction of the quantum harmonic oscillator un (x) = An Hn (x)e x /2 ,
where Hn (x) are the set of Hermite polynomials and An is a normalisation constant. These functions
form an orthonormal basis set the space of square-integrable functions on the domain x 2 ( 1, 1),
and
P1 we can write the wavefunction for a general state of the quantum Harmonic oscillator as (x) =
n=1 ↵n un (x), where ↵n is a probability amplitude.
Example
p 2. Consider the set of periodic functions on [ ⇡, ⇡]. The set of plane waves un (x) =
e / (2⇡) can be used as an orthonormal basis:
inx
Z ⇡ ⇢
1 i(n m)x 1 for n = m
hun |um i = e dx = . (9.7)
2⇡ ⇡ 0 for n 6= m
Now a set of basis functions is said to be complete if any function in the space can be represented
by the basis, and necessarily implies the existence of a completeness relation. For example, consider
the functions f, g 2 F. If the set of basis functions are {un }1
n=1 a complete orthonormal basis for F,
then this implies
X1 X1
f (x) = fn un (x) and g(x) = gn un (x), (9.8)
n=1 n=1
Chapter 9. Functions as vectors 56
with fn = hun |f i and gn = hun |gi. Taking the innerproduct, we then have,
1 1 1
!
X X X
hf |gi = f¯n gn = hf |un i hun |gi = hf | |un ihun | |gi , (9.9)
n=1 n=1 n=1
where (x) is the Dirac -function. Though this expression looks more intimidating with the presence
of the -function, it is equivalent to the completeness relations we have studied previously. To see this,
consider the components of the identity operator we used in the discrete case, (1̂)ij = ij , where the
Kronecker- is 1 when i = j, and 0 for all other values. Now, recall that for function spaces un is a
vector and un (x) is a specific component of un labelled by x. So we can understand Eq. 9.11 as the
components of the identity operator in function space, where (x y) is zero everywhere except when
x = y. There is a complication here that when x = y, the -function is ill defined unless under an
integral sign, however the intuition holds.
where rather than a sum over the basis, we have used an integral since x is a continuous variable. This
definition allows us to straightforwardly identify,
Z b
f (x) = hx|f i = (y x)f (y)dy, (9.13)
a
where we have noticed that the -function satisfies the requirements to recover f (x) from the inner-
product. So the Dirac -function plays the role of basis vector for position vectors. We will generally
stick with Dirac notation |xi as the basis vector, as manipulating -functions is fraught with problems
since they are not well defined outside of integrals. Ignoring these strict mathematical considerations,
we can define the overlap between two position vectors as,
Z b
⌦ 0↵
xx = (y x) (y x0 )dy = (x x0 ), (9.14)
a
Chapter 9. Functions as vectors 57
Note: The basis |xi is not square integrable, since hx|xi = (x x) = (0) = 1. So the basis |xi is
not normalisable!
Aside: We can see the equivalence with the functional form of the completeness relation given in
Eq. 9.11 explicitly. If we consider the matrix elements of the 1̂ with repsect to the position basis,
hx| 1̂ |yi = hx|yi = (x y).
Notice that we are using the arrow to indicate that g(x) is the representation of |gi with respect to
the basis |xi. We can show this by considering the components of g(x),
Z b Z b
0 0
↵⌦ 0 ↵ ⌦ ↵⌦ ↵
g(x) = hx|gi = hx| X̂ |f i = dx hx| X̂ x x f = dx0 x0 x x0 x0 f = x hx|f i , (9.17)
a a
where we have used hx|x0 i = (x x0 ). Alternatively, since |xi is an eigenstate of X̂, then we could
write the spectral decomposition of X̂ as,
Z b
X̂ = x |xihx| dx. (9.18)
a
Exercise 9.1
Show the spectral decomposition of X̂ given in Eq. 9.18. Demonstrate that X̂ is a hermitian
operator.
Another important class of linear operators in function spaces are differential operators, e.g.
|xi X dn
|gi = K̂ |f i ! g(x) = hn (x) f (x), (9.19)
n
d xn
where hn (x) are arbitrary functions of x. Check that the above operator satisfies the definitions of
linearity.
As an example, lets consider a linear operator D̂, whose action on f is to take the first derivative,
that is,
d
|gi = D̂ |f i $ g(x) = f (x). (9.20)
dx
Chapter 9. Functions as vectors 58
An interesting propert of the operator D̂ can be seen by considering the inner product:
Z b
hh|gi = hh| D̂ |gi = h(x)g(x) dx, (9.21)
a
where h is some arbitrary function. Using Eq. 9.20 and integrating by parts, we see that
Z b ✓Z b ◆
d d
hh|gi = h(x) f (x) dx = [h(x)f (x)]ba f (x) h(x) dx = hf | D̂ |hi = hh| D̂† |f i
a dx a dx
where we have assumed the functions vanishes at the boundaries x = a and x = b, and the last step
is found through the definition of the adjoint. From this definition we see that D̂† = D̂, so the
differential operator is anti-Hermitian, also referred to as skew-Hermitian. A hermitian operator can
be constructed from D̂ using:
K̂ = iD̂, (9.22)
which should look very familiar as the momentum operator you used in quantum mechanics!
In the last lecture, we showed that functions can be readily represented as vectors in a vector space,
defining the concept of basis functions and differential operators in the language of linear algebra. In
this lecture we will show how to change basis and introduce the momentum representation of functions,
which compliments the coordinate representation introduced last lecture.
By the end of this lecture you should be able to:
• Change the basis used to represent a function.
• Define the momentum representation of a function, and change between coordinate and momen-
tum representation.
• Define the spectral representation of a linear operator in the momentum representation.
Inserting this resolution of the identity into the coordinate representation of f , we obtain,
Z b Z b ✓Z b ◆ Z b
|f i = dxf (x)1̂ |xi = dk dxf (x) hk|xi |ki = dk f˜(k) |ki . (10.4)
a a a a
59
Chapter 10. Changing basis and the momentum representation 60
Recall that K̂ is Hermitian, and therefore the eigenvalues are real k 2 R and the eigenvectors are
orthogonal satisfying hk|k 0 i = (k k 0 ). If applying the position basis bra, we have:
d
k (x) =i k (x) =) hx|ki = k(x) = N eikx , (10.9)
dx
where N is a constant of normalisation. Subbing this expression back into Eq. 10.5, we have:
Z 1
˜
f (k) = N dxf (x)eikx , (10.10)
1
Exercise 10.1
p
Use the orthonormality of the momentum eigenstates |ki to show that N = 1/ 2⇡.
We can therefore define two completely equivalent representations of the function f as,
R1
|f i = 1̂ |f i = 1 dx f (x) |xi , (10.11)
R1
|f i = 1̂ |f i = dk f˜(k) |ki . 1 (10.12)
In the first expression the expansion coefficients are values of f (x), we call this the position representa-
tion. In the second, the expansion coefficients are values of f (k), and we refer to this as the momentum
representation.
On first acquaintance, this theorem probably seems slight mysterious. However, with the machinary
of vector spaces, then it has a clear interpretation: the inner product of two functions is invariant of
choice of basis. This becomes even clearer when we consider Parceval’s theorem, which for g(x) = f (x),
we have Z 1 Z 1
2
|f (x)| dx = |f˜(k)|2 dk, (10.14)
1 1
which essentially states that the norm of a vector f is independent of the choice of basis.
Exercise 10.2
R1
Using the definition of the inner product in function space hf |gi = 1 f (x)g(x) dx, prove the
mulitplication theorem.
Notice the similarities with the spectral decomposition of the position operator X̂ in the position
representation.
Now things are a little more interesting for the position operator:
Z 1 Z 1
1
hk| X̂ |f i = hk|xi hx| X̂ |f i dx = p xf (x)eikx dx,
1 2⇡ 1 (10.17)
d
=i f (k).
dk
So in momentum representation the position operator looks just like the momentum operator in position
space!
It is hard to emphasise just how important complex numbers are to modern physics and mathematics.
Throughout your first and second year courses, you will no doubt represented classically oscillating
systems in terms of complex exponentials, e.g. in simple harmonic motion. Then in quantum me-
chanics, complex numbers began to take a more central role, appearing in linear operators and in the
wavefuction of a system. In the remainder of this course, we are going to develop ideas of complex
analysis, and demonstrate just how powerful complex numbers can be. This will culminate in our
proof of Cauchy’s theorem—quite possibly the most beautiful theorem in mathematics— where real
integrals that are impossible to solve analytically with standard tools from real analysis, are evaluated
by simply counting the number of infinities in the complex plane. But before we get to this, we have
some work to do and tools to develop.
In this lecture we will:
• Revise the basic properties of complex numbers, and their polar representation.
• Define the concept of branch points and branch cuts of multivalued functions.
z = x + iy, where x, y 2 R,
where we have introduced the complex unit i2 = 1, and we denote that if a number is complex then
z 2 C. We will often see that x = Re(z) and y = Im(z), which define the real and imaginary parts
respectively. Two complex numbers can be added:
as well as multiplied:
(x + iy)(a + ib) = (xa yb) + i(ya + xb).
We also have the complex conjugate z = x iy, which allows us to define the modulus of a complex
number:
|z|2 = zz = x2 + y 2 0. (11.1)
62
Chapter 11. Complex numbers 63
Figure 11.1.1: (a) An argand diagram for the complex number z = x + iy. (b) An argand diagram for a product
of complex numbers, note that the magnitude of p has not been drawn to scale.
z = rei✓ , (11.2)
p
where r = x2 + y 2 is the magnitude of z, and ✓ = arg(z) = arctan(y/x) is its argument or sometimes
phase. Graphically, we can represent a complex numbers using an argand diagram (which is sometimes
simply referred to as the complex plane), as shown in Fig. 11.1.1.
Using the polar representation, multiplication of complex numbers has a simple geometric inter-
pretation: if z = rei✓ and w = ⇢ei , then the product of the two is:
where the magnitude of the new complex number is |p| = r⇢ and the argument is ✓ + . We can also
write a complex exponential in terms Euler’s identity,
Combining the above representations with the properties of the exponential function yields de Moivres
theorem:
z n = rn ein✓ = rn (cos(n✓) + i sin(n✓)). (11.4)
It is important to note that from the polar representation of z, the argument of z is not unique, since:
So we have that arg(z) = ✓ + 2m⇡, where m 2 Z. To get around this multi-valued nature of the
argument, we define the principal value of arg(z), which is restricted to the range of ⇡ < ✓ ⇡. The
argument restricted to this range will be denoted Arg(z).
Given any positive integer n 1 and any choice of complex numbers a0 , a1 , · · · , an , such that
an 6= 0, the polynomial equation
P (z) = an z n + · · · + a1 z + a0 = 0, (11.5)
Put another way, the polynomial P (z) has n (not necessarily distinct) roots, allowing us to write,
where {zk 2 C}nk=1 are the roots of P (z) = 0. Although this may appear obvious, or perhaps not
very interesting on first acquaintance, its worth taking a moment to reflect on just how remarkable
this statement is: if I were to restrict the coefficients an to the reals, then there are clearly inumerable
situations where the roots of P (z) cannot be expressed as reals numbers. Similarly if an are the taken
topbe rational, then its perfectly possible to get irrational roots, e.g. for P (z) = z 2 2, the roots are
± 2. Clearly complex numbers are a rather special set of numbers, with the roots of P (z) belonging
to the same field as the polynomial. Unfortunately, we are not quite equipped to prove this theorem
at the moment, we will tackle this later, but it is sufficiently important for us to state it here without
proof.
Exercise 11.1:
Consider the polynomial P (z) = an z n + · · · + a1 z + a0 . If the coefficients ak 2 R 8k, prove that
the roots are either real or occur in complex conjugate pairs (i.e. of zk is a root then so is zk is
also a root).
p⇡ ⇡ 2⇡ 4⇡ 5⇡
✓= = 0, , , ⇡, , ,
3 3 3 3 3
plugging this back in, we have, z5 z6
zp = eip⇡/3 = 1, ei⇡/3 , ei2⇡/3 , 1, ei4⇡/3 , ei5⇡/3 , Figure 11.2.1: The 6th -roots of unity.
these are the six 6th roots of unity. The argand dia-
gram given in Fig. 11.2.1 shows the roots, which live
on the unit circles.
Chapter 11. Complex numbers 65
Im[z] Im[w]
z
⇡/3
⇡/6
O Re [z] O Re [w]
so a point in the complex plane maps to half the angle and with magnitude reduced by a square root.
More interesting things happen if we consider the action of Im[z]
the function z 1/2 on all points in the complex plane. Before the
applying the function, z can be defined anywhere in the complex
plane, as indicated by the hatched area in Fig. 11.3.1.
Now applying the function and assuming we take the prin-
cipal range for the arguments, i.e. ⇡ < Arg(z) ⇡, then the Re [z]
new complex variable w has argument ⇡/2 < Arg(w) ⇡/2.
After applying the function, the new complex number is defined
only for Re(w) 0, only half the complex plane as shown in
Fig. 11.3.1. Clearly this is not the full picture, what happened Im[w]
to the negative square root?
This is a consequence of not being sufficiently general with
our definition of z: we need to use the definition,
So the square root function is multi-valued, i.e. every value of z 2 C can be mapped to two different
points in the w-plane. The exception to this is occurs at the origin z = 0. This is a special point
Chapter 11. Complex numbers 66
A point z0 is a branch point of a multivalued function f (z) if the value of f (z) does not return
to its initial value as a closed circuit is traced out around that point in such a way that f varies
continuously along the circuit.
Let’s apply this definition to the function f (z) = z 1/2 . Taking a small circuit in the complex plane
z = ✏ei✓ for ⇡ ✓ ⇡, where ✏ is very small. This defines a circle of radius ✏ centred about the
point z = 0. The function z 1/2 has two possible values,
⇢
✏1/2 ei✓/2 branch 1,
z 1/2 =
✏ ei(✓/2+⇡) branch 2.
1/2
Considering branch 1: As we cross the line seperating ✓ = ⇡ and ✓ = ⇡, the circuit becomes
discontinuous:
✓ = ⇡ ) z 1/2 = ✏1/2 ei⇡/2 = i✏1/2 ,
(11.9)
✓= ⇡ ) z 1/2 = ✏1/2 e i⇡/2
= i✏1/2 .
Therefore, z = 0 is branch point for z 1/2 . So we say that z 1/2 has two branches and one branch point
at z = 0.
Exercise 11.2:
Show that the functionb f (z) = [(z a)(z b)]1/2 has two branches, and branch points at z = a
and z = b.
z1
z2 z2
z1
In this chapter we will consider more generally functions in the complex plane, and in particular,
consider what it means for a complex function to be differentiable. We will discuss what it means
for a complex function to be continuous and differentiable. This will set the scene for one of the
most important results in complex analysis, the Cauchy-Riemmann equations, which tells us whether
a function is differentiable or not. This will be relied on to prove many of the results developed later
in the course.
where u, v 2 R. Any complex function can be written in this form with varying degrees of algebra.
For example, consider the function f (z) = z/z, we have:
z x iy x iy (x2 y 2 ) i2xy
f (z) = = ⇥ = , (12.1)
z x + iy x iy x2 + y 2
In this case, u(x, y) = ex cos(y) and v(x, y) = ex sin(y). Note that this also means that trig and
hyperbolic functions can be written in this form for example,
1 iz iz 1 ix y ix y
sin(z) = e e = e e e e = sin(x) cosh(y) + i cos(x) sinh(y). (12.3)
2i 2i
The more astute of you may have spotted a potential ambiguity in how we destinguish between between
trigonometric functions and their hyperbolic counterparts, since sin(z) contains both hyperbolic and
trig terms. For contreteness, we define the two families of functions using the convention,
1 z z 1 iz iz
cosh(z) = e +e , cos(z) = e +e ,
2 2 (12.4)
1 1 iz
sinh(z) = ez e z
, sin(z) = e e iz
.
2 2i
The familiar identities still work, for example cos2 (z) + sin2 (z) = 1, but in general it is not guaranteed
that | sin(z)| 1.
68
Chapter 12. Functions and differentiability in the complex plane 69
There are various types of singularity that we will discuss in this course; we have already met one
family of them: Branch points are singularities since the associated function takes different values as
we approach it from different directions. Another family of singularities are those points at which a
function is ill-defined, for example f (z) = 1/z has a singularity at z = 0.
Im[z]
Path 2
z0
Path 1
Re [z]
Figure 12.3.1: A figure showing the approach to point z0 along two different paths.
Let us assume that the functions f (z) = u(x, y) + iv(x, y) is analytic in the complex plane. This
means that the derivative exists, such that,
✓ ◆
df (u(x, y) u(x0 , y0 )) + i(v(x, y) v(x0 , y0 ))
(z0 ) = lim (12.8)
dz z!z0 (x x0 ) + i(y y0 )
Using the path independence of analytic functions, we can equate the real and imaginary parts of both
of these expressions to obtain the Cauchy Riemann equations,
It is important to note that we have made no assumpions about our function f (z) other than that it is
analytic. This means that the Cauchy Riemann equations provide a necessary and sufficient condition
for a complex function to be differentiable. In other words, if a function is analytic then it must
satisfy the Cauchy Riemann equations. The if a function satisfies the Cauchy Riemann then it must
be analytic. Therefore, when we are asking whether a function is differentiable, we need only see if it
satisfies the Cauchy Riemann equations.
Chapter 12. Functions and differentiability in the complex plane 71
A function f (z) = u(x, y) + iv(x, y) is said to be analytic at a point z0 if and only if it satisfies
the Cauchy Riemann equations at that point.
f (z) f 0 (z)
lim = lim 0 (12.13)
z!z0 g(z) z!z0 g (z)
The chain rule is also valid, such that if f (z) is analytic, and g(w) is analytic at w = f (z), then g(f (z))
is analytic at z. This yields,
dg dg df
= . (12.14)
dz df dz
Similarly, the product rule is also valid for f (z) and g(z) analytic, we have
df g dg df
=f + g. (12.15)
dz dz dz
@ 2 g(x, y) @ 2 g(x, y)
r2 g(x, y) = + = 0. (12.16)
@x2 @y 2
If f (z) = u(x, y) + iv(x, y) is analytic, then u(x, y) and v(x, y) are necessarily harmonic, that is
This can be proven using the Cauchy Riemann equations, taking the partial derivative with respect to
x,
@ 2 u(x, y) @ 2 v(x, y)
= , (12.18)
@x2 @x@y
similarly we can take the derivative with respect to y to obtain,
@ 2 u(x, y) @ 2 v(x, y)
= . (12.19)
@y 2 @x@y
Chapter 12. Functions and differentiability in the complex plane 72
@ 2 u(x, y) @ 2 u(x, y)
= , (12.20)
@x2 @y 2
which is clearly a solution to Laplace’s equation. A similar result can be obtain for v(x, y).
It is quite common to see the function u and v satisfying the above condition called conjugate
functions: If we are given one of these functions, we can work out the other (up to an additive factor).
To make this clear, let’s do an example:
Suppose f (z) is analyic, with u(x, y) = x3 + 3xy 2 . Determine and v(x, y).
@2u
= 6 x, (12.21)
@x2
and
@2u
= 6x, (12.22)
@y 2
equating the two, we have,
@2u @2u
+ 2 = 6 x + 6x = 0. (12.23)
@x2 @y
Therefore, we have = 1. From the Cauchy Riemann equations, we have,
@u @v
= = 3x2 + 3y 2 . (12.24)
@x @y
Integrating, with respect to y, we find v(x, y) = 3yx2 + y 3 + g(x). Taking the derivative with respect
to x and using the Cauchy Riemann equation,
@v @u
= 6xy + g 0 (x) = = 6xy. (12.25)
@x @y
Therefore, we find that g 0 (x) = 0 and g(x) = A is a constant. Putting this together, we obtain,
Conformal mappings
In this chapter we will build on the ideas of functions in the complex plane, where we will now consider
how functions change entire regions of the complex plane rather than single points. In this way, we
consider functions as mapping. We will use these methods alongside the ideas of analytic functions
to drastically simplify problems that are often encountered in physics. This will be done through the
use of conformal mappings, which are mappings that preserve the angle between lines post-mapping.
Before we dive into conformal mappings though, we will review functions, and considered how they
may be thought of in the context of mappings.
Figure 6.1.1: The function z 2 as a mapping. Lines of constant x in the z-variable (blue) are mapped to the blue
parabolas in the w-plane. Similarly lines of constant y (red curves) are mapped to parabolas. Notice that the
angle between the red and blue is preserved by the mapping.
Consider the lines with constant x = c, where c is a real constant, These curves are illustrated by the
blue lines in Fig. 6.1.1. After the mapping, the lines of constant real part become w = f (c + iy) =
c2 y 2 + 2icy, which are naturally parabolic, as can be seen by the blue curves in the right hand plot of
Fig. 6.1.1. Similarly, we can consider the lines of constant imaginary part of z, i.e. y = c, given by the
red lines in Fig. 6.1.1. In this case, these lines map to w = f (x + ic) = x2 c2 + 2ixc, which are also
44
Chapter 6. Conformal mappings 45
parabolas, as shown in Fig. 6.1.1 by the red curves. The savvy reader might notice something curious
about the above plots: though the red and blue curves are parabolic in nature, they are orthogonal to
one another (with the exception of the point at w = 0)– the mapping has preserved the angle to one
another. This is a general feature of analytic functions which we shall prove later.
Im[z]
Im[w]
⇡
Re [z] Re [w]
⇡
Figure 6.1.2: The exponential function as a mapping. A line of length 2⇡ maps to a circle in the complex plane.
A strip of thickness 2⇡, as given by the shaded region in the z-plane, maps to the entire w-plane.
entire w-plane. You can see this from |w| = ex , as we move along the strip the radius of the circle gets
larger and larger, encompassing the entire w-plane.
w=w w0 ,
= f (z0 + z) f (z0 ),
(6.3.0.2)
df 0
= z = zf (z0 ).
dz z0
As f 0 (z0 ) is complex, its phase is the angle through which the mapping rotates the tangent to the
curve at this point. Any other curve passing through the same point, will also be rotated by the same
amount. So the angle between them is conserved.
Lets make this statement more explicit by supposing z = ✏ei✓ where ✏ ⌧ 1 and writing f 0 (z0 ) =
M ei↵ . Then,
⇣ ⌘
w = ✏ei✓ M ei↵ ,
(6.3.0.3)
= (✏M ) ei(✓+↵) .
So locally f scales z by some factor M and rotates by a fixed angle ↵. Note: we require f 0 (z0 ) 6= 0,
otherwise the angles are clearly not preserved. If a shape exists over a region where f 0 (z) is roughly
constant in the z plane, then it is mapped (rotated and magnified) in the w plane.
Conformal mappings are useful for physicists because u and v are solutions to the Laplace equation;
~ 2 = 0. We use a conformal mapping to transform a problem in a given geometry into a solvable
r
problem. Then, with the inverse mapping, obtain the solution. If we can find a solution that satisfies
the boundary conditions, then it is the only solution as a consequence of the uniqueness theorem.
In this course we restrict the discussion to two dimensions, however this still includes many useful
problems in 3D which are symmetric around an axis. Note: exam questions will give guidance on what
mapping should be used (see the example sheets).
The general strategy here is:
1. Define the problem in the xy plane.
2. Find a mapping that takes curves of a constant quantity (e.g. electric potential) to a new, simpler
problem in the plane Z = X(x, y) + iY (x, y).
u(x, y) + iv(x, y) = (X(x, y), Y (x, y)) + i (X(x, y), Y (x, y)) , (6.3.0.4)
where (X, Y ) is our solution in the XY plane and (X, Y ) is chosen to make u + iv analytic.
Or, more simply, and must satisfy the Cauchy-Riemann equations:
@ @ @ @
= , = . (6.3.0.5)
@X @Y @Y @X
1
Conformal means same shape.
Chapter 6. Conformal mappings 47
L u = V0
x
0 u=0
Figure 6.3.1: Two infinite conducting plates (black) held at different potentials. The arising electric field is
drawn in blue and the equipotentials are represented in red.
6.3.1 Electrostatics
As a reminder,in electrostatics we have
~ =
E ~
ru,
~ = ⇢,
~ ·E (6.3.1.1)
r
✏0
where u is the potential. In a vacuum (⇢ = 0), these yield the Laplace equation
~ 2 u = 0.
r (6.3.1.2)
Solutions of eq. 6.3.1.2 differ according to the boundary conditions they satisfy. For instance, we can
have the potential is constant on a conductor, or the field is constant at large distances.
u = V0 u=0 x
F B= 1 E C A=1 D
Figure 6.3.2: Two semi-infinite conducting plates (black) held at different potentials separated by an insulator
at the origin. The labelled points assist understanding the mapping.
Y =⇡
E0 B0 F0
Y =0 X
C0 A0 D0
Figure 6.3.3: Mapping of two conducting plates to Z = Ln z where the solution is trivial. Equipotentials are
drawn in red and field lines are drawn in blue. Labelled points are described in the text.
y
V0
u= 2
3V0 V0
u= 4 u= 4
x
u = V0 u=0
Figure 6.3.4: Two semi-infinite conducting plates (black) held at different potentials separated by an insulator
at the origin. Field lines are drawn in blue and equipotentials represented in red.
Y = ⇡/2 u = T0
u = T0
R
)
u=0 X
x Y =0
u=0
Figure 6.3.5: Left: two perpendicular heat conductors held at different temperatures with an insulator of radius
R closing the arc. Right: Z = Ln z map that transforms the conductors such that they are parallel.
This is analogous to the previous example where field lines are replaced by heat flow and equipotentials
by isotherms.
~ · (⇢~v ) = @⇢
r . (6.3.3.2)
@t
Rearranging and expanding we have,
@⇢ ⇣ ~ ⌘ ⇣ ⌘
~ · ~v = 0.
+ r⇢ · ~v + ⇢ r (6.3.3.3)
@t
As the fluid is incompressible, its density cannot change over time; i.e.:
d⇢
= 0. (6.3.3.4)
dt
Expanding the total derivative we find
d⇢ @⇢ @⇢ dx @⇢ dy
= + + . (6.3.3.5)
dt @t @x dt @y dt
If we assume the volume of interest moves at the same rate as the fluid3 , then
d⇢ @⇢ ~
= + r · ~v = 0. (6.3.3.6)
dt @t
From the continuity equation (eq. 6.3.3.3), we are left with
~ · ~v = 0.
r (6.3.3.7)
This idealised fluid, which ignores friction/viscosity, is called potential flow. Lines tangential to the
flow, i.e. there is no flow across them, are known as streamlines. These lines represent the trajectory a
particle would follow in the flow. The lines perpendicular to the streamlines are known as the velocity
potential, this is the scalar field u.
2
You can study this in great detail in courses on fluid dynamics.
3 dx dy
,
dt dt
= ~v .
Chapter 6. Conformal mappings 51
y Y
B0
x
) X
E C A F E0 C0 D0 A0 F 0
D
Figure 6.3.6: Left: a cylinder centred at the origin of the xy plane with various positions indicated. Right: the
Z = z + z1 map to the XY plane with the transformed points shown. The cylinder is ‘squashed’ onto the X axis.
(x, y) ! (X, Y ),
C( 1, 0) ! C 0 ( 2, 0),
D(0, 1) ! D0 (0, 0), (6.3.3.11)
0
E( 2, 0) ! E ( 2.5, 0),
F (2, 0) ! F 0 (2.5, 0).
Writing ✓ ◆ ✓ ◆
1 1
X + iY = r+ cos ✓ + i r sin ✓, (6.3.3.12)
r r
means that Y = 0 if r = 1 or if ✓ = 0 or ✓ = ⇡. So, lines parallel to the X axis are streamlines in this
mapping, ✓ ◆
1
= V0 Y = V0 r sin ✓, (6.3.3.13)
r
where V0 is the magnitude of the flow and r 1. The velocity potentials are parallel to the Y axis
and given by ✓ ◆
1
= V0 Y = V0 r+ cos ✓. (6.3.3.14)
r
~ field if
This solution is drawn in figure 6.3.7. This problem is identical to finding the resultant E
a conducting cylinder is placed in a uniform field, here the fluid streamlines represent the electric
equipotentials and the fluid velocity potentials represent the electric field.
Chapter 6. Conformal mappings 52
1 y
0.8
0.6
0.4
0.2
x
0.2 0.4 0.6 0.8 1
Figure 6.3.7: Streamlines (red) and velocity potentials (blue) for fluid flowing around a cylinder.
y Y
D F E D0 E0 ⇡ F0
⇡
1
x
) -1 1
X
⇡
A CB A0 B0 ⇡ C0
Figure 6.3.8: Left: a channel of width 2⇡ with various points defined. Right: Z +eZ = z mapping that transforms
the channel such that it ‘folds back’ on itself.
therefore
x = X + eX cos Y,
(6.3.3.16)
y = Y + eX sin Y.
Chapter 6. Conformal mappings 53
Figure 6.3.9: Streamlines (red) and velocity potentials (blue) for fluid flowing out of a channel.
(x, y) ! (X, Y ),
A( 1 e, ⇡) ! A0 ( 1, ⇡),
B( 1, ⇡) ! B 0 (0, ⇡),
C(1 e, ⇡) ! C 0 (1, ⇡), (6.3.3.17)
0
D( 1 e, ⇡) ! D ( 1, ⇡),
E( 1, ⇡) ! E 0 (0, ⇡),
F (1 e, ⇡) ! F 0 (1, ⇡).
So the mapping breaks the plot at B 0 and E 0 and folds back. This has the desired effect of maintaining
parallel streamlines within the channel which will diverge on exit. This inverse mapping takes the
region ⇡ Y ⇡ to the whole of the z-plane.
To proceed, we could solve for + i , but it is considerably easier to parametrically plot the
steamlines and velocity potentials in the xy plane by setting one of X and Y constant in each case,
for each line. This is shown in figure 6.3.9. Note that the velocity potential changes discontinuously
on the channel. These are branch cuts of the function Z = f (z).
Chapter 7
54
Chapter 7. Complex (contour) integration 55
b
Im(z)
s
C1 C2
a Re(z)
Figure 7.0.2: A curve C from a to b that can be split into two small paths C1 and C2 at some intermediate
point s.
where these are now line integrals defined for functions defined over the real plane. dx and dy are not
independent, but related by the path C. If we define d⃗r = (dx, dy), then
Z Z Z
f (z)dz = (u, −v) · d⃗r + i (v, u) · d⃗r. (7.5)
C C C
A number of conventions we use in real integrals also hold for complex integrals defined on a path.
For instance, if a path C runs from a to b, the reversed path will run from b to a; ∆zk → −∆zk .
Therefore,
Z b Z a
f (z)dz = − f (z)dz, (7.6)
a b
or Z Z
f (z)dz = − f (z)dz. (7.7)
C −C
If s is a point on C between a and b, then we are also free split the integral:
Z b Z s Z b
f (z)dz = f (z)dz + f (z)dz. (7.8)
a a s
Or, equivalently Z Z Z
f (z)dz = f (z)dz + f (z)dz. (7.9)
C1 C2
C=C1 +C2
Such a path where these relationships will hold is drawn in figure 7.0.2. We can also define an integral
as the difference of two other integrals:
Z Z Z
f (z)dz = f (z)dz − f (z)dz. (7.10)
C C2
C1 =C−C2
If the path C is closed (a = b), the integral is written as ‘ C ’. If the curve is non-selfintersecting,
H
this is known as a Jordan curve. The convention is that these are traversed anticlockwise2 .
2
unless specified otherwise
Chapter 7. Complex (contour) integration 56
x
1
Proof. We start by looking again at Sn = nk=1 f (ξk )∆zk . Taking the absolute value,
P
n
X n
X
|Sn | ≤ |f (ξk )| |∆zk | ≤ M |∆zk | , (7.12)
k=1 k=1
as required.
7.2 Examples
R i+1
7.2.1 0
z 2 dz
To start, we will integrate along y = x2 as drawn in fig. 7.2.1. On this path we have dy = 2x dx. The
function f (z) = z 2 = x2 − y 2 + 2ixy. On the path y = x2 , this reduces to
f (z) = x2 − x4 + 2ix3 = u + iv. (7.14)
The integral becomes
Z Z Z
f (z)dz = udx − vdy + i udy + vdx,
C C C
Z 1 Z 1 Z 1 Z 1 (7.15)
2 4 3 2 4 3
= x − x dx − 2x dy + i (x − x )dy + i 2x dx.
0 0 0 0
C2′
C1′
x
1
We could also consider a different path C ′ = C1′ + C2′ where C1′ is y = 0 for x ∈ [0, 1] and C2′ is
x = 1 for y ∈ [0, 1], as shown in figure 7.2.2. On C1′ we have
y = 0, dy = 0, u = x2 , and v = 0. (7.17)
1
H
C z dz along |z| = 1
On this path we choose to write x and y in terms of polar coordinates such that
Cauchy’s theorem
If f (z) is analytic within and on a closed contour with surface S (see fig. 8.0.1), then
I
f (z)dz = 0. (8.1)
C
Proof. To prove this relation, we will start by considering purely real variables and then extend to
complex. Recall Stoke’s theorem:
I Z
⃗
A · d⃗r = ⃗ ×A
∇ ⃗ · dS.
⃗ (8.2)
C S
Im(z)
Re(z)
59
Chapter 8. Cauchy’s theorem 60
Im(z)
C1
C2
Re(z)
Figure 8.0.2: Two contours C1 and C2 with the same initial and final points.
where we have used the Cauchy-Riemann equations (equations REF). Cauchy’s theorem therefore only
holds provided f (z) is analytic.
Exercise:
Why does C z1 dz ̸= 0 as we saw in subsubsection 7.2.1?
H
Cauchy’s theorem may appear innocuous, but it allows us to state there is path dependence when
integrating between two points, provided the function is analytic. Suppose we have two open paths C1
and C2 with the same start and end points (see fig. 8.0.2), such that we could define a closed contour
C = C2 − C1 . If f (z) is analytic in the region contained by C, then
I Z Z
f (z)dz = 0 = f (z)dz − f (z)dz,
C C2 C1
Z Z (8.7)
∴ f (z)dz = f (z)dz.
C2 C1
Cauchy’s theorem also allows us to define the integral; i.e., allowing the endpoint variable z to vary.
Let Z z
F (z) = f (ξ)dξ, (8.8)
a
Chapter 8. Cauchy’s theorem 61
Im(z)
z2
C2
a
C1
z1
Re(z)
Figure 8.0.3: The integral of along some paths C1 and C2 only depends on the endpoints z1 and z2 .
In the above we have made use of the fact that f (z) is continuous, so
Z z+∆z Z z+∆z
f (ξ) dξ = f (z) dξ as ∆z → 0, (8.10)
z z
R z+∆z
and the path
R z independence of contour integrals, so z dξ = ∆z for any line. So, for analytic
functions, a f (ξ) dξ is the anti-derivative of f (z).
Consider the integral of f (ξ) along some path C from z1 to z2 as displayed in figure 8.0.3. We
could equally traverse paths C1 and C2 that pass through some point a. The integral along C therefore
becomes
Z z2 Z Z
f (ξ) dξ = f (ξ) dξ + f (ξ) dξ,
z1 C1 C2
(8.11)
= −F (z1 ) + F (z2 ),
= F (z2 ) − F (z1 ),
demonstrating explicit dependence only on the end points. If we change the starting point a, F (z)
only varies by an additive constant. This allows us to define indefinite complex integrals:
Z z
f (ξ) dξ, (8.12)
Chapter 8. Cauchy’s theorem 62
Im(z)
C′
Re(z)
Figure 8.0.4: Contours C and C ′ enclosing some non-analytic region depicted by the shaded area.
which are not unique, up to an additive constant. This allows us to immediately write down indefinite
integrals for a variety of common functions including:
Z Z
n 1 n+1
z dz = z + c, cos(z) dz = sin(z) + c,
n+1
Z Z (8.13)
1 z z
dz = ln z + c (for z ̸= 0), e dz = e + c, etc.;
z
where c ∈ C. Revisiting the example we saw in subsubsection 7.2.1,
Z 1+i
1 3 1+i 1
z 2 dz = z 0 = (1 + i)3 ,
0 3 3 (8.14)
2
= (−1 + i).
3
So we formally see the path independence of the integral z 2 dz.
R
Cauchy’s theorem can also be applied to scenarios where the function is not analytic at isolated
points, otherwise known as a meromorphic function. For instance we could consider f (z) = z1 which is
not defined at z = 0 (subsubsection 7.2.1). These points are called singularities.
In these cases, if two distinct contours contain the same region of non-analiticity, as shown in
figure 8.0.4; then I I
f (z) dz = f (z) dz ̸= 0. (8.15)
C C′
Proof. Consider
H a composite contour as shown in figure 8.0.5. From Cauchy’s theorem, this contour
must satisfy f (z) dz = 0. We can write this contour integral in terms of the four paths displayed:
Z B Z E Z D Z A
f (z) dz + f (z) dz + f (z) dz + f (z) dz = 0. (8.16)
A B E D
|{z}
R
|{z}
R
C C′
the two straight lines cancel. In this limit the integrals along paths C and C ′ become closed:
Z E I Z A I
f (z) dz → f (z) dz and f (z) dz → − f (z) dz, (8.18)
B C D C′
Chapter 8. Cauchy’s theorem 63
Im(z)
A B
D E
C′
Re(z)
Figure 8.0.5: Path deformation of contours C and C ′ such that the enclosed area no longer contains the non-
analytic region. Labelled points described in main body.
Im(z) Im(z)
C2
=
C1
C C3
Re(z) Re(z)
Figure 8.0.6: The integral around a closed path that encompasses singularities is equivalent to the sum of smaller
contours that circle each singularity individually.
where we introduce a minus sign for contour C ′ as the path is clockwise. Therefore equation 8.16
becomes
I I
f (z) dz − f (z) dz = 0,
C′ I
C I
(8.19)
f (z) dz = f (z) dz.
C C′
This technique of deforming contours can be extended to paths that enclose multiple singularities;
such a case is shown in figure 8.0.6.
What is the result of an integral around a contour that encloses one singularity? We have already
found this in the example seen in subsubsection 7.2.1:
I
1
dz = 2πi. (8.20)
z
|z|=1
By Cauchy’s theorem this is true for any contour which encloses the point z = 0. Similarly, for
(
0, for a outside of C,
I
1
= (8.21)
C z−a 2πi, for a inside of C.
Chapter 8. Cauchy’s theorem 64
Im(z)
Re(z)
2 3
C1
C2
−1
Figure 8.0.7: Two contours that surround a different number of the poles from f (z) = [(z − 2)(z − 3)] .
The result in equation 8.21 is remarkably simple, yet is vital for the rest of the course.
8.0.1 Example
Consider the integral
dz
I
, (8.22)
C (z − 2)(z − 3))
which clearly has singularities at z = 2 and z = 3. We will first consider a contour C1 defined as
|z| = 52 , as depicted in figure 8.0.7. Contour C1 only surrounds the pole at z = 2, therefore we find the
integral becomes, after separating the denominator,
−1
I
1
+ dz = −2πi + 0. (8.23)
C1 z − 2 z−3
As the second pole is outside of this contour, it contributes 0 to the integral. If instead we were to
choose the contour C2 defined as z = |4|, then both poles would be inside it and we would have
I
1
dz = −2πi + 2πi = 0. (8.24)
C2 (z − 2)(z − 3)
Chapter 8. Cauchy’s theorem 65
Im(z)
a
C
Re(z)
f (z)
I
1
f (a) = dz. (8.25)
2πi C (z − a)
a
Often this is used the other way round.
Proof. To prove Cauchy’s integral formula, we use Cauchy’s theorem to shrink the contour to a small
circle around z = a as shown in figure 8.1.1.
Consider C to be a circle of radius ϵ centred on a, so z = a + ϵeiθ and dz = iϵeiθ dθ. Therefore,
f a + ϵeiθ
Z 2π
f (z)
I
dz = iϵeiθ dθ,
C z−a ϵeiθ
0
Z 2π (8.26)
iθ
=i f a + ϵe dθ,
0
which is valid for any ϵ > 0. Now consider what happens as we shrink ϵ:
Z 2π
f (z)
I
dz = i lim f a + ϵeiθ dθ,
C z−a ϵ→0 0
Z 2π
=i f (a) dθ = 2πif (a), (8.27)
0
f (z)
I
1
∴ f (a) = dz
2πi C (z − a)
as required.
where we have made use of the Cauchy integral formula (eq. 8.25). Combining the two integrals we
find,
f (z)
I
1
f ′ (a) = lim dz,
δa→0 2πi C (z − a)(z − a − δa)
(8.30)
f (z)
I
1
= dz
2πi C (z − a)2
as desired.
We can generalize the above two results, extending to any derivative of order n, using
dn f (a) n! f (z)
I
= f (n)
(a) = dz. (8.31)
dan 2πi C (z − a)n+1
8.1.1 Examples
cos z
H
C z dz
Consider the integral
cos z
I
I= dz, (8.32)
C z
for some contour C that encloses z = 0. Applying Cauchy’s integral formula (eq 8.25) with a = 0,
such that z−a
1
= z1 and f (z) = cos z, we have
I = 2πi f (a = 0),
(8.33)
= 2πi cos 0 = 2πi.
So Cauchy’s results mean we only need to determine if the contour encloses a singularity. Then,
determine the order of the pole.
dz
H
C (z−2)(z−3) (again)
Here, we revisit the integral
dz
I
I= (8.34)
C (z − 2)(z − 3)
for some contour C defined as |z| = 4. Now, we make use of Cauchy’s integral formula to split C into
two smaller contours that just enclose each pole as drawn in figure 8.1.2. Now, I becomes
dz dz
I I
I= + ,
C1 (z − 2)(z − 3) C2 (z − 2)(z − 3)
| {z } | {z }
1 1
f (z)= z−3 f (z)= z−2
(8.35)
1 1
= 2πi + ,
z − 3 z=2 z−2 z=3
1 1
= 2πi + = 0.
2−3 3−2
Chapter 8. Cauchy’s theorem 67
Im(z)
C1 C2
Re(z)
2 3
−1
Figure 8.1.2: A contour C that encloses the poles from f (z) = [(z − 2)(z − 3)] split into two smaller contours
C1 and C2 that enclose each pole individually.
Im(z)
π
C1
Re(z)
C2
−π C
ez
Figure 8.1.3: A contour C that encloses the poles from f (z) = (z 2 +π 2 )2 split into two smaller contours C1 and
C2 that enclose each pole individually.
ez
H
C (z 2 +π 2 )2 dz for |z| = 4
Consider the integral
ez
I
I= dz for |z| = 4. (8.36)
C (z 2 + π 2 )2
This denominator of this integral can be expanded to reveal two double poles,
ez
I
I= 2 2
dz. (8.37)
C (z − iπ) (z + iπ)
We can therefore replace contour C with two smaller contours around each of these poles at ±iπ:
ez ez
I I
I= 2 2
dz + 2 2
dz. (8.38)
C1 (z − iπ) (z + iπ) C2 (z − iπ) (z + iπ)
This is drawn in figure 8.1.3. We now apply Cauchy’s general integral formula (eq. 8.31) for each pole.
Within C1
ez
‘f (z)’ = , (8.39)
(z + iπ)2
Chapter 8. Cauchy’s theorem 68
ez ez
d
I
dz = 2πi ,
(z − iπ)2 (z + iπ)2 dz (z + iπ)2 z=iπ
C1 (8.40)
1
= 2 (iπ − 1).
2π
Applying the same approach to the integral around C2 we find
ez ez
d
I
2 2
dz = 2πi ,
C2 (z − iπ) (z + iπ) dz (z − iπ)2 z=−iπ
(8.41)
1
= 2 (iπ + 1).
2π
So the result of this integral is
ez i
I
2 2
dz = . (8.42)
C (z − iπ) (z + iπ) π
• We have expressions for f (n) (a) in terms of f (z) on C. Therefore, all derivatives exist.
is path independent and dF dz = f (z). Hence, F (z) is analytic. Then, by Cauchy’s formulas, all
higher derivatives of F (z) (and f (z)) exist. We can turn Cauchy’s theorem around and say that
path independence of F (z) implies analyticity of f (z). Furthermore, path independence of F (z)
implies C f (z) dz = 0. This is known as Morera’s theorem - the converse of Cauchy’s theorem -
H
if I
f (z) dz = 0 (8.44)
C
for any path C within a region R, then f (z) is analytic within R.
• We can find Liouville’s theorem that states that every bounded entire function must be constant.
Proof. We start by finding Cauchy’s inequality which comes from combining the estimation lemma
(eq. 7.11) and Cauchy’s integral formula. Consider the integral of f (z) around a closed circular
contour C of radius R centered on z = a:
n! f (z)
I
(n)
f (a) = dz ,
2π C (z − a)n+1
n! M (8.45)
≤ 2πR,
2π Rn+1
M n!
∴ f (n) (a) ≤ .
Rn
This is Cauchy’s inequality. If f (z) is analytic everywhere, in the limit
R → ∞ all derivatives tend to zero, provided f (z) is bounded; i.e., has a maximum modulus
Chapter 8. Cauchy’s theorem 69
somewhere in the plane. Hence f (z) is constant. Mathematically, we could consider just the first
derivative in this limit, such that
M
lim f ′ (a) ≤ lim = 0,
R→∞ R→∞ R
⇒ lim f ′ (a) = 0, (8.46)
R→∞
∴ f (a) = constant.
So, every interesting (non-constant) function f (z) must have non-analyticities somewhere in the
infinite complex plane.
• We can also now prove the fundamental theorem of algebra: an nth order polynomial P (z) has
n roots (not necessarily distinct).
where Q(z) is a (n − 1)th order polynomial. We can repeat this step and show Q(z) has atleast
one root:
P (z) = (z − z1 )(z − z2 )R(z), (8.48)
where R(z) is an (n − 2)th order polynomial. Iterating this process we show that P (z) has n
roots;
P (z) = (z − z1 )(z − z2 ) . . . (z − zn )c, (8.49)
where c is a constant.
If f (z) is analytic except for at P poles and if it has N zeros within some curve C, then
f ′ (z)
I
1
dz = N − P. (8.50)
2πi C f (z)
This type of function, that is analytic except at isolated poles, is known as meromorphic.
Proof. Start with a function with one pole of order p, and one zero of order n within C,
(z − β)n
f (z) = g(z) , (8.51)
(z − α)p
where g(z) is analytic with no zeros within C. The derivative of f (z) is then
So
f ′ (z) g ′ (z) n p
= + − . (8.53)
f (z) g(z) z−β z−α
Chapter 8. Cauchy’s theorem 70
Im(z)
Im(z)
C4
C3
=
C5
C1
C2
C
Re(z)
Re(z)
Figure 8.1.4: The argument theorem illustrated. The integral around a closed path that encompasses singularities
(dots) and zeros (crosses) is equivalent to the sum of smaller contours that circle each point individually.
f ′ (z)
I ′
g (z)
I
dz = dz + 2πi (n − p). (8.54)
C f (z) C g(z)
where N is the sum of orders of zeros of f (z) within C and P is the sum of orders of all poles of f (z)
within C.
Another way we can understand the argument theorem is replace the full contour C by a set of
contours of small circles about each zero and pole, as shown in figure 8.1.4. Mathematically, this means
f ′ (z) X I f ′ (z)
I
dz = dz = 2πi (N − P ) . (8.59)
C f (z) Ci f (z)
i
Chapter 8. Cauchy’s theorem 71
zf = e2πi zi (8.61)
We now move to discuss power series involving complex numbers. This will allow us to broaden what
we have seen so far in complex integration.
If f (z) is analytic within a circle of radius R centred on a (fig. 9.0.1), then for all z such that
|z − a| < R, we can express f (z) as a power series:
1 ′′ 1
f (z) = f (a) + f ′ (a)(z − a) + f (a)(z − a)2 + · · · + f (n) (z − a)n + . . . . (9.1)
2! n!
Proof. Let C be a circular path within a region of convergence of f (z), and let z and a be within this
path as shown in figure 9.0.1. Using Cauchy’s integral formula, we can introduce a:
f (ξ) f (ξ)
I I
1 1
f (z) = dξ = dξ. (9.2)
2πi C ξ − z 2πi C (ξ − a) − (z − a)
z − a −1
f (ξ)
I
1
f (z) = 1− ,
2πiC ξ−a ξ−a
!
z−a z−a 2
f (ξ)
I
1
= 1+ + + ... , (9.3)
2πi C ξ − a ξ−a ξ−a
∞
X 1 (n)
= (z − a)n f (a).
n!
n=0
All Taylor series you have previously encountered can still be applied for complex variables within
their radius of convergence. For instance,
X zn
ez = ,
n
n!
1 3
sin z = z −z + ...,
3! (9.4)
1
cos z = 1 − z 2 + . . . and
2!
1 1
ln(1 + z) = z − z 2 + z 3 + . . . .
2 3
72
Chapter 9. Taylor & Laurent series 73
R
Im(z)
C
a
Re(z)
Figure 9.0.1: A circular contour centered on z = a inside the region of analyticity |z − a| < R2 of f (z).
−1 2 −6 (−1)n n!
f ′ (z) = , f ′′ (z) = , f ′′′ (z) = , ..., f (n) = . (9.5)
(1 + z)2 (1 + z)3 (1 + z)4 (1 + z)n+1
At z = 1
1 −1 1 (−1)n n!
f (1) = , f ′ (1) = , f ′′ (1) = , ..., f (n) = . (9.6)
2 4 8 2n+1
The expansion is then
f (z) = cos z, f ′ (z) = − sin z, f ′′ (z) = − cos(z), f ′′′ (z) = sin z. (9.9)
R2
Im(z)
R1 C1 C2
a
Re(z)
Figure 9.0.2: Two circular contours centered on z = a inside the region of analyticity R1 < |z − a| < R2 of
f (z).
Shrinking the gap between lines AB and DC, A → D and B → C, such that
Z A I Z C I Z B Z D
→ , →− and →− . (9.19)
D C1 B C2 A C
Im(z)
C1 C2
a
B
C
A
D
Re(z)
Figure 9.0.3: A contour C that connects points A, B, C & D whilst tracing contours C1 & C2 such that the
function is analytic within the region enclosed.
We will first concentrate on the first integral in equation 9.20. For this integral,
|ξ − a| > |z − a|, so we can write
z − a −1
1 1 1
= = 1− ,
ξ−z (ξ − a) − (z − a) ξ−a ξ−a
! (9.21)
z−a z−a 2
1
= 1+ + + ... .
ξ−a ξ−a ξ−a
This allows us to write the integral as a series,
f (ξ) f (ξ) z−a f (ξ) (z − a)2 f (ξ)
I I I I
1 1
dξ = dξ + 2 dξ + 3 dξ
2πi C1 ξ − z 2πi C1 ξ − a 2πi C1 (ξ − a) 2πi C1 (ξ − a)
f (ξ)
I
1 n
+ ··· + (z − a) n+1 dξ + . . . , (9.22)
2πi C1 (ξ − a)
X∞
= an (z − a)n ,
n=0
where
f (ξ)
I
1
an = dξ. (9.23)
2πi C1 (ξ − a)n+1
Note that an is often equal to f (n) (0) (REF BACK TO RELEVANT SECTION).
Turning our attention to the second integral in equation 9.20. Here, on C2 , we have |ξ − a| < |z − a|,
so we can write
−1 ξ − a −1
1 1
= = 1− ,
ξ−z (ξ − a) − (z − a) z−a z−a
! (9.24)
−1 ξ−a ξ−a 2
= 1+ + + ... .
z−a z−a z−a
Therefore we can write the second contour integral as
−1 f (ξ)
I I I
1 1 1
dξ = f (ξ) dξ + f (ξ)(ξ − a) + . . .
2πi C2 ξ − z 2πi C2 (z − a)2 2πi C2
I
1 1
+ f (ξ)(ξ − a)n−1 dξ + . . . , (9.25)
(z − a)n 2πi C2
∞
X bn
= ,
(z − a)n
n=1
Chapter 9. Taylor & Laurent series 77
where I
1
bn = f (ξ)(ξ − a)n−1 dξ. (9.26)
2πi C2
So, within the annulus of radii R1 and R2 f (z) is represented by the Laurent series:
∞ ∞
X X 1
f (z) = an (z − a)n + bn . (9.27)
(z − a)n
n=0 n=1
As f (z) is analytic between C1 and C2 , we can replace these contours in the definitions of an and bn
with any contour C lying within the annulus of convergence.
Note: the part with positive powers of z is known as the analytic part and the part with negative
powers of z is called the principal part of the Laurent series.
Chapter 10
The residue theorem is one of the most elegant theorems in mathematics. It is a powerful tool that
lets us compute line integrals for analytic functions on closed curves (which we will explore shortly),
infinite series (which will be explored after) and real integrals (the final and arguably most applicable
part of the course).
What is a residue? A residue is the first non-zero term in the principal part of the Laurent series
evaluated about a point z = a. The Laurent series for some function f (z) expanded about z = a in a
Laurent series, valid within 0 < |z − a| < R is
X bn X
f (z) = n
+ an (z − a)n . (10.1)
(z − a)
n=1 n=0
Then, for some curve C lying within the region 0 < |z − a| < R,
I
f (z) dz = 2πi b1 . (10.2)
C
The residue theorem states that for a curve C that encloses a number of poles of f (z), the closed
contour integral around C is then
I
f (z) dz = 2πi (sum of residues at all poles inside C) . (10.3)
C
MORE
Let’s consider an example and compare methods we’ve seen previously and the residue theorem.
Let’s evaluate I
1
dz, (10.4)
C z(z − 2)
where C encloses both z = 0 and z = 2 (see fig. 10.0.1). Here, we use the argument theorem (subsection
8.1.3) to split C into C1 and C2 that enclose only z = 0 and z = 2 respectively;
I I I
1 1 1
dz = dz + dz. (10.5)
C z(z − 2) C1 z(z − 2) C2 z(z − 2)
We then evaluate each of these using Cauchy’s integral formula (eq. 8.25). For C1 , we set
1 f (z)
= , (10.6)
z(z − 2) z
so I
1 1
= 2πi f (0) = 2πi = −πi. (10.7)
C1 z(z − 2) z−2 z=0
78
Chapter 10. The residue theorem 79
Im(z)
Re(z)
C1 C2 2
Figure 10.0.1: A contour C that encloses both poles of f (z) at z = 0 and z = 2, and smaller contours C1 and
C2 that enclose each pole individually.
g(z)
If f (z) is given in the form z−a , then the residue is simply g(a).
z−1
Example: f (z) = z(z−2)
(z − 1) 0−1
1
res (0) = lim z = = . (10.19)
z→0 z(z − 2) 0−2 2
At z = 2, the residue is
(z − 1) 2−1
1
res (2) = lim (z − 2) = = . (10.20)
z→2 z(z − 2) 2 2
1
Example: f (z) = sin z
The function
1
f (z) = (10.21)
sin z
has simple poles at z = nπ. The residue at these points are
z − nπ 0 L′ Hôpital 1
res (nπ) = lim = = = (−1)n . (10.22)
z→nπ sin z 0 cos nπ
g(z)
Example: f (z) = h(z)
where g(a) ̸= 0 and h(a) = 0. In such cases, we can Taylor expand h(z) about z = a:
Our previous example conforms with this result with g(z) = 1 and h(z) = sin(z).
Chapter 10. The residue theorem 81
We can extend the approach outlined above to functions with a pole of order n at z = a. Consider
such a function f (z), then g(z) = (z − a)n f (z) is analytic at z = a. Taylor expanding g(z) about
z = a,
z−a ′
g(z) = g(a) + g (a) + . . .
1!
(10.26)
(z − a)n−1 (n−1)
+ g (a) + . . . .
(n − 1)!
g (n−1) (a)
res (a) = b1 = ,
(n − 1)!
( ) (10.28)
1 d(n−1) n
= lim [(z − a) f (z)] .
(n − 1)! z→a dz (n−1)
These cases are more work, unless f (z) is already in the form g(z)(z − a)−n and the Taylor series of
g(z) is known. As an example, we could find the residue of
sin z
f (z) = (10.29)
z8
at z = 0. Taylor expanding the numerator we find
1 3 1 5 1 7
sin z z− 3! z + 5! z − 7! z + ...
8
= . (10.30)
z z8
So, the coefficient of z1 , and hence residue, is
1 1
res (0) = − =− . (10.31)
7! 5040
(final example?)
Further reading:
Chapter 11
We will now conclude this course by bringing together everything we have learnt about complex analysis
to evaluate real integrals. This might sound odd at first, but this will allow us to evaluate integrals
that are nearly impossible through other analytic methods. We have been building towards this for
some time, so you should expect to see a question on real integration in the exam.
Consider the integral
Z ∞ Z R
1 1
I= dx = lim dx, (11.1)
−∞ x + a2
2 R→∞ −R x2 + a2
for real a > 0. The approach you learnt in first year (or before) is to make the substitution x = tan θ.
The result of this is:
x R
1
I = lim arctan
R→∞ a a −R
(11.2)
2 R 2π π
= lim arctan = = .
R→∞ a a a2 a
We can also evaluate this integral by using complex integration and the residue theorem. x = Re(z),
so a2 +z
1
2 dz = a2 +x2 dx on the real axis. Furthermore, we can rewrite the integrand to make the poles
1
explicit:
1 1
2 2
= ; (11.3)
a +z (z + ia)(z − ia)
so there are simple poles at z = ±ia. The trick with all of these problems is to choose a suitable
contour that both has a segment that corresponds to the definite integral we wish to calculate and
allows us to make use of the residue theorem. In this example, such a contour traces a semicircle of
radius R with its base on the real axis where we explore the limit R → ∞; this is drawn in figure
11.0.1. We can therefore extract the integral of interest by splitting the contour integral up into each
path:
I
1
IC = 2 2
dz = I1 + I2 ,
C z +a
Z R Z −R
1 1 (11.4)
= 2 2
dz + 2 2
dz .
z +a z +a
| −R {z } |R {z }
along real axis along semicircle
So,
I = lim I1 = lim [IC − I2 ] . (11.5)
R→∞ R→∞
82
Chapter 11. Complex methods for real integrals 83
Im(z)
C
ia
Re(z)
−R R
−ia
Figure 11.0.1: A contour C that consists of a semicircle of radius R in the upper-half plane and a straight line
along the real axis. C encloses the upper pole at z = ia.
From the result above, we can indeed see that in the limit R → ∞ the integral goes to zero. So, we
conclude that Z ∞
1 π
2 2
= . (11.8)
−∞ x + a a
That was quite a lot of effort, and probably not worth it for such a simple integral. Consider instead
a more challenging problem that cannot be solved with a standard substitution, such as
Z ∞
1
I= 6 6
dz, (11.9)
−∞ z + a
for real a > 0. From the fundamental theorem of algebra (REF), the integrand has six simple poles:
z 6 + a6 = 0 or z 6 = −a6
1
1
6 iπ 2πik
(11.10)
⇒ z = a(−1) 6 = a eiπ+2πik = ae 6 e 6 ,
with k = 0, 1, 2, . . . , 5. We can evaluate the integral by using an identical contour to the previous
example; however, it will now enclose three poles instead of just one (see fig. 11.0.2).
Chapter 11. Complex methods for real integrals 84
Im(z)
C
ia
Re(z)
−R R
−ia
Figure 11.0.2: A contour C that consists of a semicircle of radius R in the upper-half plane and a straight line
πi 5πi
along the real axis. C encloses the upper poles at z = ae 6 , ia, ae 6 .
Like the previous example, we write the integral we wish to evaluate as the limit as R tends to
infinity of the difference between the closed-contour integral and the semicircular part:
I = lim I1 = lim [IC − I2 ] ,
R→∞ R→∞
I 1
Z −R
1 (11.11)
= lim dz − dz .
R→∞ C z 6 + a6 z 6 + a6
|R
{z }
along semicircle
You should be able use the estimation lemma to show that I2 vanishes as R → ∞. So we just need to
find the value of IC , which follows from the residue theorem once more:
πi 5πi
IC = 2πi res (ae 6 ) + res (ia) + res (ae 6 ) . (11.12)
θ
π
2
2θ π
Figure 11.1.1: A plot illustrating that sin θ (solid) is greater than π (dashed) for 0 < θ < 2.
Example
∞
eikx
Z
I= dx, (11.17)
−∞ 1 + x2
for real k > 0. Again, we write
eikz
I
dz = I1 + I2 , (11.18)
1 + z2
using the same contour as before (fig. 11.0.1) and
I = lim I1 . (11.19)
R→∞
What about the integral around the semicircle, I2 ? Here we can make use of the method used to derive
the estimation lemma (eq. 7.11):
Z π −kR sin θ
eikz eikz e
Z Z
2
dz ≤ 2
|dz| ≤ |i|R eiθ dθ,
I2 1 + z I2 1 + z 0 R2 − 1
(11.20)
| {z }
writing y=R sin θ
Z π
R
= 2 e−kR sin θ dθ.
R −1 0
To proceed, we note that the integrand is symmetrical about θ = π/2 and that sin θ > θ
π for 0 < θ < π2 ,
as shown in figure 11.1.1. Therefore,
Z π Z π/2
R −kR sin θ 2R −kR 2θ π 1 −kR
e dθ ≤ e π dθ = 1 − e . (11.21)
R2 − 1 0 R2 − 1 0 k R2 − 1
So
lim I2 = 0. (11.22)
R→∞
To evaluate the closed-contour integral, we write
eikz eikz
I I
2
dz = dz, (11.23)
C 1+z C (z − i)(z + i)
Chapter 11. Complex methods for real integrals 86
I2
I3
I1 I1 x
−R −ϵ ϵ R
Figure 11.1.2: A contour C composed of two concentric semicircles of radii R and ϵ in the upper half plane
centred over the pole at the origin.
where the pole at z = i is enclosed by the contour. The residue at this pole is
e−k
ikz
e
res (i) = = . (11.24)
z + i z=i 2i
If k had instead been negative, we would have closed the contour in the lower half plane and obtained
I = πek . The result that is valid any real k ̸= 0 is πe−|k| .
This is an example of the use of Jordan’s lemma, which states that I2 will vanish for the integrand
e f (z) if:
ikz
• k > 0,
On the other hand, if k has the opposite sign we need to close the contour in the lower half plane. For
that case, Jordan’s lemma reads:
• k < 0,
NOTE: Consider re-writing in terms of Mike’s general proof then using this as an example.
Another example of an improper integral we can now evaluate is
Z ∞
sin x
dx. (11.26)
−∞ x
but the integrand now has a pole at the x = 0, because the numerator is 1 there. Instead we redefine
the real integral I, by making use of the contour drawn in figure 11.1.2. The integral around the closed
contour, IC , is
IC = I1 + I2 + I3 = 0, (11.28)
Chapter 11. Complex methods for real integrals 87
I = lim I1 . (11.29)
R→∞
ϵ→0
I2
I4 I1
x
−1
ϵ I3
Figure 11.3.1: A contour C composed of two concentric circles of radii R and ϵ connected via a bridge such that
the branch cut on the positive x axis is not enclosed by C.
Applying these to the integral at hand, in which the contour C is the circle |z| = 1, gives
dz
I
1
I= 1 iz ,
C 5+4 z+ z
I
1 1
= dz, (11.40)
2i C z + 52 z + 1
2
I
1 1
= dz.
2i C z + 12 (z + 2)
We continue by inspecting each integral in turn, starting with I2 . As the contour for I2 is a circle of
radius R, we change the integration variable from z to θ via
So,
2π
R−α e−iαθ
Z
I2 = iReiθ dθ. (11.46)
0 1 + Reiθ
By using the estimation lemma, we find
R−α+1
|I2 | ≤ 2π. (11.47)
R−1
For large R this tends to 2πR−α which tends to 0 as R → ∞. I4 is also along a circular contour, so
using a similar approach we work out
Z 0 −α −iαθ
ϵ e
I4 = iθ
iϵeiθ dθ. (11.48)
2π 1 + ϵe
ϵ1−α
|I4 | ≤ 2π. (11.49)
1−ϵ
This also tends to 0 as ϵ → 0 provided α < 1. Finally we consider I3 where z = xe2πi . This can be
written in terms of I1 :
−α
xe2πi
Z ϵ
I3 = dx,
R 1+x
Z R −α
x
= −e−2πiα dx, (11.50)
ϵ 1+x
| {z }
I1
−2πiα
= −e I1 .
IC = I1 + I3 = I1 1 − e−2πiα = −iπα
(11.51)
|2πie{z } ,
from residue theorem
so that
π
I1 eiπα − e−iπα = 2πi, or I1 = (11.52)
.
sin(πα)
So the result for I is finite and positive for 0 < α < 1, as it should be.
The common approach here is to shift the integration variable such that it returns a Gaussian which
we know how to integrate; i.e.,
Z ∞ Z ∞ r
−α(x−x0 )2 −αu2 π
e dx = e du = , (11.54)
−∞ −∞ α
Chapter 11. Complex methods for real integrals 90
Im(z) I3
b
I4 I2 Re(z)
−R I1 R
Figure 11.4.1: Rectangular contour C of height b and width 2R split into four integrals.
where we have made the substitution u = x − x0 . We can prove this approach is valid for imaginary
x0 = ib by again extending the integral to the complex plane. Consider the integral
I
2
I= e−z dz (11.55)
C
around the contour as displayed in figure 11.4.1 in the limit R → ∞. This contour encloses no
singularities so the integral yields zero. Therefore,
I1 + I2 + I3 + I4 = 0. (11.56)
So, the imaginary shift x0 = ib does not change the value of the integral.
z = nπ + w, (11.62)
such that
cos(nπ + w) (−1)n cos w
cot z = = = cot w,
sin(nπ + w) (−1)n sin w
(11.63)
1 w w3
= − − + O(w5 ),
w 3 45
Chapter 11. Complex methods for real integrals 91
Im(z)
−L + iL L + iL
Re(z)
−L − iL L − iL
Figure 11.5.1: A square contour C of side length L = (N + 21 )π. The simple poles on the real axis are represented
by dots and the third-order pole at the origin is shown as a circled dot.
in which we have written down a few terms of the ‘known’ (= looked-up) Laurent series for cot w.
Thus,
cot z 1 1 w
= − + ... . (11.64)
z2 (nπ + w)2 w 3
Therefore the residues at the simple poles are
w 1 w
res (nπ) = lim − + ...
w→0 (nπ + w)2 w 3
(11.65)
1
= 2 2 , for n ̸= 0.
n π
To find the residue from the third order pole at n = 0, we again make use of the Laurent series for the
cotangent:
cot z 1 1 z 1 1
= − + O(z 3
) = 3− + O(z). (11.66)
z2 z2 z 3 z 3z
From the coefficient of 1
z we find the residue
1
res (0) = − . (11.67)
3
The result of the contour integral is therefore
N
!
1 X 1
IN = 2πi − + 2 . (11.68)
3 n2 π 2
n=1
To complete our work, we use the estimation lemma to show that IN vanishes for N → ∞. To do
this, we consider a square contour centered on the origin with side length 2L, see figure 11.5.1. On
this contour,
1 1 1
|f (z)| = 2 = 2 2
≤ 2. (11.69)
|z | x +y L
On the vertical side where z = L + iy,
cos(L + iy) − sin L sin(iy)
cot z = =
sin(L + iy) sin L cos(iy) (11.70)
= − tan(iy) = −i tanh y,
on this side, and similarly for z = −L + iy. On the horizontal side, we have z = x + iL,
1 i(x+iL) + e−i(x+iL)
2 e
⇒ cot z = 1 i(x+iL) −i(x+iL)
,
2i e − e
eix e−L + e−ix eL
= i ix −L , (11.72)
(e e − e−ix eL )
ie−ix eL
≈ = −i,
−e−ix eL
where on the last line we have assumed eL ≫ e−L for large L; the corrections to the final result are
very small, of order e−2L . Thus, on all sides, | cot z| ≲ 1 and
cot z 1
2
≲ 2. (11.73)
z L
as desired.
Similar sums can be done using similar trigonometric multipliers in the integrands. We choose
the trig function whose poles match the series to be summed: cot z and cosec z (whose poles are at
z = nπ) can be used for sums over all integers in which the terms have the same or alternating signs,
respectively; while tan z and sec z (poles at z = (n + 12 )π) are useful for sums over odd integers in
which the terms have the same or alternating signs, respectively.