0% found this document useful (0 votes)
52 views87 pages

Mathematical Methods of Physics I - 2014: Bstract

This document provides an overview of the topics to be covered in a lecture on mathematical methods of physics I, with a focus on Lie theory. It will discuss linear algebra, Hilbert spaces, classical orthogonal polynomials, the Lie group SU(2) and its representation theory. It will also cover Lie groups in general, their Lie algebras and representation theory, using the examples of sl(2,R) and sl(3,R). Finally, it will discuss how Lie groups, harmonic analysis and Lie algebras appear in the example of bosonic strings.

Uploaded by

Ray Mondo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views87 pages

Mathematical Methods of Physics I - 2014: Bstract

This document provides an overview of the topics to be covered in a lecture on mathematical methods of physics I, with a focus on Lie theory. It will discuss linear algebra, Hilbert spaces, classical orthogonal polynomials, the Lie group SU(2) and its representation theory. It will also cover Lie groups in general, their Lie algebras and representation theory, using the examples of sl(2,R) and sl(3,R). Finally, it will discuss how Lie groups, harmonic analysis and Lie algebras appear in the example of bosonic strings.

Uploaded by

Ray Mondo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 87

MATHEMATICAL METHODS OF PHYSICS I – 2014

THOMAS CREUTZIG

A BSTRACT. These are lecture notes in progress for Ma Ph 451 – Mathematical Physics I. The
lecture starts with a brief discussion of linear algebra, Hilbert spaces and classical orthogonal
polynomials. Then as an instructive example the Lie group SU(2) and its Hilbert space of square
integrable functions will be discussed in detail. The focus of the second part of the lecture
will then be on Lie groups in general, their Lie algebras and its representation theory. Guiding
examples are both sl(2; R) and sl(3; R). Finally, we discuss how Lie groups, their harmonic
analysis and especially Lie algebras appear in the example of bosonic strings.

April 23, 2014.

1
2 T CREUTZIG

C ONTENTS
1. Introduction 4
2. Linear Algebra 6
2.1. Vector Spaces 6
2.2. Linear Transformations and Operators 9
2.3. Operators 11
2.4. Eigenvectors and eigenvalues 14
2.5. Examples 15
2.6. Exercises 16
2.7. Solutions 17
3. Hilbert Spaces 18
3.1. The definition of a Hilbert space 19
3.2. Square integrable functions 22
3.3. Classical orthogonal polynomials 23
3.4. Gegenbauer polynomials and hypergeometric functions 26
3.5. Hermite polynomials 28
3.6. Exercises 29
3.7. Solutions 30
4. Harmonic Analysis 32
4.1. Motivation 32
4.2. Distributions 33
4.3. Fourier Analysis 34
4.4. Harmonic analysis on the Lie group U(1) 35
4.5. Harmonic analysis on S3 37
4.6. Summary 48
4.7. Exercises 49
4.8. Solutions 49
5. Lie Groups 50
5.1. The Haar measure 52
5.2. Lie subgroups of GL(n, C) 54
5.3. Left-Invariant vector fields 56
5.4. Example of a Lie supergroup 58
6. Lie Algebras 61
6.1. The Casimir element of a representation 65
6.2. Jordan Decomposition 68
6.3. Root Space Decomposition 69
6.4. Finite-dimensional irreducible representations of sl(2; R) 72
6.5. Representation theory 74
6.6. Highest-weight representations of sl(3; R) 75
6.7. Exercises 76
6.8. Solutions 77
7. The bosonic string 79
7.1. The free boson compactified on a circle 79
7.2. The Virasoro algebra 82
7.3. Lattice CFT 83
MA PH 451 3

7.4. Fermionic Ghosts 84


7.5. BRST quantization of the bosonic string 84
8. Possible Exam Questions 85
References 87
4 T CREUTZIG

1. I NTRODUCTION
We will cover subjects of interest in mathematical physics. Lie theory is a fascinating area
of mathematical physics with various applications in both areas. So the focus of this lecture
will be on Lie theory. Lie groups are smooth manifolds that at the same time are groups. They
have been introduced by Sophus Lie. In studying Lie groups it turns out that it is much simpler
to consider infinitesimal transformations. These transformations have an algebraic structure
called Lie algebra. Lie algebras and Lie groups are important as they appear as the symme-
tries of physical systems. This leads to a variety of connections of interesting modern topics
of mathematical physics. Both Lie groups and Lie algebras are often best visualized using ma-
trices. Examples that are familiar to most physicists are the Heisenberg Lie group and the Lie
group of the standard model of particle physics.
The three-dimensional Heisenberg group is
  
 1 a b 
H=  0 1 c a, b, c ∈ R .

 
0 0 1
So it is a subgroup of invertible 3 × 3 matrices. Its Lie algebra h on the other hand is spanned
by       
 0 1 0 0 0 0 0 0 1 
h = span p = 0 0 0, q = 0 0 1, z = 0 0 0
 
0 0 0 0 0 0 0 0 0
Lie group and Lie algebra are related by the exponential map

Xn
exp(X) = ∑ n! ∈ H for all X ∈ h.
n=0
(You can check this doing explicit matrix multiplication) The commutator of matrices defines
an algebra structure on h,
[X,Y ] = XY −Y X.
Performing the appropriate matrix multiplication one finds that these commutators or Lie brack-
ets are zero, except for
[p, q] = z.
But this is exactly the Heisenberg Lie algebra of quantum mechanics under the identification
d
p = −ih̄ dx , q = x, z = −ih̄.
The Lie group of the standard model is U(1) × SU(2) × SU(3). This is the product of a one-
dimensional Lie group U(1), a three-dimensional Lie group SU(2) and an eight-dimensional
one SU(3). The standard model is a gauge theory, and each generator of the Lie group has
an associated gauge boson. The electroweak interaction is described by U(1) × SU(2), and
the U(1)-gauge particle is the photon, while the ones for SU(2) are called W ± and Z gauge
bosons. Quantum chromodynamics is associated to the SU(3) and the eight gauge bosons are
called gluons. Quarks and leptons then come in colors, with spins and electric charge. These
quantities encode how the particles behave under the action of the gauge group, that is the Lie
group U(1) × SU(2) × SU(3). A mathematician would call these quantities weights, and the
representation theory of the underlying Lie algebra would tell him how the gauge group acts.
The initial purpose of studying Lie theory was the understanding of certain differential equa-
tions. If you look back to your quantum mechanics course, the Heisenberg algebra and also the
MA PH 451 5

Lie algebra of infinitesimal rotations describing spin and angular momentun, were used to solve
the Schrödinger equation, a differential equation. In general, the representation theory of Lie
algebras is a great aide in simplifying certain second order differential equations.
There are more modern connections between Lie theory, physics and mathematics. String
theory is a quantum theory of strings and not of point-like particles. It’s inital motivation was
the search for a quantum theory that both incorporates particle physics and gravity. The frame-
work of string theory uses a variety of mathematics, ranging from geometry, topology, algebra
and number theory. But Lie theory is always central, simply since a Lie group or Lie alge-
bra of some kind will always appear as symmetry of the theory in questuion. Moreover, the
world-sheet theory of a string is a two-dimensional conformal field theory. Local conformal
transformations in two-dimensions generate an infintie-dimenisonal algebra, the Witt algebra.
Its central extension is called the Virasoro algebra. This is an infinite-dimensional Lie algebra
and it is contained in the symmetry algebra of every world-sheet theory of every string. Only
due to Lie theory the conformal field theory can be exactly solved. This is extremely special;
exactly solvable quantum field theories are very rare. Both conformal field theory and string
theory have lead to an tremendous progress in pure mathematics. I plan to tell you a little bit
about the bosonic string, infinite-dimensional Lie algebras and the monstrous moonshine at the
end of the lecture as an interesting application/generalization of what you have learnt. This
monstrous moonshine is a surprising connection between modular functions (number theory)
and finite groups. But this is by far not the only progress in mathematics due to physics re-
lated to Lie theory. Another one is three-dimensional tolpological quantum field theories and
invariants of knots. You surely know what a knot is, and you can probably imagine that it is
difficult to describe arbitrary knots on some three-manifold. However, Witten found that there
is a topological field theory, called Chern–Simons theory, that can be used to understand the
problem of describing knots. If you want to describe a knot in some three-manifold M, you
donot really care about the size of the knot. So you need a theory that is metric independent to
describe them, the Chern–Simons theory. Let G be a compact Lie group, the gauge group. You
should think of this Lie group as a subgroup of invertible n × n matrices. The Chern-Simons
action is then built out of Lie group valued (that is matrix valued) gauge connections A,
 
k 2
Z
S= tr A ∧ dA + A ∧ A ∧ A .
4π M 3
More correctly, A is a matrix in some representation R of G. tr denotes the trace, that is the sum
over the diagonal entries, of the matrix. To each knot C, one can assosiate a Wilson loop
 Z 
W (C, R) = tr P exp A
C
as the trace of the path ordered exponential of the integral along the knot of the gauge connec-
tion. Expectation values of these Wilson loops are then invariants of knots that are naturally
associated to representations of compact Lie groups. This physics description finally allowed
to characterize knots in an efficient and treatable manner. Both Edward Witten and Vaughan
Jones are the pioneers in this area, and both of them received the fields medal. Chern–Simons
theory is a topological quantum field theory. However, if the manifold M ha a boundary, then
its boundary degrees of freedoms are described by a two-dimensional conformal quantum field
theory, the Wess–Zumino–Novikov–Witten model of the Lie group G. The action of this the-
ory looks similar as the Chern–Simons action, just that the gauge connection is replaced by
the Maurer-Cartan one form. This is the invariant Lie algebra valued one-form of a Lie group.
6 T CREUTZIG

The conformal field theory is completely described by the representation theory of the affiniza-
tion of the Lie algebra of G. Turning things around, another huge success of the last decades
in mathematical physics was the interaction between infinite-dimensional Lie algebras (Kac-
Moody algebras), conformal field theory and modular forms. Note, that Robert Moody has
been a professor of the University of Alberta. The success is due to the very surprising fact, that
the representation theory of the Lie algebra is guided by modular forms, that is by functions
that behave nicely under the action of the Möbius group SL(2; Z) action
 
a b aτ + b
τ= .
c d cτ + d
However, for a physicists this is less of a surprise. The point being that axioms of two-
dimensional conformal field theory tell us, that the representation category of the field theory is
a modular one. The really beautiful thing now is, that this modular representation category is
essentially a formal way of looking at knots and their invariants. To summarize, physics knows
a three-dimensional topological field theories and their two-dimensional conformal field theo-
ries at the boundary. These two somehow imply a relation between knots on three-manifolds,
infinite-dimensional Lie algebras and modular forms. This is very fascinating and here at the
University of Alberta in the mathematical physics group we all perform research in closely
related directions.

2. L INEAR A LGEBRA
We start with a review of linear algebra, a course that most of you have taken the first year
of your studies. Linear algebra deals with linear transformations on finite dimensional vector
spaces. Such linear transformations are best represented using matrices and important examples
are translations, rotations and reflections.
We will mostly be concerned with the field of complex numbers
C = x + iy | x, y ∈ R; i2 = −1 .


This section follows chapter 2 of [H].

2.1. Vector Spaces. A vector space is defined as follows


Definition 1. A vector space V over C is a set V whose elements a ∈ V are called vectors.

This set is endowed with two operations, addition of vectors


+ : V × V → V,

a , b 7→ a + b

and multiplication by a scalar


· : C × V → V,

λ , a 7→ λ a .
Addition satisfies the following list of properties for every three vectors a , b , c ∈ V.


(1) commutativity: a + b = b  + a ; 
(2) associativity: a + b + c = a + b + c ;
(3) additive There exists a unique vector 0 , called the zero vector, satisfying
identity:

0 + a = a ;

(4) inverse: There exists a unique vector − a satisfying a + (− a ) = 0 ;
Multiplication with a scalar satisfies for every a ∈ V and every two scalars λ , µ ∈ C.


(1) associativity: λ µ a = (λ µ) a ;
MA PH 451 7

(2) multiplicative identity: 1 a = a .
Finally, there are two distributivity
laws
combining vector addition and multiplication with a
scalar. For every two vectors a , b ∈ V and for every two scalars λ , µ ∈ C they are as

follows.

(1) λ a + b = λ a + λ b ;
(2) (λ + µ) a = λ a + µ a .
We will recall some more well-known linear algebra terms.

(1) A sum of vectors λ1
a1 + · · · + λn an is called a linear combination of the vectors

a1 , . . . , an . If the relation λ1 a1 + · · · + λn an = 0 implies that λ1 = · · · = λn = 0,

then the a1 , . . . , an are said to be linear independent. This means that the
vectors

vector a i for any 1 ≤ i ≤ n cannot be written as a linear combination of the other n − 1
vectors a j , j 6= i.
(2) A subset W of a vector space V is itself a vector space, that is a sub vector space of V,
if it is closed
under addition and multiplication by a scalar. In formulae this means for
all a , b ∈ W and any scalar λ ∈ C, also

a + b and λ a

are in W.
(3) If S is a set of vectors in our vector space V, then the set of all linear combinations of
the vectors in S is a sub vector space of V. This sub vector space has the name span of
S, and we write WS .
(4) A basis of a vector space V is a set B of linearly independent vectors such that its span
is the vector space V itself, WB = V.
(5) All bases of a given vector space have the same cardinality. Especially, if a basis of
V has only finitely many elements, say d, then every basis of V has d elements. The
number d is then called the dimension of V.
Another useful structure one can introduce on vector spaces is an inner product.
V V V

Definition
2. The inner product on a vector space is a map × → C that maps a , b ∈
V × V to a b ∈ C satisfying for all a , b , c ∈ V and all λ , µ ∈ C




(1) sesquilinearity or hermiticity:
a b = b a ;


(2) linearity in one component: a λ b + µ c = λ a b + µ a c ;



(3) positive definiteness: a a ≥ 0, and a a = 0 if and only if a = 0 .
Here µ ∗ denotes complex conjugate of the complex number µ.


The Dirac braket notation has its original use in quantum mechanics. We say that a is bra
of a and a is ket of a. It is nothing but a notation and every other way of denoting vectors is

of course allowed, though it is useful to keep some conventions. The inner product singles out
a special and very useful kind of basis
V

Definition 3. Let be a vector space with inner product. We call two
a , b orthogonal
vectors

 a b = 0. A vector a is called normalized or unit vector if a a = 1. A basis B =




if
e1 , . . . , en is called orthonormal if
(

1 if i = j
ei e j = δi, j := .
0 else
8 T CREUTZIG

Such a basis can always be found, and the algorithm to do so is called the Gram–Schmid–
orthonormalization process.

Theorem 1. Let V be a finite-dimensional vector space, then there exists an orthonormal basis.

The statement can be best proven using induction. The proof is constructive and provides an
algorithm to find an orthonormal basis.

Proof. Let a1 , . . . , an be a basis of V. Define the sequence of sub vector spaces




W1 ⊂ · · · ⊂ Wn
where Wm is the of a1 , . . . , am , so that Wn = V. By property (3) of the definition of an


span

inner product a1 a1 > 0, hence

e1 := q a1



a1 a1
is normal and gives a normal basis of the one-dimensional vector space W 1.
Our hypothesis of the induction is now that Wm has orthonormal basis e1 , . . . , em . Define

m

bm+1 := am+1 − ∑ ei ei am+1 .

i=1
Then for all 1 ≤ j ≤ m, we have
m

m
e j bm+1 = e j am+1 − ∑ e j ei ei am+1 = e j am+1 − ∑ δi, j ei am+1 = 0.





i=1 i=1
bm+1 6= 0 , since am+1 is not a vector of Wm and hence linearly independent of the



e1 , . . . , em . Hence bm+1 bm+1 > 0 and

bm+1
em+1 := q


bm+1 bm+1

together with e1 , . . . , em form an orthonormal basis of Wm+1 .


Using similar ideas one can show the following Schwarz inequality

Theorem 2. Let V be a vector space with inner product. Then for any two vectors a , b the

inequality
a a b b ≥ | a b |2





holds. Equality is true if and only if a and b are proportional to each other.

Proof. Let

c := b −
a b a .

a a

The geometric interpretation
is that c is the difference of b with its projection onto a . So

that
especially c is orthogonal to a . We can use this expression to write the inner product of
b with itself as

a b 2



b b =
+ c c .
aa
MA PH 451 9



Since c c ≥ 0, the Schwarz
inequality follows. Equality holds
if and only if c c = 0, that
if and only if c = 0 . But this is true if and only if b coincides with its projection onto
is
a , that is the two vectors are proportional to each other.

In physics, the states of a physical system often form a vector space. In order to measure
properties of physical states one then needs an inner product as well as operators that operate
on the space of physical states.

2.2. Linear Transformations and Operators. Unless otherwise stated, vector spaces are fi-
nite dimensional.
Definition 4. Let V and W be two vector spaces over C. A map
T : V→W
is called a linear map or a linear transformation if for all a , b ∈ V and all λ ∈ C the

following two properties are satisfied.


  
(1) T a + 
b = T a + T b ;

(2) T λ a = λ T a .
The space of linear transformations from V to W is denoted by L(V, W).
The space of linear transformations has itself an interesting algebraic structure. Let S, T ∈
L(V, W) for two complex vector spaces V and W We can define addition by
S + T : V → W,
 
a 7→ S a + T a

and multiplication by a scalar λ by


λ T : V → W,

a 7→ λ T a .

It is a direct computation to verify that these define a vector space structure on L(V, W).
We can also compose linear maps. Namely let V, W, X be three vector spaces and let T ∈
L(V, W), S ∈ L(W, X) be two linear transformations then the composition S ◦ T ∈ L(V, X) is
defined as
S ◦ T : V → X,

a 7→ S T a .
This product allows to define an algebra structure on L(V) := L(V, V). In the next definition,
we use a, b to denote vectors and not the Dirac braket notation.
Definition 5. Let V be a vector space over C, then V is an C-algebra if there is a product
· : V × V → V, (a, b) 7→ ab
satisfying
a(b + c) = ab + ac, (a + b)c = ac + bc
and
a(λ b) = λ (ab), (λ a)b = λ (ab)
for all a, b, c ∈ V and all λ ∈ C.
It is again a direct computation to verify that composition defines an algebra structure on
L(V).
Theorem 3. Let V and W be two vector spaces over C. Then L(V, W) is a C-vector space and
L(V) is even an (associative) C-algebra.
10 T CREUTZIG

Let now
T : V→W
be a linear transformation and L(V, W) the C-vector space of linear transformations. We list
some properties and definitions.
• If W = C, then L(V, W) is called the dual space of V and it is denoted by V∗ . The elements
of V∗ are called linear
functionals; they map each vector in V to a complex number. 
set of vectors a in V that are mapped by T to the zero vector in W, that is T a =
• The
0 ∈ W, is called the kernel of T . Using the axioms of linear transformations one verifies
that the kernel of a linear transformation is a sub vector space of V.
b in W, such that there is at least one a in V with

• The image
 of T is the set of vectors
T a = b . Again using the axioms of linear transformations one can verify that the image
of a linear transformations is a sub vector space (of W).
• The linear transformation T is called injective if the kernel of T is just the zero vector 0 ∈ V.

It is called surjective if the image of T is the complete vector space W, and the transformation
is called bijective if it is both injective and surjective. In that case V and W are said to be
isomorphic. If V = W, then a bijective linear transformation from V to V is said to be an
automorphism.
Proposition 4. Let T : V → W be a linear transformation, then
dim(kernel of T ) + dim(image of T ) = dim(V).
Proof. This has probably been proven in a linear algebra course. One takes a basis

a1 , . . . , am

of the kernel of T and also one 


b1 , . . . , bn
of the image of T . One chooses then vectors am+i in V with the property that


T am+i = bi .
a be an arbitrary vector in V, then

The a1 , . . . , am+n cannot be
linearly
dependent.
Let
there exists c in the span of am+1 , . . . , am+n with
 
T a = T c

and hence a − c is in the kernel of T .
It follows that two vector spaces can only be isomorphic if their dimensions coincide. In fact,
we have
Proposition 5. Two vector spaces over C are isomorphic if and only if they have the same
dimension.
Proof. Let V and W be two vector spaces of same dimension n and let

a1 , . . . , an

be a basis of V and 
b1 , . . . , bn
a basis of W. Then
T : V → W,

λ1 a1 + · · · + λn an 7→ λ1 b1 + · · · + λn bn
MA PH 451 11

defines a linear transformation. As it maps basis vector to basis vector it must be both injective
and surjective.

Let V be a vector space with an inner product. We can then


think of the vectors in the dual
space as the bra-vectors. The reason is as follows. Consider a ∈ V. Then a defines a linear


functional fa on V by
fa : V → C,


b 7→ a b .


So that

it
is natural to think of fa as a . And from now on we
will take this intuitive notation
fa = a . The linear functionals span a subvector space of V of same dimension as V, so that

by above proposition they are isomorphic. We even have that all linear functionals are of this
form.

Theorem 6. Let V be a vector space with inner product, then V ∼ = V∗ , that is the vector space is
isomorphic to its own dual.

Proof. We construct an isomorphism. Let a1 , . . . , an be a basis of V. Then we define




fi : V → C,

λ1 a1 + · · · + λn an 7→ λi .
Again we inspect that this map is a linear transformation and clearly the f1 , . . . , fn are linearly
independent. We have to show that every linear functional is in the span of these fi . For this let

f be a linear functional with action on basis vectors f ai = µi , then

f = µ1 f1 + · · · + µn fn ,
so that the map is also surjective and we have constructed the desired bijective linear transfor-
mation.

We can now define two useful maps

Definition 6. Let V be a vector space over C with inner product. Then the dagger is defined as
† : V → V∗ , a 7→ a † := a .


Let V and W be two vector spaces over C with inner products, and let T ∈ L(V, W). Then the
pullback T ∗ ∈ L(W∗ , V∗ ) of T is the map
T ∗ : W∗ → V∗ , f 7→ f ◦ T.

2.3. Operators. Linear transformations from a vector space to itself form an algebra. Such
linear transformations are called linear operators and they form the algebra of linear operators
on the given vector space. Let L(V) be the algebra of linear operators on the C-vector space V
with inner product. The identity is a linear transformation, so our algebra has a multiplicative
identity. A linear operator that has a non-trivial kernel can not have an inverse, but every bijec-
tive linear transformation has, as it maps any basis of V to another one. We denote the inverse
of a linear operator T by T −1 and it satisfies
T ◦ T −1 = T −1 ◦ T = 1.
It is the unique operator with this property, since if there is another transformation R with
T ◦ R = 1 then multiplying this equation by the left by T −1 and using associativity unique-
ness follows. In quantum mechanics, operators represent a physical quantity as energy, some
charge or momentum. The measurement of this quantity of a physical state is then given by the
12 T CREUTZIG

expectation value of this operator in the given state. It is defined as follows,




hT ia := a T a

for a state a and an operator T . Since operators form an algebra, we can use addition and
multiplication to define polynomials of operators. Using the inverse of operators we can further
define rational functions in the operators. We can even define power series and Laurent series
in the operators. These are then formal series, since we don’t have a notion of convergence yet.
An example is the exponential of an operator T , defined by the series expansion of the standard
exponential function on the complex numbers,

Tk
exp(T ) := ∑ .
k=0
k!
But, operators do not necessarily commute. It is useful to define the commutator of two opera-
tors U, T in L(V) as
[T,U] := TU −UT.
The commutator is a bilinear map, that is a map from L(V) × L(V) to L(V) that is linear in
each component. It satisfies the properties
[U, T ] = −[T,U] (antysymmetry)
and
[U, [S, T ]] + [T, [U, S]] + [S, [T,U]] = 0 Jacobi identity
for all U, S, T in L(V). A vector space with a bilinear map, that is antisymmetric and that
satisfies the Jacobi identity is called a Lie algebra. So that we have

Theorem 7. Let V be a vector space, then the commutator gives L(V) the structure of a Lie
algebra.

A useful formula for the commutator of two exponentials of operators is the Baker-Campbell-
Hausdorff formula:
 2   3 
t t
exp(t(U + S)) = exp(tU)exp(tS)exp − [U, S] exp (2[S, [U, S] + [U, [U, S]]) · . . . .
2 6
where the dots indicate exponentials of higher powers of the variable t.
There is a list of important types of operators.

Definition 7. Let V be a vector space over C with inner product, and let T in L(V), then the
adjoint or Hermitian conjugate of T is denoted by T † in L(V∗ ) and is defined by



for all a , b in V.

a T b = b T a ,

The adjoint is somehow an operator analogue of complex conjugation. The analogue to real
numbers are then those operators that do not change under Hermitian conjugation.

Definition 8. A linear operator T in L(V) is Hermitian or self-adjoint if


for all a , b in V.



b T a = b T a ,
One often uses the slightly confusing short-hand notation T † = T . It is called anti-Hermitian if
for all a , b in V.



b T a = − b T a ,
And the short-hand notation is T † = −T .
MA PH 451 13

So anti-Hermitian operators are the analogue of purely imaginary numbers. Indeed, by def-
inition of the adjoint, the expectation values of Hermitian operators are real, while those of
anti-Hermitian ones are purely imaginary. This statement is actually an if and only if statement.
Definition 9. An operator T in L(V) is called positive if all its expectation values are non-
negative integers. We say that it is positive definite if all its expectation values are positive
integers.
We use the notation T ≥ 0 for a positive operator, and T > 0 for a positive definite one.
Definition 10. An operator T in L(V) is called unitary if



a b = a T T b
for all a , b in V.

Unitary means that the adjoint of an operator is the same as its inverse.
Let V be a vector space with inner product, then to every vector a in V we can associate an

operator Ta via
Ta : V → V,


b 7→ a a b .
In physics one uses the notation

Ta := a a ,

which we will also adapt. What does this operator
do? It maps a vector orthogonal
a to the
to
zero vector, while every vector parallel to a is mapped to itself.

If
the norm of a is one, then

it is thus the projection operator onto the subspace spanned by a . Given a set of orthonormal

vectors { ai }, the operator
m

∑ ai ai
i=1
is then the projection operator on the sub vector space spanned by these m orthonormal vectors.
Especially if these vectors form an orthonormal basis, this operator is just the identity,
m

∑ ai ai = 1.
i=1
This relation is called the completeness relation.
You are surely familiar with matrices, and surely also with the concept that matrices repre-
sent linear transformations. So we will only very briefly recall this here. Let V be a finite-
dimensional complex vector space with inner product, and B a basis, and T a linear
operator on
V. Every basis allows a matrix representation of T . Call the basis vectors a1 , . . . , an , then
the matrix of T with respect to the basis B has components Ti j defined by
n
∑ Ti j a j

T ai = .
j=1

A particular nice basis is an orthonormal one, so let B be orthonormal, then the matrix entries
have the nice form
n
ai T a j = ∑ Ti j ai a j = Ti j .



j=1
Let TB = (Ti j ) be the matrix for a linear operator T in a given basis B, then the matrix in the
basis B for the adjoint is the complex conjugate transpose of TB ,
∗
TB† = TBt = T ji∗ .

14 T CREUTZIG

Matrices are then called Hermitian, unitary etc if they are the matrices with respect to some
basis of Hermitian, unitary etc operators.
We have mentioned power series and Laurent series in linear operators. Using the matrix
form of a linear operator in some basis, the matrix form of the Laurent or power series of the
operator is then just the Laurent or power series of its matrix. It then can be said to converge if
it converges component wise.

2.4. Eigenvectors and eigenvalues. Linear algebra has taught us, that a matrix is diagonaliz-
able, if there exists a basis such that in this basis the matrix can have non-zero entries only on
the diagonal. For a linear operator T on a vector space V, this translates to the statement
that T
is diagonalizable
if and only if there exists a basis of V, such that each basis vector ai satisfies

T ai = λi ai for some complex number λi called the eigenvalue of the eigenvector ai . The
set of eigenvectors is usually called the spectrum of the operator. As I have the impression that
you are all very familiar with these concepts, we will only repeat few important statements.
Theorem 8. Let V be a vector space with inner product, and U, T two linear operators on V. Then
U and T are simultaneous diagonalizable, that is they possess a common basis of eigenvectors,
if and only if they commute.
Simultaneous diagonalizable operators commute, since diagonal matrices do. The statement
is then proven by looking at the eigenspace of U of a given eigenvalue and using commutativity
to observe that the eigenspace is T -invariant. Using that distinct eigenspaces are orthogonal one
can then show that every invariant subspace of a diagonalizable operator can be decomposed
into eigenspaces, so that the theorem follows.
Lemma 9. Let V be a vector space with inner product and let V = M ⊕ M ⊥ be an orthogonal
decomposition. This means that every vector in M is orthogonal to every vector in M ⊥ . Let T
be a linear operator on V, then M is invariant under T if and only if M ⊥ is invariant under T † .
The proof can be left as an exercise.
Definition 11. An operator T in L(V) is called normal if it commutes with its adjoint.
Both Hermitian and unitary operators are normal.
Theorem 10. (spectral decomposition)
Let T in L(V) be a normal operator with spectrum {λ1 , . . . , λn } and corresponding eigenspaces
V1 , . . . , Vn . Then there exist non-zero mutually orthogonal projection operators P1 : V → V1 , . . . , Pn :
V → Vn , such that
P1 + · · · + Pn = 1 and λ1 P1 + · · · + λn Pn = T.
In other words, every normal operator cab be written as a linear combination of projection
operators with coefficients the eigenvalues of the normal operator.
Proof. The intersection of two eigenspaces with distinct eigenvalue is trivial, hence the direct
sum
M := V1 ⊕ · · · ⊕ Vn
exists. We claim that M = V. Let M ⊥ be the orthogonal complement of M in V. T leaves M
invariant, since T commutes with its adjoint the same is true for T † . By above lemma T leaves
M ⊥ invariant. The eigenvalues of a linear operator are given by the roots of its characteristic
MA PH 451 15

polynomial. The characteristic polynomial is a polynomial of degree the dimension of the


vector space. But every polynomial of degree at least one has at least one root over C. Hence
M ⊥ must be zero-dimensional. It follows that the sum of projection operator is the identity on
M = V. Further the operators λ1 P1 + · · · + λn Pn and T coincide on every vector in M = V and
hence must be the same.
The eigenvalues of an Hermitian operator are always real and those of unitary ones are always
on the unit circle.
One advantage of diagonalizable operators is, that it is now easy to explicitly say what
functions in this operator are. So let T in L(V) be a diagonalizable operator with spectrum
{λ1 , . . . , λn } and corresponding eigenspaces V1 , . . . , Vn and projection operators P1 : V → V1 , . . . , Pn :
V → Vn . Let f be a function on the space of linear operators, that is for example a polynomial,
or a rational function, or a power series or a Laurent series, then
n
f (T ) = ∑ f (λi )Pi .
i=1
Now, also the notion of convergence makes sense, and we can say that f (T ) converges in a basis
consisting of eigenvectors if f converges at all eigenvalues.

2.5. Examples.

Quantum Chromodynamics and SU(3). Those of you who have taken a standard model or
quantum field theory course will know that quantum chromodynamics, that is the strong force,
is an SU(3)-gauge theory. We will put the objects appearing there in the framework of this
section. Let V = C3 . The Lie group SL(3; C) has V as defining representation. This means
there is a homomorphism
ρ : SL(3; C) → {M ∈ Mat3 (C)|det(M) = 1}.
Warning: In mathematics one distinguishes between the abstract Lie group from the Lie group
of matrices in its defining representation. In physics one doesnot, as the two are isomorphic.
There is then the risk that one confuses representations of the Lie group with the Lie group
itself. In any case you can picture the Lie group SL(3; C) as the group of unit determinant three
by three matrices. in the same sense you should picture the unitary real form SU(3) as the
subgroup of unitary three by three matrices of determinant one, that is
n o
SU(3) ∼= M ∈ Mat3 (C)|det(M) = 1; M † = M −1

Let X = R3|1 be our four-dimensional Minkowski space-time. A quantum field is then an


operator valued map from X, where the operators themselves act on some infinite-dimensional
vector space. We donot want to be concerned with any details of that here. The point of this
example is, that the quantum fields of the three quarks, can be organized in a vector
 
ψred (x)
ψquark (x) = ψgreen (x)
ψblue (x)
whose components correspond to the three colored quarks. Quantum chromodynamics also has
eight gauge fields, the gluons. But SU(3) is an eight-dimensional real Lie group and the eight
gluons correspond to a choice of basis of the underlying Lie algebra, called the Gell-Mann
matrices λ1 , . . . , λ8 .
16 T CREUTZIG

In summary, here we see that the quantum fields representing quarks are vector (C3 ) valued
objects carrying a representation of the gauge group SU(3). The gauge particles themselves,
the gluons, are also vector (or matrix if you wish) (SU(3))-valued objects acting (gauge trans-
formation) on the quarks.

The one-dimensional harmonic oscillator. This is one of the first and instructive problems of
a quantum mechanics course. We consider the one-dimensional harmonic oscillator of mass m
and frequency ω. The space of physical states are square integrable real-valued functions in one
variable. This is an infinite-dimensional vector space, and hence this problem already leads us
to what we will deal with the coming lectures. The space and momentum operators x and p act
on the space of complex-valued functions in one variable by multiplication with the variable x
d
respectively by −ih̄ dx . The time-independent Schrödinger equation is
h̄2 d 2 mω 2 2
 
Hψ(x) = − + x ψ(x) = Eψ(x)
2m dx2 2
It is useful to define the characteristic length
r

x0 := .
ωm
the Schrödinger equation is a differential equation, that might be solved using standard methods.
If desired, we can have a look at that later in the course. But there is also an algebraic way to
find the eigenfunctions. Let
   
1 x d † 1 x d
a= √ + x0 , a =√ − x0 .
2 x0 dx 2 x0 dx
The commutation relations are [a, a† ] = 1, and the Hamilton operator becomes
 
† 1
H = h̄ω a a + .
2
So that finding the eigenstates of H amounts to finding the eigenstates of n = a† a. Let ψν be an
eigenfunction for n with eigenvalue ν and let us denote the inner product on square-integrable
functions by ( , ), then
ν(ψν , ψν ) = (ψν , a† aψν ) = (aψν , aψν ) ≥ 0.
the smallest possible eigenvalue is thus ν = 0. Positive definiteness of the inner product implies
that  
1 x d
0 = aψ0 (x) = √ + x0 ψ0 (x).
2 x0 dx
The solution of this equation with norm one is
 2
1 −1 x
ψ0 = p√ e 2 x0
.
πx0
Using [n, a† ] = a† and [n, a] = −a, it is straight forward that a† ψν is an eigenfunction with
eigenvalue ν + 1, while aψν is one with eigenfunction ν − 1. The first property allows to con-
struct eigenfunctions with eigenvalues positive integers starting from ψ0 . The second property
together with the positive definiteness property of the inner product of square integrable func-
tions allows to show that these are all.

2.6. Exercises.
MA PH 451 17

(1) Let V be a vector space with inner product. Let a and



b be two vectors. b can

be
uniquely written as a sum of
a vector parallel to a and another one orthogonal to

a . The vector parallel to a is called the projection of b onto a . Show that the

projection of b onto a is given by the formula


a b

a .
a a
(2) Let V be a finite-dimensional vector space, show that L(V) is associative, that is for
each three elements A, B,C in L(V),
A ◦ (B ◦C) = (A ◦ B) ◦C
holds.
(3) Give an example of operators on a vector space that do not commute.
(4) Show that an operator is Hermitian if and only if all its expectation values are real.
(5) Proof Lemma 9.
(6) Give an example of a non-diagonalizable operator on a finite-dimensional vector space.

2.7. Solutions.

b . Here p(b)a is the projection of b onto a and b ⊥

(1) We write b = p(b)
a +
is orthogonal to a . Since p(b)a is parallel to a there is a λ with p(b)a = λ a .

Taking the inner product of a with b , we get



a b = λ a a ,
so that the claim follows.
(2) V is a finite-dimensional vector space. Choose a basis B of V, and let AB , BB ,CB the
matrices of A, B,C with respect to this basis. Matrix multiplication is associative (write
down the product of three matrices with the indices and use associativity of multiplica-
tion in C.
(3) Choose two random n × n matrices, most likely they won’t commute. Or the operators
d
x and dx acting on a vector space of polynomials in one variable.
(4) We already noted in the lecture, that the expectation value if a Hermitian operator is
real. By the spectral decomposition theorem a Hermitian operator is diagonalizable
and
eigenspaces
are orthogonal and in fact form an orthogonal basis. Say this basis is
{ a1 , . . . , a n } with eigenvalues {λ1 , . . . , λn }. Assume the statement is true for every

eigenvector
a1 , that is λi is real. Then the expectation value of an arbitrary vector

a = µ1 a1 + · · · + µn an is

hT ia = |µ1 |2 λ1 + · · · + |µn |2 λn
is also real. It thus remains to show that all λi are real. We have



hT iai = ai T ai = λi ai ai ,
but also





hT iai = ai T ai = ai T † ai = ai T ai = λi∗ ai ai = λi∗ ai ai

so that the claim λi = λi∗ follows.


18 T CREUTZIG

(5) Assume that T leaves M invariant, that is for all a in M, we have T a in M. This
means for all b in the orthogonal complement of M, that is in M ⊥ , we have



b T a = 0,
but this is the same as

a T b = 0
and hence for all b in M ⊥ also T † b is orthogonal to M and hence in M ⊥ . The

converse direction is exactly the same argument but interchanging M and M ⊥ as well as
T and T † .
(6) Let V = C2 with basis v and w. The operator J, that maps v to w and w to zero has
Jordan normal form  
0 1
0 0
in this basis. It cannot be diagonalized.

3. H ILBERT S PACES
Hilbert spaces appear in many problems of physics and mathematics. You have probably first
heart about it in quantum mechanics, and we have just seen as an example the one-dimensional
quantum mechanical harmonic oscillator.
In mechanics, one is interested in a physical state, an object for example, its properties as
energy, momentum, position, etc.; and how it changes in time. We thus have three fundamental
properties of a mechanical system: states, observables and time evolution.
In classical mechanics, the states of a physical system are the elements of the phase space;
observables are functions on the space of physical states; and time evolution is given by a path
or flow in the phase space. In quantum mechanics, the situation is quite different, and the data
is summarized as follows
(1) The space of physical states is associated a seperable (that is countable basis), complex
Hilbert space H. The physical states are represented by vectors in the Hilbert space
subject to the identification of two vectors that only differ by a phase. These equivalence
classes are known as rays in the Hilbert space.
(2) Each real physical observable is associated with a Hermitian operator T on H. The
only possible measurement is the eigenvalue λa of a (normalized) eigenstate a of T
in H. 2 a given arbitrary state b in H, the probability of measuring λa is given


For

by | a b | . The expectation
value
of a measurement, that is the weighted average of
possible measurements is b T b .

(3) Symmetries of the physical system are described by unitary operators on H. Especially
the time evolution if the system is given by a one-parameter family of unitary transfor-
mations U(t) for t in R the time-variable. If H is the time-independent Hamiltonian of
the physical system, this is a special Hermitian operator whose eigenvalues correspond
to the possible energies of states in H, then
 
i
U(t) := exp − tH . (3.1)

You all have learnt Schrödinger’s uncertainty principle, that two observables can
only be mea-
sured simultaneously if the corresponding operators commute. Given a state a in our Hilbert
space H and two operators T,U on H that do not commute. If we first measure the observable
MA PH 451 19

associated to T , we will get some eigenvalue λ of T . If we repeat the measurement, we will


always get the same answer. So the measurement of the observable projects the state onto an
eigenstate (of eigenvalue λ ) b of the associated operator T . But if we now want to measure
the observable for U, we will project on an eigenstate of U. Since U and T donot commute
such an eigenstate needs not to be one of T and a subsequent measurement of the observable of
T will give an answer that needs not to be λ again.
After this short motivation, let’s turn to defining a Hilbert space.

3.1. The definition of a Hilbert space. As we like to deal with infinite-dimensional vector
spaces, we might like to study infinite sums and thus we need a notion of convergence. So let
in this section H be a possibly infinite-dimensional complex vector space with inner product.
But we would like H to be separable, that is have a countable basis.

Definition 12. A norm on H is a map || || : H → R with


a in H; and || a || = 0 if and only if a = 0 ;

(1) || a || ≥ 0 for all
a || = |λ ||| a || for all λ in C and all a in H;

(2) ||λ
(3) || a + b || ≤ || a || + || b || for all a , b in H.

Proposition 11. Let H be a possibly infinite-dimensional complex vector space with inner prod-
uct. Then q


|| a || :=
a a (3.2)
defines a norm on H.

Proof. The first two properties follow directly from the inner product properties. For the triangle
inequality, recall the Schwarz inequality and we compute


 




a + b a + b = a a + b b + a b + a b




≤ a a + b b + 2| a b |


2
≤ a a + b b + 2|| a |||| b || = || a || + || b ||
so that the statement follows by taking the square root on both sides of the equation.

Having a norm, we can introduce the notion of convergence. Given a sequence of points
a1 , a2 , . . . in a set H with norm d, we say that the sequence converges if there is another point
a in H, such that for every ε > 0 there is an N0 with for all n > N0 we have d(a, an ) < ε. We
actually need a more restrictive type of convergence, the Cauchy sequence

Definition 13. Let H be a set with norm d, then a sequence {an |n ∈ Z≥ ; an ∈ H} is called
Cauchy sequence if for every ε > 0 there exists N0 such that for all n, m > N0 we have
d(an , am ) < ε.

One can show that every convergent sequence is a Cauchy sequenze using the triangle in-
equality. But the converse is noy necessarily true.

Example 1. The reason a Cauchy-sequence might not converge in a set H is that the limit might
not be an element of this set. For example, consider a Cauchy sequence in the rational numbers,
that converges to a non-rational one. Such a sequence is thus non-convergent because the set,
the rational numbers, is somehow incomplete.
20 T CREUTZIG

Definition 14. A set H with norm d is called complete if every Cauchy sequence converges in
this set. If H is a vector space with inner product and the norm is given by the inner product as
in (3.2), then if H is complete, the it is called a Hilbert space.
We already saw that the rational numbers are not complete, but both real and complex num-
bers are (with norm the absolute value). This generalizes to finite-dimensional inner product
spaces
Proposition 12. Every Cauchy sequence in a finite-dimensional inner product space, with norm
defined by (3.2) over the real or complex numbers is convergent. This means, that every finite-
dimensional vector space (over the real or complex numbers) with innner product is complete
with respect to the norm given by (3.2).
The proof is postphoned to the exercises. You can find it it many textbooks as for example [H]
page 217. For infinite-dimensional inner product spaces over a complete field (like C or R) it
is in general difficult to decide whether a space is Hilbert or not. An example is given on page
217 of [H]. Another one is
Example 2. Let X be the space of sequences {a1 , ....|ai ∈ R; all but finitely many ai = 0} with
almost all entries zero, and finitely many ones non-zero. Let a = {a1 , . . . } and b = {b1 , . . . } be
such two sequences, then an inner product is defined by

(a, b) = ∑ ai bi .
i=1
This gives X the structure of an inner product space. A orthonormal basis of this space is
given by {e(1), e(2), ...} where e(n) is the sequence that has zeros everywhere, except its n-th
component is one. We can then construct a sequence of sequences {F1 , . . . } in X via
1 1
F1 = e(1), Fn = Fn−1 + n e(n).
2 2
The sequence Fn has first n components non-zero and zeros otherwise. So that its limit (it exists,
since the sum ∑ 2−n converges) has non-zero entries everywhere, which is a sequence that is
not an element of X.
In this example, I have already called the e(n) a basis (even though X is not a Hilbert space).
So let us see, what we mean by a basis.
Proposition 13. Let H be a infinite-dimensional Hilbert space

over C, and let { e1 , . . . } be an
set of orthonormal vectors in H. Let a in H and define complex numbers

infinite

ordered

ai := ei a . Then the Bessel inequality

∑ |ai|2 ≤


a a
i=1
holds.
Proof. Define a sequence of vectors
n
an := ∑ ai ei .

i=1
The Schwarz inequality proven in last section gives



n
| a an |2 ≤ a a an an = a a ∑ a2i .


(3.3)
i=1
MA PH 451 21

The inner product of an with a is
n n n
a an = ∑ ai a en = ∑ ai a∗i = ∑ |ai |2 .



i=1 i=1 i=1
Substituting this identity into (3.3) gives
n
∑ |ai|2 ≤


a a .
i=1

That is the sequence {An = ∑ni=1 |ai |2 |n ∈ Z>0 } of non-negative real numbers is monotonously
growing and bounded from above, hence convergent. The limit must satisfy the same inequality.

It follows that the vector


∞ n
∑ ai ei := lim ∑ ai ei

n→∞
i=1 i=1
converges
in the sense that it has finite norm. The question is whether this vector coincides with
a ?

Definition 15. A sequence of orthonormal vectors e1 , . . . } in a Hilbert spaceH is called
{
complete if the only vector that is orthogonal to all ei is the zero vector. In this case { e1 , . . . }

is called a basis of H.

This definition is justified by

Proposition 14. Let { e1 , . . . } be an orthonormal sequence of a Hilbert space H. Then the


following statements are equivalent



(1) { e1 , . . . } is complete.
(2)


a = ∑ ei ei a , for all a in H.

i=1
(3)

∑ ei ei = 1.


i=1
(4)



a b = ∑ a ei ei b , for all a , b in H.



i=1
(5)


|| a ||2 = ∑ | ei a |2 , for all a in H.

i=1

Proof. (1) → (2) : Let




c = a − ∑ ei ei a ,
i=1

then c is orthogonal to all ei and hence must be the zero vector in a complete Hilbert space.

(2) → (3) : We have that


1 a = a = ∑ ei ei a

i=1
22 T CREUTZIG

for all a in H. That means




∑ ei


ei
i=1
acts as the idenitity on every vector in the Hilbert space and hence is the identity operator.
(3) → (4) :



a b = a 1 b = ∑ a ei ei b



i=1
for all a , b in H.



(4) → (5) : This is statement (4) with a = b .
(5) → (1) : Let a be orthogonal to all ei , then by (5) the norm of a must be zero. But
the only vector of zero norm in an inner product space is the zero vector itself.

The equality


|| a ||2 = ∑ | ei a |2 = ∑ |ai |2 ,


ai := ei a
i=1 i=1
is called the Parseval equality, the complex numbers ai are called the generalized Fourier coef-
ficients and the relation


∑ ei ei = 1
i=1
is called the completeness relation. The definition of a Hilbert space has been rather abstract.
Interestingly, there is a very concrete way to picture all separable infinite-dimensional Hilbert
spaces over the real or complex numbers.

3.2. Square integrable functions. The set of square integrable functions on a real interval
[a, b] is denoted by L2ω (a, b). Here L is due to Lebesgue, who is responsible for understanding
functions that are not continuous. ω is the weight function, this is a strictly positive and hence
real valued function on the interval. L2ω (a, b) is then the set of all functions
f : [a, b] → C
(you can replace C by R if you wish) with finite norm, defined by the inner product given by
the Lebesgue integral
Z b
( f , g) := f (x)∗ g(x)ω(x)dx. (3.4)
a
A famous theorem die to Riesz and Fischer is

Theorem 15. The space L2ω (a, b) is a separable Hilbert space and all separable complex Hilbert
spaces are isomorphic to a Hilbert space of square integrable functions on some interval with
some weight function.

This is really nice, it tells us that studying separable Hilbert spaces, that is Hilbert spaces
with countable basis is the same as studying spaces of square integrable functions. Finding a
basis is due to a theorem by Stone and Weierstrass

Theorem 16. The sequence of monomials {x, x2 , x3 , . . . } forms a basis of any L2ω (a, b).

We are not completely happy yet, as we are usually in physics interested in an orthonormal
basis (possibly corresponding to the eigenvectors of some important operator).
MA PH 451 23

3.3. Classical orthogonal polynomials. The following theorem is a generalization of the Gram-
Schmid process to a wide class of Hilbert spaces of square integrable functions.
Theorem 17. Define a sequence of functions on the interval [a, b]
1 dn
Fn (x) := (ω(x)s(x)n ), (3.5)
ω(x) dxn
with
(1) F1 (x) is a polynomial of degree one.
(2) s(x) is a polynomial of degree at most two with only real roots.
(3) ω(x) is a strictly positive function on the interval (a, b) with the boundary conditions
ω(a)s(a) = ω(b)s(b) = 0.
Then Fn (x) is a polynomial of degree n and it is orthogonal to any polynomial pk (x) of degree
k < n, that is Z b
pk (x)Fn (x)ω(x)dx = 0, for k < n.
a
These polynomials are called classical orthogonal polynomials.
This theorem is particularly useful if s(x) is just a polynomial of degree one. In that case
define
Gn (x) := ω(x)Fn (x). (3.6)
Then there is the obvious recurrence relation
d d n−1
 
n−1
 d d
Gn (x) = n−1
ω(x)s(x) s(x) = Gn−1 (x) s(x) + nGn−1 (x) s(x). (3.7)
dx dx dx dx
Proof. We first claim that the identity
dm
(ω(x)s(x)n Pk (x)) = ω(x)s(x)n−m Pk+m (x) (3.8)
dxm
holds for all m ≤ n. Here Pk stands for an arbitrary polynomial of degree at most k. We fix n
and proof the statement by induction to m. For m = 0 this is an homest identity. Using (3.5)
with n = 1, we derive
d d
s(x) ω(x) = ω(x)F1 (x) − ω(x) s(x) = ω(x)p1 (x) (3.9)
dx dx
since s(x) is a polynomial of degree at most two and F1 (x) one of degree one. We thus get for
m > 0,
dm d d m−1
 
n n
(ω(x)s(x) Pk (x)) = (ω(x)s(x) Pk (x))
dxm dx dxm−1
d
ω(x)s(x)n−m+1 Pk+m−1 (x)

=
dx  
n−m d d
= ω(x)s(x) (n − m + 1) s(x)Pk+m−1 (x) + s(x) Pk+m−1 +
dx dx
d
+ s(x) ω(x)s(x)n−m Pk+m−1 (x)
dx
= ω(x)s(x)n−m Pk+m (x).
So that the claim follows. Setting k = 0 and P0 = 1 it follows that
dm
n
(ω(x)s(x) ) =0 (3.10)
dxm x=a,b
24 T CREUTZIG

for all m < n. Using this equation together with integration by parts, it follows inductivley (for
k < n) that
Z b Z b k  n−k
k d d
Pk (x)Fn (x)ω(x)dx = (−1) k
Pk (x) n−k
(ω(x)s(x)n )dx. (3.11)
a a dx dx
But the k − th derivative of a polynomial of degree at most k is a constant (−1)kC, so that we
get
Z b n−k
d n−k−1
Z b b
d n n
Pk (x)Fn (x)ω(x)dx = C n−k
(ω(x)s(x) )dx = C n−k−1 (ω(x)s(x) ) = 0. (3.12)
a a dx dx a
So that orthogonality of Fn (x) to all polynomials of degree lower than n follows. It remains to
prove that Fn (x) is a polynomial of degree n. The identity (3.8) with k = 0 and P0 = 1 and m = n
implies that Fn has at most degree n. So we write Fn (x) = αxn + pn−1 (x). Multiplying both
sides by ω(x)Fn (x) and integrating over our intervall, we get
Z b Z b Z b
Fn (x)2 ω(x)dx = (αxn + pn−1 (x))Fn (x)dx = α xn Fn (x)dx,
a a a
positive definiteness of the inner product implies that the left-hand side is non-zero if Fn 6= 0,
and hence in that case α must be non-zero. In other words, in order to finish the proof, we
have to show that Fn 6= 0. For this, observe that (3.11) also holds by replacing Pk by any square
integrable function f (x), so that with f (x) = xn and k = n − 1, we get
Z b Z b  n−1  Z b
n k d n d n k d
x Fn (x)ω(x)dx = (−1) n−1
x (ω(x)s(x) )dx = (−1) n! x (ω(x)s(x)n )dx.
a a dx dx a dx
Integration by parts gives
Z b b Z b Z b
d
x(ω(x)s(x)n )dx = xω(x)s(x)n − ω(x)s(x)n dx = − ω(x)s(x)n dx,

a dx a a a
using (3.10). Positivity of the inner product forces this integral to be non-zero for even n, so
that Fn 6= 0 for n even. Assume that there is an even n with Fn−1 = 0. Recall the definition of
Gn (3.6), then also Gn−1 = 0, and by (3.7) we see that we get a contradiction to Fn 6= 0 if s(x) is
just a polynomial of degree at most one. Hence let s(x) = αx2 + β x + γ with α 6= 0. Then
 2
d d n−1  n(n − 1) d n−2

n−1 n−1 d
Gn (x) = ω(x)s(x) s(x) = ω(x)s(x) s(x)
dx dx n−1 2 dx n−2 dx2
 n−2  (3.13)
d
= αn(n − 1) ω(x)s(x)n−1
dxn−2
but the derivative of the right-hand side is proportional to Gn−1 and hence vanishes, it follows
that Gn is constant, and hence Fn is proportional to 1/ω(x). Since we already know that Fn
is a polynomial of degree exactly n, the same must be true for 1/ω(x), say 1/ω(x) = pn (x).
Inserting this in the definition of Gn−1 yields
d n−1 d n−1 s(x)n−1
 
n−1

0 = Gn−1 (x) = n−1 ω(x)s(x) = n−1
dx dx pn (x)
s(x)n−1
so that pn (x) must be a polynomial of degree at most n − 2, and hence the roots of pn must
be a subset of those of s(x)n−1 . But s only has two roots (which may coincide) and since pn
is a polynomial of degree n, both roots must be roots of pn . Positivity of ω(x) together with
the boundary conditions ω(a)s(a) = ω(b)s(b) = 0 imply that pn cannot have any roots in [a, b].
MA PH 451 25

This is a contradiction. So that in every case we get an contradiction to our assumption that
Fn−1 = 0. This completes the proof.
Having this theorem, we would like to know how to use it. There are four cases
(1) s(x) has no root, that means it is the constant polynomial;
(2) s(x) has one root with multiplicity one, that means it is a polynomial of degree exactly
one;
(3) s(x) has two distinct roots;
(4) s(x) has one root with multiplicity two.
The first two cases have been studied by Hassani [H], so let’s look at the third one. We would
like to determine ω(x). Let
s(x) = (x − α)(x − β ), (3.14)
so α and β are the two distinct real roots of s(x). Recall that
1 d
F1 (x) = (ω(x)s(x))
ω(w) dx
is a polynomial of degree one. Dividing both sides of this equation by s(x) is
F1 (x) 1 d d
= (ω(x)s(x)) = ln(ω(x)s(x)).
s(x) ω(w)s(x) dx dx
We thus have to integrate the left-hand side in order to determine the weight function ω(x). For
this, note the partial fraction decomposition
F1 (x) A B
= + .
s(x) x−α x−β
Here A = F1 (α)/(α − β ) and B = −F1 (β )/(α − β ). (it is a short exercise to verufy this iden-
tity). This fraction is the derivative of
 
A B
ln (x − α) (x − β ,
so that up to a constant (which we can set to one), we get
ω(x)s(x) = (x − α)A (x − β )B
and hence
ω(x) = (x − α)A−1 (x − β )B−1 .
The boundary conditions ω(a)s(a) = ω(b)s(b) = 0 imply that the intervall of integration is
bounded by α and β and that −1 < A, B. Also note, that we can allways translate and rescale
the intervall, so that without loss of generality α = −1, β = 1.
The fourth case can be treated analogously. Let
s(x) = (x − α)2 , (3.15)
then a similar analysis reveals that
A
ω(x) = e− x−α (x − α)B−2 ,
with A and B defined by
F1 (x) A B
= 2
+ .
s(x) (x − α) x−α
Translating our integation intervall allows to set α = 0. It turns out that it is impossible to find
a positive weight function with the desired boundary conditions. It doesnot exist.
26 T CREUTZIG

So that we can summarize

Theorem 18. The following data inserted in the previous theorem defines up to isomorphism all
classical polynomials.
2
(1) s(x) = 1: the weight function is ω(x) = e−x and the intervall is [−∞, ∞]. The resulting
polynomials are called Hermite polynomials and they are denoted by Hn ;
(2) s(x) = x: the weight function is ω(x) = xν e−x with ν > −1 and the interval is [0, ∞].
The resulting polynomials are called Laguerre polynomials and they are denoted by Lnν ;
(3) s(x) = (1 − x)(1 + x): the weight function is ω(x) = (1 + x)µ (1 − x)ν with ν, µ > −1
and the interval is [−1, 1]. The resulting polynomials are called Jacobi polynomials and
µ,ν
they are denoted by Pn ;
(4) s(x) = x2 : this case doesnot exist.

There is a Schrödinger equation for these polynomials

Theorem 19. Let k and σ the coefficients F1 (x) = kx + . . . and s(x) = σ x2 + . . . . Then the
Schrödinger equation
d2 d
s(x) Fn (x) + F1 (x) Fn (x) = (kn + σ n(n − 1))Fn (x) (3.16)
dx2 dx
holds.

Proof. This will be an exercise.

One can ask a somehow inverse question, namely under which assumption does a differential
equation of type
d2 d
q(x) 2 Fn (x) + `(x) Fn (x) = λm Fn (x) (3.17)
dx dx
admit solutions that are mutually orthogonal for distinct λn . A computation reveals that this is
true if ω(x)q(x) vanishes sufficiently fast at the boundary of the interval of integration, and in
addition the identity
d
(ω(x)q(x)) = `(x)ω(x)
dx
is true.

3.4. Gegenbauer polynomials and hypergeometric functions. We now would like to dis-
cuss an example. The most prominent examples in physics are the Hermite polynomials in the
one-dimensional quantum mechanical harmonic oscillator, and the Gegenbauer polynomials in
problems with a spherical symmetric potential, as for example in electro-statics. The Gegen-
bauer polynomial are the special case µ = ν of Jacobi polynomials. Let us define them as
α−1/2,α−1/2
Cnα (x) := Pn ,
so that α < −1/2. The data for these polynomials is
1
s(x) = 1 − x2 , ω(x) = (1 − x2 )α− 2 , C1α (x) = (2α + 1)x.
So that they satisfy the differential equation
d2 α d
(1 − x2 ) Cn (x) + (2α + 1) Cnα (x) = 2(α + 1)n − n2 Cnα (x).

dx 2 dx
MA PH 451 27

The importance of the Gegenbauer polynomials comes from the form of its generating function,
that is ∞
1
2
(1 − 2xt + t )
α = ∑ Cnα (x)t n . (3.18)
n=0
For example in spherical symmetric potentials, we have to deal with expressions of the form
1
|~r −~r0 |2α
for two vectors ~r,~r0 . Let r, r0 be the length of these vectors, and let ϑ be the angle between
these two vectors, then an exercise in linear algebra shows (look at your first year linear algebra
textbook) that
r02 r0
 
0 2 2
|~r −~r | = r 1 + 2 − 2 cosϑ
r r
0
so that with t = rr , we get
1 ∞
r0n α
= ∑ n+2 Cn (cosϑ ).
|~r −~r0 |2α n=0 r
Another important property of these polynomials is there normalization, that is
Z 1
1 Γ(n + 2α)
Cnα (x)Cnα (x)(1 − x2 )α− 2 = 21−2α π (3.19)
−1 (n + α)Γ(α)2 Γ(n + 1)
where the Γ function satisfies the relation Γ(x + 1) = xΓ(x) with Γ(1) = 1. The Gegenbauer
polynomials can be expressed in terms of hypergeometric functions
   
α n + 2α − 1 1 1
Cn (x) = 2 F1 −n, n + 2α, α + ; (1 − x) (3.20)
n 2 2
where the hypergeometric series is a function that appears frequently in various areas of physics
and mathematics. It is ∞
(α)n (β )n n
2 F1 (α, β , γ; x) = ∑ x (3.21)
n=0 n!(γ)n
for |x| < 1 and (α)n is a short-hand notation
Γ(α + n)
(α)n := .
Γ(α)
The relation to hypergeometric function actually holds to all Jacobi polynomials, it is
(α + 1)n
Pnα,β (x) = 2 F1 (−n, α + β + 1 + n, α + 1; (1 − 2x)). (3.22)
n!
So that this class of classical polynomials is part of hypergeometic functions. The hyper ge-
omtric functions satisfy the (Euler hypergeometric) differential equation
d2 d
x(1 − x) 2 2 F1 (α, β , γ; x) + (γ − (α + β + 1)x) 2 F1 (α, β , γ; x) = αβ 2 F1 (α, β , γ; x). (3.23)
dx dx
Hypergeometric functions appear in many areas as in number theory, especially the theory of
partitions of integers; in modular and elliptic functions and in conformal field theory and string
theory. There a class of theories, called the Virasoro minimal models (but also many others)
relates to hypergeomtric functions. Especially correlation functions of four fields inserted at
0, 1, x and ∞ satisfy the differential equation (3.23). But these correlation functions are usually
28 T CREUTZIG

computed using the integral identity


Z 1
Γ(γ)
2 F1 (α, β , γ; x) = yβ −1 (1 − y)γ−β −1 (1 − xy)−α dy (3.24)
γβ Γ(γ − β ) 0
which holds if the real part of γ is larger than the real part of β , which must be positive.

3.5. Hermite polynomials. Every classical orthogonal polynomial satisfies some nice differ-
ential and integral equation. We have seen such an equation for Jacobi polynomials in the last
section. Here, I would like to present a similar equation for Hermite polynomial as well as some
more properties of them. The Hermite polynomial can be represented by the following integral
2
2n (−i)n ex ∞ Z
2
Hn (x) = √ e−t +2itxt n dt. (3.25)
π −∞
They also satisfy an integral equation
1 y2
Z ∞
x2
e− 2 Hn (x) = √ eixy e− 2 Hn (y)dy. (3.26)
in 2π −∞
The use of orthogonal polynomials is that they form an orthogonal basis of square integrable
functions. This means that every other square integable function can be expressed in terms of
this basis. For Hermite polynomials the more precise statement is

2
Theorem 20. Let f (x) be in Lω (−∞, ∞) for ω = e−x and let f be smooth, then

f (x) = ∑ cnHn(x)
n=0
with Z i
1
cn = n √ n f tyω(x)Hn (x) f (x)dx.
2 n! π −∞

In many cases it is possible to explicitely compute these coefficients.

Example 3. Let f (x) = x2p for p = 0, 1, . . . . Then the expansion of this monomial in terms of
Hermite polynomials is
p
x2p = ∑ c2nH2n(x),
n=0
so that most coefficients in the expansion vanish. In order to compute the c2n , we need the
integral representation for the Γ function, that is
  Z ∞
1 2
Γ p−n+ = e−x x2p−2n dx
2 −∞
as well as the identity

 
2p−2n 1
2 Γ p−n+ (p − n)! = π(2p − 2n)!.
2
MA PH 451 29

Then by partially integrating (the second equation), we get


Z i
1 2
c2n = √ n f tye−x x2p H2n (x)dx
22n (2n)! π −∞
1
Z i 2n 
−x2 2p d

−x2
= √ n f tye x e dx
22n (2n)! π −∞ dx2n
Z i
1 (2p)! 2
= 2n √ n f tye−x x2p−2n dx (3.27)
2 (2n)! π (2p − 2n)! −∞
 
1 (2p)! 1
= 2n √ Γ p−n+
2 (2n)! π (2p − 2n)! 2
(2p)!
= 2p
2 (2n)!(p − n)!
Similarly one can show for odd degree monomials that
(2p + 1)! p 1
x 2p+1
= 2p+1 ∑ H2n+1 (x). (3.28)
2 n=0 (2n + 1)!(p − n)!

3.6. Exercises.

(1) Proof Proposition 12.


(2) Find the following differential equation for classical orthogonal polynomials. Let k and
σ the coefficients F1 (x) = kx + . . . and s(x) = σ x2 + . . . . Then
 
1 d d
ω(x)s(x) Fn (x) = (kn + σ n(n − 1))Fn (x) (3.29)
ω(x) dx dx
and
d2 d
s(x) 2
Fn (x) + F1 (x) Fn (x) = (kn + σ n(n − 1))Fn (x) (3.30)
dx dx
hold. Proceed in the following steps:
• Use (3.8) to prove
Z b  
d d
Fm (x) ω(x)s(x) Fn (x) dx = 0
a dx dx
for m < n.
• Use (3.8) to show that
 
1 d d
ω(x)s(x) Fn (x)
ω(x) dx dx
is a polynomial of at most degree n.
• Use the first step together with the orthogonality of the Fn with respect to the inner
product on L2ω (a, b), that is
Z b
(Fm , Fn ) = Fm (x)Fn (x)ω(x)dx = 0
a
if n 6= m, to show that
 
1 d d
ω(x)s(x) Fn (x) = γFn (x) (3.31)
ω(x) dx dx
for some proportionality constant γ.
30 T CREUTZIG

• Use (3.9) to show that


d2
 
d d d
ω(x)s(x) Fn (x) = ω(x)s(x) 2 Fn (x) + ω(x)F1 (x) Fn (x) (3.32)
dx dx dx dx
• We introduce some notation. Let α be the coefficient in front of xn of Fn , that is
Fn (x) = αxn + . . . . Let β be the norm of Fn , that is
Z b
(Fn , Fn ) = Fn (x)Fn (x)ω(x)dx = β .
a
Use this notation together with (3.31) and the previous step to prove (3.29).
• Combine the last two steps to (3.30).

3.7. Solutions.

(1) We repeat the proof of [H] on page 216/217. Let a 1 , . . . be a Cauchy sequence of
vectors in a finite-dimensional vector space V . Let e1 , . . . , en be an orthonormal
basis of V , and define coefficients αik such that
n
ai = ∑ αik ek .

k=1
Then
2 n  2
ai − a j = ∑ αik − α jk ek

k=1
n 2
= ∑ αik − α jk .

k=1
The left-hand side tends to zero for large i, j, as the sequence is Cauchy. So the same
must be true for the right-hand side. But the right-hand side is a sum of non-negative
summands. Hence each term must tend to zero. The complex numbers are complete,
hence the limits
lim αik := αk
i→∞
exists. Let
n
∑ αk ek

a = ,
k=1
then the cauchy sequence converges to this vector since
2 n 2 n 2
lim ai − a = lim ∑ αik − αk = ∑ lim αik − αk = 0.

i→∞ i→∞ i→∞
k=1 k=1
(2) We will prove this statement following the indicated steps.
• Equation (3.8) with n = m = 1 reads
d
(ω(x)s(x)Pk (x)) = ω(x)Pk+1 (x) (3.33)
dx
for every polynomial Pk of degree k. We partially integrate twice and use the boundary
behaviour (3.10) to get
Z b   Z b 
d d d d
Fm (x) ω(x)s(x) Fn (x) dx = − Fm (x) ω(x)s(x) Fn (x)dx
a dx dx a dx dx
Z b   
d d
= Fm (x) ω(x)s(x) Fn (x)dx.
a dx dx
MA PH 451 31

d
Using (3.33) with k = m − 1 and Pm−1 = dx Fm (x), we get
Z b    Z b
d d d
Fm (x) ω(x)s(x) Fn (x)dx = (Pm−1 (x)ω(x)s(x))Fn (x)dx
a dx dx a dx
Z b
= Pm (x)ω(x)Fn (x)dx = 0
a
for all m < n since Fn is orthogonal to any polynomial of degree less than n.
• We again use (3.33), that is the special case of n = m = 1 of (3.8). Then for Pn−1 =
d
dx Fn , we get
 
1 d d 1
ω(x)s(x) Fn (x) = ω(x)Pn (x) = Pn (x)
ω(x) dx dx ω(x)
that indeed the expression is a polynomial Pn of degree at most n.
• The polynomial Pn of the last step is a plynomial of degree at most n, so we can
expand it in terms of those orthogonal polynomials that have degree at most n, i.e.
n
Pn = ∑ γmFm.
m=1
The first step tells us that (Fm , Pn ) = 0 for all m < n. The orthogonality of the Fm
implies that (Fm , Pn ) = γm (Fm , Fm ), and hence γm must be zero for m < n. The claim
follows with γ = γn .
• We have
d2
 
d d d d
ω(x)s(x) Fn (x) = ω(x)s(x) 2 Fn (x) + (ω(x)s(x)) Fn (x).
dx dx dx dx dx
Equation (3.9) can be rewritten as
d
(ω(x)s(x)) = ω(x)F1 (x)
dx
so that the claim follows.
• We first compute the inner product of Fn with xn . Since the inner product of Fn with
any polynomial of lower degree vanishes, we have (Fn , Fn − αxn ) = 0, and hence
(Fn , xn ) = β /α. The task is to compute γ, using the previous two steps we have
β γ = γ(Fn , Fn ) = (Fn , γFn )
  
1 d d
= Fn , ω(x)s(x) Fn (x)
ω(x) dx dx
d2
 
d
= Fn , s(x) 2 Fn (x) + F1 (x) Fn (x)
dx dx
= ασ n(n − 1)(Fn , x ) + αkn(Fn , xn )
n

= β (σ n(n − 1) + kn).
So that the claim follows. Here, we again used the orthogonality of Fn to polynomials
of lower degree.
• Inserting now (3.32) in this differential equation proofs the final statement.
32 T CREUTZIG

4. H ARMONIC A NALYSIS
So far, we have talked about Hilbert spaces of functions, which have a countable basis con-
sisting of certain polynomials. We now would like to study larger Hilbert spaces and thus have
to relax our notion of functions.

4.1. Motivation. Consider a Hilbert space of square integrable functions H = Lω (a, b). We
have learnt that we can expand any function in this space in a given basis, say {ei , . . . },

f (x) = ∑ fnen(x),
n=0
where the coefficients are given by
Z b
fn = e∗n (x) f (x)ω(x)dx
a
if the basis is orthonormal. The expansion of a function is thus given (for an orthonormal basis)
by an assignment
Z b
b : H × Z≥0 → C, ( f , n) 7→ fn = e∗ (x) f (x)ω(x)dx (4.1)
a
from H times a set indexing the basis to the complex number. So that the expansion of our
function f in the basis becomes

f (x) = ∑ b( f , n)en(x).
n=0
Now we would like to replace our countable basis by a basis {ey |y ∈ I} indexed by an uncount-
able set I, then the situation should be analogous. There should be an assignment
b : H × I → C, ( f , y) 7→ fy , (4.2)
such that the element f can be expanded in terms of the uncountable basis
Z
f= fy ey dy.
I
Since the new basis is uncountable, we here have to replace the sum over the set of basis
elements by an integral. There are now some obvious questions:
• What is a natural uncountable set for the space of square integrable functions?
• What is fy ?
• What are the ey ?
A candidate for the first question is the interval
I = [a, b]
of integration. In order to proceed, we need the notion of orthogonality. We define

Definition 16. The Dirac delta distribution δ (y − x) is defined by requiring that for any square
integrable function on [a, b] the identity
Z b
f (y)δ (y − x)dy = f (x)
a
holds for all x ∈ [a, b].
MA PH 451 33

Then we define a distribution valued inner product on the basis vector ey by



Z b
ez ey = ω(x)e∗z (x)ey (x)dx = δ (y − z).
a
So that orthonormality should be interpreted in a distributional sence and the role of 1 is replaced
by the Dirac delta distribution. With this notion, we can compute the expansion coefficients

Z b ∗ Z bZ b

Z b
ez f =
ez (x) f (x)ω(x)dx = ez (x) fy ey (x)ω(x)dydx = δ (z − y) fy dy = fz .
a a a a
Here we interchanged the order of integrations. We assume this to be allowed. We observe, that
the expansion coefficients are as in the seperable case determined by the inner product with the
(in some distributional sense orthonormal) basis vectors. Now, let
Z b
g(x) = gy ey (x)dy
a
be another element of our Hilbert space. The inner product is then coefficient wise.

Z bZ bZ b ∗ ∗ Z bZ b

Z b
f g =
fy ey (x)gz ez (x)ω(x)dxdydz = fy gz δ (y − z)dydz = fy∗ gy dy
a a a a a a
There is also a completeness relation. Consider
Z Z b
f (x) = f = fy ey (x)dy = e∗y (z)ω(z)ey (x) f (z)dzdy
I a
so that the identity opertor is the integral over
Z b
δ (x − z) = e∗y (z)ω(z)ey (x)dy. (4.3)
a

4.2. Distributions. We have seen, that we need to learn about distributions in order to study
bases of uncountable cardinality. Distributions act on suitable spaces of test functions. We
choose our space of test functions to be C∞ (C). That is the space of smooth (infinitely differen-
tiable functions) on R with values in C.

Definition 17. A distribution is a linear functional on C∞ (C), that means it maps every smooth
function to a complex number.

Example 4. Let f be a sufficiently integrable function on the real line, then the distribution T f
is defined by Z
T f (g) := f (x)g(x)dx.
R
Usually one abuses notation and identifies the distribution corresponding to a function by the
same name, that is
Tf = f .
We will adopt this notation.

Example 5. In the same sense, the Dirac delta distribution is defined by


δ ( f ) = f (0)
and one abuses the notation as in the definition of last section, that is
Z
δ ( f ) = f (0) = δ (x) f (x)dx.
R
34 T CREUTZIG

The derivative of a distribution φ is defined by partial integration


d d
Z Z
φ (x) f (x)dx = − φ (x) f (x)dx,
R dx R dx
that is we assume that test functions vanish sufficiently fast at ±∞. This definition then implies
that
d
δ (x − y) = θ (x − y) (4.4)
dx
where the Heavyside step function is defined as
(
1 if x > 0
θ (x) := .
0 if x < 0
A useful way to represent the delta distribution is as a limit of a series of distributions associated
to some functions.
Definition 18. Let {φ1 (x), . . . } be a series of functions such that
Z
lim φn (x) f (x)dx
n→∞ R

exists for all smooth functions f (x). Then we say that the series of functions φ1 (x), . . . converges
to the distribution φ (x) defined by this limit.
Example 6. One can show that the two sequences
   
n −n2 x2 sin(nx)
√ e ,
π πx
both converge to the Dirac delta distribution. A good idea to proof the statement is to proof it
for a basis of the Hilbert space. But we will not go into the details here. We just note, that the
second sequence can be used to derive the following integral representation of the Dirac delta
distribution
1
Z
δ (x) = eixt dt.
2π R
This follows, since the second sequence has the integral representation
sin(nx) 1 n ixt
Z
= e dt.
πx 2π −n
We will see another example soon in the context of Fourier series.
4.3. Fourier Analysis. We turn to functions on the circle S1 of radius one. The circle is
S1 = eiα | − π ≤ α < π .


So that we can identify functions on the circle with functions on the intercal [−π, π]. We know
that a good basis for the space of some square integrable functions are monomials. But there
are also other possibilites.
Theorem 21. The Hilbert space of functions on the circle, H = L(−π, π), has orthonormal basis
{e1 , . . . } given by
1
en (x) = √ einx .

This means that for every square integrable function f ∈ H, there is an expansion
1
f (x) = √ ∑ fn einx
2π n∈Z
MA PH 451 35

with Fourier coefficients


1 π
Z
fn = √ f (x)e−inx dx.
2π −π
We donot proof this theorem. Orthonormality of the en is a direct computation, while com-
pleteness requires a Stone-Weierstrass theorem in two variables, that then can be specialized to
functions on the circle. Recall that in a finite-dimensional vector space, the identity operator is
the sum of projection operators associated to a basis. In our case, the operator
1 π
Z
Pn ( f ) = f (y)e−iny dyeinx = fn en (x)
2π −π
projects f onto the subspace spanned by en (x). Further, the completeness relation becomes
∑ Pn = 1,
n∈Z
since
1
Z π
∑ Pn( f ) = ∑ 2π −π
f (y)e−iny dyeinx = ∑ fnen(x) = f (x)
n∈Z n∈Z n∈Z
so that we get the representation of the delta distribution on the circle
1
δ (x − y) = ∑
2π n∈Z
dyein(x−y) .

Next, we ask what does it mean that a function converges to its Fourier series. Let f ∈ H,
then we say that its Fourier series
1
√ ∑ fn einx
2π n∈Z
converges in H = L(−π, π) if the partial sums
N
1
fN = √ ∑
2π n=−N
fn einx

converge in the sense that


lim || fN − f || = 0.
N→∞
This means the difference of fN and f becomes a function of measure zero. If we allow for
piecewise continuous functions, then at the points of discontinuity the difference between orig-
inal function and its Fourier series is given by the following theorem.
Theorem 22. The Fourier series of a piecewise contiuous function f (x) on [−π, π] converges
pointwise to
1
lim ( f (x + ε) + f (x − ε))
ε→0 2
at x ∈ (−π, π) and to
1
( f (π) + f (−π))
2
at x = ±π.
Fourier analysis on the circle is the simplest example of harmonic analysis on a Lie group.
So let us express everything we know in an unusual language.
4.4. Harmonic analysis on the Lie group U(1). The real unitary Lie group U(1) is the one-
dimensional real manifold S1 . We can parameterize elements in S1 by
eiα , −π ≤ α < π.
36 T CREUTZIG

Then the product


eiα eiβ = ei(α+β )
defines a group structure on S1 , the Lie group U(1). The tangent space at a point is nothing
d
but a line. An infinitesimal translation on the line ` = a + bx is given by the derivative dx .
The operator of infinitesmial translations on the tangent space of a Lie group is called invariant
vector field. U(1) is a commutive Lie group, in which case there is only one type of invariant
vector field. In the general non-commutative case there are both left- and right-invariant vector
fields, corresponding to infinitesimal left and right translations. The invariant vector fields
satisfy the commutation relations of the Lie algebra of the Lie group. The Lie algebra of U(1)
d
is called u(1) and it is the one-dimesional abelian Lie algebra. Indeed, the vector field dx
commutes with itself. Harmonic analysis is then the study of square integrable functions on
the Lie group. These functions respect the Lie group symmetry in the sense, that they are
representations of the Lie algebra of vector fields. In our case, the square integrable functions
on the Lie group are best described as periodic functions on the real line. We then saw that a
basis is given by the einx , and indeed each basis vector carries a one-dimensional representation
of u(1) given by
d inx
e = ineinx .
dx
In the non-commutative setting, good basis elements won’t be invariant under the Lie algebra
of vector fields, but they will carry a nice, that means irreducible, representation of the Lie
algebra. But there is another operator, called the Casimir or Laplace operator. It is a second-
order differential operator, that is invariant under conjugation by a Lie group element. In our
case, this is simply
d d
∆= .
dx dx
Conjugating by a Lie group element is rather trivial
eiα ∆e−iα = ∆.
Of course our good basis elements are eigenfunctions of the Laplace operator,
∆einx = −n2 einx .
This will still be true in the non-commutative setting. One essential task of harmonic analysis
is then to find these eigenfunctions. Finally, we need a measure on our Lie group. This measure
must respect the Lie group symmetry, and it is called the invariant measure or Haar measure. In
our case, it is
dµ(U(1)) = dx.
Invariance has the following meaning. Let g = eix our group-valued (that is S1 -valued) function,
then a constant group element eiα acts by multiplication
eix eiα = ei(x+α) .
It thus translates our variable x 7→ x + α. Now, consider an integral of some function f (x) on
S1 , that is f (x) is periodic with periodicity 2π:
Z π Z π+α Z π
f (x + α)dx = f (x)dx = f (x)dx.
−π −π+α −π
In the last equality we used the periodicity of f (x). We thus see, that the measure is translation
invariant, that is it respects the symmtry of the Lie group of the circle.
MA PH 451 37

The quantum mechanics interpretation of harmonic analysis on the circle is a free particle
with momentum. The momentum operator is
d
p = −ih̄ ,
dx
and the Hamiltonian describing the kinetic energy of the particle is

H = − ∆,
2m
so that the particle (of mass m) given by the wave-function
ψn (x) = einx
h̄n2
has momentum h̄n and kinetinc energy 2m .

4.5. Harmonic analysis on S3 . we now turn to a much more complicated example of harmonic
analyis, functions on the three-sphere S3 . The three-sphere is embedded in R4 as
S3 = {(x1 , x2 , x3 , x4 ) ∈ R4 |x12 + x22 + x32 + x42 = 1}.
Equivalently, we can write a = x1 + ix2 and b = x3 + ix4 , then we can view the three-sphere as
being embedded in C2 ,
S3 = {(a, b) ∈ C2 ||a|2 + |b|2 = 1}.
This second way gives us a good matrix representation of S3 . Consider a 2 × 2 matrix
 
a b
M= .
c c
We would like this matrix to have unit determinant, that is ad − bc = 1 and to be unitary, that is
its inverse coincides with its adjoint:
   ∗ ∗
d −b −1 † a c
=M =M = ∗ ∗ .
−c a b d
So that d = a∗ and c = −b∗ . The determinant one condition then translates to |a|2 + |b|2 = 1,
so that we see that
n o
3 −1 †
S = SU(2) = M ∈ Mat2 (C) | det M = 1, M = M .
We see that the set of points on the three-sphere can be identified with unitary two by two
matrices of determinant one. Given two such matrices A, B, we can take its matrix product. The
inverse of the product is
(AB)−1 = B−1 A−1 = B† A† = (AB)† .
So that we observe that the product of two unitary matrices is still unitary. Its determinant is
still one, since
det(AB) = det(A) det(B) = 1 · 1 = 1.
All unitary matrices are b definition invertible and hence matrix multiplication defines a group
structure. This group is isomorphic to the unitary real form SU(2) of the Lie group SL(2, C). It
is a real Lie group. Let us learn more about this Lie group. In order to study Lie groups, one
usually first considers its infinitesimal analouge the underlying Lie algebra. Its Lie algebra is
38 T CREUTZIG

called su(2). It consists of the following matrices


   
a b


su(2) = M = ∈ Mat2 (C) tr(M) = a − d = 0, M = −M

c d
    (4.5)
ia b + ic

= M= ∈ Mat2 (C) a, b, c ∈ R

−b + ic −ia
We thus see, that su(2) is a three-dimensional Lie algebra (over R) generated by
     
i 0 0 1 0 i
σ1 = , σ2 = , σ3 = .
0 −i −1 0 i 0
These are the famous Pauli matrices of physics. There products are easily comoted, they are
 
1 0
σi σ j = εi jk σk − δi, j .
0 1
Here δi, j is the Kronecker delta (
1, i= j
δi, j =
0, i 6= j
and εi jk is completely anti-symmetric in all three indices
εi jk = −ε jik = −εik j .
Its normalization is ε123 = 1. It follows that the commutator of the Pauli matrices is
[σi , σ j ] = 2εi jk σk .
For computational purposes, the Pauli matrices are not perfect. It is convenient to pass to the
complexification sl(2; C) = su(2) ⊗R C. Define
     
1 0 0 1 0 0
h= , e= , f= .
0 −1 0 0 1 0
These matrices form a basis of the complexification. Their commutation relations are
[h, e] = 2e, [h, f ] = −2 f , [e, f ] = h.
You might have seen this algebra in discussing the angular momentum and spin in quantum me-
chanics. e and f are the latter operaors, while the h-eigenvalue is the spin or angular momentum
in a distinguished direction.
The Lie algebra can be obtained as the algebra of left or right-invariant vector fields of the Lie
group. The invariant vector fields are infinitesimal translation operators, they are the differential
operators acting on functions on the Lie group. So really, we are now computing the Lie group
d
analouge of dx . For this let
 iθ
eiθ2 sin φ

e 1 cos φ
g(φ , θ1 , θ2 ) = .
−e−iθ2 sin φ e−iθ1 cos φ
g is a map from [0, 2π)3 to SU(2). The inteval is chosen such that the map is injective. Sur-
jectivity is verified by writing a = eiθ1 r and b = e−θ2 s for two real numbers r and s. The unit
determinant condition then impies r2 + s2 = 1, so that r = cos φ and s = sin φ is a good choice
of parameterization.
Even though we haven’t yet said yet what a Lie group is (we will do that later), let’s define
invariant vector fields. Just think of the Lie group as SU(2).
MA PH 451 39

Definition 19. Let g be a parameterixation of a Lie group G, and let g be the Lie algebra of G.
Then the left-invariant vector field for X ∈ g is the differential operator satisfying
LX g = −Xg.
The right-invariant vector field is defined analogously (up to a minus sign)
RX g = gX.

You should think of the invariant vector fields as two commuting copies of the Lie algebra.

Theorem 23. The map


(X,Y ) 7→ (LX , RY )
is a homomorphism from two commuting copies of the Lie algebra to the vector space of dif-
ferential operators on the Lie group.

Proof. Let X,Y be two arbitrary elements of g. We have to show that LX LY − LY LX = L[X,Y ] ,
but
(LX LY − LY LX )g = −LX Y g + LY Xg = −Y LX g + XLY g = (Y X − XY )g = −[X,Y ]g = L[X,Y ] g.
So that the action of LX LY − LY LX and L[X,Y ] coincides on G and hence they are the same
operators. Similarly for the right invariant vector fields
(RX RY − RY RX )g = RX gY − RY gX = g(XY −Y X) = g[X,Y ] = R[X,Y ] g.
The commutativity of the two types of vector fields follows similarly
RX LY g = −RX Y g = −Y RX g = −Y gX = LY gX = LY RX g.

This is pretty nice, it means that functions on the Lie group carry an action of its Lie algebra.
In other words analyzing the space of functions directly leads us to the representation theory of
the Lie algebra. After this general interlude, let us turn to the technical question of computing
the invariant vector fields for SU(2). We choose to do computation in the e, f , h basis (it is
simpler). The computation goes in a few steps.

(1) Let α, β , γ be functions on [0, 2π)3 , then the differential operator


d d d
Dα,β ,γ = α +β +γ
dθ1 dθ2 dφ
acts on g as follows
 iθ
eiθ2 (iβ sin φ + γ cos φ )

e 1 (iα cos φ − γ sin φ )
Dα,β ,γ g = −iθ2
e (iβ sin φ − γ cos φ ) e−iθ1 (−iα cos φ − γ sin φ )
The proof is a direct computation, take the derivatives of the components of the matrix
g.
40 T CREUTZIG

(2) The elements e, f , h act as follows on g


e 2 sin φ −e−iθ1 cos φ
   iθ 
0 −1
−eg = g= ,
0 0 0 0
 iθ
−e 1 cos φ −eiθ2 sin φ
  
−1 0
−hg = g= ,
0 1 −e−iθ2 sin φ e−iθ1 cos φ
   
0 0 0 0
−fg = g= .
−1 0 −eiθ1 cos φ −eiθ2 sin φ
These identities are verified by performing the appropriate matrix multipliations.
(3) The equation
−eg = Dα,β ,γ g (4.6)
is true for
i sin φ −i(θ1 +θ2 ) i cos φ −i(θ1 +θ2 ) 1
α =− e , β= e , γ = − e−i(θ1 +θ2 ) .
2 cos φ 2 sin φ 2
You can now directly verify this identity. But the way to derive it is to consider (4.6)
and to solve this matrix equation for its components. You thus get four equations for
the three unknown functions α, β , γ. It turns out that this system has the uniqe solution
given above.
(4) In analogy to step three one shows that the equation
−hg = Dα,β ,γ g
is true for
α = β = i, γ = 0.
(5) In analogy to step three one shows that the equation
− f g = Dα,β ,γ g
is true for
i sin φ i(θ1 +θ2 ) i cos φ i(θ1 +θ2 ) 1
α =− e , β= e , γ = ei(θ1 +θ2 ) .
2 cos φ 2 sin φ 2
We summarize

Theorem 24. The left invariant vector fields of SU(2) are


i sin φ −i(θ1 +θ2 ) d i cos φ −i(θ1 +θ2 ) d 1 d
Le = − e + e − e−i(θ1 +θ2 ) ,
2 cos φ dθ1 2 sin φ dθ2 2 dφ
 
d d
Lh = i + , (4.7)
dθ1 dθ2
i sin φ i(θ1 +θ2 ) d i cos φ i(θ1 +θ2 ) d 1 d
Lf = − e + e + ei(θ1 +θ2 ) .
2 cos φ dθ1 2 sin φ dθ2 2 dφ
It is instructive to verify that the commutation relations of these vector fields are indeed
those of su(2). This was the first step of harmonic analysis on the Lie group SU(2), finding
infinitesimal translation operators. The second step is to find a Laplace operator. For this we
define

Definition 20. The universal envelopping algebra of sl(2, C) is the ring of functions
U(sl(2, C)) = C[e, h, f ]/I.
MA PH 451 41

Where the ideal I is generated by the polynomials


[x, y] − (xy − yx)
for all x, y in sl(2, C). In words, the universal envelopping algebra is the polynomial ring in
the generators of the Lie algebra, where the Lie bracket [x, y] coincides with the commutator of
elements, that is (xy − yx). In the universal envelopping algebra you can compute Lie brackets
as you are used to in matrix algebras.

Definition 21. A very important element of the universal envelopping algebra is the Casimir
element. For sl(2, C) it is
Ω = hh − 2h + 4e f .

Let us compute the commutator of Ω with the generators of the Lie algebra.
Ωe − eΩ = hhe − 2he + 4e f e − ehh + 2eh − 4ee f
= 4eh + 4e f e − 4ee f (since ehh = heh − 2eh = hhe − 2he − 2eh)
= 4he − 4he = 0
Ωh − hΩ = hhh − 2hh + 4he f − hhh + 2hh + 4e f h = −8e f + 8e f = 0 (4.8)
Ω f − f Ω = hh f − 2h f + 4e f f − f hh + 2 f h − 4 f e f
= −4h f + 4e f f − 4 f e f (since f hh = h f h + 2 f h = hh f + 2h f + 2 f h)
= −4h f + 4h f = 0
So its importance is that the Casimir commutes with the Lie algebra. For sl(2, C) a stronger
statement is even true.

Theorem 25. The center of sl(2, C) inside its universal envelopping algebra is the polynomial
ring C[Ω].

The Laplace operator is then defined to be1


−∆ = Lh Lh − 2Lh + 4Le L f . (4.9)
By construction it commutes with the left-invariant vector fields. The theory of compact Lie
groups implies that this operator also commutes with the right-invariant vector fields. We know
from linear algebra, that two operators that commute can be simultaneously diagonalized. One
of our main goals of this course will be to learn about representations of Lie algebras. So how
does this statement help us? In harmonic analysis we are interested in the question: What are
the representations of the invariant vector fields which the functions on the Lie group carry? The
Laplacian commuting with the Lie algebra of vector field means that two functions transforming
in the same irreducible representation (we will learn what this word means later) of the Lie
algebra have the same eigenvalue of the Laplacacian. This means that finding the eigenvalue of
the Laplacian is the first step in analyzing functions on a Lie group. Computing the Laplacian

1The minus sign is just a choice of normalization, but it ensures that we get the same result as deducing the Laplace
operator from the Riemannian metric on S3 .
42 T CREUTZIG

explicitely is now straight-forward though tedious. We compute


sin2 φ d 2 cos2 φ d 2 d d d2
4Le L f = − − + 2 − +
cos2 φ dθ12 sin2 φ dθ22 dθ1 dθ2 dφ 2
   
d d cos φ sin φ d
+ 2i + − −
dθ1 dθ2 sin φ cos φ dφ
d2 d2 d d
Lh Lh = − 2
− 2
−2 ,
dθ1 dθ2 dθ1 dθ2
so that the explicit form of the Laplace operator is
d2 1 d2 d2
 
1 cos φ sin φ d
∆= + + + − . (4.10)
cos2 φ dθ12 sin2 φ dθ22 dφ 2 sin φ cos φ dφ
You might wish to rewrite this expression using the trigonometric identity
 
cos φ sin φ cos 2φ
− =2 .
sin φ cos φ sin 2φ
We have differential operators and a Laplace operator for our Lie group. Next, we need an
invariant measure. Computing this is a little but subtle. I will give a method that works for any
Lie group.

Definition 22. Let g be a Lie group-valued element, then its Maurer-Cartan one-from is
ω(g) = g−1 dg.

This one-form is defined such that it is left-invariant under the translation (from the left) of a
constant Lie group element h,
ω(hg) = (hg)−1 d(hg) = g−1 h−1 hdg = g−1 dg = ω(g)
The Maurer Cartan form is a one-form with values in the Lie algebra of G. We can thus write
ω(g) = ω(e)0 e + ω( f )0 f + ω(h)0 h.
The ω(x)0 for x in su(2) are also one-forms, and they are called the dual one-forms to the
left-invariant vector fields Lx . The Haar measure is then defined to be
dµ(g) = ω(e)0 ∧ ω( f )0 ∧ ω(h)0 .
It is by definition left-invariant, that is for any function f (g) on our Lie group G, we have
Z Z Z
f (hg)dµ(g) = f (hg)dµ(hg) = f (g)dµ(g).
G G G
How do we compute this quantity in an efficient way? The Haar measure of a compact Lie
group has the nice property that
dµ(g) = dµ(g−1 ).
We have
     
−1 −1 d −1 d −1 d
ω(g ) = −(dg)g = g g dθ1 + g g dθ2 + g g−1 dφ .
dθ1 dθ2 dφ
MA PH 451 43

Our task is to expand this expression in the Lie algebra basis. For this, it is usefull to express
the derivatives in terms of left-invariant vector fields:
d  
g = e−i(θ1 +θ2 ) L f − ei(θ1 +θ2 ) Le g

d   
g = −i cos2 φ Lh + i sin φ cos φ e−i(θ1 +θ2 ) L f + ei(θ1 +θ2 ) Le g (4.11)
dθ1
d   
g = −i sin2 φ Lh − i sin φ cos φ e−i(θ1 +θ2 ) L f + ei(θ1 +θ2 ) Le g.
dθ2
Here we did nothing but inverted the 3 × 3 matrix which expresses the left-invariant vector
fields in terms of the derivatives of our angles θ1 , θ2 , φ . Having this expression, we insert it in
the definition of the Maurer Cartan form of g−1 , to get
ω(g−1 ) = ω(e)e + ω( f ) + ω(h)
ω(e) = ei(θ1 +θ2 ) (dφ + i sin φ cos φ (dθ1 − dθ2 ))
(4.12)
ω( f ) = e−i(θ1 +θ2 ) (−dφ + i sin φ cos φ (dθ1 − dθ2 ))
ω(h) = i sin2 φ dθ2 + i cos2 φ dθ1 .
Using the anti-symmetry of the wedge product, the Haar measure takes the explicit form
dµ(g−1 ) = ω(e) ∧ ω( f ) ∧ ω(h)
= 2i sin φ cos φ dφ ∧ (dθ2 − dθ1 ) ∧ i sin2 φ dθ2 + i cos2 φ dθ1

(4.13)
= −2 sin φ cos φ (dφ ∧ dθ1 ∧ dθ2 ) − sin2 φ − cos2 φ


= sin 2φ (dφ ∧ dθ1 ∧ dθ2 ).


In the case of S3 , there is a more intuitive way to derive this measure. Let me present this
second derivation. However note, that this second derivation does not nicely generalize to any
compact Lie group. Consider the threeball
B3r = {(a, b) ∈ C2 ||a|2 + |b|2 ≤ r}
of radius r. Then the boundary of the unit three-ball is the unit-three sphere. On C2 , we have
the natural measure da ∧ da∗ ∧ db ∧ db∗ . Let us parameterize elements in B3r by
a = r cos φ eiθ1 , b = r sin φ eiθ2 ,
so that at r = 1 we recover our previous parameterization of S3 . Let
da ∧ da∗ ∧ db ∧ db∗ = µ(r)dr ∧ µ(φ , θ1 , θ2 )dφ ∧ dθ1 ∧ dθ2
be the measure in the new coordinates. Let f be a function on the unit three-ball, then its integral
is Z 1 Z
µ(r)dr ∧ f (r, φ , θ1 , θ2 )µ(φ , θ1 , θ2 )dφ ∧ dθ1 ∧ dθ2 .
0 S3
And this integral is invariant under rotating the angles by means of an SU(2) transformation,
but rotations only act on the latter part, hence for every r
Z
f (r, φ , θ1 , θ2 )µ(φ , θ1 , θ2 )dφ ∧ dθ1 ∧ dθ2
S3
44 T CREUTZIG

must already be SU(2) invariant. So µ(φ , θ1 , θ2 )dφ ∧ dθ1 ∧ dθ2 must be an invariant measure.
It is computed from the Jacobian of the change of coordinates
 d d d d 
dr a dφ a dθ1 a dθ2 a
 d a∗ d a∗ d a∗ d a∗ 
µ(r)µ(φ , θ1 , θ2 ) = det drd dφ dθ1 dθ2 

d
 dr b dφ b dθd 1 b dθd 2 b 

(4.14)
d ∗ d ∗ d ∗ d ∗
dr b dφ b dθ1 b dθ2 b
= r3 sin 2φ
so that
µ(r) = r3 and µ(φ , θ1 , θ2 ) = sin 2φ .
We thus obtain the exact same Haar measure as before.
Let us summarize what we have done so far. We have computed a measure, differential oper-
ators and a Laplace operator on the Lie group. So that we can finally turn to the most important
question. What are functions on the Lie group? And what is a good basis of functions? There
is a very powerful theorem due to Peter and Weyl.

Theorem 26. Let G be a compact Lie group and H a Hilbert space, such that H is a unitary
representation of G, then H is a direct sum of irreducible finite-dimensional representations of
the underlying Lie algebra.

I still haven’t told you what a Lie group is, and we will do that in the next chapter. Here, we
have introduced another word, representation of a Lie algebra. What is that?

Definition 23. Let V be a vector space and g a Lie algebra, then V is called a representation ρ
of g, if there is a linear map
ρ : g → End(V )
from g to the ring of linear operators on V (this is denoted by End(V)), satisfying
ρ([x, y]) = ρ(x)ρ(y) − ρ(y)ρ(x)
for all x, y in g.
Such a representation is called finite-dimensional if V is finite-dimensional and it is called
irreducible if there is no subrepresentation than V itself. This means, there is no subvector space
W of V that is invariant under the action of g.

Let us turn to our example of sl(2; C). C2 is a representation of it, and the action of the Lie
algebra is given by the matrices we used to define the Lie algebra. This is the reason that this
representation carries the special names fundamental representation, standard representation
and also defining representation. Clearly a basis of this representation is given by
 
v= 1 0 and w= 0 1 .
Using the explicit form, we can compute
hv = v, ev = 0, f v = w.
We see that v is an eigenvector of eigenvalue one of h. Such an eigenvalue is denoted as weight
in the theory if Lie algebras. We also see that v is annihilated by e. A vector with such a property
is called a highest-weight vector (of weight one). If we look at our second vector w, we find
hw = −w, ew = v, f w = 0.
MA PH 451 45

So indeed the weight of w is lower (minus one) than the weight of v. Since w has the lowest
weight in this representation, it is called a lowest-weight vector. How do we get more interesting
representations of this Lie algebra? Consider the symmetric product of C2 , Sym2 (C2 ). This is
the vector space with basis {vv, vw, ww}. We can define an action of sl(2; C) by
x(ab) = (xa)b + a(xb)
for all x in sl(2; C) and all a, b in C2 . What is the action on our basis then?
hvv = 2vv, evv = 0, f vv = 2vw,
hvw = 0, evw = ww, f vw = vv,
hww = −2ww, eww = 2vw, f ww = 0.
Is this representation irreducible? It is, and in order to see this, we assume that it is not irre-
ducible. In that case, there must be a vector z = a1 ww + a2 wv + a3 vv for some complex numbers
a1 , a2 , a3 such that this vector is a vector of an sl(2; C)-invariant subspace of Sym2 (C2 ). For
this let i ∈ {1, 2, 3} be such that ai 6= 0 but a j = 0 for j < i. Then
e3−i z = ai vv,
so that vv is in this sub-module, but then applying f to vv gives vw and applying it twice ww,
so that the invariant subspace is already the complete vector space Sym2 (C2 ) and we have a
contradiction to our assumption.
In other words, we obtain the three-dimensional irreducible highest-weight representation
of sl(2; C), the highest-weight is 2, the highest-weight-vector is vv. In this case, ww is the
lowest-weight vector of lowest-weight −2.

Exercise 1. Show that the n-fold symmetric product of C2 carries the n + 1-dimensional irre-
ducible representation of sl(2; C) (you have to show that the representation is irreducible). Find
the highest-weight vector and the highest-weight.

The n-fold symmetric product has basis {vk wn−k }. This vector space has a unique su(2)-
invariant inner product. We fix normalization to be (vn , vn ) = 1. The adjoint of e is f . We
compute
f (vk wn−k ) = kvk−1 wn−k+1 , e(vk wn−k ) = (n − k)vk+1 wn−k−1 ,
so that
   
2 k+1 n−k−1 k+1 n−k−1 k n−k k n−k
(n − k) (v w ,v w ) = (e(v w ), e(v w )
 
= ( f e(vk wn−k ), vk wn−k
 
= (n − k)(k + 1) vk wn−k , vk wn−k .
Iterating this procedure, we find
  (n − k)!k!
vk wn−k , vk wn−k = (vn , vn )
n!
so that we can conclude that there is a positive inner product with orthonormal basis
  12
n
en,k = vk wn−k .
k
46 T CREUTZIG

Now, we want to find highest-weight vectors and thus highest-weight representations that are
functions on our Lie group. Let
ψa,b (φ , θ1 , θ2 )
be a (periodic) function in our three angles φ , θ1 , θ2 and let a, b be non-negative integers. We
want it to be a highest-weight vector of highest-weight n = a + b for the action of sl(2; C) given
by the left-invariant vector fields. In other words, we want ψa,b to satisfy
Lh ψa,b (φ , θ1 , θ2 ) = nψa,b (φ , θ1 , θ2 )
and
Le ψa,b (φ , θ1 , θ2 ) = 0.
The first condition can be satisfied if we write
ψa,b (φ , θ1 , θ2 ) = fa,b (φ )e−iaθ1 −ibθ2 .
It thus remains to find fa,b (φ ). The condition Le ψa,b = 0 translates into
sin φ cos φ 1 d
−a +b − fa,b (φ ) = 0
cos φ sin φ fa,b (φ ) dφ
This equation can be rewritten using
d sin φ d cos φ 1 d d
ln cos φ = − , ln sin φ = , fa,b (φ ) = ln fa,b (φ ).
dφ cos φ dφ sin φ fa,b (φ ) dφ dφ
Namely we get
d   
ln fa,b (φ )(cos(φ ))−a (sin(φ ))−b =0

and hence up to a normalization
fa,b (φ ) = (cos(φ ))a (sin(φ ))b
and the highest-weight vector becomes
ψa,b (φ , θ1 , θ2 ) = (cos(φ ))a (sin(φ ))b e−iaθ1 −ibθ2 .
The theorem of Peter and Weyl tells us, that this function must transform in an irreducible-
highest weight representation of weight n = a + b. But such a representation is the unique
n + 1-dimensional irreducible representation of highest-weight n, call it ρn . It is spanned by
ψa,b , L f ψa,b , L2f ψa,b , . . . , Lnf ψa,b .
Note, that a non-trivial implication of the Peter Weyl theorem is that
Ln+1
f ψa,b = 0.
Computing the basis vectors is both straight-forward and tedious. For example one gets
 
−i(a−1)θ1 −i(b−1)θ2 a−1 b+1 a+1 b−1
L f ψa,b = e −a(cos φ ) (sin φ ) + b(cos φ ) (sin φ )

L2f ψa,b = e−i(a−2)θ1 −i(b−2)θ2 a(a − 1)(cos φ )a−2 (sin φ )b+2 + 2ab(cos φ )a (sin φ )b +

+b(b − 1)(cos φ )a+2 (sin φ )b−2 .
The lowest-weight vector can be computed exactly in the same way as in the highest-weight
case.
MA PH 451 47

Exercise 2. Let a, b be non-negative integers with a + b = n. Make the Ansatz


φa,b (θ1 , θ2 , φ ) = ga,b (φ )eiaθ1 +ibθ2 .
Show that φa,b has weight −n, i.e.
Lh φa,b = −nφa,b .
We require that φa,b is a lowest-weight vector, that is
L f φa,b = 0.
Show that this implies that ga,b satisfies exactly the same differential equation as fa,b before,
that is
d   −a −b

ln ga,b (φ )(cos(φ )) (sin(φ )) = 0.

Conclude that the lowest-weight vector is
φa,b (φ , θ1 , θ2 ) = (cos(φ ))a (sin(φ ))b eiaθ1 +ibθ2 .
In other words, the lowest-weight vector is the complex conjugate of the highest-weight vector.

We thus get for the norm of the highest-weight vector


 Z π Z π Z π
ψa,b , ψa,b = (cos(φ ))2a (sin(φ ))2b sin 2φ dθ1 dθ2 dφ
−π −π −π
Z π
= 8π 2
(cos(φ ))2a+1 (sin(φ ))2b+1 dφ (2 cos φ sin φ = sin 2φ )
−π
Z 1
= 8π 2 (1 − x2 )a x2b+1 dx (x = sinφ , dx = cos φ dφ , (cos φ )2 = 1 − x2 )
−1
Z 1
a
= 8π 2 (1 − x2 )a−1 x2(b+1)+1 dx (4.15)
b+1 −1
..
.
a!b! 1 Z
2
= 8π x2(b+a)+1 dx
(b + a)! −1
a!b!
= 8π 2 .
(b + a + 1)!
We have learnt before, that there is up to normalization of the norm of the highest-weight vector
a unique invariant inner product on a representation. So that we have determined all norms of
all vectors in the highest-weight representation.
There is one more nice observation we can make. The Casimir (the Laplacian in terms of the
vector fields) commutes with all elements of the Lie algebra. Hence all elements of ρn have the
same eigenvalue of the Laplacian. It is most easy to compute this number acting on the lowest-
weight state. However, since we already have an explicit expression for the highest-weight
state, let us rewrite the Laplcacian as
∆ = −Lh Lh + 2Lh − 4Le L f = −Lh Lh − 2Lh − 4L f Le .
Here we used the relation L f Le − Le L f = −h. Now, we can compute
∆ψa,b = −Lh Lh − 2Lh − 4L f Le ψa,b = (−Lh Lh − 2Lh )ψa,b = −n2 − 2n ψa,b .
 
48 T CREUTZIG

In other words, every element in the representation ρn has Casimir eigenvalue −n2 − 2n. We
summarize our findings
Theorem 27. The Hilbert space of square integrable functions on the Lie group SU(2) has basis
n o
Lmf ψa,b (φ , θ1 , θ2 ) a, b ∈ Z≥0 , 0 ≤ m ≤ a + b; ψa,b (φ , θ1 , θ2 ) = (cos(φ ))a (sin(φ ))b e−iaθ1 −ibθ2 .

The highest-weight vectors of highest weight n are the ψa,b (φ , θ1 , θ2 ) with a + b = n and the
functions
ψa,b , L f ψa,b , L2f ψa,b , . . . , Lnf ψa,b
span the irreducible n + 1 dimensional highest-weight representation ρn of sl(2; C) of highest-
weight n. The Laplacian has eigenvalue −n2 − 2n on each function in this representation. As
a representation of the left-invariant vector fields, the space of square-integrable functions on
SU(2) decomposes as
L2µ (SU(2)) =
M
dim(ρn )ρn , (4.16)
n∈Z≥0
since there are dim(ρn ) = n + 1 distinct highest-weight vectors of highest-weight n.
If one also considers the right-invariant vector fields, one can do better. Recall that right-
invariant vector fields commute with the left-invariant ones. A representation of two commuting
copies of a Lie algebra g decomposes as
ρ1 ⊗ ρ2 ,
where ρ1 is a representation of the first copy of the algebra and ρ2 of the second one. The action
of g ⊕ g on a vector v ⊗ w is then defined as
(x ⊕ y)(v ⊗ w) = (xv ⊗ w) ⊕ (v ⊗ yw).
We can rewrite our SU(2) decomposition (4.16) as
L2µ (SU(2)) =
M
ρnL ⊗ Cn+1 .
n∈Z≥0

Here we put an upper index L on the representation to indicate that it is a representation of


the left-invariant vector fields. It is now suggestive that the right-invariant ones act on the
multiplicity vector spaces Cn+1 , and indeed the following theorem is true.
Theorem 28. Under the action of both left- and right-invariant vector fields, the space of square
integrable functions on SU(2) decomposes as
L2µ (SU(2)) =
M
ρnL ⊗ ρnR . (4.17)
n∈Z≥0

In order to prove this statement, you have to show that the n+1 functions ψa,b (with a+b = n)
carry the n + 1-dimensional representation of the sl(2; C)-action given by the right-invariant
vector fields. By the theorem of Peter and Weyl, this amounts to finding a highest-weight vector
of highest-weight n.
Question: Can you formulate an analogous theorem for the circle S1 = U(1)?
4.6. Summary. This section has been quite some work, so let us summarize what we have
done. We have started by learning that the three-sphere carries a group structure, namely of the
Lie group SU(2). We then have decided that harmonic analysis on the three-sphere is harmonic
analysis on the group. For the latter, we had to compute differential operators. These operators
MA PH 451 49

are called invariant vector fields. There are left-invariant ones and right-invariant ones. They
make functions on the Lie group into a representation of two commuting copies of the underly-
ing Lie algebra, (su(2)). We then decided that it is easier to work with the complexification of
this real Lie algebra. The complexification is called sl(2; C). Once having obtained these dif-
ferential operators, we immediately obtained the Laplacian from the Casimir element of the Lie
algebra. The next step was then to find an invariant measure, which has been computed from a
left-invariant one-form, the Maurer-Cartan form. After all this preparation, we have learnt that
sl(2; C) has special representations, called irreducible finite-dimensional highest-weight repe-
sentations. There is a very important theorem due to Peter and Weyl, that tells us that exactly
these representations appear in the harmonic analysis of SU(2). Indeed, we where then able
to explicitly compute highest- and lowest-weight vectors as well as their norm. The norm of
all other elements in the representation is then determined by the one of the highest-weight
vector. The final theorem is then your homework problem, to find the decomposition of square
integrable functions on SU(2) under both left- and right-action of the underlying Lie algebra of
vector fields.

4.7. Exercises.

(1) Proof Theorem 28.


In other words, compute the right-invariant vector fields in a similar manner as we
have done it for the left-invariant ones. The first step for that is the same as for the left-
invariant ones. The second step is also very analogous to the case of the left-invariant
vector fields, namely it is to compute the right-action of the matrices corresponding to
the Lie algebra on the Lie group valued function g. Then you have to solve equations of
type (4.6), that is for example ge = Dα,β ,γ g gives you Re .
Secondly, you need to use these vector fields to find highest-weight vectors, that is
functions ψ that satisfy the differential equations
Re ψ = 0, Rh ψ = nψ.

4.8. Solutions.

(1) The first task is to compute the right-invariant vector fields. We use the same parame-
terization of a Lie group valued matrix g as in the lecture. We then compute
0 eiθ1 cos φ
 
ge =
0 −e−iθ2 sin φ
 iθ
−eiθ2 sin φ

e 1 cos φ
gh =
−e−iθ2 sin φ −e−iθ1 cos φ
 iθ 
e 2 sin φ 0
g f = −iθ1
e cos φ 0
The next step is to solve the differential matrix equations
Dα,β ,γ g = gX
50 T CREUTZIG

for X = e, h, f . This gives four equations for the three functions α, β , γ. There must be
a unique solution in each case and indeed the answer to these equations turns out to be
i sin φ i(θ1 −θ2 ) d i cos φ i(θ1 −θ2 ) d 1 d
Re = − e − e + ei(θ1 −θ2 ) ,
2 cos φ dθ1 2 sin φ dθ2 2 dφ
 
d d
Rh = −i − ,
dθ1 dθ2
i sin φ −i(θ1 −θ2 ) d i cos φ −i(θ1 −θ2 ) d 1 d
Rf = − e − e − e−i(θ1 −θ2 ) .
2 cos φ dθ1 2 sin φ dθ2 2 dφ
we are now looking for functions φa,b (θ1 , θ2 , φ ) with a + b = nsatisfying the system of
differential equations
Rh φa,b = nφa,b , Re φa,b = 0.
For this we make the separation of variables Ansatz
φa,b (θ1 , θ2 , φ ) = ga,b (φ )ei(aθ1 −bθ2 )
so that the equation Rh φa,b = nφa,b is satisfied. Plugging the Ansatz in the second dif-
ferential equation, we can solve it exactly with the same technique as in the lecture to
get that
ga,b (φ ) = (cos φ )a (sin φ )b .
So that we get n + 1 solutions for our differential equation. These are all highest-weight
vectors for the n + 1 dimensional irreducible representation of sl(2; R). But especially,
we see that φ0,n = ψ0,n is both a highest-weight vector for the left and for the right
regular action. Since these two actions commute this is a highest-weight vector for the
tensor product and the theorem follows.

5. L IE G ROUPS
In this section, we will repeat part of the analysis of last section in a much more general
setting. We will learn about important concepts of Lie groups. I use the book by Daniel Bump
[Bu] as reference.
Definition 24. Let G be a n-dimensional real manifold, and let {Uα } be a set of open subsets
of G that cover G, such that each open subset looks like Rn in the sense that there is a bijective
map
φα : Uα → Rn
that is bi-continuous. In other words both φα and its inverse are continuous maps. A transition
map is then a map
−1
φαβ = φβ ◦ φα
φα (Uα ∩Uβ )
 
from φα Uα ∩Uβ to φβ Uα ∩Uβ . A manifold is called smooth if all transition maps are
smooth, that is infinitely differentiable. A Lie group G (over R) is a smooth manifold and also
a group, such that group multiplication and inversion are smooth maps.
Example 7. A list of examples are
(1) Rn with group operation addition of vectors.
(2) R \ {0} with group operation multiplication.
(3) positive real numbers with group operation multipliation
MA PH 451 51

(4) the unit circle S1 . We have called this the Lie group U(1).
(5) real invertible n × n matrices. This is called the Lie group GL(n, R). Most Lie groups
of interests will turn out to be subgroups of the complex Lie group GL(n, C).
(6) unitary n × n matrices of determinant one, the special unitary Lie group SU(n).
As a physicist, we are interested in problems, where the Lie group is the continuous group
of symmetries of the problem. A physical observable is then an objective that carries an action
of the symmetry, that is an action of the Lie group. There are two important notions of action:
representation and action.
Definition 25. An action of a Lie group G on a manifod M is an assignment ρ, that assigns to
each g in G a diffeomorphism ρ(g) on M, with the properties that this assignment is compatible
with the group structure, that is
ρ(1) = Id, ρ(gh) = ρ(g)ρ(h)
for all h, g in the Lie group. In addition one wants ρ(g) to be a smooth map on M.
A representation of a Lie group G is a vector space V together with a group morphism
ρ : G → End(V ). We will be interested in finite-dimensional representations, that is V = Rn or
V = Cn . In that case End(V) is the Lie group of invertible (real or complex) n × n matrices. The
map ρ is then a morphism of Lie groups, which means it is a smooth map that respects the Lie
group structure in the sense, that ρ(gh) = ρ(g)ρ(h) and the image of the identity is the identity
matrix.
Example 8. A very important example of a group action is the group action on a coset space.
Let G be a Lie group and H be a sub Lie group of G, that is a subset of G that itself is a Lie
group, then the space of cosets
G/H := {gh|g ∈ G, h ∈ H, gh = g0 h0 if g−1 g0 ∈ H}
is called the coset G/H of G. It carries an action of G by multiplication from the left. Similarly
one can also define the coset
H \ G := {hg|g ∈ G, h ∈ H, hg = h0 g0 if g0 g−1 ∈ H}
it then carries an action of the Lie group by multiplication from the right. Important interesting
manifolds like spheres can be constructed as such coset manifolds.
A very important example of a representation of a Lie group is the Hilbert space of square
integrable function on the Lie group, denoted by
L2µ (G),
where µ = µ(G) refers to the Haar measure, that is a measure that respects the group action.
This measure is the weight function of the Hilbert space, and part of our coming analysis will
also be concerned with this measure. Also the Hilbert space of square integrable function on
a coset space is a representation of a Lie group. The analysis of this Hilbert space uses the
resulting analysis of the parent Lie group Hilbert space. We will however not look at such
examples, as it is getting more and more complicated.
We will now describe harmonic analysis on a Lie group in more detail. Harmonic analysis
is nothing but quantum mechanics on the group manifold. One might also like to view it as
a somehow semi-classical limit of quantum field theory on the Lie group, so in any case it is
very important for a theoretical quantum physicist. We will see, that we essentially need to
52 T CREUTZIG

understand a nice class of representations of Lie algebras. The most important theorem is the
famous theorem of Peter and Weyl that we have already mentioned in the previous chapter.
Definition 26. Let H be a Hilbert space, a representation ρ : G → End(H) is called unitary if
the inner product respects the Lie group in the sense that
(ρ(g)v, ρ(g)w) = (v, w)
for all g in G and all v, w in H.
If H is the Hilbert space of square integrable functions on G, then every function f carries
two-commuting actions of G, the left-regular action defined by
Lh : H → H, f (g) 7→ f (gh).
and the right-regular one
Rh : H → H, f (g) 7→ f (h−1 g)
The Peter-Weyl theorem then has two versions. Both of them you have already seen in the
example of SU(2) in the last chapter.
Theorem 29. (Peter and Weyl)
(1) Let H be a Hilbert space and ρ : G → End(H) a unitary representation of a compact
group G. Then H is a direct sum of finite-dimensional irreducible representations.
(2) Let H = L2µ (G) be the Hilbert space of square integrable functions of a compact Lie
group G. Then H is a unitary representation of G. Moreover, let R be the set of all
finite-dimensional unitary representations of G, then under the left-right action of G,
the Hilbert space decomposes as
L2µ (G) =
M
ρ L ⊗ ρ̄ R .
ρ∈R

Here, by ρ̄ we denote the representation conjugate to ρ, meaning that the one-dimensional


representation is contained in the tensor product of the two representations and hence there
exists a linear map on this product.
Compare this result with the last theorem of last chapter. We will now turn on the relation
between Lie group and Lie algebra, and then rephrase this theorem in terms of representations
of the Lie algebra of invariant vector fields. We start with the invariant measure.
5.1. The Haar measure. We require that G is a locally compact Lie group. This means that
every point x in G has a compact neighbourhood. This is not a severe restriction, as every
compact Lie group is especially locally compact, but also Rn is locally compact and more
generally every Lie group that looks locally like Euclidean space is locally compact. Looking
like Euclidean space means that every point has an open neighbourhood that is homeomorphic
(there is a bi-continuous map) to Euclidean space.
Definition 27. A measure µ on a locally compact group G is called a left Haar measure if it is
regular, that is
µ(X) = inf{µ(U)|U ⊃ X,U open} = sup{µ(K)|K ⊂ X, K compact}
and if it is left-invariant, that is µ(X) = µ(gX)
Such a measure has the property, that any compact set has finite measure and any nonempty
open set has measure > 0.
MA PH 451 53

Theorem 30. If G is a locally compact group, then there is a unique left Haar measure.

Left-invariance of the measure amounts to left-invariance of the integral of an integrable


function f Z Z
f (γg)dµ(g) = f (g)dµ(g).
G G
There is also a unique right Haar measure. Left and right measure do not necessarily coincide.
For example, let   
y x
G= |x, y ∈ R, y > 0
0 1
then one can show that the left measure is
dµL = y−2 dydx,
but the right measure is
dµR = y−1 dydx.

Definition 28. A Lie group G is called unimodular if left and right Haar measure coincide.

So unimodular Lie groups are in a sense very symmetric. These are the type of Lie groups
appearing frequently in physical problems. Let g in G, then g acts on G via conjugation, that is
h 7→ ghg−1 . Every conjugation is an automorphism of G, since
g1g−1 = 1, ghg−1 gh0 g−1 = ghh0 g−1 .
Every automorphism takes a measure to another measure, so conjugation maps the left Haar
measure to another left Haar measure. Uniqueness implies that this must be a constant multiple
of the original measure. Thus for every g in G, there is δ (g) > 0, such that
Z Z Z
f (g−1 hg)dµL (h) = g(h)dµL (ghg−1 ) = δ (g) f (h)dµL (h).
G G G

Definition 29. A quasicharacter is a continuous homomorphism


χ : G → C \ {0}.
If |χ(g)| = 1 for all g in G, then χ is called a unitary quasicharacter.

Proposition 31. The function


δ : G → R>0
is a quasicharacter and the measure δ (h)µL (h) is right invariant.

Proof. Conjugation by first g1 and then by g2 is the same as conjugating by g1 g2 . Thus


δ (g1 )δ (g2 ) = δ (g1 g2 ) and hence the map is a homomorphism, it is also continuous and hence
a quasicharacter. Using the left invariance of the Haar measure, we get
Z Z Z Z
−1 −1
δ (g) f (h)dµL (h) = f (g hg)dµL (h) = f (gg hg)dµL (h) = f (hg)dµL (h).
G G G G
Replace f by δ f , so that
Z Z
δ (g) f (h)δ (h)dµL (h) = f (hg)δ (hg)dµL (h).
G G
Now using that δ is a homomorphism and dividing by δ (g) gives the result.

Proposition 32. If G is compact, then G is unimodular and µ is finite.


54 T CREUTZIG

Proof. The map δ is a homomorphism, so that its image must be a subgroup of R>0 . Since G
is compact and δ is continuous its image must also be compact. The only compact subgroup
of the positive real numbers is {1}. Thus δ (g) = 1 for all g in G. Hence the left Haar measure
coincides with the right Haar measure. The volume of a compact subset of a locally compact
group is finite, so the volume of G must already be finite.
Finally, another useful result is
Proposition 33. If G is unimodular, then g 7→ g−1 is an isometry.
Proof. The map g 7→ g−1 turns a left-invariant measure into a right-invariant measure. If both
measures agree, then the map g 7→ g−1 multiplies the left Haar measure by a positive constant.
Since the map is of order two, this constant must be one.
We now turn to the essential example of Lie groups, these are subgroups of GL(n, C).
5.2. Lie subgroups of GL(n, C). Every classical Lie group is such a subgroup, so actually we
are studying a quite generic situation in this section. To get an idea we list the standard examples
Example 9. Interesting subgroups of GL(n, C) are
(1) The orthogonal (compact) group O(n),
O(n) = g ∈ GL(n, R)|ggt = I


(2) The unitary (compact) group U(n),


U(n) = g ∈ GL(n, C)|g(g∗ )t = I


(3) The special (compact) unitary group SU(n),


SU(n) = {g ∈ U(n)|detg = 1}
(4) The symplectic group (compact) Sp(2n)
Sp(2n) = g ∈ GL(2n, R)|gJgt = J


for  
0 −In
J=
In 0
Exercise 3. Show, that the sets of the example are groups.
We want to find the Lie algebra of these groups. So let us recall the definition.
Definition 30. A Lie algebra g is a vector space g together with a bilinear operation
[ , ] : g×g → g
satisfying
[U, T ] = −[T,U] (antisymmetry)
and
[U, [S, T ]] + [T, [U, S]] + [S, [T,U]] = 0 Jacobi identity
for all U, S, T in g.
Let Matn (C) be the vector space of n × n matrices over the complex numbers. The Lie group
GL(n, C) is the group of all invertible such matrices. The important map is the exponential map:
1 1
exp : Matn (C) → GL(n, C), exp(X) = I + X + X 2 + X 3 + . . . (5.1)
2 6
MA PH 451 55

This series converges for every X. You prove this by observing this statement for matrices in
Jordan normal form and then use that every matrix can be brought into such a form. Let G be a
Lie sub group of GL(n, C), we want to somehow think of the Lie algebra of G, called Lie(G) as
all matrices X, such that exp(X) is in G. We will now explain this intuitive picture.

Proposition 34. Let U be an open subset of Rn , and let x in U. Then there is a smooth function
f with compact support contained in U that does not vanish at x.

Proof. Assume that x = (x1 , . . . , xn ) is the origin (otherwise translate the point x into the origin).
Define     
exp − 1 − |x|2 −1 if |x| ≤ r
f (x1 , . . . , xn ) = r2 .
0 otherwise

The function f is smooth, and has support inside the ball |x| ≤ r. For sufficiently small r it
vanishes outside of U.

Definition 31. Let G be a Lie group, and let g be a point in G. Let γi : (−1, 1) → G be a
curves with γi (0) = g for i = 1, 2. Then choose an open set U ⊂ G containing g and a chart
φ : U → Rn . γ1 and γ2 are equivalent if φ ◦ γ1 = φ ◦ γ2 . The set of such equivalence classes is
called the tangent space of G at the point g.

Proposition 35. Let G be a subgroup of GL(n, C) and let X be an n × n matrix over C. Then the
path t → exp(tX) is tangent to G at the identity if and only if it is contained in G for all t.

Proof. If exp(tX) is contained in G, then it is tangent to G at the identity. Suppose that there
is t0 > 0 with exp(t0 X) not an element of G, but the path is still tangent to G. We will derive a
contradiction to this assumption. With the previous proposition, we know that there is a smooth
compactly supported function φ0 on GL(n, C) with φ0 (g) = 0 for all g in G and φ0 (exp(t0 X)) 6=
0. Let Z
f (t) = φ (exp(tX)), φ (h) = φ0 (hg)dµ
G
with the left Haar measure dµ on G. So that φ is constant on the left cosets hG of G and
especially vanishes on G, but is non-zero at exp(t0 X). For any t, we can write the derivative as
d
f 0 (t) = φ (exp(tX)exp(uX)) =0

du u=0
since the path exp(tX)exp(uX) is tangent to the coset exp(tX)G and φ is constant on such cosets.
Moreover f (0) = 0 and hence f (t) = 0 for all t, but this is a contradiction to f (t0 ) 6= 0.

Corollary 36. Let G be a subgroup of GL(n, C). The set Lie(G) of all X ∈ Matn (C) such that
exp(tX) ⊂ G is a vector space whose dimension is equal to the dimension of G as a manifold.

Proposition 37. Let G be a subgroup of GL(n, C). The map


X → exp(X)
gives a diffeomorphism of a neighborhood of the identity in Lie(G) onto a neighborhood of the
identity in G.

Proof. Recall the expansion of exp(X) = I + X + X 2 /2 + . . . so that the Jacobian of exp at the
identity is one. Hence exp induces a diffeomorphism of an open neighborhood of the identity
in Matn (C) onto a neighborhood of the identity in GL(n, C). Since Lie(G) is a vector space of
56 T CREUTZIG

the same dimension as G the Inverse Function Theorem (you might have had that in analysis)
implies that Lie(G) ∩U must be mapped onto an open neighborhood of the identity in G.
Proposition 38. Let G be a subgroup of GL(n, C), then for X,Y in Lie(G), also [X,Y ] is in
Lie(G).
Proof. Using the expansion of exp, we see that
exp etX Ye−tX = etX exp(Y )e−tX ,


so that with Y also etX Ye−tX is an element of Lie (G). Thus Lie (G) contains
1 tX −tX t
−Y = XY −Y X = X 2Y − 2XY X +Y X 2 + . . .
 
e Ye
t 2
this is true for all t. Taking the limit t → 0 shows that [X,Y ] = XY −Y X is also in Lie(G).
In the algebra chapter, we have seen, that the commutator gives an associative algebra the
structure of a Lie algebra.
Theorem 39. Lie(G) is a Lie subalgebra of gl(n, C) of same dimension as G.
Example 10. The important examples are
(1) The Lie algebra of O(n, R) is o(n, R). It consists of all matrices X satisfying X + X t = 0.
(2) The unitary Lie algebra is denoted by u(n, R) and consists of all complex matrices X
satisfying X ∗ + X t = 0.
(3) The special unitary Lie algebra is denoted by su(n, R) and consists of all traceless uni-
tary matrices.
(4) The symplectic Lie algebra is denoted by
 sp(2n, R) and consists of all matrices X satis-
0 −In
fying XJ + JX t = 0 with J = .
In 0
In order to see that these examples are indeed the Lie algebras of the Lie groups presented
at the beginning of this section, one has to exponentiate the relations. For example, let X in
o(n, R), then tX t = −tX for all t and hence
exp(tX)−1 = exp(tX)t
so that exp(tX) is in O(n, R) for all t. Thus o(n, R) is a subgroup of Lie(O(n, R)). For the
converse direction, suppose that X in Lie(O(n, R)), then for all t
I = exp(tX)exp(tX)t
  
1 2 2 t 1 2 t 2
= I + tX + t X + . . . I + tX + t X + . . .
2 2
 1  2 
= I + t X + X t + t 2 X 2 + 2XX t + X t +...
2
This is true for all t, hence X + X t = 0 and hence Lie(O(n, R)) is o(n, R).
5.3. Left-Invariant vector fields. Let G be a Lie group. G acts on itself by multiplication from
the left. Let us call this action Lg ,
Lg : G → G, h 7→ gh.
Consider the Tangent space at a point h, that is Th (G). Recall that this space was the space of
equivalence classes of curves γ : (−1, 1) → G with γ(0) = h. If we compose such a curve with
MA PH 451 57

Lg , we get a tangent vector at the point gh. We thus have defined a map
Lg,∗ : Th → Tgh , γ 7→ Lg ◦ γ.

Definition 32. A vector field X on a Lie group G is a collection of assignments, that assigns to
each g ∈ G an element Xg in Tg (G). A vector field X on G is called left-invariant if
Lg,∗ (Xh ) = Xgh .

Proposition 40. The vector space of left-invariant vector fields is closed under the commutator
[ , ] and is a Lie algebra of dimension dim(G). If e is the identity of G and if Xe in Te (G),
then there is a unique left-invariant vector field X on G with the prescribed tangent vector at the
identity.

Proof. Given a tangent vector at the identity Xe , a left-invariant vector field is defined by Xg =
Lg,∗ (Xe ). Conversely every left-invariant vector field must satisfy this identity. Hence the space
of left-invariant vector fields is isomorphic to the tangent space of G at the identity. Therefore
its vector space dimension equals the dimension of G. The Lie algebra structure is given by the
tangent vectors at the identity.

The Lie algebra of a Lie group, is the Lie algebra of the left-invariant vector fields. However,
we have in the previous section already defined the Lie algebra of GL(n, C) and its subgroups
in terms of n × n matrices. We thus need to see that the two definitions define isomorphic Lie
algebras. let G = GL(n, C). Then both the tangent space and the Lie algebra of n × n matrices
are of dimension n2 . Now, let X be an n × n matrix. The tangent space at the identity may be
identified with n × n matrices, as it is a vector space of that dimension. This means to each X,
we can define a differential operator LX via
d
LX f (g) = f (g exp(tX))

dt t=0
for every square integrable function f on G. This defines a left-invariant derivation on the space
of functions. If one studies vector fields, one would learn that they are derivations, and that the
LX are exactly the left-invariant vector fields. In practice, one computes the vector fields as we
did it in the case of SU(2). That is one tries to find a suitable parameterization of the Lie group,
and then one computes derivatives that one can combine so that they act as the left-invariant
vector fields. The right invariant vector fields are defined analogously
d
RX f (g) = f ( exp(−tX)g)

dt t=0
In summary, we have seen how the action of the Lie group induces one of the underlying Lie
algebra. The Lie algebra of a Lie group is the Lie algebra of infinitesimal transformation along
the directions tangent to the Lie group. Whether we study the action of the Lie group, or its
Lie algebra is thus very much related. Especially, every irreducible unitary representation of a
compact Lie group is also an irreducible unitary one of its Lie algebra. The Peter Weyl theorem
can thus be rephrased.

Theorem 41. (Peter and Weyl)


Let H = L2µ (G) be the Hilbert space of square integrable functions of a compact Lie group G.
Then H is a unitary representation of Lie(G). Moreover, let R be the set of all finite-dimensional
unitary representations of Lie(G), then under the left-right action of Lie(G), the Hilbert space
58 T CREUTZIG

decomposes as
L2µ (G) =
M
ρ L ⊗ ρ̄ R .
ρ∈R

This is exactly our final result of the harmonic analysis on SU(2). Recall, that once we knew
all irreducible representations of the Lie algebra su(2), we had to find functions that transform
in these representations under the action of the vector fields. finding this action amounted
to solving a set of first order differential equation. Our next goal is thus to understand the
representation theory of the Lie algebras of compact Lie groups. Before we turn to this goal,
we will look at another example of harmonic analysis. Unfortunately, for larger Lie groups than
SU(2) harmonic analysis becomes very cumbersome. However, there are also non-compact Lie
groups and Lie supergroups. Studying them is usually much harder than compact Lie groups.
But there is one Lie supegroup, that is the super-analogue of the commutative Lie group R. This
Lie supergroup can be nicely explicitly studied. In order to repeat and to illustrate some of the
objects we have introduced we will study this Lie supergroup now.

5.4. Example of a Lie supergroup. Lie supergroups are a natural generalization of Lie groups.
From a physics point of view, they describe physics in a somehow supersymmetric world.
Whether such a world is realistic is of course very arguable. However, there is one applica-
tion. The biggest success of string theory is the AdS/CFT conjecture of Juan Maldacena. It
says that string theory on a supermanifold of Anti-de-Sitter type is dual to super Yang-Mills
theory on the asymptotic boundary of the manifold. This conjecture is so important, because
gauge theories like Yang-Mills only allow for a theoretical description within a weakly cou-
pled regime. This duality gives then a description of a fairly realistic gauge theory beyond the
weakly coupled regime. The supermanifolds appearing in this description are Lie supergroups
and their cosets.
We are interested in the Lie supergroup GL(1|1). Its Lie algebra is called gl(1|1) and it is
generated by four elements, two bosonic ones E, N and two fermionic ones ψ ± . The super-
commutators are
[E, N] = [N, E] = [E, ψ ± ] = [ψ ± , E] = 0
[N, ψ ± ] = −[ψ ± , N] = ±ψ ±
[ψ + , ψ − ] = [ψ − , ψ + ] = E, [ψ ± , ψ ± ] = 0.
As Lie algebras can often be thought of as Lie algebras of matrices, we can do the same for this
Lie superalgebra. Namely the identification
   
e 0 n 0
ρe,n (E) = , ρe,n (E) =
0 e 0 n−1
   
+ 0 e − 0 0
ρe,n (ψ ) = , ρe,n (ψ ) =
0 0 1 0
defines a Lie superalgebra homomorphism from gl(1|1) into the space of 2 × 2 supermatrices.
Here all the name super means is, that the commutator of two matrices A and B is defined as
[A, B] = A0 B0 − B0 A0 + A0 B1 − B1 A0 + A1 B0 − B0 A1 + A1 B1 + B1 A1 ,
where for      
a b a 0 0 b
A= , A0 = , A1 = .
c d 0 d c 0
MA PH 451 59

The even subalgebra is generated by E and N, and these two elements commute. Hence the
even subalgebra is just the two-dimensional commutative Lie algebra and its Lie group is R2
(or if we compactify S1 × S1 ). While a Lie superalgebra is not a Lie algebra, a Lie supergroup
is a Lie group, but not over the complex numbers. We also have to allow fermionic numbers.
Such numbers are called odd Grassmann numbers, and two such numbers η± satisfy
η±2 = 0, η+ η− = −η− η+ .
We have learnt in the previous sections that it is a good idea to think about the Lie group of
a Lie algebra as the exponentials of the Lie algebra. For our case the Lie algebra consists of
elements of the form
aE + bN + θ+ ψ + + θ− ψ− ,
where a and b are Grassmann even numbers (like for example complex numbers) and η± are
Grassmann odd. Hence a Lie supergroup element is of the form
exp xE + yN + θ+ ψ + + θ− ψ− .

(5.2)
If you prefer matrices, then our above Lie superalgebra homomorphism induces one on the Lie
supergroup and the image are invertible matrices of the form
 x 
e θ+
(5.3)
θ− ey
for Grassmann even numbers x, y and odd ones θ± . We thus have two ways to parameterize
the Lie supergroup. Let us now use both of them to compute the invariant vector fields. The
left-invariant ones are defined by
LX g = gX
for X in E, N, ψ + and L− g = −gψ − , respectively by
LXe,n g = gρe,n (X)
e,n
for X in E, N, ψ + and L− g = −gρe,n (ψ − ). We start with the matrix form and proceed as in the
SU(2) case. We use the ρ1,1/2 representation as it is fairly symmetric.

(1) We first modify our parameterization a little bit to


 y 
x e θ+
g(x, y, θ± ) = e
θ− e−y
(2) We then compute
 y 
d d x e 0
g = g, g=e ,
dx dy 0 −e−y
    (5.4)
d x 0 1 d x 0 0
g=e , g=e
dθ+ 0 0 dθ− 1 0
60 T CREUTZIG

(3) We then compute the left action of the corresponding matrices


 
1 0
g =g
0 1
1 x ey −θ+
   
1 1 0
g = e
2 0 −1 2 θ− −e−y
   −y
 (5.5)
0 1 x θ− e
g =e
0 0 0 0
   
0 0 x 0 0
g =e
1 0 ey θ+
(4) Combining (5.4) and (5.5), we then get the invariant vector fields
1,1/2 d
LE =
dx 
1,1/2 1 d d d
LN = − θ+ + θ−
2 dy dθ+ dθ−
      (5.6)
1,1/2 −y θ− d d θ− θ+ d
L+ = e + + 1−
2 dx dy 2 dθ+
     
1,1/2 y θ+ d d θ+ θ− d
L− = −e − + 1−
2 dx dy 2 dθ−
In order to redo an analogous computation for the parameterization (5.2), it is again better to
pass to a modified parameterization. In this case it is
− +
g = eθ− ψ exE+yN eθ+ psi .
One can now proceed as in the previous case in computing the invariant vector fields. We will
only give the result.
d d d
LE = , LN = − θ+
dx dy dθ+
(5.7)
d −y d d
L+ = , L− = −e + θ+ .
dθ+ dθ− dx
It is a good exercise with Grassmann variables that indeed these differential operators satisfy
the defining equations of the invariant vector fields. Another good exercise is to verify that
they satisfy the commutation relations of the Lie superalgebra. Having these vector fields let us
compute the Haar measure. We proceed in the following steps

(1) The ordinary differential operators are in terms of the invariant vector fields
d d d d
= LE , = LN + θ+ L+ , = L+ , = −ey L− − ey θ+ LE .
dx dy dθ+ dθ−
(2) The Maurer-Cartan one form is
       
−1 −1 d −1 d −1 d −1 d
g dg = g g dx + g g dy + g g dθ+ + g g dθ−
dx dy dθ+ dθ−
= g−1 (LE dx + (LN + θ+ L+ )dy + L+ dθ+ − ey (L− + θ+ LE )dθ− )g
= Edx + N + θ+ ψ + dy + ψ + dθ+ + ey ψ − + θ+ E dθ−
 

= E(dx + θ+ dθ− ) + Ndy + ψ + dθ+ + ψ − ey dθ−


MA PH 451 61

(3) Hence the dual one forms are ω(E) = (dx + θ+ dθ− ), ω(N) = dy, ω(ψ + ) = dθ+ and
ω(ψ − ) = ey dθ− .
(4) the Haar measure is then almost the exterior power of the dual one forms, it is
dµ = e−y dxdydθ− dθ+ .
Functions on the supergroup are then functions on R2 combined with odd functions. More
precisely a basis of functions consists of the following
f0 (e, n); = eiex+iny , f± (e, n) = θ± f0 (e, n), f2 (e, n) = θ+ θ− f0 (e, n).
We see that all fi (e, n) have LE eigenvalue ie and that the Ln eigenvalue of f0 (e, n) and f− (e, n)
is in, while the one of f+ (e, n) and f2 (e, n) is i(n − 1). We say that the vector i(e, n) and
i(e, n − 1) are the weight of f0 (e, n) and f− (e, n) respectively of f+ (e, n) and f2 (e, n). We also
see that f0 (e, n) and f− (e, n) are annihilated by L+ . We call L+ an annihilation operator and L−
a creation operator. Then in analogy to sl(2), we can call f0 (e, n) and f− (e, n) highest-weight
vectors of highest-weight i(e, n). The action of L− then maps as
L− f0 (e, n) = −ie f+ (e, n), L− f− (e, n) = −ie f2 (e, n) − f0 (e, n − 1)
and f0 (e, n) together with f+ (e, n) forms a two-dimensional representation of gl(1|1). The same
is true for f− (e, n) and ie f2 (e, n) + f0 (e, n − 1). These representations are irreducible unless
e = 0. In that case we get four-dimensional indecomposable but reducible modules. We will
terminate our analysis of this example here. The purpose was, that you see once again how to
explicitly analyse a Lie (super)group.

6. L IE A LGEBRAS
In the last section, we have already encountered Lie algebras and its representation theory.
We will now turn to a more general study of it. I use the book by Humphries [Hu] as reference.
We have already had the definition of a Lie algebra

Definition 33. A Lie algebra g is a vector space g together with a bilinear operation
[ , ] : g×g → g
satisfying
[U, T ] = −[T,U] (antisymmetry)
and
[U, [S, T ]] + [T, [U, S]] + [S, [T,U]] = 0 Jacobi identity
for all U, S, T in g.

Let φ : g → g0 be a vector space homomorphism. It is called a Lie algebra homomorphism if


both g and g0 are Lie algebras and if
φ ([x, y]) = [φ (x), φ (y)]
for all x, y in g. If one wants to work concretely with Lie algebras, a useful concept are the
structure constants. For this let g be a finite-dimensional Lie algebra and {x1 , . . . , xn } be a basis
of g. Its structure constants are defined as
n
[xi , x j ] = ∑ fi j k xk ,
k=1
62 T CREUTZIG

so that antisymmetry implies


n  
0 = [xi , x j ] + [x j , xi ] = ∑ k
fi j + f ji k
xk
k=1
and hence
fi j k + f ji k = 0
for all i, j, k in 1, . . . , n. The Jacobi identity implies
n  
0 = ∑ fi j k fkl m + f jl k fki m + fli k fk j m ,
k=1
which can be derived analogously to the previous case. The main examples of this section will
be the special linear Lie algebras, that are the Lie algebras of traceless square matrices. For
small n, explicite bases are

Example 11.

(1) for n = 2, we already have encountered sl(2; R). It is generated by e, f , h. they corre-
spond to the matrices
     
0 0 1 0 0 1
f= , h= , f= ,
1 0 0 −1 0 0
and their commutation relations are
[h, e] = 2e, [h, f ] = −2 f , [e, f ] = h.
In order to have the same notation as the following two examples one needs to redefine
h = hα , e = eα , f = fα .
(2) The Lie algebra sl(3; R) is generated by eight elements, which we call fα1 , fα2 , fα1 +α2 ,
hα1 , hα2 , eα1 , eα2 , eα1 +α2 . In terms of matrices, the f 0 s are the lower triangular matrices
     
0 0 0 0 0 0 0 0 0
fα1 = 1 0 0, fα2 = 0 0 0, fα1 +α2 = 0 0 0,
0 0 0 0 1 0 1 0 0
the h0 s are the traceless diagonal matrices
   
1 0 0 0 0 0
hα1 = 0 −1 0, hα2 = 0 1 0 
0 0 0 0 0 −1
and the e0 s are the upper triangular matrices
     
0 1 0 0 0 0 0 0 1
eα1 = 0 0 0, eα2 = 0 0 1, eα1 +α2 = 0 0 0.
0 0 0 0 0 0 0 0 0
MA PH 451 63

The non-zero commutation relations are


[ fα1 , fα2 ] = − fα1 +α2 , [eα1 , eα2 ] = eα1 +α2 ,
[hα1 , fα1 ] = −2 fα1 , [hα1 , eα1 ] = 2eα1 ,
[hα1 , fα2 ] = fα2 , [hα1 , eα2 ] = −eα2 ,
[hα1 , fα1 +α2 ] = − fα1 +α2 , [hα1 , eα1 +α2 ] = eα1 +α2 ,
[hα2 , fα1 ] = fα1 , [hα2 , eα1 ] = −eα1 ,
(6.1)
[hα2 , fα2 ] = −2 fα2 , [hα2 , eα2 ] = 2eα2 ,
[hα2 , fα1 +α2 ] = − fα1 +α2 , [hα2 , eα1 +α2 ] = eα1 +α2 ,
[eα1 , fα1 ] = hα1 , [eα2 , fα2 ] = hα2 , [eα1 +α2 , fα1 +α2 ] = hα1 + hα2 ,
[eα1 , fα1 +α2 ] = − fα2 , [eα2 , fα1 +α2 ] = fα1 ,
[ fα1 , eα1 +α2 ] = eα2 , [ fα2 , eα1 +α2 ] = −eα1 .
You see that these are quite a lot of relations.
(3) For sl(n; R), let Ei, j be the matrix that has entry zero everywhere except a one at the
j − th position of row number i. Then define
hαi = Ei,i − Ei+1,i+1 , eαi = Ei,i+1 , fαi = Ei+1,i
and further generators are defined similarly to the sl(3; R) case.
A Lie algebra g acts on itself via the commutator. This action is called the adjoint represen-
tation
ad : g → End(g), ad(x)(y) = [x, y].
More generally a representation ρ of a Lie algebra g is a vector space V together with a Lie
algebra homomorphism
ρ : g → End(V ).
Where a Lie algebra homomorphism is defined as follows. A vector space homomorphism (over
our usual field C or R) is a map
ρ :V →W
of vector spaces that is linear, meaning that
ρ(ax + by) = aρ(x) + bρ(y)
for all a, b in our field and all x, y in our vector space V . Both a finite-dimensional Lie algebra g
and the space of linear operators (matrices) on a vector space are themselves vector spaces. A
vector space homomorphism
ρ : g → End(V )
is called a Lie algebra homomorphism, if g is a Lie algebra and if
[ρ(x), ρ(y)] = ρ([x, y])
for all x, y in g. If V is a finite-dimensional vector space, then End(V ) is a subalgebra of the
algebra of square matrices acting on V . Matrices have a very natural bilinear form, the trace.
The trace is denoted by tr and it is the sum of the diagonal entries of a given matrix.
Definition 34. A bilinear form B : g × g → is called non-degenerate if for every non-zero x in g
there exists y in g with B(x, y) 6= 0. The bilinear form is called symmetric if
B(x, y) = B(y, x)
64 T CREUTZIG

and invariant if
B(x, [y, z]) = B([x, y], z)
for all x, y, z in g.

Proposition 42. Let A, B,C be three n × n matrix then


tr(AB) = tr(BA)
and
tr(A[B,C]) = tr([A, B]C).

Proof. Let ai j be the entry on position (i, j) of the matrix A and bi j and ci j correspondingly for
the matrices B and C. The trace is the sum over the diagonal elements, hence
tr(AB) = ∑ ai j b ji = ∑ bi j a ji = tr(BA).
1≤i, j≤n 1≤i, j≤n

Using this identity for the two matrices AC and B, we compute


tr(A[B,C]) − tr([A, B]C) = tr(A(BC −CB)) − tr((AB − BA)C)
= tr(−ACB + BAC)
= tr(BAC − BAC) = 0.

This is nice, it tells us that every matrix represention of a Lie algebra defines an invariant
symmetry bilinear form

Corollary 43. Let ρ : g → End(V ) be a finite-dimensional representation of a finite-dimensional


Lie algebra g, then
B(x, y) = tr(ρ(x)ρ(y))
defines an invariant symmetric bilnear form on g.

A given Lie algebra surely has many finite-dimensional representations. The question is now,
how are these related and if there is at least one such bilinear form that is non-degenerate.

Definition 35. Let g be a Lie algebra and I a subset of g. We call I an ideal if for every x in g
and for every y in I also [x, y] is in I. So especially every ideal is a sub Lie algebra.
A Lie algebra is called simple if it is not abelian and it has no non-trivial ideal.
Let B be a symmetric invariant bilinear form of g. The radical SB is the set of all degenerate
elements in the sense
SB = {x ∈ g |B(x, y) = 0 for all y ∈ g}.

Proposition 44. The radical SB of a Lie algebra g with symmetric, invariant bilinear form B is
an ideal of g.

Proof. Let x in SB and y in g, we have to show that [x, y] is in SB . For this let z be an arbitrary
element of g. By definition B(x, z) = 0, we have to show that also B([x, y], z) = 0. By invariance
of the bilinear form B([x, y], z) = B(x, [y, z]) but the latter vanishes since x has zero product with
every other element of the Lie algebra.

So we see that a bilinear form can only be non-degenerate if its radical vanishes. This espe-
cially happens for a simple Lie algebra as such a Lie algebra has no non-trivial ideal.
MA PH 451 65

Definition 36. A Lie algebra is called semi-simple if it is the direct sum of simple Lie algebras.
The Killing form of a Lie algebra is the invariant symmetric bilinear form defined by the trace
in the adjoint representation, that is
κ(x, y) = tr(ad(x)ad(y)).
If you want to, you can express the Killling form in terms of the structure constants. That is
in the adjoint representation, the basis vector xi acts by multiplication with the matrix fi j k and
hence
∑ fim n f jn m .

κ xi , x j =
1≤m,n≤dim(g)
For an abelian Lie algebra all structure constants are identical zero and hence the Killing form
is identical zero. A nice structural theorem is
Theorem 45. A finite-dimensional Lie algebra is semi-simple if and only if its Killing form is
non-degenerate.
Both having a non-degenerate bilinear form and being simple (having no ideals) are important
properties that usually hold in questions of interest in physics.
Example 12. Let us look back at s(3; R). We have seen that this Lie algebra is the Lie algebra of
traceless 3 × 3 matrices. This means that the Lie algebra acts on three-dimensional real vector
space. The corresponding invariant symmetric bilinear form is easy to read off:
B(hα1 , hα1 ) = B(hα2 , hα2 ) = 2, B(hα1 , hα2 ) = −1
B( fα1 , eα1 ) = B( fα2 , eα2 ) = B( fα1 +α2 , eα1 +α2 ) = 1.
Let us choose the ordered basis { fα1 , fα2 , fα1 +α2 , hα1 , hα2 , eα1 , eα2 , eα1 +α2 } of sl(3; R). Then in
this basis
ad(hα1 ) = diag(−2, 1, −1, 0, 0, 2, −1, 1), ad(hα2 ) = diag(1, −2, −1, 0, 0, −1, 2, 1)
and hence
κ(hα1 , hα1 ) = κ(hα2 , hα2 ) = 12, κ(hα1 , hα2 ) = −6.
The invariant bilinear form is uniquely determined by these three products. The reason is that
the bilinear from is invariant and symmetric. It is an instructive exercise to show that. We
observe that κ = 6B, the two bilinear forms only differ by a scalar.
This is actually not a coincidence, an important implication of the Lemma of Schur is
Theorem 46. Up to a scalar multiple the invariant symmetric non-degenerate bilinear form of a
simple finite-dimensional Lie algebra is unique.
This statement is a Lie algebra analouge of the uniqueness of both left-right invariant Haar
measure and Laplacian of compact Lie groups. Recall that the Laplacia is constructed from the
Casimir element of the underlying Lie algebra. We will now discuss this secial and important
element.
6.1. The Casimir element of a representation. If you study algebra or representation theory
you will often hear the statement this follows from Schur’s Lemma, exactly as I just said for the
uniqueness of the Killing form. Now I would like to tell you how it implies the uniqueness of
the Casimir operator, which is the Lie algebra analouge of the Laplacian, which in turn played
the role of the Hamiltonian of a system with compact Lie group symmetry. The important class
of representations are irreducible representations and completely reducible ones.
66 T CREUTZIG

Definition 37. Let ρ : g → End(V ) be a representation of a Lie algebra. A sub representation of


ρ is a sub vector space W ⊂ V with the property that ρ(x)w in W for every x in g and for every
w in W .
A representation is called irreducible if there is no non-trivial subrepresentation, where V
itself and the zero vector are the two trivial sub representations.
A representation is called completely reducible, if V is the direct sum
M
V= Vi
of vector spaces Vi such that each Vi is an irreducible sub representation of ρ.

Lemma 47. Schur’s Lemma


Let ρ : g → End(V ) be an irreducible representation of g. Then the only elements of End(V )
that commute with the image of g under ρ are the scalars, that are the scalar multiples of the
identity matrix if V is finite-dimensional and if we identify the endomorphism ring with the
space of square matrices.

Let g be a simple finite-dimensional Lie algebra. If you wish you can generalize to semi-
simple Lie algebras, that are direct sums of simple ones. Let ρ : g → End(V ) be an irreducible
representation and let Bρ the trace of matrices representing the Lie algebra in the endomorphism
ring of V with respect to some basis of V . Let us choose a basis {xi , . . . , xn } of g, then there
exists a dual basis {y1 , . . . , yn } satisfying

Bρ xi , y j = δi, j .
We want to use the bilinear form to construct an operator, that is an element in the endomor-
phism ring of V that commutes with all x in g. So let x be an arbitrary element, then we use our
two bases to define matrices via
n n
[x, xi ] = ∑ ai j x j , [x, yi ] = ∑ bi j y j .
j=1 j=1

The matrix b whose entries are the bi j is minus one times the transpose of the matrix a with
entries ai j . This can be seen by the following computation
n
∑ ai j Bρ

aik = x j , yk = Bρ ([x, xi ], yk ) = Bρ (−[xi , x], yk )
j=1
n
= −Bρ (xi , [x, yk ]) = − ∑ bk j Bρ xi , y j

j=1
= −bki .
Here we used anti-symmetry of the Lie bracket as well as invariance and linearity of the bilinear
form. We define the Casimir operator of the representation ρ as
n
Cρ = ∑ ρ(xi )ρ(yi ).
i=1
This is by definition an element of the endomorphism ring of V . It is not clear yet, that this
defintion is independent of our choice of basis.

Proposition 48. The Casimir operator commutes with ρ(x) for every x in g.
MA PH 451 67

Proof. In the endomorphism ring we have the identity


[ρ(x), ρ(y)ρ(z)] = ρ(x)ρ(y)ρ(z) − ρ(y)ρ(z)ρ(x)
= (ρ(x)ρ(y) − ρ(y)ρ(x))ρ(z) + ρ(y)(ρ(x)ρ(z) − ρ(z)ρ(x))
= [ρ(x), ρ(y)]ρ(z) + ρ(y)[ρ(x), ρ(z)].
for all x, y, x in g. Note, that in this argument we could have replaced ρ(x), ρ(y), ρ(z) with any
three endomorphisms. This equality holds because the endomorphism ring is associative, which
is true because matrix multiplication is associative. Remember, you proofed this statement in
your very first homework problems.
We thus get
n
ρ(x),Cρ = ∑ [ρ(x), ρ(xi )ρ(yi )]
 
i=1
n
= ∑ ([ρ(x), ρ(xi )]ρ(yi ) + ρ(xi )[ρ(x)ρ(yi )])
i=1



= ai j ρ(x j )ρ(yi ) + bi j ρ(xi )ρ(y j )
1≤i, j≤n
= 0.
So indeed the Casmir operator commutes with every element ρ(x) for every x in g.
Corollary 49. Let ρ : g → end(V ) be an irreducible finite-dimensional representation of a simple
Lie algebra, then Cρ is the endomorphism that acts by multpiplication with the scalar
dim(g)
cρ = .
dim(V )
Especially the definition of the Casimir operator is independent of the choice of basis.
Proof. We are precisely in the situation of Schur’s Lemma, so Cρ must act as a scalar, that
is if we choose a basis of V and let every endomorphism be represented by its matrix in this
representation, then the Casimir operator is a scalar cρ times the identity matrix, Cρ = cρ Id. In
order to compute this number, we take the trace
tr(Cρ ) = tr(cρ Id) = dim(V )cρ ,
but also
n n n
tr(Cρ ) = ∑ tr(ρ(xi )ρ(yi )) = ∑ Bρ (xi , yi ) = ∑ 1 = n.
i=1 i=1 i=1
The result follows since the dimension of g is n.
Example 13. Let g be sl(2; R) and V = R2 . Let ρ be the mapped we used to define the Lie
algebra, that is
     
0 1 1 0 0 0
ρ(x) = , ρ(h) = , ρ( f ) = .
0 0 0 −1 1 0
Then the bilinear form is
Bρ (e, f ) = 1, Bρ (h, h) = 2.
So that a dual basis is given by e, h/2, f . The Casimir operator is thus
1
C = xy + hh + yx.
2
68 T CREUTZIG

Compare this result with the constuction of the Laplacian in our harmonic analysis of SU(2).
Can you see that up to a scalar they are the same? The scalar is just a normalization. We thus
find
1
Cρ = ρ(x)ρ(y) + hh + ρ(y)ρ(x)
2
     
1 0 1 1 0 0 0
= + =
0 0 2 0 1 0 1
 
3 1 0
= .
2 0 1
Note, that indeed
3 dim(sl(2; R))
= .
2 dim(R2 )
6.2. Jordan Decomposition. In linear algebra, the main goal is to find procedures to bring
matrices in a nice form. In general, the nicest possible form is the Jordan normal form. For
complex matrices it reads

Theorem 50. Let A be an n×n matrix over the complex numbers, and let PA = (λ1 −t)r1 . . . (λm −
t)rm be its characteristic polynomial. Then there exists an invertible matrix S, such that
 
D1 0 . . . 0
 0 D2 . . . 0 
−1
SAS =  ..
 
. . .. 
 . . . 
0 ... 0 Dm
where Di is a ri × ri matrix of the form
 
λi 1 0 . . . 0
 0 λ1 0 . . . 0 
Di =  ..
 
.. .. 
. . .
0 . . . . . . 0 λi

Note, that especially SAS−1 splits into a diagonal matrix D and a nilpotent (upper triangular)
matrix N such that [D, N] = 0. If we are working over another field K than the complex numbers,
as for example the real ones, the Jordan form is

Theorem 51. Let A be an n × n matrix over a field K, and let PA = (λ1 − t)r1 . . . (λm − t)rm be its
characteristic polynomial. Then there exists an invertible matrix S, such that
 
D1 0 . . . 0
 0 D2 . . . 0 
SAS−1 =  ..
 
.. .. 
 . . . 
0 ... 0 Dm
where Di is a ri × ri matrix of the form
 
λi ∗ ∗ . . . ∗
 0 λ1 ∗ . . . ∗ 
Di =  ..
 
. . .. 
. . .
0 . . . . . . 0 λi
MA PH 451 69

such that
SAS−1 = D + N
where D is a diagonal matrix and N is a nilpotent upper triangular matrix with [D, N] = 0.

An n × n matrix represents an endomorphism of n-dimensional vector space with respect to


some chosen basis. The Jordan normal form tells us that there exists another basis, related to the
first one by the basis change matrix S, such that the endomorphism is represented by a matrix
in Jordan form. Especially this matrix (and hence the endomorphism) splits into a diagonal
part and a nilpotent one. The preimage under the basis change of the diagonal part, S−1 DS, is
usually called the semi-simple part of the endomorphism and the one of the upper triangular
one S−1 NS is called the nilpotent one. The decomposition
A = S−1 DS + S−1 NS
is then called the Jordan decomposition of A.
A Lie algebra acts on itself and this action is called the adjoint representation. Let g be a
n-dimensional Lie algebra, then the adjoint representation is represented by n × n matrices in
some given basis of g. Let x be an arbitrary element of g and
ad(x) = S + N
be the Jordan decomposition of ad(x). It defines uniquely elements s and n of g such that
ad(s) = S and ad(n) = N. The decomposition
x = s+n
is called the abstract Jordan decomposition of x. The important theorem is

Theorem 52. Let g ⊂ gl(Rn ) be a finite-dimensional Lie algebra.


(1) Then g contains the semi-simple and nilpotent parts in gl(Rn ) of all its elements. In
particular the abstract and the usual Jordan decompositions in g coincide.
(2) Let ρ : g → End(V ) be a finite-dimensional representation of g. If x = s + n is the
abstract Jordan decomposition of x in g, then ρ(x) = ρ(s) + ρ(n) is the usual Jordan
decomposition of ρ(x).

Example 14. Consider our prime example sl(3; R). We know all Lie brackets of this algebra
(6.1), but we also know the matrix form of elements in the three-dimensional representation.
So that by the previous theorem, we know that hα1 and hα2 are the semi-simple elements and
fα1 , fα2 , fα1 +α2 , eα1 , eα2 , eα1 +α2 are the nilpotent ones. It is instructive to verify this for an ex-
ample. For example, let us look at the action of eα1 . We find
[eα1 , fα1 +α2 ] = − fα2 , [eα1 , fα2 ] = 0
[eα1 , fα1 ] = hα1 , [eα1 , hα1 ] = −2eα1 , [eα1 , eα1 ] = 0
[eα1 , hα2 ] = eα1
[eα1 , eα2 ] = eα1 +α2 , [eα1 , eα1 +α2 ] = 0
so that ad(eα1 )3 = 0 and eα1 is indeed a nilpotent element. We see that we actually have con-
structed our basis of sl(3; R) according to some Jordan decomposition.

6.3. Root Space Decomposition. We will now use the Jordan decomposition to find conve-
nient decompositions and hence conveneient bases of Lie algebras. For this, let g be a simple
70 T CREUTZIG

Lie algebra. The semi-simple elements with respect to the abstract Jordan decomposition form
a subalgebra.

Definition 38. Let g be a simple (or semi-simple) Lie algebra, then a subalgebra of semi-simple
elements is called a toral subalgebra. A maximal toral subalgebra is called the Cartan subalge-
bra of g and it is usually denoted by h.

Proposition 53. A toral subalgebra is abelian, and the Cartan subalgebra is a maximal abelian
subalgebra.

The reason for this statement is that semi-simple elements can be diagonalized and diagonal
matrices commute. So think of the Cartan subalgebra as the subalgebra of the diagonal elements
of your favourite matrix representation of your Lie algebra.

Example 15.
(1) In the case of sl(2; R) the Cartan subalgebra is the one-dimensional abelian Lie algebra
spanned by h.
(2) n the case of sl(3; R) the Cartan subalgebra is the two-dimensional abelian Lie algebra
spanned by hα1 and hα2 .

Of course a Cartan subalgebra is not unique, but for a simple Lie algebra it is unique up to
automorphisms of the Lie algebra.
So the Cartan subalgebra is a maximal abelian subalgebra. Hence in the adjoint representa-
tion, the elements of the Cartan subalgebra form a set of commuting endomorphisms (matri-
ces). What do we know from linear algebra? A very important statement is that commuting
endomorphisms are simultaneously diagonalizable, especially using the adjoint action the Lie
algebra decomposes into a direct sum of common eigenspaces of the elements of the Cartan
subalgebra. Call our Lie algebra g, and its Cartan subalgebra h. Let h∗ be its dual space and
let eα be an element in an eigenspace with eigenvalues defining a linear functional α in h∗ such
that
[h, eα ] = α(h)eα .
Compare this situation to our examples of sl(2; R) and sl(3; R). In other words, there exists a
finite subset ∆ of h∗ such that M
g = h⊕ gα ,
α∈∆
where
gα = {x ∈ g|[h, x] = α(h)x}
and α in ∆ if and only if gα is non-trivial. The space gα is called the root space of the root α
and the decomposition of g into root spaces is called the root space decomposition of g.

Example 16. Let us look at our standard examples


(1) For g = sl(2; R) the Cartan subalgebra is one-dimensional and hence its dual space
is also one-dimensional. Remember that we introduced the somehow stupid looking
notation
h = hα , e = eα , f = fα
so that the Lie bracket reads
[hα , eα ] = 2eα , [hα , fα ] = −2 fα , [eα , fα ] = hα .
MA PH 451 71

Define the element α in h∗ by α(hα ) = 2, then the relations become


[hα , eα ] = α(hα )eα , [hα , fα ] = −α(hα ) fα , [eα , fα ] = hα ,
so that eα spans the one-dimensional roots space gα and fα spans the one-dimensional
root space g−α .
(2) For g = sl(3; R) the situation becomes more complicated. The Cartan subalgebra is
two-dimensional at it is spanned hα1 and hα2 . We define the roots α1 and α2 as the two
linear funcions on h defined by
α1 (hα1 ) = 2, α1 (hα2 ) = −1, α2 (hα1 ) = −1, α2 (hα2 ) = 2.
If you look back at the commutation relations of sl(3; R), we see that the definition is
made to satisfy
[h, eβ ] = β (h)eβ , [h, fβ ] = −β (h) fβ
for β in {α1 , α2 , α1 + α2 }. Especially, we see that each eβ spans the one-dimensional
root space gβ and each fβ the one-dimensional root space g−β . But we see more if we
look very carefully at the relations of sl(3; R). Namely, we see that
[gα , gβ ] ⊂ gα+β .
This is actually a property that has to follow from the Jacobi identity. Moreover looking
at the Killing form, we observe (with g0 = h) that if α + β 6= 0, then gα is orthogonal to
gβ relative to the Killing form. Finally, we also observed previously that all eα and fα
are nilpotent.
The properties illustrated in the example all hold in generality:
Proposition 54. Let g be a simple finite-dimensional Lie algebra with Cartan subalgebra h. Let
α and β be roots, then the following are true:
(1) [gα , gβ ] ⊂ gα+β
(2) x in gα for α 6= 0 is nilpotent
(3) if α + β 6= 0, then gα is orthogonal to gβ relative to the Killing form
(4) the restriction of the Killing form to h is non-degenerate.
The non-degeneracy of the Killing form restricted to h is very nice, as it allows us to identify
h with h∗ via hα is the unique element with the property that κ(hα , h) = α(h) for all h in h. It
is a nice exercise that indeed this holds (depending on your normalization of Killing form of
course) in our examples of sl(2; R) and sl(3; R).
Theorem 55. Let g be a simple Lie algebra with Cartan subalgebra h and root system ∆, then
(1) ∆ spans h∗
(2) If α in ∆ then −α in ∆
(3) For x in gα and y in g−α we have [x, y] = κ(x, y)hα .
(4) If α is in∆, then [gα , g−α ] is one-dimensional with basis hα
(5) α(hα ) = κ(hα , hα ) 6= 0 for α in ∆
(6) If α in ∆ and xα is any nonzero element in gα , then there exists yα in g−α such that
xα , yα and tα = [xα , yα ] span the three-dimensional subalgebra isomorphic to sl(2; R)
(7) tα = κ(h2hα ,h
α
α)

Proof.
72 T CREUTZIG

(1) If ∆ doesnot span h∗ , then there exists (by duality) non-zero h in h such that α(h) = 0
for all α in ∆, but this forces h to commute with all x in g which cannot be true since g
is simple.
(2) Assume that α in ∆ but −α not, then κ(gα , g) = 0 and we get a contradiction to the
non-degeneracy of the Killing form.
(3) Let x in gα and y in g−α and h an arbitrary element of h. Then
κ(h, [x, y]) = κ([h, x], y) = α(h)κ(x, y) = κ(hα , h)κ(x, y) = κ(κ(x, y)hα , h)
= κ(h, κ(x, y)hα )
so that by the non-degeneracy of the restriction of the Killing form to h the claim [x, y] −
κ(x, y)hα = 0 follows.
(4) The last statement says that [gα , g−α is spanned by hα as long as [gα , g−α ] 6= 0. But if
for non-zero x in gα , we have κ(x, y) = 0 for all y in g−α then we get a contradiction to
the non-degeneracy of the Killing form.
(5) The proof of this statement will be omitted because we lack some preparation for it
(6) Let x be a non-zero element of gα , then by the last statement and since κ(x, g−α ) 6= 0,
we can find an element y in g−α with κ(x, y) = κ(hα2,hα ) . Define
2hα
tα =
κ(hα , hα )
then by (3) [x, y] = tα . Moreover,
2 2α(hα )
[tα , x] = [hα , x] = x = 2x
α(hα ) α(hα )
and similarly [tα , y] = −2y.
(7) This follows directly from the proof of the previous statement.

We have seen that roots are very efficient in describing a simple Lie algebra, and we also
have seen that each vector x in a root space gα comes with distinguihsed vectors y in g−α and
h in the Cartan subalgebra such that these three elements generate a copy of sl(2; R). We now
want to look at the representation theory. For this both roots and these sl(2; R) subalgebras are
important.

6.4. Finite-dimensional irreducible representations of sl(2; R). Recall the commutation re-
lations of g = sl(2; R)
[h, e] = 2e, [h, f ] = −2 f , [e, f ] = h.
Recall that h spans the Cartan subalgebra, that e spans the root space gα and f the one of −α,
where α(h) = 2. Let ρ : g → End(V ) be a finite-dimensional representation of g, then the action
of h on V can be diagonalized M
V= Vλ
λ
with
Vλ = {v ∈ V |hv = λ (h)v}.
The eigenvalues λ (h) are described by a linear function λ on h. These linear functions are
called weights and the eigenspaces Vλ are called weight spaces. The set of all λ such that
MA PH 451 73

Vλ 6= 0 is called the set of weights of the representation V . We identify weights with complex
numbers via λ (h)
Lemma 56. If v in Vλ , then ρ(e)v in Vλ +2 and ρ( f )v in Vλ −2 .
Proof.
ρ(h)ρ(e)v = ρ([h, e])v + ρ(e)ρ(h)V = 2ρ(e)v + λ ρ(e)v = (λ + 2)ρ(e)v
and similarly for ρ( f ).
So for a finite-dimensional representation V , there must exist a weight λ such that Vλ +2 = 0.
Such a weight is called a highest-weight.
Lemma 57. Let ρ : g → End(V ) be a finite-dimensional irreducibe representation of g and let
v0 in Vλ be a highest-weight vector. Define
1
vi = ρ( f )i v0
i!
and v−1 = 0 then
(1) ρ(h)vi = (λ − 2i)vi
(2) ρ( f )vi = (i + 1)vi+1
(3) ρ(e)vi = (λ − i + 1)vi−1
Proof. The first statement follows by iterative appliation of the previous lemma, whike the
second one is the definition of the vectors vi . The last statement is proven by induction. The
case i = 0 is true since v−1 = 0. The induction step follows by the following computation
iρ(e)vi = ρ(e)ρ( f )vi−1
= ρ([e, f ])vi−1 + ρ( f )ρ(e)vi−1
= ρ(h)vi−1 + ρ( f )ρ(e)vi−1
= (λ − 2(i − 1))vi−1 + (λ − i + 2)ρ( f )vi−2
= (λ − 2(i − 1))vi−1 + (i − 1)(λ − i + 2)vi−1
= i(λ − i + 1)vi−1
dividing both sides by i finishes the proof.
The first point of this lemma tells us that all vi are linearly independent as they have different
ρ(h) eigenvalues. Since V is finite-dimensional, there must be a m with vm 6= 0 but vm+1 = 0.
Hence all vm+i must vanish too. Above lemma shows that v0 , . . . , vm form a basis of a submodule
of ρ and since the representation is irreducible they form a basis of V . The matrix ρ(h) is a
diagonal matrix, while ρ(e) is upper triangular and ρ( f ) is lower triangular. The third statement
of above lemma tells us something very interesting. For i = m + 1 the left-hand side is zero, but
for the right-hand side this can only be true if λ = m. In other words, the highest weight is a
non-negative integer m and the dimension of the representation is m + 1. We summarize
Theorem 58. Let ρ : g → End(V ) be a m + 1-dimensional irreducibe representation of g, then
the weight space decomposition of V is
V = V−m ⊕V−m+2 ⊕ · · · ⊕Vm−2 ⊕Vm
the representation has a highest-weight vector of highest-weight m, so every m + 1-dimensional
irreducible representation of sl(2; R) is isomorphic to the irreducible higest-weight representa-
tion of highest-weight m.
74 T CREUTZIG

We remark, that when we studied functions on the Lie group SU(2), we have introduced
representations of sl(2; R) in a natural way. Namely the Lie algebra acts on the polynomial
ring in two variables in a degree preserving way. But the vector space of polynomials of degree
m is m + 1-dimensional and one can show that it carries the highest-weight representation of
highest-wright m. Such natural representations also exist for other Lie algebras and for example
sl(3; R) acts natrually on homogeneous polynomials in three-variables.

6.5. Representation theory. We have seen, that sl(2; R) and also sl(3; R) allow a triangular
decomposition
g = g− ⊕ h ⊕ g+ ,
where g− is spanned by the creation operators, the fα and g+ are the annihilation operators, the
eα . This picture can be generalized. Let g be a simple finite-dimensional Lie algebra. A subset
Π of the root system ∆ is called simple if Π is a basis of h∗ and if each root can be written as
β= ∑ kα α
α∈Π
with integral coefficients kα , either all non-positive or all non-negative. A root is then called
positive if all coefficients are non-negative and negative if all coefficients are non-positive. The
space of all positive respectively negative roots is denotned by ∆± , and we have
g = g− ⊕ h ⊕ g+ ,
with M
g± = gα .
α∈∆±
We want to think of the elements in g− as creation operators and of those in g+ as annihilation
operators.

Example 17. For g = sl(3; R) the simple roots are α1 and α2 . The six roots of g are
±α1 , ±α2 , ±(α1 + α2 )
so that α1 , α2 , (α1 + α2 ) are positive roots and −α1 , −α2 , −(α1 + α2 ) are negative ones.

Let ρ :→ End(V ) be an irreducible finite-dimensional representation of g, then since abstract


and usual Jordan decomposition agree the Cartan subalgebra h must act diagonalizable on V .
Hence V decomposes into eigenspaces
M
V= Vλ ,
λ
where the direct sum is over λ in h∗ , and if Vλ 6= 0 then we call λ a weight and Vλ a weight
space. We say that a weight λ is larger than another weight µ if
λ −µ = ∑ kα α
α∈Π
and all kα ≤ 0. Of course there are weights that are neither larger nor smaller, a comparison is
not allways possible. We say that a vector vλ in Vλ is a highest-weight state of highest-weight
λ if µ < λ for all weights µ of the representation ρ.

Theorem 59. Every irreducible g module has a unique highest-weight state, it is a highest-weight
representation.
MA PH 451 75

This means that


ρ(eα )vλ = 0, for all eα in g+
(6.2)
ρ(h)vλ = λ (h)vλ .
and all other states of the representation can be written as a linear combinations of monomials
of the form
∏ ρ( fα )nα vλ
α∈∆−
with non-negative integers nα and fα an element of gα . Such a vector has weight
µ =λ− ∑ n−α α
α∈∆+

and hence clearly obeys the relation µ < λ .

6.6. Highest-weight representations of sl(3; R). Let us conclude this section with a detailed
description of some irreducible highest-weight representations of sl(3; R). Let R[x, y, z] be the
ring of polynomials over the real numbers in three variables. Define a linear map via
d d d d
ρ(hα1 ) = x − y , ρ(hα2 ) = y − z ,
dx dy dy dz
d d d
ρ(eα1 ) = x , ρ(eα2 ) = y , ρ(eα1 +α2 ) = x , (6.3)
dy dz dz
d d d
ρ( fα1 ) = y , ρ( fα2 ) = z , ρ( fα1 +α2 ) = z ,
dx dy dx
then it is a computation that ρ defines a Lie algebra homomorphism, for example
d d d d d
[ρ(eα1 ), ρ(eα2 )] = x y − y x = x = ρ(eα1 +α2 ) = ρ([eα1 , eα2 ]).
dy dz dz dy dz
We thus get an action on the space of homogeneous polynomials. Let us analyze those de-
gree by degree. Degree zero polynomials are just the real numbers and they carry the one-
dimensional representation of g. Degree one polynomials form R3 , and we get the three-
dimensional defininig representation spanned by x, y and z. What are the weights of these states?
For example for x, we get
ρ(hα1 )x = x, ρ(hα2 )x = 0,
let λ = aα1 + bα2 , then λ (hα1 ) = 2a − b and λ (hα2 ) = 2b − a, so that the weight of x is
1
λx = (2α1 + α2 ).
3
Similarly one finds that the weight of y is
1
λy = (−α1 + α2 ),
3
and
1
λz = (−α1 − α2 ).
3
We observe that
λy = λx − α1 , λz = λx − α1 − α2
76 T CREUTZIG

so that x is our highest-weight-vector of highest-weight λ = λx . This is in perfect agreement


with the action of the creation operators
d d
ρ( fα1 )x = y x = y, ρ( fα2 )y = z y = z.
dx dy
Let us know look at a monomial of degree n, for example
xa yb zc
with non-negative integers a, b, c satisfying a + b + c = n. Then by induction and by the Leibniz
rule of the derivative one can compute that the weight of this monomial is
aλx +bλy +cλz = nλx −((n − a)λx + b(α1 − λx ) + c(α1 + α2 − λx )) = nλx −(cα2 + (b + c)α1 ).
In other words, nλx is a highest-weight with highest-weight vector xn . The monomial can then
be rewritten as
d b d c n
   
a b c a!
x y z = y z x
n! dx dx
so that
a!
xa yb zc = (ρ( fα1 ))b (ρ( fα1 +α2 ))c xn .
n!
So that we can summarize

Theorem 60. The polynomial ring in three variables decomposes as a direct sum

M
R[x, y, z] = Vnλ
n=0
of irreducible highest-weight representations Vnλ of g = sl(3; R) of highest-weight
n
nλ = (2α1 + α2 )
3
Representations of this type appear frequently in physics, and they are termed osciallator
d
representations, where you should think of dx as the annihilation operator ax and as x as the

creation operator ax .
More generally this result (with appropriate ordering of simple roots) generalizes as follows

Theorem 61. The polynomial ring in m-variables (m > 1) decomposes as a direct sum

M
R[x1 , . . . , xm ] = Vnλ
n=0
of irreducible highest-weight representations Vnλ of g = sl(m; R) of highest-weight
n
nλ = ((m − 1)α1 + (m − 2)α2 + · · · + αm−1 )
m
6.7. Exercises.
(1) Show that the cross product of vectors
     
x1 y1 x2 y3 − y2 x3
× : R3 × R3 → R3 , x2  × y2  = x3 y1 − y3 x1 
x3 y3 x1 y2 − y1 x2
defines a Lie algebra structure on R3 .
(2) Determine all non-abelian Lie algebras g (up to isomorphism) of dimension two.
MA PH 451 77

(3) Compute the Casimir of sl(2; R) in the adjoint representation without using Corollary
49.
(4) Compute the Casimir of sl(3; R) in the three-dimensional representation without using
Corollary 49.

6.8. Solutions.
(1) we choose as a basis of R3 the standard unit vectors
     
1 0 0
σ1 = 0 ,
  σ2 = 1 ,
  σ3 = 0  .

0 0 1
The cross-product is by definition anti-symmetric. Further, we compute
σ1 × σ2 = σ3 , σ2 × σ3 = σ1 , σ3 × σ1 = σ2 .
These are exactly the commutation relations of the Pauli-matrices, which can be com-
pactly written as
[σi , σ j ] = εi jk σk
with the totally anti-symmetric tensor ε. These define the Lie algebra su(2), see(4.5).
(2) Let g be a two-dimensional Lie algebra, and let x, y be a basis of g. Then the most
general form of the possible commutation relations are
[x, x] = [y, y] = 0, [x, y] = ax + by
for some constants a and b. If both these numbers are zero, then the Lie algebra is
abelian. Otherwise we can assume that a 6= 0 (if a = 0, then b 6= 0 and we can rename).
The aim is to find another basis, such that the commutation relations look very nice. Let
y0 = ay , then
[x, y0 ] = x + by0 .
Let x0 = x + by0 , then
[x0 , y0 ] = x0 .
We thus have shown, that every non-abelian two-dimensional Lie algebra has a basis
x0 , y0 such that the commutation relations take the simple form as above. Note, that this
Lie algebra is a Lie algebra of two by two matrices, and it can be viewed as a subalgebra
of sl(2, R) generated by f and 12 h.
(3) The strategy to compute a Casimir is as follows:
• Choose an ordered basis x1 , . . . , xn of g
• Find the corresponding representation matrices ρ(xi ).
• Compute the traces tr(ρ(xi ), ρ(x j )) and use them to find a dual basis y1 , . . . , yn defined
by tr(ρ(xi )ρ(y j )) = δi, j .
• The Casimir is the matrix
n
Cρ = ∑ ρ(xi )ρ(yi ).
i=1
So let’s do that. As a basis, we choose x1 = e, x2 = h, x3 = f . Then the representation
matrix for ad(e) is defined using
ad(e)(e) = [e, e] = 0, ad(e)(h) = [e, h] = −2e, ad(e)( f ) = [e, f ] = h,
78 T CREUTZIG

so that in our ordered basis we have


 
0 −2 0
ad(e) = 0 0 1.
0 0 0
Similarly one computes
   
2 0 0 0 0 0
ad(h) = 0 0 0 , ad( f ) = −1 0 0.
0 0 −2 0 2 0
The non-zero traces are then
tr(ad(e)ad( f )) = tr(ad( f )ad(e)) = 4, tr(ad(h)ad(h)) = 8,
so that a dual basis is y1 = 41 f , y2 = 18 h, y3 = 14 e, and the Casimir is
1 1 1
Cad = ad(e)ad( f ) + ad(h)ad(h) + ad( f )ad(e)
4 8 4
     
2 0 0 4 0 0 0 0 0
1 1 1
= 0 2 0 + 0 0 0 + 0 2 0
4 8 4
0 0 0 0 0 4 0 0 2
 
1 0 0
= 0 1 0.
0 0 1
(4) We already know the representation matrices. As we naturally identify sl(3; R) with its
three-dimensional matrix representation, I will obmit the ρ indicating the representa-
tion. We choose our basis to be x1 = eα1 , x2 = eα2 , x3 = eα1 +α2 , x4 = hα1 , x5 = hα2 , x6 =
fα1 , x7 = fα2 , x8 = fα1 +α2 . We know from the lecture, that the trace can only be non-
zero for vectors in root spaces gα and gβ with α + β = 0. So that there are only few
possibilities for non-zero traces. They then can be compactly written as

tr eα , fβ = δα,β , tr(hα1 , hα1 ) = tr(hα2 , hα2 ) = 2, tr(hα1 , hα2 ) = −1.
So that the dual vectors y1 = fα1 , y2 = fα2 , y3 = fα1 +α2 , y6 = eα1 , y7 = eα2 , y8 = eα1 +α2
are obvious. It remains to find y4 and y5 . We make the Ansatz y4 = ahα1 + bhα2 , and we
are looking for real numbers a and b such that
tr(hα1 y4 ) = 1, tr(hα2 y4 ) = 0.
But
tr(hα1 y4 ) = 2a − b, tr(hα2 y4 ) = 2b − a
so that the second equation tells us that a = 2b, while inserting this in the first one gives
3b = 1. Hence
1
y4 = (2hα1 + hα2 ).
3
Similarly, we find
1
y5 = (hα1 + 2hα2 ).
3
MA PH 451 79

So that the Casimir becomes


8
C = ∑ xi yi
i=1
       
1 0 0 0 0 0 1 0 0 2 0 0
1
= 0 0 0 + 0 1
   0 + 0 0
  0 + 0 1 0+
3
0 0 0 0 0 0 0 0 0 0 0 0
       
0 0 0 0 0 0 0 0 0 0 0 0
1
0 1 0 + 0 1 0 + 0 0 0 + 0 0 0
3
0 0 2 0 0 0 0 0 1 0 0 1
 
1 0 0
8
= 0 1 0.
3
0 0 1

7. T HE BOSONIC STRING
In this final section, we illustrate features of what we have learnt in the example of the
bosonic string. String theory is a quantum field theory of strings, that is of one-dimensional
objects, instead of point-like particles. The original motivation of string theory was to find a
quantum field theory that consistenty incorporates gravity. As one-dimensional objects have
more structure than points, string theory (if formulated correctly) is guaranteed to refine quan-
tum theory of point-like objects. The original string theory development started with purely
bosonic strings. By now it is realized that this is not sufficient and supersymmetry has to be
incorporated. Nonetheless, the bosonic string by itself has many nice features and it will be our
interesting example to terminate this course.
There are both many science and popular science books on string theory. A very enjoyable
popular science book (to me) is the book the elegant universe by Brian Greene. We won’t really
follow any textbook, but [P] is a suitable reference.

7.1. The free boson compactified on a circle. A string propagating in time sweeps out a
two-dimensional surface in space-time. This surface is called the world-sheet of the string.
The world-sheet quantum field theory is a two-dimensional conformal quantum field theory
(CFT), so that these two-dimensionals CFTs are the building blocks of string theory. CFTs are
intimately connected to Lie groups and Lie algebras. In some sense the symmetry algebra of a
CFT is a quantization of a Lie algebra and its representation theory a quantization of harmonic
analysis on the corresponding Lie group. We learnt that the Circle S1 is the one-dimensional
d
abelian Lie group U(1) with Lie algebra u(1) represented by dx . It acts on periodic functions
f : R → R. Periodicity means that f (x + 2π) = f (x) and a basis of functions has been given by
fn (x) = einx . Multiplication of functions is then easily performed
fn (x) fm (x) = fn+m (x)
and seen to be the same as addition in the integers. The inner product of functions is (we take
an appropriate normalization of the measure)
1 π
Z
fn (x) fm (x)dx = δn+m,0 .
2π −π
80 T CREUTZIG

The free boson comapctified on a circle is a quantization of this data. Let us denote the generator
of u(1) by u, then we consider its loop algebra
u(1) ⊗ R[t,t −1 ]
of formal Laurent polynomials in one variable with coefficients in u(1). This is an infinite-
dimensional abelian Lie algebra. Abelian Lie algebras are not too interesting. However, this
Lie algebra allows for a central extension
û(1) = u(1) ⊗ R[t,t −1 ] ⊕ RK ⊕ Rd
with non-zero commutation relations
[un , um ] = nδn+m,0 K, [d, un ] = nun
with the short-hand notation un = u ⊗t n . Note, that d acts as t dtd . It acts as a differential operator
and it is actually called a derivation. The Lie algebra û(1) is called the affinization of u(1), or
the Kac-Moody algebra of u(1). It is the symmetry algebra of the uncompactified free boson
CFT. What does this mean? A quantum field theory is is in the first place a theory of fields. So
what are these fields? The Heisenberg field or free boson is the formal Laurent polynomial
X(z) = ∑ unz−n−1
n∈Z
with coefficients in the symmetry algebra û(1). It acts naturally on representations of the sym-
metry algebra. The most important
representation is the vacuum representation V0 . It is gener-
ated by a highest-weight state 0 , the vacuum. The vacuum satisfies

u0 0 = 0, un 0 = 0, K 0 = 0
for all n > 0, while the un for negative

n create the descendent states of the infinite-dimensional
vacuum representation. Its dual 0 is defined

by interchanging the role of creation and annihi-
lation operators. We define a norm by 0 0 = 1. The algebraic structure of the free boson is
encoded in correlation functions
0 X(z)X(w) 0 = 0 ∑ un z−n−1 ∑ um w−m−1 0



n∈Z m∈Z

1 ∞ ∞
∑ un z−n ∑ u−m wm 0

= 0
zw n=1 m=1

1 ∞ ∞
∑∑ −n m

= 0 [u n , u−m ]z w 0
zw n=1 m=1

1 ∞ ∞
∑∑

= 0 nδ n,m K 0
zw n=1 m=1
1 ∞ n −n
= ∑ nz w
zw n=1
1
=
(z − w)2
In the last equation, we used the derivative of the geometric series

1
1 − wz
 = ∑ znw−n.
n=0
MA PH 451 81

So that the identity is only true if |z| < |w|. Correlation functions are neatly summarized in the
algebraic structure called operator product algebra. For the free boson, the operator product
algebra is completely determined by
1
X(z)X(w) ∼ .
(z − w)2
d
The free boson X should be viewed as a quantization of the Lie algebra generator dx . This op-
erator acts naturally on the square integrable functions on the circle, the basis of these functions
d
are the einx , and dx acts as
d inx
e = ineinx .
dx
We now would like to find fields that quantize this action. For this define the field φ (z) via
d
φ (z) = X(z)
dz
and define the normally ordered exponential
Vα (z) =: eαφ (z) :
(the factor of i is hidden in the definition of φ ). In quantum field theory, due to non-commutativity,
there is an issue in how to write products of operators. Normally ordering means that all annihi-
lation operators (the un with n > 0) are to the left of the other operators. This procedure ensures
that the action of fields on states is well-defined. The operator product of X(z) with the field
Vα resembles very much the action of the Lie algebra u(1) on its square integrable functions,
namely
αVα (w)
X(x)Vα (w) ∼ .
(z − w)
Moreover, the product of functions translates to the operator product of fields as

Vα (z)Vβ (w) =: Vα (z)Vβ (w) : (z − w)αβ ∼ (z − w)αβ Vα+β (w) + . . . .
Here, we see that the operator product is multi-valued (we have to decide for a choice of root)
if αβ is not an integer. So that if we restrict to square root f two times integer α we get a
well-defined operator product algebra. The integers however are an example of a lattice and
the conformal field theory
√ generated by X and the Vn for integer√n is called the free boson
compactified on R/Z 2 = S1 , or just the lattice CFT of the lattice 2Z.
The interaction of physical quantities in quantum field theory is given by expectation values
for interaction, these are correlation functions. Correlation functions should be viewed asthe
quantum analog of the inner product. For the free boson, two-point functions are
2
Vα (z)Vβ (w) = δα+β ,0 (z − w)−α

and three-point functions are


2
Vα (z)Vβ (w)Vγ (x) = δα+β +γ,0 (z − w)−α .

You can go on with this and for n-field insertions you get
hVα1 (z1 ) . . .Vαn (zn )i = δα1 +···+αn ,0 ∏ zi − z j
 αi α j
.
i< j
82 T CREUTZIG

Note, that the corresponding quantum mechanical correlation function is


1 π iα1 x

iα x Z
iαn x
. . . eiαn x dx = δα1 +···+αn ,0 .

e 1 ...e = e
2π −π
d d
Recall that the Laplacian on the circle is just the second derivative ∆ = dx dx , and also recall
that our free boson analouge of the ordinary derivative is the free boson X(z) itself, while its
norally ordered product is (up to a factor of 1/2 the Virasoro field). The Laplacian was some-
thing like the Hamiltonian, and the Virasoro field is actually also called the energy-momentun
field of the CFT. In general, there is a nice analogy between harmonic analysis on a Lie group
and a CFT on a Lie group.
Harmonic analysis on a Lie group is called the semi-classical (or quantum mechanical) limit
of two-dimensional CFT on the same Lie group. In this limit
• fields of the CFT become square integrable functions on the Lie group.
• the product of fields becomes ordinary multiplication of functions.
• correlation functions become inner products.
• the infinite-dimensional symmetry Lie algebra degenerates to the Lie algebra of invariant
vector fields.
• the Virasoro field (the energy-momentum tensor) becomes the Laplacian.
In other words, one application of what you have learnt in this course is, you have learnt the
quantum mechanics limit of a string propagating on a Lie group.
Before we turn to more general lattice CFTs, we have to discuss the most important structure
of every world-sheet CFT of a string. Its Virasoro Lie algebra.

7.2. The Virasoro algebra. The Virasoro algebra is a central extension of the Lie algebra
of continuous derivations on Laurent series in one variable t. The Lie algebra of continuous
derivations is also called the Witt algebra and it is generated by −t n+1 dtd with commutation
relations  
n+1 d m+1 d d
−t , −t = −(n − m)t n+m+1
dt dt dt
The Virasoro algebra has generators Ln for n in Z plus a central element C. The commutation
relations are almost as those of the Witt algebra (up to setting C to zero),
n3 − n
[Ln , Lm ] = (n − m)Ln+m + δn+m,0C.
12
The central element C acts in representations of CFT as multiplication by a number c. This
number is called the central charge c, and the quotient of the Virasoro algebra by the ideal
generated by the relation C = c is called the Virasoro algebra of central charge c. The importance
of this algebra is that it appears as the symmetry algebra of every two-dimensional CFT. In
other words, every two-dimensional CFT and hence every world-sheet theory of a string has an
infinite-dimensional Lie algebra as symmetry. This is a very helpful and powerful structure.
How do we find this structure in the free boson CFT? We define the Virasoro field

1 1
T (z) = : X(z)X(z) := ∑ z−n−2 ∑ (un−m um + u−m−1 un+m+1 ) = ∑ Ln z−n−2 .
2 2 n∈Z m=0 n∈Z
It is an instrucitve and laborious computation to verify that
n3 − n
[Ln , Lm ] = (n − m)Ln+m + δn+m,0 ,
12
MA PH 451 83

we indeed get the Virasoro algebra relations at central cahrge one. This laborious computation
can be circumvented knowing that the operator product algebra of the central charge c Virasoro
algebra is
d
c/2 2T (w) dw T (w)
T (z)T (w) ∼ + +
(z − w)4 (z − w)2 (z − w)
and verifying that the operator product of 21 : X(z)X(z) : indeed is the same for c = 1. The
power of the operator product is that the complete algebraic structure of the infinite-dimensional
symmetry algebra of a CFT is encoded in its generating fields. In the case of the Virasoro
algebra this means it is all encoded in this innocent looking operator product. In order to get to
the bosonic string, we have to replace the integers by other lattices.
7.3. Lattice CFT. The first example of a bosonic string is given by starting with a CFT of a
self-dual even rank 24 lattice. What does this mean. A lattice L of rank n is a free Z-module of
dimension n. This means there are n generators xi , . . . , xn , such that the set
L = {m1 x1 + · · · + mn xn |mi ∈ Z}
is the lattice L. It is closed under addition and it carries an action of Z via multiplication.
Further, we require that this set has a quadratic form
Q : L × L → Q.
We call a lattice integral if this quadratic form takes values in the integers and even if it takes
values in the even integers. Even integral lattices √ lift to conformal field theories very much
like the free boson compactified via the lattice 2Z. Namely to each element γ in L, we
can associate a vertex operator Vγ , such that the operator product algebra is defined by lattice
addition. Namely
Vγ (z)Vµ (w) =: Vγ (z)Vµ (w) : (z − w)Q(γ,µ) ∼ (z − w)Q(γ,µ) Vγ+µ (w) + . . . .


We see as in the free boson that integrality ensures that we don’t have to take any roots, so there
is no multi-valuedness. Modules of a lattice theory are parameterized by elements of the coset
L0 /L, where L0 is the dual lattice with respect to Q. so that if a lattice is self-dual, then the only
module of the lattice theory is the lattice theory itself. In that case, we are talking about the
lattice CFT of the self-dual lattice. If the quadratic form takes only values in the even integers,
then we ensure that it is a bosonic theory. Otherwise there will also be fermions. Geometrically
a lattice CFT describes a string propagation on the n-dimensional torus Rn /L. A lattice CFT
of rank n, that is its complexification has dimension n as a complex vector space has n copies
of the Heisemberg algebra, the free boson, as a subalgebra. Its Virasoro field is the sum of the
Virasoro fields of these subalgebras.
Example 18. There are two-important examples for us
(1) The two-dimensional lattice L1,1 = Zx ⊕ Zy with quadratic form (x, y) = 1 and (x, x) =
(y, y) = 0.
(2) A self-dual even lattice Λ of rank 24. These lattices are called Niemeier lattices and the
most prominent example is the Leech lattice.
You might have heart that the bosonic string is only consistent in 26 dimensions. This state-
ment is actually false, it must (probably) say in dimension at most 26. We will come to that.
The full world-sheet CFT is not the bosonic string, but its states are described by a semi-infinite
cohomology, called BRST-cohomology.
84 T CREUTZIG

7.4. Fermionic Ghosts. The next ingredient we need is a fermionic CFT. This is a theory based
on a Lie superalgebra. The Lie superalgebra is generated by fermionic elements cn , bn for n in
Z and only one bosonic element K. The non-zero commutation relations are
[bn , cm ] = [cm , bn ] = δn,m K.
These look very similar to those of û(1). The associated fields are called the bc-ghosts, and they
are
b(z) = ∑ bn z−n−1 , c(z) = ∑ cn z−n .
n∈Z n∈Z
The Lie superalgebra structure is encoded in the operator product
1
b(z)c(w) ∼ c(z)b(w) ∼ .
(z − w)
This CFT has a Virasoro field
 
d d
Tghost (z) =: b(z) c(z) : −2 : (b(z)c(z)) :
dz dz
of central charge −26. This central charge is the reason people are claiming that the bosonic
string is only consistent in 26 dimensions. Why are these ghosts so important?

7.5. BRST quantization of the bosonic string. Consider a CFT, that has as commuting sub-
algebras the fermionic ghosts and a Virasoro algebra of central charge c. Then one can define
the BRST-current
1 3 d2
JBRST (z) =: c(z)T (z) : + : c(z)Tghost(z) + c(z).
2 2 dz2
This is a fermionic field. It satisfies the following important operator products
3 : b(w)c(w) : T (w) + Tghost (w)
JBRST (z)b(w) ∼ 3
+ + ,
(z − w) (z − w)2 (z − w)
d
(c/2 − 9) : c(w) dw c(w)
JBRST (z)JBRST (w) ∼ 3

(z − w)
d 2 d3
(c/2 − 9)/2 : c(w) dw 2 c(w) (c − 26)/12 : c(w) dw 3 c(w)
− .
(z − w)2 (z − w)
Here it is very important that the first order pole of the operator product of the BRST-current
with itself vanishes if and only if c = 26. Define the BRST-differential
1
I
QBRST = JBRST (z)dz
2πi
such that it picks up the residuum of the operator product of a field with the BRST-current. We
thus see, that
d3
QBRST b(w) = T (w) + Tghost (w), QBRST JBRST (w) = (c − 26)/12 : c(w) c(w)
dw3
especially if and only if c = 26, we have
QBRST JBRST (w) = 0
and hence also
QBRST QBRST = 0.
MA PH 451 85

This is very important. Let V be a vector space on which QBRST acts, that is we have a map
QBRST : V → V
and since Q2BRST = 0, the image of this map is contained in the kernel, so that we can look at
the vector space of equivaence classes
ker(QBRST : V → V )
HV =
im(QBRST : V → V )
= {[v]|v ∈ V, QBRST v = 0, [v] = [w] ↔ QBRST x = v − w for some x ∈ V }
This set of equivalence classes is called the BRST-cohomology on V .
Definition 39. Let V be a two-dimensional conformal field theory of central charge 26 that is
graded by L1,1 and that contains a rational unitary CFT of central charge 24 as subalgebra, such
that the Virasoro field of V is the sum of the Virasoro fields of the two subalgebras. Further
let the only module of V be V itself. Let W be the product of this CFT with the fermionic
ghost CFT and let X be the kernel of b0 acting on the vacuum module of W . The space of
physical states of a bosonic string propagating on a world-sheet described by V is the semi-
infinite BRST-cohomology HX .
An example then would be if we take for V the product of the lattice CFT of a Niemeier
lattice times the lattice CFT of the lattice L1,1 . In this case V = W . This describes the original
bosonic string and it is a string propagating ona 26-dimensional torus. There are a few important
theorems, most importantly the no-ghost theorem that tells us that the states of the bosonic string
live in a Hilbert space [P].
Let us outline the mathematical importance of the bosonic string. Lian and Zuckerman
showed that the states of the bosonic string have the structure of an infinite-dimensional Lie
algebra. Richard Borcherds realized that these are actually very nice Lie algebras, and he called
them generalized Kac-Moody algebras. Frenkel Lepowsky and Meurman were able to construct
a CFT of central charge with automorphism group the largest finite simple sporadic group, the
monster. The Lie algebra of the bosonic string had the favourable nature, that its graded dimen-
sions were counted by an automorphic product. Borcherds was able to use the properties of
this automorphic product to relate Hauptmoduls of genus zero subgroups of the modular group
to twisted partition functions of the monster CFT. He received the fields medal for his work,
that is the highest possible award for a mathematician. Terry Gannon, here of the University of
Alberta is a leading expert on this subject and he wrote the standard textbook [G].
There is a generalization of this story, which my master advisor Nils Scheithauer has pursued.
Namely there are very few other Lie algebras (less or equal than 12) that behave as the Lie
algebras of the two known bosonic strings. They also connect to automorphic forms and finite
sporadic groups. The conjecture is that these are all Lie algebras of physical states of bosonic
strings. The conjecture is still open, and in my master thesis I took care of a family of four
cases.

8. P OSSIBLE E XAM Q UESTIONS


Exercise 4. Let H be the Lie algebra generated by p, q, z, d with commutation relations
[d, p] = p, [d, q] = −q, [p, q] = z
and all others vanish. Especially p, q, z generated the three-dimensional Heisenberg Lie algebra
of the introduction and also of the harmonic oscillator in quantum mechanics.
86 T CREUTZIG

(1) A quadratic Casimir operator is a polynomial of exactly degree two in the generators of
the Lie algebra with the property that it commutes with every element of the Lie algebra.
This Lie algebra H has two quadratic Casimir operators, one of them is C1 = z2 . Find
the other one.
Hint: Make the Ansatz C2 = αdz + β (pq + qp) and determine α and β such that C2
commutes with d, p, q and z.
(2) Let
g(v, w, x, y) = evp exd+yz ewq ,
the left Maurer-Cartan one form is defined as
d d d d
ω(g) = g−1 dg = g−1 gdv + g−1 gdw + g−1 gdx + g−1 gdy.
dv dw dx dy
it takes values in H, so it can be written as
ω(g) = ω(d)d + ω(z)z + ω(p)p + ω(q)q
with the dual one-forms ω(X) for X in H. Compute ω(g) and the right Maurer-Cartan
one form ω(g−1 ). The left Haar measure is the wedge product of the dual one forms of
the left Maurer-Cartan one form, and the right Haar measure correspondingly. Do these
two measures agree?
The following formula might be helpful,

(ad(X))n (Y ) 1 1
eX Ye−X = ∑ n!
= Y + [X,Y ] + [X, [X,Y ]] + [X, [X, [X,Y ]]] + . . . .
2 6
n=0
(3) sl(3; R) contains a copy of sl(2; R) as subalgebra generated by eα1 , hα1 , fα1 . So that
sl(3; R) can be viewed as a sl(2; R)-module via the adjoint action. Decompose sl(3; R)
in a direct sum of irreducible sl(2; R)-modules under this action.
(4) sl(3; R) contains a copy of sl(2; R) as subalgebra generated by eα1 +α2 , hα1 +hα2 , fα1 +α2 .
So that sl(3; R) can be viewed as a sl(2; R)-module via the adjoint action. Decompose
sl(3; R) in a direct sum of irreducible sl(2; R)-modules under this action.
(5) What is the highest-weight of the adjoint representation of sl(3; R)?
(6) Consider the polynomial ring R[η, ν] in two odd variables η, ν. This means η and
ν satisfy η 2 = ν 2 = 0 and ην + νη = 0. So that the polynomial ring has real basis
d d
1, η, ν, ην. It is four dimensional. The derivatives dν and dη act on the basis vectors as
follows
d d d d d
1 = 0, η = 0, ν = 1, νη = η, ην = −η,
dν dν dν dν dν
and
d d d d d
1 = 0, η = 1, ν = 0, νη = −ν, ην = ν.
dη dη dη dη dη
Further these operators satisfy themselves the rules
d d d d d d d d
= = 0, + = 0.
dη dη dν dν dη dν dν dη
Show
• The map ρ : sl(2; R) → End(R[η, ν]) defined by
d d d d
ρ(e) = η , ρ(h) = η −ν , ρ( f ) = ν
dν dη dν dη
MA PH 451 87

defines a representation of sl(2; R).


• Decompose R[η, ν] into irreducible representations of sl(2; R).

R EFERENCES
[B] V Bouchard. Ma Ph 451 – Mathematical Methods for Physics I. Winter 2013.
[Bu] D Bump. Lie Groups. Springer Graduate Texts in Mathematics.
[G] T. Gannon. Moonshine Beyond the Monster. Cambridge University Press.
[H] S Hassani. Mathematical Physics: A Modern Introduction to Its Foundations.
[Hu] J E Humphreys. Introduction to Lie Algebras and Representation Theory. Springer Graduate Texts in
Mathematics.
[P] J. Polchinski. String Theory. Cambridge University Press.

(T Creutzig) 573 CAB, U NIVERSITY OF A LBERTA


E-mail address: creutzig@ualberta.ca

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy