0% found this document useful (0 votes)
17 views135 pages

18.784: Seminar in Number Theory: Lecturer: Professor Ju-Lee Kim

The seminar focuses on presentations and discussions centered around number theory, specifically modular forms, elliptic curves, and L-functions, using texts by Serre and Diamond & Shurman. Students are encouraged to engage actively, with constructive feedback provided for presentations. Key concepts introduced include the Riemann zeta function, elliptic curves, and the modular group, with a strong emphasis on understanding and explaining the material clearly.

Uploaded by

Anuj Jha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views135 pages

18.784: Seminar in Number Theory: Lecturer: Professor Ju-Lee Kim

The seminar focuses on presentations and discussions centered around number theory, specifically modular forms, elliptic curves, and L-functions, using texts by Serre and Diamond & Shurman. Students are encouraged to engage actively, with constructive feedback provided for presentations. Key concepts introduced include the Riemann zeta function, elliptic curves, and the modular group, with a strong emphasis on understanding and explaining the material clearly.

Uploaded by

Anuj Jha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 135

18.

784: Seminar in Number Theory


Lecturer: Professor Ju-Lee Kim
Notes by: Andrew Lin

Spring 2020

Introduction
This class is a CI-M class, so it will consist primarily of student presentations.
Looking at the syllabus, most of this class is centered around presentation, writing, and participation. The schedule
is already posted on the Stellar website, and there is a Google Sheet where we can sign up for presentation times. Our
first talk is just for feedback – it won’t be graded.
We’ll be starting with chapter 7 of Serre’s “A Course in Arithmetic,” and then we’ll move on to Diamond and
Shurman’s “A First Course in Modular Forms.” We can get both of these texts from the library.

Fact 1
Notes from my own presentations will be copied from my handouts, while notes from others’ are my transcriptions.

1 February 4, 2020
Here’s a few points about how to give a good presentation:
• Read and understand (digest) the material we’re being given, and then explain it! The rest of the class will only
see this final “explain” part, but that doesn’t mean the first two parts aren’t important.
• Have a good lecture plan. Decide what to include and what to exclude, keeping in mind that we’re trying to help
our classmates understand the material. Also, include good examples!
• Be ready to respond to questions. We’ll have lots of discussions, and we get to ask many questions.
• Speak clearly and loudly.
• Organize board space well (dividing the board into smaller boards, picking blackboard order, etc.) Make sure the
lecture “flows” well on the board, so people can go back to previous theorem statements, definitions, and so on.
(Write in complete sentences.)
• Label definitions, lemmas, and so on.
• Make handouts for lectures and distribute them beforehand. (This also helps with bad handwriting.) Handouts
may or may not be graded, still unclear.
• Interact with the audience – face them from time to time when lecturing, make eye contact, and pause for
questions.

1
Professor Kim has office hours (see Stellar for timeslots each week), where she can help us with presentation plans
and material. All of us will be given comment forms to fill out for each presenter – we should make sure our comments
are constructive, because they will be given (anonymously) to the presenter.
For our writing project, we can pick our own topic – Professor Kim is very open-minded.
Our main goals in this class are to understand modular forms, elliptic curves, and L-functions. So today, we’ll do
a brief introduction for motivation and to tie together some of these topics.

Example 2
The Riemann zeta function
X∞
1
ζ(s) = s
n=1
n
is an example of an L-function.

This function is well-defined for all Re(s) > 1, and it has a pole at s = 1. We can also write it in the Euler product
form
Y 1
ζ(s) = .
p
1 − p −s

ζ(s) also satisfies a functional equation, which gives a meromorphic continuation to all of C. The whole idea is that
this can contain a lot of information!

Definition 3 (Loose definition)


In general, an L-function takes the form

X an
L(s) = ,
n=1
ns

where an ∈ C and the sequence {an } contains some arithmetic information.

Often, L-functions can be associated with an elliptic curve L(s, E) or a modular form L(s, f ) – we’ll talk about
this more later.

Example 4 (Dirichlet series)


Let N ∈ Z>0 , and consider a character (basically a one-dimensional representation) from (Z/nZ)∗ → C.

This can be lifted to χ̃ : Z → C by defining



χ(a mod N gcd(a, N) = 1
χ̃(a) = .
0 otherwise

We can then define the L-function



X χ̃(n)
L(s, χ) = .
n=1
ns
This actually gives us the following result from chapter 6 of Serre’s book:

Theorem 5 (Dirichlet)
If gcd(a, N) = 1, then there exist infinitely many primes p with p ≡ a mod N.

2
Definition 6
An elliptic curve is a smooth projective algebraic curve of genus 1 with a distinguished point.

Rational elliptic curves can be written in the form (after transformations)

y 2 = x 3 + ax + b, a, b ∈ Z

where the discriminant 4a3 + 27b2 is nonzero. This is an example of a Diophantine equation, which people have been
interested in trying to solve for integer solutions:

Example 7
Are there three consecutive integers whose product is a square?

(This can be rewritten as finding integral solutions to y 2 = x 3 − x.)

Example 8
Similarly, what positive integers are congruent (which means they’re a possible area of a right triangle with rational
side lengths)?

This can actually be rephrased in terms of elliptic curves as well, after many changes of variables:

Proposition 9
An integer n ∈ N is congruent if and only if y 2 = x 3 − n2 x has rational solutions with y 6= 0.

For example, for n = 1, we have the elliptic curve y 2 = x 3 − x, which only has the solutions (0, 0), (1, 0), (−1, 0).
So 1 is not a congruent number.
We can be more precise with this problem, though. Denote Cn to be the set of triples (a, b, c) ∈ Q3 corresponding
to right triangles with area n, and let En be the solutions (x, y ) ∈ Q2 to the elliptic curve y 2 = x 3 − n2 x. Then there’s
a direct bijection between the solutions:
   
nb 2n2 x 2 − n2 2nx x 2 + n2
(a, b, c) ⇐⇒ , , , , ⇐⇒ (x, y ).
c −a c −a y y y

Elliptic curves can also be used to prove Fermat’s last theorem – specifically, the relevant objects are called Frey
curves. For any odd prime p, the idea is to consider

y 2 = x(x − ap )(x + bp )

if ap + bp = c p . (Then the discriminant is divisible by ap , bp , c p .) Frey curves are not modular, but it turns out all
rational elliptic curves are modular! So this is a contradiction, but it takes many pages to prove this.
P∞ an
So how do we construct L-functions associated to an elliptic curve E? We start with our basic form n=1 ns , and
we specify our coordinates: ap for prime p is equal to p + 1 minus the number of rational points of E over Fp , and
similarly apn is p n + 1 minus the number of rational points of E over Fpn . And then we can define amn = am · an for
relatively prime m, n, and this allows us to determine all coefficients.
ap
Remark 10. If we’re interested in statistics or probability, we can consider the distribution of the coefficients √
2 p
(which are contained in [-1, 1]). This is called the Sato-Tate distribution.

3
Definition 11
The upper-half plane" is defined
# as H = {z ∈ C|Im(z) > 0}, and the special linear group SL2 (R) is defined as
a b
the set of matrices with real coefficients and determinant 1.
c d

az+b
SL2 (R) acts on the upper half-plane by taking a complex number z and sending it to gz = cz+d .

Definition 12
A function f : H → C is a modular function of weight k ∈ 2N if

f (gz) = (cz + d)k f (z).

If f is also holomorphic on H and at infinity, then f is a modular form.

" # " #
1 1 0 1
Let’s think about H/SL2 (Z): this is generated by the two elements and , and this gives us a
0 1 −1 0
fundamental domain. And if we take the closure, this gives us a modular curve, which is closely related to the elliptic
curve.

Example 13 (Quadratic forms)


Given an n ∈ N, can we write n as a sum of k square numbers a12 , · · · , ak2 ? Denote the number of ways to do this
Sk (n).

For example, S2 (5) = 8 (we can use positive and negative numbers, and we can swap their order). It turns out
there’s a closed form    X  
−1 −1
s2 (n) = 2 1 + ,
n d
d|n

but how are we supposed to relate this to modular forms? Well, consider the theta function
X X
e 2πiz ·j =
2 2
Θ(z) = qj
j∈Z j∈Z

(where we denote q = e 2πiz ). Then this is a generating function, and the q n coefficient cn of Θk is the number of
ways to write n as the sum of k squares. And it turns out that Θk is actually a modular form – just not as defined as
above. (We’ll be more precise in the future.) It has weight k2 , which can be pretty interesting to study as well.
And finally, how do we construct
" L-functions
# associated to a modular form? For any modular form, f (z +1) = f (z)
1 1
(because z + 1 is the action of on z). So that means we can write f as a Fourier series
0 1

X
f (z) = an e 2πinz ,
n=0

and the Fourier coefficients an will go into the L-function (except tossing the constant term). This turns out to also
have nice properties – this has something to do with “eigenvalues of the Hecke operators.” We’ll see how everything
is connected in the next few months!

4
2 February 6, 2020

Serre 7.1 – Dhruv Rohatgi


We’ll start by talking about the modular group, its action on H, and its fundamental domain.

Definition 14
The upper half-plane is defined to be H = {z ∈ C : Im z > 0}.

Definition 15 " #
a b
Given any ring R, we can define SL2 (R) to be the multiplicative group of 2×2 matrices with a, b, c, d ∈ R
c d
and ad − bc = 1. (We’ll be using R = Z here.)

We can then also define an action of SL2 (Z) on the upper half-plane from SL2 (Z) × H to H via
" #
az + b a b
gz = φ(g, z ) = , g=
cz + d c d

We need to check that this is a group action first – we just check that g(hz) = (gh)z, which we can do with direct
calculation. To check that gz is in H if z is in H, notice that

az + b Im((az + b)(cz + d))


Im(gz) = Im = .
cz + d |cz + d|2

Expanding the numerator gives (ad − bc) Im z = Im z (since ad − bc = 1), and then we’re dividing by something
positive. So this means that gz lies in the upper half-plane.
What can we say about this group action? Notice that −I acts trivially on the upper half-plane (because we have
−z divided by −1).

Definition 16
The projective special linear group G = P SL2 (Z) is defined as SL2 (Z)/{±I}. This is also called the modular
group.

We can consider the induced group action of this modular group on H, and a useful thing to have is a set of
generators. If we define " # " #
0 −1 1 1
S= , T = ,
1 0 0 1

then Sz = − z1 and T z = z + 1.

Theorem 17
S and T generate G.

We can prove this algebraically, but a more geometric approach works well here. Define the region
1 1
D = {z ∈ H : − ≤ Re z ≤ , |z| ≥ 1}.
2 2

5
Other than the boundary, we’ll show that this domain contains one representative from each orbit.

Theorem 18
We have the following facts:
1. There’s a surjective map from D → H/G. Moreover, for all z ∈ H, there exists a g ∈ hS, T i such that
gz ∈ D.

2. This map is also “mostly” injective aside from “boundary cases:” if z, z ′ ∈ D, z 6= z ′ , but z ′ = gz for some
g ∈ G, then Re(z) = ± 21 and g = T ±1 or |z| = 1 and g = S.

3. Finally, every z ∈ D has trivial stabilizer except for i , ρ, −ρ, where ρ is − 21 + 3
2 .

Why does this imply that S and T generate G? Let z be in the interior of D, and let g ∈ G. By (1), there is
some h ∈ (S, T ) such that h(gz ) ∈ D, and by (2) this means hgz = z. That means by (3) that hg = I, which means
g = h−1 ∈ hS, T i.

Proof. We’ll first prove surjectivity: for any z ∈ H, there exists some n such that T n z has real part between − 12 and
1
2. If this point has magnitude 1 or larger, we’re done. Otherwise, apply S, which increases the imaginary coordinate,
and then apply more T s to fix the real part again. To prove this terminates, every time we apply an S, we have a
sequence of imaginary parts Im(z) < Im(z2 ) < · · · , where z2 = ST n z and so on. But the set

{Im(gz ) : g ∈ G ∧ Im(gz ) > Im(z)}

Im(z)
is finite, because Im(gz ) = |cz+d|2 can only be at least Im(z) if |cz + d| ≤ 1, which can happen only for a finite number
of (c, d).
To show the other parts, take z, z ′ ∈ D so that z ′ = gz, g ∈ G. Without loss of generality, say that Im(z ′ ) ≥ Im(z),
which tells us that |cz + d| ≤ 1 (just like the above part). Since we’re in the fundamental domain D, the imaginary

part of Im(z) ≥ 2
3
and |cz + d| ≥ |c| Im(z), which leaves a finite number of details to check.

Serre 7.2.1 – Vanshika Jain

Definition 19
A function f is weakly modular of weight 2k, k ∈ Z if f is meromorphic (analytic everywhere except at poles)
on H and satisfies  
az + b
f (z) = (cz + d)−2k f
cz + d
" #
a b
for all ∈ SL2 (Z).
c d

Using the quotient rule (and the fact that ad − bc = 1), we find that

d(gz ) 1
= ,
dz (cz + d)2

so the main relation of our definition can be written as

f (gz )d(gz)k = f (z)dz k .

6
One way to interpret this is to think of the differential form f (z)dz k as being invariant under action by G. Since G is
generated by S and T , it suffices to check invariance under S and T to check the weakly modular condition.

Proposition 20
Let f be a meromorphic function on H. Then v is weakly modular of weight 2k if and only if
 
1
f (z + 1) = f (z), f − = z 2k f (z)
z

(corresponding to action by T and S, respectively).

Both of these just come from plugging in the matrices for S and T into the weakly modular condition. This tells us
that f is periodic, so we can apply the change of variables q = e 2πiz to get a new function f˜ which is meromorphic
on the unit disk with 0 removed. Specifically, we have

X ∞
X
f˜ = an q n , f = an e 2πinz .
−∞ −∞

Notice that as z → i ∞, q → 0.

Definition 21
If f˜ extends to a meromorphic (resp: holomorphic) function at the origin, then f is meromorphic (resp: holo-
morphic) at ∞.

P
In these cases, the infinite sum an q n only needs to sum from −m to ∞ and 0 to ∞, respectively (corresponding
to a pole of order m and a holomorphic function, respectively).

Definition 22
A weakly modular function is modular if it is meromorphic at ∞ (that is, f˜ has at most a pole at 0). If f is
holomorphic at ∞, we can also define the value of f at infinity via f (∞) = f˜(0).

So modular functions are analytic on H except at poles, invariant under transformations of G (up to some scaling
factors), and f˜’s Laurent expansion has a finite-order pole at 0.

Definition 23
A modular form is a modular function that is holomorphic everywhere (including ∞). A modular form which has
value 0 at infinity is called a cusp form.

So to recap, a modular form of weight 2k can be written in the form



X ∞
X
f (z) = an q n = an e 2πinz ,
n=0 n=0

converging in the unit disk |q| < 1, satisfying f − z = z 2k f (z) – it’s a cusp form if a0 = 0.
1

Example 24
If we’re given two modular forms f , f ′ with weight 2k, 2k ′ , then f f ′ is a modular form of weight 2k + 2k ′ – we
can check the two equations (action under T and S) both hold.

7
Example 25
The function
Y
q n = 1]∞ (1 − q n )24 = q − 24q 2 + · · ·
[

is a cusp form of weight 12, and it will come up later on.

Serre 7.2.2 - Swapnil Garg

Definition 26
A lattice Γ of a finite-dimensional vector space V (of dimension n) is a subgroup of V isomorphic to Zn . Γ must
also span V , which means that it contains a basis of V .

Example 27 (Non-example)
√ √
Taking the set of points (a + b 2, 0) (integer combinations of 1 and 2). The result is isomorphic to Z2 but
does not span R2 , so it is not a lattice.

If we look at lattices in R2 , we can generate any lattice with two basis vectors (w1 , w2 ). To avoid double-counting,
we’ll be a bit more precise:

Definition 28
Let R be the set of lattices in C if we look at it as R2 , and let M be the set of ordered pairs (w1 , w2 ) with
w1 , w2 ∈ C \ 0 and w1
w2 ∈ H.

Observe that all lattices can then be generated by (w1 , w2 ), so we have a surjective map from M to R. A natural
next question to ask is when two ordered pairs generate the same lattice – this is where SL2 (Z) comes in. First, we’ll
need a lemma:

Lemma 29
Suppose that v1 = gv2 , where v1 = (w1 , w2 ), v2 = (w1′ , w2′ ) are two-dimensional (complex) vectors and g is a
w1′
2 × 2 matrix. If g has positive determinant, then z = w1
w2 and z ′ = w2′ have the same sign if and only if g has
positive determinant.

(This was basically proven by Dhruv’s part of the lecture.)

Proposition 30
v1 , v2 ∈ M map to the same lattice in R if and only if v1 = gv2 for some g ∈ SL2 (Z).

Proof. The backwards direction is clear, because v1 = gv2 is a lattice transformation – we map our basis vectors to
other vectors in our lattice, and we have the same unit cell size.
′ ′
The forwards direction
" is very
# similar: if v1 = (w1 , w2 ) and v2 = (w1 , w2 ), then v1 = aw1 +bw2 and v2 = cw1 +dw2 .
a b
This means that g = , and having the same unit cell cize requires the determinant to be ±1. By the above
c d
lemma, we know that the determinant of g is positive, so g ∈ SL2 (Z).

8
This means that R is isomorphic to M/SL2 (Z). Now we also want to mod out by the action of C∗ by scalar
multiplication (which scale, stretch, and rotate our lattices). If we consider the map from M to H sending (w1 , w2 )
to w1
w2 , notice that (w1 , w2 ) maps to the same point as (λw1 , λw2 ). So H is bijective with M/C∗ , meaning pairs of
basis vectors mod C∗ are identified with the upper half-plane. And if we mod this out by SL2 (Z), we find that R/C∗
is isomorphic to H/G, where G is the modular group defined earlier. (It’s okay to mod out by G rather than SL2 (Z)
because ±1 don’t do anything.) And this is why we introduce lattice – R/C∗ is very closely related to H/G, which is
closely related to the fundamental domain D.

Definition 31
A lattice function F : R → C has weight 2k if F (λΓ) = λ−2k F (Γ) for all λ ∈ C, Γ ∈ R.

We can think of F as acting on the basis vectors instead of the whole lattice – this gives us a map F : M → C,
where F (w1 , w2 ) is F of the lattice generated by (w1 , w2 ). The lattice function condition then becomes

F (λw1 , λw2 ) = λ−2k F (w1 , w2 ),

so w22k F (w1 , w2 ) is invariant. This motivates us to define


 
w1
f = w22k F (w1 , w2 ).
w2

Also, because F is invariant under SL2 (Z), we know that


 
−2k az + b
f (z) = F (z, 1) = F (az + b, cz + d) = (cz + d) f ,
cz + d

which is exactly the modular function condition. So adding a few extra conditions means that f can be a modular
function!

3 February 11, 2020

Serre 7.2.3 – Christian Altamirano


Last week, we defined modular functions, and we’ll be presenting some examples today. We’ll start with a lemma:

Lemma 32
P′ P′
Let Γ be a lattice in C. Then 1
γ∈Γ |γ|σ is convergent for θ > 2, where denotes a sum over all nonzero
elements of the lattice.

Proof. There are two ideas we can use here. First, we can majorize the series under (a constant times) the double
integral ZZ
dxdy
(x 2 + y 2 )θ/2
by writing out the double integral as a Riemann sum. Then the double integral is easy to evaluate by using polar
coordinates.
Another idea is to bound the points in the lattice with |γ| between n and n + 1 – there are O(n) of these by an
area argument because O([n + 1]2 − n2 ) = O(n), so the infinite sum for the lattice is convergent if and only if 1
nσ−1 is

9
convergent because
X 1 X∞ X∞
1 1
≤ O(n) = K .
|γ|σ
n=1
n θ
n=1
n σ−1
γ∈Γ

With this, we can construct an example:

Example 33
Let k > 1 be an integer and let Γ be a lattice. Then define a function on lattices

X 1
Gk (Γ) = .
γ 2k
γ∈Γ

From the above lemma, this is an absolutely convergent sum – this is called the Eisenstein series of index k. We
know that our lattice Γ can be defined by two complex numbers w1 , w2 , and any point can be written as mw1 + nw2
(an integer linear combination), so we can instead define

X 1
Gk (w1 , w2 ) = .
(mw1 + nw2 )2k
(m,n)

w1
Note that the “shape” of the lattice depends mostly on z = w2 , we can define


X 1
Gk (z) = Gk (z, 1) = .
(mz + n)2k
(m,n)

Proposition 34
Gk (z) is a modular form of weight 2k for all integers k > 1, and we have Gk (∞) = 2ζ(2k).

Proof. First, we show that Gk (z) is weakly modular (of weight 2k). We can show this by looking at the two transfor-
mations under T and S: recall that f is weakly modular of weight 2k if it satisfies
 
1
f (z) = f (z + 1), f = z −2k f (z),
z

and we can check that both of these equations hold.


The next step is to show Gk (z) is holomorphic everywhere (including at ∞). Let z ∈ D (the fundamental domain
of the modular group) – note that

|mz + n|2 ≥ m2 − mn + n2 = |mρ − n|2 ,

where ρ = e 2πi/3 , by a direct calculation, and thus


X 1
Gk (z) ≤
(mρ − n)2k

is convergent by Lemma 32 – in fact, it converges normally. Now, given any z ∈ D and any g ∈ G = hS, T i, we get
Gk (gz ), which will also converge normally (because we’re transforming the domain D). Thus Gk is holomorphic in the
upper half-plane.

10
Finally, we need to show that Gk is holomorphic at ∞, which is the same as showing that Gk (z) has a limit as
z → ∞. We can assume that z is in the fundamental domain (meaning we take z → i ∞), and then

X X 1 ′
1
lim = lim = 2ζ(2k)
z→∞
m,n
(mz + n)2k z→∞ n2k
n∈Z

because any term with a z will disappear (the denominator becomes infinity).

The main idea of normal convergence is that a function


X
f = fn
n
P
converges normally if n max |fn (z)| converges. Basically, it’s a very strong condition – it implies that the series
converges at any point.

Serre 7.3.1 – Anton Trygub

Definition 35
Let f be a meromorphic function on H. Then the order of f at p is the integer n so that f
(z−p)n is a holomorphic
function at p and f (p) 6= 0. If n is positive, then p is called a zero of f , and if n is negative, then p is called a
pole of f . Denote this order νp (f ).

We know that by definition, a modular function f satsifies


 
az + b
f (z) = (cz + d)−2k f
cz + d
" #
a b
for all ∈ SL2 (Z). This tells us that
c d
νz (f ) = ν az+b (f );
cz+d

in other words, we have νp (f ) = νgp (f ) for any p ∈ H and g ∈ G. So this function ν is constant on every orbit of p –
it only depends on the image of p in H/G.
For convenience, we’ll also define the order v∞ (f ) to be the order of q = 0 for f˜(q), where the function f˜ is defined
so that q = e 2πiz and f˜(q) = f (z).

Claim 36. Let f be a modular function of weight 2k. Then f has only finitely many zeros and poles in the fundamental
domain D = {| Re(z)| ≤ 21 , |z| ≥ 1, Im(z) > 0}.

Proof. Since f˜ is meromorphic at q = 0, and zeros and poles are isolated for meromorphic functions, there exists a
neighborhood of 0 where f˜ has no zeros and poles – specifically, there is an r > 0 such that f˜ has no zeros for all
log(1/r )
0 < |q| < r . Because f˜(e 2πiz ) = f (z), this means that f (z) does not have any zeros or poles for Im(z) > 2π . So
any zeros or poles of f in the fundamental domain are in the compact region

log(1/r )
Dr = {x ∈ D : Im(x) ≤ }

and there can only be a finite number of zeros or poles here, as desired.

11
Definition 37
Let ep be the order of the stabilizer of the point p under G.

From the first theorem of the chapter (where we said that we have an injective function except with a few
exceptions), we know that 

2 p = g(i )


ep = 3 p = g(ρ)



1 otherwise

(in each case, for some g ∈ G).

Theorem 38
Let f be a modular function of weight 2k. Then
X 1 k
ν∞ + νp (f ) = .
ep 6
p∈H/G

Our previous claim tells us that there are only finitely many points p ∈ H/G with order different from zero, so this
definition makes sense. We can also rewrite this with our calculation of ep :

1 X k
ν∞ + νi (f ) + νρ (f ) + νp (f ) = .
2 6
ρ∈H/G
ρ̸=gi,gρ

Serre 7.3.1 continued – Zack Chroman


We’ll now prove the above theorem. We’ll denote the sum over p over the domain D which are not i or ρ with the
P
symbol ∗ .
The big theorem we’ll need to show this is the following:

Theorem 39
If we integrate a function f around a closed domain A, then
Z X
1
f (z)dz = Resp (f ),
2πi δA
p∈A

where the residue Resp (f ) is defined to be the coefficient a−1 in the Laurent series expansion

X
f (z) = an (z − p)n .
n=−∞

f′
If we apply the residue theorem to the function f dz, we get the following nice corollary:

Corollary 40 (Argument principle)


We have Z X
1 f′
dz = νp (f ).
2πi δA f
p∈A

12
We’ll skip the proof of this for now – it is an exercise in expanding out the Laurent series.
To prove the theorem we want, we will use the residue theorem to evaluate a certain integral: we take the funda-
mental domain D, but we avoid a neighborhood of the points ρ, i , and ρ + 1 (picture taken from Serre):

Here, we’re picking A and E to be high enough so that all of the zeros and poles in the fundamental domain are
below line AE, and we’re also making the radius of the curves from B to B ′ , C to C ′ , and D to D′ go to 0.
We assume that there’s no poles along the parts from A to B or E to D′ – if there are, then we just avoid those
by drawing an identical curve on the left and right.
Call our curve C. By the residue theorem, we know that
Z X ∗
f′
dz = 2πi νp (f ),
C f

but we can also break this up along each of the individual parts. The integral along A to B and D′ to E are identical
by periodicity, but we’re going in opposite directions, so they cancel out.
Now, under the transformation z 7→ e 2πiz , the line AE gets mapped to a circle ω around 0, and thus
Z A ′ Z
f df
dz = = −2πi ν∞ (f ).
E f ω f

(The negative sign comes from the orientation of the circle.) The rest of the curves, like from B to B ′ , are all partial
circles. To evaluate these, we need a lemma:

Lemma 41
Integrating along a small circular arc around a pole p of angle α yields Resp (f ) · αi .

The idea is that the integral all the way around the pole is 2πi Resp (f ), so it’s just proportional with the arc length.
a−1
Proof. First of all, we can ignore all power series terms except the z−p term (everything else goes away because it
integrates to 0). Thus, we want the integral Z
a−1
dz.
arc z −p
2πiθ 2πiθ
Substituting z = p + e , we have dz = 2πi e dθ and thus our integral becomes
Z x
a−1
2πiθ]
· 2πi e 2piiθ dθ = xa−1 · 2πi ,
0 e

as desired.

13
With this, note that as r → 0, the arc from B to B ′ becomes an arc of angle π
3 (because the angle between the
π
vertical line and the tangent at ρ is 3 ). This means that
Z B′
f′ 2πi
dz = − νρ (f ),
B f 6

and similarly
Z D′
f′ 2πi
dz = − νρ (f ).
D f 6

Meanwhile, the circular arc from C to C approaches a semicircle, so
Z C′
f′ 2πi
dz = − νi (f ).
C f 2
" #
0 1
Finally, note that the matrix S = sends the arc B ′ C to the arc DC ′ (oriented in that direction). The
−1 0
definition of a modular function tells us that
Z C Z C Z C′
2k df (Sz ) dz df df
f (Sz ) = z f (z) =⇒ dz = 2k + = .
B′ f (Sz ) B′ z f D f

This means that the sum Z Z Z


C D C
df df dz
+ = −2k
B′ f C′ f B′ z
(where the signs come from us switching the order of integration), and now this is just the residue theorem on the
unit circle. The angle from B ′ to C goes to π
12 , so this evaluates to −2k · − 2πi
12 =
2πik
6 .
Putting everything together, this means that
Z ′  
f νρ (f ) νi (f ) k
dz = −2πi ν∞ (f ) + + − ,
C f 3 2 6
P
but we know from before that this integral is also equal to 2πi ∗ νp (f ). Rearranging and canceling the 2πi gives us
the result we want.

4 February 13, 2020

Serre 7.3.2 – David Wu


We’ve been considering properties of modular forms, and now we will look at the whole space of modular forms
together. Let Mk be the space of modular forms of weight 2k, and let M0k be the space of cusp forms. Both are
vector spaces over C, because the sum of holomorphic functions is holomorphic, and 0 is always a modular form of
weight k.
Here’s how we can relate modular forms and cusp forms:

Proposition 42
We have Mk = M0k ⊕ CGk (which means we add a complex number times the Eisenstein series).

Proof. Consider the map from Mk to C sending f to f (∞). By definition of cusp forms, the kernel of this map is
Mk0 , and the image is a subspace of C (so either all of C or just 0). But we know that Gk (∞) 6= 0, so we can indeed
decompose as stated.

14
The nice thing about Eisenstein series is that we can work directly with them – in contrast, we don’t actually have
that many examples of cusp forms. But one good example is

∆ = g23 − 27g32 ,

where g2 = 60G2 and g3 = 140G3 . (This is an element of M60 – a cusp form of weight 12.)

Theorem 43
The space of modular forms Mk is trivial for k < 0 and k = 1. For k = 0, 2, 3, 4, 5, Mk is a vector space spanned
by 1, G2 , G3 , G4 , G5 respectively, and Mk0 is trivial. Also, if we multiply by ∆, then we have an isomorphism between
Mk−6 and M0k .

Basically, we can classify for small values of k and then move to higher weights as well.

Proof. Recall from last time the formula


X ∗
1 1 k
v∞ (f ) + vi (f ) + vρ (f ) + vp (f ) = .
2 3 6
H/G

All of these orders are nonnegative because we’re working with holomorphic functions f , so we must have k ≥ 0 to
1
have a nonzero function f . Also, if k = 1, then there are no terms that can give a contribution of 6 – the smallest
1
contribution is 3.
Also, if we have a cusp form, then v∞ (f ) ≥ 1 by definition, so the left hand side is at least 1 (and therefore for
k < 6, there are no cusp forms other than 0). By Proposition 42, this means Mk = CGk , so Mk is indeed just a
one-dimensional vector space.
Finally, let’s first prove a few properties about our discriminant function ∆. If we apply our counting formula above
to G2 , we find that vρ (G2 ) = 1 and vp (G2 ) = 0 for all other points p. Similarly, vi (G3 ) = 1 and vp (G3 ) = 0 for all
other p. This means that if we look at our discriminant ∆ = g23 − 27g32 , it will be nonzero at i (because there is a
zero for the g3 term, but not for the g2 term). Therefore, ∆ is not identically zero, and because it is a cusp form,
v∞ (∆) ≥ 1. Again applying our counting formula, all of the other vp s must be zero for ∆, so ∆ is nonzero on all of H.
So now take any f ∈ Mk0 , and define g = f
∆ – this is holomorphic because ∆ is zero everywhere except at ∞ (where
f already has a zero), and that means we can find the orders of the zeros of g:

vp (f ) 6 ∞
p=
vp (g) =
v (f ) − 1 p = ∞
p

and computing again with the counting formula shows that we do indeed end up in Mk−6 .

Corollary 44
We have an explicit formula for all k ≥ 0:
 
 k k ≡ 1 mod 6
dim Mk =  6 
 k +1 otherwise.
6

This tells us that these vector spaces are finite-dimensional, which is nice!

15
Proof. Applying Theorem 43 gives the desired result for k = 0, 1, 2, 3, 4, 5 (verify directly). To pass to larger k, note
that adding 6 to k makes the right-hand side of the equation just increase by 1, and by Proposition 42,

Mk+6 = M0k+6 ⊕ CGk+6 = Mk ⊕ CGk+6

(last equality from Theorem 43), which means the dimension of Mk+6 just tacks on a one-dimensional subspace.

Corollary 45
Mk has a basis {G2m G3n , m, n ∈ Z≥0 , 2m + 3n = k}.

Proof. First we show that this set of vectors generate the vector space, and then we show that they are linearly
independent. If k is small, specifically for all k ≤ 3, we can check directly that M2 is spanned by G2 and M3 is spanned
by G3 .
For all k > 3, we again use induction: the Chicken McNugget Theorem tells us that there exist α, β with
2α + 3β = k. So if we compute the weight of g = G2α G3β , the weight of the modular form will be k, and it is nonzero
at ∞ (because G2 and G3 are nonzero at ∞). So for any modular form f ∈ Mk , there exists λ ∈ C such that f − λg
is a cusp form, which means
f − λg = ∆h, h ∈ Mk−6 .

By the inductive hypotehsis, this means we can write

f − λg = (g23 − 27g32 )(G2a G3b )

for some a, b, and expanding this out gives the appropriate result.
To show that these monomials are independent, say that we have a dependence relation

a1 G2α G3β + a2 G2α−3 G3β+2 + · · · = 0.

G23
Dividing through by a power of G3β tells us that G32
satisfies a polynomial equation, so the fundamental theorem of
G23
algebra tells us that G32
is completely constant, which is not true (because G2 is zero at ρ while G3 is not, and this is
not the zero function).

So we’ve shown that we can classify the vector spaces of modular forms, and we have explicit formulas for the
basis elements and dimensions of these spaces

Serre 7.3.3 – Shreyas Balaji


We’ll talk about a specific nice modular function:

Definition 46
The modular invariant is given by
1728g23
j= .

To motivate the 1728 coefficient in the numerator, let’s think about the series expansion: one way we can make
it nice is to make the residue at ∞ equal to 1 (because j has a simple pole at ∞, which we’ll show).

16
Proposition 47
j is a modular function of weight 0.

Proof. g2 is a modular form of weight 2k = 4, so g23 is a modular form of weight 12. ∆ also has weight 12, so the
quotient has weight zero.

Proposition 48
j is holomorphic on H, and it has a simple pole at ∞.

Proof. Recall from the previous section that g2 is holomorphic on H and at ∞, and ∆ has a simple zero at ∞ and is
nonzero everywhere else. Dividing will not introduce any zeros or poles except at ∞.

Proposition 49
j defines by passage to quotient a bijection between H/G → C.

Proof. Here, “passage to quotient” means we quotient everything in the upper half plane by G – for this to be valid,
j should be equal on the equivalence classes in H/G, but remember that j has weight zero (so this is well-defined).
To show that we have a bijection, we need to show that for any λ ∈ C, there exists a unique ω ∈ H/G such that
j(ω) = λ.
Define the function fλ : H/G → C via
fλ = 1728g23 − λ∆.

We’re trying to show that λ has a unique zero: we can apply the counting zeros formula to find that

X ∗
1 1 k
v∞ (f ) + vi (f ) + vρ (f ) + vp (f ) = ,
2 3 6
H/G

where k = 6 in this case. There are a few options: we can have a zero at ∞, two zeros at i , 3 zeros at ρ, or one zero
at a certain point. But there’s always exactly one point at which we have a zero, which is what we want.

Proposition 50
Let f be a meromorphic function over H. Then the following are equivalent:
1. f is a modular function of weight 0.

2. f is a quotient of two modular forms of the same weight.

3. f is a rational function of j.

Proof. (3) implies (2) implies (1), so all we need to do is to show that (1) implies (3). Let f be a modular function of
weight 0 – in fact, we can assume that f is holomorphic over H, because it starts off meromorphic with finitely many
poles and then we can multiply by some polynomial in j to get rid of those.
The discriminant ∆ is zero at ∞, so the function g = ∆n f is holomorphic at ∞ for some n. Now g has weight 12n
(because f has weight 0 and ∆ has weight 12), so we know that g is a linear combination
X
g= G2α G3β .
2α+3β=6n

17
It suffices to show that each term is a rational function of j by linearity. We have

G2α G3β
f = ,
∆n
and we know that 2α + 3β = 6n, so α is a multiple of 3 and β is a multiple of 2. Let α = 3p and β = 2q, and now
we can write
G23p G32q
f =
∆p+q
G23p
(the denominator p + q comes from making sure f has weight 0). And now f must be a rational function of j: ∆p
G32q
and ∆q are both functions of j, and we’re done.

Serre 7.4.1 – Andrew Gu


We’ll be talking about Bernoulli numbers, which will be helpful for computing some coefficients.

Definition 51
The Bernoulli numbers are defined by the power series expansion

x x X x 2k
=1− + (−1)k+1 Bk .
e −1
x 2 (2k)!
k=1

The infinite sum has no odd-degree terms; to check this, note that
 
x x x ex + 1
+ = ,
ex − 1 2 2 ex − 1

which is an even function. As an example, the first few Bernoulli numbers are
1 1 1
B1 = , B2 = , B3 = ,··· .
6 30 42
The Bernoulli numbers are sometimes defined differently via

X bk x k
x
= ,
ex − 1 k!
k=0

so we don’t throw away the odd-degree terms. These give the same numbers up to some small changes, but we’ll just
use the Bk (because that’s what the book uses).

Theorem 52
We have a formula for the zeta function at even values: for all integers k ≥ 1,

22k−1
ζ(2k) = Bk π 2k .
(2k)!

Proof. Set x = 2i z in the definition of the Bernoulli numbers to find that



X 22k Bk z 2k
2i z
= 1 − i z − .
e 2iz − 1 (2k)!
k=1

18
Moving the i z term to the left gives

X 22k
1− Bk z 2k = z cot z
(2k)!
k=1

by expanding out the definitions of the exponentials. But we can also find a different formula for z cot z: starting with
the Euler product
∞ 
Y 
z2
sin z = z 1− .
n=1
n2 π 2
f′
We apply logarithmic differentiation, meaning we send f → f . This turns products into sums:

cos z 1 X −2z/(n2 π 2 )
= + .
sin z z n=1 1 − z 2 /(n2 π 2 )

Multiplying both sides by z yields



X z
z cot z = 1 − 2 .
n=1
n2 π 2 − z 2
This is almost a power series expansion – we’ll now expand this fraction as a power series, so

X X∞ X∞  k
z 2 /(n2 π 2 ) z2
z cot z = 1 − 2 =1−2
n=1
1 − z 2 /(n2 π 2 ) n=1
n2 π 2
k=1

is valid as long as |z| < π. Exchanging the sums yields


X ∞
z 2k X 1
1−2 .
π 2k n=1 n2k
k=1

Comparing coefficients with our two boxed equations yields exactly what we want (we are comparing power series
around a neighborhood of 0, so they must agree).

5 February 13, 2020

Serre 7.4.2 – Michelle Xu


We’ll be discussing the Eisenstein series of index k (an example of a modular form)

X 1
Gk (z) =
m,n
(nz + m)2k

We’ll be trying to express this as a Taylor expansion in terms of q = e 2iπz .

Lemma 53
For all k ≥ 2, we have
X X ∞
1 1 k
= (−2πi ) nk−1 q n .
(z + m)k (k − 1)! n=1
m∈Z

19
Proof. Recall Euler’s sine product formula: taking the log derivative of the expression for sin z yields

X x2
x cot x = 1 − 2 .
m=1
m2 π 2 − x 2

Set x = πz and dividie both sides by z to yield the expression


X∞
1 z
π cot πz = − 2 2 − z2
z m=1
m

and we can break up the fraction into a simpler sum



1 X 1 1 X 1
= + + = .
z m=1 z + m z − m z +m
m∈Z

But we can rewrite the left hand side another way:


cos πz q+1
π cot πz = π = iπ ,
sin πz q−1
because
q+1 2 cos2 πz + 2i sin πz cos πz cos πz
= = −i ,
q−1 −2 sin πz + 2i sin πz cos πz
2 sin πz
and we can continue to simplify this via
  X∞
q+1 2
iπ = iπ 1 + = i π − 2i π qn .
q−1 q−1 n=0

Setting these equal to each other yields


X X∞
1
= i π − 2i π qn .
z +m n=0
m∈Z

Differentiating k − 1 times (this is why we need k ≥ 2) yields the result.

This allows us to get to our Taylor expansion:

Proposition 54
For all k ≥ 2, we have

(2i π)2k X
Gk (z) = 2ζ(2k) + 2 σ2k−1 (n)q n ,
(2k − 1)! n=1
P
where σk (n) is defined to be the sum of the kth powers of divisors of d: σk (n) = d|n dk.

Proof. We break up the sum into the contribution from n = 0 and n 6= 0:



X XX
1 1
Gk (z) = + .
m2k (nz + m)2k
m∈Z n̸=0 m∈Z

The first sum just gives us twice ζ(2k) by similar arguments to what we’ve done in the past, and we can pull out a
factor of 2 in the other sum as well:
∞ X
X 1
= 2ζ(2k) + 2 .
n=1 m∈Z
(nz + m)2k

20
This is what we have in the lemma, except that we replace z with nz and k with 2k. So we can rewrite our lemma as
X X ∞
1 1 2k
= (−2πi ) a2k−1 q na .
(nz + m)2k (2k − 1)! a=1
m∈Z

Plugging this into our Eisenstein series yields (relabeling n as d)


∞ ∞
2(−2i π)2k X X 2k−1 da
Gk (z) = 2ζ(2k) + a q .
(2k − 1)! a=1b=1

Then we get a contribution to an of a2k−1 for every d|nk, which gives us what we want.

Corollary 55
P∞ 4k
Let Ek (z) = 1 + γk n=1 σ2k−1 (n)q n , where γk = (−1)k Bk
. Then we can rewrite

Gk (z) = 2ζ(2k)Ek (z).

Proof. Last time, we found that for all k ≥ 1,

22k−1
ζ(2k) = Bk π 2k .
(2k)!

This means that


4k(2π)2k (2i π)2k
γk = (−1)k = ,
2(2k)!ζ(2k) (2k − 1)!ζ(2k)
and this gives us the constant that we want.

We can see some examples of how this looks for different k.

Example 56
1
B2 = 30 , so we can write

X
E2 (z) = 1 + 240k σ3 (n)q n .
n=1
π4
We also know that ζ(4) = 90 , so we can also write

4π 4
g2 = 60G2 = 120ζ(4)E2 = E2 .
3

In general, every Ek is a polynomial in E2 and E3 – this is the same argument as with the Gk s. For example,
because E4 = E22 and E5 = E2 E3 , we can get identities like

X
n−1
σ7 (n) = σ3 (n) + 120 σ3 (m)σ3 (n − m).
m=1

Serre 7.4.3 – Andrew Lin


Recall that a modular form is holomorphic everywhere in H, so we can express it as a Fourier series

X
f (z) = f˜(q) = an q n
n=0

21
where q = e 2πiz . A natural question to ask about is the order of growth of the an (to understand the contribution of
higher-order terms). To answer this, recall that the space of modular forms of weight 2k can be decomposed via

Mk = M0k ⊕ CGk ,

where M0k is the cusp forms, and Gk is the Eisenstein series of weight 2k. We’ll study these separately:

Proposition 57
Let k ≥ 2. For the modular form f = Gk , we have |an | = Θ(n2k−1 ). In other words, there exist A, B > 0 such
that (for all n ≥ 1) we have
An2k−1 ≤ |an | ≤ Bn2k−1 .

Proof. In the previous section, we showed that the coefficients an (for n ≥ 1) satisfied

an = (−1)k Cσ2k−1 (n),

where C is a constant (depending on k) independent of n, and σ2k−1 (n) is the sum of the (2k − 1)th powers of divisors
of n. We can bound these magnitudes both from above and below:

|an | = Cσ2k−1 (n) ≥ Cn2k−1 ,

and also
|an | X 1 X 1
2k−1
=C 2k−1
≤C =D<∞
n d d 2k−1
d|n d≥1

(in the first equality we use the fact that summing d 2k−1 /n2k−1 = 1/(n/d)2k−1 over all divisors d is the same if
we replace n/d with d). This means we have shown that Cn2k−1 < |an | < Dn2k−1 , and thus |an | = Θ(n2k−1 ), as
desired.

The subspace of cusp forms is a bit more tricky:

Theorem 58 (Hecke)
|an |
For any cusp form f of weight 2k, we have |an | = O(nk ); that is, nk is bounded as n → ∞.

Proof. By definition, the power series expansion of f has a0 = 0, which means that as q → 0 (meaning z = x + i y
tends to i ∞), the magnitude of f is O(q). In other words,

|f (z)| = O(e 2πi(x+iy ) ) = O(e −2πy ).


To relate the magnitude of f to y more explicitly, note that for modular forms f of weight 2k, the function
Im(z)
φ(z) = |f (z)|y k is invariant under the modular group G. This is because Im(gz ) = (cz+d)2 , so

φ(gz ) = |f (gz )| Im(gz )k = (|f (z)||(cz + d)|2k ) · (Im(z)k (cz + d)−2k ) = φ(z).

But φ is continuous (on the fundamental domain), and as the imaginary part y → ∞, we know that φ goes to 0
because the exponential term e −2πy dominates the polynomial term y k . This means that φ is bounded, or that for all
z ∈ H,
φ(z) ≤ M =⇒ |f (z)| ≤ My −k .
f (z)
This is helpful, because we can now extract the an coefficient by considering the function q n+1 . If we consider
q = e 2πi(x+iy ) for a fixed y and sending x from 0 to 1, the contour follows a circle C once counterclockwise around

22
q = 0, and thus the residue formula tells us
Z Z Z 1
1 f (z)dq 1 f (z)dq |f (x + i y )|
|an | = ≤ = dx
2πi C q n+1 2π C q n+1 0 |q n |

(using the substitution q = e 2πi(x+iy ) =⇒ dq = 2πi qdx). And we can now bound this with the inequality
Z 1
|an | ≤ |My −k q −n |dx ≤ My −k e 2πny ,
0

1
and taking y = n gives the desired result.

Corollary 59
If a modular form f of weight 2k is not a cusp form, then the coefficients have order of magnitude n2k−1 (that is,
|an | = Θ(n2k−1 )).

Proof. Such modular forms of weight 2k can be written as cGk + h, where h is a cusp form and c ∈ C 6= 0. Gk ’s
coefficients are Θ(n2k−1 ) while h’s are O(nk ), so the Gk coefficients dominate for large n.

Remark 60. Work by Pierre Deligne has shown that Theorem 58 can be improved: it has been shown that an =
O(nk−1/2 σ0 (n)), where σ0 (n) denotes the number of divisors of n. Since σ0 (n) is subpolynomial, this tells us that for
all ε > 0, we have the stronger bound an = O(nk−1/2+ε ).

Serre 7.4.4 – Nikhil Reddy


Today, we’ll talk a bit about the discriminant
∆ = g23 − 27g32 ,

where g2 = 60G2 and g3 = 140G3 . To get some of the constants to work out a bit better, we can also write this in
terms of E2 and E3 : because
g2 = 120ζ(4)E2 , g3 = 280ζ(6)E3 ,

we can substitute these values in to find that

∆ = (2π)12 (12)−3 (E23 − E32 ).

We know the q-series for E2 and E3 , so we can directly compute the first few coefficients (which are all positive
integers):
∆ = (2π)12 (q − 24q 2 + 252q 3 − 1472q 4 + · · · ).

This power series may look a bit familiar:

Theorem 61
We have

Y
∆ = (2π)12 q (1 − q n )24 .
n=1

This proof is a “bit artificial” because the natural method is to use elliptic curves.

23
Q∞
Proof. Let F = q n=1 (1 − q n )24 : it’s enough to show that F is a cusp of weight 12, and then check the coefficient
of the q term to figure out the scaling factor (because the space of cusp forms of weight 12 has dimension 1).
Define the two series (prime sum means we ignore (0, 0))

XX ′
XX
1 1
G1 (z) = , G(z) = .
n m
(m + nz)2 m n
(m + nz)2

Because the double sum is not absolutely summable, the order of summation here is important. Note that G1 is not
a modular form:

Proposition 62
We have
X∞
(2π)2
G1 (z) = − 2(2π)2 σ1 (n)q n
12 n=1

and  
1
G1 − = z 2 G1 (z) − 2πi z .
z

Proof. Earlier, we showed that for k ≥ 2,



(2πi )2k X
Gk (z) = 2ζ(2k) + 2 σ2k−1 (n)q n .
(2k − 1)! n=1

The proof basically follows the same way if you plug in k = 1 instead and sum in the correct order (as we do).
We’ll postpone the second identity for the end.

To show that F has weight 12, we need to show that


 
1
F − = z 12 F (z).
z

We’ll take the logarithmic differential of both sides:



!
dF 1 X nq n−1
= − 24 dq.
F q n=1
1 − qn

We can write this more nicely as !


X∞
dq 1 nq n
= − 24
q q n=1
1 − qn
and then expand as a geometric sequence to find

! ∞
!
dq X dq X
= 1 − 24 nq mn
= 1 − 24 σ1 (n)q n
.
q n,m=1
q n=1

Using the first part of Proposition 62 yields


12 dq
= G1 (z) ,
(2π)2 q
dq
and now q = 2πi dz tells us that
dF 12i
= G1 (z)dz .
F 2π

24
Plugging in − z1 , we find that   
dF − z1 12i 1 dz
 = Gi − ,
F − z1 2π z z2
and now by the second part of Proposition 62, this is
 
12i z 2 G1 (z) − 2πi z 12i 12
= dz = G1 (z) + dz.
2π z2 2π z

But now this means that 


dF − z1 dF 12
 = + dz,
F − z1 F z
which means that F (− z1 ) = kz 12 F (z) by reversing the logarithmic derivative. Now looking at z = i , we find that
F (i ) = kF (i ), so k = 1 (because F (i ) 6= 0). So indeed F is a modular form of weight 12, as desired. (Checking the
constant term between F and ∆ just comes from looking at the q-term.)

We can now prove the second identity


 
1
G1 − = z 2 G(z) − 2πi z .
z

We will need to introduce two more series



XX ′
XX
1 1
H1 (z) = , H(z) =
n m
(m − 1 + nz)(m + nz) m n
(m − 1 + nz)(m + nz)

where we avoid both (1, 0) and (0, 0). These two series can be computed directly, because the series telescopes via
1 1 1
= − .
(m − 1 + nz)(m + nz) m − 1 + nz m + nz

So for H1 , we have the contribution from n 6= 0 giving us


X 1
H1 (z) = =0
m
(m − 1 + nz)(m + nz)

and the n = 0 contribution just gives us 2. A more complicated calculation gives us H(z) = 2 − 2πi
z , and now it turns
out that  
1
G1 − = z 2 G(z).
z
1
This is true because of how we usually write series – plugging in z brings up the z to the numerator but swaps the
order of summation. So now

X X ′
1 1 1
− =
m,n
(m + nz) 2 (m − 1 + nz)(m + nz) m,n
(m + nz) (m − 1 + nz)
2

is absolutely convergent (order 3), so we can add it to G and G1 to find that

2πi
G − H = G1 − H1 =⇒ G − G1 = H − H1 = ,
z
which yields the result because
   
1 2πi
G1 − =z 2
G1 (z) − = z 2 G(z) − 2πi z .
z z

25
6 February 25, 2020

Serre 7.4.5 – Michael Tang


We’ll be doing some cleanup from section 2.3 (lattices and elliptic curves), and then we’ll talk a bit more about the
coefficients of ∆.

Definition 63
Let Γ be a lattice in the complex plane (Γ is isomorphic to Z2 and spans C over real linear combinations). Then
the Weierstrass p-function  
1 X 1 1
℘Γ (u) = + − 2 .
u 2 (u − γ)2 γ
γ∈Γ

We’re going to show that this function and its derivative satisfy the equation of an elliptic curve. To do this, we
start with the Laurent expansion:

Proposition 64
The Laurent expansion of ℘Γ is
X ∞
1
℘Γ (u) = + (2k − 1)Gk (Γ)u 2k−2 ,
u2
k=2
P′ 1
where Gk (Γ) = γ∈Γ γ 2k .

Sketch. There’s a lot of algebra, and we can look at the 18.783 lecture notes for more details. We can expand out
Gk in the proposition as a sum as well to get a double summation:
∞ ′
X X (2k − 1)u 2k−2
1
+
u2 γ 2k
k=2 γ∈Γ

This converges absolutely, so we can swap the order of summation

XX ′ ∞
1 u 2k−2
= 2
+ (2k − 1) 2k .
u γ
γ∈Γ k=2

For each k, this is an “arithmetico-geometric series” which we can evaluate directly. But to get the exact form of the
Weierstrass p function, we need to do a trick: we actually add in the “odd terms” with γ 2k+1 back in (they cancel out
because γ n cancels with (−γ)n ), and that will give us what we want.

Proposition 65
Let x = ℘Γ (u) and y = ℘′Γ (u). Then
y 2 = 4x 3 − g2 x − g3 ,

where g2 = 60G2 (Γ) and g3 = 140G3 (Γ).

Sketch. Define the function


F (u) = y 2 − (4x 3 − g2 x − g3 ) :

26
write out the Laurent expansion and bash, and we’ll see that the negative-exponent terms cancel out, and also
F (0) = 0. This means that F is a holomorphic function at z = 0.
Now ℘Γ and ℘′Γ are both periodic with respect to the lattice, so F is also periodic. And this means that F is
holomorphic at all points of the lattice Γ: a uniform convergence argument shows that F, ℘Γ , ℘′Γ are all holomorphic
at all points not on the lattice, so F is also holomorphic everywhere.
Now F is periodic with no poles and entire, so it is bounded by the supremum on one (compact) fundamental
domain, so it must be constant by Liouville, and thus F = 0 everywhere.

This is the Weierstrass form for elliptic curves – we can prove that it is nonsingular, so there is something about
an isomorphism between curves and lattices (though we don’t have the background).
Next, we’ll go back to the modular form
∆ = g23 − 27g32 .

Recall a few important properties here: ∆ has weight 12, and it’s a cusp form (this explains the coefficient 27). Also,
it has the q-expansion

Y
∆ = (2π)12 q (1 − q n )24 .
n=1

Definition 66
The Ramanujan τ function is defined such that

Y X
q (1 − q n )24 = τ (n)q n .
n=1

We can check some small values: τ (1) = 1, τ (2) = −24, τ (3) = 252, τ (4) = −1472 grow pretty quickly, but we
can give a bound on the coefficients. Recall that a cusp form of weight k has q-coefficients that are O(nk ) (and in
fact O(nk−1/2+ε ). So in this case, τ (n) = O(n6 ) and actually O(n11/2+ε ).
The τ function has nice properties, but we’ll prove them later on:

Proposition 67
We have the following results:
• τ is multiplicative: for any m, n relatively prime, τ (mn) = τ (m)τ (n). (This means we only need the values
at prime powers.)

• We have the second-order recursion (for prime p and n > 1)

τ (p n+1 ) = τ (p)τ (p n ) − p 11 τ (p n−1 )

P∞ τ (n)
• We can write the Dirichlet series L(τ, s) = n=1 ns as a product
Y 1
L(τ, s) = .
1− τ (p)p −s + p 11−2s
pprime

It’s conjectured (unproven) that τ (n) 6= 0 for all n, but this has been checked up to about 8 × 1023 .

27
Serre 7.5.1 – Natalie Stewart
We’ll start by redefining a few familiar concepts:

Definition 68
Let E be a set. Then XE denotes the free abelian group on E:
( )
X
XE = cx x|cx ∈ Z, all but finitely many zero .
x∈E

Definition 69
A correspondence on a set E is an endomorphism T : XE → XE – an equivalent way to think about this is that
we can write
X
T (x) = nT (x, y )y ,
y ∈E

which is equivalent to a set function nT : E → NE with finite support.

Definition 70
Let f be a function from a set E to a group G. Define T f : E → G as the composition
T f
E ,→ XE → XE → G.

Note that the endomorphisms on (say) an abelian group can be added pointwise and multiplied via composition,
so they have a structure like a ring. We’ll be defining a bunch of correspondences which basically form a subring.

Definition 71
Let R be the set of lattices on C. Define the correspondence T (n) by
X
T (n)Γ = Γ′ .
Γ′ ∈R
(Γ:Γ′ )=n

for all n > 1. For all λ ∈ C× , define the map from XR to XR via

Rλ Γ = 1 · (λΓ),

where (λΓ) denotes the homothetic rescaling of Γ.

Note that if Γ′ ∈ Γ (is a sublattice) has index n, then Γ′ will contain nΓ. This gives us a chain of inclusions, and

{Γ′ |(Γ : Γ′ ) = n} ∼
= {G ∈ Γ/nΓ||G| = n}.

We’ll be using this a lot in the upcoming proof. This reduces the job of counting the subgroups of a certain order for
the group Γ/nΓ. We’ll use the fact that if n is a prime, then this number is actually n + 1.

28
Proposition 72
For all λ, µ ∈ C× and n, m ∈ Z>1 , we have the following:
1. Homothety operators multiply: Rλ Rµ = Rλµ .

2. Rλ T (n) = T (n)Rλ (we have commutativity).

3. T (n)T (m) = T (nm) if n, m are relatively prime.

4. T (p n )T (p) = T (p n+1 ) = pT (p n−1 )Rp if p is a prime.

Proof. (1) and (2) are clear (arguments about scaling).


For (3), note that if (Γ : Γ′′ ) = nm, then the coefficient of Γ′′ in T (m)T (n)Γ comes from counting the lattices Γ′
where Γ′′ ⊂ Γ′ ⊂ Γ and (Γ : Γ′ ) = n, and by an isomorphism theorem this is just the number of subgroups G of Γ/Γ′′
with |G| = m. Since m, n are relatively prime, it’s sufficient to check that the coefficient of Γ′′ in T (n)T (m)Γ is 1,
which means there is a unique subgroup of Γ/Γ′′ of order m (this is true).
For (4), note that a sublattice is represented only if its index is p n+1 , so we can just check that the coefficients are
the same there. Consider Γ′′ such that (Γ : Γ′′ ) = p n+1 : from an earlier argument, Γ′ with index p must contain pΓ.
Suppose that the coefficient of Γ′′ in T (p n )T (p) is a, and the coefficient in T (p n−1 )Rp is c. we wish to show that
a = 1 + pc. We break into cases:
(case i) If Γ′′ is not contained in pΓ, then c = 0, and we want to show that a = 1. Note that |Γ/pΓ| = p 2 , so
the order of the image of the projection from Γ into Γ/pΓ is p. Also, because Γ′′ is not contained in pΓ, it doesn’t
intersect at all, and the order of the image of Γ′′ is also p. Since the image of Γ′′ is contained in the image of Γ′ , we
can use the uniqueness part of the isomorphism theorem to show that we have a bijective map between subgroups –
there exists a unique Γ′ .
(case ii) If Γ′′ is contained in pΓ, which is contained in Γ′ , we can check that c = 1 and a = p + 1 (we did this
above).

Corollary 73
Inductively, we can show that the prime powers T (p n ) are polynomials in the underlying prime T (p) and the
homothety operator Rp . Also, the ring generated by T (p) and Rλ for p prime and λ contains T (n) for all n > 1
and its commutative.

One important idea is that these correspondence act on our lattice functions: recall that a lattice function of
weight 2k from R to C satisfies
F (λΓ) = λ−2k F (Γ).

Using the homothety operator, we can write that

F (Rλ Γ) = λ−2k F (Γ),

so T (n)F is also a lattice function of weight 2k. And this gives us an extra characterization similar to the above
proposition:
T (m)T (n)F = T (mn)F, T (p n )T (p)F = T (p n+1 )F + p 1−2k T (p n−1 )F.

29
Serre 7.5.2 – Kaarel Haenni
We’ll be talking about sublattices of a particular lattice. Throughout this, we’ll be fixing a lattice Γ with basis ω1 , ω2 :
define Γ(n) to be the set of sublattices of Γ of index n.

Definition 74 " #
a b
Let Sn be the set of integer matrices with ad = n, a ≥ 1, and 0 ≤ b < d.
0 d

Definition 75" #
a b
For any σ = , define the sublattice Γσ be the sublattice with basis ω1′ = aω1 + bω2 and ω2′ = dω2 .
0 d

Theorem 76
The map σ → Γσ is a bijection from Sn to Γ(n).

Proof. First of all, we check that the index is indeed n: this is equal to the ratio of areas of a fundamental cell of Γσ
to Γ, which is indeed det σ = ad = n. (These two equalities follow by looking at the “cross product” of two vectors in
the complex plane and considering the number of lattice points in one fundamental domain.)
To show that we have a bijection, we construct an inverse map sending a sublattice Γ′ to σ(Γ′ ). Define the two
groups
Y1 = Γ/(Γ′ + Zω2 ), Y2 = Zω2 /(Γ′ ∩ Zω2 ).

Both of these are cyclic groups, generated by ω1 , ω2 in the quotient maps. Let a and d be the orders of these cyclic
groups ω1 , ω2 . Now defining ω2′ = dω2 , we know that ω2′ ∈ Γ′ . Also, by definition of a, we know that there is some
unique b such that aω1 + bω2 ∈ Γ′ as long as 0 ≤ b < d. Then let ω1′ = aω1 + bω2 . Now we have a, b, d and we can
just define " #
a b
σ(Γ′ ) = .
0 d
Showing that these maps are inverses is routine, so we’ll skip it here.

Example 77
Let
" p# be a prime.
" Then
# the sublattices of Γ of index p can be easily characterized: Sp consists of the matrices
p 0 1 b
and for all 0 ≤ b < p.
0 1 0 p

(This shows that the statement from Natalie’s lecture that there are p+1 sublattices of index p.) These correspond
to the sublattices ω1′ = pω1 , ω2′ = ω2 (scaling one variable by p) and ω1′ = ω1 + bω2 , ω2′ = pω2 (scaling the other by
p and doing a shear).

30
7 February 27, 2020

Serre 7.5.3, 7.5.4 – Michelle Xu


We recently introduced the operator T (n), which acts on lattices as follows:
X
T (n)Γ = Γ′ .
(Γ:Γ′ )=n

Today, we’ll talk about this in the context of modular functions.


Recall that we have a map between lattice functions F of weight 2k and weakly modular functions f of weight
2k, with the correspondence  
ω1
ω22k F (Γ(ω1 , ω2 )) =f .
ω2

Definition 78
T (n) acts on functions f with the relation

T (n)f (z) = n2k−1 T (n)F (z, 1).

(The n2k−1 is there for aesthetic reasons). Remember that the set of sublattices of index n for a lattice Γ(ω1 , ω2 )
can be represented with the set of bases
" # " #" #
ω1′ a b ω1
= ,
ω2′ 0 d omega2

where ad = n, 0 ≤ b < d, a ≥ 1. This means we can also rewrite our equation above as

X X  
az + b
T (n)f (z)n2k−1 F (Γ′ ) = n2k−1 d −2k f
d
(Γ:Γ′ )=n a,d,b

(expressing in terms of the bases).

Proposition 79
Let f be a weakly modular function of weight 2k. Then the function T (n)f is also weakly modular of weight 2k,
and it is holomorphic if f is.

Proof. Remember that the action on SL2 (Z) on our lattice function keeps it invariant, so

T (n)f (z) = n2k−1 T (n)F (z, 1) = n2k−1 T (n)F (az + b, cz + d),

and converting this back to our weakly modular functions gives us


 
az + b
= (cz + d)−2k T (n)f ,
cz + d

so this is indeed of weight 2k. And now we have a finite number of terms in the boxed equation above, each of which
is meromorphic (because f is meromorphic). So T (n)f is also meromorphic, and it is holomorphic if f is (for the same
reason).

31
Proposition 80
Our operator T (n) satisfies the following equations:
• T (n)T (m)f = T (mn)f for m, n relatively prime.

• T (p)T (p n )f = T (p n+1 )f + p 2k−1 T (p n−1 )f for p prime and n ≥ 1.

Proof. Recall that we proved a very similar set of relations for lattice functions – we just plug this in along with our
definition of an operator.
The main point is that the p 2k−1 factor is different; this mostly comes from the n2k−1 definition of our operator
on modular functions.

Proposition 81
Let f be a modular function of weight 2k (so it is also meromorphic at infinity) with Laurent expansion f (z) =
P m
m∈Z c(m)q . Then T (n)f is also a modular function, we can write
X
T (n)f = γ(m)q m ,
m∈Z
P mn

where γ(m) = a≥1,a|(n,m) a2k−1 c a2 .

Proof. We expand f with its Laurent series to write


X X
T (n)f = n2k−1 d −2k c(m)e 2πim(az+b)/d .
ad=n m∈Z
a≥1
0≤b<d

(here we have expanded q = e 2πiz ).


P
Fix d and m, and consider the sum 0≤b<d e 2πimb/d . If d|m, then the exponent is a multiple of 2πi , so this sum
is just d. Otherwise, we have a sum over the roots of unity, which is 0. To use this, we define m
d = m′ , and then we
can write
X ′
T (n)f = n2k−1 d −2k+1 c(m′ d)q am .
ad=n
a≥1
m′ ∈Z

To simplify further, let µ = am′ , and now we just have a single exponent for q:
X X  n 2k−1  µd 
µ
T (n)f (z) = q c .
d a
µ∈Z a|(n,µ),a≥1

And now this is basically what we want by relabeling indices – we just need to show T (n)f is meromorphic. Because f

is meromorphic, its Laurent expansion stops at some N: c(m) = 0 for all m ≤ N. This means that c µd a = 0 for all
µ ≤ −nN (because the largest d
a can be is n). This means that f is meromorphic at infinity, and because it is weakly
modular (from earlier arguments), T (n)f is indeed a modular function, as desired.

So T (n) brings modular functions to modular functions, which is nice.

32
Corollary 82
Using the same notation as above, we have γ(0) = σ2k−1 (n)c(0), where σ2k−1 (n) is the sum of the (2k − 1)th
divisors of n, and γ(1) = c(n). Also, if n = p is prime, then we have

c(pm) m 6= 0 (mod p)
γ(m) =  
c(pm) + p 2k−1 c m m = 0 (mod p).
p

(These can be verified by directly plugging in m = 0, 1 and n = p and using divisibility properties.)

Corollary 83
If f is a modular form (resp. cusp form), then T (n)f is modular (resp. cusp) as well.

Proof. The argument follows analogously as above, replacing “meromorphic at infinity” with “holomorphic at infinity.”
Cusp form follows from the fact that if c(0) = 0, then γ(0) = 0 from the above corollary.

With this, we’ll study something more specific about our T (n): eigenvalues and eigenfunctions. We’ll assume that
P
f = ∞ n
n=0 c(n)q is a modular form of weight 2k for some k > 0.

Definition 84
f is an eigenfunction for all T (n) if we have λ(n) ∈ C such that

T (n)f = λ(n)f .

(Then the λ(n) are the set of eigenvalues for T (n).)

Theorem 85
P
Let f = ∞ n
n=0 c(n)q be an eigenfunction. Then the coefficient of c(1) is nonzero, and if we normalize c(1) = 1,
we have c(n) = λ(n).

Proof. We know that the coefficient of q for T (n)f , which is γ(1), is equal to c(n). But by the definition of the
eigenfunction, T (n)f should have q-coefficient λ(n)c(1), so c(n) = λ(n)c(1). Then c(1) can’t be zero, because
c(n) = 0 for all n ≥ 1 (and that means f is a constant, and the only modular form for k > 0 is trivial). Thus c(1) 6= 0,
and the second claim follows easily by setting c(1) = 1.

Corollary 86
If two modular forms of weight 2k are eigenfunctions of T (n) with the same eigenvalues λ(n), and they are both
normalized (to c(1) = 1), then they coincide.

(This is because the Laurent expansion is completely defined by the eigenvalues.)


This allows us to say something more specific about our eigenfunctions f :

33
Corollary 87
Suppose that f is an eigenfunction of all T (n), and it is normalized to c(1) = 1. Then

c(m)c(n) = c(mn) ∀(m, n) = 1,

c(p)c(p n ) = c(p n+1 ) + p 2k−1 c(p n−1 ).

(We plug in the properties of our operator T (n) into the eigenfunction statement.)

Definition 88
Let f be an eigenfunction. Then

X c(n)
Φf (s) =
n=1
ns
is the Dirichlet series defined by the coefficients of an eigenfunction f .

This series converges absolutely for all Re s > 2k, because we proved that modular forms’ coefficients have order
of magnitude c(n) = O(n2k−1 ).

Corollary 89
Let P be the set of prime numbers. Then
Y 1
Φf (s) = .
1 − c(p)p −s + p 2k−1−2s
p∈P

Proof. First, we want to show that we can write our Dirichlet series as

!
Y X
n −ns
Φf (s) = c(p )p .
p∈P n=0

(This proof comes from Serre chapter 6.) Let S be a finite set of prime numbers, and let N(S) be the set of integers
with prime factors only in S. We know that coefficients are multiplicative, so we can rewrite

!
X c(n) Y X
m −ms
= c(p )p .
ns m=0
n∈N(s) p∈S

Now as S approaches the set of all primes, our infinite product will converge to the result above. From this, we just
need to show the inner term works out: in other words, for all primes p, we want to show (defining Q = p −s )

X 1
c(p n )Qn = .
n=0
1 − c(p)Q + p 2k−1 Q2

To do this, we consider the series



X
ψ(n) = ( c(p n )Qn )(1 − c(p)Q + p 2k−1 Q2 );
n=0

we wish to show that this is equal to 1. Now the Q-coefficient is equal to c(p) − c(p) = 0, and Qn+1 for all n ≥ 1
has coefficient (expanding out the relevant terms)

c(p n+1 ) − c(p)c(p n ) + p 2k−1 c(p n−1 ),

34
which is zero because of the recursive equation for c(p n ). And this means that we only have a constant term, and
ψ(n) = c(1) = 1 by normalization, completing the proof.

Hecke also proved a few interesting facts about Φs (f ) – this will have a meromorphic continuation over the plane,
just like the Riemann zeta function.

Serre 7.5.5, 7.5.6 – Shreyas Balaji


Throughout this section, “normalized eigenfunction” means c(1) = 1. We’ll do a few examples of eigenfunctions for
the Hecke operators f (n) first:

Proposition 90
The Eisenstein series Gk is an eigenfunction of T (n) with eigenvalue λ = σ2k−1 (n). Specifically, the normalized
eigenfunction is

Bk Bk X
f = (−1)k Ek = (−1)k + σ2k−1 (n)q n ,
4k 4k n=1

where Bk are the Bernoulli numbers.

Proof. Let Gk (Γ) be the function from the set of lattices R to the complex numbers, where

X 1
Gk (Γ) = .
γ 2k
γ∈Γ

We can just show that Eisenstein series are eigenfunctions for all T (p) for prime p (and then use the relations to build
up to all T (n)). Then for any prime p,

X X 1
T (p)Gk (Γ) = 2k
.

γ
(Γ:Γ )=p γ∈Γ

Consider a particular γ ∈ Γ: then there are two cases. (1) We have γ ∈ pΓ (that is, the lattice vectors scaled by p).
There are p + 1 sublattices that have index p in Γ, and if it’s inside pΓ, it’s inside all of the sublattices as well. (2) γ
is contained in exactly one sublattice. So we can split these up to find that

T (p)Gk (Γ) = Gk (Γ) + pGk (pΓ)

(the first term from every point being counted once, and the second coming from points that are counted p + 1 times).
And note that Gk (pΓ) = γ −2k Gk (Γ), so

T (p)Gk (Γ) = (1 + p 1−2k )Gk (Γ),

which shows that Gk (Γ), viewed as a function on lattices, is an eigenfunction for the operator T (p). And now
this means the modular form Gk associated with the function Gk (Γ) is an eigenfunction under T (p) with eigenvalue
p 2k−1 (1 + p 1−2k ) = σ2k−1 (p), as desired. (We pick up a p 2k−1 factor when we convert between modular functions
and modular forms.)

Proposition 91
Let f be defined as above. Then the Dirichlet series Φf (s) = ζ(s)ζ(s − 2k + 1).

35
Proof. We can write out the coefficients

X X a2k−1
σ2k−1
Φf (s) = = .
n=1
ns as d s
a,d≥1

We can now separate this into two different sums:


! !
X 1 X 1
= ,
ds as+1−2k
d≥1 a≥1

which is exactly what we want.

Proposition 92
The ∆ function is an eigenfunction of T (n) with normalized eigenfunction

Y ∞
X
−12
(2π) ∆=q (1 − q ) n 24
= τ (n)q n .
n=1 n=1

Proof. Remember that the space of cusp forms of weight 12 is of dimension 1, and this space is stable under action
of T (n). Thus T (n)∆ must be some constant times ∆, and we can check the coefficients for normalization.

Corollary 93
The τ function satisfies τ (nm) = τ (n)τ (m) for all relatively prime n, m, and τ (p)τ (p n ) = τ (p n+1 ) + p 11 τ (p n−1 )
for all prime p and n ≥ 1.

(This follows from directly applying results from above about the coefficients c(n) of eigenfunctions.)
We’ll now move on to complements: we will be stating but not proving a lot of results.

Definition 94
Let f , g be cusp forms of weight 2k. Then define a measure µ via

dµ(f , g) = f (z)g(z)y 2k dxdy /y 2 ,

where x, y are the real and imaginary part of z.

Note that µ is invariant under action by G (because of the y 2k term) and it is bounded over H/G (because f , g
are bounded and f and g fall off exponentially).

Definition 95
The Petersson scalar product on M0k is defined as
Z Z
hf , gi = dµ(f , g) = f (z)g(z)y 2k−2 dxdy
H/G D

where D is the fundamental domain.

36
Fact 96
The operators T (n) turn out to be Hermitian under the Petersson scalar product: we have

hT (n)f , gi = hf , T (n)gi.

Since T (n) commute with each other, the operators are simultaneously diagonalizable. Furthermore, the image of
f is contained within the span of the T (n) eigenvectors (the operators are complete), so there exists an orthogonal
basis of M0k made of eigenvectors of T (n) with real eigenvalues (because the operators are Hermitian).
We’ll move on to something completely unrelated:

Definition 97
P∞ n
Let Mk (Z) be the set of weight 2k modular forms which can be written as f = n=0 cn q for cn integers.

It turns out that there’s a specific Z-basis of Mk (Z) (it must exist because the space is stable under action of
T (n)), which extends to a C-basis of Mk . Specifically, we pick

{E2α F β : α + 3β = k/2}

for k even, and


{E3 E2α F β : α + 3β = (k − 3)/2}

for k odd. The nice thing is that this Z-basis extends to a C-basis, so the coefficients of the characteristic polynomial
of T (n) must be integers (because they come from a Z-basis).

Fact 98 (Recently proven Petersson conjecture)


P
Let f = n≥1 c(n)q n be a cusp form of weight 2k which is also a normalized eigenfunction for T (n). Define

Φf ,p (T ) = 1 − c(p)T + p 2k−1 T 2 :

then we can factor this polynomial as (1 − αp T )(1 − α′p T ), where αp + α′p = c(p), αp α′p = p 2k−1 . Then αp , α′p
are complex conjugates.

8 March 3, 2020

Serre 7.6.1-7.6.4 – Vanshika Jain


We’ll start with a few definitions that we’ll need throughout the rest of this lecture:

Definition 99
An invariant measure µ is a measure preserved under translation and rotation.

For example, the normal “product measure” dx1 dx2 · · · dxn is an invariant measure.

37
Definition 100
The dual of a vector space V is the space of linear functionals (linear maps from V to the scalar field) on V , with
structure of pointwise addition and scalar multiplication.

Definition 101
A rapidly decreasing (smooth) function is a function f (x) such that f , f ′ , f ′′ , · · · exist everywhere and decay
faster than any negative power of x.

(f (x) = e −x is such an example.) Note that the space of rapidly decreasing, smooth functions have the property
that the Fourier transform is an automorphism of the space.
Let V be a real vector space of dimension n, equipped with an invariant measure µ. Let V ′ be the dual of V , and
let f be a rapidly decreasing smooth function of V . Then the Fourier transform of f is defined to be
Z
fˆ(y ) = e −2iπ⟨x,y ⟩ f (x)µ(x).
V

If we let Γ be a lattice in V , and let Γ′ be the dual lattice in V ′ (that is, the set of y ∈ V ′ such that hx, y i = y (x) is
an integer). Since V is a finite-dimensional vector space, there exists an isomorphism between V and V ′ given by

v 7→ hv , ·i

for an inner product h·, ·i.

Example 102
If we let V = Rn , Γ = Zn , the dual vector space is isomorphic to Rn , and the lattice Γ′ is just Zn if we work with
the usual dot product.

Proposition 103
Let v = µ(V /Γ) (the measure of one fundamental region of Γ), and let f be a rapidly decreasing smooth function.
Then
X 1Xˆ
f (x) = f (y ).
v ′
x∈Γ y ∈Γ

Proof. Replace µ with a scalar multiple so that the measure of the fundamental domain becomes 1. If we take a basis
e1 , · · · , en of Γ, we can then identify V with Rn , Γ with Zn , and take µ to be the ordinary product measure. Then the
formula we’re trying to show just reduces to the classical Poisson summation formula
X X
f (x) = fˆ(x),
x∈Zn x∈Zn

which is a property of the Fourier transform.

We’ll now apply this to quadratic forms: suppose that V is equipped with a symmetric bilinear form hx, y i that is
positive and nondegenerate. As before, this defines an isomorphism between V and V ′ , so the dual lattice Γ′ can be
thought of as a lattice in V instead of V ′ : a point y ∈ Γ′ if and only if hx, y i is an integer for all x ∈ Γ.

38
Definition 104
Associate a function to each lattice Γ via
X
ΘΓ (t) = e −πt⟨x,x⟩ .
x∈Γ

Proposition 105
We have the relation (for all t ∈ R≥0 and lattices Γ)

ΘΓ (t) = t −n/2 v −1 ΘΓ′ (t −1 ).

Proof. Let f (x) = e −π⟨x,x⟩ : this is rapidly decreasing and smooth, so if we pick an orthonormal basis to identify V
with Rn and µ with the product measure, we have

f = e −π(x1 +x2 +···+xn ) .


2 2 2

Since the Fourier transform e −πx is itself, we can use the previous result: note that t 1/2 Γ has dual lattice t −1/2 Γ′ ,
2

and the volume of our lattice is t n/2 v . Thus


X X X
e −π(x1 +···+xn ) = e −πt(x1 +···+xn ) = ΘΓ (t).
2 2 2 2
f (x) =
x∈t 1/2 Γ x∈t 1/2 Γ x∈Γ

But we can use the Poisson summation formula and do the same trick with t −1/2 Γ′ to find that this is also equal to
t −n/2 v −1 ΘΓ” (t −1 ), as desired.

Everything with Θ here can be represented using a matrix: if we let e1 , · · · , en be a basis for our lattice Γ, we can
define aij = hei , ej i, which gives us a positive, symmetric, nondegenerate matrix A = (Aij ). Thus, for any x ∈ V , we
can write
X
hx, xi = aij xi xj ,
i,j

so we can write our function


X ∑
ΘΓ (t) = e −πt i,j aij xi xj
.
xi ∈Z

Defining the determinant using the wedge product, we can see that the volume v of Γ is the square root of det A.
Furthermore, if B = (Bij ) is the inverse matrix, we have a dual basis
X
ei′ = bij ej .
j

Then the ei′ form a basis of Γ′ – we have hei′ , ej′ i = Bij . In particular, this tells us that if v ′ = µ(V /Γ′ ), we have
v v ′ = 1.
For the upcoming section, we’ll be dealing with some special cases – pairs (V, Γ) with a few nice properties. First
of all, we want Γ′ to be equal to Γ, which is equivalent to saying that hx, y i is an integer for all x, y ∈ Γ. This also
implies that Γ’s matrix A has integer coefficients, and A has determinant 1.
We’ll also assume that hx, xi is always even – this means that Γ is of type II. We’ll see why this is useful soon.

39
Serre 7.6.5-7.6.7 – Andrew Lin
In these final sections of the book, we’ll discuss a particular kind of modular form with particularly nice properties.
Much of the background work has already been done: throughout this lecture, we’ll assume that we have a lattice Γ
on a vector space V of dimension n, such that the fundamental domain has volume 1 and the diagonal entries of the
matrix Aij are even (we have a type II lattice). Since any x ∈ Γ can be written as a Z-combination of basis elements
P
i xi ei , we know that X X
hx, xi = xi2 aii + 2 xi xj aij
i i<j

is always even for a type II lattice.

Definition 106
For a lattice Γ, let rΓ (m) be the number of elements x ∈ Γ such that hx, xi = 2m.

We know that our bilinear form is positive and nondegenerate, and thus it is essentially a rescaling of the dot
product (because we can diagonalize the matrix A that defines it). Thus, rΓ (m) is bounded by a constant (depending

on Γ) times the number of points in Zn of distance at most 2m from the origin. Such points are contained in the

n-dimensional box of side length 2 2m: thus rΓ (m) = O(mn/2 ) for any Γ, so rΓ (m) is polynomial in m. This means
the infinite sum
X
rΓ (m)q m
m≥0

is convergent for all |q| < 1. (It’s good to make sure it’s clear why the constant term rΓ (0) = 1 for any lattice Γ.)
Here’s where we’ll connect this back to our modular forms: notice that what we’ve written down is the Fourier
expansion for a modular form on the upper half-plane, where q = e 2πiz .

Definition 107
The theta function associated to a lattice Γ is
X X
θΓ (z) = rΓ (m)e 2πimz = e πiz ⟨x,x⟩ .
m≥0 x∈Γ

Because this sum is absolutely convergent and we have no poles, θΓ is holomorphic on the upper half-plane.

Theorem 108
Suppose Γ satisfies the above assumptions. Then
(a) n is a multiple of 8,

(b) θΓ is a modular form of weight n2 .

Proof. The proof of the first point is outside the scope of this lecture 1 , but we’ll do a somewhat circular proof of (a)
once we prove (b).
We know that θΓ is holomorphic, and it satisfies the relation under T because we’ve written it as a Fourier series.
1 See chapter 5 of Serre. The main idea is that the signature of our bilinear form can be essentially related to a canonical element of
the dual lattice of (Γ/2Γ)n , if we look at images of the dot product mod 8.

40
Thus, it suffices to check the relation under S: to do this, we will instead check that
 
1
θΓ − = (i z)n/2 θΓ (z) .
z

(The extra i n/2 factor cancels out if n is a multiple of 8.) To show this result, note that both sides are holomorphic on
the entire half-plane, so it suffices to show that they are equal on a set containing a limit point. Specifically, plugging
P
in z = i t, note that the left hand side evaluates to (using θΓ (r ) = x∈Γ e πiz⟨x,x⟩ )
  X  
1 −π/t⟨x,x⟩ 1
θΓ − = e =Θ ,
z t
x∈Γ

while the right hand side is


X
(i t)n/2 θΓ (z) = t n/2 e −πt⟨x,x⟩ = t n/2 Θ(t).
x∈Γ
P −πt⟨x,x⟩
(Here, we use the definition of Θ(z) = e from a previous section.) These expressions are indeed equal (see
Serre 7.6.2) as long as the volume v associated to our lattice is 1, which is one of our assumptions. Thus we’ve shown
the boxed result, and thus the S-relation holds: θΓ is indeed a modular form.
With this, we can verify that n must be a multiple of 8 (in a circular manner). Suppose not (for the sake of
contradiction); then we can assume n ≡ 4 (mod 8) by replacing Γ with Γ ⊕ Γ (either once or twice) and still have a
(theoretically) valid lattice. The boxed equation then becomes
 
1
θΓ − = −z n/2 θΓ (z),
z

so the differential form ω = (dz)n/4 θΓ (z) is sent to


   n/4
n/4 1 dz
d(− z1 ) θΓ − = · −z n/2 θΓ (z) = −ω
z z2

under S. But ω is invariant under T , so ST transforms ω into −ω. This is a contradiction with the fact that (ST )3
is the identity, and thus we must have had n be a multiple of 8 from the start.

Corollary 109
n
Given any lattice satisfying our assumptions, there exists a cusp form fΓ of weight 2k = 2 such that θΓ = Ek + fΓ .

Proof. Both θΓ and Ek have constant term 1 in their q-expansions, so their difference is a cusp form.

Taking this result and directly reading off coefficients yields the following:

Corollary 110
For any lattice Γ satisfying our assumptions, we have
4k
rΓ (m) = σ2k−1 (m) + O(mk ),
Bk
again taking k = n4 .

(Here, we need to use that the coefficients |an | of a cusp form are O(mk ), and also that k must be even.)

41
Fact 111
In general, the cusp forms fΓ are not identically zero. However, Siegel has shown that a weighted mean of the fΓ s
is zero: if Cn is the set of isomorphism classes of our Γs, and nΓ is the order of the automorphism group of Γ in
Cn , then
X 1
fΓ = 0.

Γ∈Cn

Another way to say this is that (because fΓ = θΓ − Ek )


!
X 1 X 1
θΓ = Ek .
nΓ nΓ
Γ∈Cn Γ∈Cn

Since Ek is an eigenfunction of our Hecke operators T (n), so is this weighted mean.

With this, we can look at a few concrete examples.

• The smallest n where a lattice with our desired properties exists is n = 8. The theta functions then correspond
to modular forms of weight 4, but there are no nonzero cusp forms of weight 4. Thus, we must have
X
θΓ (z) = E2 (z) = 1 + 240σ3 (m)q m
m≥1

for any 8-dimensional lattice Γ. We have a matrix representation for such a lattice:
 
2 −1
 
 2 −1 
 
−1 −1 
 2 
 
 −1 −1 2 −1 
Γ8 = 



 −1 2 −1 
 
 −1 2 −1 
 
 
 −1 2 −1
−1 2

(this has determinant 1), and in fact, this is the only isomorphism class of lattices in C8 .

• The next smallest value of n is n = 16. There are still no nonzero cusp forms of weight 8 (for k = 4), so we
know that
X
θΓ (z) = E4 (z) = 1 + 480σ7 (m)q m .
m≥1

Notably, this is true even though there are two different isomorphism classes of lattices: Γ16 (as defined in Serre
chapter 5)2 and Γ8 ⊕ Γ8 . Since the direct sum of lattices gives a squared generating function, this yields the
surprising identity !2
X X
1+ 480σ7 (m)q m = 1+ 240σ3 (m)q m .
m≥1 m≥1

For example, looking at the m = 2 coefficient, we have 1 + 480(27 + 1) = 1 + 2 ∗ 240(23 + 1) + 2402 .

• Finally, looking at the case n = 24 finally gives us a nonzero cusp form: the space of modular forms is spanned
2 These lattices are defined by taking a subset of the half-integer lattice points which satisfy certain conditions: for example, the sum of

the coordinates must be an even integer.

42
by the two functions
65520 X Y Y
E6 = 1 + σ11 (m)q m , F =q (1 − q m )24 = τ (n)q n .
691
m≥1 m≥1 m≥1

Thus, we can write our theta function


θΓ (z) = E6 + cΓ F

for some constant cΓ which depends on the lattice. Computing this constant requires us to count the number
of points x ∈ Γ with hx, xi = 2, since computing the q-coefficients yields
65520
rΓ (1) = + cΓ .
691
For example, Γ = Γ8 ⊕ Γ8 ⊕ Γ8 has rΓ (1) = 720, and Γ = Γ24 has rΓ (1) = 1104. Of special attention is the
Leech lattice, which has applications to coding theory and sphere packings: this particular lattice has rΓ (1) = 0,
so cΓ = − 65520
691 .

As a closing remark, needing to restrict ourselves to Type II lattices severely restricts the kinds of lattices we can
deal with. In particular, forcing that the diagonal terms are even means that we cannot analyze quadratic forms such
as
x12 + x22 + · · · + xn2 .

to analyze, for instance, the number of ways in which a positive integer can be written as the sum of n squares. In
order to deal with such cases, we relax our conditions slightly: now hx, xi can be any integer, so defining an analogous
theta function can be done with respect to the subgroup of G generated by S and T 2 . Such functions have “weight
n
2” (may not be an integer), and the fundamental domain now has two cusps, yielding two different “Eisenstein series.”
Further discussion will be postponed to later in this class, though.

9 March 5, 2020

The LMFDB
Today, David Roe is here to talk to us about the L-functions and Modular Forms Database. The link to the website
can be found at https://www.lmfdb.org.
This project is focused on creating a database of objects in computational number theory – a lot of development
is happening at MIT and at other parts of the world. It’s open-source, and we can use it to learn about some of the
objects we’ve been talking about in this class!
On the left side of the website, there’s a tab with links to various parts of the database. Modular forms, for
example, are split up into their four types (classical, Maass, Hilbert, and Bianchi). The database also includes elliptic
curves over Q and over number fields, genus 2 and higher genus curves (calculation gets harder as genus increases),
and abelian varieties over finite fields.

Remark 112. The dimension of the abelian variety attached to the elliptic curve (the Jacobian) is actually the genus
of the curve. We might talk more about this later.

Other parts of the left column of the website include other objects in algebraic number theory: there’s a tab for

number fields classified by discriminant (for example, Q[ d] for d squarefree has discriminant d if d ≡ 1 mod 4 and

43
4d otherwise). And we define invariants similarly for other number fields. There’s also a classification of Galois groups
and Sato-Tate groups (which are related to counting points on elliptic curves mod p).
We’ll take a quick look at the classical modular forms tab. One nice feature is that there are links which
expand out and explain terms: for example, we can click on the word newform, and we’ll get a definition and further
explanation of the mathematics there. So the purpose of the LMFDB is to provide both a research tool for experienced
mathematicians and an expository one for those of us who are learning!
This means that there are two ways to look for a particular modular form: first of all, we can browse by the links
at the top of the page.

Fact 113
We’ve only been talking about modular forms of full level so far, which means that we want to transform under
all matrices in SL2 (Z). But changing the level means we only look at specific subgroups, which gives us more
freedom. Modular forms that come up in the proof of Fermat’s Last Theorem are of weight 2 and high level.

Secondly, we can search for something more specific. For example, if we know a specific label for our modular
form (this means we have a permanent URL), we can type it in to find its homepage. Alternatively, we can search for
a modular form with specific properties.
So once we find a modular form, we get to its homepage! Every page starts with some parameters and invariants
– for example, we are told about the level and weight of the form, its analytic rank (the error counts in the point
counts for elliptic curves give us a modular form a1 q + a2 q 2 + a3 q 3 + · · · , where ap is the error term for the elliptic
curve under Fp . And we can also put those coefficients into an L-function and look at the order of the zero). We’re
given the q-expansion – for weight 1, they often live in cyclotomic fields, so we see them in terms of roots of unity.

Fact 114
We can search for modular forms by dimension as well, which tells us about the size of the “block” when we try
to “diagonalize” the space of modular forms over Q instead of C. (Each block corresponds to a factor of the
characteristic polynomial over Q, and newforms correspond to these blocks.) So a lot of calculations end up being
linear algebra here!

It’s also good to look at the elliptic curves part of the database, which has also classified in a few nice ways. The
discriminant is something we define directly in terms of the equation of the curve, and we also define the conductor
(which has a more complicated definition, but it has the same prime divisors as the discriminant). We can think of
the conductor as the level of the associated modular form!
The individual elliptic curve homepages are interesting too: they tell us about integral points, group structure,
certain invariants, and so on. Many pages on this database also have related pages linked on the right, which are
helpful if we want to learn more.
So how is this database generated? Each section has pages talking about completeness, reliability, and source
of the data (for example, there might be assumptions like the generalized Riemann hypothesis).

Writing topics
Professor Kim has updated the Stellar page with sample papers and sample abstracts (from last year’s Seminar in
Number Theory).

44
The most important (and maybe most difficult) part of writing a survey paper is to choose a topic, so that’s
what we’ll brainstorm now! Let’s (for example) think about modular forms. We need to write an abstract, which is
supposed to (briefly) summarize the main content and results of our paper. For example, the abstract might say that
“we will define modular forms and their properties, and we’ll prove the dimension formula.” It’s good to say that “this
is a survey/expository paper based on (books/papers),” since we won’t really be doing original research.
Most papers start with an introduction, which is basically an expanded summary of the content of our paper.
But if there is some history or background for the subject, we can also talk about that to give some motivation for
the mathematics, and we can also state a main theorem and the ingredients for its proof (if they exist). One main
purpose for the introduction is to state the structure of the paper! For example, we might say that “in section 2,
we do (something), and in section 3, we cover (other things),” summarizing each section.

Remark 115. When we write an introduction, we might need to introduce new terminology which is not defined yet.
In principle, we want to define everything before we use it, but an introduction should not have too many detours. So
it’s often good to give label numbers to where definitions are actually made!

Let’s think about how we might structure a paper about modular forms, for example. We want to start with a
definition – a modular form is a function satisfying a holomorphic condition, as well as invariance with respect to the
action of SL2 (Z) – but to have this definition, we need to define things like H, the relations of SL2 (Z), how SL2 (Z)
acts on the H, and so on.
Once the definition is established, a good next step is to give some examples (Eisenstein function, discriminant,
proving that both of these are indeed modular forms), and then start proving some properties about them. The specific
sample paper we can see on Stellar finished by proving the dimension formula, and it also had an appendix with results
from complex analysis (so that they aren’t too distracting).

Fact 116
We should be careful about plagiarism in our papers – we should make sure to quote and cite our sources! We
can label sources by numbers or by initials: for example, a book by Diamond and Shurman written in 2005 can
be labeled as [DS05] or as [1]. But if we’re using a source, we should cite the specific theorem or proof from the
book so that it’s easy to look up.

Remark 117. We’re writing this paper for an audience like our classmates – basically, we have some background in
modular forms.

10 March 10, 2020

Diamond and Shurman 1.2 – David Wu


We’ll start off the
" new#book by talking about congruence subgroups. First, a quick review of notation and general-
a b
ization: let γ ∈ be an element of SL2 (Z). We previously defined what it means for a function to be weakly
c d
modular with respect to the full modular group G: basically, f is weakly modular of weight k with respect to SL2 (Z) if

f (γ(τ )) = (cτ + d)k f (τ )

for all γ ∈ SL2 (Z), τ ∈ H.

45
But now we’ll make a generalization: we’ll replace SL2 (Z) with a congruence subgroup Γ, which will allow weights
to be any nonnegative integer k.
What is causing this generalization? We know that −I is an element of SL2 (Z), and the group action −I(τ ) = τ
fixes complex numbers τ ∈ H. So previously, when k was odd, the definition of a modular function would force

f (τ ) = (−1)k f (τ ),

which means that f is trivial. The main idea here is that we won’t need to include −I in our congruence subgroup!
At the end of the last lecture, we were talking about theta functions, and we introduced (for a lattice Γ)

rΓ (m) = #{x ∈ Γ : hx, xi = 2m}.

We required gnarly conditions on Γ last time to make sure we had a modular form when we assembled the generating
function θΓ . In particular, we (sort of) showed last time that
 
1
θΓ − = (i z)n/2 θΓ (z).
z

Let’s define this more rigorously now:

Example 118
Let r (n, k) be the number of lattice points v ∈ Zk such that

X
k
n= vi2 .
i=1
P∞
This defines a theta function θ(τ, k) = n=0 r (n, k)q n (where q = e 2πiτ ).

This is mostly as a sketch – we won’t show that this actually converges as a power series or anything like that
– but the reason this function is important is that we have Legendre’s four squares theorem, which tells us that
any nonnegative integer can be written as the sum of four squares. This can be rephrased as saying that θ(τ, 4) has
all nonzero coefficients in its q-expansion: once we develop a bit more theory, we’ll be able to make progress on this
question!

Claim 119. θ(τ, 4) is a modular form of weight 2 with respect to a specific congruence subgroup Γ.

Proof sketch. First, we can notice that if i + j = k, then we have the convolution
X
r (n, k) = r (`, i )r (m, j).
ℓ+m=n

Basically, if we want to write n as a sum of k squares, we go through all possible cases of what the first i squares add
up to. And this also relates nicely to our generating function: we have

θ(τ, k1 )θ(τ, k2 ) = θ(τ, k1 + k2 ).

Since q = e 2πiτ , our function θ is Z-periodic. This means that we already have (assuming absolute convergence for
now) " # !
1 1
θ τ, 4 = θ(τ + 1) + θ(τ, 4)
0 1

46

for all θ, so the action under T is “correct.” For the action under S, our goal is to show that θ − τ1 , 4 is related
to θ(τ, 4). Define the function θ = θ(τ, 1); then we know that θ4 = θ(τ, 4), so it’s good enough to figure out the
P 2
behavior of θ. θ is the number of ways to write a number as a single perfect square, so θ = d∈Z e 2πid τ (possibly
with an extra constant factor), and then Poisson summation gives us
 
1 √
θ − = −2i τ θ(τ ).

" #
0 −1
Notice, though, that this transformation gives us the matrix , which does not have determinant 1. Instead,
4 0
we’ll have to use " # " #" #" #
1 0 0 1/4 1 −1 0 −1
=
4 1 −1 0 0 1 4 0
τ
which maps τ to 4τ +1 , and this gives us (with some algebra)
 
τ √
θ = 4τ + 1θ(τ ).
4τ + 1

This implies (raising everything to the fourth power) that


 
τ
θ , 4 = (4τ + 1)2 θ(τ, 4).
4τ + 1

So now we’ve verified the modular form equation

θ(γ(τ ), 4) = (cτ + d)2 θ(τ, 4)


" # " #
1 1 1 0
under action of and . Before, we cared about being weakly modular with respect to S and T , and now
0 1 4 1
we’ll look at being weakly modular with respect to these new matrices. But here we’re describing things with respect
to the actual generators, and we want something more general:

Definition 120
Let N be a positive integer. The principal congruence subgroup of level N is
(" # " # " # )
a b a b 1 0
Γ(N) = ∈ SL2 (Z) : = mod N ,
c d c d 0 1

where congruence is taken entry-wise.

As a check, we can make sure Γ(N) is actually a subgroup of SL2 (Z) – this can be done just by running through
the axioms. Notice that Γ(1) = SL2 (Z), and also that Γ(N) has finite index in SL2 (Z). Later, we’ll state a result
that helps us explicitly calculate this index, but first we’ll finally make our important definitions:

Definition 121
A subgroup Γ ⊂ SL2 (Z) is a congruence subgroup if Γ(N) ⊂ Γ for some N. Then Γ is called a congruence
subgroup of level N.

47
Definition 122
Some special subgroups:
(" # " # " # )
a b a b ∗ ∗
Γ0 (N) = ∈ SL2 (Z) : = mod N
c d c d 0 ∗

and (" # " # " # )


a b a b 1 ∗
Γ1 (N) = ∈ SL2 (Z) : = mod N ,
c d c d 0 1
where ∗ can be any element.

This may seem a bit unmotivated, but we’ll see why they’re useful later. It’s possible to find generators explicitly
for Γ(N), Γ0 (N), and Γ1 (N), but it’s a bit annoying.

Proposition 123
Let N be a positive integer. Then the following results hold:
1. Γ(N) ⊂ Γ1 (N) ⊂ Γ0 (N) ⊂ SL2 (Z).

2. The map SL2 (Z) → SL2 (Z/nZ) (the natural map) is a surjection with kernel Γ(N), so SL2 (Z)/Γ(N) is
isomorphic to SL2 (Z/nZ).
" #
a b
3. Consider the map Γ1 (Z) → Z/nZ, sending to b mod N. Then this map is a surjection with kernel
c d
Γ(N), so Γ1 (N)/Γ(N) is isomorphic to Z/nZ.
" #

a b
4. Consider the map Γ1 (Z) → (Z/nZ) , sending to d mod N. Then this map is a surjection with
c d
kernel Γ1 (N), so Γ0 (N)/Γ1 (N) is isomorphic to (Z/nZ)∗ .

The main point is that Γ(N), Γ0 (N), Γ1 (N) are not super mysterious: we can find explicit correspondences between
them. We’ll just do a quick sketch of the second point:
" #
1 0
Proof sketch of (2). The identity element of SL2 (Z) is , so Γ(N) is indeed the kernel here by definition.
0 1
Showing surjectivity just requires us to lift SL2 (Z/nZ) to SL2 (Z), and the Euclidean algorithm tells us that this can
be done.

This isomorphism allows us to find explicit formulas for the index of these subgroups, and this tells us that Γ0 and
Γ1 have finite index (which can be explicitly calculated as well).

Definition 124 " #


a b
Let the factor of automorphy j(γ, τ ) = cτ + d for a matrix γ = . Also, let the weight-k operator [γ]k
c d
be defined as
(f [γ]k )(τ ) = j(γ, τ )−k f (γ(τ )).

We find after unpacking definitions that this just gives us a nice way to say that f is weakly modular of weight k if
f [γ]k = f . We can also translate properties (like those about group actions) that we previously proved into operator
notation:

48
Lemma 125
We have, for all τ ∈ H, γ, γ ′ ∈ SL2 (Z),
• j(γγ ′ , τ ) = j(γ, γ ′ (τ ))j(γ ′ , τ ).

• (γγ ′ )(τ ) = γ(γ ′ (τ )).

• [γγ ′ ]k = [γ]k [γ ′ ]k (as operators).


Im τ
• Im(γ, τ ) = |j(γ,τ )|2 .
dγ(τ ) 1

dτ = j(γ,τ )2 .

This is making sure that the action under γγ ′ is the same as the action under γ ′ followed by the action under γ.
The same proofs work – we’re just using operator notation, and the idea is that there’s a more intrisic view of these
operators independent of the modular form f . One nicer way to show these that we haven’t discussed yet, though, is
to define a group action of SL2 (Z) on column vectors by matrix multiplication. Then we find that
" # " #
τ γ(τ )
γ = j(γ, τ ),
1 1

and this allows us to show the identities easily.


The point is that if f is weakly modular with respect to a set of matrices, it’s also weakly modular with respect to
the group that is generated by those matrices. It turns out that the theta function θ(τ, 4) is weakly modular of weight
2 with respect to the subgroup Γ0 (4).

Diamond and Shurman 1.2 (continued) – Dhruv Rohatgi


We’ll now formally define modular forms with respect to the congruence subgroup and apply this to the four square
theorem.

Definition 126
Let Mk (Γ) be the vector space of modular forms of weight k over a congruence subgroup Γ. A function f : H → C
is weakly modular of weight k with respect to Γ if
1. f is weakly modular of weight k with respect to Γ,

2. f is holomorphic on H,

3. f is holomorphic at the cusps.

Let’s define what this last point actually means by thinking about what happened when Γ = SL2 (Z). In that case,
we knew that f (z) = f (z + 1), so f was Z-periodic, and there was a transform f˜ : D \ {0} → C defined by

f (z) = f˜(e 2πiz ) = f˜(q).

Because f was holomorphic on the upper half-plane, f˜ was holomorphic on the punctured unit disk, so there was a
Laurent expansion in q: our condition forced us to say that
" there
# were no negative coefficients there.
1 N
Now, in our general case, we know that Γ ⊃ Γ(N) 3 , where this last matrix represents translation by N.
0 1

49
So a function that is weakly modular with respect to Γ satisfies f (z) = f (z + N) for some N, and now we can define

f (z) = f˜(e 2πiz/N ),

where we’ll define q = qN = e 2πiz/N . As before, we have a Laurent expansion at q, and we’ll say that a function f
which just satisfies (1) and (2) (that is, weakly modular of weight k with respect to Γ, and holomorphic at H) is also
holomorphic at ∞ if f˜ has a holomorphic extension at q = 0.
But this isn’t strong enough for a general subgroup Γ: we need that the vector spaces of modular forms of a
fixed weight is finite dimensional. In the SL2 (Z) case, we can map any rational point on the real line to ∞ with a
transformation in SL2 (Z), but this isn’t quite so nice in the general case:

Definition 127
Let Γ be a congruence subgroup. Then a cusp of Γ is an equivalence class of Q∪{∞} under action by Γ. (Because
Γ has finite index, there are only a finite number of cusps.)

Definition 128
A function H → C satisfying (1) and (2) above is holomorphic at the cusps if f (γz) is holomorphic at ∞ for all
γ ∈ SL2 (Z).

(Notably, this maps any cusp to infinity.) At first glance, this seems a bit hard to work with: if we’re given a
q-expansion, it’s easy to say that it’s holomorphic at ∞, but not so much with other cusps. Fortunately, we have an
easier condition to check:

Proposition 129
Let f : H → C be holomorphic (on H) and weakly modular with respect to a congruence subgroup Γ of weight
N, and suppose f is holomorphic at ∞ with coefficients of q-expansion

X
f (z) = an q n
n=0

satisfying an = O(nr ) for some constant r (polynomially bounded), then f is modular form with respect to Γ (it
is holomorphic at the cusps).

Proof sketch. Doing some calculations shows that if z = x + i y , we have


c1
|f (z)| ≤ c +
y r +1

by doing some integration. Now if we let α ∈ SL2 (Z), it suffices to show that q · f (αz) converges to 0 as q → 0.
Then q · f (αz) extends to a holomorphic function with a zero, so we can divide out the q again. To do this, we use
the above bound: fixing q and taking q = e 2πiz/N for x ∈ [0, N] (without loss of generality),

Im z
Im(αz) = ,
|cz + d|2

and using the bound on x and Im z, this is at least y


N 2 +y 2 ≥ C
y. So the magnitude of |f (αz)| is at most O(y r +1 ), but
q is exponentially decaying, so we’re done.

We’ll return to the motivating example from David’s lecture:

50
Proposition 130
θ4 (z) = θ(z, 4) is a modular form in M2 (Γ0 (4)).

Proof. We showed in the previous lecture that θ4 (z) is weakly modular. To show that this is holomorphic, we know
that

X
θ4 (z) = r (n, 4)q n .
n=0

On any compact subset of H, q is bounded by 1, and r (n, 4) is bounded polynomially, so this will converge absolutely.
And because |r (n, 4)| = O(n4 ), we can use the previous proposition to show that θ4 is holomorphic at the cusps as
well.

We’ll now look at some examples that help us characterize our vector space of modular forms. Now that we can
have modular forms of any integer weight, we now write the weight 2 Eisenstein series as

XX X∞
1
G2 (z) = = 2ζ(2) − 8π 2
σ1 (n)q n .
(nz + m)2 n=1
n∈Z m∈Z

This isn’t a modular form in any reasonable sense, but here’s how we can use it:

Proposition 131
We have the functional equation

G2 (γz) = (cz + d)2 G2 (z) − 2πi c(cz + d)


" #
a b
for any γ = in SL2 (Z). Thus, for any positive integer n, we can define
c d

G2,N (z) = G2 (z) − NG2 (Nz),

and this will be a modular form of weight 2 over Γ0 (N).

Proof sketch. We saw in a previous lecture (with some difficulty) that


 
1
Gz − = z 2 G2 (z) − 2πi z ,
z

and also that


G2 (z + 1) = G2 (z).

Thus, the functional equation is satisfied for γ = S, S −1 , T, −I (if we just plug in the correct values of c and d for each).
Suppose that G2 (γ1 z) = (c1 z +d)2 G2 (z)−2πi c1 (c1 z +d1 ), and also that G2 (γ2 z) = (c2 z +d)2 G2 (z)−2πi c2 (c2 z +d2 ).
Then we can check by direct computation that G2 (γ1 γ2 z) also satisfies the desired equation, and since S, S −1 , T, −I
generate SL2 (Z), we’ve shown the functional equation for all γ.
Now if we define G2,N as above, we can show that this is weakly modular with respect to Γ0 (N) by direct compu-
tation. To show holomorphicity, we do something similar as with θ4 : since the coefficients of G2 (z) and NG2 (Nz) are
polynomially bounded and we’re subtracting two such terms, their difference is also polynomially bounded. Using the
proposition above then shows that G2,N (z) is a modular form.

51
Example 132
Consider N = 2 in the above calculation, giving us the series G2,2 (N).

Then
G2,2 (z) = G2 (z) − 2G2 (2z),

and looking at the q-expansion shows that



X ∞
X

G2,2 (z) = −2ζ(2) − 8π 2 σ1 (n)q n − 2σ1 (n)q 2n = −2ζ(2) − 8π 2 (σ1 (n) − 2σ1 (n/2)) q n ,
n=1 n=1

where σ1 is zero if its argument is not an argument. Notice that 2σ1 (n/2) is actually the sum of all even divisors of
n, so we can write  

X X
−2ζ(2) − 8π 2   qn .
n=1 d|n,2∤d

This is a modular form with respect to Γ0 (2).

Example 133
Take N = 4, giving us the series G2,4 (N).

This, similarly, shows that  



X X
G2,4 (z) = −π 2 − 8π 2  d  qn
n=1 d|n,4∤d

This is a modular form with respect to Γ0 (4). Since G2,2 (z) is a modular form with respect to Γ0 (2), it’s also a modular
form with respect to Γ0 (4). So now we have three modular forms with respect to Γ0 (4), but it turns out (we’ll learn
in May, perhaps) that the dimension of M2 (Γ0 (4)) = 2. This means that θ4 is in the span of G2,2 and G2,4 (because
those two are linearly independent), and we also know the first few coefficients

θ4 (z) = 1 + 8q + · · · .

This is enough to tell us what linear combination to take: we actually just want
1
θ4 (z) = − G2,4 .
π2
So we know what the q-expansion looks like: we have
 

X X
θ4 (z) = 1 + 8  d  qn .
n=1 d|n,4∤d

And the coefficients here are always nonzero, because 1 always divides d, and we’ve proven Lagrange’s four squares
theorem!

11 March 12, 2020


The two speakers (Kaarel and Anton) for today aren’t coming due to sickness, so we’ll have to change the schedule a
bit. (They’ll present after spring break.)

52
The deadline for our paper abstract will stay the same, but we will have an extension. (We should make sure we
do submit it by the end of break, though.)
Office hours, as well as class, will be done by Zoom. We’ll be livestreaming student lectures, so the time zone for
students is important. We can either use a tablet or use slides / Beamer.

12 March 31, 2020

Diamond and Shurman 1.3 – Kaarel Haenni


We’ll start by recalling a lemma from a previous lecture: two bases give the same lattice if and only if we have a
basechange matrix γ ∈ SL2 (Z) that gets us from one to the other.

Definition 134
A complex torus is a set C/Λ of additive cosets {z + Γ : z ∈ C} with algebraic structure as a quotient of C:
addition is given by addition in C, and geometric structure is also induced by C.

This turns out to make C/Λ a Riemann surface – we won’t define this rigorously, but we can just think of this as
things looking “locally like C.” A good way to think about C/Λ is that we have a fundamental domain of the lattice Λ,
except we identify opposite edges. Topologically, there are neighborhoods that look like normal neighborhoods in C,
but there are also neighborhoods that “cross over the boundary:” quotienting one pair of edges gives us a tube, and
then quotienting the other pair gives us a torus.
We might have seen this next result from regular complex analysis before:

Theorem 135 (Open mapping theorem for Riemann surfaces)


Let X and Y be two Riemann surfaces, and let f : X → Y be a holomorphic map. Then either f is constant or f
is an open mapping (it maps open subsets to open subsets).

(Here, holomorphic means that the map looks holomorphic on neighborhoods that “look like” C.)

Proof sketch. Restrict the map to neighborhoods and then apply the usual open mapping theorem; now take a union
over the whole surface.

Corollary 136
Let X and Y be compact Riemann surfaces, and say that f : X → Y is holomorphic. Then f is either compact
or a surjection.

Proof. A theorem of topology says that if X is compact, then f (X) is compact, and therefore f (X) is closed. But X
is also the whole space, so X is also open, meaning f (X) is open by the open mapping theorem. Thus f (X) is closed
and open, and Y is connected – therefore Y = f (x) t f (x) means that f (X) must be the whole space.

Now we’ll look at a specific set of maps that we care about: in particular, everything from before also applies
because complex tori are compact Riemann surfaces.

53
Proposition 137
Suppose φ : C/Λ → C/Λ′ is holomorphic. Then there exist m, b ∈ C such that mΛ ⊂ Λ′ , and the map has the
explicit form φ(z + Λ) = mz + b + Λ′ , where m and b are unique mod Λ′ . Additionally, this map is bijective if and
only if mΛ = Λ′ .

Proof sketch. A lifting theorem from topology means that we can lift φ to a map between the universal covering
spaces. The universal covering space of a complex torus is C, so we have a map φ̃ from C → C which satisfies the
following commutative diagram:
ϕ̃
C C
πΛ πΛ′
ϕ
C/Λ C/Λ′

Now for any λ ∈ Γ, consider the function fλ (z) = φ̃(z + λ) − φ̃(z). Since the diagram commutes, fλ must map to Λ′
((z + λ) and z need to end up in the same element of Λ′ , because they project to the same thing in our commutative
diagram). But Λ′ is a discrete subset of C, so any continuous map into Λ′ has to be constant. Thus its derivative is
zero, which means that φ̃′ (z + Λ) = φ̃(z)′ . But now the derivative of φ̃ is holomorphic and it’s also Λ-periodic – this
means that it is bounded, and therefore it is constant by Liouville’s theorem. This means that φ̃(z) = mz + b, as
desired, and then finding φ from φ̃ is diagram chasing with the above commutative diagram.
And this is an if and only if statement: any map of this form is holomorphic, as long as mΛ ⊆ Λ′ . Explicitly finding
two elements that map to the same element (showing that it’s not injective) gives us the last part of this theorem.

We’re considering holomorphic maps, which preserves the geometric structure, but we also want to preserve the
algebraic structure:

Corollary 138
Let φ : C/Λ → C/Λ′ be holomorphic, or equivalently φ(z + Λ) = mz + b + Λ′ with mΛ ⊆ Λ′ . Then the following
are equivalent:
• φ is a group homomorphism.

• b ∈ Λ′ , which means that φ(z + λ) = mz + Λ′ .

• φ(0) = 0.

This is not too hard to show – we just plug in the properties of the group homomorphism. In particular, this tells
us that there exist a nonzero holomorphism from /CCΛ to C/Λ′ if and only if there is a nonzero m ∈ C such that
mΛ ⊆ Λ′ (and we have an isomorphism if there is equality).

Example 139
Consider a lattice Λ = ω1 Z ⊕ ω2 Z with the usual normalization τ = ω1
ω2 ∈ H. Let Λτ = τ Z ⊕ Z, and consider the
map
φτ : C/Λ → C/Λτ

given by φτ (z + λ) = z/ω2 + Λ.

54
We can notice that this indeed maps C/Λ to C/Λτ – this is an isomorphism of complex tori, and in fact this sends
our complex tori (up to SL2 (Z)) to some τ ∈ H. To be more precise, tori are isomorphic if they are sent to the same
orbit – the isomorphism classes of complex tori are in bijection with orbits of SL2 (Z) of the upper half plane.

Definition 140
An isogeny is a nonzero holomorphic homomorphism between complex tori.

We have the following properties:


• Holomorphic isomorphisms are isogenies.

• Isogenies surject.

• Isogenies have finite kernel. (This is because complex analysis tells us that it has a discrete kernel, and then we
have a discrete set in a compact space.)
There are two main examples of isogenies that we care about: these might not be all of the isogenies, and this has
to do with complex multiplication.

Example 141
The multiply-by-N map [N] : C/Λ → C/Λ sends a point z + Λ to Nz + Λ.

The kernel here is important: it’s the set of points z + Λ such that Nz ∈ Λ, and we call this the set of N-torsion
points. Letting E = C/Λ, we notice that we have a group structure:

E[N] = Z/NZ × Z/nZ.

If we draw a picture, this basically means we split up our complex torus into an N × N grid.

Example 142
Let C ⊆ E[N] be a cyclic group of order N in the N-torsion subgroup. Then the preimages of C in the map
C → C/Λ form a superlattice of Λ, which we also call C – this is because we can pick an explicit basis (ω1 , ω2 )
and also of the generators of C.

Then we get a map π : C/Λ → C/C, where z + Λ is sent to z + C (this is a projection map). We can convince
ourselves that this is a holomorphic homomorphism, and its kernel is C. Pictorally, the easiest example of this is to
take the subgroup generated by ω2
N : then C/C looks like a slice of the previous fundamental domain, and then we just
project down.

Proposition 143
Any isogeny φ : C/Λ → C/Λ′ can be factored as
[n] ∼
φ : C/Λ → C/Λ → C/nK → C/Λ′ ,
π

where K is the kernel of φ.

Basically we multiply by n, then do a cyclic quotient, and then we do an isomorphism. (Proof omitted for now.)

55
Proposition 144
C/Λ is isogenous to C/Λ′ if there exists an isogeny between the two. This is an equivalence relation.

Proof. Being reflexive and transitive are both easy – symmetry is a bit trickier, but the idea is that there is a dual
isogeny that we can explicitly construct.

This dual isogeny has a few properties:


• The multiply-by-N map’s dual is itself.

• The dual of a cyclic quotient is the dual of a different cyclic quotient of the same order (the “orthogonal one”)
and then scaling up by N.

• The dual of an isomorphism is its inverse.

• The dual of the isomorphisms is the isomorphisms of the dual, but in the opposite oder (just like functions).

• The dual of the dual is itself.

• The sum of duals is the dual of the sum.


For the purpose that we’re using these for, which is studying modular forms, it turns out that isogeny is a better
equivalence relation than isomorphism.
The last topic of this section is the Weil pairing, but we don’t have too much time to talk about it: recall that

E[N] ∼
= hω1 /N + Λi × hω2 /N + Λi,

and let µN be the group of Nth roots of unity. We can define an inner product eN : E[N] × E[N] → µN as follows:
given P, Q ∈ E[N], we can assemble a coefficient matrix γ such that
" # " #
P ω1 /N + Λ
=γ .
Q ω2 /N + Λ

We then define
eN (P, Q) = e 2πi det γ/N ,

and we can check that this is actually independent of our choice of basis for Λ – this is because determinants are
basically the ratio of areas for fundamental cells.

Diamond and Shurman 1.4 – Anton Trygub


Today, we’re going to talk about the connection between elliptic curves and complex tori. We’ll start with any lattice
Λ and consider the Weierstrass p function
X  
1 1 1
℘(z) = 2 + − 2
z (z − w )2 w
ω∈Λ−{0}

for all z ∈ C not in the lattice Λ. (We’ll show later on that this is an absolutely convergent sum.) Notice that that
the derivative of this function is
X 1
℘′ (z) = −2 ,
(z − w )3
ω∈Λ

56
and now if we consider any ω ∈ Λ, we can consider the function

f (z) = ℘(z + ω) − ℘(z).

Then the derivative is


f ′ (z) = ℘′ (z + ω) − ℘′ (z) = 0,

because ℘′ is periodic, so f (z) is constant. And then we can calculate the exact constant by substituting in − ω2 : then
 ω ω  ω
f − =℘ −℘ − = 0,
2 2 2
because ℘ is an even function. Therefore ℘(z + ω) = ℘(z) for all ω ∈ Λ, and ℘ is periodic.

Definition 145
The Eisenstein series for a lattice Λ are defined as
X 1
Gk (Λ) = .
ωk
ω∈Λ−{0}

We’re going to make a connection between elliptic curves with complex tori now through the next few results:

Proposition 146
The Laurent expansion of ℘(z) is
1 X
℘(z) = + (n + 1)Gn+2 (Λ)z n
z2
n≥2, even

for all 0 < |z| < inf{|ω| : ω ∈ Λ − {0}}.

Proof. We can rewrite


    z 2 
1 1 1 1 1 z
− 2 = 2 − 1 = 1 + + + · · · ) 2
− 1
(z − ω) 2 ω ω (1 − z/ω)2 ω2 ω ω

and then expanding this out yields  


z z  z 2
= 2+3 +4 + ··· .
ω3 ω ω
|z|
Taking p = supω∈Λ−{0} |ω| (which is less than 1 by assumption), this sum is bounded absolutely by

z 
2 + 3p + 4p 2 + · · · < ∞.
ω3
P 1
Now summing over ω, since |ω 3 | was shown to be absolutely convergent, we do indeed have absolute convergence
of this series.
So we can now rearrange terms to find that
X ∞
X
1 zn
℘(z) = + (n + 1) ,
z2 ω n+2
ω∈Λ−{0} n=1

and the terms with odd n cancel because the ωs in the denominator have opposite signs. This yields the desired result
by evaluating the inner sum to be Gn+2 (Λ), and now we’ve shown that ℘(z) is well-defined as well.

57
Proposition 147
We have the relation
(℘′ (z))2 = 4℘(z)3 − g2 (Λ)℘(z) − g3 (Λ),

where g2 (Λ) = 60G4 (Λ) and g3 (Λ) = 140G6 (Λ).

Proof. From the above proposition, we know that


1
℘(z) = + 3G4 (Λ)z 2 + 5G6 (Λ)z 4 + O(z 6 ),
z2
and also that
2
℘′ (z) = − + 6G4 (Λ)z + 20G6 (Λ)z 3 + O(z 5 ).
z3
By expansion, we see that ℘′ (z)2 and 4℘(z)3 − g2 (Λ)℘(z) − g3 (Λ) are both 4
z6 − 24G4 (Λ)
z2 − 80G6 (Λ) + O(z 2 ). Thus
their difference is holomorphic, but it is also Λ-periodic. Thus it is bounded and constant, and taking z → 0 makes
this difference go to 0, and thus the difference is zero.

Proposition 148
Let Λ be a lattice generated by (ω1 , ω2 ), and say that ω3 = ω1 + ω2 . Then the cubic equation satisfied by ℘(z)
and ℘′ (z) above, y 2 = 4x 3 − g2 (Λ)x − g3 (λ), can be rewritten as

y 2 = 4(x − e1 )(x − e2 )(x − e3 ),

wi

where ei = ℘ 2 are distinct.

Proof. The function f (z) = ℘(z)−t is meromorphic for any t, so there is some translation P of the basis parallelogram
such that there are no poles or zeros on that boundary. This boundary also has opposite sides being equal, so that
tells us that the contour integral Z
1 f (z)′
= 0.
2πi ∂P f (z)
Therefore, the number of poles and zeros inside of ∂P is equal. The only pole is the point inside P in the lattice Λ,
and that has order 2, so there are two zeros of f inside P .
  
But ℘′ is an odd function, so ℘′ ω2 = ℘′ − ω2 for any ω ∈ Λ. So then we can take the ei = ℘ ωi
2 that we defined
above, and they will indeed be roots of 4x − g2 (Λ) − g3 (Λ) (because those are exactly the points where ℘′ (z)2 = 0).
3

And these roots are pairwise different because ωi


2 is a double root of ℘(z) − ei (its derivative is zero at that point too).
Since we’re only allowed two zeros in P , there is no other value of z in P such that ℘(z) = ei . And this logic works
for every point in ℘(z).

Corollary 149
The function ∆ is nonvanishing on H.

Proof. For any τ ∈ H, we can consider the corresponding lattice Λτ and the cubic 4x 3 − g2 (Λτ )x − g3 (Λτ ). This
g2 (τ )3 27g3 (τ )2 ∆(τ )
discriminant is 16 − 16 = 16 , and this is nonzero because the roots are distinct by the previous proposition.
Thus ∆ is nonzero.

58
So now we know that the map z → (℘Λ (z), ℘Λ (z)′ ) takes nonlattice points of C to points on the elliptic curve
y 2 = 4x 3 − g2 (Λ)x − g3 (Λ). Indeed, for any value of x ∈ C, we have two values of y that satisfy the equation
corresponding to the points (x, y ) and (x, −y ) respectively. We can also extend this map by defining a point at infinity
for the elliptic curve, and this means that we have a bijection between each complex torus C/Λ and each elliptic curve
of the form y 2 = 4x 3 − g2 (Λ)x − g3 (Λ).
Call this map (℘, ℘′ ). (℘, ℘′ ) actually transforms the group law from the complex torus onto the elliptic curve.
If we have two points z1 + Λ, z2 + Λ on the torus, then (℘(z1 ), ℘′ (z1 )) and (℘(z2 ), ℘′ (z2 )) form a secant (or tangent)
line in C2 of the form ax + by + c = 0. Consider the meromorphic function

f (z) = a℘(z) + b℘′ (z) + c.

We’ll use the argument principle again. When b 6= 0, this has a triple pole at 0 and zeros at z1 + Λ, z2 + Λ. We must
have three zeros in this case, and the third zero must be at the point where z1 + z2 + z3 = 0. Meanwhile, when b = 0,
f has a double pole at 0 but still zeros at z1 + λ and z2 + Λ, so we can say that z1 + z2 + z3 = 0 in C/Λ as well
– this corresponds to the point at infinity (℘(0), ℘′ (0). The whole point is that the points on the elliptic curve that
also satisfy ax + by + c = 0 are exactly (℘(zi ), ℘′ (zi )), where we’ve just defined z3 , and thus collinear triples on the
elliptic curve sum to zero in the complex torus.
We’ve described a way to go from a complex torus to an elliptic curve, and it turns out we can go in reverse as
well:

Theorem 150
For any elliptic curve y 2 = 4x 3 − a2 x − a3 where a23 − 27a32 6= 0, there exists a lattice Λ such that a2 = g2 (Λ) and
a3 = g3 (Λ).

Proof. We have the surjective map j : H → C, so for any a2 and a3 , there exists a τ such that

1728a23 g2 (τ )3 a23
j(τ ) = =⇒ = .
− 27a3
a23 2 g2 (τ )3 − 27g3 (τ )2 a2 − 27a32
3

When a2 and a3 6= 0, we can take the reciprocal of both sides and subtract 1 to find that

27g3 (τ )2 27a2
3
= 23 .
g2 (τ ) a2

This is an invariant quantity when we scale a lattice by ω: in fact, g3 (Λ) = g3 (τ )ω −6 . Pick ω accordingly, and then
g3 (Λ) = a3 and g2 (Λ) = a2 as desired.
In the edge cases, if a2 = 0, then g2 (τ ) = 0, and we can just scale ω again to get g − 3 to the correct value
without worrying about g2 . The same thing works if a3 = 0.

The important takeaway here is that complex tori and elliptic curves are interchangeable!

13 April 2, 2020

Diamond and Shurman 1.5 – Swapnil Garg


The topic for this section is moduli spaces and modular curves, but we’ll only be talking about specific examples of
the former rather than general theory. Recall that the Weierstrass ℘ function gives us a bijection between complex

59
tori (equivalent to fundamental domains of lattices in C) and complex elliptic curves, and the mapping is also a group
isomorphism (with addition corresponding to collinearity). We also know that there is a holomorphic group isomorphism
(meaning that the two complex tori are isomorphic) if and only if Λ′ = mΛ for some m ∈ C: this means one is a
complex multiple of the other.
The purpose of this lecture is to show that complex elliptic curves are in bijection with orbits of the action of
SL2 (Z) on the upper half-plane. We’ll define specific kinds of moduli spaces here – these three are the only ones that
we’re going to be using, so we won’t define a moduli space completely. Recall that for any integer N, E[N] is the
N-torsion subgroup of E, which is the elements in the complex torus with Nq = 0. We know that |E[N]| = N 2 (we
have an N by N grid), and the Weil pairing is defined as

eN (P, Q) = e 2πi det γ/N ,

where det γ is the size of the fundamental domain of the lattice generated by P and Q, relative to the size of Γ/N. A
few weeks ago, we defined some congruence subgroups of SL2 (Z):
( " # )
∗ ∗
Γ0 (N) = γ ∈ SL2 (Z) : γ = mod N ,
0 ∗
( " # )
1 ∗
Γ1 (N) = γ ∈ SL2 (Z) : γ = mod N ,
0 1
( " # )
1 0
Γ(N) = γ ∈ SL2 (Z) : γ = mod N .
0 1
We’ll connect these definitions with our elliptic curves as follows:

Definition 151
An enhanced elliptic curve for Γ0 (N) is an ordered pair (E, C), where E is a complex elliptic curve and C is an
order N cyclic subgroup of E. An enhanced elliptic curve for Γ1 (N) is an ordered pair (E, Q), where E is a
complex elliptic curve and Q is a point on E of order N. An enhanced elliptic curve for Γ(N) is an ordered pair
(E, (P, Q)), where E is a complex elliptic curve and (P, Q) generate E[N] with the Weil pairing eN (P, Q) = e 2πi/N .

Notably, we can generate an enhanced elliptic curve for Γ0 (N) from one for Γ1 (N) by looking at the cyclic group
generated by Q. For all three definitions, we define an equivalence relation such that (E, x) and (E, x ′ ) are equivalent
if there is an isomorphism that sends E → E ′ and x → x ′ – we’ll denote the equivalence classes as [E, x].

Definition 152
Modding out the sets of enhanced elliptic curves for Γ0 (N), Γ1 (N), and Γ(N) by the equivalence relation yield
S0 (N), S1 (N), S(N) respectively, which are examples of moduli spaces.

When N = 1, all three of these are just the space of complex elliptic curves with equivalence by scaling (that we
introduced earlier), since the torsion group is trivial.

Definition 153
The modular curve Y (Γ) for a congruence subgroup Γ of SL2 (Z) is the quotient space of orbits of H under the
action of Γ, which can also be written as Γ\H. Denote Y0 (N), Y1 (N), Y (N) to be the modular curves for Γ0 , Γ1 , Γ
respectively.

60
Recall that Λτ denotes the lattice τ Z ⊕ Z for any τ ∈ H. Such lattices correspond to elliptic curves Eτ , and this
is unique up to adding an integer to τ .

Theorem 154
We have the following descriptions for the moduli spaces introduced:
1. The moduli space of Γ0 (N) is
S0 (N) = {[Eτ , h1/N + Λτ i] : τ ∈ H}.

2. The moduli space of Γ1 (N) is


S0 (N) = {[Eτ , 1/N + Λτ ] : τ ∈ H}.

3. The moduli space of Γ(N) is

S(N) = {[Eτ , (τ /N + Λτ , 1/N + Λ/τ )] : τ ∈ H}.

Basically, all of the order-N subgroups are isomorphic to taking the point 1/N (which generates a specific subgroup
in the lattice). Here, 1
N + Λτ denotes an additive coset of C, and indeed N times that point is an element of the
lattice.

Proof. Note that E is isomorphic to C/Λτ ′ for some τ ′ . The three proofs are similar: for (1), if we have an enhanced
elliptic curve (E, P ) in S0 (N), we can say that the cyclic subgroup P is generated by (cτ ′ + d)/N + Λτ ′ , and then we
can find a, b such that ad − bc = 1 mod N. In (2), we similarly find a, b for the pount (cτ ′ + d)/N + Λτ ′ , and in
(3), we"know #that (aτ ′ + b)/N + Λτ ′ , (cτ ′ + d)/N + Λτ ′ are the two points that generate E[N]. In all cases, we get a
a b
matrix in SL2 (Z/NZ), which lifts to an element γ ∈ SL2 (Z). Now the element τ we want is
c d

aτ ′ + b
τ = γτ ′ = ,
cτ ′ + d
because we indeed can write this enhanced elliptic curve with m = cτ ′ + d and

mΛτ = m(τ Z ⊕ Z) = (aτ ′ + b)Z ⊕ (cτ ′ + d)Z = τ ′ Z ⊕ Z = Λτ ′ ,

where the third equality comes because we’re acting by an element γ −1 of SL2 (Z), and then
 
1 cτ ′ + d
m + Λτ = + Λτ ′ .
N N

In other words, we’ve found a τ such that our enhanced elliptic curve reduces to the desired form. We now want to
show that each equivalence class [Eτ , (data)] map to equivalent enhanced elliptic curves. This can again be done by
writing our elements τ = γτ ′ and verifying that multiplying by m = cτ ′ + d sends one to the other.

Theorem 155
In all three cases of the theorem above, we have equivalences [Eτ , (data)] ∼ [Eτ ′ , data] if and only if Γτ = Γτ ′
(so τ, τ ′ are in the same Γ-orbit and map to the same element in the modular curve). Thus, we have bijections
from S0 (N) → Y0 (N), S1 (N) → Y1 (N), and S(N) → Y (N).

Proof. We’ve shown the backwards direction above already. Now if we have equivalent enhanced elliptic curves from
Eτ and Eτ ′ , then mΛτ = Λτ ′ . Since (mτ, m) now forms a basis for Λτ ′ , we can write mτ = aτ ′ + b and m = cτ ′ + d,

61
" #
a b
which gives us γ = . We wish to show that γ is indeed in the corresponding congruence subgroup Γ, which we
c d
can verify by noting that m(1/N + Λτ ) goes to 1/N + Λτ ′ .

We can call these specific representatives enhanced elliptic curves of the special type. For example, (Eτ , 1/N +
Λτ ) is an enhanced elliptic curve of the special type, but not (Eτ , −1/N + Λτ ). This theorem indeed tells us that
Y (1) = SL2 (Z)\H, which is what we wanted to show initially (because torsion data doesn’t tell us anything). We
talked about the j-invariant earlier in this class – we can now associate each complex elliptic curve with the orbit
SL2 (Z)τ , and can define a value j(E) = j(τ ) for each curve. It turns out that elliptic curves with rational j-values
correspond to modular forms – this is connected to Fermat’s last theorem!
Since moduli spaces and modular curves are equivalent, we can now take maps of modular curves and make them
into maps of moduli spaces.

Example 156
We have a natural map from Y1 (N) → Y0 (N) taking orbits of τ in Γ1 (N) to orbits in Γ0 (N). This translates into
a map from S1 (N) to S0 (N), which takes [E, Q] to [E, hQi].

Example 157
Γ1 (N) is a normal subgroup of Γ0 (N), so the quotient acts on Y1 (N). Translating this to our moduli spaces, we
get an important map γ, defined as
Γ1 (N)γ : [E, Q] 7→ [E, dQ]

(where d is the bottom-right element of the matrix γ). This is a Hecke operator, and it will come up later.

Recall that modular forms have some weight k by definition: there’s a connection here as well.

Definition 158
Let Γ be one of the groups Γ0 (N), Γ1 (N), Γ(N). A function F : {enhanced elliptic curves for Γ → C is degree-k
homogeneous with respect to Γ if we have

F (C/mΛ, mx) = m−k F (C/Λ, x)

for any lattice Λ and complex number m. Here, x is a cyclic group of order N if Γ is Γ0 (N), a point of order N if
it is Γ1 (N), and a pair of points if it is Γ(N).

Basically, functions on enhanced elliptic curves give functions on the upper half-plane (as a function of τ ).

Definition 159
The dehomogenized function f : H → C corresponding to a degree-k homogeneous F is

f (τ ) = F (C/Λτ , x),

where x is h1/N + Λτ i if Γ = Γ0 (N), 1/N + Λτ if Γ = Γ1 (N), and (τ /N + Λτ , 1/N + Λτ ) if it is Γ(N).

62
Proposition 160
A degree-k dehomogenized function f is weight-k invariant with respect to Γ. In other words, f (γ(τ )) = (cτ +
d)k f (τ ).

We can see this by letting m = (cτ + d)−1 in the definition.


Recall that lattice functions correspond to functions on the upper half-plane: we’re saying here that functions on
enhanced elliptic curves do the same, and we can get a function F on enhanced elliptic curves from a function f that
is weight-k invariant with respect to a congruence subgroup. This is well-defined, because two enhanced elliptic curves
of the special type will have the same values of f , and then we can define

F (C/Λτ ′ , x) = m−k F (C/Λτ , x).

Diamond and Shurman 2.1-2 – Natalie Stewart


The point of this lecture is to start allowing us to talk about everything from a differential geometry perspective with
Riemann surfaces. Recall that a congruence subgroup Γ ⊆ SL2 (Z) is a subgroup containing one of the principal
congruence subgroups Γ(N) – then the modular curve for Γ is the orbit space Y (Γ) = Γ H. But we only really know
how this looks with respect to the quotient topology, and we’re going to upgrade this now.

Definition 161
A Riemann surface is a connected complex 1-dimensional manifold – that is, it’s a connected topological space
with a countable basis which is Hausdorff, and for any two intersecting neighborhoods Um , Un , we have coordinate
charts φm , φn → D (the unit disk) such that the following composition is holomorphic:
∼ ∼
φm (Um ∩ Un ) → Um ∩ Un → φn (Um ∩ Un ).

One note about the last point is that we can map Um and Un conformally to the unit disk D, and we want the
“transfer map” between the images of the intersection Um ∩ Un onto the disks to be holomorphic.
We have a surjective open mapping H → Y (Γ) (which maps τ to Γτ ), which tells us immediately that Y (Γ) is
second-countable and connected. To show that Y (Γ) is a Riemann surface, we need to show that it is Hausdorff,
providing coordinate charts on Y (Γ), and we’ll show that those transfer maps defined above are holomorphic.

Proposition 162
Let τ1 , τ2 be two points in the upper half-plane. Then the action of SL2 (Z) (and of Γ) is properly discontinuous:
we can find neighborhoods U1 and U2 of τ1 , τ2 such that

γ(U1 ) ∩ U2 6= ∅ =⇒ γ(τ1 ) = τ2 .

This takes a long time to prove, so we won’t talk too much about this here.

Corollary 163
The modular curve Y (Γ) is Hausdorff for any congruence subgroup Γ.

Proof. Suppose we have two different points π(τ1 ), π(τ2 ) ∈ Y (Γ). By definition, τ1 , τ2 are in different orbits, so we can
choose two neighborhoods as in the above proposition so that γ(U1 ) ∩ U2 = ∅, which means that π(U1 ) ∩ π(U2 ) = ∅

63
(all of the points in the neighborhood of τ1 are in different orbits as those in τ2 ). Since π is a quotient mapping, it’s
an open mapping, and thus π(Ui ) is a neighborhood of π(τi ), which gives us our desired neighborhoods.

Giving coordinate charts is significantly more difficult, so we’ll go through some random-seeming constructions
which will come together:

Definition 164
Let G be a group acting on a space X, and let Γ ⊂ G be a subgroup. The isotropy subgroup for τ under Γ (for
any τ ∈ X) is the set of γ ∈ Γ which fix τ .

Definition 165
A point τ ∈ H is an elliptic point for Γ if the containment {±I}Γπ ⊃ {±I} is a proper containment. (π(τ ) is
also called elliptic.)

In other words, elliptic points are preserved by some nontrivial transformation of H by Γ.

Proposition 166
Let Γ be a congruence subgroup of SL2 (Z). Then the isotropy subgroups Γτ are finite cyclic groups for all elliptic
points τ .

We’ll prove this next lecture – this was also proved in Serre. This allows us to make the following definition:

Definition 167
The period of τ ∈ H, denoted hτ , is

|Γτ |/2 −I ∈ Γτ ,
hτ = [{±I}Γτ /{±I}] =
|Γ | −I 6∈ Γτ .
τ

Using a bit of general topology, we’ll define this for the modular curve as well:

Lemma 168
Let G be a group acting on a space X, and let Γ ⊂ G. If τ ∈ X is a point, then for any γ ∈ G, we have

γ(Γτ )γ −1 = (γΓγ −1 )γ(τ ) .

When γ ∈ Γ, this reduces to the period of τ being constant across orbits – thus, we can associate the period with
points of Y (Γ) instead of τ .

Proof sketch. If we take any α ∈ Γτ , we know that γαγ −1 γ(τ ) = γα(τ ) = γ(τ ), which shows that the left side is
contained in the right. The other direction is similar – if γαγ is an element of the right side, then γα(τ ) = γ(τ ), so
α(τ ) = τ .

We’ll now move to something that looks unrelated: we can extend an action of SL2 (R) on H to an action of
GL2 (C) on the Riemann sphere: " #
a b az + b
z= .
c d cz + d

64
Definition 169 " #
1 −τ
The straightening map for a point τ ∈ H is the action of ∈ GL2 (C).
1 −τ

Note that δτ (τ ) = 0 and δτ (τ ) = ∞, which is nice because we can map neighborhoods to neighborhoods of 0. It
turns out this is also a straightening in a more powerful sense if we connect this with isotropy groups: since congruence
subgroups have all real coordinates, Γτ = Γτ , which helps when we’re dealing with the Riemann sphere (and complex
numbers). Using the above lemma about conjugation, we find an equality of isotropy groups

(δτ ({±I}Γ)δτ−1 )0 /{±I} = δτ ({±I}Γτ /{±I})δτ−1 = (δτ ({±I}Γ)δτ−1 )∞ /{±I},

and we’ll denote this group G. (In words, the transformations induced by the isotropy group of the image of Γ under
conjugation of the straightening map at 0 and ∞ are the same as the conjugate of the isotropy groups around τ .) We
know that G is given by the properties of the isotropy groups, and for all g ∈ G, we know that 0 and ∞ are preserved,
so we must actually have g(z) = az. In addition, because G is finite cyclic, each element of G is just a rotation around
the origin by some integer multiple of 2π
hτ , since |G| = hτ .
This helps us think about how to define our coordinate charts using our straightening maps: under the map δτ ,
neighborhoods π(U) turn into radial sectors of the neighborhood of 0. Then we can apply the hτ -fold “wrapping map,”
ρhτ (z) = z hτ , to get the whole neighborhood.
So we need to decide what our small neighborhood U looks like, and we’ll use the following corollary of proper
discontinuity of SL2 (Z) action:

Corollary 170
Let Γ be a congruence subgroup of SL2 (Z). Then τ ∈ H has a neighborhood U such that

γ(U) ∩ U 6= ∅ =⇒ γ ∈ Γτ .

This neighborhood has no elliptic points except possibly τ .

Now, we can pick a point π(τ ) ∈ Y (Γ), and we’ll let U be the neighborhood of τ from the above corollary. We’ll
introduce the notation δ = δτ , ρ = ρhτ (for the straightening map and wrapping map, respectively), ψ = ρ ◦ δ, and
V = ψ(U).
We claim there is a bijection φ : π(U) → V such that φ ◦ π = ψ. This is essentially an algebraic manipulation:
π(τ1 ) = π(τ2 ) if and only if the two points are in the same orbit, which is true if and only if τ1 is in Γτ τ2 . This means
that
δ(τ1 ) ∈ δ(Γτ τ2 ) =⇒ δ(τ1 ) ∈ (δΓτ δ −1 )(δ(τ2 )),

which means that the two elements have the same order – δ(τ1 )h = δ(τ2 )h , meaning that ψ(τ1 ) = ψ(τ2 ).

Lemma 171
φ is a homeomorphism.

Proof. This is a verification that the map and its inverse are continuous: we know that ψ and π are both continuous
open surjections, an open subset W of V will correspond to preimages that are open.

65
We now need to show that φ is holomorphic, and this is a lot harder. We’ll define

V1,2 = φ1 (π(U1 ) ∩ π(U2 ))

(the small subset of the disks corresponding to the intersection of the neighborhoods), and define V2,1 similarly.

Proposition 172
The transition map φ2,1 : V1,2 → V2,1 , defined as φ2 ◦ φ−1
1 , is holomorphic.

Proof. Fix a point x in the intersection of the two neighborhoods, and choose some preimages such that x = π(τ1 ) =
π(τ2 ). Fix γ such that τ2 = γ(τ1 ) (they’re in the same orbit under τ , so this exists). We’ll check holomorphicity
locally on φ1 (x) – specifically, if U1,2 = U1 ∩ γ −1 (U2 ), we’ll prove holomorphicity on φ2,1 = φ1 ◦ π(U1,2 ).
Define δi = δτi and hi = hτi for convenience. First assume that φ1 (x) = 0. We know that any point q = φ1 (x ′ ) in
this neighborhood in question (φ1 ◦ π(U1,2 )) is of the form

q = ψ1 (τ ′ ) = (δ1 (τ ′ ))h1

for some τ ′ ∈ U1,2 , where ψ1 = φ1 ◦ π. Let h˜2 be the period of the point τ˜2 ∈ U2 – this τ̃2 is the point such that
ψ2 (τ̃2 ) maps to 0. Then we can chase some equalities to find that
 h˜2
φ2,1 (q) = ψ2 (γ(τ ′ )) = (δ2 γδ1−1 )(q 1/h1 ) .

The central trick here is that we stick in a δ1−1 δ1 term to introduce a q. We know that δ2 γδ1−1 is a fractional linear
transformation, and we can note that 0 and ∞ are both fixed. Thus, δ2 γδ1−1 is just multiplication by some a: we now
have a nice form for the transfer map
˜
φ2,1 (q) = (aq 1/h1 )h2 .

If h1 = 1, this is holomorphic. Otherwise, τ1 is elliptic – τ2 = γ(τ1 ) is in the same orbit, so it is elliptic with the same
period. We defined U2 in such a way that there is only one elliptic point, so τ2 = τ˜2 , so h1 = h˜2 . But this is exactly
˜
what we want: this means that φ2,1 (q) = ah2 q , which is a holomorphic map as desired.
We’ve only proved this for the case where φ1 (x) = 0, but this also proves it for the case where φ2 (x) = 0
(the inverse of a bijective, holomorphic map is holomorphic). And now we can add an intermediate function: write
φ2,1 as φ2,3 ◦ φ3,1 , where φ3 : U3 → V3 sends our point x to 0, and the composition of two holomorphic maps is
holomorphic.

And this tells us that modular curves Y (Γ) are Riemann surfaces, which is exactly what we wanted to show.

14 April 7, 2020

Diamond and Shurman 2.3 – Andrew Gu


Recall that an elliptic point for a congruence subgroup Γ is a point in τ ∈ H such that

Γτ = {γ ∈ Γ : γτ = τ }

is larger than {±I}. We’ll also call the image of τ in the modular curve Y (τ ) elliptic, and we’ll say that the period of
such an elliptic curve is |Γτ |/2 when −I ∈ Γ/τ and |Γτ | otherwise.

66
Last time, we skipped the proof that Γτ is a finite cyclic group, and that’s going to be the main purpose of this
lecture. It turns out that most of the proof is showing that this works for Y (1), so we’ll start by looking at the elliptic
points there.
Recall that the fundamental domain of the upper half-plane is defined as
 
1
D = τ ∈ H : | Re(τ )| ≤ , |τ | ≥ 1 ,
2

and the generators of SL2 (Z) are " # " #


0 1 1 1
S= , T = .
−1 0 0 1
We saw that the map π : D → Y (1) is surjective in one of the first lectures: basically we translate with T and then
use S to get the imaginary part large enough. However, this map is not injective: we know that if π(τ1 ) = π(τ2 ),
then "either #we’re working with the boundary points on the left and right or with the unit circle. The idea is that for
a b
γ= ,
c d
τ2 = γτ1 =⇒ |cτ1 + d| ≤ 1.
" #
1 b
When c = 0, we know that γ = ± , which means τ2 = τ1 + b. On the other hand, when c = ±1,
0 1

1
(Re(τ1 ) + d)2 ≤ 1 − Im(τ1 )2 ≤ ,
4
which means |d| ≤ 1 and we can again characterize the elliptic points. This gives us the result:

Proposition 173
The elliptic points of SL2 (Z) are SL2 (Z)i and SL2 (Z)ω, where ω = e 2πi/3 . Thus, the isotropy groups for elliptic
curves are conjugates of hSi and hST i, which have order 4 and 6 respectively. Therefore, all elements of finite
order are conjugate to one of −I, S ±1 , (ST )±1 , (ST ST )±1 , which have order 2, 4, 6, 3.

Note that τ = i and τ = ω both generate lattices Z + Zτ – one is square and one is hexagonal. Both of these
have rotational automorphisms around the origin, which tells us more about the complex elliptic curves associated to
the lattices. But we don’t have too much to say about this right now.
With this, we can think about the elliptic points of Y (Γ) in general:

Proposition 174
Let Γ be a congruence subgruop. Then Y (Γ) always has finitely many elliptic points, each with finite cyclic isotropy
group and period 2 or 3.

Proof. Pick any τ ∈ H. The isotropy group Γτ is a subgroup of SL2 (Z)τ . Since SL2 (Z)τ is always cyclic from the
above argument, so is Γτ . We can think about what possible groups the Γτ can be: it can be a subgroup of hSi, and
in this case it can’t be a proper subgroup because we’re excluding {±I}, so we must be conjugate to hSi itself. Thus
4
the group has order 4, meaning that the elliptic point has order 2 = 2. Similarly, if SL2 (Z)τ has order 6, then the
elliptic point has order 3.
To show that there are finitely many elliptic points, note that Γ has finite index in SL2 (Z), so we can write it as
a union of cosets Γγj . We know that the elliptic points for Γ are always images of i or ω, so they are a subset of

67
SL2 (Z)i ∪ SL2 (Z)ω. Taking the images in Y (γ) means that our elliptic points are a subset of

Eγ = {Γγj i , Γγj ω : 1 ≤ j ≤ d},

which is at most 2d elliptic points, as desired.

In general, we don’t have exactly 2d elliptic points: recall that we worked with Γ(N), Γ1 (N), Γ0 (N) in previous
lectures.

Proposition 175
Γ(N) for N > 1, Γ1 (N) for N > 3, and Γ0 (N) for N divisible by a prime p ≡ −1 mod 12 all have no elliptic points.

We know that a nontrivial isotropy group must be generated by an element of order 3, 4, 6, so they must have
characteristic polynomial x 2 + 1 or x 2 ± x + 1. The idea of the proof is to show that there are no generators of this
form in the subgroup.

Proof. For Γ(N), note that it is a normal subgroup of SL2 (Z), so conjugates of its elements are also in Γ(N). And
notice that S, T, ST, (ST )2 are all not in Γ(N), so there are no elements of order 3, 4, 6.
For Γ1 (N), notice that the trace of the matrix is 2 mod N, and the trace of matrices of order 3, 4, 6 are all −1, 0, 1.
So whenever N > 3, none of those numbers are 2 mod N. " #
a b
Finally, for Γ0 (N), suppose that there is a prime p ≡ −1 mod 12 that divides N. We have a matrix γ =
c d
with c ≡ 0 mod p, and we know that the trace is a + d ∈ {−1, 0, 1} mod p, while det γ = ad ≡ 1 mod p. Having
integer solutions to these equations means that t 2 − 4ad must be a perfect square in Z/pZ. But whenever p is
−1 mod 4,    
−4 −1
= = −1,
p p
and similarly when p is −1 mod 3, −3 is not a quadratic residue. Thus we get the result that we want.
" #
1 −1
For Γ1 , it turns out that N = 2, 3 both have elliptic curves: Γ1 (2) contains the element , and we can
2 −1
" #
τ −1 1 i 1 −1
solve the equation τ = 2τ −1 to get the elliptic point 2 + 2 with period 2. Similarly, Γ1 (3) contains , and
3 −2

1 3
2 + 6 i is our elliptic point in this case.
We can use this to classify elliptic points of Γ0 (N). Looking ahead, ε2 and ε3 count the number of elliptic points
of order 2 and 3, and they show up in the dimension formulas for modular functions.

Proposition 176
The period 2 elliptic points of Γ0 (N) are in bijective correspondence with ideals J of Z[i ] such that Z[i ]/J ∼
= Z/nZ.
Similarly, the period 3 elliptic points of Γ0 (N) are in bijective correspondence with ideals J of Z[e 2πi/6 ] such that
Z[e 2πi/6 ]/J = Z/nZ.

Basically, we consider elements of order 3, 4, 6 and look at quadratic residues again. This yields the following
formulas:

68
Proposition 177   
Q
−1
The number of elliptic points of order 2 in Γ0 (N) is p|N 1 + when N 6≡ 0 mod 4, and 0 otherwise.
Q   
p

Similarly, the number of elliptic points of order 3 in Γ0 (N) is p|N 1 + −3 when N 6≡ 0 mod 9, and 0
p
 
otherwise. Here, we’re using the Legendre symbol except with the convention that −12 = −3
3 = 0 – this is
only because p = 2 is weird when we’re solving the equation n2 + 1 ≡ 0 mod N.

Diamond and Shurman 2.4-5 – Nikhil Reddy


We’ll talk about cusps in this lecture. Recall from previous lectures that a congruence subgroup Γ(N) provides a left
action on the upper half-plane, and then we define the modular curve Y (Γ) to be the Γ-equivalence classes on H. Last
time, we showed that Y (Γ) is a Riemann surface, and we’ll be extending that result here.
If we look at the fundamental domain D (which is the points with real part between − 12 and 12 , outside of the unit
circle) on the Riemann sphere, the region looks like a triangle (circles and lines are basically equivalent), except that
we’re missing the point at infinity. So we’ll introduce that now: when we adjoin it, we also need to adjoin its orbit in
SL2 (Z). We can show that the orbit is rational, but we need to first show that we can get frmo ∞ to any rational
point:

Lemma 178 " #


a b
For any r = a
b ∈ Q, there exists γ = ∈ SL2 (Z) such that γ is sent to r .
c d

To prove this, we just find c, d ∈ Z such that ad − bc = 1 by the Chinese Remainder Theorem.

Definition 179
Define H∗ to be H ∪ Q ∪ {∞}. The compactified modular curve X(Γ) is

X(Γ) = Γ\H∗ = Y (Γ) ∪ Γ\(Q ∪ {∞}).

We use the notation X0 (N), X1 (N), X(N) for the special congruence subgroups. A cusp is a Γ-equivalence class
of Γ\(Q ∪ {∞}).

Proposition 180
X(Γ) always has finitely many cusps. In particular, X(1) has only one cusp as we showed above.

Proof. The index of Γ is finite in SL2 (Z), and now take all of the left coset representatives and multiply cusps by
them to generate other Γ equivalence classes. This reaches every rational point, so we must have reached all of the
equivalence classes, and thus there are finitely many cusps in total.

We use the Euclidean topology for H, but we need to do a bit more for H∗ .

Definition 181
A neighborhood around ∞ is of the form

NM = {τ ∈ H : Im(τ ) > M}.

69
The idea here is that circles centered at ∞ look like lines. Then we can define neighborhoods NM ∪ {∞}, and we
can also get neighborhoods around rationals by defining α(NM ∪ {∞}) to be a neighborhood around α(∞) for any
α ∈ SL2 (Z). Since we’ve defined our neighborhoods this way, γ is always a homeomorphism, and we get the quotient
topology from the natural projection map π : H∗ → X(Γ).
Basically, the neighborhoods around rational points are open disks tangent at the rational point, plus the rational
point itself.

Theorem 182
X(Γ) is a compact Riemann surface.

Recall that a Riemann surface needs to be connected, Hausdorff, and it needs a series of charts (sometimes called
an atlas), which are pairs (Um , Vm ) such that Um is a neighborhood around m and Vm is some open set in C (often the
open disk) such that we have a homeomorphism φm : Um → Vm . In addition, if Um and Un intersect, the induced map

φm (Um ∩ Un ) → φn (Um ∩ Un )

should be holomorphic as a function from C to C. And we just need to add being compact to this list.

Proposition 183
X(Γ), the compactified modular group, is connected, compact, and Hausdorff.

Proof. First, we show that H∗ is connected and compact, and then we can project down with the projection map. We
showed previously that H is connected, so if H∗ is a union of two disjoint open sets, then one must contain H and the
other is contained in Q ∪ {∞}. But there aren’t any nontrivial open sets of Q ∪ {∞}.
For compactness, we can first show that the fundamental domain plus the point at infinity D ∪ {∞} is compact in
H . Then by definition, we know that translates of D∗ cover H∗ , so

H∗ = SL2 (Z)(D∗ ),

and then we can break up SL2 (Z) into the right cosets of Γ:
[
= Γγj (D∗ ).
j

Apply π to each term to find that


[
X(Γ) = π(Γγj (D∗ )).
j

But π and γj are both continuous maps, so this means X(Γ) is a finite union of compact sets, which means it is
compact. (We have a finite union because Γ has finite order in SL2 (Z).)
Being Hausdorff is a bit more complicated: recall that the idea is that two points p1 , p2 must have neighborhoods
U1 , U2 such that U1 ∩ U2 = ∅. We now only need to consider the cases where the two points aren’t both in H (because
we did that in the previous class).
In the first case, when s1 ∈ Q ∪ {∞} and τ2 ∈ H, take some neighborhood U2 of τ2 . We need to show that there
is a neighborhood of s1 that does not intersect γU2 for any γ ∈ Γ, but note that

Im(α(τ )) ≤ max {} .

70
Thus we get an absolute upper bound on the imaginary part of γ(τ ), which means that we just need to pick a large
enough M such that NM ∪ {∞} does not intersect SL2 (Z)U2 . And now if α1 (∞) = s1 , we have

SL2 (Z)U2 ∩ α1 (NM ∪ {∞}) = ∅,

and we’ve found our two neighborhoods.


In the other case where we have two points s1 , s2 ∈ Q ∪ {∞}, we’ll pick two neighborhoods (such that α1 takes
∞ to s1 and α2 takes ∞ to s2 )

U1 = α1 (N2 ∪ {∞}), U2 = α2 (N2 ∪ {∞}).

In N2 , we’re high enough in the imaginary axis that the only way we can have equivalence is a translation. If the
neighborhoods intersect, then we know that γα1 (τ1 ) = α2 (τ2 ) for some γ ∈ Γ (the two points are equal up to a
γ-action). This means that α−1
2 γα1 is a translation, which means that γ(s1 ) = s2 because ∞ is fixed. But that
means s1 = s2 . Thus the neighborhoods don’t intersect for distinct points, and now we’ve indeed showed that X(Γ)
is Hausdorff.

From here, we have to deal with our charts: last week, we discussed this for any τ ∈ H. Basically, we start with
some open set U whose center is an elliptic point – because things look equivalent around such elliptic points, we’ll
have regions
" that are
# Γ-equivalent, which means its map to a unit disk is not so nice. So we had to use a straightening
1 −τ
map δ = last time, which sends our neighborhood to a neighborhood of 0 (because τ goes to 0 and the
1 −τ
conjugate goes to ∞), and then we see that U becomes a sector of a circle. We then make that into a full circle is
to use ρ, which raises δ(U) to some power. (This is the identity except at elliptic points, which have hT greater than
1.) Notably, we make the neighborhoods small enough that the Uτ avoid other elliptic points.
We need to figure out how to define our charts for H as well. We’ll do the same strategy: starting with a
straightening map, and then identifying things that are Γ-equivalent. If we look at this picture from ∞, we can
consider the neighborhoods N2 ∪ {∞}. Under Γ = X(1), this neighborhood is periodic every 1, and that is the only
equivalence. In general, this is periodic every h, and thus we just need to use the map

ρ(τ ) = e 2πiτ /h .

This is sort of like the q-series map earlier, where we sent our map into a Fourier series on the unit disk. So if we take
an s ∈ Q ∪ {∞}, we will send our point to ∞ first, and then we’ll use the identification map ρ to fill out the whole
disk. (We might use different hs ’s for different points.)
Formally, we’re defining
hs = |SL2 (Z)∞ /(δ{±1}Γδ −1 )∞ |.
" #
−1
1 hs
Basically, we’re extracting out the fact that (δΓδ )∞ is generated by . It should be clear that hs is finite –
0 1
otherwise, we get an infinite index of SL2 (Z).
Defining ψ to be the composition φ ◦ π, we know that ψ is not necessarily a bijection. However, we can show that
π(U) (which is our modular curve) is in bijection with V , because only the Γ-equivalent points are sent to the same
point. π and ψ are both open continuous maps, so π is open and continuous as well, which shows that φ is indeed a
homeomorphism.
To show holomorphicity, the idea is that “maps look more or less holomorphic.” When we have one point in H and
one in Q ∪ {∞}, consider the map from V1 to U1 , which sends q to δ1−1 q 1/h1 . After applying this map, we go from U1

71
to U2 by applying some γ, and then we go from U2 to V2 by applying the exponential map e 2πiδ2 x/h2 . Everything here
is holomorphic except when we’re taking the h1 th roots. But our domain never contains 0, because that means that
U1 ∩ U2 contains an elliptic point, and then π(U2 ) contains the elliptic point π(τ ) (and we defined our neighborhoods
so that the only elliptic points are possibly at the center), so we can always define q 1/h1 in a consistent way.
For the case where we have two cusps, we know that δ2 γδ1−1 is a translation if our neighborhoods intersect, and
then we just get another chain of holomorphic maps – this case isn’t too bad.
And now we’re done – we’ve showed all of the required conditions for X being a compact Riemann surface.
We can now state a first version of the Modularity Theorem:

Theorem 184
Let E be a complex elliptic curve with rational j-invariant. Then there is a positive integer N such that there is a
surjective holomorphic function X0 (N) → E, known as a modular parameterization.

15 April 9, 2020

Diamond and Shurman 3.1 – Christian Altamirano


We’ll talk today about the genus of compact Riemann surfaces. There is a rigorous definition for genus, but intuitively,
it is just the number of “holes” in our surface (so 0 for a sphere and 1 for a torus). Our goal will be to compute this
number for a few Riemann surfaces.
Recall that the modular curve X(Γ) is defined to be the set of orbits {Γτ : τ ∈ H∗ = H ∪ Q ∪ {∞}}. We showed
earlier that X(Γ) is a Riemann surface, and whenever we have a nonconstant holomorphic map f : X → Y between
compact Riemann surfaces, f is surjective, and f −1 (y ) is discrete (so finite) for any y ∈ Y .
It turns out that there is a well-defined degree d ∈ Z+ for our function f , so that |f −1 (y )| = d for all but finitely
many points y ∈ Y . We’ll be proving something a bit more general:

Definition 185
For any x ∈ X, let ex ∈ Z+ be the ramification degree of f at x. In other words, ex is the multiplicity with which
f takes 0 to 0 as a local map, meaning g(x) = x ex h(x) and h(x) 6= 0.

Recall the following theorem from complex analysis:

Theorem 186 (Local Mapping Theorem)


Let f be a holomorphic function, and suppose that f (z) − w0 has a zero of order n (that is, with ramification
degree ex = n). Then points w near w0 will have n distinct roots for the solution f (z) = w near z.

Lemma 187
There exists d such that
X
ex = d
x∈f −1 (y )

for any y ∈ Y .

72
Proof. Let E be the set of exceptional points – that is, the points with ramification degree more than 1. This set is
finite, because all such points are roots of f ′ , which has finitely many zeros. Therefore, X ′ = X \ E and Y ′ = Y \ f (E)
are both still connected. Now fix y ∈ Y ′ ; we know that for each x ∈ f −1 (y ), we have a neighborhood Ux such that f
is locally bijective on Ux (because x has ramification degree 1); shrink these neighborhoods so that they are disjoint
from each other, and such that they map to the same neighobrhood V ∈ Y ′ containing y . (This is okay because there
are finitely many images in the preimage.) Now we can define a function y → |f −1 (y )| on V ; this is a continuous
function and it is integer-valued, so it must be constant.
P
Let that constant be d: we now know that x∈f −1 (y ) ex = d for all points y ∈ Y ′ (everything except the exceptional
points). To extend this to Y , note that whenever y = f (x) and x is an exceptional point, we can find a neighborhood
P
N(y ) of y such that every point in that neighborhood has x∈f −1 (y ) ex = d. (Basically, we’re replacing ex points of
multiplicity 1 with 1 point of multiplicity ex ). Thus we have the result that we want.

Definition 188
Define the degree of f : X → Y be the unique d ∈ Z+ such that
X
ex = d
x∈f −1 (y )

for all y ∈ Y .

Theorem 189 (Riemann-Hurwitz)


Let gX , gY be the genera of two compact Riemann surfaces X and Y . Then
X
2gX − 2 = d(2gY − 2) + (ex − 1).
x∈X

Proof sketch. The idea is to triangulate Y – we get EY edges and FY faces, and we know that 2 − 2g = F − E + V for
P
any surface. Then lifting under f −1 yields a triangulation of X with dEY edges and dFY faces, but we lose x (ex − 1)
vertices due to ramification.

So now we can return to modular curves: suppose that Γ1 ⊂ Γ2 are congruence subgroups of SL2 (Z). There
is a natural projection f : X(Γ1 ) → X(Γ2 ), sending the orbits Γ1 τ → Γ2 τ . We can think of this as a nonconstant
holomorphic map between Riemann surfaces, and we can therefore calculate the degree:

Proposition 190
We have 
[Γ2 : Γ1 ]/2 −I ∈ Γ2 , −I 6∈ Γ1
deg(f ) = [{±I}Γ2 : {±I}Γ1 ] =
[Γ : Γ ] otherwise.
2 1

S
Proof. Partition {±I}Γ2 into a coset partition j {±I}Γ1 yj . Pick a point Γ2 τ (an orbit in X(Γ2 )) such that this is not
the image of a point that ramifies; we’ll show that f −1 (Γ2 τ ) = {Γ1 γj τ }. (There are only finitely many points that
ramify.)
We verify both inclusions: Γ1 γj τ ∈ f −1 (Γ2 τ ), because we’ll take f (Γ1 γj τ ) to Γ2 γj τ = Γ2 τ , and for any Γ1 τ ′ ∈
f −1 (Γ2 τ ), we know that f (Γ1 τ ′ ) = Γ2 (τ ′ ) = Γ2 τ , and then we can find γ such that τ ′ = γτ , meaning that
Γ 1 τ ′ = Γ 1 γj τ .

73
There is no ramification here, so ex = 1 for every point in f −1 (Γ2 τ ), meaning that deg f is the number of cosets.

Here, recall that we multiply by {±I} in the congruence subgroups, because −I fixes any point and the action of
SL2 (Z) factors through ±I.
Earlier on in the class, we discussed the local structure on Riemann surfaces with the straightening maps δ and
wrapping maps ρ1 (z) = z h1 , ρ2 (z) = z h2 . For a subset U of H, define ρ1 ◦ δ(U) = V1 and ρ2 ◦ δ(U) = V2 . If we
denote Γj,τ to be the isotropy subgroup of τ in j = 1, 2, and we proved last time that hj = |{±I}Γj,τ |/2 ∈ {1, 2, 3}
h2
(because elliptic points have order either 2 or 3). Since h1 should be integral, because the local map from V1 to V2 is
h2 /h1
the holomorphic map q , there are very few cases: we must either have h1 = 1 or h1 = h2 .

Proposition 191
The ramification degree for τ ∈ H is h2 when τ is an elliptic point for Γ2 but not for Γ1 and 1 otherwise. This is
also the size of the quotient subgroup [{±I}Γ2,τ : {±I}Γ1,τ ].

Similarly, we can define a ramification degree for cusps. Earlier, we showed that if U is a neighborhood of s ∈
Q ∪ {∞}, our maps ρ1 (z), ρ2 (z) are e 2πiz/h1 and e 2πiz /h2 , respectively, so this time the local map looks like q → q h1 /h2 ,
where hj = [SL2 (Z)∞ : {±I}Γj,s ] are the widths.

Proposition 192
The ramification degree for an s ∈ Q ∪ {∞} is h1
h2 . This is also the size of the quotient subgroup [{±I}Γ2,s :
{±I}Γ1,s ].

Proposition 193
Let Γ1 be normal in Γ2 . Then all points in X(Γ1 ) with the same image in X(Γ2 ) have the same ramification
degree.

Proof. Suppose Γ1 τ1 and Γ1 τ2 are points in X(Γ1 ) that map to the same point in X(Γ2 ). We know that Γ2 τ1 = Γ2 τ2 ,
so we can find γ such that γτ1 = τ2 . And the idea from here is that conjugation by γ does not change the period.

This now allows us to compute the genus: recall that for a group action G on X, Gx is the orbit of x ∈ X. We’ll
consider the case Γ1 = Γ and Γ2 = SL2 (Z).

Definition 194 (Local definitions)


Let y2 = SL2 (Z)i , y3 = SL2 (Z)e 2πi/3 , and y∞ = SL2 (Z)∞. These are an elliptic point of period 2, elliptic point
of period 3, and cusp of X(1), respectively. Let ε2 , ε3 be the number of elliptic points of Γ in f −1 (y2 ) and f −1 (y3 ),
and let ε∞ be the number of cusps of X(Γ).

But any elliptic point Γτ of period 2 has an image that is also an elliptic point of period 2. (If h1 > 1, then
h2 = h1 > 1). The only elliptic point of period 2 in X is y2 , so we indeed have Γτ = y2 . The same argument works
for ε3 , so ε2 and ε3 account for all of the elliptic points.
So now for h = 2, 3, we know that the ramification degree of the points in f −1 (yh ) is h for all elliptic points and 1
otherwise. Thus, the degree of f is
X
d= ex = h · (|f −1 (yh )| − εh ) + 1 · εh .
x∈f −1 (yh )

74
Similarly, we can find ε∞ by noting that
X X
d= ex =⇒ (ex − 1) = d − ε∞ ,
x∈f −1 (y∞ ) x∈f −1 (y∞ )

and then we can apply Riemann-Hurwitz:

Theorem 195
The genus of X(Γ) is
d ε ε ε
g =1+ − − − ,
12 2 3 ∞
where d is the degree of the natural projection f : X(Γ) → X(1).

(Here, we’ve used the fact that the fundamental domain D has genus 0 – it’s basically a triangle.)

Theorem 196
Let p be a prime, and let k = p + 1. The genus of the congruence subgroup X0 (p) = X(Γ0 (p)) is
 
 k − 1 k ≡ 2 mod 12
g =  12 
 k otherwise.
12

k will show up later as the weight of some modular forms – this should remind us of computing the dimension of
cusp forms of some weight k.
" # " #
1 0 1 −1
Proof. We’ll need a few results: let αj = for j ∈ [0, p − 1], and let α∞ = . The point of these αj s
j 1 1 0
S
is that we can split SL2 (Z) into a disjoint union j Γ0 (p)αj .
We’ll skip the proofs of a few computational results:

Lemma 197
X0 (p) has exactly two cusps at 0 and ∞.

Lemma 198
For any 0 ≤ j < p, we have γαj (i ) = αj (i ) for some γ ∈ Γ0 (p) of order 4 if and only if j 2 + 1 ≡ 0 mod p.

The purpose of these results is that the number of elliptic points of period 2 is the number of solutions to
x + 1 ≡ 0 mod p. This is because we have a nontrivial martix γ that fixes αj (i ), so αj (i ) is an elliptic point, and
2

because it is in the orbit of i , it must have period 2. And we know how to compute this number mod 4 using some
results in algebra (it’s 2 for p ≡ 1 mod 4, 0 for p ≡ 3 mod 4, and 1 for p = 2).

Lemma 199
For any 0 ≤ j < p, we have γαj (e 2πi/3 ) = αj (e 2πi/3 ) for some γ ∈ Γ0 (p) of order 6 if and only if j 2 −j+1 ≡ 0 mod p.

The logic is the same here, and we also know how to compute this number mod 3. Putting all of this together
means that we can do casework on the residue of p mod 12, to give the desired result.

75
Example 200
Setting p = 13, this is the smallest prime where all four possible elliptic points occur.

We can also consider some coset representatives of SL2 (Z)/Γ0 (13):


" #" # " # " #
1 0 0 1 1 −1 0 −1
βj = , β∞ = α∞ ,
j 1 1 0 0 1 1 0

and this partitions the fundamental domain D of X0 (13) into 14 regions. There are 13 points in SL2 (Z)i , and we can
actually associate these with the βj (i ): it turns out that jj ′ + 1 ≡ 0 mod 13 if and only if γβj (i ) = βj ′ (i ) for some γ
of degree 4.
We also yield some facts about elliptic points: for example, because 5 · 5 + 1 ≡ 0 mod 13, this gives So this allows
us to identify points of the boundary arc with each other, as well as understand the orientation of the fundamental
domain.

Modular forms and representations of real groups – Professor Kim


We’ll talk about how modular forms induce representations for real groups (like SLn (R), SOn (R), and so on).

Definition 201
Let G be a group (in general, we’re interesting in locally compact groups such as SL2 (R)). A representation of
G on a vector space Vπ is a continuous homomorphism π : G → Aut(Vπ ).

Definition 202
Let G act on a topological space X, and let Vπ = CC∞ (x) (the space of compactly supported smooth functions).
Define the right regular representation Rx via the action

(Rx (g)f )(x) = f (xg)

when X = G.

One of the questions to study is the space of L2 (square-integrable) functions L2 (G) or L2 (Γ\G) for some discrete
subgroup. The point is that modular forms (as well as Maass forms) generate representations of SL2 (R), which
induces an automorphic representation.
To understand what’s going on, let’s fix a group G = SL2 (R) and let K = SO(2), Γ = SL2 (Z). We know that
G/K is homeomorphic to H, because K can be characterized as the stabilizer of i in G, so we send g to gi . Now
recall that a modular form is a holomorphic function with the automorphic condition

f (g, z ) = j(g, z )k f (z),

where j = cz + d. This identity can be notationally rewritten as (f |k g)(z) = f (z) – this is the [γ]k operator from
earlier in the class – and now we can define a function

φf (g) = f |k (g)(i ) = j(g, z )−k f (gi ).

76
This function φf inherits some properties from f : it is smooth (because"f is nice), it defines
# a function on the quotient
cos θ − sin θ
space C ∞ (p\SL2 (R), and the right action of the rotation matrix rθ = ∈ K pops out as a “character”
sin θ cos θ
in k:
φf (grθ ) = e r kθ φf (g).

This means that φf is a one-dimensional representation of K.


Also, since f is holomorphic at ∞, there is moderate growth of this function φf at i ∞. (In addition, if f is
∂f
cuspoidal, then φf is rapidly decreasing.) Finally, because f is holomorphic, ∂z = 0, which means there exists a
differential operator F such that F · φf = 0.
In general, the above properties define automorphic forms of SL2 (R), and more generally any real reductive group.

Definition 203
Let G be a connected real reductive group. and let Γ be an arithmetic subgroup of G. A function φ on Γ\G is an
automorphic form if φ is smooth, of moderate growth, is K-finite (the vector space Kφ is finite-dimensional),
plus another property related to Maass forms and the Lie algebra.

We can let A(G, Γ) be the space of automorphic forms on G with respect to Γ. This is G-stable with respect to
the right-regular representation, and we can use this to study L2 (G) because the cuspoidal automorphic forms are
contained in L2 (G): A0 (G, Γ) ⊂ L2 (G). If we go back to the SL2 (Z) example, take a function f ∈ Mk (Γ), and
consider
Vϕf = hR(g)φf |g ∈ Gi.

This gives us a holomorphic discrete series πk of minimal weight k, and we can make the connection here:

Mk (Γ) ∼
= Hom(g,K) (πk A(G, Γ)).

16 April 14, 2020

Diamond and Shurman 3.2 – Zack Chroman


We’ll be talking today about automorphic forms. Recall from earlier in the class the weight-k operator

(f [α]k )(τ ) = j(α, τ )−k f (α(τ )),

where j(α, τ ) = cτ + d and α ∈ SL2 (Z). In Serre, we defined a modular function in order to define a modular form,
and we went from that to restricting to specific congruence subgroups. We’ll work backwards now:

Definition 204
An automorphic form is a function f : H → C that is meromorphic, weight-k invariant under the congruence
subgroup Γ, and meromorphic at infinity for all α ∈ SL2 (Z). Denote the space of automorphic forms at weight k
as Ak (Γ).

When we look at the case of k = 0. A0 (Γ) ∼


= C(X(Γ)) corresponds to the space of meromorphic functions on
X(Γ). This is because our functions need to be invariant under Γ, so we can define on a reduced domain X(Γ).

77
Theorem 205
1728g23
We have that A0 (SL2 (Z) = C(j), where j = ∆ .

Here, C(j) is the space of rational functions in j.

Proof. Clearly C(j) is contained in A0 (SL2 (Z), because j is weight zero. To go backwards, we’ll need to do more
work. We know that the Laurent series is of the form

1 X
j(τ ) = + an q n ,
q n=0

so the only pole of j is at infinity, because that’s the only point where ∆ = 0. So j is holomorphic on the space Y1
(which is X1 without the point at infinity), which means j : Y (1) ∼= C is a homeomorphism. In general, if we have a
meromorphic function f which has zeros zi and poles pj , we can define
Qa
(j(τ ) − j(zi ))
g(τ ) = Qbi=1 .
i=1 (j(τ ) − j(pi ))

f
This function has the same zeros and poles as f , so g has no poles and zeros at any finite point. And the number of
poles and zeros need to add up to the same number on this compact Riemann surface, including multiplicity, so the
ratio must have no zeros or poles even at infinity, which means that f = cg for some constant c. Since g is a rational
function of τ , this means f is also a rational function of τ .

Later on, we’ll be able to compute A0 (Γ) for other congruence subgroups, too.
Our next goal is to extend this result – when k 6= 0, we can’t make our automorphic forms into a map on X(Γ),
because even if we have two points that are equivalent under Γ, they’ll have some conjugate value under the action.
But we should be able to define the order of vanishing consistently, and what’s morally going on is that two points
under the action of γ ∈ Γ will keep the order the same because the factor of automorphy cz + d has no zeros or poles
on H.

Definition 206
If π(τ ) (the projection from C to X(Γ)) is not a cusp, define the order of vanishing at π(τ ) as

ντ (f )
νπ(τ ) (f ) = ,
h
where h = | ± IΓτ / ± I| is the period of the elliptic point τ .

The idea is that the coordinate charts are no longer injective – they’re h-to-one. So this order of vanishing is
sometimes not an integer, if we’re looking at elliptic points where we have half- or third-integers.
But the case where π(τ ) is a cusp, the order of f at τ is more complicated. We’ll consider τ = ∞ first: recall
that we needed the local coordinate q = e 2πiτ /h , and then we wrote f as a power series " in q,# defining the order to
1 h
be the lowest degree term. Here, h is the width: it’s the minimal h such that one of ± are in Γ.
0 1
" #
1 h
If f is h-periodic, we can write down a power series in q, and everything works out. But it’s possible that −
0 1
" #
1 h
is in Γ, but is not. Then f (τ + h) = −f (τ ), meaning our function is 2h-periodic instead of h-periodic.
0 1

78
So we’ll define h′ = 2h, so that we’re writing our power series as

X
f (τ ) = an qhn′ ,
n=m

and then we’ll define the order at infinity ν∞ (f ) = h2 .


And to get the complete definition at cusps, we just need to conjugate from ∞ to s:

Definition 207
Let s be a cusp. Then the order of vanishing of π(s) is
 *  +

 1 h

νs (f )/2 k odd, (α−1 Γα)∞ = −  
νπ(s) (f ) = 0 1



ν (f )
s otherwise,

where α is the element of SL2 (Z) such that α∞ = s.

In almost all cases, f will be h-periodic, and we won’t have the extra case. We’ll call the extra case an irregular
1
cusp, and we’ll see later on that the only case this happens for our familiar congruence subgroups is s = 2 for Γ1 (4).
From here, we’re done with the definitions, and we now have a nice result to characterize the space of cusp forms:

Theorem 208
Let k, N ∈ N such that k(N + 1) = 24, and let Sk be the space of all cusp forms of weight k. Define φk (τ ) =
Q
η(τ )k η(Nτ )k , such that η(τ ) = e 2πi/24 ∞n=1 (1 − q ).
n

• If Sk (Γ1 (N)) is nonzero, it is equal to Cφk .

• In addition, if Sk (Γ0 ) is nonzero, then Sk (Γ0 (N)) = Sk (Γ1 (N)) = Cφk .

Here, N is either 1 or a prime number, so the second point follows from the first because Sk (Γ0 (N)) ⊆ Sk (Γ1 (N)).
An important case is N = 1, k = 12: this tells us that all cusp forms of weight 12 over SL2 (Z) are multiples of ∆.

Proof. We can first define


g = φN+1
k = (2π)−24 ∆(τ )∆(24τ ),

and some more substitution yields



Y
= q N+1 (1 − q n )24 (1 − q Nn )24 .
n=1

Now ∆(Nτ ) is a cusp form on S12 (Γ0 (N)), so substitution tells us that g ∈ S24 (Γ0 (N)) ⊆ S24 (Γ1 (N)).

Lemma 209
Our function g has a zero of order N + 1 at every cusp π(s) (here, π is the projection map onto X(Γ1 (N))).

Proof of lemma. We’ll need to blackbox a few results: it turns out that all cusps are regular for N 6= 4. Also, the cusps
of X(Γ0 (N)) are π0 (∞) and π0 (0), and the widths here are 1 and N respectively. Finally, we’ll need the projection
map X(Γ1 (N)) → X(Γ0 (N)) to be unramified at the cusps.
The second assumption is the most important here, because it gives us a handle on the cusps on the larger space
X(Γ1 (N)). For any cusp s ∈ X(Γ1 (N)), we know that π(s) lies over either π0 (∞) or π0 (0).

79
In the first case, where π(s) is a cusp over π0 (∞), we can write s = α∞ for some matrix α ∈ Γ0 (N). Since g is
Γ1 (N)-invariant, it is invariant under the weight-24 action of α, which means that

νπ(s) (g) = νs (g) = ν∞ (g) = N + 1,

where the last equality comes from the direct expansion of g. " #
0 −1
In the second case, π(s) is a cusp over π0 (0), so the weight-24 operator of S = on g looks like
1 0

Y
g[S]24 = N −12 qNN+1 (1 − qNn )24 (1 − qNNn )24 ,
n=1

where qN = e 2πiτ /N . Since S sends ∞ to zero, the order of π(S) at g is the same as the order of π(0) at g, which is
also N + 1.

One important point is that we needed to write things in terms of qN instead of our usual q, because the width of
f N+1
π0 (0) is N. So now we can finish the proof: if f ∈ Sk (Γ1 (N)), we can consider the automorphic form g . This is
holomorphic on Y (Γ1 (N)), because g has no zeros (it’s the product of ∆ functions), and because of our lemma, this
is also holomorphic on the cusps. This is because the numerator f N+1 has a zero at every cusp, so the order is at
least as large in the numerator as the denominator. But then this is a holomorphic function defined on a compact set
X(Γ1 (N)), so we actually have that f N+1
g = c for a constant c, which implies that f = c ′ φk ∈ Cφk (φk is one of the
(N + 1)th roots of g, and we need to pick the same (N + 1)th root by continuity throughout).

We haven’t shown that φk is actually a cusp form yet – we do know this is true for (N, k) = (12, 1), because then
Sk (SL2 (Z)) = C(∆). On the other hand, whenever k is odd, Sk (Γ0 (N)) = 0, because −I is in Γ0 (N). But it does
turn out that (using the dimension formulas that we’ll go over in the next few classes) we do have

Sk (Γ0 (N)) = Sk (Γ1 (N)) = Cφk :

we have a space of cusp forms of dimension 1.

Diamond and Shurman 3.3 – Michael Tang


The topic of this section is meromorphic differentials – these will be helpful in talking about modular forms for the
rest of the chapter. Today, we’ll go over some motivation, define local and then global differentials, and then we’ll
go from differentials to automorphic forms. On Thursday, we’ll finish the lecture by going in the opposite direction,
giving us an isomorphism between automorphic forms and differentials, and we can use this to talk about dimension
of modular forms.
We’ll start with the transformation rule: for any function f ∈ A2n (Γ), we have

f (γ(τ )) = j(γ, τ )2n f (τ ) ∀γ ∈ Γ.

We can notice that dγ(τ )


dτ = j(γ, τ )−2 by computing with the quotient rule, so we can now say that
 −n
dγ(τ )
f (γ(τ )) = f (τ ),

and now we can symbolically treat the derivative as a fraction and rewrite as

f (γ(τ )) = (dγ(τ ))n = f (τ )(dτ )n .

80
This doesn’t have any real meaning yet, but we’ll make it more formal later. But this is nice, because f (z)(dz)n is
invariant under the action of Γ – we’ll make the definitions make sense now, so that we can exploit the useful properties
here.
We’ll start by looking at open sets on C:

Definition 210
Let V ⊆ C be an open set, and let n ∈ N. The meromorphic differentials on V of degree n are

Ω⊗n (V ) = {f (q)(dq)n : f meromorphic on V }.

We can set up a vector space structure over C, because f and g being meromorphic means f g, cf are both
meromorphic as well. The dq here is just a symbol – everything is really just determined by f and n.
In order to study this on Riemann surfaces, we’ll need to talk about a mapping between local differentials: if we
have two open sets V1 , V2 and a holomorphic map φ between them, we’ll induce a pullback map φ∗ in the other
direction between the differentials, defined by

φ∗ (f (q2 )(dq2 )n ) = [f (φ(q1 ))(φ′ (q1 ))n ](dq1 )n .

This is basically a u-substitution. We can confirm that whenever φ is holomorphic and f is meromorphic, (f ◦ φ)(φ′ )n
is also meromorphic.

Example 211
If we take the inclusion ι : V1 ⊆ V2 , then ι∗ is just a restriction:

ι∗ (f (q)(dq)n ) = [f (ι(q))1n ](dq)n = f |V1 (q)(dq)n .

We can note a few properties of this pullback map:


• φ∗ is a linear map from Ω⊗n (V2 ) to Ω⊗n (V1 ).

• φ∗ is a contravariant operator: (φ2 ◦ φ1 )∗ = φ∗1 ◦ φ∗2 .


Both of these results are just bookkeeping of our earlier definitions. We’ll also state a few properties without proof:

• If φ is surjective, then φ∗ is injective. Similarly, if φ is a bijection, so is φ∗ , and we have the identity (φ∗ )−1 =
(φ−1 )∗ .

We now want to define differentials on the entire Riemann surface X. Recall that when we defined the Riemann
surface, we used a collection of coordinate charts (φj )j∈J , each of which maps neighborhoods Uj of X (here the Uj
must cover X) to open sets in C, and one condition is that they must be compatible:

φk,j = φk ◦ φ−1
j

must be a holomorphic map from Vj → Vk for all j, k ∈ J. So similarly, a global differential should be a collection of
local differentials which are compatible.

81
Definition 212
Let X be a Riemann surface with charts φj : Uj → Vj . A meromorphic differential on X of degree n is a
collection of (ωj )j∈J , where ωj ∈ Ω⊗n (Vj ), such that for all j, k ∈ J,

φ∗k,j ωk |ϕk (Uj ∩Uk ) = ωj |ϕk (Uj ∩Uk )

(the pullback of the transition map does send ωk to ωj whenever both are defined). We denote the set of
differentials by Ω⊗n (X).

Example 213
Consider X = C/Λ to be a complex torus. Recall that the coordinate charts φj are just the projection map
π : C → X, and the inverses are just defined locally in the simple way. Then the transition maps are then just
“corrections between fundamental domains:” we have φk,j (z) = z + λ for some λ ∈ Λ.

We can check the compatibility condition – because the inverse should have derivative 1, we can just say that
ωj = dz for all j ∈ J, and then indeed φ∗k,j (ωj ) = dz = ωk , and the collection of ωs makes sense on X.
So now we can work towards automorphic forms: let Γ be a congruence subgroup of SL2 (Z), and we’ll spend the
rest of this lecture on X(Γ). Remember that neighborhoods of X(Γ) look like π(Uj ), where π is a projection from H∗
to X(Γ) and Uj . Then we defined the coordinate charts indirectly via

ψj = φj ◦ π,

where ψj = e 2πiδ(τ )/h is the identification map we’ve been disussing. So the local differentials are defined on Vj , and
now we can define a global pullback of ω by pulling back the ωj s individually:

π ∗ (ω)|Uj′ = ψj∗ (ωj′ ),

where the primes mean that we’re only working on H. We just need to make sure these local pullbacks agree on the
intersections of the Uj s, in the same way that we needed to make sure the differentials were compatible. In other
words, we need to show that
 
ψj∗ ωj |Vj,k = ψk∗ = ψk∗ ωk |Vj,k ,

where Vj,k = ψj (Uj′ ∩ Uk′ ) and Vk,j = ψk (Uj′ ∩ Uk′ ).

Proof. We know that ψj = φj ◦ π and ψk = φk ◦ π, so solving for π yields

ψj ◦ φ−1
j = π = ψk ◦ φ−1
k .

This means that


φkj = φk ◦ φ−1
j = ψk ◦ ψj−1 ,

meaning that we can write ψk = φkj ◦ ψj . This is helpful, because the compatibility condition for the ωs has to do
with the pullback of the transition map. So taking the pullback of both sides and using contravariance yields

ψk∗ = ψj∗ ◦ φ∗kj =⇒ ψk∗ (ω) = ψj∗ (φ∗kj (ωk )) = ψj∗ (ωj ),

as desired.

82
This shows that π ∗ is well-defined, because the local pullbacks take the same value whenever they intersect. And
the Uj′ s cover H – we’ll talk on Thursday about extending this to the cusps, too.
In summary, if we have a meromorphic differential ω ∈ Ω⊗n (X(Γ)) of degree n, we can construct a meromorphic
differential π ∗ (ω) on H, also of degree n. We can then define the function f such that

π ∗ (ω) = f (τ )(dτ )n .

Since π ∗ (ω) lives in X(Γ), the pullback doesn’t care about the individual elements of Γ, so it must be Γ-invariant: in
other words, γ ∗ acts trivially on Ω⊗n (X(Γ)).
And then
f (τ )(dτ )n = γ ∗ (f (τ )(dτ )n ) = f (γ(τ ))(γ ′ (τ ))n (dτ )n ,

and plugging in the definition of the derivative in our case yields

f (τ ) = f (γ(τ ))j(γ, τ )−2n ,

and now we’ve gotten that f is weakly modular of weight 2n, which is a (roundabout) way to motivate our definition
of weakly modular!
Next time, we’ll show that f is actually an automorphic form – we need to show meromorphicity when acting by
the weight-k operators and at infinity. It turns out that it’s easier to study differential forms than their corresponding
automorphic forms, so this will help us get some more insight out of the group A2n .

17 April 16, 2020

Diamond and Shurman 3.3 continued – Michael Tang


We’ll continue discussing meromorphic differentials today – as a recap, recall that we defined the local differentials
f (q)(dq)n ∈ Ω⊗n (V ) on open sets V ⊆ C, and then we defined pullback maps from Ω⊗n (V2 ) → Ω⊗n (V1 ) given a
homomorphism between open sets V1 → V2 . We combined these local differentials into global differentials on Riemann
surfaces, making sure that they were compatible through the pullback (transition) maps. Specifically, if we consider
a congruence subgroup Γ ⊆ SL2 (Z), we constructed a global pullback f (τ )(dτ )n on the modular curve X(Γ). We
noticed that this yields a weakly modular function of weight 2n, and it turns out that f [α]2n is actually meromorphic at
∞, so f is an element of A2n (Γ). This is important, because it gives us a map between differentials and automorphic
forms.
Today, we’ll consider the inverse mapping as well: we’ll start with f ∈ A2n (Γ), and we’ll construct the ω that pulls
back to f . This will give us an isomorphism between the spaces of meromorphic forms and automorphic forms, and
we’ll see why this is important.
Remember that the mapping ω → f was defined via the pullback map

π ∗ (ω)|Uj′ = ψj∗ (ωj′ ) = f (τ )(dτ )n .

Here, Uj′ is an open subset of H, π is the projection map, and ψ takes a neighborhood Uj to a neighborhood Vj . To
show this was valid, we needed to show that the local pullbacks agreed: the compatibility condition we needed to
satisfy was
ψk∗ ωk ) = (ψj∗ ◦ φ∗kj )(ωk ) = ψj∗ (ωj ).

83
Let’s show the converse is true as well:

Lemma 214
Suppose (ωj ) is a collection of differentials on the Vj s, and let ψJ : Uj → Vj be identification maps for X(Γ). If
ψj∗ (ωj′ ) = ψk∗ (ωk′ ) on the intersections, then the ωj s are compatible, meaning they define a global differential.

Proof. Run the previous argument in reverse: each equality is an if and only if statement.

This is helpful because we can start with an f , construct the individual ωj s, make sure they pull back to f (τ )(dτ )n ,
and that will give us the ω.
For the inverse construction, let’s start with an automorphic form f ∈ A2n (Γ). We are doing things locally, so we
can start with a neighborhood Uj ⊆ H∗ , we want to construct an ωj so that

ψj∗ (ωj ) = (f (τ )(dτ )n )|Uj .

We’ll want to do a “pushforward,” but we can just try to take an inverse map ψ −1 . Unfortunately, ψj isn’t invertible
in general, so we’ll need to look more carefully what’s going on. Recall that when we set up the charts of Riemann
surfaces, we wrote
ψj = ρj ◦ δj
" #
1 −τj
where δj is a linear transformation (the action of the matrix ), while ρj (z) = z h for non-cusps and ρj (z) =
1 −τj
e 2πiz/h for cusps. So we can try to do our pushforward in two steps: construct an intermediate differential such that
pulling back by δj gives the original differential ωj , and then pulling back by ρj gives the intermediate differential.
We’ll start with the linear transformation: luckily this is invertible, because the determinant is nonzero whenever
τJ ∈ H, so we just need to define  
λj = α∗ (f (τ )(dτ )n |Uj′

where α = δj−1 . And doing some computation with this,


 
λj = α∗ f (τ )(dτ )n |Uj′ = f (α(z))(α′ (z))n (dz)n ,

and here α′ (z) = det αj(α, z )−2 . (We can’t assume ad − bc = 1 for the matrix like usual.) This yields

= f (α(z)) · (det α)n j(α, z )−2n (dz)n ,

and we can still define (by analogy) that this λj = (f [α]2n )(z)(dz)n – this is a generalization of the weight-k
operator.
So now we need to find an ωj such that ρ∗j (ωj ) = λj . If we take a small neighborhood Uj , we can try to find a
formula for a local differential form that will pull back, at least in a small region. We’ll just do the case where τj is not
a cusp, so we have ρj (z) = z h .

Lemma 215
The function z n (f [α]2n )(z) is only dependent on z h .

Remember that h is defined to be the size of the isotropy subgroup {±I}(δj Γδj−1 )τj /{±I}, and we showed that
this is finite cyclic, generated by a rotation rh (z) = e 2πih z.

84
Proof. This follows from the Γ-invariance of f (τ )(dτ )n , which means that α∗ (f (τ )(dτ )n )) is δj Γδj−1 -invariant:

(δj γδj−1 ) = (δj−1 )∗ ◦ γ ∗ ◦ δj∗ = αγ ∗ α−1 ,

and applying this to α∗ works out the way we want. And this means rh∗ (λj ) = λj , so

(e 2πi/h z)n (f [α]2n )(e 2πi/h z) = z n (f [α]2n )(z),

which means that z n (f [α]2n )(z) is invariant under a rotation by an hth root of unity, meaning it is only dependent on
z h.

The point is that we can now define


gj (z h ) = z n (f [α]2n )(z)

for some meromorphic function gj . It turns out that

gj (q)
ωj = (dq)n
(hq)n

now works, because we can check that pulling back by ρj yields

ρ∗j (ωj ) = (f [α]2n )(z)(dz)n = λj ,

because the (hq)n makes the derivatives work out. So now we’ve constructed the local differential ωj !

Fact 216
For the cusp case, we can find a similar property: (f [α]2n )(z) turns out to only be a function of e 2πiz/h , and
gj (q) n
ωj = (2πiq/h)n (dq) .

So in all cases, we have a local differential ωj , such that pulling back gives us our original f (τ )(dτ )n . Since all the
ωj s pull back to the same f , they are compatible, so we get a global differential on the Riemann surface.
Combining all of this together, we get the isomorphism we’ve been after:

Theorem 217
The space of automorphic forms is isomorphic to the space of meromorphic differentials: for any n ∈ N and
congruence subgroup Γ,
Ω⊗n (X(Γ)) ∼
= A2n (Γ)

as complex vector spaces.

We do have a bijection, and we just need to show that there is a linear map between them, but the pullback map
is linear and we’re not doing anything weird with multiplication or addition in ways that are not compatible.
This is important, because we want to compute the dimensions of the spaces Mk (Γ) and Sk (Γ) (modular forms
and cusp forms, respectively). Here, modular forms are automorphic forms that are holomorphic, and cusp forms
are such forms that vanish at ∞. Instead of looking at the dimensions directly, we can look at the images under the
isomorphism, and compute the dimensions of the images in Ω⊗k/2 (X(Γ)). Riemann surfaces turn out to have more
structure – we’ll be able to write things in terms of the order of vanishing in a purely algebraic way. For example, being
a modular form means that the order of vanishing of f at every point π(τj ) and π(sj 0 is at least 0.

85
Diamond and Shurman 3.4 – David Wu
We’ll follow up on the previous topic by talking about divisors and the Riemann-Roch formula. Throughout this lecture,
we’ll be talking about general Riemann surfaces X, though we really care about X(Γ).

Definition 218
A divisor D on a compact Riemann surface X is a finite formal Z-linear combination of points of X:
X
D= nx x, nx ∈ Z, nx = 0 for all but finitely many x.
x∈X

We can look at the set of all divisors on X, denoted Div(x), and this forms the free abelian group on the points
of X: basically, we add two divisors by adding the coefficients. We’ll define a (partial) ordering on Div(x) as well: say
that D ≥ D′ if the coefficients satisfy nx ≥ nx′ for all x.

Definition 219
P
The degree of a divisor is deg(D) = x nx (since we assume that nx is zero for all but finitely many x).

Notice that the degree map from Div(x) to Z is a homomorphism of abelian groups. A natural question is to ask
about the kernel of this map – one natural place to start is to look at the function field of meromorphic functions
C(X). (Remember that being a meromorphic function f on X means that pre- and post-composing it with the
embedding into C yields a meromorphic function on C.) For any such meromorphic function f , we have the divisor
X
div(f ) = νx (f )x,

where νf is the order of vanishing of f on x. Then we claim that the map

div : C(X)∗ → Div(X)

is a homomorphism as well, because div(f1 f2 ) = div(f1 ) + div(f2 ). (Just consider the Laurent expansion at each point:
the powers should add when we multiply the functions.)

Proposition 220
The degree of any of these nonzero divisors f ∈ C(X) is zero.

Proof. A meromorphic function f : X → C is a holomorphic function on Ĉ, where Ĉ is C without a finite set of points.
We know that (from Christian’s lecture) the degree
X
d= νx (f˜)
x∈f −1 (y )

is constant. But the number of zeros is equal to the number of poles if we take y = 0, ∞.

As a more handwavy illustrative example, if we take X to be the Riemann sphere and we draw a closed contour,
this is saying that we can apply the argument principle on both sides of the contour.

86
Definition 221
Let Divℓ (X) be the group of divisors of nonzero meromorphic functions, and let Div0 (X) be the group of divisors
of degree 0 (this is the kernel of the degree map). By the proposition above, Divℓ (X) is a subgroup of Div0 (X).

There are divisors that don’t come from meromorphic functions, but we have a handy theorem:

Theorem 222 (Abel)


Let g be the genus of X, and let Λg be a lattice that spans Cg . Then

Div0 (X)/Divℓ (X) ∼


= Cg /Λg

(a g-holed torus).

℘′
For example, if we let Λi = i Z ⊕ Z and consider the elliptic curve C/Λi , we can compute the divisor of ℘, where
℘ is the Weierstrass function – we know the zeros and poles of this function, so this is easy to do directly.

Definition 223
For any divisor D on X, the linear space of D is

L(D) = {f ∈ C(X) : f = 0 or div(f ) + D ≥ 0}.

This might look mysterious in general, but if we have D = div(f˜) for some meromorphic function f˜, L(D) consists
of the functions f that “cancel out the poles,” meaning f f˜ must be holomorphic. We can indeed check that

νx (f1 + f2 ) ≥ min(νx (f1 ), νx (f2 )),

so L(D) is closed under addition. In fact, L(D) is a finite-dimensional vector space: we’ll denote its dimension to be
`(D). Take an n ∈ N, and let ω ∈ Ω⊗n (X) be a nonzero meromorphic differential on X. We have a local map at each
point in x ∈ X of the form
ωx = fx (q)(dq)n ,

where fx just means we’re looking at the local map at the point x, and now if νx (ω) = ν0 (fx ) (because q is centered
around 0), we define
X
div(ω) = νx (ω)x.
x

Here, we’ll have div(ω1 ω2 ) = div(ω1 ) + div(ω2 ) for any ω1 ∈ Ω⊗n (X) and ω2 ∈ Ω⊗m (X).

Definition 224
A canonical divisor on X is a divisor of the form div(λ), where λ is a nonzero element of Ω1 (X).

Theorem 225 (Riemann-Roch)


Let X be a compact Riemann surface of genus g, and let div(λ) be any canonical divisor on X. For any D ∈ Div(X),

`(D) = deg(D) − g + 1 + `(div(λ) − D).

This is important because we have explicit information about canonical divisors. (We won’t prove this.)

87
Corollary 226
Take previous notation. We have the following results:
1. `(div(λ)) = g.

2. deg(div(λ)) = 2g − 2.

3. For any divisor D with deg(D) < 0, we have `(D) = 0.

4. For any divisor D of degree larger than 2g − 2, `(D) = deg(D) − g + 1.

Proof. For (1), if f is nonconstant, it has a pole (for example, polynomials have a pole at infinity). This means
div(f ) ≥ 0 cannot occur, which means that L(0) only contains constant functions, meaning `(0) = 1. Taking D = 0
in Riemann-Roch, we find that 1 = −g + 1 + `(div(λ)), so `(div(λ)) as desired.
For (2), we can set D = div(λ), which yields g = deg(div(λ)) − g + 1 + 1. Rearranging gives the desired result.
For (3), suppose that `(D) > 0, so there is a nontrivial element f ∈ L(D). This means div(f ) ≥ −D, and taking
degrees yields deg(D) ≥ 0, so we’ve proved the contrapositive.
Finally, (2) and (3) imply (4): we have `(div(λ) − D) = 0, and

deg(div(λ) − D) = deg(div(λ)) − degD < 0,

since the degree of D is larger than 2g − 2.

Recall that Zack’s lecture showed us that a nonzero automorphic form j ′ ∈ A2 (Γ) exists for any congruence
subgroup Γ. Let f be any nonzero automorphic form of weight 2: define

λ = ω(f ) ∈ Ω1 (X(Γ))

to be the meromorphic differential for f . Then div(λ) is a canonical divisor, and by (2) above, the degree must be
(2g − 2). So if we take any positive even integer k,

λk/2 = Ω⊗k/2 (X(Γ))

(multiplying the meromorphic functions and the dqs), and because degree is a homomorphism, this gives us a degree
of k(g − 1) for the divisor of λk/2 . We know the equality of vector spaces

Ω⊗k/2 (X(Γ)) = C(X(Γ))λk/2 ,

and for any meromorphic f , deg(div(f )) = 0. Thus by additivity of the degree, every nonzero differential ω ∈
Ω⊗k/2 (X(Γ)) has a degree of k(g − 1).
So now if Γ is a congruence subgroup and g is the genus of X(Γ), the space of holomorphic one-forms Ω1hol (X(Γ)) is
isomorphic to L(λ), the linear space of λ. By (1) above, the dimension of the linear space `(λ) = g, so the dimension of
S2 (Γ) is g. (We can show that the space of weight-2 cusp forms is isomorphic to the space of holomorphic differentials
of degree 1.) In general, this argument will let us find general dimension formulas for Mk (Γ) and Sk (Γ).
In summary, we’ve defined these divisors in terms of the order of vanishing of the functions. When we proved the
formulas for dimension on SL2 (Z), we did an argument relating orders of vanishing at i , ρ, ∞, and so on, and we’re
generalizing here: we’re taking the Riemann surface X(Γ) and using order of vanishing data to relate that to the space
of modular forms.

88
18 April 21, 2020

Diamond and Shurman 3.5 – Kaarel Haenni


The plan for today is to discuss dimension formulas for the spaces of modular forms and cusp forms of weight k: in
this lecture, we’ll combine results from the past few lectures to answer this question for even k.
We’ll start with a quick review: recall that an automorphic form is a function f : H → Ĉ which is like a modular
form, but we only need to be meromorphic everywhere. If we take a nonzero f ∈ Ak (Γ), it may not make sense to
talk about f as a function on X(Γ) because it’s not constant on orbits, but we can talk about the order of vanishing
ντ (f )
(which is constant). We define this order via νx (f ) = h , where h is the period of τ . We want to talk about the
formal sum
X
div(f ) = vx (f )x,
x∈X(Γ)

which looks like the divisor from last lecture, but the small technical difficulty is that these vx may be rational rather
than just being integers. So we’ll just fix this: we’ll define the rational-coefficient divisor space DivQ (X) to be the
P
space of formal sums nx x, where nx are rational and almost all zero. This still has a natural abelian group, a ≥
relation, and a degree function (adding all the coefficients, just like for the integer case).
Riemann-Roch does not extend directly to this Q case, but it will still be useful. Recall that a corollary of Riemann-
Roch is that when we have a compact Riemann surface of genus g, the linear space L(D) of a divisor D ∈ Div0 (X)
(which is the space of divisors of degree 0) satisfies

`(D) = dim(L(D)) = deg(D) − g + 1

whenever deg(D) > 2g − 2. We also know that we have an isomorphism of complex vector spaces

ω : Ak (Γ) → Ω⊗k/2 (X(Γ)),

such that f is sent to ωj and ωj pulls back to f (τ )(dτ )k/2 . With this, we’ll move on to new material: our goal is to
reduce the dimension calculations to finding dimensions of linear spaces. We’ll look at k ≥ 2 and even, so that we
know there exists a nonzero function f ∈ Ak (Γ). Let C(X(Γ)) be the field of meromorphic functions on X(Γ): recall
that
Ak (Γ) = C(X(Γ))f

(the automorphic forms are equal to the meromorphic forms times a particular function). Using this description, we
can describe the space of modular forms via

Mk (Γ) = {f0 f ∈ Ak (Γ) holomorphic ⇐⇒ f0 f = 0 or div(f0 f ) ≥ 0},

which is isomorphic to the complex vector space

{f0 ∈ C(X(Γ)) : f0 = 0 or div(f0 ) + div(f ) ≥ 0}

where we’ve used the fact that degree is a homomorphism. This is just the linear space of div(f ), and it seems that
we might want to use Riemann-Roch. But div(f ) might have non-integer coefficients, so instead we’ll just take the
floor, bdiv(f )c, which is defined via jX k X
nx x = bnx cx.

89
Since div(f0 ) has all integer coefficients (because f0 is meromorphic), we actually have

div(f0 ) + div(f ) ≥ 0 ⇐⇒ div(f0 ) + bdiv(f )c ≥ 0.

This means that the dimension satisfies


dim(Mk (Γ)) = `(bdiv(f )c),

and we’ll now try to understand the right-hand side better. Let ω be the meromorphic differential which pulls back
to f (τ )(dτ )k/2 on H: recall that {x2,i } denotes the period 2 elliptic points of X(Γ), of which there are a total of ε2 ,
{x3,i } and ε3 are defined similarly for period 3, and {xi } and ε∞ denote the cusps.

Definition 227
We define the formal divisor
X1 X2 X
div(dτ ) = − x2,i − x3,i − xi .
2 3
i i i

Here, ω kind of looks like f · (dτ )k/2 , so we want to “expand out the div.”

Proposition 228
We do have
k
div(ω) = div(f ) + div(dτ ).
2

Proof. We just compare coefficients, using the fact that


 
k 1
νx (ω) = νx (f ) − 1−
2 h
1
at non-cusps, and the same thing without the h at cusps.

If we rearrange this result and take floors term by term (each x-coefficient only appears in one sum, and the first
term div(ω) has integers),
X k  X k  Xk
bdiv(f )c = div(ω) + x2,i + x3,i + xi .
4 3 2
i i i

We can now use Riemann-Roch on the floor-div: usually there is a canonical differential term which we want to avoid.

Proposition 229
The degree of bdiv(f )c is larger than 2g − 2.

Proof. We know that the divisor is a homomorphism, so we can bash out the sum: we know that deg(div(ω)) =
k(g − 1), so    
k k k
deg(bdiv(f )c) = k(g − 1) + ε2 + ε3 + ε∞
4 3 2
and then we bound by the worst case based on k:
k k −2 k −2 k
≥ (2g − 2) + ε2 + ε 3 + ε∞ .
2 4 3 2
This can be rewritten as  
k −2 ε2 2ε3
2g − 2 + 2g − 2 + + + ε∞ + ε∞ ,
2 2 3

90
where the parenthetical term is at least 0 from a previous lecture, and ε∞ is the equivalence classes of ∞, so it is at
least 1. This shows that the degree is large enough.

And now applying Riemann-Roch in this simple case yields

dim(Mk (Γ)) = `(bdiv(f )c) = deg(bdiv(f )c) − g + 1,

meaning that
   
k k k
dim(Mk (Γ)) = (k − 1)(g − 1) + ε2 + ε3 + ε ∞ .
4 3 2
We’ll now talk briefly about cusp forms, but the story is basically the same: we just have that
$ %!
X

Sk (Γ) = L div(f ) − xi
i

The strategy is basically the same, except that we require divisors at cusps to have coefficients at least 1. This time,
when we calculate the degree of bdiv(f )c, we lose the last ε∞ term in the boxed expression above, and we need k ≥ 4
P
so that the parenthetical term is now a strict “larger than 0.” (For the k = 2 term, bdiv(f )c− i xi is just the canonical
divisor div(λ), so the linear space has dimension g.)
Finally, let’s look at the k ≤ 0 case: M0 (X(Γ)) must be constant by Liouville’s theorem, and S0 (X(Γ)) = {0} is
the only cusp form. Notice that for k < 0, if there is any function f ∈ Mk (Γ), then f 12 ∆−k would be in S0 (γ), so
indeed we also just have Mk (Γ) = {0} for all k < 0 (which implies that the spaces of cusps forms are also zero).
So all of our results can be summarized in a compact form:

Theorem 230
For even k, we have
    

 (k − 1)(g − 1) + k4 ε2 + k3 ε3 + k2 ε∞ k ≥2


dim(Mk (Γ)) = 1 k =0



0 k <0

and      

 (k − 1)(g − 1) + k4 ε2 + k3 ε3 + k
− 1 ε∞ k ≥4

 2

dim(Mk (Γ)) = g k =2



0 k = 0.

This theorem is nice because it gives us results in terms of congruence subgroups which we know how to calculate.
And note that when Γ = SL2 (Z), we already have a similar result which can be easily derived by the results we know.

Example 231
We can prove that
M2 (Γ0 (p)) = S2 (Γ0 (p)) ⊕ CG2,p

by showing that the dimensions add up.

Basically, S2 is just missing a dimension of 1 from M2 , and note that CG2,p = G2 (τ ) − pG2 (pτ ) is an element of
M2 but not an element of S2 .

91
Example 232
For all k even such that k(N + 1) = 24, we have dim(Sk (Γ0 (N))) = 1.

This in particular means that Sk (Γ0 (N)) = Sk (Γ1 (N)) = Cφk , where

φk (τ ) = η(τ )k η(Nτ )k

for the Dedekind eta function η. In particular, this means that

M2 (Γ0 (11)) = Cφ2 ⊕ CG2,p :

we can find some very explicit descriptions of modular forms.


Finally, we have the isomorphism
Sk (SL2 (Z)) ∼
= S2 (Γ0 (p)),

where k = p + 1 and p is an odd prime. And when p = 11, we have the simple isomorphism where we multiply by the
ϕ2
function ∆.

Diamond and Shurman 3.6 – Vanshika Jain


We’ll prove the dimension formulas for the odd-k case here. Here’s the result we’re trying to show:

Theorem 233
Let Γ be a congruence subgroup of SL2 (Z), and let k be an odd integer. If −I ∈ Γ, then Mk (Γ) = Sk (Γ) = 0.
Otherwise, let g be the genus of X(Γ), ε3 be the number of elliptic points of period 3, εreg
∞ be the number of
regular cusps and εirr
∞ be the number of irregular cusps. Then

(k − 1)(g − 1) +  k  ε3 + k εreg + k−1 irr
k ≥3
3 2 ∞ 2 ε∞
dim Mk (Γ) = ,
0 k <0

(k − 1)(g − 1) +  k  ε3 + k−2 reg k−1 irr
k ≥3
3 2 ε∞ + 2 ε∞
dim Sk (Γ) = .
0 k <0

∞ > 2g − 2, then the dimensions of M1 , S1 are ε∞ /2 and 0 respectively, and otherwise, they
When k = 1, if εreg reg

are at least εreg reg


∞ /2 and equal to ε∞ /2, respectively.

We’ll use the following tools:

• The fact that dim(Mk (Γ)) = `(bdiv(f )c) still holds for odd k.

• We have an isomorphism of vector spaces an (Γ) → Ω⊗n/2 X(Γ)). In particular, we can take f ∈ Ak (Γ) and
consider f 2 for n = 2k.

First, note that Mk (Γ) = 0 for all k < 0, so Sk (Γ) = 0 for all k < 0 as well. And if k is odd and −I ∈ Γ, then

f (τ ) = (f [−I]k )(τ ) = −f (τ ),

which implies that the whole space Ak (τ ) = 0, which means that Mk (Γ) = Sk (Γ) = 0.

92
From here on out, k will be odd and positive, and we can assume −I 6∈ Γ. Recall that X(Γ) can have" both# regular
1 h
and irregular cusps: a irregular cusp π(s) ∈ X(Γ) is one where the group (α−1 Γα) is generated by − , where
0 1
α(∞) = s. (The cusp is regular if there’s no negative sign.) Then we have that the order of vanishing is

νs (f )/2 irregular cusp for s
νπ(s) (f ) =
ν (f ) otherwise
s

for a cusp π(s) ∈ X(Γ). (Basically, this is because the Fourier series width is 2h instead of just h.) Also, recall
that there is an isomorphism between automorphic forms and differentials when k ∈ N is even (see above). We’ll use
that isomorphism now: take ω to be the differential ω(f 2 ), which pulls back to f (τ )2 (dτ )k on H. We’ll also define
{x3,i }, {xi }, {xi′ } to be the period 3 elliptic points, regular cusps, and irregular cusps, with orders ε3 , εreg irr
∞ , and ε∞ .
We’ll similarly define the formal divisor
X2 X X
div(dτ ) = − x3,i − xi − xi′ .
3
i i i

Note that because −I 6∈ Γ, there are no period 2 elliptic points, so this is consistent with last time’s definition. Replace
k
f and 2 from the last lecture to show similarly that

div(ω) = 2div(f ) + kdiv(dτ ),

and now we can combine these two to find that


1 Xk Xk Xk
div(f ) = div(ω) + x3,i + xi + x ′.
2 3 2 2 i
i i i

Taking pointwise floor, the order of vanishing is integral everywhere except at x3,i , xi , xi′ . For the elliptic points x of
order 3, we know that we can write the order of vanishing as
j 1 j −k
νx (f ) = m + =⇒ vx (ω) = m + ,
3 2 3
but j ≡ k mod 3 because 12 vx (ω) is integral, so the integral part of the order of vanishing is just 12 vx (ω)+b k3 c. Similarly,
we can find the integral parts at xi and xi′ : these have additional terms of k
2
k−1
(plus the original 12 vx (ω)), so
and 2
!
1 X k  Xk Xk −1

dim(Mk (Γ)) = `(bdim(f )c) = ` div(ω) + x3,i + xi + x .
2 3 2 2 i
i i i

Our goal is to use a corollary of Riemann-Roch, so we want to get a bound on the degree of the divisor here (specifically,
we again want it to be at least 2g − 2). But
  
k k k − 1 irr ε3 ε∞ 
deg(bdiv(f )c) = k(g − 1) + ε3 + εreg + ε > (k − 2) g − 1 + + + (2g − 2) > 2g − 2
3 2 ∞ 2 ∞
3 2

for all k ≥ 3, so now the simpler form of Riemann-Roch tells us that


 
k k k − 1 irr
`(bdiv(f )c) = (k − 1)(g − 1) + ε3 + εreg + ε∞ ,
3 2 ∞ 2

as desired. For cusp forms, we’ll look at functions of the form f0 f ∈ Ak (Γ): note that at a regular cusp,

vx (f0 f ) > 0 ⇐⇒ vx (f0 f ) ≥ 1 ⇐⇒ vx (f0 ) + vx (f ) − 1 ≥ 0,

93
and at an irregular cusp,
1
vx (f0 f ) > 0 ⇐⇒ vx (f0 f ) ≥ vx (f0 ) + v (f ) −
≥ 0.
2
(this is coming from the order of vanishing from earlier in the lecture). So we want to measure the dimension
!
X X1

` bdiv(f ) − xi − xc ,
2 i
i

and we just do similar calculations to get the result in our main theorem.
To summarize, we used the isomorphism with f 2 instead of f , which gives us our differential. This allows us to
calculate an explicit formula for the divisor div(f ), and then we do some case-by-case analysis to get the integral parts.
We’ll now prove that Ak (Γ) does contain a nonzero f when k > 0 is odd and −I 6∈ Γ. Take a nonzero differential
λ ∈ Ω1 (X(Γ)) – for example, we can consider the divisor that pulls back to j ′ (τ )dτ . Now take an element x0 ∈ X(Γ)
– by Riemann-Roch, the degree of the divisor of λ is 2g − 2, so

div(λ) − 2(g − 1)x0 ∈ Div0 ,

where Div0 is the set of divisors of degree 0. (In general, Divℓ denotes the divisors of nonzero meromorphic functions
on X(Γ).) Abel’s theorem tells us that we have a mapping Div0 /Divℓ → Cg /Λg: let x + Λg be the image of the
element div(λ) − 2(g − 1)x0 + Divℓ . If a divisor D ∈ Div0 has that D + Divℓ maps to z
2 + Λg (where z ∈ Cg ), then we
know that
2D = div(λ) + 2(g − 1)x0 + div(f )

for some f ∈ C(X(Γ). We can rearrange this to

div(f λ) = 2(D + (g − 1)x0 ),

and remember our goal now is to find a automorphic form f which is weight-k and nonzero, and we’ll do this by finding
something of weight 1 and raise it to the kth power. Let f˜ be the function such that f λ on X(Γ) pulls back to
f˜(τ )dτ on H. This is nice because we know what dτ is: we find that
X2 X X
div(f˜) = div(f λ) − div(dτ ) = 2(D + (g − 1)x0 ) + x3,i + xi + xi′ .
3
i i i

We know that at elliptic points of period 3, vτ (f˜) is 3vπ(τ ) (f˜) and we have a 2 in the numerator for this term, so
this is even. Since f˜ is weight-2 invariant, there exists a function f1 such that f12 = f˜ and f1 [γ]1 = χ(γ)f1 , where
χ : Γ → {±1} (so it’s an automorphic form up to some negative sign). Let Γ′ be the set such that χ(γ) = 1. If the
index is 1, then we’re good – otherwise, we find a function f2 which is negative at the correct spots, and multiplying
f1 by f2 means we have our element of Ak (Γ), as desired.

19 April 21, 2020

Diamond and Shurman 3.8, 4.1 – Michelle Xu


We’ll cover cusps for congruence subgroups (explicit forms, cusps in relation to double cosets) and Eisenstein series
(review of material from Serre) in this lecture: the topics might seem disjoint, but we’ll connect them in the next
lecture.
Recall that a cusp is an equivalence class of Q ∪ {∞} under action by Γ: this is also denoted Γ\(Q ∪ {∞}). We

94
know that there is always an element of SL2 (Z) which maps ∞ to r for any rational r , so there is only one cusp for
SL2 (Z). However, this won’t be true with our congruence subgroups.

Lemma 234
a′
Let s = a
c and s ′ = c′ be elements of Q ∪ {∞} such that gcd(a, c) = gcd(a′ , c ′ ) = 1 are coprime. Then
" # " #
′ a′ a
s = γ(s) ⇐⇒ = ±γ .
c′ c

" #
p q
Proof. When c, c ′ are both nonzero, we know that if γ = ,
r t

a′ pa + qc a
s ′ = γ(s) ⇐⇒ = = .
c′ r a + tc c
But both fractions are in lowest terms because we can invert γ −1 , so we must have a′ = pa + qc, c ′ = r a + tc as
desired.
a′
If c = 0, then a = 1 based on our condition, so c′ = pr . Here, we have gcd(p, r ) = 1 because pt − qr = 1, which
means that a′ = p, c ′ = r . (And c ′ = 0 is similar.)

We’ll now focus more on the congruence subgroups:

Lemma 235
If γ ∈ Γ(N), then " # " # " # " #
a′ a a′ a
=γ ⇐⇒ ≡γ mod N.
c′ c c′ c

Proof. Here, γ is the identity matrix mod N, so left-to-right is clear. For the other direction, we start with the specific
case a = 1, c = 0, which means that a′ = 1 mod N. Bezout tells us that for any x, y ∈ Z, we have m, n ∈ Z such
that mx + ny = gcd(x, y ), and in general we can make mx + ny equal to any multiple of gcd(x, y ). In our specific
case, because a′ and c ′ are coprime, we can find β, δ such that
1−a
a′ β + c ′ δ = ,
N
and this means we can use the matrix " #
a′ βN
γ= ∈ Γ(N).
c′ 1 + δN

# have (a, c) = (1, 0)), we know that there exist b, d ∈ Z such that ad − bc = 1, so using
In general (when" we don’t
a b
the matrix α =
c d
" # " # " # " #
1 a a 1
α = =⇒ α−1 ≡ mod N,
0 c c 0
so we can first find γ ′ such that " # " #

−1 a −1 a
α = γ′α ,
c′ c
and use γ = αγ ′ α−1 because Γ(N) is a normal subgroup of SL2 (Z).

95
We’ll use these lemmas to prove more about our cusps:

Proposition 236
′ a′
Take the same notation as above, so s = a
c,s = c′ . Then we have an explicit description for s and s ′ to
correspond to the same cusp:
" # " #
′ a′ a
Γ(N)s = Γ(N)s ⇐⇒ =± mod N,
c′ c
" # " #
′ a′ a + jc
Γ1 (N)s = Γ1 (N)s ⇐⇒ =± mod N,
c′ c
" # " #
′ y a′ a + jc
Γ0 (N)s = Γ0 (N)s ⇐⇒ =± mod N
c′ yc
for some j, y .


Proof.
" # For " # the left hand side is equivalent to saying that s = γ(s) for some γ ∈ Γ(N), which is equivalent to
Γ(N),
a′ a

= ±γ by the first lemma, and then we can apply the second lemma.
c c
For Γ1 (N), any matrix can be written in the form
" # " #
1 ∗ aN + 1 bN + j
mod N = .
0 1 cN dN + 1

We decompose the congruence subgroup as


" #
[ 1 j
Γ1 (N) = Γ(N) ,
j
0 1


and we can now # the results we already know for Γ: the left hand side is equivalent to saying that s ∈ Γ1 (N)s,
" apply
1 j
so s ′ ∈ Γ(N) s for some j. But then we can just multiply out and apply the lemmas for Γ(N), and this tells us
0 1
gives us the desired result.
For Γ0 , we’ll decompose again: " #
[ x k
Γ0 (N) = Γ1 (N) ,
y N y

# N, and we need xy − Nk = 1. And the rest is a similar calculation:


where we take the union over y relatively "prime to
x k
the left side is equivalent to s ′ ∈ Γ1 (N) x for some y , which is equivalent to Γ1 (N)s ′ = Γ1 (N) Na+y
xa+kc
c , and
N y
applying the result for Γ1 tells us that
" # " #
a′ xa + kc + jy c
=± mod N,
c′ yc

and multiply a factor of y on both sides to get the result.

This means we have explicit conditions for checking whether two elements in Q ∪ {∞} are the same cusps, and
this allows us to count the number of cusps for each subgroup.
Instead, we’ll discuss these cusps in a group-theoretic manner:

96
Definition 237
Let G be an arbitrary group with subgroups H1 , H2 . A double coset of G is a subset of the form H1 gH2 , and the
space of double cosets is denoted H1 \G/H2 .

These double cosets are disjoint, and we can therefore decompose as


[
G= H1 gH2 .
g∈R

Proposition 238
Let Γ be a congruence subgroup of SL2 (Z). Let P be the parabolic subgroup
( " # )
1 j
P = ± :j ∈Z .
0 1
" #
a b
Then the map between Γ\SL2 (Z)/P and cusps of Γ is a bijection: Γ P maps to Γ(a/c).
c d

(We won’t prove this, but it’s a nice way to study cusps and is more general.)
We’ll finish by reviewing the Eisenstein series: recall that we define these to be

X 1
Gk (τ ) = ,
(cτ + d)k
(c,d)∈Z

where we sum over all pairs of integers except (0, 0). We also define the normalized Eisenstein series Ek (|tau) =
Gk (τ )
2ζ(k) .
We’ll be rewriting the Eisenstein series in nicer forms now:

Lemma 239
We have
X 1
Gk (τ ) = ζ(k) .
(cτ + d)k
gcd(c,d)=1

Proof. Reorganize the order of summation by the gcd (absolute convergence allows us to do this for k ≥ 4):

X X X 1 ∞ X
1 1
Gk (τ ) = = ,
n=1 gcd(c,d)=n
(cτ + d)k n=1
nk (cτ + d)k
gcd(c,d)=1

which is what we want.

Lemma 240
Let P+ be the positive part of the parabolic subgroup of SL2 (Z) (P , except without the ±). Then
1 X
Ek (τ ) = j(γ, τ )−k ,
2
γ∈P+ \SL2 (Z)

where j(γ, τ ) = (cτ + d).

97
Proof. Take the previous lemma and divide to find
X 1
Ek (τ ) = .
(cτ + d)k
gcd(c,d)=1

We know that c, d have no common divisor, so we can index with SL2 (Z) as long as we remove duplicates of the
same (c, d). And this requires proving that the sets of SL2 (Z) with the same (c, d) are exactly the same as the orbits
of P+ . One direction is easy –" we can#check"that multiplying
# SL2 (Z) by P+ preserves the (c, d) pair. For the other
a1 b1 a2 b2
direction, if we consider both and , then
c d c d

(b1 − b2 )c
a1 = a2 + ,
d
b1 −b2
and indeed n = d means we have a perfect indexing of (c, d) pairs. Substituting everything in gives the result.

This in particular show that Ek (τ ) is weakly modular of weight k: this is because we act with some γ and use
properties of the factor of automorphy j(γ, τ ), as well as the fact that multiplying by γ ′ onto some γ ∈ P+ \SL2 (Z)
gives another element of that equivalence class.
We found in Serre that
Mk = Sk ⊕ CGk ,

which motivates the following definition in general:

Definition 241
The weight-k Einstein space Ek (Γ) of a congruence subgroup Γ is the quotient space Mk /Sk .

We will prove later that this is a subspace of Mk and also complementary to Sk .

Proposition 242
∞ when k ≥ 3 is odd and −I 6∈ Γ,
The dimension of the Eisenstein space Ek (Γ) is ε∞ when k ≥ 4 is even, εreg
ε∞ − 1 when k = 2, εreg
∞ when k = 1 and −I 6∈ Γ, 1 when k = 0, and 0 otherwise.

This is basically just casework from previous lectures and subtracting dimensions.

Diamond and Shurman 4.2 – Andrew Gu


In this lecture, we will generalize the construction from last section to construct multiple Eisenstein series for congruence
subgroups Γ, and we’ll get an explicit basis for the Eisenstein space. Throughout this lecture, overline means reduction
mod N, P+N = P+ ∩ Γ(N), k ≥ 3 is a positive integer, and v ∈ (Z/nZ)2 is a row vector of order N (in other words,
1
gcd(c, d, N) is 1 for any lift v = (c, d) of v ). Also, let N be equal to 2 for N = 1 or 2 and be equal to 1 otherwise
(this is compensating for the −I being in Γ(N)).

Definition 243
For any row vector v , define the Eisenstein series
X
Ekv (τ ) = N (cτ + d)−k .
(c,d)≡v mod N
gcd(c,d)=1

98
Proposition 244
Let δ be any element of SL2 (Z) where the bottom row of the matrix reduces to (cv , dv ) mod N. Then we can
group matrices by bottom row:
X
Ekv (τ ) = N j(γ, τ )−k ,
γ∈P+N \Γ(N)δ

where j(γ, τ ) is the usual factor of automorphy cτ + d.

We’ll omit this proof for now. Our goal is to show that this series is an element of Mk (Γ(N)): to do that, we
need to show that it is holomorphic, weight-k invariant, and holomorphic at all cusps. The first property is basically
identical to the N = 1 case (because of uniform convergence, the function is again holomorphic everywhere).
To understand whether the transformation law works correctly, we consider the weight-k operator:

Proposition 245 (Transformation law)


For any γ ∈ SL2 (Z), we have
(Ekv [γ]k )(τ ) = Ekv γ (τ ).

In other words, we just end up with another Eisenstein series corresponding to a different row vector.

Proof. This is just a computation: recall that the weight-k operator is defined via

(f [γ]k )(τ ) = j(γ, τ )−k f (γ(τ )),

so the left side is equal to


X
N j(γ, τ )−k j(γ ′ , γ(τ ))−k
γ ′ ∈P+N \Γ(N)δ

which can be simplified via properties of j to


X
= N j(γ ′ γ, τ )−k ,
γ ′ ∈P+N \Γ(N)δ

and relabeling indices, this is summing j(γ ′′ , τ )−k over γ ′ ∈ P+N \Γ(N)δγ, which is exactly the right side.

Corollary 246
Ekv is weight-k invariant.

Proof. For any γ ∈ Γ(N), v γ = v (because γ is the identity mod N), so this follows directly from the above result:

(Ekv [γ]k )(τ ) = Ekv γ (τ ),

as desired.

It remains to show that the behavior at the cusps is what we want: we’ll start by looking at the cusp ∞. The
only terms in the Eisenstein series that do not vanish are those where (cτ + d)−k does not go to zero, so we must
have c = 0 and |d| = 1, and this only occurs when v = ±(0, 1). N = 1 and 2 are special cases, because we have
both (0, 1) and (0, −1). When k is odd, these contributions cancel, and otherwise we get an extra contribution
of 1
2 ((−1)
k
+ (−1)k ) = (−1)k from the two j(γ, τ ) terms. Thus, as Im(τ ) → ∞, Ekv (τ ) is only nonzero when
v = ±(0, 1), and furthermore we must have either N ≥ 3 or k even.

99
Remark 247. This is part of why the normalization N factor is nice: the values of our normalized Eisenstein series
Ekv are always integers at cusps.

Now note that when k is odd and n ∈ {1, 2}, the Eisenstein space is zero (because −I is in the congruence
subgroup). So we’ll assume that we’re working with N ≥ 3 and k even until we do our Fourier coefficient calculation.

Proposition 248
Ekv is holomorphic at all cusps.

" #
a′ a′ b′
Proof. Let v = (c, d) be the bottom row of δ ∈ SL2 (Z). If s = c′ is any cusp in Q ∪ {∞}, let α = ∈ SL2 (Z)
c′ d′
take ∞ to s: then the transformation law in Proposition 245 tells us that

Ekv (s) = Ekv [α]k (∞) = Ekv α (∞),

where the superscript can be written as (0, 1)δα. We know that this last Eisenstein series is only nonzero if
" # " #
a′ −d
(0, 1)δα = ±0, 1 =⇒ ′
≡± mod N.
c c

This means that Ekv is only nonvanishing at the cusp (orbit) Γ(N)(−d/c) and is zero everywhere else. Thus the value
is always finite, so we do have holomorphicity at the cusps.

Note now that all cusps are regular, so Ek (Γ(N)) has dimension ε∞ (Γ(N)) (since we’re assuming k ≥ 3 and N is
not 1 or 2). Also notice above that each cusp can be associated with an explicit Eisenstein series which is only nonzero
at that cusp, so we can use that set of Eisenstein series to create a basis.

Corollary 249
We can construct an Eisenstein series by summing over cosets: for any congruence subgroup Γ of level N, the
sum over coset representatives
X
v
Ek,Γ = Ekv [γj ]k
γJ ∈Γ(N)\Γ

is an element of Mk (Γ).

We’ll now move on to computing Fourier series. Ekv is normalized, but we want to instead look at the non-normalized
series (where we don’t require gcd(c, d) = 1)

X
Gkv (τ ) = (cτ + d)−k
(c,d)≡v mod n

when calculating these Fourier coefficients. We’ll group our terms according to gcd(c, d), which is always coprime to
N by assumption, and this yields
X X
Gkv (τ ) = (cτ + d)−k ,
n=1 (c,d)≡v mod N
gcd(n,N)=1 gcd(c,d)=N

and now factoring out the gcd term in the inner sum yields
X 1 X
= (c ′ τ + d ′ )−k .
nk
n=1 (c ′ ,d ′ )≡n−1 v mod N
gcd(n,N)=1 gcd(c ′ ,d ′ )=1

100
Thus, we can write
1 X −1 v
Gkv (τ ) = n
ζ+ (k)Ekn (τ ),
N
n∈(Z/nZ)∗

where

X
n 1
ζ+ (k) =
m=1
mk
m≡n mod N

is the modified zeta function for any n ∈ (Z/nZ)∗ . (This reduces to the usual zeta function for N = 1.) We can
similarly rewrite the E series in terms of the G series:

Proposition 250
We have
X −1 v
Ekv (τ ) = n
ζ+ (k, µ)Gkn (τ )
n∈(Z/nZ)∗

for the function



X
n µ(m)
ζ+ (k, µ) =
m=1
mk
m≡n mod N

where µ(m) is 0 for non-squarefree m, and otherwise (−1)k if k prime numbers divide m.

P
We prove this with a lot of calculation and a Möbius inversion: one main idea is that d|m µ(d) = 1 when m = 1
and 0 otherwise. And now that recalling that we can choose a collection of vectors that represent each cusp once,
giving a basis of {Ekv }s: the result above now tells us that we can get a basis of {Gkv }s as well.
To get a Fourier series, we’ll need the following lemma which was proved in Serre earlier in this class:

Lemma 251
For any k ≥ 2 and τ ∈ H, we have
X X∞
1
= Ck mk−1 q m ,
(τ + d)k m=1
d∈Z

(−2πi)k
where q = e 2πiτ and Ck = (k−1)! .

This now allows us to compute the Fourier series of Gkv (remembering that we’re summing over all nonzero (c, d))
by rewriting as
XX 1 1 XX 1
Gkv (τ ) = = k .
c≡cv d∈Z
(cτ + dv + Nd)k N k c≡c cτ +dv
+d
v d∈Z N

Notice that this does sum over all (c, d) ≡ (cv , dv ) mod N, and now we’ll break this up by values of c.

• c = 0 yields the constant term of the Fourier series (the above lemma will show that all other values of c have
no constant term, or alternatively we can think of taking τ to +i ∞): if cv = 0, then this is

X 1
,
dk
d≡dv

and otherwise it is zero.


cτ +dv
• For c > 0 (corresponding to points in the upper half plane), we can use our lemma with N in place of τ .

101
This then evaluates to

1 XX 1 Ck X X k−1 dv m cm
k = m µN qN ,
N k c≡cv cτ +dv
+d N k c≡cv m=1
d∈Z N
c>0 c>0
2πi/N 2πiτ /N
where µN = e and q = e . (The c and dv in the exponents can be understood if we expand out
2πi(cτ +dv )/N
e as in the lemma.) This can then be rewritten by summing over constant exponents of qN as

Ck X X
= k mk−1 µdNv m qNn ,
N c≡cv
m|n
c>0 n/m≡c
v
m>0

and we have a Fourier expansion for the c > 0 terms.



• On the other hand, for c < 0, we can write cτ +dv
N + d = − − cτN−dv − d . The first term in the parentheses,
− cτN−dv , now does exist in the upper half-plane, and applying the lemma gives something similar: we have

1 XX 1 k Ck X X k−1 −dv m −cm
k = (−1) m µN qN ,
N k c≡cv cτ +dv
+d N k c≡cv m=1
d∈Z N
c<0 c>0

and we can now rewrite this sum again by summing over the exponent of qN : if we flip signs to force c to now
be positive, m must be negative, and the (−1)k term simplifies with the (−1)k−1 from the m exponent. We end
up with
Ck X X
−mk−1 µdNv m qNn ,
N k c≡cv
m|n
c>0 n/m≡c
v
m<0

where notice that we now sum over the negative divisors of n instead of the positive ones.

Putting all of these together, we get the full Fourier expansion for Gkv : the qNn -coefficient of Gkv (τ ) for a point v
of order n is
Ck v
σ (n),
N k k−1
where
X
v
σk−1 (n) = sgn(m)mk−1 µdNv m .
m|n
n/m≡cv

As a final note, this Fourier expansion tells us that the coefficients are polynomially-bounded for any Ekv , and thus
we find directly that we have a modular form.

20 April 21, 2020

Diamond and Shurman 4.3 – Andrew Lin


Recall that last week, we introduced the Eisenstein series associated to specific congruence subgroups Γ. We were
able to describe Eisenstein series that live in Mk (Γ), and we also found that we could describe the cusp “Eisenstein
space”
Ek (Γ) = Mk (Γ)/Sk (Γ),

computing explicit basis elements of Ek (Γ(N)). The point of this section is to start giving some background for looking
at modular forms for Γ1 (N), as well as cover some general number theory concepts.

102
Throughout this, we will let GN denote the multiplicative group (Z/NZ)∗ : recall that this group has order φ(N),
and it consists of the residues mod N that are relatively prime to N.

Definition 252
A Dirichlet character mod N is a homomorphism χ : GN → C∗ .

Here are a few useful properties we can extract from this definition:
• Because GN is a finite group, all elements have finite order, so χ(n) is a root of unity for all n ∈ GN .
• The set of Dirichlet characters form a group, called the dual group ĜN of GN . Indeed, the law of composition
for two characters χ, ψ is the product character (χψ)(n) = χ(n)ψ(n). This group has an identity element 1N ,
which just sends everything to 1 (known as the trivial character), and the inverse of a character χ is χ−1 = χ,
which takes the complex conjugate of χ(n) for all n.
• Recall that for prime numbers p, the multiplicative group (Z/pZ)∗ is cyclic. Then the dual group is also cyclic:
each character is determined by which root of unity a primitive root g is sent to.

This last result can be generalized to all integers, not just primes:

Fact 253
The dual group ĜN is isomorphic to GN (and in particular, there are φ(N) Dirichlet characters mod N).

The proof can be found in Serre: the main idea is to induct on the order of the group by considering subgroups of
GN and noting that for any subgroup H of G, we can decompose G as the product of H and G/H (and same for the
dual groups). However, there is no canonical isomorphism in general (for example, we need to pick a primitive root
arbitrarily for the prime n case).
The next result generalizes the duality between the two groups:

Proposition 254 (Orthogonality relations)


We have 
X φ(N) χ = 1N
χ(n) =
0 otherwise
n∈GN

and 
X φ(N) n=1
χ(n) =
0 otherwise.
χ∈ĜN

These may look familiar (for example) from 18.702.

Proof. We’ll just prove the first result – the two proofs are basically the same. If χ = 1N , then we are just adding up
the total number of elements in the group, which is clearly φ(n) by the above fact. Otherwise, there is some element
y such that χ(y ) 6= 1. Then summing over the whole group,
X X X
χ(n) = χ(ny ) = χ(n)χ(y ),
n∈GN n∈GN n∈GN

which means that


X
(χ(y ) − 1)( χ(n)) = 0,
n∈GN

103
and the result follows. We do the same thing for the second relation, except we multiply by a character instead (and
use the fact that we still have a group).

We’ll now say a bit more about this idea of building up characters from smaller ones: suppose that d divides N,
and we have a Dirichlet character χd mod d. Then we can construct a Dirichlet character mod N in the natural way
by letting
χN (n mod N) = χd (n mod d),

but this is only always consistent if d divides N. We denote this via χN = χ ◦ πN,d , where πN,d takes an element of
GN and reduces it mod d.

Definition 255
The conductor of a Dirichlet character χ mod N is the smallest d|N such that we can write χ = χd ◦ πN,d for
some χd .

Because of the simple group structure here, we can equivalently say that the conductor is the smallest d such that
χ is identically equal to 1 for all elements in the normal subgroup

K = {n ∈ GN : n ≡ 1 mod d},

which is the kernel of the projection map.

Example 256
The conductor of the Dirichlet character ψ mod 12 sending 1, 7 to +1 and 5, 11 to −1 is 3.

Definition 257
A Dirichlet character mod N is primitive if its conductor is N.

For example, ψ “only cares about the element mod 3,” so it is not primitive.
We can extend the definition of χ to all integers by turning it into a function χ : Z → C:

χ(n mod N) gcd(n, N) = 1
χ(n) =
0 otherwise.

This function is indeed consistent with the original χ we described, and it is still multiplicative, although it is not a
homomorphism because 0 is not invertible in C. (Keep in mind that this also means the trivial character 1N is no
longer just equal to 1 everywhere, and for example χ(0) is only nonzero when N = 1.) Then we can restate the
orthogonality relations in a more helpful form for later use, though they are not really any different:
 
X
N−1 φ(N) χ = 1N X φ(N) n ≡ 1 mod N
χ(n) = , χ(n) =
0 otherwise 0 otherwise.
n=0 χ∈ĜN

104
Definition 258
The Gauss sum of a character χ mod N is

X
N−1
g(χ) = χ(n)µnN ,
n=0

where µN = e 2πi/N is the standard Nth root of unity.

For example, the Gauss sum of ψ above is zero, which does not tell us very much useful information.

Proposition 259
For any primitive character χ mod N and any integer m,

X
N−1
χ(n)µnm
N = χ(m)g(χ).
n=0

Notice the special cases when m = N and m = 1: the sum of χ is zero for a primitive character.

Proof. When gcd(m, N) = 1, this is simple: multiply the left side by χ(m)χ(m) = 1 to get

X
N−1
χ(m) χ(n)χ(m)µnm
N ,
n=0

and now the sum is still summing over all integers mod N, so it is g(χ), recovering the desired result.
Otherwise, the right hand side is zero because χ(m) = 0. Let d = gcd(m, N), and set m′ d = m and N ′ d = N.
Now rewrite the sum based on the residue of n mod N ′ : for any given residue n′ mod N ′ , the exponents for those
(n′ +kN ′ )m
corresponding terms look like µN for some k. But d|m, so the kN ′ terms disappear and we’re just left with an
exponent of n′ m. Noting that (µN )d = µN ′ , this leaves us with
 
′ −1
X
N−1 X  N−1
N X  ′ ′
χ(n)µnm N =  χ(n) µnN ′m .
n=0 n′ =0 n=0
n≡n′ mod N ′

To show the desired result, it suffices to show that the parenthetical term is zero. First of all, consider the sum over
K (that is, when n′ = 1): since χ is primitive, the value of χ over this kernel should not be trivial, so by a similar
“sum-over-group” argument, the sum over K is zero. And now we can sum over any other coset n′ K and get zero by
multiplicativity, and we’re done.

Corollary 260

The Gauss sum of a primitive character has magnitude N (in particular, it is nonzero).

Proof. We know that


X
N−1
g(χ)g(χ) = g(χ) χ(m)µ−m
N ,
m=0

and bringing g(χ) in and using the above proposition yields

X N−1
N−1 X X
N−1 X
N−1
−m (n−1)m
= χ(n)µnm
N µN = χ(n) µN ,
m=0 n=0 n=0 m=0

105
where we’ve swapped the order of summation. But the sum over m (by a familiar argument) is zero unless (n − 1)m
P
is a multiple of N, which happens exactly when n = 1. And that means the sum just reduces to N−1 n=0 1 = N, as
desired.

Proposition 261
If N = 1 or 2, then all Dirichlet characters have χ(−1) = 1. Otherwise, there are an even number of Dirichlet
characters, and half of them have each of χ(−1) = ±1.

Proof. It’s easy to write down the only Dirichlet character for N = 1 and 2: it’s the trivial one. Otherwise, χ(−1)
must either be −1 or 1 (because χ(−1)2 = χ(1) = 1), and there does exist a character that takes −1 mod N to −1,
because we can lift a nontrivial character either mod p for an odd prime p|N (sending a primitive root mod p to a
primitive root of unity) or the nontrivial character mod 4 when 4|N (which sends ±1 to ±1). So we have a surjective
homomorphism from the group of characters ĜN to {±1}, and the correspondence theorem yields the result.

This is most of what we’ll need for the upcoming theory of Eisenstein spaces, and we’ll finish with an important idea
" we’ll#construct subspaces of modular forms Mk (Γ1 (N)) corresponding to the congruence subgroup
of decomposition:
1 ∗
of matrices γ ≡ mod N.
0 1

Definition 262
The χ-eigenspace of a Dirichlet character χ mod N is

Mk (N, χ) = {f ∈ Mk (Γ1 (N)) : f [γ]k = χ(dγ )f ∀γ ∈ Γ0 (N)},

where dγ is the lower-right entry of γ.

This is clearly a (linear) subspace of Mk (Γ1 (N)).

Lemma 263
Mk (N, 1) = Mk (Γ0 (N)), and Mk (N, χ) is trivial unless χ(−1) = (−1)k .

Proof. For the first statement, note that 1(dγ ) = 1 because gcd(dγ , N) = 1, and the space of functions left invariant
under [γ]k for γ ∈ Γ0 (N) is exactly Γ0 (N).
For the second statement, the statement is vacuously true for N = 2, and otherwise −I is an element of Γ0 (N).
Then f [γ]k = (−1)k f , and the result follows.

Proposition 264
The vector spaces Mk (Γ1 (N)), Sk (Γ1 (N)), and Ek (Γ1 (N)) all decompose as direct sum of χ-eigenspaces: for
example,
M
Mk (Γ1 (N)) = Mk (N, χ),
χ

and analogously for Sk and Ek .

106
Proof. The proofs are the same for the first two spaces, and then quotienting yields the result for the third (since
(A ⊕ B)/(C ⊕ D) =∼ A/C ⊕ B/D for spaces C ⊂ A, D ⊂ B). We’ll only present the first proof. Define the weight-k
operator "" ##
a b
hdi =
c δ
k
for any δ ≡ d mod N. This is a well-defined, multiplicative operator called a diamond operator, but we need more
theory from Chapter 5 to see why this is indeed well-defined. Now for each character mod N, we can define
1 X
πχ = χ(d)−1 hdi.
φ(N)
d∈(Z/nZ)∗

This is a linear projection operator: applying πχ twice as a double sum and using multiplicativity includes χ(d) for
each element φ(N) times. To understand where this projection operator sends our modular forms, if we take any
f ∈ Mk (Γ1 (N)), then
1 X
πχ f [γ]k = χ(d)−1 hdif [γ]k
φ(N)
d∈(Z/nZ)∗

still sums over d because the lower-right element dγ of [γ] is invertible:

1 X
= χ(dγ )χ(d)−1 χ(dγ )−1 hddγ if = χ(dγ )πχ f ,
φ(N)
d∈(Z/nZ)∗

so we do project into Mk (N, χ) (πχ f satisfies the charactersitic condition), and when f is already in the χ-eigenspace,
πχ acts as
1 X 1 X
πχ f = χ(d)−1 hdif = χ(d)−1 χ(d)f = f ,
φ(N) φ(N)
d d

so we project onto the subspace.


To finish, note that the sum over characters
X X 1 X
πχ = χ(d)−1 hdi
φ(N)
χ χ d∈(Z/nZ)∗

is zero except when d = 1, in which case we just end up with 1


ϕ(N) φ(N) · h1i, which is the identity operator on Γ0 .
This means the subspaces are spanning. Finally, for any two distinct characters χ, χ′ ,
1 X
πχ ◦ πχ′ = χ(d)−1 χ′ (e)−1 hdei
φ(N)2
d,e∈(Z/nZ)∗

by multiplicativity, but summing over any constant c = de yields a sum


X
χ(d)−1 χ′ (d)χ(c),
d

which is zero since χ−1 χ is not the identity. Thus we’ve shown that the subspaces are disjoint: nothing projects down
to both subspaces, and we’re done.

Diamond and Shurman 4.4 – Natalie Stewart


The basic structure of this lecture is to go through a laundry list of functions to define and use: we’ll be discussing the
n
gamma function Γ, zeta function ζ (and associated ξ), L-functions L(s, χ) and the modified zeta function ζ+ (s). The
point is to provide meromorphic extensions to all of C for these functions, and characterize the poles and important

107
explicit values.

Definition 265
The gamma function Z ∞
Γ(s) = e −t t s−1 dt.
0

This is an “extension of the factorial function:” we have that Γ(n + 1) = n! for integers n, and in fact whenever
Re(s) > 0, the integral converges because we have exponential decay from e −t . The main point is that we have a
functional equation for all Re(s) > 0 of the form

Γ(s + N)
Γ(s) = ,
s(s + 1) · · · (s + n − 1)

which we prove via integration by parts of the identity Γ(s + 1) = sΓ(s) (we lower the power of the polynomial term).
The point is that we can use this to extend Γ to a function on all of C, except that we avoid points where we have to
divide by zero:
Γ(s + n + 1) (−1)n
lim (s + n)Γ(s) = lim = .
s→−n s→−n s · (s + 1) · · · (s + n − 1) n!
So when we meromorphically continue Γ to the domain where Re(s) ≤ 0, we will have a simple pole at each integer
(−1)n
−n ∈ Z≤0 with a residue n! . There’s also another alternative formula that we can use:

Proposition 266
Γ(s)Γ(1 − s) = π
sin(πs) .

In particular, this means that Γ(s) is nonvanishing on all of C (we just need to check explicitly at the positive
1
integers where we have poles, where we already know the value of Γ), so Γ is an entire function.

Proof sketch. Break down both sides into series: we use the fact that
  Y∞  
t n s2
lim 1 − = e −t , sin(πs) = πs 1− 2
n m=1
m

(the latter is the Euler product, which “tracks the zeros of the sine ”). Then breaking down the integrals means that
we want to show Z n  n
t
In = 1− t s−1 dt
0 n
converges to Γ(s), and these are easier to integrate by parts because we are integrating polynomials. We find that
1
In (s)In (−s) = Qn ,
−s 2 m=1 (1 − s 2 /m2 )

and taking n → ∞ will give us what we want.


(−2πi)k 1
 √
If we define the constant Ck = Γ(n) , we can verify that Γ 2 = π (change variables to a Gaussian integral),
and in general we can compute at half-integers that

π −(1−k)/2 Γ((1 − k)/2) Ck


= .
π −k/2 Γ(k/2) 2

108
This is shown by directly manipulating products:

Ck (−2πi )k 2k π k (−1)k/2
= = ,
2 2Γ(k) 2(1 · 3 · · · (k − 1)(2 · 4 · · · (k − 2))

and then we distribute factors of −2 into the parentheses on the left and +2 into the parentheses on the right.
(Everything works out – the point is to set up a parity between powers of π and values where Γ is evaluated.)
We’ll now move on to another function:

Definition 267
The Riemann zeta function is defined as
X∞
1
ζ(s) = s
n=1
n

for any Re(s) > 1.

The relevant complex analysis argument shows that this is indeed absolutely convergent on this half-plane, and we
also have the product formula
Y
ζ(s) = (1 − p −s )−1
p prime

(we can basically multiply each factor for a prime p over to the other side, and that accounts for all of the denominators
that have prime powers of p). We’ll define in section 4.9 that if we define the function

ξ(s) = π −s/2 Γ(s/2)ζ(s),

we have the functional equation ξ(s) = ξ(1 − s), and the only poles of this function are at s = 0, 1. And this tells us
that we have a meromorphic continuation of ζ to the entire complex plane, such that the only pole is at s = 1 (with
residue 1) and we have simple zeros at s = −2, −4, −6, · · · (and possibly other points as well).
The main argument we’ll make about these is that we can use our coefficients Ck from earlier to get all the negative
Ck Bk
integer values of ζ: recall that ζ(k) = 2k , where Bk is the kth Bernoulli number, and that tells us that

π (1−s)/2 Γ((1 − s)/2


ζ(s) = ζ(1 − s),
π −s/2 Γ(−s/2
2ζ(k)
so ζ(1 − k) = Ck = Bk
k .
We can also generalize this zeta function using the Dirichlet character:

Definition 268
For a Dirichlet character χ mod N, define the L-function

X χ(n)
L(s, χ) = .
n=1
ns

for Re(s) > 1.

A similar product argument shows that


Y
L(s, χ) = (1 − χ(p)p −s )−1 ,
p

109
and this means that, for example,
Y
L(s, 1N ) = ζ(s) (1 − p −s )
p|N

(remember that 1N is 1 whenever our integer is relatively prime to N and 0 otherwise). So at least for the identity
character 1N , we can meromorphically continue the L-function: L(s, 1N ) will have a meromorphic continuation to the
Q
entire complex plane, where there is only a simple pole at 1 with residue p|N (1 − p −s ) = ϕ(N)
N .
But whenever χ is not the character 1N , we actually have better behavior: we can find functional equations for
non-principal Dirichlet characters, and the extensions will turn out to be entire. So the key point is that ζ and L(s, χ)
are meromorphic with some nice properties.
We’ll quickly talk about the last family of functions here:

Definition 269
The modified zeta functions are defined via

X ′
X
n 1 1
ζ+ (k) = , ζ n (k) = .
m=1
mk mk
m≡n mod N
m≡n mod N

(where we’re allowed to sum over negative m in the second sum). Also, define

X
n µ(m)
ζ+ (k, µ) = .
m=1
mk
m≡n mod N

We care about these because they showed up in our equations between our Eisenstein series Ekv and Gkv , and in fact
n n
this gives us a relation between ζ+ and ζ+ (k, µ). We can write our modified zeta functions as a sum of L-functions:
we have that
1 X
χ(n−1 )L(s, χ) = ζ+
n
(s)
φ(N)
χ∈ĜN
n
by using the orthogonality relations, and therefore ζ+ also has a meromorphic continuation to the complex plane,
n
which again only has a simple pole at 1. This gives us the continuation of ζ+ (s, µ) by defining linear relations between
vectors of Ekv and Gkv and using a fair bit of linear algebra: the punchline is that we can write ζ+
n
(s, µ) in terms of
L-functions, so we can meromorphically continue it as well. Finally,

ζ n (s) = ζ+
n
(s) + (−1)−s ζ+
−n
(s),

so we can find a meromorphic extension, and in fact ζ n is entire (we just need to check that there isn’t a pole at
s = 1). One value that we will care about later on is

πi π πn
ζ n (1) = + cot .
N N N

21 April 30, 2020

Diamond and Shurman 4.5 – Nikhil Reddy


1
Recall that we defined the (original) Eisenstein series Gk (τ ) and Ek (τ ) by summing (cτ +d)k over all integers (c, d)
(except for zero), and we can also extract out a factor of ζ(k) by only considering coprime (c, d) to define the

110
normalized Eisenstein series. We checked that both of these converge absolutely when k ≥ 3, and we’ll assume that
throughout this section. (Notice that Gk and Ek are both identically zero in this definition whenever k is odd, so these
are really only defined for even k.)
Our next idea was to define the alternate series
X 1 X 1
Ekv (τ ) = εN , Gkv (τ ) = εN
(cτ + d)k (cτ + d)k
(c,d)≡v mod N (c,d)≡v mod N
gcd(c,d)+1

for a vector v of additive order N. (Remember that the overline means we reduce mod N.) These play nicely with
each other as well, and notice that taking N = 1 gives us the original Ek and Gk .

We also proved that the Gkv s were a linear combination of the Ekv s, given by

1 X −1 v
Gkv = n
ζ+ (k)Ekn (τ ),
εN
n∈(Z/nZ)∗

n
where ζ+ (k) is the modified Riemann zeta function. We’ll see a return of the Dirichlet characters as well: these are the
homomorphisms from GN → C∗ , and the important idea is that a character has a conductor (which is the periodicity
as a function Z → C): a primitive character is one with condutor N. (And remember that overline over a character is
a conjugate, not a reduction mod N.) Recall that when χ is primitive, we have

X
N−1
χ(n)µnm
N = χ(m)g(χ)
n=0

and therefore the Gauss sum is nonzero for a primitive character. This is helpful for us to to decompose the space
M
Mk (Γ1 (N)) = Mk (N, χ)
χ

into χ-eigenspaces: we’re going to be studying those eigenspaces in this lecture here.
We’ll keep doing a bit more review so the calculations make sense: last week, we showed the weight-k operator
satisfies
(Ekv [γ]k )(τ ) = Ekv γ (τ ).

Since the Gkv s are linear combinations of multiple Ekv s, the same equation holds for them as well:

(Gkv [γ]k )(τ ) = Gkv γ (τ ).

we’ll use this throughout the lecture.

Proposition 270
The Eisenstein series Gk(0,d) for any d ∈ (Z/nZ)∗ lives in Mk (Γ1 (N)).

Proof. Any element γ ∈ Γ1 (N) acts as the identity on vectors (0, d), so the above equation tells us that

(Gkv [γ]k )(τ ) = Gkv (τ ),

which is the characteristic property for a modular form in Mk (Γ1 (N)).

Our goal from here will be to construct a basis of the Eisenstein space by using pairs of primitive characters. Our
first step will be to try to build an element of Γ0 (N) from our elements Gk(0,d) :

111
Proposition 271
The Eisenstein series
X
Gk(0,d)
d∈(Z/nZ)∗

is an element of Mk (Γ0 (N)).

Proof. Notice that applying the operator [γ]k means we’re just summing over the group in a different order (using
ddγ instead of d, so we have the same sum again when we apply that operator.

Proposition 272
The Eisenstein series
X
χ(d)Gk(0,d)
d∈(Z/nZ)∗

is an element of the χ-eigenspace Mk (N, χ).

Proof. We’re still bringing the dγ into the vector, so the result under the weight-k operator is
X (0,ddγ )
χ(d)Gk .
d∈(Z/nZ)∗

Multiplying in a factor of χ(dγ )χ(dγ ) = 1 gives us a χ(dγ ) that comes out, and the rest is “summing over the group”
again. Then we end up with
(0,ddγ )
χ(dγ )sumd∈(Z/nZ)∗ χ(d)Gk ,

which is the definition of the χ-eigenspace.

Now we can do the full construction as follows: start with two Dirichlet characters ψ mod u and φ mod v , such
that
• uv = N,

• The product character satisfies (ψχ)(−1) = (−1)k , and

• φ is primitive.
Then the definition we’re using is

X v −1 X
u−1 X u−1
Gkψ,ϕ (τ ) = ψ(c)φ(d)Gk(cv ,d+ev ) (τ ).
c=0 d=0 e=0

This is secretly pretty familiar: if we plug in u = 1, the sums over c and e disappear, and we’re just left with the series
we just worked with in the φ-eigenspace in the proposition above. (And we define Gkv to be zero when v is not of order
N.)
Let’s prove this is in the φ-eigenspace: notice that

(cv , d + ev )γ = (c ′ v , d ′ + e ′ v

where we change the coordinates via c ′ = caγ , d ′ = ddγ , e ′ = (e + c ′ bγ )dγ , so we might as well sum over c ′ , d ′ , e ′

112
instead. So the action under [γ]k yields

X v −1 X
u−1 X u−1
(c ′ v ,d ′ +e ′ v )
Gkψ,ϕ (τ )[γ]k = ψ(c)φ(d)Gk (τ ).
c ′ =0 d ′ =0 e ′ =0

and now we just need to change the c and d in the Dirichlet characters: we use the fact that

ψ(c)φ(d) = ψ(aγ )ψ(c ′ )φ(dγ )φ(d ′ ),

and becaus aγ dγ = 1, we can write ψ(aγ ) = ψ(dγ ), so this just means we end up with

Gkψ,ϕ [γ]k = (ψφ)(dγ )Gkψ,ϕ .

This gives us our first result:

Theorem 273
The Eisenstein series Gkψ,ϕ lives in Mk (N, ψφ).

Remember that we have a Fourier series for Gkv , where the non-constant part looks like

Ck X
sgn(m)mk−1 µdNv m qNnm .
Nk
mn>0,n≡cv mod N

We can plug this into the formula for our Eisenstein series Gkψ,ϕ to get a quadruple sum
u−1 v −1 u−1
Ck X X X X
= k
ψ(c)φ(d) sgn(m)mk−1 µ(d+ev
N
)m mn
qN
N c=0 e=0 mn>0,n≡c
d=0 v

which we can group by summing roots of unity, using that uv = N:


u−1 v −1
!
Ck X X X X
u−1
= k ψ(c)φ(d) sgn(m)mk−1 µdm
N µem
u qNmn
N c=0 mn>0,n≡c e=0
d=0 v

But the inner sum of roots of unity is zero unless u divides m, and also n is a multiple of v by definition. So we can
get rid of one of the summations by replacing m → um and n → v n to get
u−1 v −1
Ck X X X
= ψ(c)φ(d) sgn(m)mk−1 µdm
v q
mn
,
v k c=0
d=0 mn>0,n≡c mod u

and now because φ is primitive by assumption, we can bring the φ(d) into the inner sum and use our proposition about
Gauss sums: now we have
Ck g(φ) X X
u−1
ψ(c) sgn(m)φ(m)mk−1 q mn ,
v k c=0
mn>0,n≡c(u)

and now the sum c can be summed over all n instead to get

Ck g(φ) X X
= sgn(m)ψ(n)φ(m)mk−1 q mn ,
v k mn>0 mn>0

and now the negativity condition tells us that the (m, n) and (−m, −n) terms are identical: we end up with

Ck g(φ X
= ψ(n)φ(m)mk−1 q mn ,
v k m,n>0

113
and we can rewrite this in a way that looks like the σk−1 coefficients in the original Eisenstein series:
 

Ck g(φ) X  X
= 2 ψ(n/m)φ(m)mk−1  q m .
vk n=1 m|n,m>0

We can then derive the following formula:

Theorem 274
We can rewrite the Eisenstein series as
Ck g(φ) ψ,ϕ
Gkψ,ϕ = Ek ,
vk
where

X
Ekψ,ϕ = δ(ψ)L(1 − k, φ) + 2 ψ,ϕ
σk−1 (n)q n ,
n=1

where δ(ψ) is 1 when the character is identically 1 and 0 otherwise, and we have the generalized power sum
X
ψ,ϕ
σk−1 = ψ(n/m)φ(m)mk−1
m|n,m>0

So now we’ll move to constructing the basis elements:

Definition 275
Let AN,k be the set of triples (ψ, φ, t) such that ψ, φ are primitive mod u and v with (ψφ)(−1) = (−1)k , and
t is an integer such that tuv |N. Let BN,k be the set of pairs (ψ ′ , χ′ ) such that ψ ′ and χ′ are (not necessarily
primitive) characters modulo u ′ , v ′ with u ′ v ′ = N and (ψ ′ φ′ )(−1) = (−1)k .

These sets are in bijection, because we can send


(ψ, φ, t) → (ψtu , φ′N/tu )

and
(ψ ′ , φ′ ) → (ψ, φ, u ′ /u)

where ψ, φ are the primitive characters mod u ′ and v ′ (the v ′ /v is implied because we can determine it from the value
of N). This is important because BN,k is easier to work with – we don’t need to worry about primitive characters – at
first-order, the size of BN,k is
1X
φ(d)φ(N/d),
2
d|N

and this seems to be also the size of Ek (Γ1 (N)). This turns out to be true for all N ≥ 4, and thus it makes sense that
we can use the set AN,k to make a basis (using our Gkψ,ϕ functions):

Theorem 276
Let N be a positive integer and k ≥ 3. Define Ekψ,ϕ,t (τ ) = Ekψ,ϕ (tτ ): then the set Ekψ,ϕ,t form a basis of Ek (Γ1 (N).
In particular, the set of Ekψ,ϕ,t with ψφ = χ also give us a basis for the χ-eigenspace Ek (N, χ).

In particular, choosing ψ = 1 also gives us a basis of eigenspaces for Γ0 .

114
Diamond and Shurman 4.6 – Christian Altamirano
In this section, we’ll talk about the Eisenstein spaces for k = 2. We’ll begin by constructing a basis for E2 (Γ(N)): the
Weierstrass ℘ function
X ′
1 1 1
℘(z) = + − 2
z 2 (z − ω)2 ω
ω∈Λ

helps us define the function (where we use the lattice Z + Zτ )


 
1 cv (τ + dv
f2v = 2 ℘τ
N N

which is a weakly modular function of weight 2 with respect to Γ(N). We’ll try and expand that now: this yields

X
1 1 1 1
f2v = + 2 2 − .
(cv τ + dv )2 N cv τ +dv
− cτ − d (cτ + d)2
(c,d)∈Z2 N

Our goal is to compute the Fourier coefficients here. One formula that we’ll use is the Fourier series of the following
sum:

Proposition 277
For any τ ∈ H and k ≥ 2, we have
X X∞
1
= Ck mk−1 q m ,
(τ + d)k m=1
d∈Z

and similarly if −τ ∈ H, we have


X −1
X
1
= −C k mk−1 q m .
(τ + d)k m=−∞
d∈Z

We’ll use the same notation as in section 4.2, since there will be a lot of computation.

Theorem 278
We have
1
f2v (τ ) = G2v (τ ) −
G2 (τ ),
N2
P
where G2 (τ ) is the regular Eisenstein series of weight 2 with Fourier expansion 2ζ(2) + 2C2 ∞ n
n=1 σ(n)q and


C2 X v
G2v (τ ) = δ(cv )ζ dv (2) + σ (n)qNn .
N 2 n=1 1

Proof. We’ll split the sum for f2v (τ ) into three parts: c = 0, c > 0, and c < 0. When c = 0, we’ll also add the
1
constant term (cv τ +dv )2 . Then we end up with a sum
! ′
1 X 1 1 X 1
2 − .
N2 cv τ +dv
−d N2 d2
d∈Z N d∈Z

1
The right sum is just N 2 2ζ(2), and the left sum depends on cv : if cv = 0, we just get ζ dv (2), and otherwise the
imaginary part of cv τ + dv is positive (because we only sum cv from 0 to N), so we can use the Fourier expansion in

115
the proposition above using cv τ + dv instead of τ . This yields

X
1
C2 mk−1 (q ′ )m ,
N 2 m=1

where q ′ = e 2πi(cv τ +dv )/N . Then q ′ is simply related to q, so we can plug that in: this means that c = 0 contribution
is just

C2 X dv m cv m 1
δ(cv )ζ dv (2) + (1 − δ(cv )) µ q − 2 2ζ(2).
N 2 m=1 N N N
For the c > 0 case, we want to figure out
1 XX 1 1
2 − .
N 2 c>0 cv τ +dv
− cτ − d (cτ + d)2
d∈Z N

We can apply the proposition for the second term again because Im(cτ ) > 0, and we end up with
1 X
C2 m(q ′ )m ,
N 2 c>0

where q ′ = e 2πicτ = q c . Reordering the sum by summing over mc instead, this becomes

C2 X
= σ(n)q n .
N 2 n=1

For the first term of this sum, we have a negative imaginary part, so we expand out the expressions there as well. The
eventual result is that the c > 0 contribution looks like
 
∞   ∞
C2 X  
X  C2 X
sgn(m)µdNv m  n
 qN − σ(n)q n .
N 2 n=1 
 m<0  N 2 n=1
m|n
n/m≡cv (N)

And the c < 0 case is similar – the right sum is the same and we need to be careful about the negative sign for Im(τ ).
Summing everything together, we do end up with the result that we wanted – for instance, things cancel out between
c = 0 and c < 0.

We’ll now move on to the “correction terms” for these series: remember that the Eisenstein series G2 (τ ) is only
invariant when we add the correction term − Im(τ
π
) . So something similar occurs here when we correct our G2 expression,
because
1
f2v (τ ) = G2v (τ ) − G2 (τ )
N2
is weight-2 invariant.

Definition 279
Define the weight-2 modular form
π
g2v (τ ) = G2v (τ ) − .
N 2 Im(τ )

To show holomorphicity everywhere, we can instead use the alternative method of proving the coefficients to be
polynomially bounded:

116
Theorem 280
The coefficients of g2v are bounded by Cn2 .

Proof. We can make an easy bound: all coefficients just look like
X X
|σ1v (n)| ≤ |sgn(m)mµdNv m | ≤ |m| ≤ Cn2 .
m|n,n/m≡cv mod N m≤n

So we do indeed have a modular form in our above definition.


With this, we’ll move on to calculating a basis for E2 (Γ(N)) and E2 (Γ1 (N)): remember that the dimension of k = 2
is ε∞ − 1, so we can get a basis by taking the cusp representatives and taking differences to remove one basis element.
It turns out that E2 (Γ(N)) has basis
{g2v1 − g2v2 , · · · , g2vε∞ −1 − g2{
v ε∞
}.

We can also change these differences so that we use linear combinations over G2v instead of g2v , as long as the
corrections match up:

Theorem 281
The space E2 (Γ(N)) can also be written as
( )
X X
E2 (Γ(N)) = av G2v : av = 0 .
v v

For E2 (Γ1 (N)), we’ll define the Eisenstein series G2ψ,ϕ (τ ) and E2ψ,ϕ (τ ) analogously to how we did in the previous
lecture for Gkψ,ϕ (τ ) and Ekψ,ϕ (τ ).
Whenever ψ or φ is nontrivial here, G2ψ,ϕ ’s coefficients sum to zero, and also note that G2ψ,ϕ ∈ M2 (N, ψφ) and
that
C2 g(φ) ψ,ϕ
G2ψ,ϕ (τ ) =
E2 (τ ).
v2
But when ψ, φ are both trivial, we won’t get a modular form because of the correction term. But we can still use that
case to get a modular form:

Lemma 282
For an integer t, the function

C2 (E211 ,11 (τ ) − tE211 ,11 (tτ )) = G2,t (τ ) = G2 (τ ) − tG2 (tτ )

is a modular form.

Proof. It suffices to show that


C2 E211 ,11 (τ ) = G2 (τ )

for any τ , and this is true because plugging in the trivial characters into the definition of E2 yields

X ∞
X
L(−1, 11 ) + 2 σ(n)q n = ζ(−1) + 2 σ(n)q n ,
n=1 n=1

but G2 (τ ) is also of this form (we just need to check that 2ζ(2) = C1 ζ(−1) so that the constants match up).

117
This means that we can finally define a basis for E2 (Γ1 (N)): much like the last section, let AN,2 be the set of triples
(ψ, φ, t) such that ψ, φ are primitive characters mod u and v with (ψφ)(−1) = 1, and let t be an integer such that
1 < tuv |N. (We don’t want t = u = v = 1 here, so we’re also excluding the tuple corresponding to the correction in
the above lemma).

Definition 283
For any (ψ, φ, t) ∈ AN,2 , define

E 11 ,11 (τ ) − tE 11 ,11 (tτ ) ψ = φ = 11
2 2
E2ψ,ϕ,t (τ ) =
E ψ,ϕ (tτ ) otherwise.
2

And this set turns out to be the basis for E2 (Γ1 (N)): in fact, the subsets

{E2ψ,ϕ,t : (ψ, φ, t) ∈ AN,2 , ψφ = χ}

will form bases for the χ-eigenspaces E2 (N, χ).

22 May 5, 2020

Diamond and Shurman 4.7 – Michael Tang


We’ll be discussing the Bernoulli numbers and Hurwitz zeta function – this is mostly preparation for the next lecture.
This will give a few formulas that are important about weight-1 Eisenstein series.
As a roadmpa, the first part of this talk will be about Bernoulli numbers (derived from the context of power sums)
Bernoulli polynomials, and the Bernoulli numbers of a Dirichlet character. Then in the second part, we will do some
sketchy complex analysis to generalize both ζ and the modified ζ function from previous lectures, and we’ll relate them
to Bernoulli numbers with a formula.

Definition 284
The kth power sum is defined as

X
n−1
Sk (n) = mk = 0k + 1k + · · · + (n − 1)k ,
m=0

where we define 00 = 1 and 0k = 0 otherwise.

We’re zero-indexing here so that computation later will be a bit more convenient.

Definition 285
The generating function for the kth power sum is the exponential generating function

X tk
S(n, t) = Sk (n) .
k!
k=0

(Here, we should think of t as our variable and n as a parameter.)

118
Lemma 286
We have
e nt − 1
S(n, t) = .
et − 1

Proof. We have

X XX ∞ n−1
tk tk
S(n, t) = Sk (n) = S(n, t) = mk ,
k! m=0
k!
k=0 k=0

and swapping the order of summation turns this into

X ∞
n−1 X
(mt)k
,
m=0 k=0
k!

and now the inner sum is the Taylor series for e mt . Evaluating the subsequent geometric series yields the result.

We’ll split this fraction up as


e nt − 1 t
S(n, t) = .
t et − 1
This is because both functions here have a limit as we approach 0, and we can write both of these more easily.

Definition 287
The Bernoulli numbers are the coefficients of the exponential generating function
X tk ∞
t
= Bk .
et − 1 k!
k=0

Substituting this in,



e nt − 1 X t j
S(n, t) = Bj ,
t j!
j=0

and now we substitute in the Taylor series for e nt , reparameterize the exponent of t after simplification, and we end
up with  

X k 
X 
 1 k +1 tk
= Bj nk+1−j  .
k +1 j k!
k=0 j=0

But remember that S(n, t) is a generating function, so the coefficients must match up: this expression involving
Bernoulli numbers must give us the power sum Sk (n). We’ll take a part of this expression:

Definition 288
The Bernoulli polynomials are defined as
k  
X k
Bk (X) = Bj X k−j .
j
j=0

(Notation-wise, Bk denotes the number, and BK (X) denotes the polynomial.) So we can rewrite
k  
1 X k +1 1
Sk (n) = Bj nk+1−j , Bk (X) = (Bk+1 (n) − Bk+1 )
k +1 j k +1
j=0

119
(where we need to subtract off the last term of the sum in the definition of the Bernoulli polynomial because of the
limits).

Example 289
We can look at the case k = 2.

Then we know that B0 = 1, B1 = − 12 , B2 = 16 , and B3 = 0 by looking at the Taylor expansion of t


e t −1 , so

3 1
B3 (X) − B3 = B0 X 3 + 3B1 X 2 + 3B2 (X) = X 3 − X 2 + X.
2 2
Substituting in X = n + 1 means that we have a formula for the sum of the first n squares – this naturally generalizes
to finding a formula for the sum of kth powers.

Definition 290
Let ψ : Z/uZ be a function (which doesn’t need to be a Dirichlet character, but that’s the case we care about).
The Bernoulli numbers of ψ, Bk,ψ , are the constants such that

X Xu−1
tk te ct
Bk,ψ = ψ(c) ut .
k! c=0
e −1
k=0

This may seem a bit unintuitive, but it has nice properties. If we plug in the trivial character χ = 1, then Bk,ψ = Bk
t
because the right hand side simplifies to e t −1 . And in fact, we can solve for Bk,ψ in terms of the Bk :

X
u−1 c 
Bk,ψ = u k−1 ψ(c)Bk .
c=0
u

This is proved by swapping the order of summation again (plug in the generating function for the Bernoulli polynomials
Bk (c/u) on the right-hand side and do some more algebra). We’ll need the k = 1 case in the next lecture, which says
that  
X
u−1
c 1
B1,ψ = ψ(c) − .
c=0
u 2
Let’s move on:

Definition 291
Let r ∈ (0, 1]. The Hurwitz zeta function is defined for all Re(s) > 1 by

X 1
ζ(s, r ) = .
n=0
(r + n)s

This converges absolutely for Re(s) > 1 by the same argument as for ζ. Notice that when we plug in r = 1, this
yields the usual Riemann zeta function (notice that the Hurwitz zeta function starts the sum from 0 instead of 1),
d
and when we plug in a rational number r = N, we have ζ(s, r ) = N s ζ+
d
(s) (both of these can be directly checked).

120
Theorem 292
Let ψ 6= 1 be a Dirichlet character mod u. Then the L-series

X ψ(n) Bk,ψ
L(1 − k, ψ) = =− .
n=1
n1−k k

In other words, we will be able to evaluate the L-function for Dirichlet characters at non-positive integers.

Proof sketch. First, we can relate the L-function to the Hurwitz function:
X
u  c Xu
ψ(c)ζ 1 − k, = u 1−k c
ψ(c)ζ+ (1 − k) = u 1−k L(1 − k, ψ),
c=1
u c=1

where we’ve used the formula for ζ(s, r ). We’ll work more with the left hand side here: lots of complex analysis gives
us the analytic continuation Z
Γ(1 − s) ze r z s−1 dz
ζ(s, r ) = − z ,
2πi γε ez − 1 z
for all s ∈ C, where γε (for ε ∈ R) is a contour which goes from −∞ to −ε, does a counterclockwise circle, and goes
back to −∞. Showing this requires the use of the Mellin transform, which we’ll talk about later on.
Substituting this into our above formula yields
Z X
u
Γ(k) (c/u)e r c/u dz
u 1−k L(1 − k, ψ) = − ψ(c) .
2πi γε c=1 e c/u − 1 z k+1

Then if we apply the Cauchy integral formula to the integral around the circle, we indeed end up with the desired result
(the integral along the line segments in γε cancel out).

So we now have a formula for our L-series on the nonnegative integers, and again we’ll care about the k = 1 case:
this reduces to
L(0, ψ) = −B1,ψ .

What’s interesting here is that the series L(0, ψ) is technically supposed to diverge, but we assign a value using this
analytic continuation.

Diamond and Shurman 4.8 – Zack Chroman


The goal of this section is to continue to find concrete bases, but this time for the weight-1 eigenspaces E1 (N, χ) and
E1 (Γ1 (N)). We’ll mimic the k ≥ 3 case as much as possible, but this is the most ugly case. Our approach will be to
define a function that’s almost the Weierstrass ℘ function, which will help us define a weight-1 modular form g1 . This
will then extend to a function G1 which will allow us to define the basis elements G1ψ,ϕ and E1ψ,ϕ .

Definition 293
The Weierstrass zeta function is defined as
′  
1 X 1 1 z
ZΛ (z) = + + + .
z z − ω ω ω2
ω∈Λ

121
The proof that this converges is similar to the proof that the Weierstrass function converges, and it turns out that
we have
Zλ′ (z) = −℘(z).

This function will not be doubly periodic like ℘ is, but its derivative is doubly periodic (as already established), so we
get the lattice constants

η1 (Λ) = Zλ (z + ω1 ) − ZΛ (z), η2 (Λ) = Zλ (z + ω2 ) − ZΛ (z)

(because their derivative is 0 – this is the same argument as usual).

Lemma 294
Let ω1 , ω2 generate our lattice Λ. Then we have

η2 (Λ)ω1 − η1 (Λ)ω2 = 2πi

when ω1
ω2 ∈ H.

Proof. The proof mimics the way we understand the Weierstrass ℘ function: we have a fundamental parallelogram
with boundary ∂P , and we can translate this by t so that there are no poles on the boundary. The residue formula
tells us that Z
ZΛ (z)dz = 2πi
t+∂P

because the residues are 1, but we can also break this up into parts as
Z ω1 Z ω2
(ZΛ (z + t) − ZΛ (z + ω2 + t))dz + (−ZΛ (z + t) + ZΛ (z + ω1 + t))dz
0 0

(one term for each side of the parallelogram), and this is exactly η2 (Λ)ω1 − η1 (Λ)ω2 .

We’re going to black box some facts: it turns out that when Λ is generated by 1 and τ , we have
X ∞  
e −2πiz q
n n
1 + e 2πiz e 2πiz q
ZΛτ (z) = η2 (Λτ )z − πi = −2πi n −
1−e 2πiz
n=1
1−e 2πiz q 1 − e −2πizq n

where q = e 2πiτ . We also have that η2 (Λτ ) = Gτ ), and therefore η1 (Λτ ) = τ G2 (τ ) − 2πi by the previous lemma.
Now we can define our functions g1 : let Λ be generated by ω1 and ω2 . Fix an integer N and a vector (cv , dV ) ∈
(Z/NZ)2 , and we define  
cv ω1 + dv ω2 cv η1 (Λ) − dv η2 (Λ)
F1v (Λ) = ZΛ − .
N N
We can check that the contributions cancel out if we add N to cv or dv , so this is well-defined. When we specialize
to the case where Λ is generated by 1 and τ , we also define
 
1 cv τ + dv cv η1 (Λτ ) − d2 η2 (Λτ )
v
g1 (τ ) = ZΛτ − .
N N N2

Lemma 295
g1 is weight-1 invariant.

We’ll also black box this fact. Our short-term goal with this is to plug in the formula for ZΛτ (z) into this formula

122
cv τ +dv
here for g1 to extract more information. We’ll define z = N , q = e 2πiτ , qN = e 2πiτ /N , and µN = e 2πi/N as we
have in the past few lectures.
We’ll go step-by-step from here when we plug in the complicated formula for ZΛτ . It turns out that the first and
last terms cancel out nicely:
η2 (Λτ )(z) cv η1 (Λτ ) − dv η2 (Λτ ) 2πi cv
− = .
N N2 N2
We’ll now deal with the
πi 1 + e 2πiz

N 1 − e 2πiz
term. In the case where cv = 0, then we just have z = dv , and this simplifies to π
N cot πd
N . Otherwise, we’ll expand
v

as a geometric series to get


X∞ X∞
1 + e 2πiz
=1+2 e 2πim(cv τ +dv )/N
=1+2 qNcv m µdNv m ,
1 − e 2piiz m=1 m=1

meaning we can just write



!
πi 1 + e 2πiz π πdv πi C1 X cv m dv m
− = δ(c v ) cot + (1 − δ(cv )) − + q µN
N 1 − e 2πiz N N N N m=1 N

where δ(cv ) is the indicator function for cv being zero, and C1 = −2πi (to draw an analogy with the previous cases).
Finally, we need to deal with the other sums. First, we deal with

X e −2πiz q
n

.
n=1
1 − e −2πizq n

We can expand out the geometric series, replace q with qNN , and replace n with (cv τ + dv )/N, and we rearrange in the
same way as the last sum to get to
∞ X
X ∞
e −2πimdv /N qNnmN−mcv .
n=1 m=1

The qN term is the important part, so we can reparameterize as


∞ X
X
= µmd v k
N qN
k=1 ℓ<0
ℓ|k
k/ℓ≡cv

where k = nmN − mcv and ` = −m.


Now, we need to deal with

X n
e 2πizq
,
n=1
1 − e 2πiz q n
and we get to the same kind of intermediate statement
∞ X
X ∞
e 2πimdv /N qNnmN+mcv .
n=1 m=1

But this time, we overcount some extra terms, so we need to correct that: it only happens for the n = 0 term, which

123
doesn’t appear in our original sum. This yields
 
∞  ∞
X X md k  X

= 
µN qN  − (1 − δ(cv )) µmd v cv m
N qN .
v

k=1 m>0  m=1


m|k
k/m≡cv

And now we can write out all of our terms together: putting together the δ(cv ) terms between all of our calculations
will make things simplify further, and we end up with

∞  
2πi cv C1 X v C1 cv 1
g1v (τ ) = dv
+ δ(cv )ζ (1) + σ0 (k)qN = G1 (τ ) −
k v
− ,
N2 N N N 2
k=0

where

X
σ0v (k) = µmd
N
v

m|k,k/m≡cv

and

!
C1 X
G1v (τ ) = δ(cv )ζ dv (1) + σ0v (k)qNk .
N
k=1

So we’ve now defined g1 and G1 , and because the Fourier coefficients for g1 are bounded, this is actually a modular
form with respect to Γ(N).
From here, we can define the series with respect to Dirichlet characters as well: if uv = N, ψ is a primitive
character mod u, and φ is a character mod v , we can define
u−1 X
X v −1 X
u−1
(cv ,d+ev )
G1ψ,ϕ (τ ) = ψ(c)φ(d)g1 (τ )
c=0 d=0 e=0

and we’ll define



X
E1ψ,ϕ (τ ) = δ(φ)L(0, ψ) + δ(ψ)L(0, φ) + 2 σ0ψ,ϕ (n)q n .
n=1

We’re claiming that these objects are actually the same up to a constant:

C1 g(φ) ψ,ϕ
G1ψ,ϕ = E1 ,
v
where g(φ) is the Gauss sum. In fact, we will establish a result similar to that of previous classes:

Theorem 296
Let An,1 be the set of triples ({φ, ψ}, t) (unordered pairs here) such that tuv |N, ψφ(−1) = −1, and ψ, φ are
{ψ,ϕ},t
both primitive (mod u and v , respectively). Define E1ψ,ϕ,t (τ ) = E1ψ,ϕ (tτ ). Then {E1 : ({ψ, φ}, t) ∈ AN,1 }
form a basis for the Eisenstein space of Γ1 (N), and for any character χ,

{E1ψ,ϕ,t : ({φ, ψ, t) ∈ AN,1 , (φψ) = χ}

forms a basis for the χ-eigenspace.

Proof. Everything except the constant terms actually work out exactly like our earlier proofs, because G1 and E1 are
the same as the k ≥ 3 definitions except for the constants. But finding this constant is not very easy. We’ll break
P
down the constant term for g1 into the sums for G1 (which behaves like Gk for larger k) and (g1 − G1 ). Summing

124
over c, d, e means we can simplify the (g1 − G1 ) term to
  v −1
C1 X 1 X X
u−1 u−1
cv
− ψ(c) − φ(d) 1.
N c=0 N 2 e=0
d=0

But now the inner two sums are zero unless we have φ being the trivial character 1, in which case v = 1 and this all
simplifies to
X
N−1  
cv 1 C1 g(φ) C1 g(φ)
= −C1 δ(φ) ψ(c) − =− δ(φ)B1,ψ = δ(φ)L(0, ψ)
c=0
N 2 v v
by using properties from Michael’s lecture.
The idea here is that when φ = 1, we can multiply by things like g(φ) and v which are actually just 1, and then we
can simplify to a form that we want.
On the other hand, the triple sum constant term has already been computed: it’s going to be
v −1
X
ψ(0) φ(d)ζ d (1).
d=0

This is similarly zero except when ψ is the 1 character, and we’ll write this as a limit as s approaches 1 of
v −1
X X ′
X
φ(d)ζ d (s) = φ(n)n−s = (1 − (−1)−s )L(s, φ) = 2δ(ψ)L(1, φ).
d=0 d n≡d mod v

Note that we have a functional equation for the L-function, so L(1, φ) can be written in terms of L(0, φ) (in general,
we can write L(k, φ) in terms of L(1 − k, φ)). So we end up with a total constant term

C1 g(φ
(δ(ψ)L(0, ψ) + δ(ψ)L(0, φ)).
v

And this is exactly the constant scaling that we wanted when we defined E1ψ,ϕ .

23 May 7, 2020

Diamond and Shurman 4.9 – Dhruv Rohatgi


We’ll be discussing the Fourier and Mellin transforms, and there won’t really be very much number theory in this
lecture. The idea is that we’ve claimed a lot of holomorphic and meromorphic functions can be analytically continued
– we proved this using a functional equation for the Gamma function, but we didn’t really do it rigorously for the zeta
function. So we’ll be discussing the ξ function, defined as

ξ(s)π −s Γ(s/2)ζ(s),

and we’ll show that ξ has a meromorphic continuation to all of C, which will show that ζ does as well.
Recall that the (`-dimension) theta function
X 2 |τ
θ(τ, `) = e πi|n
n∈Zℓ

is holomorphic, because we’re bounded away from the real axis for any compact subset of C. We’ll use a Poisson
summation to prove a functional equation, which will allow us to construct and understand ξ.

125
Theorem 297
For all t > 0 and integer ` > 0, we have some symmetry along the imaginary axis:
X X
e −π|n
2 |t
= t −ℓ/2 e −π|n
2 |/t
θ(i t, `) = = t −ℓ/2 θ(i /t, `).
n∈Zℓ n∈Zℓ

We’ll get a functional equation


θ(−1/τ, `) = (−i τ )ℓ/2 θ(τ, `)

as a corollary of this result.


In order to prove this functional equation, we’ll need to introduce the Fourier transform:

Definition 298
The Fourier transform of an absolutely integrable f : Rℓ → C is defined to be
Z
ˆ
f (y ) = f (x)e −2πi⟨x,y ⟩ dy .
Rℓ

Proposition 299
For “sufficiently nice” f : Rℓ → C, we have
X X
f (n) = fˆ(m).
n∈Zℓ m∈Zℓ

Lemma 300
The Fourier transform of a Gaussian is another Gaussian with a scaling factor: if f (x) = e −πt|x| is a function
2

from Rℓ → R, then
fˆ(y ) = t −ℓ/2 e −π|y | /t .
2

Proof. This is a “contour shifting” proof:


Z Z
e −πt|x| e −2πi⟨x,y ⟩dx = e −π|y | /t e −πt|x+iy /t| dx.
2 2 2
fˆ(y ) =
Rℓ Rℓ

But integrating over a shifted axis is the same as the regular integral of a Gaussian (by Cauchy’s theorem), and we
can evaluate it to get the desired result.

Applying this lemma and the Poisson summation formula gives us exactly what we want to prove the above theorem.

Definition 301
Let f : R+ → C be a function. Then the Mellin transform of f is
Z ∞
g(s) = f (t)t s−1 dt.
0

This is nice because it has an inverse (and other nice properties), but we mostly care about it for motivation.
Notably, if we pick f (t) = 21 (θ(i t) − 1), then
Z ∞
1
g(s) = (θ(i t) − 1)t s−1 dt.
2 0

126
Lemma 302
When Re(s) > 1/2, g(s) is well-defined, and

g(s) = π −s Γ(s)ζ(2s) = ξ(2s).

Proof. We check that we don’t have “problem points” at t → 0 (since θ(i t) might explode) or t → ∞ (since t s−1
might explode). Break up the integral into a part from 0 to 1 and a part from 1 to ∞. The latter decays exponentially
P
because |θ(it) − 1| ≤ 2 ∞n=1 e
−πn2 t
, and to understand the former, use the functional equation to find that

θ(i t) = t −1/2 θ(i /t),

so θ(i/t) goes to 1 as t → 0, meaning θ(i t) is asymptotically t −1/2 . This means that when we multiply it by t s−1 for
any Re(s) > 1/2, this integral does indeed converge.
To show that this is related to the ξ function, we can write out the definition of the theta function:
Z ∞ Z ∞
∞X
1
e −πn t t s−1 dt,
2
g(s) = (θ(i t) − 1)t s−1 dt =
2 0 0 n=1

where we’ve summed over positive integers instead of all integers and removed the factor of 2. We can then use the
dominated convergence theorem to swap the sum and integral to get
∞ Z
X ∞
e −πn t t s−1 dt.
2
=
n=1 0

Substituting u = πn2 t, the integral becomes independent of n, and we end up with



X Z ∞
= (πn2 )−s e −u u s−1 du,
n=1 0

which miraculously simplifies to π −s ζ(2s)Γ(s), as desired.

So we’ve shown that we have another expression for ξ on the same domain it’s defined, and now we can show that
g extends meromorphically to C – this is easier than working with ξ directly. We again split into its two pieces:
Z 1 Z ∞
g(s) = (θ(i t) − 1)t dt +
s−1
(θ(i t) − 1)t s−1 dt.
0 1

The second integral is already holomorphic, so we don’t need to worry about it. The first integral is essentially going
to look asymptotically like Z 1
1 1
≈ (t −1/2 − 1)t s−1 dt = − − ,
0 1/2 − s s
and notice that the function on the right side is indeed meromorphic everywhere on C. So we can analyze in terms of
an “error term:” whenever Re(s) > 12 , we can first integrate the polynomial term
Z 1 Z 1
1
(θ(i t) − 1)t s−1 dt = θ(i t)t s−1 dt − ,
0 0 s
1
and then do a u-substitution t = uto get
Z ∞
1 1
g(s) = (θ(i t) − 1)(t −s−1/2 + t s−1 )dt − − .
1 s 1/2 − s

127
But this extra term is meromorphic, and that’s the same thing as saying that g (and therefore ξ meromorphically

continue). And notice that there’s a symmetry in this expression: g(s) = g 12 − s , and since g(s) = θ(2s), this
means that ξ(s) = ξ(1 − s). So we’ve shown that ξ extends meromorphically to all of C, with only simple poles at 0
and 1, with a simple functional equation. This allows us to understand the poles of ζ pretty clearly as well (it’s just at
z = 1).

Diamond and Shurman 4.10 – Swapnil Garg


In this lecture, we’ll meromorphically extend the Eisenstein series using the Mellin transform we just introduced. The
idea here is that in Serre, we discussed the Eisenstein series for SL2 (Z) for all even weight k ≥ 4, and similarly we do
this for k ≥ 3 for congruence subgroups (where the weight doesn’t necessarily have to be even). Eisenstein series also
exist for k = 1, 2, but they’re harder to deal with because they don’t converge. There’s a lot more bashing that we
have to do, but we’ll look at these series directly by adding a complex parameter s. We will then analytically continue
our series to s = 0, which is theoretically what our series should look like.

Definition 303
Let v be a vector in (Z/NZ)2 of order N, let k be a positive integer, and εN be defined as usual ( 12 for N = 1, 2
and 1 otherwise). Let v be a lift of v to Z2 . Then for any τ = (x + i y ) ∈ H and s ∈ C, define the augmented
series via
X Im(τ )s
Ekv (τ, s) = εN .
(cτ + d)k |cτ + d|2s
(c,d)≡v mod N
gcd(c,d)=1

Note that whenever Re(k + 2s) > 2, this series converges absolutely, and because it converges uniformly on
compact sets, this makes the series analytic on that particular half-plane.
We’re going to rewrite this series using weight-k operators: we’ll generalize them to two parameters by letting

(f [γ]k )(τ, s) = j(γ, τ )−k f (γ(τ ), s),

where j(γ, τ ) is still cτ + d. We showed earlier on that some functions f satisfy f [γ]k = f , meaning that we are
weakly modular of weight k. So if we let δ be an element of SL2 (Z) with bottom row (c ), and we let P+ be the
" v , dv #
1 n
positive part of the parabolic subgroup of SL2 (Z) (meaning we consider the elements , where n is a positive
0 1
integer), we have the following result:

Proposition 304
The Eisenstein series can be rewritten as
X
Ekv (τ, s) = εN Im(τ )s [γ]k
γ∈(P+ ∩Γ(N))\Γ(N)δ

(where we sum over the orbits of P+ ∩ Γ(N) inside Γ(N)δ).

" #
a b
Proof. Compare this expression to the original definition of Ekv (τ, s). First, we can check that for any γ = in
c d
SL2 (Z), we have
Im(τ )
Im(γ(τ )) = ,
|cτ + d|2

128
so that means the Im(τ )s and |cτ + d|2s cancel out the relevant factors. Looking at the stabilizer of Γ(N)δ, two
matrices in Γ(N)δ will have the same bottom row if and only if they are in the same orbit of P+ ∩ Γ(N), because Γ(N)δ
only contains matrices with bottom row (c, d) ≡ v . This means that we know what the weight-k operator does:

Im(τ )s
Im(τ )s [γ]k = (cτ + d)−k Im(γ(τ ))s = ,
(cτ + d)k |cτ + d|2s

and plugging this back in and looking at what we sum over (each orbit corresponds to a specific vector (c, d)) yields
the result.

This yields the following formula, which should look familiar:

Lemma 305
We have
(Ekv [γ]k )(τ, s) = Ekv γ (τ, s).

Proof. This is the same as the proof in Section 4.2: we just sum over the orbits in a different order and gain an extra
factor.

In particular, for any γ ∈ Γ(N), it is true that (Ekv [γ]k )(τ, s) = Ekv (τ, s) (because v γ = v ). TO proceed, we can
look at the non-normalized Eisenstein series
X Im(τ )s
Gkv (τ, s) = .
(cτ + d)k |cτ + d|2s
(c,d)≡v mod N

Remember that G and E are basically the same – there are some linear combinations – so if we can meromorphically
continue one, we can meromorphically continue the other. So we’ll only look at the case k = 0 and N = 1 today,
which is very nice, and we’ll see the general case next time.
So now we can look at the modified theta function
X
e −π|nγ|
2
ϑ(γ) =
n∈Z2

where γ ∈ GL2 (R).

Lemma 306
For a function f ∈ L1 (R2 ), a matrix γ ∈ SL2 (R), and r > 0, the Fourier transform of a general function (for
x ∈ R2 ) φ(x) = f (xγr ) is θ̂(x) = r −2 fˆ(xγ −T /r ), where γ −T denotes the negative transpose for γ.

We’re skipping over this proof for now – it’s just a fact about Fourier transforms. So now if we use Poisson
summation and use the function f (x) = e −π|x| , we find that
2

X X
r f (nγr ) = r −1 f (nγ −T /r ).
n∈Z2 n∈Z2
" #
0 −1
for any r > 0 and γ ∈ SL2 (R). So now if we let S = , then Sγ −T = γS, and |x| and |xS| are the same.
1 0
Since f is a Gaussian function, this means f (x) = f (xS), and S is just a 90 degree rotation of our lattice, so we sum
over the same thing. Thus we get the following result:

129
Corollary 307
P
e −π|nγ| .
2
We have the transformation law by using the fact that ϑ(γ) = n∈Z2

1 γ 
r ϑ(γr ) = ϑ .
r r

Now we look at something similar to in Dhruv’s lecture: note that the Mellin transform of ϑ(γt 1/2 ) − 1,
Z ∞ Z ∞ ′
X
dt dt
e −π|nγ| t t s
2
g(s, γ) = (ϑ(γt 1/2
− 1))t = s
0 t t t
n∈Z2

(note that letting γ = i I makes this the same as the previous lecture), where subtracting the 1 in the first expression
P
removes the point at the origin (which is why we have a ′ ). The transformation law tells us that ϑ(γt 1/2 ) must be
1
asymptotically proportional to t near t = 0, because our function goes to 1 at infinity. That means that our integral
for g(s, γ) converges at t = 0 as long as Re(s) > 1. But this function also converges at all s for t → ∞ (again,
by exponential suppression). So we can move the sum outside the integral because of convergence, and changing
variables with t replacing π|nγ|2 t yields our gamma function again:

X Z ∞ X ′
2 −s dt
g(s, γ) = (π|nγ| ) e −t t s = π −s Γ(s) |nγ|−2s .
0 t
n∈Z2 2 n∈Z

So now we can connect this to the function G0 (τ, s): for any τ = x + i y in the upper half-plane, define
" #
1 y x
γτ = √ ,
y 0 1

which sends i to τ . Since |(c, d)γτ | = |cτ + d|2 /y (just by computation), we find that

X ys
g(s, γτ ) = π −s Γ(s) ,
|cτ + d|2s
(c,d)∈Z2

which is exactly the expression π −s Γ(s)G0 (τ, s) for the augmented Eisenstein series when k = 0 and N = 1. A bit of
algebra analogous to the last section (a change of variables), plus the transformation law, yields
Z 1 Z ∞
dt dt 1 1
(ϑ(γt 1/2
) − 1)t s
= (ϑ(γt 1/2 ) − 1)t 1−s − − .
0 t 1 t s 1−s

But notice that this integrand converges quickly – there are no problem points – which shows that g(s, γ) only has
simple poles at s = 0 and 1, and we have g(s, γ) = g(1 − s, γ). So

π −s Γ(s)G0 (τ, s) = π s−1 Γ(1 − s)G0 (τ, 1 − s),

and we’ve defined our weight-0 Eisenstein series (in a simple case) with a meromorphic continuation. This function is
SL2 (Z)-invariant and of weight 0, so it is weakly modular.

130
24 May 12, 2020

Diamond and Shurman 4.10 continued – Shreyas Balaji


We’ll start with some definitions from last lecture: we have
X
e −π|nγ| , γ ∈ GL2 (R),
2
ϑ(γ) =
n∈Z2
" #
1 y x
γτ = √ ∈ SL2 (R), τ = x + iy,
y 0 1
X ys
Gkv (τ, s) = .
(cτ + d)k |cτ + d|2s
c,d≡v mod N

Last time, we worked with these objects by considering the function ϑ(γt 1/2 ) − 1, and we considered the Mellin
transform g(s, γ), which was both equal to π −s Γ(s)G0 (τ, s) and an meromorphic function in s
Z ∞
dt 1 1
g(s, γ) = (ϑ(γt 1/2 ) − 1)(t s + t 1−s ) − −
1 t s 1 − s

which has a meromorphic extension to the full complex plane. Our goal today is extend this logic to higher weights
and levels k, N.
We’ll introduce some notation: let G be (Z/NZ)2 (we’ll be treating these group elements as row vectors), let
µN = e 2πi/N , and we’ll be considering arbitrary functions a : G → C instead of vectors (they " don’t# need to be
0 −1
homomorphisms). We’ll let h·, ·i denote the standard inner product, and we’ll also define S = . Note that
1 0
S T = −S, which is nice for inner products which we’ll be working with throughout this lecture.

Definition 308
The Fourier transform of a function a : G → C is given by
1 X
â(v ) = a(w )µ−⟨w
N
,v S⟩
.
N
w ∈G

Proposition 309
We can invert the Fourier transform to get
1 X
a(u) = â(v )µ⟨u,v
N
S⟩
N
v ∈G

⟨u,ej S⟩
Proof. There is some basis vector ej such that µN 6= 1 as long as u is nonzero, and then we can “sum over the
P
group” in two ways by the transformation v → v + ej , showing that the sum v ∈G µ⟨u,v
N
S⟩
will be 0 for all u 6= 0. (And
when u = 0, that sum is N 2 .) So we can plug in the definition for â(v ) and swap the order of summation, which will
yield the result.

131
Definition 310
The harmonic polynomial for a positive integer k, denoted hk , is defined as

hk (c, d) = (−i )k (c + i d)k

for (c, d) ∈ R2 .

Definition 311
Define the theta function θkv : GL2 (R) → C to be
X
hk ((v /N + n)γ)e −π|(v /N+n)γ| .
2
ϑvk (γ) =
n∈Z2

For notation’s sake, let f (x) = e −π|x| be the Gaussian function and fk (x) = hk (x)e −π|x| be the Schwartz function.
2 2

There’s lots of useful facts we’ll need before doing some algebraic manipulation:
• The Fourier transform of the Schwartz function fk is (−i )k fk .

• Define φk (x) = fk (xγr ) for some r > 0: then the Fourier transform is (−i )k r −2 fk (xγ −T r −1 ).

• Expanding out the definition of hk and defining z(x) = c + i d for x = (c, d), we can write

hk (x) = (z(xS)k .

• This means that hk (xS) = (−i )k hk (x), and fk (xS) = (−i )k hk (x).

• Finally, Sγ −T = γS and S T = −S.


So now we have a lot of algebra to get through: we expand out r ϑvk (γr ) from first definitions and write it in terms
of the Schwarz function and then φk (by definition) to get
X X
=r fk ((v /N + n)γr ) = r φk (v /N + n).
n∈ZZ2 n∈Z2

Recall the Poisson summation formula: applying it to our function (where we use the Fourier transform) yields
X X
=r φ̂k (n)e 2πi⟨n,v /N⟩ = (−i )k r −1 fk (nγ −T r −1 )e 2πi⟨n,v /N⟩ .
n∈Z2 n∈Z2

Substituting n → nS and swapping around a few terms yields


X
= (−i )k r −1 (−i )k fk (nγr −1 )µ−⟨n,v
N
S⟩
,
n∈Z2

and from here, we break up our sum by group elements in G:


X X
= (−1)k r −1 fk (nγr −1 )µ−⟨w
N
,v S⟩
,
w ∈G n∈Z2 ,n≡w mod N

and now this is exactly the sum that we want in the definition of our theta function: some more algebra yields
X −⟨w ,v S⟩
= (−1)k r −1 ϑw
k (γNr
−1
)µN .
w ∈G

132
If we now look at ϑvk as a function of v , then our last sum looks like N times the Fourier transform. Adding that factor
back in yields
r θkv (γr ) = (−1)k Nr −1 ϑ̂vk (γNr −1 ).

And since r is arbitrary, we send r → N 1/2 r , and we get the final identity

r ϑvk (γN 1/2 r ) = (−1)k r −1 ϑ̂vk (γN 1/2 r −1 ) .

Proposition 312
Let a : G → C be a function. Then the sum-theta function
X
Θk (γ) = (av + (−1)k â(−v ))ϑ[k v (γN 1/2 )
v ∈G

satisfies r Θak (γr ) = r −1 Θak (γr −1 ).

To show this, we’ll need a lemma that relates the Fourier transforms with swapping v → −v :

Lemma 313
For any function a : G → C, we have
X X
a(v )ϑvk (γ) = â(−v )ϑ̂vk (γ),
v ∈G v ∈G
X X
â(−v )ϑvk (γ) = a(v )ϑ̂vk (γ).
v ∈G v ∈G

Proof of lemma. To do this, we can prove the general statement


X X
a(x)b(x) = â(−y )b̂(y )
x∈G y ∈G

by writing out the Fourier transforms and swapping the order of summation. Then substituting in b = ϑvk or ϑ̂vk yields
the desired result.

Proof of Proposition 312. Write out the definition of sum-theta function, apply the boxed identity above, and then
use the lemma to move factors of −1 between the a functions.

This is a generalization of what was shown in the last lecture. Note that Θak (γr ) rapidly converges to 0 as r → ∞,
so the following Mellin transform of Θak , denoted
Z ∞
dt
gka (s, γ) = Θak (Γt 1/2 )t s ,
0 t

converges as t → 0 for all s because of the proposition which essentially allows us to “swap” t 1/2 with t −1/2 .
More specifically, the integral converges as t → 0, as well as t → ∞. So we can expand the sum, swapping the
sum and the integral, expanding out the expression for the theta function, and then factor out all the terms that we
can that don’t depend on t, using the trick where we sum over all n ≡ v mod N instead of all n ∈ Z2 . The final result

133
we find is that
X ′
X
gka (s, γ) = π −k/2−s N s Γ(k/2 + s) (a(v ) + (−1)k â(−v )) hk (nγ)|nγ|−k−2s .
v ∈G n≡v mod N

Now defining γ = γτ yields hk (nγτ ) = (cτ + d)k /y k/2 , and we also have that |nγ|−k−2s = y k/2+s /|cτ + d|k+2s , so we
can substitute this in and simplify to find that the Mellin transform of the sum-theta function (specifying to γτ ) is

gka (s, γτ ) = π −k/2−s Γ(k/2 + s)N s y k/2 Gka (τ, s − k/2),

where Gka is the sum of Eisenstein series given by


X
Gka (τ, s) = (a(v ) + (−1)k â(−v ))Gkv (γN 1/2 ).
v ∈G

And now we do the same thing with analytic continuation: applying our above proposition on the Mellin transform
integral lets us relate values of Θak between the regions [0, 1] and [1, ∞], and the transformation yields
Z 1 Z ∞
dt dt
Θak (γt 1/2 )t s = Θak (γt 1/2 )(t s + t 1−s ) ,
0 t 1 t

which has symmetry under s → (1 − s). And this last integral is entire in s, so combining everything gives us the final
theorem of the section:

Theorem 314
For any positive integer N, let G = (Z/NZ)2 . Construct Gka (τ, s) as before: then for any integer k and any point
τ = x + iy ∈ H,
(π/N)−s Γ(|k|/2 + s)Gka (τ, s − k/2)

has a meromorphic continuation in s to all of C which is entire when k 6= 0 and has simple poles at 0, 1 for k = 0.

(Specifically, we’ve proved the k > 0 case, and the others follow by a similar argument.)

Diamond and Shurman 4.11 – Anton Trygub


We’ll make this presentation as short as possible to leave some time for celebration at the end of the class. Basically,
we’ll introduce a new theta function and show a few important properties.

Lemma 315
Let d be a cubefree positive integer. The equation x 3 = d mod p has 3 solutions if p ≡ 1 mod 3 and d is a
nonzero cube mod p, 0 solutions if p ≡ 1 mod 3 and d is not a cube mod p, and 1 if p ≡ 2 mod 3 or p|3d.

Proof. The result is clear for p = 3 or p = d (because 03 , 13 , 23 are all different mod 3, and d = 0 is the only solution
in the latter case). otherwise, let g be a primitive root mod p: letting d = p k , the equation reduces to showing the
number of solutions to 3x = k mod (p − 1), and the result follows.

We’ll denote e(z) = e 2πiz and tr(x) = x + x ∗ throughout the next few results. We’ll also use the notation

A = Z[µ3 ], α = i 3, B = α1 A (note that A ⊂ B ⊂ 31 A as lattices). Some useful facts to remember are that if we
write x = x1 + x2 µ for integers x1 , x2 ,

|x|2 = x12 − x1 x2 + x22 , |x + y |2 = |x|2 + |y |2 + tr(xy ∗ ).

134
Definition 316
Let N be a positive integer, and let u ∈ 13 A/NA. Then we have the theta function
X  u 2

θu (τ, N) = e N +n τ
N
n∈A

for τ ∈ H.

Our first result tells us about this function when we move along the real axis:

Lemma 317
We have  
u |u|2
θ (τ + 1, N) = e θu (τ, N)
N
for any positive integer N and u ∈ B/NA.

Proof. We have that


u 2 |u|2
+n ≡ N mod Z
N N
by expanding out u = u1 + u2 µ andn = n1 + n2µ. Since e(a + b) = e(a)e(b) by definition, replacing τ with τ + 1 in
2
the definition will give us an extra e N Nu + n , and this simplifies using our first observation because e is 1-periodic
in the real direction.

The next result is a “scaling:”

Lemma 318
We have
X
θu (τ, N) = θv (dr, dN)
v ∈B/(dNA)
v ≡n mod NA

for any positive integer N, u ∈ B/NA, and positive integer d.

Proof. Let n = r + dm, and substitute into the definition:


!
X  u 2
 X u + rN 2
u
θ (τ, N) = e N +n τ = e dN + m dτ
N dN
n∈A m∈A
r ∈A/dA

1
where we’ve introduced a factor of d inside the absolute value square but canceled this out with the two factors of d
outside. But now the sum over m can be rewritten as a theta function as well, which yields what we want.

Finally, we describe how θ transforms under the “action” of S:

Lemma 319
We have    
1 −i τ X tr(uv ∗ )
θ u
− ,N = √ e − θv (τ, N)
τ N 3 v ∈B/NA N

for any positive integer N and u ∈ B/NA.

This result takes a bit more effort to prove, so we’ll leave it as an exercise.

135

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy